* [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library
@ 2018-06-08 10:51 Yipeng Wang
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support Yipeng Wang
` (6 more replies)
0 siblings, 7 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-06-08 10:51 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, john.mcnamara, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh
This patch set adds the read-write concurrency support in rte_hash.
A new flag value is added to indicate if read-write concurrency is needed
during creation time. Test cases are implemented to do functional and
performance tests.
The new concurrency model is based on rte_rwlock. When Intel TSX is
available and the users indicate to use it, the TM version of the
rte_rwlock will be called. Both multi-writer and read-write concurrency
are protected by the rte_rwlock instead of the x86 specific RTM
instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed
and the code is infused into the main .c file.
A new rte_hash_count API is proposed to count how many keys are inserted into
the hash table.
Yipeng Wang (3):
hash: add read and write concurrency support
test: add test case for read write concurrency
hash: add new API function to query the key count
lib/librte_hash/rte_cuckoo_hash.c | 658 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 16 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 ---------
lib/librte_hash/rte_hash.h | 14 +
lib/librte_hash/rte_hash_version.map | 8 +
test/test/Makefile | 1 +
test/test/test_hash.c | 12 +
test/test/test_hash_multiwriter.c | 9 +
test/test/test_hash_perf.c | 36 +-
test/test/test_hash_readwrite.c | 649 +++++++++++++++++++++++++++++++++
10 files changed, 1135 insertions(+), 432 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
create mode 100644 test/test/test_hash_readwrite.c
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-06-08 10:51 ` Yipeng Wang
2018-06-26 14:59 ` De Lara Guarch, Pablo
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 2/3] test: add test case for read write concurrency Yipeng Wang
` (5 subsequent siblings)
6 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-08 10:51 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, john.mcnamara, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh
The existing implementation of librte_hash does not support read-write
concurrency. This commit implements read-write safety using rte_rwlock
and rte_rwlock TM version if hardware transactional memory is available.
Both multi-writer and read-write concurrency is protected by rte_rwlock
now. The x86 specific header file is removed since the x86 specific RTM
function is not called directly by rte hash now.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 627 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 16 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 ---------
lib/librte_hash/rte_hash.h | 3 +
4 files changed, 390 insertions(+), 420 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index a07543a..a5bb4d4 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -31,9 +31,6 @@
#include "rte_hash.h"
#include "rte_cuckoo_hash.h"
-#if defined(RTE_ARCH_X86)
-#include "rte_cuckoo_hash_x86.h"
-#endif
TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
@@ -93,8 +90,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
void *buckets = NULL;
char ring_name[RTE_RING_NAMESIZE];
unsigned num_key_slots;
- unsigned hw_trans_mem_support = 0;
unsigned i;
+ unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
+ unsigned int readwrite_concur_support = 0;
+
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
hash_list = RTE_TAILQ_CAST(rte_hash_tailq.head, rte_hash_list);
@@ -118,8 +117,16 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD)
+ multi_writer_support = 1;
+
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) {
+ readwrite_concur_support = 1;
+ multi_writer_support = 1;
+ }
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
- if (hw_trans_mem_support)
+ if (multi_writer_support)
/*
* Increase number of slots by total number of indices
* that can be stored in the lcore caches
@@ -233,7 +240,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->cmp_jump_table_idx = KEY_OTHER_BYTES;
#endif
- if (hw_trans_mem_support) {
+ if (multi_writer_support) {
h->local_free_slots = rte_zmalloc_socket(NULL,
sizeof(struct lcore_cache) * RTE_MAX_LCORE,
RTE_CACHE_LINE_SIZE, params->socket_id);
@@ -261,6 +268,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->key_store = k;
h->free_slots = r;
h->hw_trans_mem_support = hw_trans_mem_support;
+ h->multi_writer_support = multi_writer_support;
+ h->readwrite_concur_support = readwrite_concur_support;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
@@ -271,21 +280,14 @@ rte_hash_create(const struct rte_hash_parameters *params)
#endif
h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR;
- /* Turn on multi-writer only with explicit flat from user and TM
+ /* Turn on multi-writer only with explicit flag from user and TM
* support.
*/
- if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
- if (h->hw_trans_mem_support) {
- h->add_key = ADD_KEY_MULTIWRITER_TM;
- } else {
- h->add_key = ADD_KEY_MULTIWRITER;
- h->multiwriter_lock = rte_malloc(NULL,
- sizeof(rte_spinlock_t),
+ if (h->multi_writer_support) {
+ h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
LCORE_CACHE_SIZE);
- rte_spinlock_init(h->multiwriter_lock);
- }
- } else
- h->add_key = ADD_KEY_SINGLEWRITER;
+ rte_rwlock_init(h->readwrite_lock);
+ }
/* Populate free slots ring. Entry zero is reserved for key misses. */
for (i = 1; i < params->entries + 1; i++)
@@ -335,11 +337,10 @@ rte_hash_free(struct rte_hash *h)
rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support) {
rte_free(h->local_free_slots);
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_free(h->multiwriter_lock);
+ rte_free(h->readwrite_lock);
+ }
rte_ring_free(h->free_slots);
rte_free(h->key_store);
rte_free(h->buckets);
@@ -386,77 +387,50 @@ rte_hash_reset(struct rte_hash *h)
for (i = 1; i < h->entries + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
/* Reset local caches per lcore */
for (i = 0; i < RTE_MAX_LCORE; i++)
h->local_free_slots[i].len = 0;
}
}
-/* Search for an entry that can be pushed to its alternative location */
-static inline int
-make_space_bucket(const struct rte_hash *h, struct rte_hash_bucket *bkt,
- unsigned int *nr_pushes)
-{
- unsigned i, j;
- int ret;
- uint32_t next_bucket_idx;
- struct rte_hash_bucket *next_bkt[RTE_HASH_BUCKET_ENTRIES];
- /*
- * Push existing item (search for bucket with space in
- * alternative locations) to its alternative location
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Search for space in alternative locations */
- next_bucket_idx = bkt->sig_alt[i] & h->bucket_bitmask;
- next_bkt[i] = &h->buckets[next_bucket_idx];
- for (j = 0; j < RTE_HASH_BUCKET_ENTRIES; j++) {
- if (next_bkt[i]->key_idx[j] == EMPTY_SLOT)
- break;
- }
-
- if (j != RTE_HASH_BUCKET_ENTRIES)
- break;
- }
-
- /* Alternative location has spare room (end of recursive function) */
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- next_bkt[i]->sig_alt[j] = bkt->sig_current[i];
- next_bkt[i]->sig_current[j] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[j] = bkt->key_idx[i];
- return i;
- }
+static inline void
+__hash_rw_writer_lock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_lock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_lock(h->readwrite_lock);
+}
- /* Pick entry that has not been pushed yet */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++)
- if (bkt->flag[i] == 0)
- break;
- /* All entries have been pushed, so entry cannot be added */
- if (i == RTE_HASH_BUCKET_ENTRIES || ++(*nr_pushes) > RTE_HASH_MAX_PUSHES)
- return -ENOSPC;
+static inline void
+__hash_rw_reader_lock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_lock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_lock(h->readwrite_lock);
+}
- /* Set flag to indicate that this entry is going to be pushed */
- bkt->flag[i] = 1;
+static inline void
+__hash_rw_writer_unlock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_unlock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_unlock(h->readwrite_lock);
+}
- /* Need room in alternative bucket to insert the pushed entry */
- ret = make_space_bucket(h, next_bkt[i], nr_pushes);
- /*
- * After recursive function.
- * Clear flags and insert the pushed entry
- * in its alternative location if successful,
- * or return error
- */
- bkt->flag[i] = 0;
- if (ret >= 0) {
- next_bkt[i]->sig_alt[ret] = bkt->sig_current[i];
- next_bkt[i]->sig_current[ret] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[ret] = bkt->key_idx[i];
- return i;
- } else
- return ret;
+static inline void
+__hash_rw_reader_unlock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_unlock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_unlock(h->readwrite_lock);
}
/*
@@ -469,32 +443,236 @@ enqueue_slot_back(const struct rte_hash *h,
struct lcore_cache *cached_free_slots,
void *slot_id)
{
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
cached_free_slots->objs[cached_free_slots->len] = slot_id;
cached_free_slots->len++;
} else
rte_ring_sp_enqueue(h->free_slots, slot_id);
}
+/* Search a key from bucket and update its data */
+static inline int32_t
+search_and_update(const struct rte_hash *h, void *data, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash)
+{
+ int i;
+ struct rte_hash_key *k, *keys = h->key_store;
+
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (bkt->sig_current[i] == sig &&
+ bkt->sig_alt[i] == alt_hash) {
+ k = (struct rte_hash_key *) ((char *)keys +
+ bkt->key_idx[i] * h->key_entry_size);
+ if (rte_hash_cmp_eq(key, k->key, h) == 0) {
+ /* Update data */
+ k->pdata = data;
+ /*
+ * Return index where key is stored,
+ * subtracting the first dummy index
+ */
+ return bkt->key_idx[i] - 1;
+ }
+ }
+ }
+ return -1;
+}
+
+
+/* Only tries to insert at one bucket (@prim_bkt) without trying to push
+ * buckets around
+ */
+static inline int32_t
+rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *prim_bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ unsigned int i;
+ struct rte_hash_bucket *cur_bkt = prim_bkt;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+ /* Check if key is already inserted before protected region*/
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ /* Insert new entry if there is room in the primary
+ * bucket.
+ */
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ /* Check if slot is available */
+ if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
+ prim_bkt->sig_current[i] = sig;
+ prim_bkt->sig_alt[i] = alt_hash;
+ prim_bkt->key_idx[i] = new_idx;
+ break;
+ }
+ }
+ __hash_rw_writer_unlock(h);
+
+ if (i != RTE_HASH_BUCKET_ENTRIES)
+ return 0;
+
+ /* no empty entry */
+ return -1;
+}
+
+/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
+ * the path head with new entry (sig, alt_hash, new_idx)
+ */
+static inline int
+rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *alt_bkt,
+ const struct rte_hash_key *key, void *data,
+ struct queue_node *leaf, uint32_t leaf_slot,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ uint32_t prev_alt_bkt_idx;
+ struct rte_hash_bucket *cur_bkt = bkt;
+ struct queue_node *prev_node, *curr_node = leaf;
+ struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
+ uint32_t prev_slot, curr_slot = leaf_slot;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+
+ /* Check if key is already inserted */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ ret = search_and_update(h, data, key, alt_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ while (likely(curr_node->prev != NULL)) {
+ prev_node = curr_node->prev;
+ prev_bkt = prev_node->bkt;
+ prev_slot = curr_node->prev_slot;
+
+ prev_alt_bkt_idx =
+ prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask;
+
+ if (unlikely(&h->buckets[prev_alt_bkt_idx]
+ != curr_bkt)) {
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Need to swap current/alt sig to allow later
+ * Cuckoo insert to move elements back to its
+ * primary bucket if available
+ */
+ curr_bkt->sig_alt[curr_slot] =
+ prev_bkt->sig_current[prev_slot];
+ curr_bkt->sig_current[curr_slot] =
+ prev_bkt->sig_alt[prev_slot];
+ curr_bkt->key_idx[curr_slot] =
+ prev_bkt->key_idx[prev_slot];
+
+ curr_slot = prev_slot;
+ curr_node = prev_node;
+ curr_bkt = curr_node->bkt;
+ }
+
+ curr_bkt->sig_current[curr_slot] = sig;
+ curr_bkt->sig_alt[curr_slot] = alt_hash;
+ curr_bkt->key_idx[curr_slot] = new_idx;
+
+ __hash_rw_writer_unlock(h);
+
+ return 0;
+
+}
+
+/*
+ * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
+ * Cuckoo
+ */
+static inline int
+rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash,
+ uint32_t new_idx, int32_t *ret_val)
+{
+ unsigned int i;
+ struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
+ struct queue_node *tail, *head;
+ struct rte_hash_bucket *curr_bkt, *alt_bkt;
+
+ tail = queue;
+ head = queue + 1;
+ tail->bkt = bkt;
+ tail->prev = NULL;
+ tail->prev_slot = -1;
+
+ /* Cuckoo bfs Search */
+ while (likely(tail != head && head <
+ queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
+ RTE_HASH_BUCKET_ENTRIES)) {
+ curr_bkt = tail->bkt;
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
+ int32_t ret = rte_hash_cuckoo_move_insert_mw(h,
+ bkt, sec_bkt, key, data,
+ tail, i, sig, alt_hash,
+ new_idx, ret_val);
+ if (likely(ret != -1))
+ return ret;
+ }
+
+ /* Enqueue new node and keep prev node info */
+ alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
+ & h->bucket_bitmask]);
+ head->bkt = alt_bkt;
+ head->prev = tail;
+ head->prev_slot = i;
+ head++;
+ }
+ tail++;
+ }
+
+ return -ENOSPC;
+}
+
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
{
hash_sig_t alt_hash;
uint32_t prim_bucket_idx, sec_bucket_idx;
- unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
- struct rte_hash_key *new_k, *k, *keys = h->key_store;
+ struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
uint32_t new_idx;
int ret;
unsigned n_slots;
unsigned lcore_id;
struct lcore_cache *cached_free_slots = NULL;
- unsigned int nr_pushes = 0;
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_lock(h->multiwriter_lock);
+ int32_t ret_val;
prim_bucket_idx = sig & h->bucket_bitmask;
prim_bkt = &h->buckets[prim_bucket_idx];
@@ -506,7 +684,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
rte_prefetch0(sec_bkt);
/* Get a new slot for storing the new key */
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Try to get a free slot from the local cache */
@@ -516,8 +694,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
cached_free_slots->objs,
LCORE_CACHE_SIZE, NULL);
if (n_slots == 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
cached_free_slots->len += n_slots;
@@ -528,8 +705,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
slot_id = cached_free_slots->objs[cached_free_slots->len];
} else {
if (rte_ring_sc_dequeue(h->free_slots, &slot_id) != 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
}
@@ -538,116 +714,60 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
new_idx = (uint32_t)((uintptr_t) slot_id);
/* Check if key is already inserted in primary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (prim_bkt->sig_current[i] == sig &&
- prim_bkt->sig_alt[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- prim_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = prim_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
/* Check if key is already inserted in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (sec_bkt->sig_alt[i] == sig &&
- sec_bkt->sig_current[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- sec_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = sec_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
new_k->pdata = data;
-#if defined(RTE_ARCH_X86) /* currently only x86 support HTM */
- if (h->add_key == ADD_KEY_MULTIWRITER_TM) {
- ret = rte_hash_cuckoo_insert_mw_tm(prim_bkt,
- sig, alt_hash, new_idx);
- if (ret >= 0)
- return new_idx - 1;
- /* Primary bucket full, need to make space for new entry */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, prim_bkt, sig,
- alt_hash, new_idx);
+ /* Find an empty slot and insert */
+ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- if (ret >= 0)
- return new_idx - 1;
+ /* Primary bucket full, need to make space for new entry */
+ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- /* Also search secondary bucket to get better occupancy */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, sec_bkt, sig,
- alt_hash, new_idx);
+ /* Also search secondary bucket to get better occupancy */
+ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
- if (ret >= 0)
- return new_idx - 1;
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
} else {
-#endif
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
-
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-
- /* Primary bucket full, need to make space for new entry
- * After recursive function.
- * Insert the new entry in the position of the pushed entry
- * if successful or return error and
- * store the new slot back in the ring
- */
- ret = make_space_bucket(h, prim_bkt, &nr_pushes);
- if (ret >= 0) {
- prim_bkt->sig_current[ret] = sig;
- prim_bkt->sig_alt[ret] = alt_hash;
- prim_bkt->key_idx[ret] = new_idx;
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-#if defined(RTE_ARCH_X86)
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
-#endif
- /* Error in addition, store new slot back in the ring and return error */
- enqueue_slot_back(h, cached_free_slots, (void *)((uintptr_t) new_idx));
-
-failure:
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return ret;
}
+
int32_t
rte_hash_add_key_with_hash(const struct rte_hash *h,
const void *key, hash_sig_t sig)
@@ -690,25 +810,20 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data)
else
return ret;
}
+
+/* Search one bucket to find the match key */
static inline int32_t
-__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig, void **data)
+search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t sig,
+ void **data, struct rte_hash_bucket *bkt)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
+ int i;
struct rte_hash_key *k, *keys = h->key_store;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
- /* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
bkt->key_idx[i] != EMPTY_SLOT) {
k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
+ bkt->key_idx[i] * h->key_entry_size);
if (rte_hash_cmp_eq(key, k->key, h) == 0) {
if (data != NULL)
*data = k->pdata;
@@ -720,6 +835,29 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig, void **data)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+ __hash_rw_reader_lock(h);
+
+ /* Check if key is in primary location */
+ ret = search_one_bucket(h, key, sig, data, bkt);
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
+ return ret;
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -727,23 +865,13 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bkt = &h->buckets[bucket_idx];
/* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->sig_alt[i] == sig) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- if (data != NULL)
- *data = k->pdata;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- return bkt->key_idx[i] - 1;
- }
- }
+ ret = search_one_bucket(h, key, alt_hash, data, bkt);
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
+ return ret;
}
+ __hash_rw_reader_unlock(h);
return -ENOENT;
}
@@ -784,8 +912,7 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
struct lcore_cache *cached_free_slots;
bkt->sig_current[i] = NULL_SIGNATURE;
- bkt->sig_alt[i] = NULL_SIGNATURE;
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Cache full, need to free it. */
@@ -806,19 +933,14 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
}
}
+/* Search one bucket and remove the matched key */
static inline int32_t
-__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig)
+search_and_remove(const struct rte_hash *h, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig,
+ int32_t *ret_val)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
struct rte_hash_key *k, *keys = h->key_store;
- int32_t ret;
-
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
+ unsigned int i;
/* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
@@ -833,38 +955,45 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
* Return index where key is stored,
* subtracting the first dummy index
*/
- ret = bkt->key_idx[i] - 1;
+ *ret_val = bkt->key_idx[i] - 1;
bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
+ return 0;
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig)
+{
+ int32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int32_t ret_val;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+ __hash_rw_writer_lock(h);
+ /* look for key in primary bucket */
+ if (!search_and_remove(h, key, bkt, sig, &ret_val)) {
+ __hash_rw_writer_unlock(h);
+ return ret_val;
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
- /* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->key_idx[i] != EMPTY_SLOT) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- remove_entry(h, bkt, i);
-
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
- }
- }
+ /* look for key in secondary bucket */
+ if (!search_and_remove(h, key, bkt, alt_hash, &ret_val)) {
+ __hash_rw_writer_unlock(h);
+ return ret_val;
}
-
+ __hash_rw_writer_unlock(h);
return -ENOENT;
}
@@ -1034,6 +1163,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
}
}
+ __hash_rw_reader_lock(h);
+
/* Compare keys, first hits in primary first */
for (i = 0; i < num_keys; i++) {
positions[i] = -ENOENT;
@@ -1088,6 +1219,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
continue;
}
+ __hash_rw_reader_unlock(h);
+
if (hit_mask != NULL)
*hit_mask = hits;
}
@@ -1146,7 +1279,7 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
bucket_idx = *next / RTE_HASH_BUCKET_ENTRIES;
idx = *next % RTE_HASH_BUCKET_ENTRIES;
}
-
+ __hash_rw_reader_lock(h);
/* Get position of entry in key table */
position = h->buckets[bucket_idx].key_idx[idx];
next_key = (struct rte_hash_key *) ((char *)h->key_store +
@@ -1155,6 +1288,8 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
*key = next_key->key;
*data = next_key->pdata;
+ __hash_rw_reader_unlock(h);
+
/* Increment iterator */
(*next)++;
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 7a54e55..40b6be0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -88,11 +88,6 @@ const rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
#endif
-enum add_key_case {
- ADD_KEY_SINGLEWRITER = 0,
- ADD_KEY_MULTIWRITER,
- ADD_KEY_MULTIWRITER_TM,
-};
/** Number of items per bucket. */
#define RTE_HASH_BUCKET_ENTRIES 8
@@ -155,18 +150,19 @@ struct rte_hash {
struct rte_ring *free_slots;
/**< Ring that stores all indexes of the free slots in the key table */
- uint8_t hw_trans_mem_support;
- /**< Hardware transactional memory support */
+
struct lcore_cache *local_free_slots;
/**< Local cache per lcore, storing some indexes of the free slots */
- enum add_key_case add_key; /**< Multi-writer hash add behavior */
-
- rte_spinlock_t *multiwriter_lock; /**< Multi-writer spinlock for w/o TM */
/* Fields used in lookup */
uint32_t key_len __rte_cache_aligned;
/**< Length of hash key. */
+ uint8_t hw_trans_mem_support;
+ uint8_t multi_writer_support;
+ uint8_t readwrite_concur_support;
+ /**< Hardware transactional memory support */
+ rte_rwlock_t *readwrite_lock; /**< Multi-writer spinlock for w/o TM */
rte_hash_function hash_func; /**< Function used to calculate hash. */
uint32_t hash_func_init_val; /**< Init value used by hash_func. */
rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
deleted file mode 100644
index 2c5b017..0000000
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ /dev/null
@@ -1,164 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016 Intel Corporation
- */
-
-/* rte_cuckoo_hash_x86.h
- * This file holds all x86 specific Cuckoo Hash functions
- */
-
-/* Only tries to insert at one bucket (@prim_bkt) without trying to push
- * buckets around
- */
-static inline unsigned
-rte_hash_cuckoo_insert_mw_tm(struct rte_hash_bucket *prim_bkt,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned i, status;
- unsigned try = 0;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* Insert new entry if there is room in the primary
- * bucket.
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
- rte_xend();
-
- if (i != RTE_HASH_BUCKET_ENTRIES)
- return 0;
-
- break; /* break off try loop if transaction commits */
- } else {
- /* If we abort we give up this cuckoo path. */
- try++;
- rte_pause();
- }
- }
-
- return -1;
-}
-
-/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
- * the path head with new entry (sig, alt_hash, new_idx)
- */
-static inline int
-rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
- struct queue_node *leaf, uint32_t leaf_slot,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned try = 0;
- unsigned status;
- uint32_t prev_alt_bkt_idx;
-
- struct queue_node *prev_node, *curr_node = leaf;
- struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
- uint32_t prev_slot, curr_slot = leaf_slot;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- while (likely(curr_node->prev != NULL)) {
- prev_node = curr_node->prev;
- prev_bkt = prev_node->bkt;
- prev_slot = curr_node->prev_slot;
-
- prev_alt_bkt_idx
- = prev_bkt->sig_alt[prev_slot]
- & h->bucket_bitmask;
-
- if (unlikely(&h->buckets[prev_alt_bkt_idx]
- != curr_bkt)) {
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- }
-
- /* Need to swap current/alt sig to allow later
- * Cuckoo insert to move elements back to its
- * primary bucket if available
- */
- curr_bkt->sig_alt[curr_slot] =
- prev_bkt->sig_current[prev_slot];
- curr_bkt->sig_current[curr_slot] =
- prev_bkt->sig_alt[prev_slot];
- curr_bkt->key_idx[curr_slot]
- = prev_bkt->key_idx[prev_slot];
-
- curr_slot = prev_slot;
- curr_node = prev_node;
- curr_bkt = curr_node->bkt;
- }
-
- curr_bkt->sig_current[curr_slot] = sig;
- curr_bkt->sig_alt[curr_slot] = alt_hash;
- curr_bkt->key_idx[curr_slot] = new_idx;
-
- rte_xend();
-
- return 0;
- }
-
- /* If we abort we give up this cuckoo path, since most likely it's
- * no longer valid as TSX detected data conflict
- */
- try++;
- rte_pause();
- }
-
- return -1;
-}
-
-/*
- * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
- * Cuckoo
- */
-static inline int
-rte_hash_cuckoo_make_space_mw_tm(const struct rte_hash *h,
- struct rte_hash_bucket *bkt,
- hash_sig_t sig, hash_sig_t alt_hash,
- uint32_t new_idx)
-{
- unsigned i;
- struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
- struct queue_node *tail, *head;
- struct rte_hash_bucket *curr_bkt, *alt_bkt;
-
- tail = queue;
- head = queue + 1;
- tail->bkt = bkt;
- tail->prev = NULL;
- tail->prev_slot = -1;
-
- /* Cuckoo bfs Search */
- while (likely(tail != head && head <
- queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
- RTE_HASH_BUCKET_ENTRIES)) {
- curr_bkt = tail->bkt;
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
- if (likely(rte_hash_cuckoo_move_insert_mw_tm(h,
- tail, i, sig,
- alt_hash, new_idx) == 0))
- return 0;
- }
-
- /* Enqueue new node and keep prev node info */
- alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
- & h->bucket_bitmask]);
- head->bkt = alt_bkt;
- head->prev = tail;
- head->prev_slot = i;
- head++;
- }
- tail++;
- }
-
- return -ENOSPC;
-}
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index f71ca9f..ecb49e4 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -34,6 +34,9 @@ extern "C" {
/** Default behavior of insertion, single writer/multi writer */
#define RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD 0x02
+/** Flag to support reader writer concurrency */
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY 0x04
+
/** Signature of key that is stored internally. */
typedef uint32_t hash_sig_t;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v1 2/3] test: add test case for read write concurrency
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support Yipeng Wang
@ 2018-06-08 10:51 ` Yipeng Wang
2018-06-26 15:48 ` De Lara Guarch, Pablo
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 3/3] hash: add new API function to query the key count Yipeng Wang
` (4 subsequent siblings)
6 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-08 10:51 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, john.mcnamara, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh
This commit adds a new test case for testing read/write concurrency.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
test/test/Makefile | 1 +
test/test/test_hash_perf.c | 36 ++-
test/test/test_hash_readwrite.c | 649 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 675 insertions(+), 11 deletions(-)
create mode 100644 test/test/test_hash_readwrite.c
diff --git a/test/test/Makefile b/test/test/Makefile
index eccc8ef..6ce66c9 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -113,6 +113,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/test/test/test_hash_perf.c b/test/test/test_hash_perf.c
index a81d0c7..33dcb9f 100644
--- a/test/test/test_hash_perf.c
+++ b/test/test/test_hash_perf.c
@@ -76,7 +76,8 @@ static struct rte_hash_parameters ut_params = {
};
static int
-create_table(unsigned with_data, unsigned table_index)
+create_table(unsigned int with_data, unsigned int table_index,
+ unsigned int with_locks)
{
char name[RTE_HASH_NAMESIZE];
@@ -86,6 +87,14 @@ create_table(unsigned with_data, unsigned table_index)
else
sprintf(name, "test_hash%d", hashtest_key_lens[table_index]);
+
+ if (with_locks)
+ ut_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT
+ | RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ ut_params.extra_flag = 0;
+
ut_params.name = name;
ut_params.key_len = hashtest_key_lens[table_index];
ut_params.socket_id = rte_socket_id();
@@ -459,7 +468,7 @@ reset_table(unsigned table_index)
}
static int
-run_all_tbl_perf_tests(unsigned with_pushes)
+run_all_tbl_perf_tests(unsigned int with_pushes, unsigned int with_locks)
{
unsigned i, j, with_data, with_hash;
@@ -468,7 +477,7 @@ run_all_tbl_perf_tests(unsigned with_pushes)
for (with_data = 0; with_data <= 1; with_data++) {
for (i = 0; i < NUM_KEYSIZES; i++) {
- if (create_table(with_data, i) < 0)
+ if (create_table(with_data, i, with_locks) < 0)
return -1;
if (get_input_keys(with_pushes, i) < 0)
@@ -611,15 +620,20 @@ fbk_hash_perf_test(void)
static int
test_hash_perf(void)
{
- unsigned with_pushes;
-
- for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
- if (with_pushes == 0)
- printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ unsigned int with_pushes, with_locks;
+ for (with_locks = 0; with_locks <= 1; with_locks++) {
+ if (with_locks)
+ printf("\nWith locks in the code\n");
else
- printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
- if (run_all_tbl_perf_tests(with_pushes) < 0)
- return -1;
+ printf("\nWithout locks in the code\n");
+ for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
+ if (with_pushes == 0)
+ printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ else
+ printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
+ if (run_all_tbl_perf_tests(with_pushes, with_locks) < 0)
+ return -1;
+ }
}
if (fbk_hash_perf_test() < 0)
return -1;
diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
new file mode 100644
index 0000000..ef3bbe5
--- /dev/null
+++ b/test/test/test_hash_readwrite.c
@@ -0,0 +1,649 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <locale.h>
+
+#include <rte_cycles.h>
+#include <rte_hash.h>
+#include <rte_hash_crc.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+#include <rte_spinlock.h>
+
+#include "test.h"
+
+
+#define RTE_APP_TEST_HASH_MULTIWRITER_FAILED 0
+
+#define TOTAL_ENTRY (16*1024*1024)
+#define TOTAL_INSERT (15*1024*1024)
+
+#define NUM_TEST 3
+
+
+unsigned int core_cnt[NUM_TEST] = {2, 4, 8};
+
+struct perf {
+ uint32_t single_read;
+ uint32_t single_write;
+ uint32_t read_only[NUM_TEST];
+ uint32_t write_only[NUM_TEST];
+ uint32_t read_write_r[NUM_TEST];
+ uint32_t read_write_w[NUM_TEST];
+};
+
+static struct perf htm_results, non_htm_results;
+
+struct {
+ uint32_t *keys;
+ uint32_t *found;
+ uint32_t nb_tsx_insertion;
+ uint32_t rounded_nb_total_tsx_insertion;
+ struct rte_hash *h;
+} tbl_multiwriter_test_params;
+
+static rte_atomic64_t gcycles;
+static rte_atomic64_t ginsertions;
+
+static rte_atomic64_t gread_cycles;
+static rte_atomic64_t gwrite_cycles;
+
+static rte_atomic64_t greads;
+static rte_atomic64_t gwrites;
+
+static int use_htm;
+
+static int reader_faster;
+
+static int
+test_hash_readwrite_worker(__attribute__((unused)) void *arg)
+{
+ uint64_t i, offset;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+
+ offset = (lcore_id - rte_get_master_lcore())
+ * tbl_multiwriter_test_params.nb_tsx_insertion;
+
+ printf("Core #%d inserting and reading %d: %'"PRId64" - %'"PRId64"\n",
+ lcore_id, tbl_multiwriter_test_params.nb_tsx_insertion,
+ offset, offset + tbl_multiwriter_test_params.nb_tsx_insertion);
+
+ begin = rte_rdtsc_precise();
+
+
+ for (i = offset;
+ i < offset + tbl_multiwriter_test_params.nb_tsx_insertion;
+ i++) {
+
+ if (rte_hash_lookup(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i) > 0)
+ break;
+
+ ret = rte_hash_add_key(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i);
+ if (ret < 0)
+ break;
+
+ if (rte_hash_lookup(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i) != ret)
+ break;
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gcycles, cycles);
+ rte_atomic64_add(&ginsertions, i - offset);
+
+ for (; i < offset + tbl_multiwriter_test_params.nb_tsx_insertion; i++)
+ tbl_multiwriter_test_params.keys[i]
+ = RTE_APP_TEST_HASH_MULTIWRITER_FAILED;
+
+
+ return 0;
+}
+
+
+static int
+init_params(void)
+{
+ unsigned int i;
+
+ uint32_t *keys = NULL;
+ uint32_t *found = NULL;
+ struct rte_hash *handle;
+ char name[RTE_HASH_NAMESIZE];
+
+
+ struct rte_hash_parameters hash_params = {
+ .entries = TOTAL_ENTRY,
+ .key_len = sizeof(uint32_t),
+ .hash_func = rte_hash_crc,
+ .hash_func_init_val = 0,
+ .socket_id = rte_socket_id(),
+ };
+ if (use_htm)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT
+ | RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+
+ snprintf(name, 32, "tests");
+ hash_params.name = name;
+
+ handle = rte_hash_create(&hash_params);
+ if (handle == NULL) {
+ printf("hash creation failed");
+ return -1;
+ }
+
+ tbl_multiwriter_test_params.h = handle;
+ keys = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+
+ if (keys == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err1;
+ }
+
+ found = rte_zmalloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+ if (found == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ goto err2;
+ }
+
+
+ tbl_multiwriter_test_params.keys = keys;
+ tbl_multiwriter_test_params.found = found;
+
+ for (i = 0; i < TOTAL_ENTRY; i++)
+ keys[i] = i;
+
+ return 0;
+
+err2:
+ rte_free(keys);
+err1:
+ rte_hash_free(handle);
+
+ return -1;
+}
+
+static int
+test_hash_readwrite_functional(void)
+{
+ unsigned int i;
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ rte_atomic64_init(&gcycles);
+ rte_atomic64_clear(&gcycles);
+
+ rte_atomic64_init(&ginsertions);
+ rte_atomic64_clear(&ginsertions);
+
+ if (init_params() != 0)
+ goto err;
+
+ tbl_multiwriter_test_params.nb_tsx_insertion =
+ TOTAL_INSERT / rte_lcore_count();
+
+ tbl_multiwriter_test_params.rounded_nb_total_tsx_insertion =
+ tbl_multiwriter_test_params.nb_tsx_insertion
+ * rte_lcore_count();
+
+ printf("++++++++Start function tests:+++++++++\n");
+
+ /* Fire all threads. */
+ rte_eal_mp_remote_launch(test_hash_readwrite_worker,
+ NULL, CALL_MASTER);
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_multiwriter_test_params.h, &next_key,
+ &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_multiwriter_test_params.found[i]++;
+ }
+
+ for (i = 0;
+ i < tbl_multiwriter_test_params.rounded_nb_total_tsx_insertion;
+ i++) {
+ if (tbl_multiwriter_test_params.keys[i]
+ != RTE_APP_TEST_HASH_MULTIWRITER_FAILED) {
+ if (tbl_multiwriter_test_params.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_multiwriter_test_params.found[i] == 0) {
+ lost_keys++;
+ printf("key %d is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during multiwriter insertion.\n");
+
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gcycles)/
+ rte_atomic64_read(&ginsertions);
+
+ printf(" cycles per insertion and lookup: %llu\n",
+ cycles_per_insertion);
+
+ rte_free(tbl_multiwriter_test_params.found);
+ rte_free(tbl_multiwriter_test_params.keys);
+ rte_hash_free(tbl_multiwriter_test_params.h);
+ printf("+++++++++Complete function tests+++++++++\n");
+ return 0;
+
+err_free:
+ rte_free(tbl_multiwriter_test_params.found);
+ rte_free(tbl_multiwriter_test_params.keys);
+ rte_hash_free(tbl_multiwriter_test_params.h);
+err:
+ return -1;
+}
+
+
+static int
+test_rw_reader(__attribute__((unused)) void *arg)
+{
+ unsigned int i;
+ uint64_t begin, cycles;
+ uint64_t read_cnt = (uint64_t)((uintptr_t)arg);
+
+ begin = rte_rdtsc_precise();
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i,
+ &data);
+ if (i != (uint64_t)data) {
+ printf("lookup find wrong value %d, %ld\n", i,
+ (uint64_t)data);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gread_cycles, cycles);
+ rte_atomic64_add(&greads, i);
+ return 0;
+
+}
+
+static int
+test_rw_writer(__attribute__((unused)) void *arg)
+{
+ unsigned int i;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+ uint64_t start_coreid = (uint64_t)arg;
+ uint64_t offset;
+
+ offset = TOTAL_INSERT / 2 + (lcore_id - start_coreid)
+ * tbl_multiwriter_test_params.nb_tsx_insertion;
+ begin = rte_rdtsc_precise();
+ for (i = offset; i < offset +
+ tbl_multiwriter_test_params.nb_tsx_insertion;
+ i++) {
+ ret = rte_hash_add_key_data(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("writer failed %d\n", i);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gwrite_cycles, cycles);
+ rte_atomic64_add(&gwrites,
+ tbl_multiwriter_test_params.nb_tsx_insertion);
+ return 0;
+}
+
+
+static int
+test_hash_readwrite_perf(struct perf *perf_results)
+{
+ unsigned int i, n;
+ int ret;
+ int start_coreid;
+ uint64_t read_cnt;
+
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ uint64_t start = 0, end = 0;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gwrites);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&greads);
+
+ rte_atomic64_init(&gread_cycles);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_init(&gwrite_cycles);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ if (init_params() != 0)
+ goto err;
+
+ if (reader_faster) {
+ printf("++++++Start perf test: reader++++++++\n");
+ read_cnt = TOTAL_INSERT / 10;
+ } else {
+ printf("++++++Start perf test: writer++++++++\n");
+ read_cnt = TOTAL_INSERT / 2;
+ }
+
+
+ /* We first test single thread performance */
+ start = rte_rdtsc_precise();
+ /* Insert half of the keys */
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_write = end / i;
+
+ start = rte_rdtsc_precise();
+
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i,
+ &data);
+ if (i != (uint64_t)data) {
+ printf("lookup find wrong value %d, %ld\n", i,
+ (uint64_t)data);
+ break;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_read = end / i;
+
+
+
+ for (n = 0; n < NUM_TEST; n++) {
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_multiwriter_test_params.h);
+
+ tbl_multiwriter_test_params.nb_tsx_insertion = TOTAL_INSERT /
+ 2 / core_cnt[n];
+ tbl_multiwriter_test_params.rounded_nb_total_tsx_insertion =
+ TOTAL_INSERT / 2 +
+ tbl_multiwriter_test_params.nb_tsx_insertion *
+ core_cnt[n];
+
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(
+ tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ /* Then test multiple thread case but only all reads or
+ * all writes
+ */
+
+ /* Test only reader cases */
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)read_cnt, i);
+
+ rte_eal_mp_wait_lcore();
+
+ start_coreid = i;
+ /* Test only writer cases */
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+
+
+ rte_eal_mp_wait_lcore();
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles)/
+ rte_atomic64_read(&greads);
+ perf_results->read_only[n] = cycles_per_insertion;
+ printf("Reader only: cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles)/
+ rte_atomic64_read(&gwrites);
+ perf_results->write_only[n] = cycles_per_insertion;
+ printf("Writer only: cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_multiwriter_test_params.h);
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(
+ tbl_multiwriter_test_params.h,
+ tbl_multiwriter_test_params.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+
+ start_coreid = core_cnt[n] + 1;
+
+ if (reader_faster) {
+ for (i = core_cnt[n] + 1; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)read_cnt, i);
+ } else {
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)read_cnt, i);
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ }
+
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_multiwriter_test_params.h,
+ &next_key, &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_multiwriter_test_params.found[i]++;
+ }
+
+
+ for (i = 0; i <
+ tbl_multiwriter_test_params.rounded_nb_total_tsx_insertion;
+ i++) {
+ if (tbl_multiwriter_test_params.keys[i]
+ != RTE_APP_TEST_HASH_MULTIWRITER_FAILED) {
+ if (tbl_multiwriter_test_params.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_multiwriter_test_params.found[i] == 0) {
+ lost_keys++;
+ printf("key %d is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during multiwriter insertion.\n");
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_write_r[n] = cycles_per_insertion;
+ printf("Read-write cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->read_write_w[n] = cycles_per_insertion;
+ printf("Read-write cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+ }
+
+ rte_free(tbl_multiwriter_test_params.found);
+ rte_free(tbl_multiwriter_test_params.keys);
+ rte_hash_free(tbl_multiwriter_test_params.h);
+ return 0;
+
+err_free:
+ rte_free(tbl_multiwriter_test_params.found);
+ rte_free(tbl_multiwriter_test_params.keys);
+ rte_hash_free(tbl_multiwriter_test_params.h);
+
+err:
+ return -1;
+}
+
+
+static int
+test_hash_readwrite_main(void)
+{
+ if (rte_lcore_count() == 1) {
+ printf("More than one lcore is required "
+ "to do read write test\n");
+ return 0;
+ }
+
+
+ setlocale(LC_NUMERIC, "");
+
+ if (rte_tm_supported()) {
+ printf("Hardware transactional memory (lock elision) "
+ "is supported\n");
+
+ printf("Test multi-writer with Hardware "
+ "transactional memory\n");
+
+ use_htm = 1;
+ if (test_hash_readwrite_functional() < 0)
+ return -1;
+
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&htm_results) < 0)
+ return -1;
+
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&htm_results) < 0)
+ return -1;
+ } else {
+ printf("Hardware transactional memory (lock elision) "
+ "is NOT supported\n");
+ }
+
+ printf("Test multi-writer without Hardware transactional memory\n");
+ use_htm = 0;
+ if (test_hash_readwrite_functional() < 0)
+ return -1;
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&non_htm_results) < 0)
+ return -1;
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&non_htm_results) < 0)
+ return -1;
+
+
+ printf("Results summary:\n");
+
+ int i;
+
+ printf("single read: %u\n", htm_results.single_read);
+ printf("single write: %u\n", htm_results.single_write);
+ for (i = 0; i < NUM_TEST; i++) {
+ printf("core_cnt: %u\n", core_cnt[i]);
+ printf("HTM:\n");
+ printf("read only: %u\n", htm_results.read_only[i]);
+ printf("write only: %u\n", htm_results.write_only[i]);
+ printf("read-write read: %u\n", htm_results.read_write_r[i]);
+ printf("read-write write: %u\n", htm_results.read_write_w[i]);
+
+ printf("non HTM:\n");
+ printf("read only: %u\n", non_htm_results.read_only[i]);
+ printf("write only: %u\n", non_htm_results.write_only[i]);
+ printf("read-write read: %u\n",
+ non_htm_results.read_write_r[i]);
+ printf("read-write write: %u\n",
+ non_htm_results.read_write_w[i]);
+ }
+
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(hash_readwrite_autotest, test_hash_readwrite_main);
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v1 3/3] hash: add new API function to query the key count
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support Yipeng Wang
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 2/3] test: add test case for read write concurrency Yipeng Wang
@ 2018-06-08 10:51 ` Yipeng Wang
2018-06-26 16:11 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
` (3 subsequent siblings)
6 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-08 10:51 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, john.mcnamara, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh
Add a new function, rte_hash_count, to return the number of keys that
are currently stored in the hash table. Corresponding test functions are
added into hash_test and hash_multiwriter test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 39 +++++++++++++++++++++++++++++++-----
lib/librte_hash/rte_hash.h | 11 ++++++++++
lib/librte_hash/rte_hash_version.map | 8 ++++++++
test/test/test_hash.c | 12 +++++++++++
test/test/test_hash_multiwriter.c | 9 +++++++++
5 files changed, 74 insertions(+), 5 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index a5bb4d4..3dff871 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -133,13 +133,13 @@ rte_hash_create(const struct rte_hash_parameters *params)
* except for the first cache
*/
num_key_slots = params->entries + (RTE_MAX_LCORE - 1) *
- LCORE_CACHE_SIZE + 1;
+ (LCORE_CACHE_SIZE - 1) + 1;
else
num_key_slots = params->entries + 1;
snprintf(ring_name, sizeof(ring_name), "HT_%s", params->name);
/* Create ring (Dummy slot index is not enqueued) */
- r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots - 1),
+ r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots),
params->socket_id, 0);
if (r == NULL) {
RTE_LOG(ERR, HASH, "memory allocation failed\n");
@@ -290,7 +290,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
}
/* Populate free slots ring. Entry zero is reserved for key misses. */
- for (i = 1; i < params->entries + 1; i++)
+ for (i = 1; i < num_key_slots; i++)
rte_ring_sp_enqueue(r, (void *)((uintptr_t) i));
te->data = (void *) h;
@@ -371,7 +371,7 @@ void
rte_hash_reset(struct rte_hash *h)
{
void *ptr;
- unsigned i;
+ uint32_t tot_ring_cnt, i;
if (h == NULL)
return;
@@ -384,7 +384,13 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- for (i = 1; i < h->entries + 1; i++)
+ if (h->multi_writer_support)
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ else
+ tot_ring_cnt = h->entries;
+
+ for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
if (h->multi_writer_support) {
@@ -394,6 +400,29 @@ rte_hash_reset(struct rte_hash *h)
}
}
+int32_t
+rte_hash_count(struct rte_hash *h)
+{
+ uint32_t tot_ring_cnt, cached_cnt = 0;
+ uint32_t i, ret;
+
+ if (h == NULL || h->free_slots == NULL)
+ return -EINVAL;
+
+ if (h->multi_writer_support) {
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ for (i = 0; i < RTE_MAX_LCORE; i++)
+ cached_cnt += h->local_free_slots[i].len;
+
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots) -
+ cached_cnt;
+ } else {
+ tot_ring_cnt = h->entries;
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots);
+ }
+ return ret;
+}
static inline void
__hash_rw_writer_lock(const struct rte_hash *h)
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index ecb49e4..b8f1f31 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -127,6 +127,17 @@ void
rte_hash_reset(struct rte_hash *h);
/**
+ * Return the number of keys in the hash table
+ * @param h
+ * Hash table to query from
+ * @return
+ * - -EINVAL if parameters are invalid
+ * - A value indicating how many keys inserted in the table.
+ */
+int32_t
+rte_hash_count(struct rte_hash *h);
+
+/**
* Add a key-value pair to an existing hash table.
* This operation is not multi-thread safe
* and should only be called from one thread.
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index 52a2576..e216ac8 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -45,3 +45,11 @@ DPDK_16.07 {
rte_hash_get_key_with_position;
} DPDK_2.2;
+
+
+DPDK_18.08 {
+ global:
+
+ rte_hash_count;
+
+} DPDK_16.07;
diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index edf41f5..b3db9fd 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -1103,6 +1103,7 @@ static int test_average_table_utilization(void)
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
+ unsigned int cnt;
printf("\n# Running test to determine average utilization"
"\n before adding elements begins to fail\n");
@@ -1121,13 +1122,24 @@ static int test_average_table_utilization(void)
for (i = 0; i < ut_params.key_len; i++)
simple_key[i] = rte_rand() % 255;
ret = rte_hash_add_key(handle, simple_key);
+ if (ret < 0)
+ break;
}
+
if (ret != -ENOSPC) {
printf("Unexpected error when adding keys\n");
rte_hash_free(handle);
return -1;
}
+ cnt = rte_hash_count(handle);
+ if (cnt != added_keys) {
+ printf("rte_hash_count returned wrong value %u, %u,"
+ "%u\n", j, added_keys, cnt);
+ rte_hash_free(handle);
+ return -1;
+ }
+
average_keys_added += added_keys;
/* Reset the table */
diff --git a/test/test/test_hash_multiwriter.c b/test/test/test_hash_multiwriter.c
index ef5fce3..ae3ce3b 100644
--- a/test/test/test_hash_multiwriter.c
+++ b/test/test/test_hash_multiwriter.c
@@ -116,6 +116,7 @@ test_hash_multiwriter(void)
uint32_t duplicated_keys = 0;
uint32_t lost_keys = 0;
+ uint32_t count;
snprintf(name, 32, "test%u", calledCount++);
hash_params.name = name;
@@ -163,6 +164,14 @@ test_hash_multiwriter(void)
NULL, CALL_MASTER);
rte_eal_mp_wait_lcore();
+
+ count = rte_hash_count(handle);
+ if (count != rounded_nb_total_tsx_insertion) {
+ printf("rte_hash_count returned wrong value %u, %d\n",
+ rounded_nb_total_tsx_insertion, count);
+ goto err3;
+ }
+
while (rte_hash_iterate(handle, &next_key, &next_data, &iter) >= 0) {
/* Search for the key in the list of keys added .*/
i = *(const uint32_t *)next_key;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support Yipeng Wang
@ 2018-06-26 14:59 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-06-26 14:59 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Wang, Yipeng1, Mcnamara, John, Richardson, Bruce,
honnappa.nagarahalli, vguvva, brijesh.s.singh
Hi Yipeng,
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Yipeng Wang
> Sent: Friday, June 8, 2018 11:51 AM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Mcnamara,
> John <john.mcnamara@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency
> support
>
> The existing implementation of librte_hash does not support read-write
> concurrency. This commit implements read-write safety using rte_rwlock and
> rte_rwlock TM version if hardware transactional memory is available.
>
> Both multi-writer and read-write concurrency is protected by rte_rwlock now.
> The x86 specific header file is removed since the x86 specific RTM function is not
> called directly by rte hash now.
Sorry for the late review.
Could you split this patch to make it easier to review?
I suggest first refactor the code adding and using the API search_one_bucket, search_and_update
and search_and_remove, and then add the rest of the patch.
Since you are removing rte_cuckoo_hash_x86.h, you need to remove it from meson.build file too.
More comments below.
Thanks,
Pablo
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> lib/librte_hash/rte_cuckoo_hash.c | 627 +++++++++++++++++++++-------------
> lib/librte_hash/rte_cuckoo_hash.h | 16 +-
> lib/librte_hash/rte_cuckoo_hash_x86.h | 164 ---------
> lib/librte_hash/rte_hash.h | 3 +
> 4 files changed, 390 insertions(+), 420 deletions(-) delete mode 100644
> lib/librte_hash/rte_cuckoo_hash_x86.h
>
> diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> b/lib/librte_hash/rte_cuckoo_hash.c
> index a07543a..a5bb4d4 100644
> --- a/lib/librte_hash/rte_cuckoo_hash.c
> +++ b/lib/librte_hash/rte_cuckoo_hash.c
...
> + if (h->multi_writer_support) {
> + h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
> LCORE_CACHE_SIZE);
Check if malloc was successful, if not go to err_unlock.
> - rte_spinlock_init(h->multiwriter_lock);
> - }
> - } else
> - h->add_key = ADD_KEY_SINGLEWRITER;
> + rte_rwlock_init(h->readwrite_lock);
> + }
>
...
> +static inline int32_t
> +__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> + hash_sig_t sig)
> +{
> + int32_t bucket_idx;
Shouldn't this be "uint32_t"?
> + hash_sig_t alt_hash;
> + struct rte_hash_bucket *bkt;
> + int32_t ret_val;
> +
> + bucket_idx = sig & h->bucket_bitmask;
...
>
> diff --git a/lib/librte_hash/rte_cuckoo_hash.h
> b/lib/librte_hash/rte_cuckoo_hash.h
> index 7a54e55..40b6be0 100644
> --- a/lib/librte_hash/rte_cuckoo_hash.h
> +++ b/lib/librte_hash/rte_cuckoo_hash.h
...
>
> uint32_t key_len __rte_cache_aligned;
> /**< Length of hash key. */
> + uint8_t hw_trans_mem_support;
> + uint8_t multi_writer_support;
Add comments for these two fields.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v1 2/3] test: add test case for read write concurrency
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 2/3] test: add test case for read write concurrency Yipeng Wang
@ 2018-06-26 15:48 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-06-26 15:48 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Mcnamara, John, Richardson, Bruce, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Hi Yipeng,
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 8, 2018 11:51 AM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Mcnamara,
> John <john.mcnamara@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v1 2/3] test: add test case for read write concurrency
>
> This commit adds a new test case for testing read/write concurrency.
Could you split this patch into two? One adding lock "support" in the current
performance test code and another one adding the new readwrite tests?
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> test/test/Makefile | 1 +
> test/test/test_hash_perf.c | 36 ++-
> test/test/test_hash_readwrite.c | 649
...
> +++ b/test/test/test_hash_readwrite.c
...
> +struct {
> + uint32_t *keys;
> + uint32_t *found;
> + uint32_t nb_tsx_insertion;
> + uint32_t rounded_nb_total_tsx_insertion;
> + struct rte_hash *h;
> +} tbl_multiwriter_test_params;
I think " rounded_nb_total_tsx_insertion" and " tbl_multiwriter_test_params"
are too long, and that's why you need to split a long line into two down below.
Could you shorten these names a bit? You can change "multiwriter" to "mw",
and "rounded_nb_total_tsx_insertion" to "total_nb_tsx_inserts".
...
> + snprintf(name, 32, "tests");
> + hash_params.name = name;
You can set hash_params.name = "tests" directly.
> +
> + handle = rte_hash_create(&hash_params);
> + if (handle == NULL) {
> + printf("hash creation failed");
> + return -1;
> + }
...
> +
> +err2:
> + rte_free(keys);
> +err1:
> + rte_hash_free(handle);
I think you can have just one label "err" with these two frees.
If the variables haven't been set, they will be NULL and that's allowed.
> +
> + return -1;
> +}
> +
...
> + begin = rte_rdtsc_precise();
> + for (i = 0; i < read_cnt; i++) {
> + void *data;
> + rte_hash_lookup_data(tbl_multiwriter_test_params.h,
> + tbl_multiwriter_test_params.keys + i,
> + &data);
> + if (i != (uint64_t)data) {
I see the following error and other errors when compiling with gcc 32 bits.
test/test/test_hash_readwrite.c:281:12: error:
cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
if (i != (uint64_t)data) {
^
> + printf("lookup find wrong value %d, %ld\n", i,
> + (uint64_t)data);
...
> +
> + if (reader_faster) {
> + unsigned long long int cycles_per_insertion =
> + rte_atomic64_read(&gread_cycles)/
> + rte_atomic64_read(&greads);
> + perf_results->read_only[n] = cycles_per_insertion;
> + printf("Reader only: cycles per lookup: %llu\n",
> + cycles_per_insertion);
> + }
> +
> + else {
} else {
...
> + use_htm = 1;
> + if (test_hash_readwrite_functional() < 0)
> + return -1;
> +
> + reader_faster = 1;
Maybe a comment about this reader_faster would be good.
Also, can we declare this variable and use_html as local and pass them to the functions,
instead of declaring them as global?
> + if (test_hash_readwrite_perf(&htm_results) < 0)
> + return -1;
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v1 3/3] hash: add new API function to query the key count
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 3/3] hash: add new API function to query the key count Yipeng Wang
@ 2018-06-26 16:11 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-06-26 16:11 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Mcnamara, John, Richardson, Bruce, honnappa.nagarahalli,
vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 8, 2018 11:51 AM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Mcnamara,
> John <john.mcnamara@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v1 3/3] hash: add new API function to query the key count
>
> Add a new function, rte_hash_count, to return the number of keys that are
> currently stored in the hash table. Corresponding test functions are added into
> hash_test and hash_multiwriter test.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> lib/librte_hash/rte_cuckoo_hash.c | 39
> +++++++++++++++++++++++++++++++-----
> lib/librte_hash/rte_hash.h | 11 ++++++++++
> lib/librte_hash/rte_hash_version.map | 8 ++++++++
> test/test/test_hash.c | 12 +++++++++++
> test/test/test_hash_multiwriter.c | 9 +++++++++
> 5 files changed, 74 insertions(+), 5 deletions(-)
>
> diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> b/lib/librte_hash/rte_cuckoo_hash.c
> index a5bb4d4..3dff871 100644
> --- a/lib/librte_hash/rte_cuckoo_hash.c
> +++ b/lib/librte_hash/rte_cuckoo_hash.c
> @@ -133,13 +133,13 @@ rte_hash_create(const struct rte_hash_parameters
> *params)
> * except for the first cache
> */
> num_key_slots = params->entries + (RTE_MAX_LCORE - 1) *
> - LCORE_CACHE_SIZE + 1;
> + (LCORE_CACHE_SIZE - 1) + 1;
This and the other changes made outside the new rte_hash_count API can be done in a different patch.
If this was an issue on the calculation of key slots, it should be marked as a fix and then
a patch with the new API can follow, with the tests.
...
>
> +int32_t
> +rte_hash_count(struct rte_hash *h)
> +{
> + uint32_t tot_ring_cnt, cached_cnt = 0;
> + uint32_t i, ret;
> +
> + if (h == NULL || h->free_slots == NULL)
I don't think the check on free_slots is necessary,
since rte_hash_create is already checking that.
...
> --- a/lib/librte_hash/rte_hash.h
> +++ b/lib/librte_hash/rte_hash.h
> @@ -127,6 +127,17 @@ void
> rte_hash_reset(struct rte_hash *h);
>
> /**
> + * Return the number of keys in the hash table
> + * @param h
> + * Hash table to query from
> + * @return
> + * - -EINVAL if parameters are invalid
> + * - A value indicating how many keys inserted in the table.
"how many keys were inserted"
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
` (2 preceding siblings ...)
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 3/3] hash: add new API function to query the key count Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions Yipeng Wang
` (5 more replies)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (2 subsequent siblings)
6 siblings, 6 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This patch set adds the read-write concurrency support in rte_hash.
A new flag value is added to indicate if read-write concurrency is needed
during creation time. Test cases are implemented to do functional and
performance tests.
The new concurrency model is based on rte_rwlock. When Intel TSX is
available and the users indicate to use it, the TM version of the
rte_rwlock will be called. Both multi-writer and read-write concurrency
are protected by the rte_rwlock instead of the x86 specific RTM
instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed
and the code is infused into the main .c file.
A new rte_hash_count API is proposed to count how many keys are inserted
into the hash table.
v2->v1:
1. Split each commit into two commits for easier review (Pablo).
2. Add more comments in various places (Pablo).
3. hash: In key insertion function, move duplicated key checking to
earlier location and protect it using locks. Checking duplicated key
should happen first and data updates should be protected.
4. hash: In lookup bulk function, put signature comparison in lock,
since writers could happen between signature match on two buckets.
5. hash: Add write locks to reset function as well to protect resets.
5. test: Fix 32-bit compilation error in read-write test (Pablo).
6. test: Check total physical core count in read-write test. Don't
test with thread count that larger than physical core count.
7. Other minor fixes such as typos (Pablo).
Yipeng Wang (6):
hash: make duplicated code into functions
hash: add read and write concurrency support
test: add tests in hash table perf test
test: add test case for read write concurrency
hash: fix to have more accurate key slot size
hash: add new API function to query the key count
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 695 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 --------
lib/librte_hash/rte_hash.h | 14 +
lib/librte_hash/rte_hash_version.map | 8 +
test/test/Makefile | 1 +
test/test/test_hash.c | 12 +
test/test/test_hash_multiwriter.c | 9 +
test/test/test_hash_perf.c | 36 +-
test/test/test_hash_readwrite.c | 645 +++++++++++++++++++++++++++++++
11 files changed, 1155 insertions(+), 448 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
create mode 100644 test/test/test_hash_readwrite.c
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-07-06 10:04 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 2/6] hash: add read and write concurrency support Yipeng Wang
` (4 subsequent siblings)
5 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit refactors the hash table lookup/add/del code
to remove some code duplication. Processing on primary bucket can
also apply to secondary bucket with same code.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 186 ++++++++++++++++++--------------------
1 file changed, 90 insertions(+), 96 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index a07543a..574764f 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -476,6 +476,33 @@ enqueue_slot_back(const struct rte_hash *h,
rte_ring_sp_enqueue(h->free_slots, slot_id);
}
+/* Search a key from bucket and update its data */
+static inline int32_t
+search_and_update(const struct rte_hash *h, void *data, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash)
+{
+ int i;
+ struct rte_hash_key *k, *keys = h->key_store;
+
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (bkt->sig_current[i] == sig &&
+ bkt->sig_alt[i] == alt_hash) {
+ k = (struct rte_hash_key *) ((char *)keys +
+ bkt->key_idx[i] * h->key_entry_size);
+ if (rte_hash_cmp_eq(key, k->key, h) == 0) {
+ /* Update data */
+ k->pdata = data;
+ /*
+ * Return index where key is stored,
+ * subtracting the first dummy index
+ */
+ return bkt->key_idx[i] - 1;
+ }
+ }
+ }
+ return -1;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
@@ -484,7 +511,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
uint32_t prim_bucket_idx, sec_bucket_idx;
unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
- struct rte_hash_key *new_k, *k, *keys = h->key_store;
+ struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
uint32_t new_idx;
int ret;
@@ -538,46 +565,14 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
new_idx = (uint32_t)((uintptr_t) slot_id);
/* Check if key is already inserted in primary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (prim_bkt->sig_current[i] == sig &&
- prim_bkt->sig_alt[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- prim_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = prim_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1)
+ goto failure;
/* Check if key is already inserted in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (sec_bkt->sig_alt[i] == sig &&
- sec_bkt->sig_current[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- sec_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = sec_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1)
+ goto failure;
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
@@ -690,20 +685,15 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data)
else
return ret;
}
+
+/* Search one bucket to find the match key */
static inline int32_t
-__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig, void **data)
+search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t sig,
+ void **data, struct rte_hash_bucket *bkt)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
+ int i;
struct rte_hash_key *k, *keys = h->key_store;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
- /* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
bkt->key_idx[i] != EMPTY_SLOT) {
@@ -720,6 +710,26 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig, void **data)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+
+ /* Check if key is in primary location */
+ ret = search_one_bucket(h, key, sig, data, bkt);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -727,22 +737,9 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bkt = &h->buckets[bucket_idx];
/* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->sig_alt[i] == sig) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- if (data != NULL)
- *data = k->pdata;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- return bkt->key_idx[i] - 1;
- }
- }
- }
+ ret = search_one_bucket(h, key, alt_hash, data, bkt);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
@@ -806,19 +803,14 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
}
}
+/* Search one bucket and remove the matched key */
static inline int32_t
-__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig)
+search_and_remove(const struct rte_hash *h, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig,
+ int32_t *ret_val)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
struct rte_hash_key *k, *keys = h->key_store;
- int32_t ret;
-
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
+ unsigned int i;
/* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
@@ -833,37 +825,39 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
* Return index where key is stored,
* subtracting the first dummy index
*/
- ret = bkt->key_idx[i] - 1;
+ *ret_val = bkt->key_idx[i] - 1;
bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
+ return 0;
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int32_t ret_val;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+ /* look for key in primary bucket */
+ if (!search_and_remove(h, key, bkt, sig, &ret_val))
+ return ret_val;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
- /* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->key_idx[i] != EMPTY_SLOT) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- remove_entry(h, bkt, i);
-
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
- }
- }
- }
+ /* look for key in secondary bucket */
+ if (!search_and_remove(h, key, bkt, alt_hash, &ret_val))
+ return ret_val;
return -ENOENT;
}
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 2/6] hash: add read and write concurrency support
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-07-06 17:11 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 3/6] test: add tests in hash table perf test Yipeng Wang
` (3 subsequent siblings)
5 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
The existing implementation of librte_hash does not support read-write
concurrency. This commit implements read-write safety using rte_rwlock
and rte_rwlock TM version if hardware transactional memory is available.
Both multi-writer and read-write concurrency is protected by rte_rwlock
now. The x86 specific header file is removed since the x86 specific RTM
function is not called directly by rte hash now.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 507 ++++++++++++++++++++++------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 -----------
lib/librte_hash/rte_hash.h | 3 +
5 files changed, 338 insertions(+), 355 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index e139e1d..efc06ed 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -6,7 +6,6 @@ headers = files('rte_cmp_arm64.h',
'rte_cmp_x86.h',
'rte_crc_arm64.h',
'rte_cuckoo_hash.h',
- 'rte_cuckoo_hash_x86.h',
'rte_fbk_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 574764f..d2c7629 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -31,9 +31,6 @@
#include "rte_hash.h"
#include "rte_cuckoo_hash.h"
-#if defined(RTE_ARCH_X86)
-#include "rte_cuckoo_hash_x86.h"
-#endif
TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
@@ -93,8 +90,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
void *buckets = NULL;
char ring_name[RTE_RING_NAMESIZE];
unsigned num_key_slots;
- unsigned hw_trans_mem_support = 0;
unsigned i;
+ unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
+ unsigned int readwrite_concur_support = 0;
+
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
hash_list = RTE_TAILQ_CAST(rte_hash_tailq.head, rte_hash_list);
@@ -118,8 +117,16 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD)
+ multi_writer_support = 1;
+
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) {
+ readwrite_concur_support = 1;
+ multi_writer_support = 1;
+ }
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
- if (hw_trans_mem_support)
+ if (multi_writer_support)
/*
* Increase number of slots by total number of indices
* that can be stored in the lcore caches
@@ -233,7 +240,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->cmp_jump_table_idx = KEY_OTHER_BYTES;
#endif
- if (hw_trans_mem_support) {
+ if (multi_writer_support) {
h->local_free_slots = rte_zmalloc_socket(NULL,
sizeof(struct lcore_cache) * RTE_MAX_LCORE,
RTE_CACHE_LINE_SIZE, params->socket_id);
@@ -261,6 +268,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->key_store = k;
h->free_slots = r;
h->hw_trans_mem_support = hw_trans_mem_support;
+ h->multi_writer_support = multi_writer_support;
+ h->readwrite_concur_support = readwrite_concur_support;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
@@ -271,21 +280,17 @@ rte_hash_create(const struct rte_hash_parameters *params)
#endif
h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR;
- /* Turn on multi-writer only with explicit flat from user and TM
+ /* Turn on multi-writer only with explicit flag from user and TM
* support.
*/
- if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
- if (h->hw_trans_mem_support) {
- h->add_key = ADD_KEY_MULTIWRITER_TM;
- } else {
- h->add_key = ADD_KEY_MULTIWRITER;
- h->multiwriter_lock = rte_malloc(NULL,
- sizeof(rte_spinlock_t),
+ if (h->multi_writer_support) {
+ h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
LCORE_CACHE_SIZE);
- rte_spinlock_init(h->multiwriter_lock);
- }
- } else
- h->add_key = ADD_KEY_SINGLEWRITER;
+ if (h->readwrite_lock == NULL)
+ goto err_unlock;
+
+ rte_rwlock_init(h->readwrite_lock);
+ }
/* Populate free slots ring. Entry zero is reserved for key misses. */
for (i = 1; i < params->entries + 1; i++)
@@ -335,11 +340,10 @@ rte_hash_free(struct rte_hash *h)
rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support) {
rte_free(h->local_free_slots);
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_free(h->multiwriter_lock);
+ rte_free(h->readwrite_lock);
+ }
rte_ring_free(h->free_slots);
rte_free(h->key_store);
rte_free(h->buckets);
@@ -366,6 +370,44 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+/* Read write locks implemented using rte_rwlock */
+static inline void
+__hash_rw_writer_lock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_lock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_lock(h->readwrite_lock);
+}
+
+
+static inline void
+__hash_rw_reader_lock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_lock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_lock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_writer_unlock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_unlock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_unlock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_reader_unlock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_unlock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_unlock(h->readwrite_lock);
+}
+
void
rte_hash_reset(struct rte_hash *h)
{
@@ -375,6 +417,7 @@ rte_hash_reset(struct rte_hash *h)
if (h == NULL)
return;
+ __hash_rw_writer_lock(h);
memset(h->buckets, 0, h->num_buckets * sizeof(struct rte_hash_bucket));
memset(h->key_store, 0, h->key_entry_size * (h->entries + 1));
@@ -386,77 +429,12 @@ rte_hash_reset(struct rte_hash *h)
for (i = 1; i < h->entries + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
/* Reset local caches per lcore */
for (i = 0; i < RTE_MAX_LCORE; i++)
h->local_free_slots[i].len = 0;
}
-}
-
-/* Search for an entry that can be pushed to its alternative location */
-static inline int
-make_space_bucket(const struct rte_hash *h, struct rte_hash_bucket *bkt,
- unsigned int *nr_pushes)
-{
- unsigned i, j;
- int ret;
- uint32_t next_bucket_idx;
- struct rte_hash_bucket *next_bkt[RTE_HASH_BUCKET_ENTRIES];
-
- /*
- * Push existing item (search for bucket with space in
- * alternative locations) to its alternative location
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Search for space in alternative locations */
- next_bucket_idx = bkt->sig_alt[i] & h->bucket_bitmask;
- next_bkt[i] = &h->buckets[next_bucket_idx];
- for (j = 0; j < RTE_HASH_BUCKET_ENTRIES; j++) {
- if (next_bkt[i]->key_idx[j] == EMPTY_SLOT)
- break;
- }
-
- if (j != RTE_HASH_BUCKET_ENTRIES)
- break;
- }
-
- /* Alternative location has spare room (end of recursive function) */
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- next_bkt[i]->sig_alt[j] = bkt->sig_current[i];
- next_bkt[i]->sig_current[j] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[j] = bkt->key_idx[i];
- return i;
- }
-
- /* Pick entry that has not been pushed yet */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++)
- if (bkt->flag[i] == 0)
- break;
-
- /* All entries have been pushed, so entry cannot be added */
- if (i == RTE_HASH_BUCKET_ENTRIES || ++(*nr_pushes) > RTE_HASH_MAX_PUSHES)
- return -ENOSPC;
-
- /* Set flag to indicate that this entry is going to be pushed */
- bkt->flag[i] = 1;
-
- /* Need room in alternative bucket to insert the pushed entry */
- ret = make_space_bucket(h, next_bkt[i], nr_pushes);
- /*
- * After recursive function.
- * Clear flags and insert the pushed entry
- * in its alternative location if successful,
- * or return error
- */
- bkt->flag[i] = 0;
- if (ret >= 0) {
- next_bkt[i]->sig_alt[ret] = bkt->sig_current[i];
- next_bkt[i]->sig_current[ret] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[ret] = bkt->key_idx[i];
- return i;
- } else
- return ret;
-
+ __hash_rw_writer_unlock(h);
}
/*
@@ -469,7 +447,7 @@ enqueue_slot_back(const struct rte_hash *h,
struct lcore_cache *cached_free_slots,
void *slot_id)
{
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
cached_free_slots->objs[cached_free_slots->len] = slot_id;
cached_free_slots->len++;
} else
@@ -503,13 +481,199 @@ search_and_update(const struct rte_hash *h, void *data, const void *key,
return -1;
}
+/* Only tries to insert at one bucket (@prim_bkt) without trying to push
+ * buckets around.
+ * return 1 if matching existing key, return 0 if succeeds, return -1 for no
+ * empty entry.
+ */
+static inline int32_t
+rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *prim_bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ unsigned int i;
+ struct rte_hash_bucket *cur_bkt = prim_bkt;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+ /* Check if key was inserted after last check but before this
+ * protected region in case of inserting duplicated keys.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ /* Insert new entry if there is room in the primary
+ * bucket.
+ */
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ /* Check if slot is available */
+ if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
+ prim_bkt->sig_current[i] = sig;
+ prim_bkt->sig_alt[i] = alt_hash;
+ prim_bkt->key_idx[i] = new_idx;
+ break;
+ }
+ }
+ __hash_rw_writer_unlock(h);
+
+ if (i != RTE_HASH_BUCKET_ENTRIES)
+ return 0;
+
+ /* no empty entry */
+ return -1;
+}
+
+/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
+ * the path head with new entry (sig, alt_hash, new_idx)
+ * return 1 if matched key found, return -1 if cuckoo path invalided and fail,
+ * return 0 if succeeds.
+ */
+static inline int
+rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *alt_bkt,
+ const struct rte_hash_key *key, void *data,
+ struct queue_node *leaf, uint32_t leaf_slot,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ uint32_t prev_alt_bkt_idx;
+ struct rte_hash_bucket *cur_bkt = bkt;
+ struct queue_node *prev_node, *curr_node = leaf;
+ struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
+ uint32_t prev_slot, curr_slot = leaf_slot;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+
+ /* Check if key was inserted after last check but before this
+ * protected region.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ ret = search_and_update(h, data, key, alt_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ while (likely(curr_node->prev != NULL)) {
+ prev_node = curr_node->prev;
+ prev_bkt = prev_node->bkt;
+ prev_slot = curr_node->prev_slot;
+
+ prev_alt_bkt_idx =
+ prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask;
+
+ if (unlikely(&h->buckets[prev_alt_bkt_idx]
+ != curr_bkt)) {
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Need to swap current/alt sig to allow later
+ * Cuckoo insert to move elements back to its
+ * primary bucket if available
+ */
+ curr_bkt->sig_alt[curr_slot] =
+ prev_bkt->sig_current[prev_slot];
+ curr_bkt->sig_current[curr_slot] =
+ prev_bkt->sig_alt[prev_slot];
+ curr_bkt->key_idx[curr_slot] =
+ prev_bkt->key_idx[prev_slot];
+
+ curr_slot = prev_slot;
+ curr_node = prev_node;
+ curr_bkt = curr_node->bkt;
+ }
+
+ curr_bkt->sig_current[curr_slot] = sig;
+ curr_bkt->sig_alt[curr_slot] = alt_hash;
+ curr_bkt->key_idx[curr_slot] = new_idx;
+
+ __hash_rw_writer_unlock(h);
+
+ return 0;
+
+}
+
+/*
+ * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
+ * Cuckoo
+ */
+static inline int
+rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash,
+ uint32_t new_idx, int32_t *ret_val)
+{
+ unsigned int i;
+ struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
+ struct queue_node *tail, *head;
+ struct rte_hash_bucket *curr_bkt, *alt_bkt;
+
+ tail = queue;
+ head = queue + 1;
+ tail->bkt = bkt;
+ tail->prev = NULL;
+ tail->prev_slot = -1;
+
+ /* Cuckoo bfs Search */
+ while (likely(tail != head && head <
+ queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
+ RTE_HASH_BUCKET_ENTRIES)) {
+ curr_bkt = tail->bkt;
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
+ int32_t ret = rte_hash_cuckoo_move_insert_mw(h,
+ bkt, sec_bkt, key, data,
+ tail, i, sig, alt_hash,
+ new_idx, ret_val);
+ if (likely(ret != -1))
+ return ret;
+ }
+
+ /* Enqueue new node and keep prev node info */
+ alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
+ & h->bucket_bitmask]);
+ head->bkt = alt_bkt;
+ head->prev = tail;
+ head->prev_slot = i;
+ head++;
+ }
+ tail++;
+ }
+
+ return -ENOSPC;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
{
hash_sig_t alt_hash;
uint32_t prim_bucket_idx, sec_bucket_idx;
- unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
@@ -518,10 +682,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
unsigned n_slots;
unsigned lcore_id;
struct lcore_cache *cached_free_slots = NULL;
- unsigned int nr_pushes = 0;
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_lock(h->multiwriter_lock);
+ int32_t ret_val;
prim_bucket_idx = sig & h->bucket_bitmask;
prim_bkt = &h->buckets[prim_bucket_idx];
@@ -532,8 +693,24 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
sec_bkt = &h->buckets[sec_bucket_idx];
rte_prefetch0(sec_bkt);
- /* Get a new slot for storing the new key */
- if (h->hw_trans_mem_support) {
+ /* Check if key is already inserted in primary location */
+ __hash_rw_writer_lock(h);
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+
+ /* Check if key is already inserted in secondary location */
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+ __hash_rw_writer_unlock(h);
+
+ /* Didnt' find a match, so get a new slot for storing the new key */
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Try to get a free slot from the local cache */
@@ -543,8 +720,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
cached_free_slots->objs,
LCORE_CACHE_SIZE, NULL);
if (n_slots == 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
cached_free_slots->len += n_slots;
@@ -555,92 +731,50 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
slot_id = cached_free_slots->objs[cached_free_slots->len];
} else {
if (rte_ring_sc_dequeue(h->free_slots, &slot_id) != 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
}
new_k = RTE_PTR_ADD(keys, (uintptr_t)slot_id * h->key_entry_size);
- rte_prefetch0(new_k);
new_idx = (uint32_t)((uintptr_t) slot_id);
-
- /* Check if key is already inserted in primary location */
- ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
- if (ret != -1)
- goto failure;
-
- /* Check if key is already inserted in secondary location */
- ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
- if (ret != -1)
- goto failure;
-
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
new_k->pdata = data;
-#if defined(RTE_ARCH_X86) /* currently only x86 support HTM */
- if (h->add_key == ADD_KEY_MULTIWRITER_TM) {
- ret = rte_hash_cuckoo_insert_mw_tm(prim_bkt,
- sig, alt_hash, new_idx);
- if (ret >= 0)
- return new_idx - 1;
- /* Primary bucket full, need to make space for new entry */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, prim_bkt, sig,
- alt_hash, new_idx);
+ /* Find an empty slot and insert */
+ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- if (ret >= 0)
- return new_idx - 1;
+ /* Primary bucket full, need to make space for new entry */
+ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- /* Also search secondary bucket to get better occupancy */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, sec_bkt, sig,
- alt_hash, new_idx);
+ /* Also search secondary bucket to get better occupancy */
+ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
- if (ret >= 0)
- return new_idx - 1;
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
} else {
-#endif
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
-
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-
- /* Primary bucket full, need to make space for new entry
- * After recursive function.
- * Insert the new entry in the position of the pushed entry
- * if successful or return error and
- * store the new slot back in the ring
- */
- ret = make_space_bucket(h, prim_bkt, &nr_pushes);
- if (ret >= 0) {
- prim_bkt->sig_current[ret] = sig;
- prim_bkt->sig_alt[ret] = alt_hash;
- prim_bkt->key_idx[ret] = new_idx;
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-#if defined(RTE_ARCH_X86)
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
-#endif
- /* Error in addition, store new slot back in the ring and return error */
- enqueue_slot_back(h, cached_free_slots, (void *)((uintptr_t) new_idx));
-
-failure:
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return ret;
}
int32_t
@@ -725,12 +859,14 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_reader_lock(h);
/* Check if key is in primary location */
ret = search_one_bucket(h, key, sig, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
@@ -738,9 +874,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
/* Check if key is in secondary location */
ret = search_one_bucket(h, key, alt_hash, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
+ __hash_rw_reader_unlock(h);
return -ENOENT;
}
@@ -782,7 +920,7 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
bkt->sig_current[i] = NULL_SIGNATURE;
bkt->sig_alt[i] = NULL_SIGNATURE;
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Cache full, need to free it. */
@@ -846,19 +984,23 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_writer_lock(h);
/* look for key in primary bucket */
- if (!search_and_remove(h, key, bkt, sig, &ret_val))
+ if (!search_and_remove(h, key, bkt, sig, &ret_val)) {
+ __hash_rw_writer_unlock(h);
return ret_val;
-
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
/* look for key in secondary bucket */
- if (!search_and_remove(h, key, bkt, alt_hash, &ret_val))
+ if (!search_and_remove(h, key, bkt, alt_hash, &ret_val)) {
+ __hash_rw_writer_unlock(h);
return ret_val;
-
+ }
+ __hash_rw_writer_unlock(h);
return -ENOENT;
}
@@ -1000,6 +1142,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
rte_prefetch0(secondary_bkt[i]);
}
+ __hash_rw_reader_lock(h);
/* Compare signatures and prefetch key slot of first hit */
for (i = 0; i < num_keys; i++) {
compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
@@ -1082,6 +1225,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
continue;
}
+ __hash_rw_reader_unlock(h);
+
if (hit_mask != NULL)
*hit_mask = hits;
}
@@ -1140,7 +1285,7 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
bucket_idx = *next / RTE_HASH_BUCKET_ENTRIES;
idx = *next % RTE_HASH_BUCKET_ENTRIES;
}
-
+ __hash_rw_reader_lock(h);
/* Get position of entry in key table */
position = h->buckets[bucket_idx].key_idx[idx];
next_key = (struct rte_hash_key *) ((char *)h->key_store +
@@ -1149,6 +1294,8 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
*key = next_key->key;
*data = next_key->pdata;
+ __hash_rw_reader_unlock(h);
+
/* Increment iterator */
(*next)++;
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 7a54e55..db4d1a0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -88,11 +88,6 @@ const rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
#endif
-enum add_key_case {
- ADD_KEY_SINGLEWRITER = 0,
- ADD_KEY_MULTIWRITER,
- ADD_KEY_MULTIWRITER_TM,
-};
/** Number of items per bucket. */
#define RTE_HASH_BUCKET_ENTRIES 8
@@ -155,18 +150,20 @@ struct rte_hash {
struct rte_ring *free_slots;
/**< Ring that stores all indexes of the free slots in the key table */
- uint8_t hw_trans_mem_support;
- /**< Hardware transactional memory support */
+
struct lcore_cache *local_free_slots;
/**< Local cache per lcore, storing some indexes of the free slots */
- enum add_key_case add_key; /**< Multi-writer hash add behavior */
-
- rte_spinlock_t *multiwriter_lock; /**< Multi-writer spinlock for w/o TM */
/* Fields used in lookup */
uint32_t key_len __rte_cache_aligned;
/**< Length of hash key. */
+ uint8_t hw_trans_mem_support;
+ /**< If hardware transactional memory is used. */
+ uint8_t multi_writer_support;
+ /**< If multi-writer support is enabled. */
+ uint8_t readwrite_concur_support;
+ /**< If read-write concurrency support is enabled */
rte_hash_function hash_func; /**< Function used to calculate hash. */
uint32_t hash_func_init_val; /**< Init value used by hash_func. */
rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
@@ -184,6 +181,7 @@ struct rte_hash {
/**< Table with buckets storing all the hash values and key indexes
* to the key table.
*/
+ rte_rwlock_t *readwrite_lock; /**< Read-write lock thread-safety. */
} __rte_cache_aligned;
struct queue_node {
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
deleted file mode 100644
index 2c5b017..0000000
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ /dev/null
@@ -1,164 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016 Intel Corporation
- */
-
-/* rte_cuckoo_hash_x86.h
- * This file holds all x86 specific Cuckoo Hash functions
- */
-
-/* Only tries to insert at one bucket (@prim_bkt) without trying to push
- * buckets around
- */
-static inline unsigned
-rte_hash_cuckoo_insert_mw_tm(struct rte_hash_bucket *prim_bkt,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned i, status;
- unsigned try = 0;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* Insert new entry if there is room in the primary
- * bucket.
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
- rte_xend();
-
- if (i != RTE_HASH_BUCKET_ENTRIES)
- return 0;
-
- break; /* break off try loop if transaction commits */
- } else {
- /* If we abort we give up this cuckoo path. */
- try++;
- rte_pause();
- }
- }
-
- return -1;
-}
-
-/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
- * the path head with new entry (sig, alt_hash, new_idx)
- */
-static inline int
-rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
- struct queue_node *leaf, uint32_t leaf_slot,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned try = 0;
- unsigned status;
- uint32_t prev_alt_bkt_idx;
-
- struct queue_node *prev_node, *curr_node = leaf;
- struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
- uint32_t prev_slot, curr_slot = leaf_slot;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- while (likely(curr_node->prev != NULL)) {
- prev_node = curr_node->prev;
- prev_bkt = prev_node->bkt;
- prev_slot = curr_node->prev_slot;
-
- prev_alt_bkt_idx
- = prev_bkt->sig_alt[prev_slot]
- & h->bucket_bitmask;
-
- if (unlikely(&h->buckets[prev_alt_bkt_idx]
- != curr_bkt)) {
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- }
-
- /* Need to swap current/alt sig to allow later
- * Cuckoo insert to move elements back to its
- * primary bucket if available
- */
- curr_bkt->sig_alt[curr_slot] =
- prev_bkt->sig_current[prev_slot];
- curr_bkt->sig_current[curr_slot] =
- prev_bkt->sig_alt[prev_slot];
- curr_bkt->key_idx[curr_slot]
- = prev_bkt->key_idx[prev_slot];
-
- curr_slot = prev_slot;
- curr_node = prev_node;
- curr_bkt = curr_node->bkt;
- }
-
- curr_bkt->sig_current[curr_slot] = sig;
- curr_bkt->sig_alt[curr_slot] = alt_hash;
- curr_bkt->key_idx[curr_slot] = new_idx;
-
- rte_xend();
-
- return 0;
- }
-
- /* If we abort we give up this cuckoo path, since most likely it's
- * no longer valid as TSX detected data conflict
- */
- try++;
- rte_pause();
- }
-
- return -1;
-}
-
-/*
- * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
- * Cuckoo
- */
-static inline int
-rte_hash_cuckoo_make_space_mw_tm(const struct rte_hash *h,
- struct rte_hash_bucket *bkt,
- hash_sig_t sig, hash_sig_t alt_hash,
- uint32_t new_idx)
-{
- unsigned i;
- struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
- struct queue_node *tail, *head;
- struct rte_hash_bucket *curr_bkt, *alt_bkt;
-
- tail = queue;
- head = queue + 1;
- tail->bkt = bkt;
- tail->prev = NULL;
- tail->prev_slot = -1;
-
- /* Cuckoo bfs Search */
- while (likely(tail != head && head <
- queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
- RTE_HASH_BUCKET_ENTRIES)) {
- curr_bkt = tail->bkt;
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
- if (likely(rte_hash_cuckoo_move_insert_mw_tm(h,
- tail, i, sig,
- alt_hash, new_idx) == 0))
- return 0;
- }
-
- /* Enqueue new node and keep prev node info */
- alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
- & h->bucket_bitmask]);
- head->bkt = alt_bkt;
- head->prev = tail;
- head->prev_slot = i;
- head++;
- }
- tail++;
- }
-
- return -ENOSPC;
-}
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index f71ca9f..ecb49e4 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -34,6 +34,9 @@ extern "C" {
/** Default behavior of insertion, single writer/multi writer */
#define RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD 0x02
+/** Flag to support reader writer concurrency */
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY 0x04
+
/** Signature of key that is stored internally. */
typedef uint32_t hash_sig_t;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 3/6] test: add tests in hash table perf test
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions Yipeng Wang
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 2/6] hash: add read and write concurrency support Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-07-06 17:17 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 4/6] test: add test case for read write concurrency Yipeng Wang
` (2 subsequent siblings)
5 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
New code is added to support read-write concurrency for
rte_hash. Due to the newly added code in critial path,
the perf test is modified to show any performance impact.
It is still a single-thread test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
test/test/test_hash_perf.c | 36 +++++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)
diff --git a/test/test/test_hash_perf.c b/test/test/test_hash_perf.c
index a81d0c7..33dcb9f 100644
--- a/test/test/test_hash_perf.c
+++ b/test/test/test_hash_perf.c
@@ -76,7 +76,8 @@ static struct rte_hash_parameters ut_params = {
};
static int
-create_table(unsigned with_data, unsigned table_index)
+create_table(unsigned int with_data, unsigned int table_index,
+ unsigned int with_locks)
{
char name[RTE_HASH_NAMESIZE];
@@ -86,6 +87,14 @@ create_table(unsigned with_data, unsigned table_index)
else
sprintf(name, "test_hash%d", hashtest_key_lens[table_index]);
+
+ if (with_locks)
+ ut_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT
+ | RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ ut_params.extra_flag = 0;
+
ut_params.name = name;
ut_params.key_len = hashtest_key_lens[table_index];
ut_params.socket_id = rte_socket_id();
@@ -459,7 +468,7 @@ reset_table(unsigned table_index)
}
static int
-run_all_tbl_perf_tests(unsigned with_pushes)
+run_all_tbl_perf_tests(unsigned int with_pushes, unsigned int with_locks)
{
unsigned i, j, with_data, with_hash;
@@ -468,7 +477,7 @@ run_all_tbl_perf_tests(unsigned with_pushes)
for (with_data = 0; with_data <= 1; with_data++) {
for (i = 0; i < NUM_KEYSIZES; i++) {
- if (create_table(with_data, i) < 0)
+ if (create_table(with_data, i, with_locks) < 0)
return -1;
if (get_input_keys(with_pushes, i) < 0)
@@ -611,15 +620,20 @@ fbk_hash_perf_test(void)
static int
test_hash_perf(void)
{
- unsigned with_pushes;
-
- for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
- if (with_pushes == 0)
- printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ unsigned int with_pushes, with_locks;
+ for (with_locks = 0; with_locks <= 1; with_locks++) {
+ if (with_locks)
+ printf("\nWith locks in the code\n");
else
- printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
- if (run_all_tbl_perf_tests(with_pushes) < 0)
- return -1;
+ printf("\nWithout locks in the code\n");
+ for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
+ if (with_pushes == 0)
+ printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ else
+ printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
+ if (run_all_tbl_perf_tests(with_pushes, with_locks) < 0)
+ return -1;
+ }
}
if (fbk_hash_perf_test() < 0)
return -1;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 4/6] test: add test case for read write concurrency
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
` (2 preceding siblings ...)
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 3/6] test: add tests in hash table perf test Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-07-06 17:31 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 5/6] hash: fix to have more accurate key slot size Yipeng Wang
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 6/6] hash: add new API function to query the key count Yipeng Wang
5 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commits add a new test case for testing read/write concurrency.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
test/test/Makefile | 1 +
test/test/test_hash_readwrite.c | 645 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 646 insertions(+)
create mode 100644 test/test/test_hash_readwrite.c
diff --git a/test/test/Makefile b/test/test/Makefile
index eccc8ef..6ce66c9 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -113,6 +113,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
new file mode 100644
index 0000000..db2ded5
--- /dev/null
+++ b/test/test/test_hash_readwrite.c
@@ -0,0 +1,645 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <locale.h>
+
+#include <rte_cycles.h>
+#include <rte_hash.h>
+#include <rte_hash_crc.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+#include <rte_spinlock.h>
+
+#include "test.h"
+
+
+#define RTE_RWTEST_FAIL 0
+
+#define TOTAL_ENTRY (16*1024*1024)
+#define TOTAL_INSERT (15*1024*1024)
+
+#define NUM_TEST 3
+unsigned int core_cnt[NUM_TEST] = {2, 4, 8};
+
+
+struct perf {
+ uint32_t single_read;
+ uint32_t single_write;
+ uint32_t read_only[NUM_TEST];
+ uint32_t write_only[NUM_TEST];
+ uint32_t read_write_r[NUM_TEST];
+ uint32_t read_write_w[NUM_TEST];
+};
+
+static struct perf htm_results, non_htm_results;
+
+struct {
+ uint32_t *keys;
+ uint32_t *found;
+ uint32_t num_insert;
+ uint32_t rounded_tot_insert;
+ struct rte_hash *h;
+} tbl_rw_test_param;
+
+static rte_atomic64_t gcycles;
+static rte_atomic64_t ginsertions;
+
+static rte_atomic64_t gread_cycles;
+static rte_atomic64_t gwrite_cycles;
+
+static rte_atomic64_t greads;
+static rte_atomic64_t gwrites;
+
+static int
+test_hash_readwrite_worker(__attribute__((unused)) void *arg)
+{
+ uint64_t i, offset;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+
+ offset = (lcore_id - rte_get_master_lcore())
+ * tbl_rw_test_param.num_insert;
+
+ printf("Core #%d inserting and reading %d: %'"PRId64" - %'"PRId64"\n",
+ lcore_id, tbl_rw_test_param.num_insert,
+ offset, offset + tbl_rw_test_param.num_insert);
+
+ begin = rte_rdtsc_precise();
+
+
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) > 0)
+ break;
+
+ ret = rte_hash_add_key(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i);
+ if (ret < 0)
+ break;
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) != ret)
+ break;
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gcycles, cycles);
+ rte_atomic64_add(&ginsertions, i - offset);
+
+ for (; i < offset + tbl_rw_test_param.num_insert; i++)
+ tbl_rw_test_param.keys[i] = RTE_RWTEST_FAIL;
+
+ return 0;
+}
+
+
+static int
+init_params(int use_htm)
+{
+ unsigned int i;
+
+ uint32_t *keys = NULL;
+ uint32_t *found = NULL;
+ struct rte_hash *handle;
+
+ struct rte_hash_parameters hash_params = {
+ .entries = TOTAL_ENTRY,
+ .key_len = sizeof(uint32_t),
+ .hash_func = rte_hash_crc,
+ .hash_func_init_val = 0,
+ .socket_id = rte_socket_id(),
+ };
+ if (use_htm)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+
+ hash_params.name = "tests";
+
+ handle = rte_hash_create(&hash_params);
+ if (handle == NULL) {
+ printf("hash creation failed");
+ return -1;
+ }
+
+ tbl_rw_test_param.h = handle;
+ keys = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+
+ if (keys == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ found = rte_zmalloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+ if (found == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ goto err;
+ }
+
+
+ tbl_rw_test_param.keys = keys;
+ tbl_rw_test_param.found = found;
+
+ for (i = 0; i < TOTAL_ENTRY; i++)
+ keys[i] = i;
+
+ return 0;
+
+err:
+ rte_free(keys);
+ rte_hash_free(handle);
+
+ return -1;
+}
+
+static int
+test_hash_readwrite_functional(int use_htm)
+{
+ unsigned int i;
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ rte_atomic64_init(&gcycles);
+ rte_atomic64_clear(&gcycles);
+
+ rte_atomic64_init(&ginsertions);
+ rte_atomic64_clear(&ginsertions);
+
+ if (init_params(use_htm) != 0)
+ goto err;
+
+ tbl_rw_test_param.num_insert =
+ TOTAL_INSERT / rte_lcore_count();
+
+ tbl_rw_test_param.rounded_tot_insert =
+ tbl_rw_test_param.num_insert
+ * rte_lcore_count();
+
+ printf("++++++++Start function tests:+++++++++\n");
+
+ /* Fire all threads. */
+ rte_eal_mp_remote_launch(test_hash_readwrite_worker,
+ NULL, CALL_MASTER);
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h, &next_key,
+ &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+ for (i = 0;
+ i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %d is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gcycles) /
+ rte_atomic64_read(&ginsertions);
+
+ printf("cycles per insertion and lookup: %llu\n", cycles_per_insertion);
+
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ printf("+++++++++Complete function tests+++++++++\n");
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+err:
+ return -1;
+}
+
+static int
+test_rw_reader(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint64_t begin, cycles;
+ uint64_t read_cnt = (uint64_t)((uintptr_t)arg);
+
+ begin = rte_rdtsc_precise();
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value %"PRIu64","
+ "%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gread_cycles, cycles);
+ rte_atomic64_add(&greads, i);
+ return 0;
+}
+
+static int
+test_rw_writer(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+ uint64_t start_coreid = (uint64_t)(uintptr_t)arg;
+ uint64_t offset;
+
+ offset = TOTAL_INSERT / 2 + (lcore_id - start_coreid)
+ * tbl_rw_test_param.num_insert;
+ begin = rte_rdtsc_precise();
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("writer failed %"PRIu64"\n", i);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gwrite_cycles, cycles);
+ rte_atomic64_add(&gwrites, tbl_rw_test_param.num_insert);
+ return 0;
+}
+
+static int
+test_hash_readwrite_perf(struct perf *perf_results, int use_htm,
+ int reader_faster)
+{
+ unsigned int n;
+ int ret;
+ int start_coreid;
+ uint64_t i, read_cnt;
+
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ uint64_t start = 0, end = 0;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gwrites);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&greads);
+
+ rte_atomic64_init(&gread_cycles);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_init(&gwrite_cycles);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ if (init_params(use_htm) != 0)
+ goto err;
+
+ /*
+ * Do a readers finish faster or writers finish faster test.
+ * When readers finish faster, we timing the readers, and when writers
+ * finish faster, we timing the writers.
+ * Divided by 10 or 2 is just experimental values to vary the workload
+ * of readers.
+ */
+ if (reader_faster) {
+ printf("++++++Start perf test: reader++++++++\n");
+ read_cnt = TOTAL_INSERT / 10;
+ } else {
+ printf("++++++Start perf test: writer++++++++\n");
+ read_cnt = TOTAL_INSERT / 2;
+ }
+
+
+ /* We first test single thread performance */
+ start = rte_rdtsc_precise();
+ /* Insert half of the keys */
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_write = end / i;
+
+ start = rte_rdtsc_precise();
+
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value"
+ " %"PRIu64",%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_read = end / i;
+
+ for (n = 0; n < NUM_TEST; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < core_cnt[n] * 2 + 1)
+ goto finish;
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ tbl_rw_test_param.num_insert = TOTAL_INSERT / 2 / core_cnt[n];
+ tbl_rw_test_param.rounded_tot_insert = TOTAL_INSERT / 2 +
+ tbl_rw_test_param.num_insert *
+ core_cnt[n];
+
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ /* Then test multiple thread case but only all reads or
+ * all writes
+ */
+
+ /* Test only reader cases */
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+
+ rte_eal_mp_wait_lcore();
+
+ start_coreid = i;
+ /* Test only writer cases */
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+
+
+ rte_eal_mp_wait_lcore();
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_only[n] = cycles_per_insertion;
+ printf("Reader only: cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->write_only[n] = cycles_per_insertion;
+ printf("Writer only: cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+
+ start_coreid = core_cnt[n] + 1;
+
+ if (reader_faster) {
+ for (i = core_cnt[n] + 1; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ } else {
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ }
+
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h,
+ &next_key, &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+
+ for (i = 0; i <
+ tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %"PRIu64" is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_write_r[n] = cycles_per_insertion;
+ printf("Read-write cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->read_write_w[n] = cycles_per_insertion;
+ printf("Read-write cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+ }
+
+finish:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+
+err:
+ return -1;
+}
+
+
+static int
+test_hash_readwrite_main(void)
+{
+ /*
+ * Variables used to choose different tests.
+ * use_htm indicates if hardware transactional memory should be used.
+ * reader_faster indicates if the reader threads should finish earlier
+ * than writer threads. This is to timing either reader threads or
+ * writer threads for performance numbers.
+ */
+ int use_htm, reader_faster;
+
+ if (rte_lcore_count() == 1) {
+ printf("More than one lcore is required "
+ "to do read write test\n");
+ return 0;
+ }
+
+
+ setlocale(LC_NUMERIC, "");
+
+ if (rte_tm_supported()) {
+ printf("Hardware transactional memory (lock elision) "
+ "is supported\n");
+
+ printf("Test read-write with Hardware transactional memory\n");
+
+ use_htm = 1;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ } else {
+ printf("Hardware transactional memory (lock elision) "
+ "is NOT supported\n");
+ }
+
+ printf("Test read-write without Hardware transactional memory\n");
+ use_htm = 0;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+
+ printf("Results summary:\n");
+
+ int i;
+
+ printf("single read: %u\n", htm_results.single_read);
+ printf("single write: %u\n", htm_results.single_write);
+ for (i = 0; i < NUM_TEST; i++) {
+ printf("core_cnt: %u\n", core_cnt[i]);
+ printf("HTM:\n");
+ printf("read only: %u\n", htm_results.read_only[i]);
+ printf("write only: %u\n", htm_results.write_only[i]);
+ printf("read-write read: %u\n", htm_results.read_write_r[i]);
+ printf("read-write write: %u\n", htm_results.read_write_w[i]);
+
+ printf("non HTM:\n");
+ printf("read only: %u\n", non_htm_results.read_only[i]);
+ printf("write only: %u\n", non_htm_results.write_only[i]);
+ printf("read-write read: %u\n",
+ non_htm_results.read_write_r[i]);
+ printf("read-write write: %u\n",
+ non_htm_results.read_write_w[i]);
+ }
+
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(hash_readwrite_autotest, test_hash_readwrite_main);
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 5/6] hash: fix to have more accurate key slot size
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
` (3 preceding siblings ...)
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 4/6] test: add test case for read write concurrency Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-07-06 17:32 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 6/6] hash: add new API function to query the key count Yipeng Wang
5 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit calculates the needed key slot size more
accurately. The previous local cache fix requires
the free slot ring to be larger than actually needed.
The calculation of the value is inaccurate.
Fixes: 5915699153d7 ("hash: fix scaling by reducing contention")
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index d2c7629..4a90049 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -133,13 +133,13 @@ rte_hash_create(const struct rte_hash_parameters *params)
* except for the first cache
*/
num_key_slots = params->entries + (RTE_MAX_LCORE - 1) *
- LCORE_CACHE_SIZE + 1;
+ (LCORE_CACHE_SIZE - 1) + 1;
else
num_key_slots = params->entries + 1;
snprintf(ring_name, sizeof(ring_name), "HT_%s", params->name);
/* Create ring (Dummy slot index is not enqueued) */
- r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots - 1),
+ r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots),
params->socket_id, 0);
if (r == NULL) {
RTE_LOG(ERR, HASH, "memory allocation failed\n");
@@ -293,7 +293,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
}
/* Populate free slots ring. Entry zero is reserved for key misses. */
- for (i = 1; i < params->entries + 1; i++)
+ for (i = 1; i < num_key_slots; i++)
rte_ring_sp_enqueue(r, (void *)((uintptr_t) i));
te->data = (void *) h;
@@ -412,7 +412,7 @@ void
rte_hash_reset(struct rte_hash *h)
{
void *ptr;
- unsigned i;
+ uint32_t tot_ring_cnt, i;
if (h == NULL)
return;
@@ -426,7 +426,13 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- for (i = 1; i < h->entries + 1; i++)
+ if (h->multi_writer_support)
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ else
+ tot_ring_cnt = h->entries;
+
+ for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
if (h->multi_writer_support) {
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v2 6/6] hash: add new API function to query the key count
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
` (4 preceding siblings ...)
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 5/6] hash: fix to have more accurate key slot size Yipeng Wang
@ 2018-06-29 12:24 ` Yipeng Wang
2018-07-06 17:36 ` De Lara Guarch, Pablo
5 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-06-29 12:24 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Add a new function, rte_hash_count, to return the number of keys that
are currently stored in the hash table. Corresponding test functions are
added into hash_test and hash_multiwriter test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 24 ++++++++++++++++++++++++
lib/librte_hash/rte_hash.h | 11 +++++++++++
lib/librte_hash/rte_hash_version.map | 8 ++++++++
test/test/test_hash.c | 12 ++++++++++++
test/test/test_hash_multiwriter.c | 9 +++++++++
5 files changed, 64 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 4a90049..3f3d023 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -370,6 +370,30 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+int32_t
+rte_hash_count(struct rte_hash *h)
+{
+ uint32_t tot_ring_cnt, cached_cnt = 0;
+ uint32_t i, ret;
+
+ if (h == NULL)
+ return -EINVAL;
+
+ if (h->multi_writer_support) {
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ for (i = 0; i < RTE_MAX_LCORE; i++)
+ cached_cnt += h->local_free_slots[i].len;
+
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots) -
+ cached_cnt;
+ } else {
+ tot_ring_cnt = h->entries;
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots);
+ }
+ return ret;
+}
+
/* Read write locks implemented using rte_rwlock */
static inline void
__hash_rw_writer_lock(const struct rte_hash *h)
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index ecb49e4..1e4ba35 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -127,6 +127,17 @@ void
rte_hash_reset(struct rte_hash *h);
/**
+ * Return the number of keys in the hash table
+ * @param h
+ * Hash table to query from
+ * @return
+ * - -EINVAL if parameters are invalid
+ * - A value indicating how many keys were inserted in the table.
+ */
+int32_t
+rte_hash_count(struct rte_hash *h);
+
+/**
* Add a key-value pair to an existing hash table.
* This operation is not multi-thread safe
* and should only be called from one thread.
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index 52a2576..e216ac8 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -45,3 +45,11 @@ DPDK_16.07 {
rte_hash_get_key_with_position;
} DPDK_2.2;
+
+
+DPDK_18.08 {
+ global:
+
+ rte_hash_count;
+
+} DPDK_16.07;
diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index edf41f5..b3db9fd 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -1103,6 +1103,7 @@ static int test_average_table_utilization(void)
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
+ unsigned int cnt;
printf("\n# Running test to determine average utilization"
"\n before adding elements begins to fail\n");
@@ -1121,13 +1122,24 @@ static int test_average_table_utilization(void)
for (i = 0; i < ut_params.key_len; i++)
simple_key[i] = rte_rand() % 255;
ret = rte_hash_add_key(handle, simple_key);
+ if (ret < 0)
+ break;
}
+
if (ret != -ENOSPC) {
printf("Unexpected error when adding keys\n");
rte_hash_free(handle);
return -1;
}
+ cnt = rte_hash_count(handle);
+ if (cnt != added_keys) {
+ printf("rte_hash_count returned wrong value %u, %u,"
+ "%u\n", j, added_keys, cnt);
+ rte_hash_free(handle);
+ return -1;
+ }
+
average_keys_added += added_keys;
/* Reset the table */
diff --git a/test/test/test_hash_multiwriter.c b/test/test/test_hash_multiwriter.c
index ef5fce3..ae3ce3b 100644
--- a/test/test/test_hash_multiwriter.c
+++ b/test/test/test_hash_multiwriter.c
@@ -116,6 +116,7 @@ test_hash_multiwriter(void)
uint32_t duplicated_keys = 0;
uint32_t lost_keys = 0;
+ uint32_t count;
snprintf(name, 32, "test%u", calledCount++);
hash_params.name = name;
@@ -163,6 +164,14 @@ test_hash_multiwriter(void)
NULL, CALL_MASTER);
rte_eal_mp_wait_lcore();
+
+ count = rte_hash_count(handle);
+ if (count != rounded_nb_total_tsx_insertion) {
+ printf("rte_hash_count returned wrong value %u, %d\n",
+ rounded_nb_total_tsx_insertion, count);
+ goto err3;
+ }
+
while (rte_hash_iterate(handle, &next_key, &next_data, &iter) >= 0) {
/* Search for the key in the list of keys added .*/
i = *(const uint32_t *)next_key;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions Yipeng Wang
@ 2018-07-06 10:04 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-06 10:04 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
Hi Yipeng,
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 29, 2018 1:25 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v2 1/6] hash: make duplicated code into functions
>
> This commit refactors the hash table lookup/add/del code to remove some code
> duplication. Processing on primary bucket can also apply to secondary bucket
> with same code.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> lib/librte_hash/rte_cuckoo_hash.c | 186 ++++++++++++++++++--------------------
...
> +/* Search one bucket to find the match key */
> static inline int32_t
> -__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> - hash_sig_t sig, void **data)
> +search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t sig,
> + void **data, struct rte_hash_bucket *bkt)
Use "const" in "struct rte_hash_bucket".
...
> +search_and_remove(const struct rte_hash *h, const void *key,
> + struct rte_hash_bucket *bkt, hash_sig_t sig,
> + int32_t *ret_val)
> {
> - uint32_t bucket_idx;
> - hash_sig_t alt_hash;
> - unsigned i;
> - struct rte_hash_bucket *bkt;
> struct rte_hash_key *k, *keys = h->key_store;
> - int32_t ret;
> -
> - bucket_idx = sig & h->bucket_bitmask;
> - bkt = &h->buckets[bucket_idx];
> + unsigned int i;
>
> /* Check if key is in primary location */
> for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { @@ -833,37 +825,39
> @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> * Return index where key is stored,
> * subtracting the first dummy index
> */
> - ret = bkt->key_idx[i] - 1;
> + *ret_val = bkt->key_idx[i] - 1;
> bkt->key_idx[i] = EMPTY_SLOT;
> - return ret;
> + return 0;
You can store ret_val and return it, instead of returning 0,
so the function is similar to the other search functions.
> }
> }
> }
> + return -1;
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/6] hash: add read and write concurrency support
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 2/6] hash: add read and write concurrency support Yipeng Wang
@ 2018-07-06 17:11 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-06 17:11 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 29, 2018 1:25 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v2 2/6] hash: add read and write concurrency support
>
> The existing implementation of librte_hash does not support read-write
> concurrency. This commit implements read-write safety using rte_rwlock and
> rte_rwlock TM version if hardware transactional memory is available.
>
> Both multi-writer and read-write concurrency is protected by rte_rwlock now.
> The x86 specific header file is removed since the x86 specific RTM function is not
> called directly by rte hash now.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> lib/librte_hash/meson.build | 1 -
> lib/librte_hash/rte_cuckoo_hash.c | 507 ++++++++++++++++++++++------------
> lib/librte_hash/rte_cuckoo_hash.h | 18 +-
> lib/librte_hash/rte_cuckoo_hash_x86.h | 164 -----------
> lib/librte_hash/rte_hash.h | 3 +
> 5 files changed, 338 insertions(+), 355 deletions(-) delete mode 100644
> lib/librte_hash/rte_cuckoo_hash_x86.h
>
...
> --- a/lib/librte_hash/rte_cuckoo_hash.c
> +++ b/lib/librte_hash/rte_cuckoo_hash.c
> @@ -31,9 +31,6 @@
...
> + if (h->multi_writer_support) {
> + h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
> LCORE_CACHE_SIZE);
I think LCORE_CACHE_SIZE should be RTE_CACHE_LINE_SIZE (same value, but different meaning).
> - rte_spinlock_init(h->multiwriter_lock);
> - }
...
> +
> + /* Didnt' find a match, so get a new slot for storing the new key */
Typo: Didn't
> + if (h->multi_writer_support) {
> lcore_id = rte_lcore_id();
> cached_free_slots = &h->local_free_slots[lcore_id];
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v2 3/6] test: add tests in hash table perf test
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 3/6] test: add tests in hash table perf test Yipeng Wang
@ 2018-07-06 17:17 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-06 17:17 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 29, 2018 1:25 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v2 3/6] test: add tests in hash table perf test
>
> New code is added to support read-write concurrency for rte_hash. Due to the
> newly added code in critial path, the perf test is modified to show any
> performance impact.
> It is still a single-thread test.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v2 4/6] test: add test case for read write concurrency
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 4/6] test: add test case for read write concurrency Yipeng Wang
@ 2018-07-06 17:31 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-06 17:31 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 29, 2018 1:25 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v2 4/6] test: add test case for read write concurrency
>
> This commits add a new test case for testing read/write concurrency.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> test/test/Makefile | 1 +
> test/test/test_hash_readwrite.c | 645
> ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 646 insertions(+)
> create mode 100644 test/test/test_hash_readwrite.c
>
> diff --git a/test/test/Makefile b/test/test/Makefile index eccc8ef..6ce66c9
> 100644
> --- a/test/test/Makefile
> +++ b/test/test/Makefile
> @@ -113,6 +113,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) +=
> test_hash_perf.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
>
> SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
> SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c diff --git
> a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c new file
> mode 100644 index 0000000..db2ded5
> --- /dev/null
> +++ b/test/test/test_hash_readwrite.c
> @@ -0,0 +1,645 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#include <inttypes.h>
> +#include <locale.h>
> +
> +#include <rte_cycles.h>
> +#include <rte_hash.h>
> +#include <rte_hash_crc.h>
> +#include <rte_launch.h>
> +#include <rte_malloc.h>
> +#include <rte_random.h>
> +#include <rte_spinlock.h>
> +
> +#include "test.h"
> +
> +
> +#define RTE_RWTEST_FAIL 0
> +
> +#define TOTAL_ENTRY (16*1024*1024)
> +#define TOTAL_INSERT (15*1024*1024)
> +
> +#define NUM_TEST 3
> +unsigned int core_cnt[NUM_TEST] = {2, 4, 8};
> +
> +
General comment. Remove extra blank lines (one is enough).
...
> + while (rte_hash_iterate(tbl_rw_test_param.h, &next_key,
> + &next_data, &iter) >= 0) {
> + /* Search for the key in the list of keys added .*/
> + i = *(const uint32_t *)next_key;
> + tbl_rw_test_param.found[i]++;
> + }
> +
> + for (i = 0;
> + i < tbl_rw_test_param.rounded_tot_insert; i++) {
This can go in a single line.
> + if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
> + if (tbl_rw_test_param.found[i] > 1) {
> + duplicated_keys++;
> + break;
...
> +
> + for (i = 0; i <
> + tbl_rw_test_param.rounded_tot_insert; i++) {
This can go in a single line.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v2 5/6] hash: fix to have more accurate key slot size
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 5/6] hash: fix to have more accurate key slot size Yipeng Wang
@ 2018-07-06 17:32 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-06 17:32 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 29, 2018 1:25 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v2 5/6] hash: fix to have more accurate key slot size
>
> This commit calculates the needed key slot size more accurately. The previous
> local cache fix requires the free slot ring to be larger than actually needed.
> The calculation of the value is inaccurate.
>
> Fixes: 5915699153d7 ("hash: fix scaling by reducing contention")
Missing "Cc: stable@dpdk.org).
Also, could you add this as the first patch of the set.
That way, it would be easier to backport (the patch will need modification, otherwise it won't compile).
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v2 6/6] hash: add new API function to query the key count
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 6/6] hash: add new API function to query the key count Yipeng Wang
@ 2018-07-06 17:36 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-06 17:36 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, June 29, 2018 1:25 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v2 6/6] hash: add new API function to query the key count
>
> Add a new function, rte_hash_count, to return the number of keys that are
> currently stored in the hash table. Corresponding test functions are added into
> hash_test and hash_multiwriter test.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> lib/librte_hash/rte_cuckoo_hash.c | 24 ++++++++++++++++++++++++
> lib/librte_hash/rte_hash.h | 11 +++++++++++
> lib/librte_hash/rte_hash_version.map | 8 ++++++++
> test/test/test_hash.c | 12 ++++++++++++
> test/test/test_hash_multiwriter.c | 9 +++++++++
> 5 files changed, 64 insertions(+)
>
> diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> b/lib/librte_hash/rte_cuckoo_hash.c
> index 4a90049..3f3d023 100644
> --- a/lib/librte_hash/rte_cuckoo_hash.c
> +++ b/lib/librte_hash/rte_cuckoo_hash.c
> @@ -370,6 +370,30 @@ rte_hash_secondary_hash(const hash_sig_t
> primary_hash)
> return primary_hash ^ ((tag + 1) * alt_bits_xor); }
>
> +int32_t
> +rte_hash_count(struct rte_hash *h)
Parameter should be "const".
For the rest of the patch:
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
` (3 preceding siblings ...)
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
` (7 more replies)
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
6 siblings, 8 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This patch set adds the read-write concurrency support in rte_hash.
A new flag value is added to indicate if read-write concurrency is needed
during creation time. Test cases are implemented to do functional and
performance tests.
The new concurrency model is based on rte_rwlock. When Intel TSX is
available and the users indicate to use it, the TM version of the
rte_rwlock will be called. Both multi-writer and read-write concurrency
are protected by the rte_rwlock instead of the x86 specific RTM
instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed
and the code is infused into the main .c file.
A new rte_hash_count API is proposed to count how many keys are inserted
into the hash table.
v2->v3:
1. hash: Concurrency bug fix: after beginning cuckoo path moving,
the last empty slot needs to be verified again in case other writers
raced into this slot and occupy it. A new commit is added to do this
bug fix since it applies to master head as well.
2. hash: Concurrency bug fix: if cuckoo path is detected to be invalid,
the current slot needs to be emptied since it is duplicated to its
target bucket.
3. hash: "const" is used for types in multiple locations. (Pablo)
4. hash: rte_malloc used for readwriter lock used wrong align
argument. Similar fix applies to master head so a new commit is
created. (Pablo)
5. hash: ring size calculation fix is moved to front. (Pablo)
6. hash: search-and-remove function is refactored to be more aligned
with other search function. (Pablo)
7. test: using jhash in functional test for read-write concurrency.
It is because jhash with sequential keys incur more cuckoo path.
8. Multiple coding style, typo, commit message fixes. (Pablo)
v1->v2:
1. Split each commit into two commits for easier review (Pablo).
2. Add more comments in various places (Pablo).
3. hash: In key insertion function, move duplicated key checking to
earlier location and protect it using locks. Checking duplicated key
should happen first and data updates should be protected.
4. hash: In lookup bulk function, put signature comparison in lock,
since writers could happen between signature match on two buckets.
5. hash: Add write locks to reset function as well to protect resets.
5. test: Fix 32-bit compilation error in read-write test (Pablo).
6. test: Check total physical core count in read-write test. Don't
test with thread count that larger than physical core count.
7. Other minor fixes such as typos (Pablo).
Yipeng Wang (8):
hash: fix multiwriter lock memory allocation
hash: fix a multi-writer bug
hash: fix to have more accurate key slot size
hash: make duplicated code into functions
hash: add read and write concurrency support
test: add tests in hash table perf test
test: add test case for read write concurrency
hash: add new API function to query the key count
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 705 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 --------
lib/librte_hash/rte_hash.h | 14 +
lib/librte_hash/rte_hash_version.map | 8 +
test/test/Makefile | 1 +
test/test/test_hash.c | 12 +
test/test/test_hash_multiwriter.c | 9 +
test/test/test_hash_perf.c | 36 +-
test/test/test_hash_readwrite.c | 646 +++++++++++++++++++++++++++++++
11 files changed, 1167 insertions(+), 447 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
create mode 100644 test/test/test_hash_readwrite.c
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 11:26 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 2/8] hash: fix a multi-writer bug Yipeng Wang
` (6 subsequent siblings)
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
When malloc for multiwriter_lock, the align should be
RTE_CACHE_LINE_SIZE rather than LCORE_CACHE_SIZE.
Also there should be check to verify the success of
rte_malloc.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index a07543a..80dcf41 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -281,7 +281,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->add_key = ADD_KEY_MULTIWRITER;
h->multiwriter_lock = rte_malloc(NULL,
sizeof(rte_spinlock_t),
- LCORE_CACHE_SIZE);
+ RTE_CACHE_LINE_SIZE);
+ if (h->multiwriter_lock == NULL)
+ goto err_unlock;
+
rte_spinlock_init(h->multiwriter_lock);
}
} else
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 2/8] hash: fix a multi-writer bug
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 14:16 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 3/8] hash: fix to have more accurate key slot size Yipeng Wang
` (5 subsequent siblings)
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Current multi-writer implementation uses Intel TSX to
protect the cuckoo path moving but not the cuckoo
path searching. After searching, we need to verify again if
the same empty slot still exists at the beginning of the TSX
region. Otherwise another writer could occupy the empty slot
before the TSX region. Current code does not verify.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash_x86.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
index 2c5b017..981d7bd 100644
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ b/lib/librte_hash/rte_cuckoo_hash_x86.h
@@ -66,6 +66,9 @@ rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
while (try < RTE_HASH_TSX_MAX_RETRY) {
status = rte_xbegin();
if (likely(status == RTE_XBEGIN_STARTED)) {
+ /* In case empty slot was gone before entering TSX */
+ if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT)
+ rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
while (likely(curr_node->prev != NULL)) {
prev_node = curr_node->prev;
prev_bkt = prev_node->bkt;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 3/8] hash: fix to have more accurate key slot size
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 2/8] hash: fix a multi-writer bug Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 14:20 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 4/8] hash: make duplicated code into functions Yipeng Wang
` (4 subsequent siblings)
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit calculates the needed key slot size more
accurately. The previous local cache fix requires
the free slot ring to be larger than actually needed.
The calculation of the value is inaccurate.
Fixes: 5915699153d7 ("hash: fix scaling by reducing contention")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 80dcf41..11602af 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -126,13 +126,13 @@ rte_hash_create(const struct rte_hash_parameters *params)
* except for the first cache
*/
num_key_slots = params->entries + (RTE_MAX_LCORE - 1) *
- LCORE_CACHE_SIZE + 1;
+ (LCORE_CACHE_SIZE - 1) + 1;
else
num_key_slots = params->entries + 1;
snprintf(ring_name, sizeof(ring_name), "HT_%s", params->name);
/* Create ring (Dummy slot index is not enqueued) */
- r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots - 1),
+ r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots),
params->socket_id, 0);
if (r == NULL) {
RTE_LOG(ERR, HASH, "memory allocation failed\n");
@@ -291,7 +291,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->add_key = ADD_KEY_SINGLEWRITER;
/* Populate free slots ring. Entry zero is reserved for key misses. */
- for (i = 1; i < params->entries + 1; i++)
+ for (i = 1; i < num_key_slots; i++)
rte_ring_sp_enqueue(r, (void *)((uintptr_t) i));
te->data = (void *) h;
@@ -373,7 +373,7 @@ void
rte_hash_reset(struct rte_hash *h)
{
void *ptr;
- unsigned i;
+ uint32_t tot_ring_cnt, i;
if (h == NULL)
return;
@@ -386,7 +386,13 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- for (i = 1; i < h->entries + 1; i++)
+ if (h->hw_trans_mem_support)
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ else
+ tot_ring_cnt = h->entries;
+
+ for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
if (h->hw_trans_mem_support) {
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 4/8] hash: make duplicated code into functions
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (2 preceding siblings ...)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 3/8] hash: fix to have more accurate key slot size Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 14:25 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 5/8] hash: add read and write concurrency support Yipeng Wang
` (3 subsequent siblings)
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit refactors the hash table lookup/add/del code
to remove some code duplication. Processing on primary bucket can
also apply to secondary bucket with same code.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 186 +++++++++++++++++++-------------------
1 file changed, 91 insertions(+), 95 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 11602af..109da92 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -485,6 +485,33 @@ enqueue_slot_back(const struct rte_hash *h,
rte_ring_sp_enqueue(h->free_slots, slot_id);
}
+/* Search a key from bucket and update its data */
+static inline int32_t
+search_and_update(const struct rte_hash *h, void *data, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash)
+{
+ int i;
+ struct rte_hash_key *k, *keys = h->key_store;
+
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (bkt->sig_current[i] == sig &&
+ bkt->sig_alt[i] == alt_hash) {
+ k = (struct rte_hash_key *) ((char *)keys +
+ bkt->key_idx[i] * h->key_entry_size);
+ if (rte_hash_cmp_eq(key, k->key, h) == 0) {
+ /* Update data */
+ k->pdata = data;
+ /*
+ * Return index where key is stored,
+ * subtracting the first dummy index
+ */
+ return bkt->key_idx[i] - 1;
+ }
+ }
+ }
+ return -1;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
@@ -493,7 +520,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
uint32_t prim_bucket_idx, sec_bucket_idx;
unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
- struct rte_hash_key *new_k, *k, *keys = h->key_store;
+ struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
uint32_t new_idx;
int ret;
@@ -547,46 +574,14 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
new_idx = (uint32_t)((uintptr_t) slot_id);
/* Check if key is already inserted in primary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (prim_bkt->sig_current[i] == sig &&
- prim_bkt->sig_alt[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- prim_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = prim_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1)
+ goto failure;
/* Check if key is already inserted in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (sec_bkt->sig_alt[i] == sig &&
- sec_bkt->sig_current[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- sec_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = sec_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1)
+ goto failure;
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
@@ -699,20 +694,15 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data)
else
return ret;
}
+
+/* Search one bucket to find the match key */
static inline int32_t
-__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig, void **data)
+search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t sig,
+ void **data, const struct rte_hash_bucket *bkt)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
+ int i;
struct rte_hash_key *k, *keys = h->key_store;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
- /* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
bkt->key_idx[i] != EMPTY_SLOT) {
@@ -729,6 +719,26 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig, void **data)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+
+ /* Check if key is in primary location */
+ ret = search_one_bucket(h, key, sig, data, bkt);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -736,22 +746,9 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bkt = &h->buckets[bucket_idx];
/* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->sig_alt[i] == sig) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- if (data != NULL)
- *data = k->pdata;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- return bkt->key_idx[i] - 1;
- }
- }
- }
+ ret = search_one_bucket(h, key, alt_hash, data, bkt);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
@@ -815,20 +812,15 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
}
}
+/* Search one bucket and remove the matched key */
static inline int32_t
-__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig)
+search_and_remove(const struct rte_hash *h, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
struct rte_hash_key *k, *keys = h->key_store;
+ unsigned int i;
int32_t ret;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
/* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
@@ -838,41 +830,45 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
if (rte_hash_cmp_eq(key, k->key, h) == 0) {
remove_entry(h, bkt, i);
+ ret = bkt->key_idx[i] - 1;
+ bkt->key_idx[i] = EMPTY_SLOT;
/*
* Return index where key is stored,
* subtracting the first dummy index
*/
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
return ret;
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int32_t ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+ /* look for key in primary bucket */
+ ret = search_and_remove(h, key, bkt, sig);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
- /* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->key_idx[i] != EMPTY_SLOT) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- remove_entry(h, bkt, i);
-
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
- }
- }
- }
+ /* look for key in secondary bucket */
+ ret = search_and_remove(h, key, bkt, alt_hash);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 5/8] hash: add read and write concurrency support
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (3 preceding siblings ...)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 4/8] hash: make duplicated code into functions Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 14:28 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 6/8] test: add tests in hash table perf test Yipeng Wang
` (2 subsequent siblings)
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
The existing implementation of librte_hash does not support read-write
concurrency. This commit implements read-write safety using rte_rwlock
and rte_rwlock TM version if hardware transactional memory is available.
Both multi-writer and read-write concurrency is protected by rte_rwlock
now. The x86 specific header file is removed since the x86 specific RTM
function is not called directly by rte hash now.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 520 ++++++++++++++++++++++------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 167 -----------
lib/librte_hash/rte_hash.h | 3 +
5 files changed, 348 insertions(+), 361 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index e139e1d..efc06ed 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -6,7 +6,6 @@ headers = files('rte_cmp_arm64.h',
'rte_cmp_x86.h',
'rte_crc_arm64.h',
'rte_cuckoo_hash.h',
- 'rte_cuckoo_hash_x86.h',
'rte_fbk_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 109da92..032e213 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -31,9 +31,6 @@
#include "rte_hash.h"
#include "rte_cuckoo_hash.h"
-#if defined(RTE_ARCH_X86)
-#include "rte_cuckoo_hash_x86.h"
-#endif
TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
@@ -93,8 +90,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
void *buckets = NULL;
char ring_name[RTE_RING_NAMESIZE];
unsigned num_key_slots;
- unsigned hw_trans_mem_support = 0;
unsigned i;
+ unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
+ unsigned int readwrite_concur_support = 0;
+
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
hash_list = RTE_TAILQ_CAST(rte_hash_tailq.head, rte_hash_list);
@@ -118,8 +117,16 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD)
+ multi_writer_support = 1;
+
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) {
+ readwrite_concur_support = 1;
+ multi_writer_support = 1;
+ }
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
- if (hw_trans_mem_support)
+ if (multi_writer_support)
/*
* Increase number of slots by total number of indices
* that can be stored in the lcore caches
@@ -233,7 +240,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->cmp_jump_table_idx = KEY_OTHER_BYTES;
#endif
- if (hw_trans_mem_support) {
+ if (multi_writer_support) {
h->local_free_slots = rte_zmalloc_socket(NULL,
sizeof(struct lcore_cache) * RTE_MAX_LCORE,
RTE_CACHE_LINE_SIZE, params->socket_id);
@@ -261,6 +268,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->key_store = k;
h->free_slots = r;
h->hw_trans_mem_support = hw_trans_mem_support;
+ h->multi_writer_support = multi_writer_support;
+ h->readwrite_concur_support = readwrite_concur_support;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
@@ -271,24 +280,17 @@ rte_hash_create(const struct rte_hash_parameters *params)
#endif
h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR;
- /* Turn on multi-writer only with explicit flat from user and TM
+ /* Turn on multi-writer only with explicit flag from user and TM
* support.
*/
- if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
- if (h->hw_trans_mem_support) {
- h->add_key = ADD_KEY_MULTIWRITER_TM;
- } else {
- h->add_key = ADD_KEY_MULTIWRITER;
- h->multiwriter_lock = rte_malloc(NULL,
- sizeof(rte_spinlock_t),
- RTE_CACHE_LINE_SIZE);
- if (h->multiwriter_lock == NULL)
- goto err_unlock;
-
- rte_spinlock_init(h->multiwriter_lock);
- }
- } else
- h->add_key = ADD_KEY_SINGLEWRITER;
+ if (h->multi_writer_support) {
+ h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
+ RTE_CACHE_LINE_SIZE);
+ if (h->readwrite_lock == NULL)
+ goto err_unlock;
+
+ rte_rwlock_init(h->readwrite_lock);
+ }
/* Populate free slots ring. Entry zero is reserved for key misses. */
for (i = 1; i < num_key_slots; i++)
@@ -338,11 +340,10 @@ rte_hash_free(struct rte_hash *h)
rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support) {
rte_free(h->local_free_slots);
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_free(h->multiwriter_lock);
+ rte_free(h->readwrite_lock);
+ }
rte_ring_free(h->free_slots);
rte_free(h->key_store);
rte_free(h->buckets);
@@ -369,6 +370,44 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+/* Read write locks implemented using rte_rwlock */
+static inline void
+__hash_rw_writer_lock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_lock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_lock(h->readwrite_lock);
+}
+
+
+static inline void
+__hash_rw_reader_lock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_lock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_lock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_writer_unlock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_unlock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_unlock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_reader_unlock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_unlock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_unlock(h->readwrite_lock);
+}
+
void
rte_hash_reset(struct rte_hash *h)
{
@@ -378,6 +417,7 @@ rte_hash_reset(struct rte_hash *h)
if (h == NULL)
return;
+ __hash_rw_writer_lock(h);
memset(h->buckets, 0, h->num_buckets * sizeof(struct rte_hash_bucket));
memset(h->key_store, 0, h->key_entry_size * (h->entries + 1));
@@ -386,7 +426,7 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support)
tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
(LCORE_CACHE_SIZE - 1);
else
@@ -395,77 +435,12 @@ rte_hash_reset(struct rte_hash *h)
for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
/* Reset local caches per lcore */
for (i = 0; i < RTE_MAX_LCORE; i++)
h->local_free_slots[i].len = 0;
}
-}
-
-/* Search for an entry that can be pushed to its alternative location */
-static inline int
-make_space_bucket(const struct rte_hash *h, struct rte_hash_bucket *bkt,
- unsigned int *nr_pushes)
-{
- unsigned i, j;
- int ret;
- uint32_t next_bucket_idx;
- struct rte_hash_bucket *next_bkt[RTE_HASH_BUCKET_ENTRIES];
-
- /*
- * Push existing item (search for bucket with space in
- * alternative locations) to its alternative location
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Search for space in alternative locations */
- next_bucket_idx = bkt->sig_alt[i] & h->bucket_bitmask;
- next_bkt[i] = &h->buckets[next_bucket_idx];
- for (j = 0; j < RTE_HASH_BUCKET_ENTRIES; j++) {
- if (next_bkt[i]->key_idx[j] == EMPTY_SLOT)
- break;
- }
-
- if (j != RTE_HASH_BUCKET_ENTRIES)
- break;
- }
-
- /* Alternative location has spare room (end of recursive function) */
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- next_bkt[i]->sig_alt[j] = bkt->sig_current[i];
- next_bkt[i]->sig_current[j] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[j] = bkt->key_idx[i];
- return i;
- }
-
- /* Pick entry that has not been pushed yet */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++)
- if (bkt->flag[i] == 0)
- break;
-
- /* All entries have been pushed, so entry cannot be added */
- if (i == RTE_HASH_BUCKET_ENTRIES || ++(*nr_pushes) > RTE_HASH_MAX_PUSHES)
- return -ENOSPC;
-
- /* Set flag to indicate that this entry is going to be pushed */
- bkt->flag[i] = 1;
-
- /* Need room in alternative bucket to insert the pushed entry */
- ret = make_space_bucket(h, next_bkt[i], nr_pushes);
- /*
- * After recursive function.
- * Clear flags and insert the pushed entry
- * in its alternative location if successful,
- * or return error
- */
- bkt->flag[i] = 0;
- if (ret >= 0) {
- next_bkt[i]->sig_alt[ret] = bkt->sig_current[i];
- next_bkt[i]->sig_current[ret] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[ret] = bkt->key_idx[i];
- return i;
- } else
- return ret;
-
+ __hash_rw_writer_unlock(h);
}
/*
@@ -478,7 +453,7 @@ enqueue_slot_back(const struct rte_hash *h,
struct lcore_cache *cached_free_slots,
void *slot_id)
{
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
cached_free_slots->objs[cached_free_slots->len] = slot_id;
cached_free_slots->len++;
} else
@@ -512,13 +487,207 @@ search_and_update(const struct rte_hash *h, void *data, const void *key,
return -1;
}
+/* Only tries to insert at one bucket (@prim_bkt) without trying to push
+ * buckets around.
+ * return 1 if matching existing key, return 0 if succeeds, return -1 for no
+ * empty entry.
+ */
+static inline int32_t
+rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *prim_bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ unsigned int i;
+ struct rte_hash_bucket *cur_bkt = prim_bkt;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+ /* Check if key was inserted after last check but before this
+ * protected region in case of inserting duplicated keys.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ /* Insert new entry if there is room in the primary
+ * bucket.
+ */
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ /* Check if slot is available */
+ if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
+ prim_bkt->sig_current[i] = sig;
+ prim_bkt->sig_alt[i] = alt_hash;
+ prim_bkt->key_idx[i] = new_idx;
+ break;
+ }
+ }
+ __hash_rw_writer_unlock(h);
+
+ if (i != RTE_HASH_BUCKET_ENTRIES)
+ return 0;
+
+ /* no empty entry */
+ return -1;
+}
+
+/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
+ * the path head with new entry (sig, alt_hash, new_idx)
+ * return 1 if matched key found, return -1 if cuckoo path invalided and fail,
+ * return 0 if succeeds.
+ */
+static inline int
+rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *alt_bkt,
+ const struct rte_hash_key *key, void *data,
+ struct queue_node *leaf, uint32_t leaf_slot,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ uint32_t prev_alt_bkt_idx;
+ struct rte_hash_bucket *cur_bkt = bkt;
+ struct queue_node *prev_node, *curr_node = leaf;
+ struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
+ uint32_t prev_slot, curr_slot = leaf_slot;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+
+ /* In case empty slot was gone before entering protected region */
+ if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT) {
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Check if key was inserted after last check but before this
+ * protected region.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ ret = search_and_update(h, data, key, alt_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ while (likely(curr_node->prev != NULL)) {
+ prev_node = curr_node->prev;
+ prev_bkt = prev_node->bkt;
+ prev_slot = curr_node->prev_slot;
+
+ prev_alt_bkt_idx =
+ prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask;
+
+ if (unlikely(&h->buckets[prev_alt_bkt_idx]
+ != curr_bkt)) {
+ /* revert it to empty, otherwise duplicated keys */
+ curr_bkt->key_idx[curr_slot] = EMPTY_SLOT;
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Need to swap current/alt sig to allow later
+ * Cuckoo insert to move elements back to its
+ * primary bucket if available
+ */
+ curr_bkt->sig_alt[curr_slot] =
+ prev_bkt->sig_current[prev_slot];
+ curr_bkt->sig_current[curr_slot] =
+ prev_bkt->sig_alt[prev_slot];
+ curr_bkt->key_idx[curr_slot] =
+ prev_bkt->key_idx[prev_slot];
+
+ curr_slot = prev_slot;
+ curr_node = prev_node;
+ curr_bkt = curr_node->bkt;
+ }
+
+ curr_bkt->sig_current[curr_slot] = sig;
+ curr_bkt->sig_alt[curr_slot] = alt_hash;
+ curr_bkt->key_idx[curr_slot] = new_idx;
+
+ __hash_rw_writer_unlock(h);
+
+ return 0;
+
+}
+
+/*
+ * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
+ * Cuckoo
+ */
+static inline int
+rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash,
+ uint32_t new_idx, int32_t *ret_val)
+{
+ unsigned int i;
+ struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
+ struct queue_node *tail, *head;
+ struct rte_hash_bucket *curr_bkt, *alt_bkt;
+
+ tail = queue;
+ head = queue + 1;
+ tail->bkt = bkt;
+ tail->prev = NULL;
+ tail->prev_slot = -1;
+
+ /* Cuckoo bfs Search */
+ while (likely(tail != head && head <
+ queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
+ RTE_HASH_BUCKET_ENTRIES)) {
+ curr_bkt = tail->bkt;
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
+ int32_t ret = rte_hash_cuckoo_move_insert_mw(h,
+ bkt, sec_bkt, key, data,
+ tail, i, sig, alt_hash,
+ new_idx, ret_val);
+ if (likely(ret != -1))
+ return ret;
+ }
+
+ /* Enqueue new node and keep prev node info */
+ alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
+ & h->bucket_bitmask]);
+ head->bkt = alt_bkt;
+ head->prev = tail;
+ head->prev_slot = i;
+ head++;
+ }
+ tail++;
+ }
+
+ return -ENOSPC;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
{
hash_sig_t alt_hash;
uint32_t prim_bucket_idx, sec_bucket_idx;
- unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
@@ -527,10 +696,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
unsigned n_slots;
unsigned lcore_id;
struct lcore_cache *cached_free_slots = NULL;
- unsigned int nr_pushes = 0;
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_lock(h->multiwriter_lock);
+ int32_t ret_val;
prim_bucket_idx = sig & h->bucket_bitmask;
prim_bkt = &h->buckets[prim_bucket_idx];
@@ -541,8 +707,24 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
sec_bkt = &h->buckets[sec_bucket_idx];
rte_prefetch0(sec_bkt);
- /* Get a new slot for storing the new key */
- if (h->hw_trans_mem_support) {
+ /* Check if key is already inserted in primary location */
+ __hash_rw_writer_lock(h);
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+
+ /* Check if key is already inserted in secondary location */
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+ __hash_rw_writer_unlock(h);
+
+ /* Did not find a match, so get a new slot for storing the new key */
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Try to get a free slot from the local cache */
@@ -552,8 +734,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
cached_free_slots->objs,
LCORE_CACHE_SIZE, NULL);
if (n_slots == 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
cached_free_slots->len += n_slots;
@@ -564,92 +745,50 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
slot_id = cached_free_slots->objs[cached_free_slots->len];
} else {
if (rte_ring_sc_dequeue(h->free_slots, &slot_id) != 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
}
new_k = RTE_PTR_ADD(keys, (uintptr_t)slot_id * h->key_entry_size);
- rte_prefetch0(new_k);
new_idx = (uint32_t)((uintptr_t) slot_id);
-
- /* Check if key is already inserted in primary location */
- ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
- if (ret != -1)
- goto failure;
-
- /* Check if key is already inserted in secondary location */
- ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
- if (ret != -1)
- goto failure;
-
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
new_k->pdata = data;
-#if defined(RTE_ARCH_X86) /* currently only x86 support HTM */
- if (h->add_key == ADD_KEY_MULTIWRITER_TM) {
- ret = rte_hash_cuckoo_insert_mw_tm(prim_bkt,
- sig, alt_hash, new_idx);
- if (ret >= 0)
- return new_idx - 1;
- /* Primary bucket full, need to make space for new entry */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, prim_bkt, sig,
- alt_hash, new_idx);
+ /* Find an empty slot and insert */
+ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- if (ret >= 0)
- return new_idx - 1;
+ /* Primary bucket full, need to make space for new entry */
+ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- /* Also search secondary bucket to get better occupancy */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, sec_bkt, sig,
- alt_hash, new_idx);
+ /* Also search secondary bucket to get better occupancy */
+ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data,
+ alt_hash, sig, new_idx, &ret_val);
- if (ret >= 0)
- return new_idx - 1;
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
} else {
-#endif
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
-
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-
- /* Primary bucket full, need to make space for new entry
- * After recursive function.
- * Insert the new entry in the position of the pushed entry
- * if successful or return error and
- * store the new slot back in the ring
- */
- ret = make_space_bucket(h, prim_bkt, &nr_pushes);
- if (ret >= 0) {
- prim_bkt->sig_current[ret] = sig;
- prim_bkt->sig_alt[ret] = alt_hash;
- prim_bkt->key_idx[ret] = new_idx;
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-#if defined(RTE_ARCH_X86)
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
-#endif
- /* Error in addition, store new slot back in the ring and return error */
- enqueue_slot_back(h, cached_free_slots, (void *)((uintptr_t) new_idx));
-
-failure:
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return ret;
}
int32_t
@@ -734,12 +873,14 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_reader_lock(h);
/* Check if key is in primary location */
ret = search_one_bucket(h, key, sig, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
@@ -747,9 +888,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
/* Check if key is in secondary location */
ret = search_one_bucket(h, key, alt_hash, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
+ __hash_rw_reader_unlock(h);
return -ENOENT;
}
@@ -791,7 +934,7 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
bkt->sig_current[i] = NULL_SIGNATURE;
bkt->sig_alt[i] = NULL_SIGNATURE;
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Cache full, need to free it. */
@@ -855,10 +998,13 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_writer_lock(h);
/* look for key in primary bucket */
ret = search_and_remove(h, key, bkt, sig);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
return ret;
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -867,9 +1013,12 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
/* look for key in secondary bucket */
ret = search_and_remove(h, key, bkt, alt_hash);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
return ret;
+ }
+ __hash_rw_writer_unlock(h);
return -ENOENT;
}
@@ -1011,6 +1160,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
rte_prefetch0(secondary_bkt[i]);
}
+ __hash_rw_reader_lock(h);
/* Compare signatures and prefetch key slot of first hit */
for (i = 0; i < num_keys; i++) {
compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
@@ -1093,6 +1243,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
continue;
}
+ __hash_rw_reader_unlock(h);
+
if (hit_mask != NULL)
*hit_mask = hits;
}
@@ -1151,7 +1303,7 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
bucket_idx = *next / RTE_HASH_BUCKET_ENTRIES;
idx = *next % RTE_HASH_BUCKET_ENTRIES;
}
-
+ __hash_rw_reader_lock(h);
/* Get position of entry in key table */
position = h->buckets[bucket_idx].key_idx[idx];
next_key = (struct rte_hash_key *) ((char *)h->key_store +
@@ -1160,6 +1312,8 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
*key = next_key->key;
*data = next_key->pdata;
+ __hash_rw_reader_unlock(h);
+
/* Increment iterator */
(*next)++;
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 7a54e55..db4d1a0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -88,11 +88,6 @@ const rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
#endif
-enum add_key_case {
- ADD_KEY_SINGLEWRITER = 0,
- ADD_KEY_MULTIWRITER,
- ADD_KEY_MULTIWRITER_TM,
-};
/** Number of items per bucket. */
#define RTE_HASH_BUCKET_ENTRIES 8
@@ -155,18 +150,20 @@ struct rte_hash {
struct rte_ring *free_slots;
/**< Ring that stores all indexes of the free slots in the key table */
- uint8_t hw_trans_mem_support;
- /**< Hardware transactional memory support */
+
struct lcore_cache *local_free_slots;
/**< Local cache per lcore, storing some indexes of the free slots */
- enum add_key_case add_key; /**< Multi-writer hash add behavior */
-
- rte_spinlock_t *multiwriter_lock; /**< Multi-writer spinlock for w/o TM */
/* Fields used in lookup */
uint32_t key_len __rte_cache_aligned;
/**< Length of hash key. */
+ uint8_t hw_trans_mem_support;
+ /**< If hardware transactional memory is used. */
+ uint8_t multi_writer_support;
+ /**< If multi-writer support is enabled. */
+ uint8_t readwrite_concur_support;
+ /**< If read-write concurrency support is enabled */
rte_hash_function hash_func; /**< Function used to calculate hash. */
uint32_t hash_func_init_val; /**< Init value used by hash_func. */
rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
@@ -184,6 +181,7 @@ struct rte_hash {
/**< Table with buckets storing all the hash values and key indexes
* to the key table.
*/
+ rte_rwlock_t *readwrite_lock; /**< Read-write lock thread-safety. */
} __rte_cache_aligned;
struct queue_node {
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
deleted file mode 100644
index 981d7bd..0000000
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ /dev/null
@@ -1,167 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016 Intel Corporation
- */
-
-/* rte_cuckoo_hash_x86.h
- * This file holds all x86 specific Cuckoo Hash functions
- */
-
-/* Only tries to insert at one bucket (@prim_bkt) without trying to push
- * buckets around
- */
-static inline unsigned
-rte_hash_cuckoo_insert_mw_tm(struct rte_hash_bucket *prim_bkt,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned i, status;
- unsigned try = 0;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* Insert new entry if there is room in the primary
- * bucket.
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
- rte_xend();
-
- if (i != RTE_HASH_BUCKET_ENTRIES)
- return 0;
-
- break; /* break off try loop if transaction commits */
- } else {
- /* If we abort we give up this cuckoo path. */
- try++;
- rte_pause();
- }
- }
-
- return -1;
-}
-
-/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
- * the path head with new entry (sig, alt_hash, new_idx)
- */
-static inline int
-rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
- struct queue_node *leaf, uint32_t leaf_slot,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned try = 0;
- unsigned status;
- uint32_t prev_alt_bkt_idx;
-
- struct queue_node *prev_node, *curr_node = leaf;
- struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
- uint32_t prev_slot, curr_slot = leaf_slot;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* In case empty slot was gone before entering TSX */
- if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT)
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- while (likely(curr_node->prev != NULL)) {
- prev_node = curr_node->prev;
- prev_bkt = prev_node->bkt;
- prev_slot = curr_node->prev_slot;
-
- prev_alt_bkt_idx
- = prev_bkt->sig_alt[prev_slot]
- & h->bucket_bitmask;
-
- if (unlikely(&h->buckets[prev_alt_bkt_idx]
- != curr_bkt)) {
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- }
-
- /* Need to swap current/alt sig to allow later
- * Cuckoo insert to move elements back to its
- * primary bucket if available
- */
- curr_bkt->sig_alt[curr_slot] =
- prev_bkt->sig_current[prev_slot];
- curr_bkt->sig_current[curr_slot] =
- prev_bkt->sig_alt[prev_slot];
- curr_bkt->key_idx[curr_slot]
- = prev_bkt->key_idx[prev_slot];
-
- curr_slot = prev_slot;
- curr_node = prev_node;
- curr_bkt = curr_node->bkt;
- }
-
- curr_bkt->sig_current[curr_slot] = sig;
- curr_bkt->sig_alt[curr_slot] = alt_hash;
- curr_bkt->key_idx[curr_slot] = new_idx;
-
- rte_xend();
-
- return 0;
- }
-
- /* If we abort we give up this cuckoo path, since most likely it's
- * no longer valid as TSX detected data conflict
- */
- try++;
- rte_pause();
- }
-
- return -1;
-}
-
-/*
- * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
- * Cuckoo
- */
-static inline int
-rte_hash_cuckoo_make_space_mw_tm(const struct rte_hash *h,
- struct rte_hash_bucket *bkt,
- hash_sig_t sig, hash_sig_t alt_hash,
- uint32_t new_idx)
-{
- unsigned i;
- struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
- struct queue_node *tail, *head;
- struct rte_hash_bucket *curr_bkt, *alt_bkt;
-
- tail = queue;
- head = queue + 1;
- tail->bkt = bkt;
- tail->prev = NULL;
- tail->prev_slot = -1;
-
- /* Cuckoo bfs Search */
- while (likely(tail != head && head <
- queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
- RTE_HASH_BUCKET_ENTRIES)) {
- curr_bkt = tail->bkt;
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
- if (likely(rte_hash_cuckoo_move_insert_mw_tm(h,
- tail, i, sig,
- alt_hash, new_idx) == 0))
- return 0;
- }
-
- /* Enqueue new node and keep prev node info */
- alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
- & h->bucket_bitmask]);
- head->bkt = alt_bkt;
- head->prev = tail;
- head->prev_slot = i;
- head++;
- }
- tail++;
- }
-
- return -ENOSPC;
-}
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index f71ca9f..ecb49e4 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -34,6 +34,9 @@ extern "C" {
/** Default behavior of insertion, single writer/multi writer */
#define RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD 0x02
+/** Flag to support reader writer concurrency */
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY 0x04
+
/** Signature of key that is stored internally. */
typedef uint32_t hash_sig_t;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 6/8] test: add tests in hash table perf test
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (4 preceding siblings ...)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 5/8] hash: add read and write concurrency support Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 15:33 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 7/8] test: add test case for read write concurrency Yipeng Wang
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 8/8] hash: add new API function to query the key count Yipeng Wang
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
New code is added to support read-write concurrency for
rte_hash. Due to the newly added code in critial path,
the perf test is modified to show any performance impact.
It is still a single-thread test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
test/test/test_hash_perf.c | 36 +++++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)
diff --git a/test/test/test_hash_perf.c b/test/test/test_hash_perf.c
index a81d0c7..33dcb9f 100644
--- a/test/test/test_hash_perf.c
+++ b/test/test/test_hash_perf.c
@@ -76,7 +76,8 @@ static struct rte_hash_parameters ut_params = {
};
static int
-create_table(unsigned with_data, unsigned table_index)
+create_table(unsigned int with_data, unsigned int table_index,
+ unsigned int with_locks)
{
char name[RTE_HASH_NAMESIZE];
@@ -86,6 +87,14 @@ create_table(unsigned with_data, unsigned table_index)
else
sprintf(name, "test_hash%d", hashtest_key_lens[table_index]);
+
+ if (with_locks)
+ ut_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT
+ | RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ ut_params.extra_flag = 0;
+
ut_params.name = name;
ut_params.key_len = hashtest_key_lens[table_index];
ut_params.socket_id = rte_socket_id();
@@ -459,7 +468,7 @@ reset_table(unsigned table_index)
}
static int
-run_all_tbl_perf_tests(unsigned with_pushes)
+run_all_tbl_perf_tests(unsigned int with_pushes, unsigned int with_locks)
{
unsigned i, j, with_data, with_hash;
@@ -468,7 +477,7 @@ run_all_tbl_perf_tests(unsigned with_pushes)
for (with_data = 0; with_data <= 1; with_data++) {
for (i = 0; i < NUM_KEYSIZES; i++) {
- if (create_table(with_data, i) < 0)
+ if (create_table(with_data, i, with_locks) < 0)
return -1;
if (get_input_keys(with_pushes, i) < 0)
@@ -611,15 +620,20 @@ fbk_hash_perf_test(void)
static int
test_hash_perf(void)
{
- unsigned with_pushes;
-
- for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
- if (with_pushes == 0)
- printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ unsigned int with_pushes, with_locks;
+ for (with_locks = 0; with_locks <= 1; with_locks++) {
+ if (with_locks)
+ printf("\nWith locks in the code\n");
else
- printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
- if (run_all_tbl_perf_tests(with_pushes) < 0)
- return -1;
+ printf("\nWithout locks in the code\n");
+ for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
+ if (with_pushes == 0)
+ printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ else
+ printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
+ if (run_all_tbl_perf_tests(with_pushes, with_locks) < 0)
+ return -1;
+ }
}
if (fbk_hash_perf_test() < 0)
return -1;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 7/8] test: add test case for read write concurrency
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (5 preceding siblings ...)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 6/8] test: add tests in hash table perf test Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 16:24 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 8/8] hash: add new API function to query the key count Yipeng Wang
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commits add a new test case for testing read/write concurrency.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
test/test/Makefile | 1 +
test/test/test_hash_readwrite.c | 646 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 647 insertions(+)
create mode 100644 test/test/test_hash_readwrite.c
diff --git a/test/test/Makefile b/test/test/Makefile
index eccc8ef..6ce66c9 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -113,6 +113,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
new file mode 100644
index 0000000..39a2bbb
--- /dev/null
+++ b/test/test/test_hash_readwrite.c
@@ -0,0 +1,646 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <locale.h>
+
+#include <rte_cycles.h>
+#include <rte_hash.h>
+#include <rte_hash_crc.h>
+#include <rte_jhash.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+#include <rte_spinlock.h>
+
+#include "test.h"
+
+#define RTE_RWTEST_FAIL 0
+
+#define TOTAL_ENTRY (16*1024*1024)
+#define TOTAL_INSERT (15*1024*1024)
+
+#define NUM_TEST 3
+unsigned int core_cnt[NUM_TEST] = {2, 4, 8};
+
+struct perf {
+ uint32_t single_read;
+ uint32_t single_write;
+ uint32_t read_only[NUM_TEST];
+ uint32_t write_only[NUM_TEST];
+ uint32_t read_write_r[NUM_TEST];
+ uint32_t read_write_w[NUM_TEST];
+};
+
+static struct perf htm_results, non_htm_results;
+
+struct {
+ uint32_t *keys;
+ uint32_t *found;
+ uint32_t num_insert;
+ uint32_t rounded_tot_insert;
+ struct rte_hash *h;
+} tbl_rw_test_param;
+
+static rte_atomic64_t gcycles;
+static rte_atomic64_t ginsertions;
+
+static rte_atomic64_t gread_cycles;
+static rte_atomic64_t gwrite_cycles;
+
+static rte_atomic64_t greads;
+static rte_atomic64_t gwrites;
+
+static int
+test_hash_readwrite_worker(__attribute__((unused)) void *arg)
+{
+ uint64_t i, offset;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+
+ offset = (lcore_id - rte_get_master_lcore())
+ * tbl_rw_test_param.num_insert;
+
+ printf("Core #%d inserting and reading %d: %'"PRId64" - %'"PRId64"\n",
+ lcore_id, tbl_rw_test_param.num_insert,
+ offset, offset + tbl_rw_test_param.num_insert);
+
+ begin = rte_rdtsc_precise();
+
+
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) > 0)
+ break;
+
+ ret = rte_hash_add_key(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i);
+ if (ret < 0)
+ break;
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) != ret)
+ break;
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gcycles, cycles);
+ rte_atomic64_add(&ginsertions, i - offset);
+
+ for (; i < offset + tbl_rw_test_param.num_insert; i++)
+ tbl_rw_test_param.keys[i] = RTE_RWTEST_FAIL;
+
+ return 0;
+}
+
+static int
+init_params(int use_htm, int use_jhash)
+{
+ unsigned int i;
+
+ uint32_t *keys = NULL;
+ uint32_t *found = NULL;
+ struct rte_hash *handle;
+
+ struct rte_hash_parameters hash_params = {
+ .entries = TOTAL_ENTRY,
+ .key_len = sizeof(uint32_t),
+ .hash_func_init_val = 0,
+ .socket_id = rte_socket_id(),
+ };
+ if (use_jhash)
+ hash_params.hash_func = rte_jhash;
+ else
+ hash_params.hash_func = rte_hash_crc;
+
+ if (use_htm)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+
+ hash_params.name = "tests";
+
+ handle = rte_hash_create(&hash_params);
+ if (handle == NULL) {
+ printf("hash creation failed");
+ return -1;
+ }
+
+ tbl_rw_test_param.h = handle;
+ keys = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+
+ if (keys == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ found = rte_zmalloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+ if (found == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ goto err;
+ }
+
+
+ tbl_rw_test_param.keys = keys;
+ tbl_rw_test_param.found = found;
+
+ for (i = 0; i < TOTAL_ENTRY; i++)
+ keys[i] = i;
+
+ return 0;
+
+err:
+ rte_free(keys);
+ rte_hash_free(handle);
+
+ return -1;
+}
+
+static int
+test_hash_readwrite_functional(int use_htm)
+{
+ unsigned int i;
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+ int use_jhash = 1;
+
+ rte_atomic64_init(&gcycles);
+ rte_atomic64_clear(&gcycles);
+
+ rte_atomic64_init(&ginsertions);
+ rte_atomic64_clear(&ginsertions);
+
+ if (init_params(use_htm, use_jhash) != 0)
+ goto err;
+
+ tbl_rw_test_param.num_insert =
+ TOTAL_INSERT / rte_lcore_count();
+
+ tbl_rw_test_param.rounded_tot_insert =
+ tbl_rw_test_param.num_insert
+ * rte_lcore_count();
+
+ printf("++++++++Start function tests:+++++++++\n");
+
+ /* Fire all threads. */
+ rte_eal_mp_remote_launch(test_hash_readwrite_worker,
+ NULL, CALL_MASTER);
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h, &next_key,
+ &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+ for (i = 0; i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %d is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gcycles) /
+ rte_atomic64_read(&ginsertions);
+
+ printf("cycles per insertion and lookup: %llu\n", cycles_per_insertion);
+
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ printf("+++++++++Complete function tests+++++++++\n");
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+err:
+ return -1;
+}
+
+static int
+test_rw_reader(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint64_t begin, cycles;
+ uint64_t read_cnt = (uint64_t)((uintptr_t)arg);
+
+ begin = rte_rdtsc_precise();
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value %"PRIu64","
+ "%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gread_cycles, cycles);
+ rte_atomic64_add(&greads, i);
+ return 0;
+}
+
+static int
+test_rw_writer(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+ uint64_t start_coreid = (uint64_t)(uintptr_t)arg;
+ uint64_t offset;
+
+ offset = TOTAL_INSERT / 2 + (lcore_id - start_coreid)
+ * tbl_rw_test_param.num_insert;
+ begin = rte_rdtsc_precise();
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("writer failed %"PRIu64"\n", i);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gwrite_cycles, cycles);
+ rte_atomic64_add(&gwrites, tbl_rw_test_param.num_insert);
+ return 0;
+}
+
+static int
+test_hash_readwrite_perf(struct perf *perf_results, int use_htm,
+ int reader_faster)
+{
+ unsigned int n;
+ int ret;
+ int start_coreid;
+ uint64_t i, read_cnt;
+
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+ int use_jhash = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ uint64_t start = 0, end = 0;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gwrites);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&greads);
+
+ rte_atomic64_init(&gread_cycles);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_init(&gwrite_cycles);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ if (init_params(use_htm, use_jhash) != 0)
+ goto err;
+
+ /*
+ * Do a readers finish faster or writers finish faster test.
+ * When readers finish faster, we timing the readers, and when writers
+ * finish faster, we timing the writers.
+ * Divided by 10 or 2 is just experimental values to vary the workload
+ * of readers.
+ */
+ if (reader_faster) {
+ printf("++++++Start perf test: reader++++++++\n");
+ read_cnt = TOTAL_INSERT / 10;
+ } else {
+ printf("++++++Start perf test: writer++++++++\n");
+ read_cnt = TOTAL_INSERT / 2;
+ }
+
+
+ /* We first test single thread performance */
+ start = rte_rdtsc_precise();
+ /* Insert half of the keys */
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_write = end / i;
+
+ start = rte_rdtsc_precise();
+
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value"
+ " %"PRIu64",%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_read = end / i;
+
+ for (n = 0; n < NUM_TEST; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < core_cnt[n] * 2 + 1)
+ goto finish;
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ tbl_rw_test_param.num_insert = TOTAL_INSERT / 2 / core_cnt[n];
+ tbl_rw_test_param.rounded_tot_insert = TOTAL_INSERT / 2 +
+ tbl_rw_test_param.num_insert *
+ core_cnt[n];
+
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ /* Then test multiple thread case but only all reads or
+ * all writes
+ */
+
+ /* Test only reader cases */
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+
+ rte_eal_mp_wait_lcore();
+
+ start_coreid = i;
+ /* Test only writer cases */
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+
+
+ rte_eal_mp_wait_lcore();
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_only[n] = cycles_per_insertion;
+ printf("Reader only: cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->write_only[n] = cycles_per_insertion;
+ printf("Writer only: cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+
+ start_coreid = core_cnt[n] + 1;
+
+ if (reader_faster) {
+ for (i = core_cnt[n] + 1; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ } else {
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ }
+
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h,
+ &next_key, &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+
+ for (i = 0; i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %"PRIu64" is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_write_r[n] = cycles_per_insertion;
+ printf("Read-write cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->read_write_w[n] = cycles_per_insertion;
+ printf("Read-write cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+ }
+
+finish:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+
+err:
+ return -1;
+}
+
+static int
+test_hash_readwrite_main(void)
+{
+ /*
+ * Variables used to choose different tests.
+ * use_htm indicates if hardware transactional memory should be used.
+ * reader_faster indicates if the reader threads should finish earlier
+ * than writer threads. This is to timing either reader threads or
+ * writer threads for performance numbers.
+ */
+ int use_htm, reader_faster;
+
+ if (rte_lcore_count() == 1) {
+ printf("More than one lcore is required "
+ "to do read write test\n");
+ return 0;
+ }
+
+
+ setlocale(LC_NUMERIC, "");
+
+ if (rte_tm_supported()) {
+ printf("Hardware transactional memory (lock elision) "
+ "is supported\n");
+
+ printf("Test read-write with Hardware transactional memory\n");
+
+ use_htm = 1;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ } else {
+ printf("Hardware transactional memory (lock elision) "
+ "is NOT supported\n");
+ }
+
+ printf("Test read-write without Hardware transactional memory\n");
+ use_htm = 0;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+
+ printf("Results summary:\n");
+
+ int i;
+
+ printf("single read: %u\n", htm_results.single_read);
+ printf("single write: %u\n", htm_results.single_write);
+ for (i = 0; i < NUM_TEST; i++) {
+ printf("core_cnt: %u\n", core_cnt[i]);
+ printf("HTM:\n");
+ printf("read only: %u\n", htm_results.read_only[i]);
+ printf("write only: %u\n", htm_results.write_only[i]);
+ printf("read-write read: %u\n", htm_results.read_write_r[i]);
+ printf("read-write write: %u\n", htm_results.read_write_w[i]);
+
+ printf("non HTM:\n");
+ printf("read only: %u\n", non_htm_results.read_only[i]);
+ printf("write only: %u\n", non_htm_results.write_only[i]);
+ printf("read-write read: %u\n",
+ non_htm_results.read_write_r[i]);
+ printf("read-write write: %u\n",
+ non_htm_results.read_write_w[i]);
+ }
+
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(hash_readwrite_autotest, test_hash_readwrite_main);
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v3 8/8] hash: add new API function to query the key count
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (6 preceding siblings ...)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 7/8] test: add test case for read write concurrency Yipeng Wang
@ 2018-07-06 19:46 ` Yipeng Wang
2018-07-09 16:22 ` De Lara Guarch, Pablo
7 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-06 19:46 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Add a new function, rte_hash_count, to return the number of keys that
are currently stored in the hash table. Corresponding test functions are
added into hash_test and hash_multiwriter test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 24 ++++++++++++++++++++++++
lib/librte_hash/rte_hash.h | 11 +++++++++++
lib/librte_hash/rte_hash_version.map | 8 ++++++++
test/test/test_hash.c | 12 ++++++++++++
test/test/test_hash_multiwriter.c | 9 +++++++++
5 files changed, 64 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 032e213..739513f 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -370,6 +370,30 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+int32_t
+rte_hash_count(const struct rte_hash *h)
+{
+ uint32_t tot_ring_cnt, cached_cnt = 0;
+ uint32_t i, ret;
+
+ if (h == NULL)
+ return -EINVAL;
+
+ if (h->multi_writer_support) {
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ for (i = 0; i < RTE_MAX_LCORE; i++)
+ cached_cnt += h->local_free_slots[i].len;
+
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots) -
+ cached_cnt;
+ } else {
+ tot_ring_cnt = h->entries;
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots);
+ }
+ return ret;
+}
+
/* Read write locks implemented using rte_rwlock */
static inline void
__hash_rw_writer_lock(const struct rte_hash *h)
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index ecb49e4..1f1a276 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -127,6 +127,17 @@ void
rte_hash_reset(struct rte_hash *h);
/**
+ * Return the number of keys in the hash table
+ * @param h
+ * Hash table to query from
+ * @return
+ * - -EINVAL if parameters are invalid
+ * - A value indicating how many keys were inserted in the table.
+ */
+int32_t
+rte_hash_count(const struct rte_hash *h);
+
+/**
* Add a key-value pair to an existing hash table.
* This operation is not multi-thread safe
* and should only be called from one thread.
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index 52a2576..e216ac8 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -45,3 +45,11 @@ DPDK_16.07 {
rte_hash_get_key_with_position;
} DPDK_2.2;
+
+
+DPDK_18.08 {
+ global:
+
+ rte_hash_count;
+
+} DPDK_16.07;
diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index edf41f5..b3db9fd 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -1103,6 +1103,7 @@ static int test_average_table_utilization(void)
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
+ unsigned int cnt;
printf("\n# Running test to determine average utilization"
"\n before adding elements begins to fail\n");
@@ -1121,13 +1122,24 @@ static int test_average_table_utilization(void)
for (i = 0; i < ut_params.key_len; i++)
simple_key[i] = rte_rand() % 255;
ret = rte_hash_add_key(handle, simple_key);
+ if (ret < 0)
+ break;
}
+
if (ret != -ENOSPC) {
printf("Unexpected error when adding keys\n");
rte_hash_free(handle);
return -1;
}
+ cnt = rte_hash_count(handle);
+ if (cnt != added_keys) {
+ printf("rte_hash_count returned wrong value %u, %u,"
+ "%u\n", j, added_keys, cnt);
+ rte_hash_free(handle);
+ return -1;
+ }
+
average_keys_added += added_keys;
/* Reset the table */
diff --git a/test/test/test_hash_multiwriter.c b/test/test/test_hash_multiwriter.c
index ef5fce3..ae3ce3b 100644
--- a/test/test/test_hash_multiwriter.c
+++ b/test/test/test_hash_multiwriter.c
@@ -116,6 +116,7 @@ test_hash_multiwriter(void)
uint32_t duplicated_keys = 0;
uint32_t lost_keys = 0;
+ uint32_t count;
snprintf(name, 32, "test%u", calledCount++);
hash_params.name = name;
@@ -163,6 +164,14 @@ test_hash_multiwriter(void)
NULL, CALL_MASTER);
rte_eal_mp_wait_lcore();
+
+ count = rte_hash_count(handle);
+ if (count != rounded_nb_total_tsx_insertion) {
+ printf("rte_hash_count returned wrong value %u, %d\n",
+ rounded_nb_total_tsx_insertion, count);
+ goto err3;
+ }
+
while (rte_hash_iterate(handle, &next_key, &next_data, &iter) >= 0) {
/* Search for the key in the list of keys added .*/
i = *(const uint32_t *)next_key;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
` (4 preceding siblings ...)
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-07-09 10:44 ` Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
` (8 more replies)
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
6 siblings, 9 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:44 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This patch set adds the read-write concurrency support in rte_hash.
A new flag value is added to indicate if read-write concurrency is needed
during creation time. Test cases are implemented to do functional and
performance tests.
The new concurrency model is based on rte_rwlock. When Intel TSX is
available and the users indicate to use it, the TM version of the
rte_rwlock will be called. Both multi-writer and read-write concurrency
are protected by the rte_rwlock instead of the x86 specific RTM
instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed
and the code is infused into the main .c file.
A new rte_hash_count API is proposed to count how many keys are inserted
into the hash table.
v3->v4:
1. Change commit message titles as Pablo suggested. (Pablo)
2. hash: remove unnecessary changes in commit 4. (Pablo)
3. test: remove unnecessary double blank lines. (Pablo)
4. Add Pablo's ack in commit message.
v2->v3:
1. hash: Concurrency bug fix: after beginning cuckoo path moving,
the last empty slot needs to be verified again in case other writers
raced into this slot and occupy it. A new commit is added to do this
bug fix since it applies to master head as well.
2. hash: Concurrency bug fix: if cuckoo path is detected to be invalid,
the current slot needs to be emptied since it is duplicated to its
target bucket.
3. hash: "const" is used for types in multiple locations. (Pablo)
4. hash: rte_malloc used for readwriter lock used wrong align
argument. Similar fix applies to master head so a new commit is
created. (Pablo)
5. hash: ring size calculation fix is moved to front. (Pablo)
6. hash: search-and-remove function is refactored to be more aligned
with other search function. (Pablo)
7. test: using jhash in functional test for read-write concurrency.
It is because jhash with sequential keys incur more cuckoo path.
8. Multiple coding style, typo, commit message fixes. (Pablo)
v1->v2:
1. Split each commit into two commits for easier review (Pablo).
2. Add more comments in various places (Pablo).
3. hash: In key insertion function, move duplicated key checking to
earlier location and protect it using locks. Checking duplicated key
should happen first and data updates should be protected.
4. hash: In lookup bulk function, put signature comparison in lock,
since writers could happen between signature match on two buckets.
5. hash: Add write locks to reset function as well to protect resets.
5. test: Fix 32-bit compilation error in read-write test (Pablo).
6. test: Check total physical core count in read-write test. Don't
test with thread count that larger than physical core count.
7. Other minor fixes such as typos (Pablo).
Yipeng Wang (8):
hash: fix multiwriter lock memory allocation
hash: fix a multi-writer race condition
hash: fix key slot size accuracy
hash: make duplicated code into functions
hash: add read and write concurrency support
test: add tests in hash table perf test
test: add test case for read write concurrency
hash: add new API function to query the key count
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 701 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 --------
lib/librte_hash/rte_hash.h | 14 +
lib/librte_hash/rte_hash_version.map | 8 +
test/test/Makefile | 1 +
test/test/test_hash.c | 12 +
test/test/test_hash_multiwriter.c | 9 +
test/test/test_hash_perf.c | 36 +-
test/test/test_hash_readwrite.c | 637 ++++++++++++++++++++++++++++++
11 files changed, 1156 insertions(+), 445 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
create mode 100644 test/test/test_hash_readwrite.c
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 1/8] hash: fix multiwriter lock memory allocation
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-07-09 10:44 ` Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 2/8] hash: fix a multi-writer race condition Yipeng Wang
` (7 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:44 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
When malloc for multiwriter_lock, the align should be
RTE_CACHE_LINE_SIZE rather than LCORE_CACHE_SIZE.
Also there should be check to verify the success of
rte_malloc.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index a07543a..80dcf41 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -281,7 +281,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->add_key = ADD_KEY_MULTIWRITER;
h->multiwriter_lock = rte_malloc(NULL,
sizeof(rte_spinlock_t),
- LCORE_CACHE_SIZE);
+ RTE_CACHE_LINE_SIZE);
+ if (h->multiwriter_lock == NULL)
+ goto err_unlock;
+
rte_spinlock_init(h->multiwriter_lock);
}
} else
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 2/8] hash: fix a multi-writer race condition
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
@ 2018-07-09 10:44 ` Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 3/8] hash: fix key slot size accuracy Yipeng Wang
` (6 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:44 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Current multi-writer implementation uses Intel TSX to
protect the cuckoo path moving but not the cuckoo
path searching. After searching, we need to verify again if
the same empty slot still exists at the begining of the TSX
region. Otherwise another writer could occupy the empty slot
before the TSX region. Current code does not verify.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash_x86.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
index 2c5b017..981d7bd 100644
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ b/lib/librte_hash/rte_cuckoo_hash_x86.h
@@ -66,6 +66,9 @@ rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
while (try < RTE_HASH_TSX_MAX_RETRY) {
status = rte_xbegin();
if (likely(status == RTE_XBEGIN_STARTED)) {
+ /* In case empty slot was gone before entering TSX */
+ if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT)
+ rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
while (likely(curr_node->prev != NULL)) {
prev_node = curr_node->prev;
prev_bkt = prev_node->bkt;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 3/8] hash: fix key slot size accuracy
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 2/8] hash: fix a multi-writer race condition Yipeng Wang
@ 2018-07-09 10:44 ` Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 4/8] hash: make duplicated code into functions Yipeng Wang
` (5 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:44 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit calculates the needed key slot size more
accurately. The previous local cache fix requires
the free slot ring to be larger than actually needed.
The calculation of the value is inaccurate.
Fixes: 5915699153d7 ("hash: fix scaling by reducing contention")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 80dcf41..11602af 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -126,13 +126,13 @@ rte_hash_create(const struct rte_hash_parameters *params)
* except for the first cache
*/
num_key_slots = params->entries + (RTE_MAX_LCORE - 1) *
- LCORE_CACHE_SIZE + 1;
+ (LCORE_CACHE_SIZE - 1) + 1;
else
num_key_slots = params->entries + 1;
snprintf(ring_name, sizeof(ring_name), "HT_%s", params->name);
/* Create ring (Dummy slot index is not enqueued) */
- r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots - 1),
+ r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots),
params->socket_id, 0);
if (r == NULL) {
RTE_LOG(ERR, HASH, "memory allocation failed\n");
@@ -291,7 +291,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->add_key = ADD_KEY_SINGLEWRITER;
/* Populate free slots ring. Entry zero is reserved for key misses. */
- for (i = 1; i < params->entries + 1; i++)
+ for (i = 1; i < num_key_slots; i++)
rte_ring_sp_enqueue(r, (void *)((uintptr_t) i));
te->data = (void *) h;
@@ -373,7 +373,7 @@ void
rte_hash_reset(struct rte_hash *h)
{
void *ptr;
- unsigned i;
+ uint32_t tot_ring_cnt, i;
if (h == NULL)
return;
@@ -386,7 +386,13 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- for (i = 1; i < h->entries + 1; i++)
+ if (h->hw_trans_mem_support)
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ else
+ tot_ring_cnt = h->entries;
+
+ for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
if (h->hw_trans_mem_support) {
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 4/8] hash: make duplicated code into functions
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (2 preceding siblings ...)
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 3/8] hash: fix key slot size accuracy Yipeng Wang
@ 2018-07-09 10:44 ` Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 5/8] hash: add read and write concurrency support Yipeng Wang
` (4 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:44 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit refactors the hash table lookup/add/del code
to remove some code duplication. Processing on primary bucket can
also apply to secondary bucket with same code.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 182 +++++++++++++++++++-------------------
1 file changed, 89 insertions(+), 93 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 11602af..b812f33 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -485,6 +485,33 @@ enqueue_slot_back(const struct rte_hash *h,
rte_ring_sp_enqueue(h->free_slots, slot_id);
}
+/* Search a key from bucket and update its data */
+static inline int32_t
+search_and_update(const struct rte_hash *h, void *data, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash)
+{
+ int i;
+ struct rte_hash_key *k, *keys = h->key_store;
+
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (bkt->sig_current[i] == sig &&
+ bkt->sig_alt[i] == alt_hash) {
+ k = (struct rte_hash_key *) ((char *)keys +
+ bkt->key_idx[i] * h->key_entry_size);
+ if (rte_hash_cmp_eq(key, k->key, h) == 0) {
+ /* Update data */
+ k->pdata = data;
+ /*
+ * Return index where key is stored,
+ * subtracting the first dummy index
+ */
+ return bkt->key_idx[i] - 1;
+ }
+ }
+ }
+ return -1;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
@@ -493,7 +520,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
uint32_t prim_bucket_idx, sec_bucket_idx;
unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
- struct rte_hash_key *new_k, *k, *keys = h->key_store;
+ struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
uint32_t new_idx;
int ret;
@@ -547,46 +574,14 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
new_idx = (uint32_t)((uintptr_t) slot_id);
/* Check if key is already inserted in primary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (prim_bkt->sig_current[i] == sig &&
- prim_bkt->sig_alt[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- prim_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = prim_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1)
+ goto failure;
/* Check if key is already inserted in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (sec_bkt->sig_alt[i] == sig &&
- sec_bkt->sig_current[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- sec_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = sec_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1)
+ goto failure;
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
@@ -699,20 +694,15 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data)
else
return ret;
}
+
+/* Search one bucket to find the match key */
static inline int32_t
-__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig, void **data)
+search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t sig,
+ void **data, const struct rte_hash_bucket *bkt)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
+ int i;
struct rte_hash_key *k, *keys = h->key_store;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
- /* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
bkt->key_idx[i] != EMPTY_SLOT) {
@@ -729,6 +719,26 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig, void **data)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+
+ /* Check if key is in primary location */
+ ret = search_one_bucket(h, key, sig, data, bkt);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -736,22 +746,9 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bkt = &h->buckets[bucket_idx];
/* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->sig_alt[i] == sig) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- if (data != NULL)
- *data = k->pdata;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- return bkt->key_idx[i] - 1;
- }
- }
- }
+ ret = search_one_bucket(h, key, alt_hash, data, bkt);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
@@ -815,20 +812,15 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
}
}
+/* Search one bucket and remove the matched key */
static inline int32_t
-__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig)
+search_and_remove(const struct rte_hash *h, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
struct rte_hash_key *k, *keys = h->key_store;
+ unsigned int i;
int32_t ret;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
/* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
@@ -848,31 +840,35 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int32_t ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+ /* look for key in primary bucket */
+ ret = search_and_remove(h, key, bkt, sig);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
- /* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->key_idx[i] != EMPTY_SLOT) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- remove_entry(h, bkt, i);
-
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
- }
- }
- }
+ /* look for key in secondary bucket */
+ ret = search_and_remove(h, key, bkt, alt_hash);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 5/8] hash: add read and write concurrency support
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (3 preceding siblings ...)
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 4/8] hash: make duplicated code into functions Yipeng Wang
@ 2018-07-09 10:45 ` Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 6/8] test: add tests in hash table perf test Yipeng Wang
` (3 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:45 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
The existing implementation of librte_hash does not support read-write
concurrency. This commit implements read-write safety using rte_rwlock
and rte_rwlock TM version if hardware transactional memory is available.
Both multi-writer and read-write concurrency is protected by rte_rwlock
now. The x86 specific header file is removed since the x86 specific RTM
function is not called directly by rte hash now.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 520 ++++++++++++++++++++++------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 167 -----------
lib/librte_hash/rte_hash.h | 3 +
5 files changed, 348 insertions(+), 361 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index e139e1d..efc06ed 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -6,7 +6,6 @@ headers = files('rte_cmp_arm64.h',
'rte_cmp_x86.h',
'rte_crc_arm64.h',
'rte_cuckoo_hash.h',
- 'rte_cuckoo_hash_x86.h',
'rte_fbk_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index b812f33..35631cc 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -31,9 +31,6 @@
#include "rte_hash.h"
#include "rte_cuckoo_hash.h"
-#if defined(RTE_ARCH_X86)
-#include "rte_cuckoo_hash_x86.h"
-#endif
TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
@@ -93,8 +90,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
void *buckets = NULL;
char ring_name[RTE_RING_NAMESIZE];
unsigned num_key_slots;
- unsigned hw_trans_mem_support = 0;
unsigned i;
+ unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
+ unsigned int readwrite_concur_support = 0;
+
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
hash_list = RTE_TAILQ_CAST(rte_hash_tailq.head, rte_hash_list);
@@ -118,8 +117,16 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD)
+ multi_writer_support = 1;
+
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) {
+ readwrite_concur_support = 1;
+ multi_writer_support = 1;
+ }
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
- if (hw_trans_mem_support)
+ if (multi_writer_support)
/*
* Increase number of slots by total number of indices
* that can be stored in the lcore caches
@@ -233,7 +240,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->cmp_jump_table_idx = KEY_OTHER_BYTES;
#endif
- if (hw_trans_mem_support) {
+ if (multi_writer_support) {
h->local_free_slots = rte_zmalloc_socket(NULL,
sizeof(struct lcore_cache) * RTE_MAX_LCORE,
RTE_CACHE_LINE_SIZE, params->socket_id);
@@ -261,6 +268,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->key_store = k;
h->free_slots = r;
h->hw_trans_mem_support = hw_trans_mem_support;
+ h->multi_writer_support = multi_writer_support;
+ h->readwrite_concur_support = readwrite_concur_support;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
@@ -271,24 +280,17 @@ rte_hash_create(const struct rte_hash_parameters *params)
#endif
h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR;
- /* Turn on multi-writer only with explicit flat from user and TM
+ /* Turn on multi-writer only with explicit flag from user and TM
* support.
*/
- if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
- if (h->hw_trans_mem_support) {
- h->add_key = ADD_KEY_MULTIWRITER_TM;
- } else {
- h->add_key = ADD_KEY_MULTIWRITER;
- h->multiwriter_lock = rte_malloc(NULL,
- sizeof(rte_spinlock_t),
- RTE_CACHE_LINE_SIZE);
- if (h->multiwriter_lock == NULL)
- goto err_unlock;
-
- rte_spinlock_init(h->multiwriter_lock);
- }
- } else
- h->add_key = ADD_KEY_SINGLEWRITER;
+ if (h->multi_writer_support) {
+ h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
+ RTE_CACHE_LINE_SIZE);
+ if (h->readwrite_lock == NULL)
+ goto err_unlock;
+
+ rte_rwlock_init(h->readwrite_lock);
+ }
/* Populate free slots ring. Entry zero is reserved for key misses. */
for (i = 1; i < num_key_slots; i++)
@@ -338,11 +340,10 @@ rte_hash_free(struct rte_hash *h)
rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support) {
rte_free(h->local_free_slots);
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_free(h->multiwriter_lock);
+ rte_free(h->readwrite_lock);
+ }
rte_ring_free(h->free_slots);
rte_free(h->key_store);
rte_free(h->buckets);
@@ -369,6 +370,44 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+/* Read write locks implemented using rte_rwlock */
+static inline void
+__hash_rw_writer_lock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_lock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_lock(h->readwrite_lock);
+}
+
+
+static inline void
+__hash_rw_reader_lock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_lock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_lock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_writer_unlock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_unlock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_unlock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_reader_unlock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_unlock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_unlock(h->readwrite_lock);
+}
+
void
rte_hash_reset(struct rte_hash *h)
{
@@ -378,6 +417,7 @@ rte_hash_reset(struct rte_hash *h)
if (h == NULL)
return;
+ __hash_rw_writer_lock(h);
memset(h->buckets, 0, h->num_buckets * sizeof(struct rte_hash_bucket));
memset(h->key_store, 0, h->key_entry_size * (h->entries + 1));
@@ -386,7 +426,7 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support)
tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
(LCORE_CACHE_SIZE - 1);
else
@@ -395,77 +435,12 @@ rte_hash_reset(struct rte_hash *h)
for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
/* Reset local caches per lcore */
for (i = 0; i < RTE_MAX_LCORE; i++)
h->local_free_slots[i].len = 0;
}
-}
-
-/* Search for an entry that can be pushed to its alternative location */
-static inline int
-make_space_bucket(const struct rte_hash *h, struct rte_hash_bucket *bkt,
- unsigned int *nr_pushes)
-{
- unsigned i, j;
- int ret;
- uint32_t next_bucket_idx;
- struct rte_hash_bucket *next_bkt[RTE_HASH_BUCKET_ENTRIES];
-
- /*
- * Push existing item (search for bucket with space in
- * alternative locations) to its alternative location
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Search for space in alternative locations */
- next_bucket_idx = bkt->sig_alt[i] & h->bucket_bitmask;
- next_bkt[i] = &h->buckets[next_bucket_idx];
- for (j = 0; j < RTE_HASH_BUCKET_ENTRIES; j++) {
- if (next_bkt[i]->key_idx[j] == EMPTY_SLOT)
- break;
- }
-
- if (j != RTE_HASH_BUCKET_ENTRIES)
- break;
- }
-
- /* Alternative location has spare room (end of recursive function) */
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- next_bkt[i]->sig_alt[j] = bkt->sig_current[i];
- next_bkt[i]->sig_current[j] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[j] = bkt->key_idx[i];
- return i;
- }
-
- /* Pick entry that has not been pushed yet */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++)
- if (bkt->flag[i] == 0)
- break;
-
- /* All entries have been pushed, so entry cannot be added */
- if (i == RTE_HASH_BUCKET_ENTRIES || ++(*nr_pushes) > RTE_HASH_MAX_PUSHES)
- return -ENOSPC;
-
- /* Set flag to indicate that this entry is going to be pushed */
- bkt->flag[i] = 1;
-
- /* Need room in alternative bucket to insert the pushed entry */
- ret = make_space_bucket(h, next_bkt[i], nr_pushes);
- /*
- * After recursive function.
- * Clear flags and insert the pushed entry
- * in its alternative location if successful,
- * or return error
- */
- bkt->flag[i] = 0;
- if (ret >= 0) {
- next_bkt[i]->sig_alt[ret] = bkt->sig_current[i];
- next_bkt[i]->sig_current[ret] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[ret] = bkt->key_idx[i];
- return i;
- } else
- return ret;
-
+ __hash_rw_writer_unlock(h);
}
/*
@@ -478,7 +453,7 @@ enqueue_slot_back(const struct rte_hash *h,
struct lcore_cache *cached_free_slots,
void *slot_id)
{
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
cached_free_slots->objs[cached_free_slots->len] = slot_id;
cached_free_slots->len++;
} else
@@ -512,13 +487,207 @@ search_and_update(const struct rte_hash *h, void *data, const void *key,
return -1;
}
+/* Only tries to insert at one bucket (@prim_bkt) without trying to push
+ * buckets around.
+ * return 1 if matching existing key, return 0 if succeeds, return -1 for no
+ * empty entry.
+ */
+static inline int32_t
+rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *prim_bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ unsigned int i;
+ struct rte_hash_bucket *cur_bkt = prim_bkt;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+ /* Check if key was inserted after last check but before this
+ * protected region in case of inserting duplicated keys.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ /* Insert new entry if there is room in the primary
+ * bucket.
+ */
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ /* Check if slot is available */
+ if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
+ prim_bkt->sig_current[i] = sig;
+ prim_bkt->sig_alt[i] = alt_hash;
+ prim_bkt->key_idx[i] = new_idx;
+ break;
+ }
+ }
+ __hash_rw_writer_unlock(h);
+
+ if (i != RTE_HASH_BUCKET_ENTRIES)
+ return 0;
+
+ /* no empty entry */
+ return -1;
+}
+
+/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
+ * the path head with new entry (sig, alt_hash, new_idx)
+ * return 1 if matched key found, return -1 if cuckoo path invalided and fail,
+ * return 0 if succeeds.
+ */
+static inline int
+rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *alt_bkt,
+ const struct rte_hash_key *key, void *data,
+ struct queue_node *leaf, uint32_t leaf_slot,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ uint32_t prev_alt_bkt_idx;
+ struct rte_hash_bucket *cur_bkt = bkt;
+ struct queue_node *prev_node, *curr_node = leaf;
+ struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
+ uint32_t prev_slot, curr_slot = leaf_slot;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+
+ /* In case empty slot was gone before entering protected region */
+ if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT) {
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Check if key was inserted after last check but before this
+ * protected region.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ ret = search_and_update(h, data, key, alt_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ while (likely(curr_node->prev != NULL)) {
+ prev_node = curr_node->prev;
+ prev_bkt = prev_node->bkt;
+ prev_slot = curr_node->prev_slot;
+
+ prev_alt_bkt_idx =
+ prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask;
+
+ if (unlikely(&h->buckets[prev_alt_bkt_idx]
+ != curr_bkt)) {
+ /* revert it to empty, otherwise duplicated keys */
+ curr_bkt->key_idx[curr_slot] = EMPTY_SLOT;
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Need to swap current/alt sig to allow later
+ * Cuckoo insert to move elements back to its
+ * primary bucket if available
+ */
+ curr_bkt->sig_alt[curr_slot] =
+ prev_bkt->sig_current[prev_slot];
+ curr_bkt->sig_current[curr_slot] =
+ prev_bkt->sig_alt[prev_slot];
+ curr_bkt->key_idx[curr_slot] =
+ prev_bkt->key_idx[prev_slot];
+
+ curr_slot = prev_slot;
+ curr_node = prev_node;
+ curr_bkt = curr_node->bkt;
+ }
+
+ curr_bkt->sig_current[curr_slot] = sig;
+ curr_bkt->sig_alt[curr_slot] = alt_hash;
+ curr_bkt->key_idx[curr_slot] = new_idx;
+
+ __hash_rw_writer_unlock(h);
+
+ return 0;
+
+}
+
+/*
+ * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
+ * Cuckoo
+ */
+static inline int
+rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash,
+ uint32_t new_idx, int32_t *ret_val)
+{
+ unsigned int i;
+ struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
+ struct queue_node *tail, *head;
+ struct rte_hash_bucket *curr_bkt, *alt_bkt;
+
+ tail = queue;
+ head = queue + 1;
+ tail->bkt = bkt;
+ tail->prev = NULL;
+ tail->prev_slot = -1;
+
+ /* Cuckoo bfs Search */
+ while (likely(tail != head && head <
+ queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
+ RTE_HASH_BUCKET_ENTRIES)) {
+ curr_bkt = tail->bkt;
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
+ int32_t ret = rte_hash_cuckoo_move_insert_mw(h,
+ bkt, sec_bkt, key, data,
+ tail, i, sig, alt_hash,
+ new_idx, ret_val);
+ if (likely(ret != -1))
+ return ret;
+ }
+
+ /* Enqueue new node and keep prev node info */
+ alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
+ & h->bucket_bitmask]);
+ head->bkt = alt_bkt;
+ head->prev = tail;
+ head->prev_slot = i;
+ head++;
+ }
+ tail++;
+ }
+
+ return -ENOSPC;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
{
hash_sig_t alt_hash;
uint32_t prim_bucket_idx, sec_bucket_idx;
- unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
@@ -527,10 +696,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
unsigned n_slots;
unsigned lcore_id;
struct lcore_cache *cached_free_slots = NULL;
- unsigned int nr_pushes = 0;
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_lock(h->multiwriter_lock);
+ int32_t ret_val;
prim_bucket_idx = sig & h->bucket_bitmask;
prim_bkt = &h->buckets[prim_bucket_idx];
@@ -541,8 +707,24 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
sec_bkt = &h->buckets[sec_bucket_idx];
rte_prefetch0(sec_bkt);
- /* Get a new slot for storing the new key */
- if (h->hw_trans_mem_support) {
+ /* Check if key is already inserted in primary location */
+ __hash_rw_writer_lock(h);
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+
+ /* Check if key is already inserted in secondary location */
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+ __hash_rw_writer_unlock(h);
+
+ /* Did not find a match, so get a new slot for storing the new key */
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Try to get a free slot from the local cache */
@@ -552,8 +734,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
cached_free_slots->objs,
LCORE_CACHE_SIZE, NULL);
if (n_slots == 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
cached_free_slots->len += n_slots;
@@ -564,92 +745,50 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
slot_id = cached_free_slots->objs[cached_free_slots->len];
} else {
if (rte_ring_sc_dequeue(h->free_slots, &slot_id) != 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
}
new_k = RTE_PTR_ADD(keys, (uintptr_t)slot_id * h->key_entry_size);
- rte_prefetch0(new_k);
new_idx = (uint32_t)((uintptr_t) slot_id);
-
- /* Check if key is already inserted in primary location */
- ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
- if (ret != -1)
- goto failure;
-
- /* Check if key is already inserted in secondary location */
- ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
- if (ret != -1)
- goto failure;
-
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
new_k->pdata = data;
-#if defined(RTE_ARCH_X86) /* currently only x86 support HTM */
- if (h->add_key == ADD_KEY_MULTIWRITER_TM) {
- ret = rte_hash_cuckoo_insert_mw_tm(prim_bkt,
- sig, alt_hash, new_idx);
- if (ret >= 0)
- return new_idx - 1;
- /* Primary bucket full, need to make space for new entry */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, prim_bkt, sig,
- alt_hash, new_idx);
+ /* Find an empty slot and insert */
+ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- if (ret >= 0)
- return new_idx - 1;
+ /* Primary bucket full, need to make space for new entry */
+ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- /* Also search secondary bucket to get better occupancy */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, sec_bkt, sig,
- alt_hash, new_idx);
+ /* Also search secondary bucket to get better occupancy */
+ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data,
+ alt_hash, sig, new_idx, &ret_val);
- if (ret >= 0)
- return new_idx - 1;
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
} else {
-#endif
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
-
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-
- /* Primary bucket full, need to make space for new entry
- * After recursive function.
- * Insert the new entry in the position of the pushed entry
- * if successful or return error and
- * store the new slot back in the ring
- */
- ret = make_space_bucket(h, prim_bkt, &nr_pushes);
- if (ret >= 0) {
- prim_bkt->sig_current[ret] = sig;
- prim_bkt->sig_alt[ret] = alt_hash;
- prim_bkt->key_idx[ret] = new_idx;
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-#if defined(RTE_ARCH_X86)
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
-#endif
- /* Error in addition, store new slot back in the ring and return error */
- enqueue_slot_back(h, cached_free_slots, (void *)((uintptr_t) new_idx));
-
-failure:
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return ret;
}
int32_t
@@ -734,12 +873,14 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_reader_lock(h);
/* Check if key is in primary location */
ret = search_one_bucket(h, key, sig, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
@@ -747,9 +888,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
/* Check if key is in secondary location */
ret = search_one_bucket(h, key, alt_hash, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
+ __hash_rw_reader_unlock(h);
return -ENOENT;
}
@@ -791,7 +934,7 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
bkt->sig_current[i] = NULL_SIGNATURE;
bkt->sig_alt[i] = NULL_SIGNATURE;
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Cache full, need to free it. */
@@ -855,10 +998,13 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_writer_lock(h);
/* look for key in primary bucket */
ret = search_and_remove(h, key, bkt, sig);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
return ret;
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -867,9 +1013,12 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
/* look for key in secondary bucket */
ret = search_and_remove(h, key, bkt, alt_hash);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
return ret;
+ }
+ __hash_rw_writer_unlock(h);
return -ENOENT;
}
@@ -1011,6 +1160,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
rte_prefetch0(secondary_bkt[i]);
}
+ __hash_rw_reader_lock(h);
/* Compare signatures and prefetch key slot of first hit */
for (i = 0; i < num_keys; i++) {
compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
@@ -1093,6 +1243,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
continue;
}
+ __hash_rw_reader_unlock(h);
+
if (hit_mask != NULL)
*hit_mask = hits;
}
@@ -1151,7 +1303,7 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
bucket_idx = *next / RTE_HASH_BUCKET_ENTRIES;
idx = *next % RTE_HASH_BUCKET_ENTRIES;
}
-
+ __hash_rw_reader_lock(h);
/* Get position of entry in key table */
position = h->buckets[bucket_idx].key_idx[idx];
next_key = (struct rte_hash_key *) ((char *)h->key_store +
@@ -1160,6 +1312,8 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
*key = next_key->key;
*data = next_key->pdata;
+ __hash_rw_reader_unlock(h);
+
/* Increment iterator */
(*next)++;
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 7a54e55..db4d1a0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -88,11 +88,6 @@ const rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
#endif
-enum add_key_case {
- ADD_KEY_SINGLEWRITER = 0,
- ADD_KEY_MULTIWRITER,
- ADD_KEY_MULTIWRITER_TM,
-};
/** Number of items per bucket. */
#define RTE_HASH_BUCKET_ENTRIES 8
@@ -155,18 +150,20 @@ struct rte_hash {
struct rte_ring *free_slots;
/**< Ring that stores all indexes of the free slots in the key table */
- uint8_t hw_trans_mem_support;
- /**< Hardware transactional memory support */
+
struct lcore_cache *local_free_slots;
/**< Local cache per lcore, storing some indexes of the free slots */
- enum add_key_case add_key; /**< Multi-writer hash add behavior */
-
- rte_spinlock_t *multiwriter_lock; /**< Multi-writer spinlock for w/o TM */
/* Fields used in lookup */
uint32_t key_len __rte_cache_aligned;
/**< Length of hash key. */
+ uint8_t hw_trans_mem_support;
+ /**< If hardware transactional memory is used. */
+ uint8_t multi_writer_support;
+ /**< If multi-writer support is enabled. */
+ uint8_t readwrite_concur_support;
+ /**< If read-write concurrency support is enabled */
rte_hash_function hash_func; /**< Function used to calculate hash. */
uint32_t hash_func_init_val; /**< Init value used by hash_func. */
rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
@@ -184,6 +181,7 @@ struct rte_hash {
/**< Table with buckets storing all the hash values and key indexes
* to the key table.
*/
+ rte_rwlock_t *readwrite_lock; /**< Read-write lock thread-safety. */
} __rte_cache_aligned;
struct queue_node {
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
deleted file mode 100644
index 981d7bd..0000000
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ /dev/null
@@ -1,167 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016 Intel Corporation
- */
-
-/* rte_cuckoo_hash_x86.h
- * This file holds all x86 specific Cuckoo Hash functions
- */
-
-/* Only tries to insert at one bucket (@prim_bkt) without trying to push
- * buckets around
- */
-static inline unsigned
-rte_hash_cuckoo_insert_mw_tm(struct rte_hash_bucket *prim_bkt,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned i, status;
- unsigned try = 0;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* Insert new entry if there is room in the primary
- * bucket.
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
- rte_xend();
-
- if (i != RTE_HASH_BUCKET_ENTRIES)
- return 0;
-
- break; /* break off try loop if transaction commits */
- } else {
- /* If we abort we give up this cuckoo path. */
- try++;
- rte_pause();
- }
- }
-
- return -1;
-}
-
-/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
- * the path head with new entry (sig, alt_hash, new_idx)
- */
-static inline int
-rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
- struct queue_node *leaf, uint32_t leaf_slot,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned try = 0;
- unsigned status;
- uint32_t prev_alt_bkt_idx;
-
- struct queue_node *prev_node, *curr_node = leaf;
- struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
- uint32_t prev_slot, curr_slot = leaf_slot;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* In case empty slot was gone before entering TSX */
- if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT)
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- while (likely(curr_node->prev != NULL)) {
- prev_node = curr_node->prev;
- prev_bkt = prev_node->bkt;
- prev_slot = curr_node->prev_slot;
-
- prev_alt_bkt_idx
- = prev_bkt->sig_alt[prev_slot]
- & h->bucket_bitmask;
-
- if (unlikely(&h->buckets[prev_alt_bkt_idx]
- != curr_bkt)) {
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- }
-
- /* Need to swap current/alt sig to allow later
- * Cuckoo insert to move elements back to its
- * primary bucket if available
- */
- curr_bkt->sig_alt[curr_slot] =
- prev_bkt->sig_current[prev_slot];
- curr_bkt->sig_current[curr_slot] =
- prev_bkt->sig_alt[prev_slot];
- curr_bkt->key_idx[curr_slot]
- = prev_bkt->key_idx[prev_slot];
-
- curr_slot = prev_slot;
- curr_node = prev_node;
- curr_bkt = curr_node->bkt;
- }
-
- curr_bkt->sig_current[curr_slot] = sig;
- curr_bkt->sig_alt[curr_slot] = alt_hash;
- curr_bkt->key_idx[curr_slot] = new_idx;
-
- rte_xend();
-
- return 0;
- }
-
- /* If we abort we give up this cuckoo path, since most likely it's
- * no longer valid as TSX detected data conflict
- */
- try++;
- rte_pause();
- }
-
- return -1;
-}
-
-/*
- * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
- * Cuckoo
- */
-static inline int
-rte_hash_cuckoo_make_space_mw_tm(const struct rte_hash *h,
- struct rte_hash_bucket *bkt,
- hash_sig_t sig, hash_sig_t alt_hash,
- uint32_t new_idx)
-{
- unsigned i;
- struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
- struct queue_node *tail, *head;
- struct rte_hash_bucket *curr_bkt, *alt_bkt;
-
- tail = queue;
- head = queue + 1;
- tail->bkt = bkt;
- tail->prev = NULL;
- tail->prev_slot = -1;
-
- /* Cuckoo bfs Search */
- while (likely(tail != head && head <
- queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
- RTE_HASH_BUCKET_ENTRIES)) {
- curr_bkt = tail->bkt;
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
- if (likely(rte_hash_cuckoo_move_insert_mw_tm(h,
- tail, i, sig,
- alt_hash, new_idx) == 0))
- return 0;
- }
-
- /* Enqueue new node and keep prev node info */
- alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
- & h->bucket_bitmask]);
- head->bkt = alt_bkt;
- head->prev = tail;
- head->prev_slot = i;
- head++;
- }
- tail++;
- }
-
- return -ENOSPC;
-}
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index f71ca9f..ecb49e4 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -34,6 +34,9 @@ extern "C" {
/** Default behavior of insertion, single writer/multi writer */
#define RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD 0x02
+/** Flag to support reader writer concurrency */
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY 0x04
+
/** Signature of key that is stored internally. */
typedef uint32_t hash_sig_t;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 6/8] test: add tests in hash table perf test
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (4 preceding siblings ...)
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 5/8] hash: add read and write concurrency support Yipeng Wang
@ 2018-07-09 10:45 ` Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 7/8] test: add test case for read write concurrency Yipeng Wang
` (2 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:45 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
New code is added to support read-write concurrency for
rte_hash. Due to the newly added code in critial path,
the perf test is modified to show any performance impact.
It is still a single-thread test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
test/test/test_hash_perf.c | 36 +++++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)
diff --git a/test/test/test_hash_perf.c b/test/test/test_hash_perf.c
index a81d0c7..33dcb9f 100644
--- a/test/test/test_hash_perf.c
+++ b/test/test/test_hash_perf.c
@@ -76,7 +76,8 @@ static struct rte_hash_parameters ut_params = {
};
static int
-create_table(unsigned with_data, unsigned table_index)
+create_table(unsigned int with_data, unsigned int table_index,
+ unsigned int with_locks)
{
char name[RTE_HASH_NAMESIZE];
@@ -86,6 +87,14 @@ create_table(unsigned with_data, unsigned table_index)
else
sprintf(name, "test_hash%d", hashtest_key_lens[table_index]);
+
+ if (with_locks)
+ ut_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT
+ | RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ ut_params.extra_flag = 0;
+
ut_params.name = name;
ut_params.key_len = hashtest_key_lens[table_index];
ut_params.socket_id = rte_socket_id();
@@ -459,7 +468,7 @@ reset_table(unsigned table_index)
}
static int
-run_all_tbl_perf_tests(unsigned with_pushes)
+run_all_tbl_perf_tests(unsigned int with_pushes, unsigned int with_locks)
{
unsigned i, j, with_data, with_hash;
@@ -468,7 +477,7 @@ run_all_tbl_perf_tests(unsigned with_pushes)
for (with_data = 0; with_data <= 1; with_data++) {
for (i = 0; i < NUM_KEYSIZES; i++) {
- if (create_table(with_data, i) < 0)
+ if (create_table(with_data, i, with_locks) < 0)
return -1;
if (get_input_keys(with_pushes, i) < 0)
@@ -611,15 +620,20 @@ fbk_hash_perf_test(void)
static int
test_hash_perf(void)
{
- unsigned with_pushes;
-
- for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
- if (with_pushes == 0)
- printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ unsigned int with_pushes, with_locks;
+ for (with_locks = 0; with_locks <= 1; with_locks++) {
+ if (with_locks)
+ printf("\nWith locks in the code\n");
else
- printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
- if (run_all_tbl_perf_tests(with_pushes) < 0)
- return -1;
+ printf("\nWithout locks in the code\n");
+ for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
+ if (with_pushes == 0)
+ printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ else
+ printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
+ if (run_all_tbl_perf_tests(with_pushes, with_locks) < 0)
+ return -1;
+ }
}
if (fbk_hash_perf_test() < 0)
return -1;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 7/8] test: add test case for read write concurrency
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (5 preceding siblings ...)
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 6/8] test: add tests in hash table perf test Yipeng Wang
@ 2018-07-09 10:45 ` Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 8/8] hash: add new API function to query the key count Yipeng Wang
2018-07-10 18:00 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Honnappa Nagarahalli
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:45 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commits add a new test case for testing read/write concurrency.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
test/test/Makefile | 1 +
test/test/test_hash_readwrite.c | 637 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 638 insertions(+)
create mode 100644 test/test/test_hash_readwrite.c
diff --git a/test/test/Makefile b/test/test/Makefile
index eccc8ef..6ce66c9 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -113,6 +113,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
new file mode 100644
index 0000000..55ae33d
--- /dev/null
+++ b/test/test/test_hash_readwrite.c
@@ -0,0 +1,637 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <locale.h>
+
+#include <rte_cycles.h>
+#include <rte_hash.h>
+#include <rte_hash_crc.h>
+#include <rte_jhash.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+#include <rte_spinlock.h>
+
+#include "test.h"
+
+#define RTE_RWTEST_FAIL 0
+
+#define TOTAL_ENTRY (16*1024*1024)
+#define TOTAL_INSERT (15*1024*1024)
+
+#define NUM_TEST 3
+unsigned int core_cnt[NUM_TEST] = {2, 4, 8};
+
+struct perf {
+ uint32_t single_read;
+ uint32_t single_write;
+ uint32_t read_only[NUM_TEST];
+ uint32_t write_only[NUM_TEST];
+ uint32_t read_write_r[NUM_TEST];
+ uint32_t read_write_w[NUM_TEST];
+};
+
+static struct perf htm_results, non_htm_results;
+
+struct {
+ uint32_t *keys;
+ uint32_t *found;
+ uint32_t num_insert;
+ uint32_t rounded_tot_insert;
+ struct rte_hash *h;
+} tbl_rw_test_param;
+
+static rte_atomic64_t gcycles;
+static rte_atomic64_t ginsertions;
+
+static rte_atomic64_t gread_cycles;
+static rte_atomic64_t gwrite_cycles;
+
+static rte_atomic64_t greads;
+static rte_atomic64_t gwrites;
+
+static int
+test_hash_readwrite_worker(__attribute__((unused)) void *arg)
+{
+ uint64_t i, offset;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+
+ offset = (lcore_id - rte_get_master_lcore())
+ * tbl_rw_test_param.num_insert;
+
+ printf("Core #%d inserting and reading %d: %'"PRId64" - %'"PRId64"\n",
+ lcore_id, tbl_rw_test_param.num_insert,
+ offset, offset + tbl_rw_test_param.num_insert);
+
+ begin = rte_rdtsc_precise();
+
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) > 0)
+ break;
+
+ ret = rte_hash_add_key(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i);
+ if (ret < 0)
+ break;
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) != ret)
+ break;
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gcycles, cycles);
+ rte_atomic64_add(&ginsertions, i - offset);
+
+ for (; i < offset + tbl_rw_test_param.num_insert; i++)
+ tbl_rw_test_param.keys[i] = RTE_RWTEST_FAIL;
+
+ return 0;
+}
+
+static int
+init_params(int use_htm, int use_jhash)
+{
+ unsigned int i;
+
+ uint32_t *keys = NULL;
+ uint32_t *found = NULL;
+ struct rte_hash *handle;
+
+ struct rte_hash_parameters hash_params = {
+ .entries = TOTAL_ENTRY,
+ .key_len = sizeof(uint32_t),
+ .hash_func_init_val = 0,
+ .socket_id = rte_socket_id(),
+ };
+ if (use_jhash)
+ hash_params.hash_func = rte_jhash;
+ else
+ hash_params.hash_func = rte_hash_crc;
+
+ if (use_htm)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+
+ hash_params.name = "tests";
+
+ handle = rte_hash_create(&hash_params);
+ if (handle == NULL) {
+ printf("hash creation failed");
+ return -1;
+ }
+
+ tbl_rw_test_param.h = handle;
+ keys = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+
+ if (keys == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ found = rte_zmalloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+ if (found == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ goto err;
+ }
+
+ tbl_rw_test_param.keys = keys;
+ tbl_rw_test_param.found = found;
+
+ for (i = 0; i < TOTAL_ENTRY; i++)
+ keys[i] = i;
+
+ return 0;
+
+err:
+ rte_free(keys);
+ rte_hash_free(handle);
+
+ return -1;
+}
+
+static int
+test_hash_readwrite_functional(int use_htm)
+{
+ unsigned int i;
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+ int use_jhash = 1;
+
+ rte_atomic64_init(&gcycles);
+ rte_atomic64_clear(&gcycles);
+
+ rte_atomic64_init(&ginsertions);
+ rte_atomic64_clear(&ginsertions);
+
+ if (init_params(use_htm, use_jhash) != 0)
+ goto err;
+
+ tbl_rw_test_param.num_insert =
+ TOTAL_INSERT / rte_lcore_count();
+
+ tbl_rw_test_param.rounded_tot_insert =
+ tbl_rw_test_param.num_insert
+ * rte_lcore_count();
+
+ printf("++++++++Start function tests:+++++++++\n");
+
+ /* Fire all threads. */
+ rte_eal_mp_remote_launch(test_hash_readwrite_worker,
+ NULL, CALL_MASTER);
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h, &next_key,
+ &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+ for (i = 0; i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %d is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gcycles) /
+ rte_atomic64_read(&ginsertions);
+
+ printf("cycles per insertion and lookup: %llu\n", cycles_per_insertion);
+
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ printf("+++++++++Complete function tests+++++++++\n");
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+err:
+ return -1;
+}
+
+static int
+test_rw_reader(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint64_t begin, cycles;
+ uint64_t read_cnt = (uint64_t)((uintptr_t)arg);
+
+ begin = rte_rdtsc_precise();
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value %"PRIu64","
+ "%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gread_cycles, cycles);
+ rte_atomic64_add(&greads, i);
+ return 0;
+}
+
+static int
+test_rw_writer(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+ uint64_t start_coreid = (uint64_t)(uintptr_t)arg;
+ uint64_t offset;
+
+ offset = TOTAL_INSERT / 2 + (lcore_id - start_coreid)
+ * tbl_rw_test_param.num_insert;
+ begin = rte_rdtsc_precise();
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("writer failed %"PRIu64"\n", i);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gwrite_cycles, cycles);
+ rte_atomic64_add(&gwrites, tbl_rw_test_param.num_insert);
+ return 0;
+}
+
+static int
+test_hash_readwrite_perf(struct perf *perf_results, int use_htm,
+ int reader_faster)
+{
+ unsigned int n;
+ int ret;
+ int start_coreid;
+ uint64_t i, read_cnt;
+
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+ int use_jhash = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ uint64_t start = 0, end = 0;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gwrites);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&greads);
+
+ rte_atomic64_init(&gread_cycles);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_init(&gwrite_cycles);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ if (init_params(use_htm, use_jhash) != 0)
+ goto err;
+
+ /*
+ * Do a readers finish faster or writers finish faster test.
+ * When readers finish faster, we timing the readers, and when writers
+ * finish faster, we timing the writers.
+ * Divided by 10 or 2 is just experimental values to vary the workload
+ * of readers.
+ */
+ if (reader_faster) {
+ printf("++++++Start perf test: reader++++++++\n");
+ read_cnt = TOTAL_INSERT / 10;
+ } else {
+ printf("++++++Start perf test: writer++++++++\n");
+ read_cnt = TOTAL_INSERT / 2;
+ }
+
+ /* We first test single thread performance */
+ start = rte_rdtsc_precise();
+ /* Insert half of the keys */
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_write = end / i;
+
+ start = rte_rdtsc_precise();
+
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value"
+ " %"PRIu64",%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_read = end / i;
+
+ for (n = 0; n < NUM_TEST; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < core_cnt[n] * 2 + 1)
+ goto finish;
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ tbl_rw_test_param.num_insert = TOTAL_INSERT / 2 / core_cnt[n];
+ tbl_rw_test_param.rounded_tot_insert = TOTAL_INSERT / 2 +
+ tbl_rw_test_param.num_insert *
+ core_cnt[n];
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ /* Then test multiple thread case but only all reads or
+ * all writes
+ */
+
+ /* Test only reader cases */
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+
+ rte_eal_mp_wait_lcore();
+
+ start_coreid = i;
+ /* Test only writer cases */
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+
+ rte_eal_mp_wait_lcore();
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_only[n] = cycles_per_insertion;
+ printf("Reader only: cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->write_only[n] = cycles_per_insertion;
+ printf("Writer only: cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ start_coreid = core_cnt[n] + 1;
+
+ if (reader_faster) {
+ for (i = core_cnt[n] + 1; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ } else {
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ }
+
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h,
+ &next_key, &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+ for (i = 0; i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %"PRIu64" is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_write_r[n] = cycles_per_insertion;
+ printf("Read-write cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->read_write_w[n] = cycles_per_insertion;
+ printf("Read-write cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+ }
+
+finish:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+
+err:
+ return -1;
+}
+
+static int
+test_hash_readwrite_main(void)
+{
+ /*
+ * Variables used to choose different tests.
+ * use_htm indicates if hardware transactional memory should be used.
+ * reader_faster indicates if the reader threads should finish earlier
+ * than writer threads. This is to timing either reader threads or
+ * writer threads for performance numbers.
+ */
+ int use_htm, reader_faster;
+
+ if (rte_lcore_count() == 1) {
+ printf("More than one lcore is required "
+ "to do read write test\n");
+ return 0;
+ }
+
+ setlocale(LC_NUMERIC, "");
+
+ if (rte_tm_supported()) {
+ printf("Hardware transactional memory (lock elision) "
+ "is supported\n");
+
+ printf("Test read-write with Hardware transactional memory\n");
+
+ use_htm = 1;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ } else {
+ printf("Hardware transactional memory (lock elision) "
+ "is NOT supported\n");
+ }
+
+ printf("Test read-write without Hardware transactional memory\n");
+ use_htm = 0;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+ printf("Results summary:\n");
+
+ int i;
+
+ printf("single read: %u\n", htm_results.single_read);
+ printf("single write: %u\n", htm_results.single_write);
+ for (i = 0; i < NUM_TEST; i++) {
+ printf("core_cnt: %u\n", core_cnt[i]);
+ printf("HTM:\n");
+ printf("read only: %u\n", htm_results.read_only[i]);
+ printf("write only: %u\n", htm_results.write_only[i]);
+ printf("read-write read: %u\n", htm_results.read_write_r[i]);
+ printf("read-write write: %u\n", htm_results.read_write_w[i]);
+
+ printf("non HTM:\n");
+ printf("read only: %u\n", non_htm_results.read_only[i]);
+ printf("write only: %u\n", non_htm_results.write_only[i]);
+ printf("read-write read: %u\n",
+ non_htm_results.read_write_r[i]);
+ printf("read-write write: %u\n",
+ non_htm_results.read_write_w[i]);
+ }
+
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(hash_readwrite_autotest, test_hash_readwrite_main);
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v4 8/8] hash: add new API function to query the key count
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (6 preceding siblings ...)
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 7/8] test: add test case for read write concurrency Yipeng Wang
@ 2018-07-09 10:45 ` Yipeng Wang
2018-07-10 18:00 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Honnappa Nagarahalli
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-09 10:45 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Add a new function, rte_hash_count, to return the number of keys that
are currently stored in the hash table. Corresponding test functions are
added into hash_test and hash_multiwriter test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 24 ++++++++++++++++++++++++
lib/librte_hash/rte_hash.h | 11 +++++++++++
lib/librte_hash/rte_hash_version.map | 8 ++++++++
test/test/test_hash.c | 12 ++++++++++++
test/test/test_hash_multiwriter.c | 8 ++++++++
5 files changed, 63 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 35631cc..bb67ade 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -370,6 +370,30 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+int32_t
+rte_hash_count(const struct rte_hash *h)
+{
+ uint32_t tot_ring_cnt, cached_cnt = 0;
+ uint32_t i, ret;
+
+ if (h == NULL)
+ return -EINVAL;
+
+ if (h->multi_writer_support) {
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ for (i = 0; i < RTE_MAX_LCORE; i++)
+ cached_cnt += h->local_free_slots[i].len;
+
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots) -
+ cached_cnt;
+ } else {
+ tot_ring_cnt = h->entries;
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots);
+ }
+ return ret;
+}
+
/* Read write locks implemented using rte_rwlock */
static inline void
__hash_rw_writer_lock(const struct rte_hash *h)
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index ecb49e4..1f1a276 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -127,6 +127,17 @@ void
rte_hash_reset(struct rte_hash *h);
/**
+ * Return the number of keys in the hash table
+ * @param h
+ * Hash table to query from
+ * @return
+ * - -EINVAL if parameters are invalid
+ * - A value indicating how many keys were inserted in the table.
+ */
+int32_t
+rte_hash_count(const struct rte_hash *h);
+
+/**
* Add a key-value pair to an existing hash table.
* This operation is not multi-thread safe
* and should only be called from one thread.
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index 52a2576..e216ac8 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -45,3 +45,11 @@ DPDK_16.07 {
rte_hash_get_key_with_position;
} DPDK_2.2;
+
+
+DPDK_18.08 {
+ global:
+
+ rte_hash_count;
+
+} DPDK_16.07;
diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index edf41f5..b3db9fd 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -1103,6 +1103,7 @@ static int test_average_table_utilization(void)
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
+ unsigned int cnt;
printf("\n# Running test to determine average utilization"
"\n before adding elements begins to fail\n");
@@ -1121,13 +1122,24 @@ static int test_average_table_utilization(void)
for (i = 0; i < ut_params.key_len; i++)
simple_key[i] = rte_rand() % 255;
ret = rte_hash_add_key(handle, simple_key);
+ if (ret < 0)
+ break;
}
+
if (ret != -ENOSPC) {
printf("Unexpected error when adding keys\n");
rte_hash_free(handle);
return -1;
}
+ cnt = rte_hash_count(handle);
+ if (cnt != added_keys) {
+ printf("rte_hash_count returned wrong value %u, %u,"
+ "%u\n", j, added_keys, cnt);
+ rte_hash_free(handle);
+ return -1;
+ }
+
average_keys_added += added_keys;
/* Reset the table */
diff --git a/test/test/test_hash_multiwriter.c b/test/test/test_hash_multiwriter.c
index ef5fce3..f182f40 100644
--- a/test/test/test_hash_multiwriter.c
+++ b/test/test/test_hash_multiwriter.c
@@ -116,6 +116,7 @@ test_hash_multiwriter(void)
uint32_t duplicated_keys = 0;
uint32_t lost_keys = 0;
+ uint32_t count;
snprintf(name, 32, "test%u", calledCount++);
hash_params.name = name;
@@ -163,6 +164,13 @@ test_hash_multiwriter(void)
NULL, CALL_MASTER);
rte_eal_mp_wait_lcore();
+ count = rte_hash_count(handle);
+ if (count != rounded_nb_total_tsx_insertion) {
+ printf("rte_hash_count returned wrong value %u, %d\n",
+ rounded_nb_total_tsx_insertion, count);
+ goto err3;
+ }
+
while (rte_hash_iterate(handle, &next_key, &next_data, &iter) >= 0) {
/* Search for the key in the list of keys added .*/
i = *(const uint32_t *)next_key;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
@ 2018-07-09 11:26 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 11:26 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 1/8] hash: fix multiwriter lock memory allocation
>
> When malloc for multiwriter_lock, the align should be RTE_CACHE_LINE_SIZE
> rather than LCORE_CACHE_SIZE.
>
> Also there should be check to verify the success of rte_malloc.
>
> Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
> Cc: stable@dpdk.org
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/8] hash: fix a multi-writer bug
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 2/8] hash: fix a multi-writer bug Yipeng Wang
@ 2018-07-09 14:16 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 14:16 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 2/8] hash: fix a multi-writer bug
Just a minor comment on the title. It should be a summary of the commit
(try to summarize the bug, instead of using the word "bug").
Keep my ack for the next version (like in other patches).
>
> Current multi-writer implementation uses Intel TSX to protect the cuckoo path
> moving but not the cuckoo path searching. After searching, we need to verify
> again if the same empty slot still exists at the beginning of the TSX region.
> Otherwise another writer could occupy the empty slot before the TSX region.
> Current code does not verify.
>
> Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
> Cc: stable@dpdk.org
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 3/8] hash: fix to have more accurate key slot size
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 3/8] hash: fix to have more accurate key slot size Yipeng Wang
@ 2018-07-09 14:20 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 14:20 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 3/8] hash: fix to have more accurate key slot size
Titles always start with a verb whereas in your case, you are starting with a noun.
Better to change it to "fix key slot size accuracy"?
Apart from that:
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>
> This commit calculates the needed key slot size more accurately. The previous
> local cache fix requires the free slot ring to be larger than actually needed.
> The calculation of the value is inaccurate.
>
> Fixes: 5915699153d7 ("hash: fix scaling by reducing contention")
> Cc: stable@dpdk.org
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 4/8] hash: make duplicated code into functions
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 4/8] hash: make duplicated code into functions Yipeng Wang
@ 2018-07-09 14:25 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 14:25 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 4/8] hash: make duplicated code into functions
>
> This commit refactors the hash table lookup/add/del code to remove some code
> duplication. Processing on primary bucket can also apply to secondary bucket
> with same code.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> ---
> lib/librte_hash/rte_cuckoo_hash.c | 186 +++++++++++++++++++-------------------
...
> @@ -838,41 +830,45 @@ __rte_hash_del_key_with_hash(const struct rte_hash
> *h, const void *key,
> if (rte_hash_cmp_eq(key, k->key, h) == 0) {
> remove_entry(h, bkt, i);
>
> + ret = bkt->key_idx[i] - 1;
> + bkt->key_idx[i] = EMPTY_SLOT;
> /*
> * Return index where key is stored,
> * subtracting the first dummy index
> */
> - ret = bkt->key_idx[i] - 1;
> - bkt->key_idx[i] = EMPTY_SLOT;
Actually, this change doesn't look needed, right?
It looks like you are just moving the two lines before the comment.
Apart from this,
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 5/8] hash: add read and write concurrency support
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 5/8] hash: add read and write concurrency support Yipeng Wang
@ 2018-07-09 14:28 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 14:28 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 5/8] hash: add read and write concurrency support
>
> The existing implementation of librte_hash does not support read-write
> concurrency. This commit implements read-write safety using rte_rwlock and
> rte_rwlock TM version if hardware transactional memory is available.
>
> Both multi-writer and read-write concurrency is protected by rte_rwlock now.
> The x86 specific header file is removed since the x86 specific RTM function is not
> called directly by rte hash now.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/8] test: add tests in hash table perf test
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 6/8] test: add tests in hash table perf test Yipeng Wang
@ 2018-07-09 15:33 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 15:33 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 6/8] test: add tests in hash table perf test
>
> New code is added to support read-write concurrency for rte_hash. Due to the
> newly added code in critial path, the perf test is modified to show any
> performance impact.
> It is still a single-thread test.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 8/8] hash: add new API function to query the key count
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 8/8] hash: add new API function to query the key count Yipeng Wang
@ 2018-07-09 16:22 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 16:22 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 8/8] hash: add new API function to query the key count
>
> Add a new function, rte_hash_count, to return the number of keys that are
> currently stored in the hash table. Corresponding test functions are added into
> hash_test and hash_multiwriter test.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v3 7/8] test: add test case for read write concurrency
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 7/8] test: add test case for read write concurrency Yipeng Wang
@ 2018-07-09 16:24 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 65+ messages in thread
From: De Lara Guarch, Pablo @ 2018-07-09 16:24 UTC (permalink / raw)
To: Wang, Yipeng1
Cc: dev, Richardson, Bruce, honnappa.nagarahalli, vguvva, brijesh.s.singh
> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Friday, July 6, 2018 8:47 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com;
> vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
> Subject: [PATCH v3 7/8] test: add test case for read write concurrency
>
> This commits add a new test case for testing read/write concurrency.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
There are still some double blank lines that I think are not necessary.
Once fixed, keep my ack:
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 0/8] Add read-write concurrency to rte_hash library
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
` (5 preceding siblings ...)
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
` (8 more replies)
6 siblings, 9 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This patch set adds the read-write concurrency support in rte_hash.
A new flag value is added to indicate if read-write concurrency is needed
during creation time. Test cases are implemented to do functional and
performance tests.
The new concurrency model is based on rte_rwlock. When Intel TSX is
available and the users indicate to use it, the TM version of the
rte_rwlock will be called. Both multi-writer and read-write concurrency
are protected by the rte_rwlock instead of the x86 specific RTM
instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed
and the code is infused into the main .c file.
A new rte_hash_count API is proposed to count how many keys are inserted
into the hash table.
v4->v5:
One typo fix in commit message that causes patch check warning.
v3->v4:
1. Change commit message titles as Pablo suggested. (Pablo)
2. hash: remove unnecessary changes in commit 4. (Pablo)
3. test: remove unnecessary double blank lines. (Pablo)
4. Add Pablo's ack in commit message.
v2->v3:
1. hash: Concurrency bug fix: after beginning cuckoo path moving,
the last empty slot needs to be verified again in case other writers
raced into this slot and occupy it. A new commit is added to do this
bug fix since it applies to master head as well.
2. hash: Concurrency bug fix: if cuckoo path is detected to be invalid,
the current slot needs to be emptied since it is duplicated to its
target bucket.
3. hash: "const" is used for types in multiple locations. (Pablo)
4. hash: rte_malloc used for readwriter lock used wrong align
argument. Similar fix applies to master head so a new commit is
created. (Pablo)
5. hash: ring size calculation fix is moved to front. (Pablo)
6. hash: search-and-remove function is refactored to be more aligned
with other search function. (Pablo)
7. test: using jhash in functional test for read-write concurrency.
It is because jhash with sequential keys incur more cuckoo path.
8. Multiple coding style, typo, commit message fixes. (Pablo)
v1->v2:
1. Split each commit into two commits for easier review (Pablo).
2. Add more comments in various places (Pablo).
3. hash: In key insertion function, move duplicated key checking to
earlier location and protect it using locks. Checking duplicated key
should happen first and data updates should be protected.
4. hash: In lookup bulk function, put signature comparison in lock,
since writers could happen between signature match on two buckets.
5. hash: Add write locks to reset function as well to protect resets.
5. test: Fix 32-bit compilation error in read-write test (Pablo).
6. test: Check total physical core count in read-write test. Don't
test with thread count that larger than physical core count.
7. Other minor fixes such as typos (Pablo).
Yipeng Wang (8):
hash: fix multiwriter lock memory allocation
hash: fix a multi-writer race condition
hash: fix key slot size accuracy
hash: make duplicated code into functions
hash: add read and write concurrency support
test: add tests in hash table perf test
test: add test case for read write concurrency
hash: add new API function to query the key count
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 701 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 --------
lib/librte_hash/rte_hash.h | 14 +
lib/librte_hash/rte_hash_version.map | 8 +
test/test/Makefile | 1 +
test/test/test_hash.c | 12 +
test/test/test_hash_multiwriter.c | 8 +
test/test/test_hash_perf.c | 36 +-
test/test/test_hash_readwrite.c | 637 ++++++++++++++++++++++++++++++
11 files changed, 1155 insertions(+), 445 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
create mode 100644 test/test/test_hash_readwrite.c
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 1/8] hash: fix multiwriter lock memory allocation
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 2/8] hash: fix a multi-writer race condition Yipeng Wang
` (7 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
When malloc for multiwriter_lock, the align should be
RTE_CACHE_LINE_SIZE rather than LCORE_CACHE_SIZE.
Also there should be check to verify the success of
rte_malloc.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index a07543a..80dcf41 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -281,7 +281,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->add_key = ADD_KEY_MULTIWRITER;
h->multiwriter_lock = rte_malloc(NULL,
sizeof(rte_spinlock_t),
- LCORE_CACHE_SIZE);
+ RTE_CACHE_LINE_SIZE);
+ if (h->multiwriter_lock == NULL)
+ goto err_unlock;
+
rte_spinlock_init(h->multiwriter_lock);
}
} else
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 2/8] hash: fix a multi-writer race condition
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 3/8] hash: fix key slot size accuracy Yipeng Wang
` (6 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Current multi-writer implementation uses Intel TSX to
protect the cuckoo path moving but not the cuckoo
path searching. After searching, we need to verify again if
the same empty slot still exists at the beginning of the TSX
region. Otherwise another writer could occupy the empty slot
before the TSX region. Current code does not verify.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash_x86.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
index 2c5b017..981d7bd 100644
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ b/lib/librte_hash/rte_cuckoo_hash_x86.h
@@ -66,6 +66,9 @@ rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
while (try < RTE_HASH_TSX_MAX_RETRY) {
status = rte_xbegin();
if (likely(status == RTE_XBEGIN_STARTED)) {
+ /* In case empty slot was gone before entering TSX */
+ if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT)
+ rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
while (likely(curr_node->prev != NULL)) {
prev_node = curr_node->prev;
prev_bkt = prev_node->bkt;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 3/8] hash: fix key slot size accuracy
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 2/8] hash: fix a multi-writer race condition Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 4/8] hash: make duplicated code into functions Yipeng Wang
` (5 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit calculates the needed key slot size more
accurately. The previous local cache fix requires
the free slot ring to be larger than actually needed.
The calculation of the value is inaccurate.
Fixes: 5915699153d7 ("hash: fix scaling by reducing contention")
Cc: stable@dpdk.org
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 80dcf41..11602af 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -126,13 +126,13 @@ rte_hash_create(const struct rte_hash_parameters *params)
* except for the first cache
*/
num_key_slots = params->entries + (RTE_MAX_LCORE - 1) *
- LCORE_CACHE_SIZE + 1;
+ (LCORE_CACHE_SIZE - 1) + 1;
else
num_key_slots = params->entries + 1;
snprintf(ring_name, sizeof(ring_name), "HT_%s", params->name);
/* Create ring (Dummy slot index is not enqueued) */
- r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots - 1),
+ r = rte_ring_create(ring_name, rte_align32pow2(num_key_slots),
params->socket_id, 0);
if (r == NULL) {
RTE_LOG(ERR, HASH, "memory allocation failed\n");
@@ -291,7 +291,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->add_key = ADD_KEY_SINGLEWRITER;
/* Populate free slots ring. Entry zero is reserved for key misses. */
- for (i = 1; i < params->entries + 1; i++)
+ for (i = 1; i < num_key_slots; i++)
rte_ring_sp_enqueue(r, (void *)((uintptr_t) i));
te->data = (void *) h;
@@ -373,7 +373,7 @@ void
rte_hash_reset(struct rte_hash *h)
{
void *ptr;
- unsigned i;
+ uint32_t tot_ring_cnt, i;
if (h == NULL)
return;
@@ -386,7 +386,13 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- for (i = 1; i < h->entries + 1; i++)
+ if (h->hw_trans_mem_support)
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ else
+ tot_ring_cnt = h->entries;
+
+ for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
if (h->hw_trans_mem_support) {
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 4/8] hash: make duplicated code into functions
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
` (2 preceding siblings ...)
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 3/8] hash: fix key slot size accuracy Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support Yipeng Wang
` (4 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commit refactors the hash table lookup/add/del code
to remove some code duplication. Processing on primary bucket can
also apply to secondary bucket with same code.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 182 +++++++++++++++++++-------------------
1 file changed, 89 insertions(+), 93 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 11602af..b812f33 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -485,6 +485,33 @@ enqueue_slot_back(const struct rte_hash *h,
rte_ring_sp_enqueue(h->free_slots, slot_id);
}
+/* Search a key from bucket and update its data */
+static inline int32_t
+search_and_update(const struct rte_hash *h, void *data, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash)
+{
+ int i;
+ struct rte_hash_key *k, *keys = h->key_store;
+
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (bkt->sig_current[i] == sig &&
+ bkt->sig_alt[i] == alt_hash) {
+ k = (struct rte_hash_key *) ((char *)keys +
+ bkt->key_idx[i] * h->key_entry_size);
+ if (rte_hash_cmp_eq(key, k->key, h) == 0) {
+ /* Update data */
+ k->pdata = data;
+ /*
+ * Return index where key is stored,
+ * subtracting the first dummy index
+ */
+ return bkt->key_idx[i] - 1;
+ }
+ }
+ }
+ return -1;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
@@ -493,7 +520,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
uint32_t prim_bucket_idx, sec_bucket_idx;
unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
- struct rte_hash_key *new_k, *k, *keys = h->key_store;
+ struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
uint32_t new_idx;
int ret;
@@ -547,46 +574,14 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
new_idx = (uint32_t)((uintptr_t) slot_id);
/* Check if key is already inserted in primary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (prim_bkt->sig_current[i] == sig &&
- prim_bkt->sig_alt[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- prim_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = prim_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1)
+ goto failure;
/* Check if key is already inserted in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (sec_bkt->sig_alt[i] == sig &&
- sec_bkt->sig_current[i] == alt_hash) {
- k = (struct rte_hash_key *) ((char *)keys +
- sec_bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Enqueue index of free slot back in the ring. */
- enqueue_slot_back(h, cached_free_slots, slot_id);
- /* Update data */
- k->pdata = data;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = sec_bkt->key_idx[i] - 1;
- goto failure;
- }
- }
- }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1)
+ goto failure;
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
@@ -699,20 +694,15 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data)
else
return ret;
}
+
+/* Search one bucket to find the match key */
static inline int32_t
-__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig, void **data)
+search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t sig,
+ void **data, const struct rte_hash_bucket *bkt)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
+ int i;
struct rte_hash_key *k, *keys = h->key_store;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
- /* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
bkt->key_idx[i] != EMPTY_SLOT) {
@@ -729,6 +719,26 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig, void **data)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+
+ /* Check if key is in primary location */
+ ret = search_one_bucket(h, key, sig, data, bkt);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -736,22 +746,9 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bkt = &h->buckets[bucket_idx];
/* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->sig_alt[i] == sig) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- if (data != NULL)
- *data = k->pdata;
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- return bkt->key_idx[i] - 1;
- }
- }
- }
+ ret = search_one_bucket(h, key, alt_hash, data, bkt);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
@@ -815,20 +812,15 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
}
}
+/* Search one bucket and remove the matched key */
static inline int32_t
-__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
- hash_sig_t sig)
+search_and_remove(const struct rte_hash *h, const void *key,
+ struct rte_hash_bucket *bkt, hash_sig_t sig)
{
- uint32_t bucket_idx;
- hash_sig_t alt_hash;
- unsigned i;
- struct rte_hash_bucket *bkt;
struct rte_hash_key *k, *keys = h->key_store;
+ unsigned int i;
int32_t ret;
- bucket_idx = sig & h->bucket_bitmask;
- bkt = &h->buckets[bucket_idx];
-
/* Check if key is in primary location */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt->sig_current[i] == sig &&
@@ -848,31 +840,35 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
}
}
}
+ return -1;
+}
+
+static inline int32_t
+__rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
+ hash_sig_t sig)
+{
+ uint32_t bucket_idx;
+ hash_sig_t alt_hash;
+ struct rte_hash_bucket *bkt;
+ int32_t ret;
+
+ bucket_idx = sig & h->bucket_bitmask;
+ bkt = &h->buckets[bucket_idx];
+
+ /* look for key in primary bucket */
+ ret = search_and_remove(h, key, bkt, sig);
+ if (ret != -1)
+ return ret;
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
- /* Check if key is in secondary location */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == alt_hash &&
- bkt->key_idx[i] != EMPTY_SLOT) {
- k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
- if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- remove_entry(h, bkt, i);
-
- /*
- * Return index where key is stored,
- * subtracting the first dummy index
- */
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
- return ret;
- }
- }
- }
+ /* look for key in secondary bucket */
+ ret = search_and_remove(h, key, bkt, alt_hash);
+ if (ret != -1)
+ return ret;
return -ENOENT;
}
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
` (3 preceding siblings ...)
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 4/8] hash: make duplicated code into functions Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-11 20:49 ` Stephen Hemminger
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 6/8] test: add tests in hash table perf test Yipeng Wang
` (3 subsequent siblings)
8 siblings, 1 reply; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
The existing implementation of librte_hash does not support read-write
concurrency. This commit implements read-write safety using rte_rwlock
and rte_rwlock TM version if hardware transactional memory is available.
Both multi-writer and read-write concurrency is protected by rte_rwlock
now. The x86 specific header file is removed since the x86 specific RTM
function is not called directly by rte hash now.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 520 ++++++++++++++++++++++------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 167 -----------
lib/librte_hash/rte_hash.h | 3 +
5 files changed, 348 insertions(+), 361 deletions(-)
delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index e139e1d..efc06ed 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -6,7 +6,6 @@ headers = files('rte_cmp_arm64.h',
'rte_cmp_x86.h',
'rte_crc_arm64.h',
'rte_cuckoo_hash.h',
- 'rte_cuckoo_hash_x86.h',
'rte_fbk_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index b812f33..35631cc 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -31,9 +31,6 @@
#include "rte_hash.h"
#include "rte_cuckoo_hash.h"
-#if defined(RTE_ARCH_X86)
-#include "rte_cuckoo_hash_x86.h"
-#endif
TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
@@ -93,8 +90,10 @@ rte_hash_create(const struct rte_hash_parameters *params)
void *buckets = NULL;
char ring_name[RTE_RING_NAMESIZE];
unsigned num_key_slots;
- unsigned hw_trans_mem_support = 0;
unsigned i;
+ unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
+ unsigned int readwrite_concur_support = 0;
+
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
hash_list = RTE_TAILQ_CAST(rte_hash_tailq.head, rte_hash_list);
@@ -118,8 +117,16 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD)
+ multi_writer_support = 1;
+
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) {
+ readwrite_concur_support = 1;
+ multi_writer_support = 1;
+ }
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
- if (hw_trans_mem_support)
+ if (multi_writer_support)
/*
* Increase number of slots by total number of indices
* that can be stored in the lcore caches
@@ -233,7 +240,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->cmp_jump_table_idx = KEY_OTHER_BYTES;
#endif
- if (hw_trans_mem_support) {
+ if (multi_writer_support) {
h->local_free_slots = rte_zmalloc_socket(NULL,
sizeof(struct lcore_cache) * RTE_MAX_LCORE,
RTE_CACHE_LINE_SIZE, params->socket_id);
@@ -261,6 +268,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->key_store = k;
h->free_slots = r;
h->hw_trans_mem_support = hw_trans_mem_support;
+ h->multi_writer_support = multi_writer_support;
+ h->readwrite_concur_support = readwrite_concur_support;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
@@ -271,24 +280,17 @@ rte_hash_create(const struct rte_hash_parameters *params)
#endif
h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR;
- /* Turn on multi-writer only with explicit flat from user and TM
+ /* Turn on multi-writer only with explicit flag from user and TM
* support.
*/
- if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
- if (h->hw_trans_mem_support) {
- h->add_key = ADD_KEY_MULTIWRITER_TM;
- } else {
- h->add_key = ADD_KEY_MULTIWRITER;
- h->multiwriter_lock = rte_malloc(NULL,
- sizeof(rte_spinlock_t),
- RTE_CACHE_LINE_SIZE);
- if (h->multiwriter_lock == NULL)
- goto err_unlock;
-
- rte_spinlock_init(h->multiwriter_lock);
- }
- } else
- h->add_key = ADD_KEY_SINGLEWRITER;
+ if (h->multi_writer_support) {
+ h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
+ RTE_CACHE_LINE_SIZE);
+ if (h->readwrite_lock == NULL)
+ goto err_unlock;
+
+ rte_rwlock_init(h->readwrite_lock);
+ }
/* Populate free slots ring. Entry zero is reserved for key misses. */
for (i = 1; i < num_key_slots; i++)
@@ -338,11 +340,10 @@ rte_hash_free(struct rte_hash *h)
rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support) {
rte_free(h->local_free_slots);
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_free(h->multiwriter_lock);
+ rte_free(h->readwrite_lock);
+ }
rte_ring_free(h->free_slots);
rte_free(h->key_store);
rte_free(h->buckets);
@@ -369,6 +370,44 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+/* Read write locks implemented using rte_rwlock */
+static inline void
+__hash_rw_writer_lock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_lock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_lock(h->readwrite_lock);
+}
+
+
+static inline void
+__hash_rw_reader_lock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_lock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_lock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_writer_unlock(const struct rte_hash *h)
+{
+ if (h->multi_writer_support && h->hw_trans_mem_support)
+ rte_rwlock_write_unlock_tm(h->readwrite_lock);
+ else if (h->multi_writer_support)
+ rte_rwlock_write_unlock(h->readwrite_lock);
+}
+
+static inline void
+__hash_rw_reader_unlock(const struct rte_hash *h)
+{
+ if (h->readwrite_concur_support && h->hw_trans_mem_support)
+ rte_rwlock_read_unlock_tm(h->readwrite_lock);
+ else if (h->readwrite_concur_support)
+ rte_rwlock_read_unlock(h->readwrite_lock);
+}
+
void
rte_hash_reset(struct rte_hash *h)
{
@@ -378,6 +417,7 @@ rte_hash_reset(struct rte_hash *h)
if (h == NULL)
return;
+ __hash_rw_writer_lock(h);
memset(h->buckets, 0, h->num_buckets * sizeof(struct rte_hash_bucket));
memset(h->key_store, 0, h->key_entry_size * (h->entries + 1));
@@ -386,7 +426,7 @@ rte_hash_reset(struct rte_hash *h)
rte_pause();
/* Repopulate the free slots ring. Entry zero is reserved for key misses */
- if (h->hw_trans_mem_support)
+ if (h->multi_writer_support)
tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
(LCORE_CACHE_SIZE - 1);
else
@@ -395,77 +435,12 @@ rte_hash_reset(struct rte_hash *h)
for (i = 1; i < tot_ring_cnt + 1; i++)
rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
/* Reset local caches per lcore */
for (i = 0; i < RTE_MAX_LCORE; i++)
h->local_free_slots[i].len = 0;
}
-}
-
-/* Search for an entry that can be pushed to its alternative location */
-static inline int
-make_space_bucket(const struct rte_hash *h, struct rte_hash_bucket *bkt,
- unsigned int *nr_pushes)
-{
- unsigned i, j;
- int ret;
- uint32_t next_bucket_idx;
- struct rte_hash_bucket *next_bkt[RTE_HASH_BUCKET_ENTRIES];
-
- /*
- * Push existing item (search for bucket with space in
- * alternative locations) to its alternative location
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Search for space in alternative locations */
- next_bucket_idx = bkt->sig_alt[i] & h->bucket_bitmask;
- next_bkt[i] = &h->buckets[next_bucket_idx];
- for (j = 0; j < RTE_HASH_BUCKET_ENTRIES; j++) {
- if (next_bkt[i]->key_idx[j] == EMPTY_SLOT)
- break;
- }
-
- if (j != RTE_HASH_BUCKET_ENTRIES)
- break;
- }
-
- /* Alternative location has spare room (end of recursive function) */
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- next_bkt[i]->sig_alt[j] = bkt->sig_current[i];
- next_bkt[i]->sig_current[j] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[j] = bkt->key_idx[i];
- return i;
- }
-
- /* Pick entry that has not been pushed yet */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++)
- if (bkt->flag[i] == 0)
- break;
-
- /* All entries have been pushed, so entry cannot be added */
- if (i == RTE_HASH_BUCKET_ENTRIES || ++(*nr_pushes) > RTE_HASH_MAX_PUSHES)
- return -ENOSPC;
-
- /* Set flag to indicate that this entry is going to be pushed */
- bkt->flag[i] = 1;
-
- /* Need room in alternative bucket to insert the pushed entry */
- ret = make_space_bucket(h, next_bkt[i], nr_pushes);
- /*
- * After recursive function.
- * Clear flags and insert the pushed entry
- * in its alternative location if successful,
- * or return error
- */
- bkt->flag[i] = 0;
- if (ret >= 0) {
- next_bkt[i]->sig_alt[ret] = bkt->sig_current[i];
- next_bkt[i]->sig_current[ret] = bkt->sig_alt[i];
- next_bkt[i]->key_idx[ret] = bkt->key_idx[i];
- return i;
- } else
- return ret;
-
+ __hash_rw_writer_unlock(h);
}
/*
@@ -478,7 +453,7 @@ enqueue_slot_back(const struct rte_hash *h,
struct lcore_cache *cached_free_slots,
void *slot_id)
{
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
cached_free_slots->objs[cached_free_slots->len] = slot_id;
cached_free_slots->len++;
} else
@@ -512,13 +487,207 @@ search_and_update(const struct rte_hash *h, void *data, const void *key,
return -1;
}
+/* Only tries to insert at one bucket (@prim_bkt) without trying to push
+ * buckets around.
+ * return 1 if matching existing key, return 0 if succeeds, return -1 for no
+ * empty entry.
+ */
+static inline int32_t
+rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *prim_bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ unsigned int i;
+ struct rte_hash_bucket *cur_bkt = prim_bkt;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+ /* Check if key was inserted after last check but before this
+ * protected region in case of inserting duplicated keys.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ /* Insert new entry if there is room in the primary
+ * bucket.
+ */
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ /* Check if slot is available */
+ if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
+ prim_bkt->sig_current[i] = sig;
+ prim_bkt->sig_alt[i] = alt_hash;
+ prim_bkt->key_idx[i] = new_idx;
+ break;
+ }
+ }
+ __hash_rw_writer_unlock(h);
+
+ if (i != RTE_HASH_BUCKET_ENTRIES)
+ return 0;
+
+ /* no empty entry */
+ return -1;
+}
+
+/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
+ * the path head with new entry (sig, alt_hash, new_idx)
+ * return 1 if matched key found, return -1 if cuckoo path invalided and fail,
+ * return 0 if succeeds.
+ */
+static inline int
+rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *alt_bkt,
+ const struct rte_hash_key *key, void *data,
+ struct queue_node *leaf, uint32_t leaf_slot,
+ hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx,
+ int32_t *ret_val)
+{
+ uint32_t prev_alt_bkt_idx;
+ struct rte_hash_bucket *cur_bkt = bkt;
+ struct queue_node *prev_node, *curr_node = leaf;
+ struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
+ uint32_t prev_slot, curr_slot = leaf_slot;
+ int32_t ret;
+
+ __hash_rw_writer_lock(h);
+
+ /* In case empty slot was gone before entering protected region */
+ if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT) {
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Check if key was inserted after last check but before this
+ * protected region.
+ */
+ ret = search_and_update(h, data, key, cur_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ ret = search_and_update(h, data, key, alt_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ *ret_val = ret;
+ return 1;
+ }
+
+ while (likely(curr_node->prev != NULL)) {
+ prev_node = curr_node->prev;
+ prev_bkt = prev_node->bkt;
+ prev_slot = curr_node->prev_slot;
+
+ prev_alt_bkt_idx =
+ prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask;
+
+ if (unlikely(&h->buckets[prev_alt_bkt_idx]
+ != curr_bkt)) {
+ /* revert it to empty, otherwise duplicated keys */
+ curr_bkt->key_idx[curr_slot] = EMPTY_SLOT;
+ __hash_rw_writer_unlock(h);
+ return -1;
+ }
+
+ /* Need to swap current/alt sig to allow later
+ * Cuckoo insert to move elements back to its
+ * primary bucket if available
+ */
+ curr_bkt->sig_alt[curr_slot] =
+ prev_bkt->sig_current[prev_slot];
+ curr_bkt->sig_current[curr_slot] =
+ prev_bkt->sig_alt[prev_slot];
+ curr_bkt->key_idx[curr_slot] =
+ prev_bkt->key_idx[prev_slot];
+
+ curr_slot = prev_slot;
+ curr_node = prev_node;
+ curr_bkt = curr_node->bkt;
+ }
+
+ curr_bkt->sig_current[curr_slot] = sig;
+ curr_bkt->sig_alt[curr_slot] = alt_hash;
+ curr_bkt->key_idx[curr_slot] = new_idx;
+
+ __hash_rw_writer_unlock(h);
+
+ return 0;
+
+}
+
+/*
+ * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
+ * Cuckoo
+ */
+static inline int
+rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
+ struct rte_hash_bucket *bkt,
+ struct rte_hash_bucket *sec_bkt,
+ const struct rte_hash_key *key, void *data,
+ hash_sig_t sig, hash_sig_t alt_hash,
+ uint32_t new_idx, int32_t *ret_val)
+{
+ unsigned int i;
+ struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
+ struct queue_node *tail, *head;
+ struct rte_hash_bucket *curr_bkt, *alt_bkt;
+
+ tail = queue;
+ head = queue + 1;
+ tail->bkt = bkt;
+ tail->prev = NULL;
+ tail->prev_slot = -1;
+
+ /* Cuckoo bfs Search */
+ while (likely(tail != head && head <
+ queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
+ RTE_HASH_BUCKET_ENTRIES)) {
+ curr_bkt = tail->bkt;
+ for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
+ if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
+ int32_t ret = rte_hash_cuckoo_move_insert_mw(h,
+ bkt, sec_bkt, key, data,
+ tail, i, sig, alt_hash,
+ new_idx, ret_val);
+ if (likely(ret != -1))
+ return ret;
+ }
+
+ /* Enqueue new node and keep prev node info */
+ alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
+ & h->bucket_bitmask]);
+ head->bkt = alt_bkt;
+ head->prev = tail;
+ head->prev_slot = i;
+ head++;
+ }
+ tail++;
+ }
+
+ return -ENOSPC;
+}
+
static inline int32_t
__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
hash_sig_t sig, void *data)
{
hash_sig_t alt_hash;
uint32_t prim_bucket_idx, sec_bucket_idx;
- unsigned i;
struct rte_hash_bucket *prim_bkt, *sec_bkt;
struct rte_hash_key *new_k, *keys = h->key_store;
void *slot_id = NULL;
@@ -527,10 +696,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
unsigned n_slots;
unsigned lcore_id;
struct lcore_cache *cached_free_slots = NULL;
- unsigned int nr_pushes = 0;
-
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_lock(h->multiwriter_lock);
+ int32_t ret_val;
prim_bucket_idx = sig & h->bucket_bitmask;
prim_bkt = &h->buckets[prim_bucket_idx];
@@ -541,8 +707,24 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
sec_bkt = &h->buckets[sec_bucket_idx];
rte_prefetch0(sec_bkt);
- /* Get a new slot for storing the new key */
- if (h->hw_trans_mem_support) {
+ /* Check if key is already inserted in primary location */
+ __hash_rw_writer_lock(h);
+ ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+
+ /* Check if key is already inserted in secondary location */
+ ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
+ return ret;
+ }
+ __hash_rw_writer_unlock(h);
+
+ /* Did not find a match, so get a new slot for storing the new key */
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Try to get a free slot from the local cache */
@@ -552,8 +734,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
cached_free_slots->objs,
LCORE_CACHE_SIZE, NULL);
if (n_slots == 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
cached_free_slots->len += n_slots;
@@ -564,92 +745,50 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
slot_id = cached_free_slots->objs[cached_free_slots->len];
} else {
if (rte_ring_sc_dequeue(h->free_slots, &slot_id) != 0) {
- ret = -ENOSPC;
- goto failure;
+ return -ENOSPC;
}
}
new_k = RTE_PTR_ADD(keys, (uintptr_t)slot_id * h->key_entry_size);
- rte_prefetch0(new_k);
new_idx = (uint32_t)((uintptr_t) slot_id);
-
- /* Check if key is already inserted in primary location */
- ret = search_and_update(h, data, key, prim_bkt, sig, alt_hash);
- if (ret != -1)
- goto failure;
-
- /* Check if key is already inserted in secondary location */
- ret = search_and_update(h, data, key, sec_bkt, alt_hash, sig);
- if (ret != -1)
- goto failure;
-
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
new_k->pdata = data;
-#if defined(RTE_ARCH_X86) /* currently only x86 support HTM */
- if (h->add_key == ADD_KEY_MULTIWRITER_TM) {
- ret = rte_hash_cuckoo_insert_mw_tm(prim_bkt,
- sig, alt_hash, new_idx);
- if (ret >= 0)
- return new_idx - 1;
- /* Primary bucket full, need to make space for new entry */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, prim_bkt, sig,
- alt_hash, new_idx);
+ /* Find an empty slot and insert */
+ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- if (ret >= 0)
- return new_idx - 1;
+ /* Primary bucket full, need to make space for new entry */
+ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data,
+ sig, alt_hash, new_idx, &ret_val);
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
+ }
- /* Also search secondary bucket to get better occupancy */
- ret = rte_hash_cuckoo_make_space_mw_tm(h, sec_bkt, sig,
- alt_hash, new_idx);
+ /* Also search secondary bucket to get better occupancy */
+ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data,
+ alt_hash, sig, new_idx, &ret_val);
- if (ret >= 0)
- return new_idx - 1;
+ if (ret == 0)
+ return new_idx - 1;
+ else if (ret == 1) {
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret_val;
} else {
-#endif
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
-
- if (i != RTE_HASH_BUCKET_ENTRIES) {
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-
- /* Primary bucket full, need to make space for new entry
- * After recursive function.
- * Insert the new entry in the position of the pushed entry
- * if successful or return error and
- * store the new slot back in the ring
- */
- ret = make_space_bucket(h, prim_bkt, &nr_pushes);
- if (ret >= 0) {
- prim_bkt->sig_current[ret] = sig;
- prim_bkt->sig_alt[ret] = alt_hash;
- prim_bkt->key_idx[ret] = new_idx;
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return new_idx - 1;
- }
-#if defined(RTE_ARCH_X86)
+ enqueue_slot_back(h, cached_free_slots, slot_id);
+ return ret;
}
-#endif
- /* Error in addition, store new slot back in the ring and return error */
- enqueue_slot_back(h, cached_free_slots, (void *)((uintptr_t) new_idx));
-
-failure:
- if (h->add_key == ADD_KEY_MULTIWRITER)
- rte_spinlock_unlock(h->multiwriter_lock);
- return ret;
}
int32_t
@@ -734,12 +873,14 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_reader_lock(h);
/* Check if key is in primary location */
ret = search_one_bucket(h, key, sig, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
bucket_idx = alt_hash & h->bucket_bitmask;
@@ -747,9 +888,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
/* Check if key is in secondary location */
ret = search_one_bucket(h, key, alt_hash, data, bkt);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
return ret;
-
+ }
+ __hash_rw_reader_unlock(h);
return -ENOENT;
}
@@ -791,7 +934,7 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
bkt->sig_current[i] = NULL_SIGNATURE;
bkt->sig_alt[i] = NULL_SIGNATURE;
- if (h->hw_trans_mem_support) {
+ if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
/* Cache full, need to free it. */
@@ -855,10 +998,13 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
bucket_idx = sig & h->bucket_bitmask;
bkt = &h->buckets[bucket_idx];
+ __hash_rw_writer_lock(h);
/* look for key in primary bucket */
ret = search_and_remove(h, key, bkt, sig);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
return ret;
+ }
/* Calculate secondary hash */
alt_hash = rte_hash_secondary_hash(sig);
@@ -867,9 +1013,12 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
/* look for key in secondary bucket */
ret = search_and_remove(h, key, bkt, alt_hash);
- if (ret != -1)
+ if (ret != -1) {
+ __hash_rw_writer_unlock(h);
return ret;
+ }
+ __hash_rw_writer_unlock(h);
return -ENOENT;
}
@@ -1011,6 +1160,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
rte_prefetch0(secondary_bkt[i]);
}
+ __hash_rw_reader_lock(h);
/* Compare signatures and prefetch key slot of first hit */
for (i = 0; i < num_keys; i++) {
compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
@@ -1093,6 +1243,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
continue;
}
+ __hash_rw_reader_unlock(h);
+
if (hit_mask != NULL)
*hit_mask = hits;
}
@@ -1151,7 +1303,7 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
bucket_idx = *next / RTE_HASH_BUCKET_ENTRIES;
idx = *next % RTE_HASH_BUCKET_ENTRIES;
}
-
+ __hash_rw_reader_lock(h);
/* Get position of entry in key table */
position = h->buckets[bucket_idx].key_idx[idx];
next_key = (struct rte_hash_key *) ((char *)h->key_store +
@@ -1160,6 +1312,8 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
*key = next_key->key;
*data = next_key->pdata;
+ __hash_rw_reader_unlock(h);
+
/* Increment iterator */
(*next)++;
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 7a54e55..db4d1a0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -88,11 +88,6 @@ const rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
#endif
-enum add_key_case {
- ADD_KEY_SINGLEWRITER = 0,
- ADD_KEY_MULTIWRITER,
- ADD_KEY_MULTIWRITER_TM,
-};
/** Number of items per bucket. */
#define RTE_HASH_BUCKET_ENTRIES 8
@@ -155,18 +150,20 @@ struct rte_hash {
struct rte_ring *free_slots;
/**< Ring that stores all indexes of the free slots in the key table */
- uint8_t hw_trans_mem_support;
- /**< Hardware transactional memory support */
+
struct lcore_cache *local_free_slots;
/**< Local cache per lcore, storing some indexes of the free slots */
- enum add_key_case add_key; /**< Multi-writer hash add behavior */
-
- rte_spinlock_t *multiwriter_lock; /**< Multi-writer spinlock for w/o TM */
/* Fields used in lookup */
uint32_t key_len __rte_cache_aligned;
/**< Length of hash key. */
+ uint8_t hw_trans_mem_support;
+ /**< If hardware transactional memory is used. */
+ uint8_t multi_writer_support;
+ /**< If multi-writer support is enabled. */
+ uint8_t readwrite_concur_support;
+ /**< If read-write concurrency support is enabled */
rte_hash_function hash_func; /**< Function used to calculate hash. */
uint32_t hash_func_init_val; /**< Init value used by hash_func. */
rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
@@ -184,6 +181,7 @@ struct rte_hash {
/**< Table with buckets storing all the hash values and key indexes
* to the key table.
*/
+ rte_rwlock_t *readwrite_lock; /**< Read-write lock thread-safety. */
} __rte_cache_aligned;
struct queue_node {
diff --git a/lib/librte_hash/rte_cuckoo_hash_x86.h b/lib/librte_hash/rte_cuckoo_hash_x86.h
deleted file mode 100644
index 981d7bd..0000000
--- a/lib/librte_hash/rte_cuckoo_hash_x86.h
+++ /dev/null
@@ -1,167 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016 Intel Corporation
- */
-
-/* rte_cuckoo_hash_x86.h
- * This file holds all x86 specific Cuckoo Hash functions
- */
-
-/* Only tries to insert at one bucket (@prim_bkt) without trying to push
- * buckets around
- */
-static inline unsigned
-rte_hash_cuckoo_insert_mw_tm(struct rte_hash_bucket *prim_bkt,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned i, status;
- unsigned try = 0;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* Insert new entry if there is room in the primary
- * bucket.
- */
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- /* Check if slot is available */
- if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
- prim_bkt->sig_current[i] = sig;
- prim_bkt->sig_alt[i] = alt_hash;
- prim_bkt->key_idx[i] = new_idx;
- break;
- }
- }
- rte_xend();
-
- if (i != RTE_HASH_BUCKET_ENTRIES)
- return 0;
-
- break; /* break off try loop if transaction commits */
- } else {
- /* If we abort we give up this cuckoo path. */
- try++;
- rte_pause();
- }
- }
-
- return -1;
-}
-
-/* Shift buckets along provided cuckoo_path (@leaf and @leaf_slot) and fill
- * the path head with new entry (sig, alt_hash, new_idx)
- */
-static inline int
-rte_hash_cuckoo_move_insert_mw_tm(const struct rte_hash *h,
- struct queue_node *leaf, uint32_t leaf_slot,
- hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx)
-{
- unsigned try = 0;
- unsigned status;
- uint32_t prev_alt_bkt_idx;
-
- struct queue_node *prev_node, *curr_node = leaf;
- struct rte_hash_bucket *prev_bkt, *curr_bkt = leaf->bkt;
- uint32_t prev_slot, curr_slot = leaf_slot;
-
- while (try < RTE_HASH_TSX_MAX_RETRY) {
- status = rte_xbegin();
- if (likely(status == RTE_XBEGIN_STARTED)) {
- /* In case empty slot was gone before entering TSX */
- if (curr_bkt->key_idx[curr_slot] != EMPTY_SLOT)
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- while (likely(curr_node->prev != NULL)) {
- prev_node = curr_node->prev;
- prev_bkt = prev_node->bkt;
- prev_slot = curr_node->prev_slot;
-
- prev_alt_bkt_idx
- = prev_bkt->sig_alt[prev_slot]
- & h->bucket_bitmask;
-
- if (unlikely(&h->buckets[prev_alt_bkt_idx]
- != curr_bkt)) {
- rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED);
- }
-
- /* Need to swap current/alt sig to allow later
- * Cuckoo insert to move elements back to its
- * primary bucket if available
- */
- curr_bkt->sig_alt[curr_slot] =
- prev_bkt->sig_current[prev_slot];
- curr_bkt->sig_current[curr_slot] =
- prev_bkt->sig_alt[prev_slot];
- curr_bkt->key_idx[curr_slot]
- = prev_bkt->key_idx[prev_slot];
-
- curr_slot = prev_slot;
- curr_node = prev_node;
- curr_bkt = curr_node->bkt;
- }
-
- curr_bkt->sig_current[curr_slot] = sig;
- curr_bkt->sig_alt[curr_slot] = alt_hash;
- curr_bkt->key_idx[curr_slot] = new_idx;
-
- rte_xend();
-
- return 0;
- }
-
- /* If we abort we give up this cuckoo path, since most likely it's
- * no longer valid as TSX detected data conflict
- */
- try++;
- rte_pause();
- }
-
- return -1;
-}
-
-/*
- * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe
- * Cuckoo
- */
-static inline int
-rte_hash_cuckoo_make_space_mw_tm(const struct rte_hash *h,
- struct rte_hash_bucket *bkt,
- hash_sig_t sig, hash_sig_t alt_hash,
- uint32_t new_idx)
-{
- unsigned i;
- struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN];
- struct queue_node *tail, *head;
- struct rte_hash_bucket *curr_bkt, *alt_bkt;
-
- tail = queue;
- head = queue + 1;
- tail->bkt = bkt;
- tail->prev = NULL;
- tail->prev_slot = -1;
-
- /* Cuckoo bfs Search */
- while (likely(tail != head && head <
- queue + RTE_HASH_BFS_QUEUE_MAX_LEN -
- RTE_HASH_BUCKET_ENTRIES)) {
- curr_bkt = tail->bkt;
- for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (curr_bkt->key_idx[i] == EMPTY_SLOT) {
- if (likely(rte_hash_cuckoo_move_insert_mw_tm(h,
- tail, i, sig,
- alt_hash, new_idx) == 0))
- return 0;
- }
-
- /* Enqueue new node and keep prev node info */
- alt_bkt = &(h->buckets[curr_bkt->sig_alt[i]
- & h->bucket_bitmask]);
- head->bkt = alt_bkt;
- head->prev = tail;
- head->prev_slot = i;
- head++;
- }
- tail++;
- }
-
- return -ENOSPC;
-}
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index f71ca9f..ecb49e4 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -34,6 +34,9 @@ extern "C" {
/** Default behavior of insertion, single writer/multi writer */
#define RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD 0x02
+/** Flag to support reader writer concurrency */
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY 0x04
+
/** Signature of key that is stored internally. */
typedef uint32_t hash_sig_t;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 6/8] test: add tests in hash table perf test
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
` (4 preceding siblings ...)
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support Yipeng Wang
@ 2018-07-10 16:59 ` Yipeng Wang
2018-07-10 17:00 ` [dpdk-dev] [PATCH v5 7/8] test: add test case for read write concurrency Yipeng Wang
` (2 subsequent siblings)
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 16:59 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
New code is added to support read-write concurrency for
rte_hash. Due to the newly added code in critial path,
the perf test is modified to show any performance impact.
It is still a single-thread test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
test/test/test_hash_perf.c | 36 +++++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)
diff --git a/test/test/test_hash_perf.c b/test/test/test_hash_perf.c
index a81d0c7..33dcb9f 100644
--- a/test/test/test_hash_perf.c
+++ b/test/test/test_hash_perf.c
@@ -76,7 +76,8 @@ static struct rte_hash_parameters ut_params = {
};
static int
-create_table(unsigned with_data, unsigned table_index)
+create_table(unsigned int with_data, unsigned int table_index,
+ unsigned int with_locks)
{
char name[RTE_HASH_NAMESIZE];
@@ -86,6 +87,14 @@ create_table(unsigned with_data, unsigned table_index)
else
sprintf(name, "test_hash%d", hashtest_key_lens[table_index]);
+
+ if (with_locks)
+ ut_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT
+ | RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ ut_params.extra_flag = 0;
+
ut_params.name = name;
ut_params.key_len = hashtest_key_lens[table_index];
ut_params.socket_id = rte_socket_id();
@@ -459,7 +468,7 @@ reset_table(unsigned table_index)
}
static int
-run_all_tbl_perf_tests(unsigned with_pushes)
+run_all_tbl_perf_tests(unsigned int with_pushes, unsigned int with_locks)
{
unsigned i, j, with_data, with_hash;
@@ -468,7 +477,7 @@ run_all_tbl_perf_tests(unsigned with_pushes)
for (with_data = 0; with_data <= 1; with_data++) {
for (i = 0; i < NUM_KEYSIZES; i++) {
- if (create_table(with_data, i) < 0)
+ if (create_table(with_data, i, with_locks) < 0)
return -1;
if (get_input_keys(with_pushes, i) < 0)
@@ -611,15 +620,20 @@ fbk_hash_perf_test(void)
static int
test_hash_perf(void)
{
- unsigned with_pushes;
-
- for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
- if (with_pushes == 0)
- printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ unsigned int with_pushes, with_locks;
+ for (with_locks = 0; with_locks <= 1; with_locks++) {
+ if (with_locks)
+ printf("\nWith locks in the code\n");
else
- printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
- if (run_all_tbl_perf_tests(with_pushes) < 0)
- return -1;
+ printf("\nWithout locks in the code\n");
+ for (with_pushes = 0; with_pushes <= 1; with_pushes++) {
+ if (with_pushes == 0)
+ printf("\nALL ELEMENTS IN PRIMARY LOCATION\n");
+ else
+ printf("\nELEMENTS IN PRIMARY OR SECONDARY LOCATION\n");
+ if (run_all_tbl_perf_tests(with_pushes, with_locks) < 0)
+ return -1;
+ }
}
if (fbk_hash_perf_test() < 0)
return -1;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 7/8] test: add test case for read write concurrency
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
` (5 preceding siblings ...)
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 6/8] test: add tests in hash table perf test Yipeng Wang
@ 2018-07-10 17:00 ` Yipeng Wang
2018-07-10 17:00 ` [dpdk-dev] [PATCH v5 8/8] hash: add new API function to query the key count Yipeng Wang
2018-07-12 21:03 ` [dpdk-dev] [PATCH v5 0/8] Add read-write concurrency to rte_hash library Thomas Monjalon
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 17:00 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
This commits add a new test case for testing read/write concurrency.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
test/test/Makefile | 1 +
test/test/test_hash_readwrite.c | 637 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 638 insertions(+)
create mode 100644 test/test/test_hash_readwrite.c
diff --git a/test/test/Makefile b/test/test/Makefile
index eccc8ef..6ce66c9 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -113,6 +113,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
new file mode 100644
index 0000000..55ae33d
--- /dev/null
+++ b/test/test/test_hash_readwrite.c
@@ -0,0 +1,637 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <locale.h>
+
+#include <rte_cycles.h>
+#include <rte_hash.h>
+#include <rte_hash_crc.h>
+#include <rte_jhash.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+#include <rte_spinlock.h>
+
+#include "test.h"
+
+#define RTE_RWTEST_FAIL 0
+
+#define TOTAL_ENTRY (16*1024*1024)
+#define TOTAL_INSERT (15*1024*1024)
+
+#define NUM_TEST 3
+unsigned int core_cnt[NUM_TEST] = {2, 4, 8};
+
+struct perf {
+ uint32_t single_read;
+ uint32_t single_write;
+ uint32_t read_only[NUM_TEST];
+ uint32_t write_only[NUM_TEST];
+ uint32_t read_write_r[NUM_TEST];
+ uint32_t read_write_w[NUM_TEST];
+};
+
+static struct perf htm_results, non_htm_results;
+
+struct {
+ uint32_t *keys;
+ uint32_t *found;
+ uint32_t num_insert;
+ uint32_t rounded_tot_insert;
+ struct rte_hash *h;
+} tbl_rw_test_param;
+
+static rte_atomic64_t gcycles;
+static rte_atomic64_t ginsertions;
+
+static rte_atomic64_t gread_cycles;
+static rte_atomic64_t gwrite_cycles;
+
+static rte_atomic64_t greads;
+static rte_atomic64_t gwrites;
+
+static int
+test_hash_readwrite_worker(__attribute__((unused)) void *arg)
+{
+ uint64_t i, offset;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+
+ offset = (lcore_id - rte_get_master_lcore())
+ * tbl_rw_test_param.num_insert;
+
+ printf("Core #%d inserting and reading %d: %'"PRId64" - %'"PRId64"\n",
+ lcore_id, tbl_rw_test_param.num_insert,
+ offset, offset + tbl_rw_test_param.num_insert);
+
+ begin = rte_rdtsc_precise();
+
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) > 0)
+ break;
+
+ ret = rte_hash_add_key(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i);
+ if (ret < 0)
+ break;
+
+ if (rte_hash_lookup(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i) != ret)
+ break;
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gcycles, cycles);
+ rte_atomic64_add(&ginsertions, i - offset);
+
+ for (; i < offset + tbl_rw_test_param.num_insert; i++)
+ tbl_rw_test_param.keys[i] = RTE_RWTEST_FAIL;
+
+ return 0;
+}
+
+static int
+init_params(int use_htm, int use_jhash)
+{
+ unsigned int i;
+
+ uint32_t *keys = NULL;
+ uint32_t *found = NULL;
+ struct rte_hash *handle;
+
+ struct rte_hash_parameters hash_params = {
+ .entries = TOTAL_ENTRY,
+ .key_len = sizeof(uint32_t),
+ .hash_func_init_val = 0,
+ .socket_id = rte_socket_id(),
+ };
+ if (use_jhash)
+ hash_params.hash_func = rte_jhash;
+ else
+ hash_params.hash_func = rte_hash_crc;
+
+ if (use_htm)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ else
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+
+ hash_params.name = "tests";
+
+ handle = rte_hash_create(&hash_params);
+ if (handle == NULL) {
+ printf("hash creation failed");
+ return -1;
+ }
+
+ tbl_rw_test_param.h = handle;
+ keys = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+
+ if (keys == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ found = rte_zmalloc(NULL, sizeof(uint32_t) * TOTAL_ENTRY, 0);
+ if (found == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ goto err;
+ }
+
+ tbl_rw_test_param.keys = keys;
+ tbl_rw_test_param.found = found;
+
+ for (i = 0; i < TOTAL_ENTRY; i++)
+ keys[i] = i;
+
+ return 0;
+
+err:
+ rte_free(keys);
+ rte_hash_free(handle);
+
+ return -1;
+}
+
+static int
+test_hash_readwrite_functional(int use_htm)
+{
+ unsigned int i;
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+ int use_jhash = 1;
+
+ rte_atomic64_init(&gcycles);
+ rte_atomic64_clear(&gcycles);
+
+ rte_atomic64_init(&ginsertions);
+ rte_atomic64_clear(&ginsertions);
+
+ if (init_params(use_htm, use_jhash) != 0)
+ goto err;
+
+ tbl_rw_test_param.num_insert =
+ TOTAL_INSERT / rte_lcore_count();
+
+ tbl_rw_test_param.rounded_tot_insert =
+ tbl_rw_test_param.num_insert
+ * rte_lcore_count();
+
+ printf("++++++++Start function tests:+++++++++\n");
+
+ /* Fire all threads. */
+ rte_eal_mp_remote_launch(test_hash_readwrite_worker,
+ NULL, CALL_MASTER);
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h, &next_key,
+ &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+ for (i = 0; i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %d is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gcycles) /
+ rte_atomic64_read(&ginsertions);
+
+ printf("cycles per insertion and lookup: %llu\n", cycles_per_insertion);
+
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ printf("+++++++++Complete function tests+++++++++\n");
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+err:
+ return -1;
+}
+
+static int
+test_rw_reader(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint64_t begin, cycles;
+ uint64_t read_cnt = (uint64_t)((uintptr_t)arg);
+
+ begin = rte_rdtsc_precise();
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value %"PRIu64","
+ "%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gread_cycles, cycles);
+ rte_atomic64_add(&greads, i);
+ return 0;
+}
+
+static int
+test_rw_writer(__attribute__((unused)) void *arg)
+{
+ uint64_t i;
+ uint32_t lcore_id = rte_lcore_id();
+ uint64_t begin, cycles;
+ int ret;
+ uint64_t start_coreid = (uint64_t)(uintptr_t)arg;
+ uint64_t offset;
+
+ offset = TOTAL_INSERT / 2 + (lcore_id - start_coreid)
+ * tbl_rw_test_param.num_insert;
+ begin = rte_rdtsc_precise();
+ for (i = offset; i < offset + tbl_rw_test_param.num_insert; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("writer failed %"PRIu64"\n", i);
+ break;
+ }
+ }
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gwrite_cycles, cycles);
+ rte_atomic64_add(&gwrites, tbl_rw_test_param.num_insert);
+ return 0;
+}
+
+static int
+test_hash_readwrite_perf(struct perf *perf_results, int use_htm,
+ int reader_faster)
+{
+ unsigned int n;
+ int ret;
+ int start_coreid;
+ uint64_t i, read_cnt;
+
+ const void *next_key;
+ void *next_data;
+ uint32_t iter = 0;
+ int use_jhash = 0;
+
+ uint32_t duplicated_keys = 0;
+ uint32_t lost_keys = 0;
+
+ uint64_t start = 0, end = 0;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gwrites);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&greads);
+
+ rte_atomic64_init(&gread_cycles);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_init(&gwrite_cycles);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ if (init_params(use_htm, use_jhash) != 0)
+ goto err;
+
+ /*
+ * Do a readers finish faster or writers finish faster test.
+ * When readers finish faster, we timing the readers, and when writers
+ * finish faster, we timing the writers.
+ * Divided by 10 or 2 is just experimental values to vary the workload
+ * of readers.
+ */
+ if (reader_faster) {
+ printf("++++++Start perf test: reader++++++++\n");
+ read_cnt = TOTAL_INSERT / 10;
+ } else {
+ printf("++++++Start perf test: writer++++++++\n");
+ read_cnt = TOTAL_INSERT / 2;
+ }
+
+ /* We first test single thread performance */
+ start = rte_rdtsc_precise();
+ /* Insert half of the keys */
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_write = end / i;
+
+ start = rte_rdtsc_precise();
+
+ for (i = 0; i < read_cnt; i++) {
+ void *data;
+ rte_hash_lookup_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ &data);
+ if (i != (uint64_t)(uintptr_t)data) {
+ printf("lookup find wrong value"
+ " %"PRIu64",%"PRIu64"\n", i,
+ (uint64_t)(uintptr_t)data);
+ break;
+ }
+ }
+ end = rte_rdtsc_precise() - start;
+ perf_results->single_read = end / i;
+
+ for (n = 0; n < NUM_TEST; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < core_cnt[n] * 2 + 1)
+ goto finish;
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ tbl_rw_test_param.num_insert = TOTAL_INSERT / 2 / core_cnt[n];
+ tbl_rw_test_param.rounded_tot_insert = TOTAL_INSERT / 2 +
+ tbl_rw_test_param.num_insert *
+ core_cnt[n];
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ /* Then test multiple thread case but only all reads or
+ * all writes
+ */
+
+ /* Test only reader cases */
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+
+ rte_eal_mp_wait_lcore();
+
+ start_coreid = i;
+ /* Test only writer cases */
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+
+ rte_eal_mp_wait_lcore();
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_only[n] = cycles_per_insertion;
+ printf("Reader only: cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->write_only[n] = cycles_per_insertion;
+ printf("Writer only: cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+ rte_atomic64_clear(&gwrites);
+ rte_atomic64_clear(&gwrite_cycles);
+
+ rte_hash_reset(tbl_rw_test_param.h);
+
+ for (i = 0; i < TOTAL_INSERT / 2; i++) {
+ ret = rte_hash_add_key_data(tbl_rw_test_param.h,
+ tbl_rw_test_param.keys + i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("Failed to insert half of keys\n");
+ goto err_free;
+ }
+ }
+
+ start_coreid = core_cnt[n] + 1;
+
+ if (reader_faster) {
+ for (i = core_cnt[n] + 1; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ } else {
+ for (i = 1; i <= core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rw_reader,
+ (void *)(uintptr_t)read_cnt, i);
+ for (; i <= core_cnt[n] * 2; i++)
+ rte_eal_remote_launch(test_rw_writer,
+ (void *)((uintptr_t)start_coreid), i);
+ }
+
+ rte_eal_mp_wait_lcore();
+
+ while (rte_hash_iterate(tbl_rw_test_param.h,
+ &next_key, &next_data, &iter) >= 0) {
+ /* Search for the key in the list of keys added .*/
+ i = *(const uint32_t *)next_key;
+ tbl_rw_test_param.found[i]++;
+ }
+
+ for (i = 0; i < tbl_rw_test_param.rounded_tot_insert; i++) {
+ if (tbl_rw_test_param.keys[i] != RTE_RWTEST_FAIL) {
+ if (tbl_rw_test_param.found[i] > 1) {
+ duplicated_keys++;
+ break;
+ }
+ if (tbl_rw_test_param.found[i] == 0) {
+ lost_keys++;
+ printf("key %"PRIu64" is lost\n", i);
+ break;
+ }
+ }
+ }
+
+ if (duplicated_keys > 0) {
+ printf("%d key duplicated\n", duplicated_keys);
+ goto err_free;
+ }
+
+ if (lost_keys > 0) {
+ printf("%d key lost\n", lost_keys);
+ goto err_free;
+ }
+
+ printf("No key corrupted during read-write test.\n");
+
+ if (reader_faster) {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ perf_results->read_write_r[n] = cycles_per_insertion;
+ printf("Read-write cycles per lookup: %llu\n",
+ cycles_per_insertion);
+ }
+
+ else {
+ unsigned long long int cycles_per_insertion =
+ rte_atomic64_read(&gwrite_cycles) /
+ rte_atomic64_read(&gwrites);
+ perf_results->read_write_w[n] = cycles_per_insertion;
+ printf("Read-write cycles per writes: %llu\n",
+ cycles_per_insertion);
+ }
+ }
+
+finish:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+ return 0;
+
+err_free:
+ rte_free(tbl_rw_test_param.found);
+ rte_free(tbl_rw_test_param.keys);
+ rte_hash_free(tbl_rw_test_param.h);
+
+err:
+ return -1;
+}
+
+static int
+test_hash_readwrite_main(void)
+{
+ /*
+ * Variables used to choose different tests.
+ * use_htm indicates if hardware transactional memory should be used.
+ * reader_faster indicates if the reader threads should finish earlier
+ * than writer threads. This is to timing either reader threads or
+ * writer threads for performance numbers.
+ */
+ int use_htm, reader_faster;
+
+ if (rte_lcore_count() == 1) {
+ printf("More than one lcore is required "
+ "to do read write test\n");
+ return 0;
+ }
+
+ setlocale(LC_NUMERIC, "");
+
+ if (rte_tm_supported()) {
+ printf("Hardware transactional memory (lock elision) "
+ "is supported\n");
+
+ printf("Test read-write with Hardware transactional memory\n");
+
+ use_htm = 1;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ } else {
+ printf("Hardware transactional memory (lock elision) "
+ "is NOT supported\n");
+ }
+
+ printf("Test read-write without Hardware transactional memory\n");
+ use_htm = 0;
+ if (test_hash_readwrite_functional(use_htm) < 0)
+ return -1;
+ reader_faster = 1;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+ reader_faster = 0;
+ if (test_hash_readwrite_perf(&non_htm_results, use_htm,
+ reader_faster) < 0)
+ return -1;
+
+ printf("Results summary:\n");
+
+ int i;
+
+ printf("single read: %u\n", htm_results.single_read);
+ printf("single write: %u\n", htm_results.single_write);
+ for (i = 0; i < NUM_TEST; i++) {
+ printf("core_cnt: %u\n", core_cnt[i]);
+ printf("HTM:\n");
+ printf("read only: %u\n", htm_results.read_only[i]);
+ printf("write only: %u\n", htm_results.write_only[i]);
+ printf("read-write read: %u\n", htm_results.read_write_r[i]);
+ printf("read-write write: %u\n", htm_results.read_write_w[i]);
+
+ printf("non HTM:\n");
+ printf("read only: %u\n", non_htm_results.read_only[i]);
+ printf("write only: %u\n", non_htm_results.write_only[i]);
+ printf("read-write read: %u\n",
+ non_htm_results.read_write_r[i]);
+ printf("read-write write: %u\n",
+ non_htm_results.read_write_w[i]);
+ }
+
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(hash_readwrite_autotest, test_hash_readwrite_main);
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* [dpdk-dev] [PATCH v5 8/8] hash: add new API function to query the key count
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
` (6 preceding siblings ...)
2018-07-10 17:00 ` [dpdk-dev] [PATCH v5 7/8] test: add test case for read write concurrency Yipeng Wang
@ 2018-07-10 17:00 ` Yipeng Wang
2018-07-12 21:03 ` [dpdk-dev] [PATCH v5 0/8] Add read-write concurrency to rte_hash library Thomas Monjalon
8 siblings, 0 replies; 65+ messages in thread
From: Yipeng Wang @ 2018-07-10 17:00 UTC (permalink / raw)
To: pablo.de.lara.guarch
Cc: dev, yipeng1.wang, bruce.richardson, honnappa.nagarahalli,
vguvva, brijesh.s.singh
Add a new function, rte_hash_count, to return the number of keys that
are currently stored in the hash table. Corresponding test functions are
added into hash_test and hash_multiwriter test.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 24 ++++++++++++++++++++++++
lib/librte_hash/rte_hash.h | 11 +++++++++++
lib/librte_hash/rte_hash_version.map | 8 ++++++++
test/test/test_hash.c | 12 ++++++++++++
test/test/test_hash_multiwriter.c | 8 ++++++++
5 files changed, 63 insertions(+)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 35631cc..bb67ade 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -370,6 +370,30 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return primary_hash ^ ((tag + 1) * alt_bits_xor);
}
+int32_t
+rte_hash_count(const struct rte_hash *h)
+{
+ uint32_t tot_ring_cnt, cached_cnt = 0;
+ uint32_t i, ret;
+
+ if (h == NULL)
+ return -EINVAL;
+
+ if (h->multi_writer_support) {
+ tot_ring_cnt = h->entries + (RTE_MAX_LCORE - 1) *
+ (LCORE_CACHE_SIZE - 1);
+ for (i = 0; i < RTE_MAX_LCORE; i++)
+ cached_cnt += h->local_free_slots[i].len;
+
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots) -
+ cached_cnt;
+ } else {
+ tot_ring_cnt = h->entries;
+ ret = tot_ring_cnt - rte_ring_count(h->free_slots);
+ }
+ return ret;
+}
+
/* Read write locks implemented using rte_rwlock */
static inline void
__hash_rw_writer_lock(const struct rte_hash *h)
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index ecb49e4..1f1a276 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -127,6 +127,17 @@ void
rte_hash_reset(struct rte_hash *h);
/**
+ * Return the number of keys in the hash table
+ * @param h
+ * Hash table to query from
+ * @return
+ * - -EINVAL if parameters are invalid
+ * - A value indicating how many keys were inserted in the table.
+ */
+int32_t
+rte_hash_count(const struct rte_hash *h);
+
+/**
* Add a key-value pair to an existing hash table.
* This operation is not multi-thread safe
* and should only be called from one thread.
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index 52a2576..e216ac8 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -45,3 +45,11 @@ DPDK_16.07 {
rte_hash_get_key_with_position;
} DPDK_2.2;
+
+
+DPDK_18.08 {
+ global:
+
+ rte_hash_count;
+
+} DPDK_16.07;
diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index edf41f5..b3db9fd 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -1103,6 +1103,7 @@ static int test_average_table_utilization(void)
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
+ unsigned int cnt;
printf("\n# Running test to determine average utilization"
"\n before adding elements begins to fail\n");
@@ -1121,13 +1122,24 @@ static int test_average_table_utilization(void)
for (i = 0; i < ut_params.key_len; i++)
simple_key[i] = rte_rand() % 255;
ret = rte_hash_add_key(handle, simple_key);
+ if (ret < 0)
+ break;
}
+
if (ret != -ENOSPC) {
printf("Unexpected error when adding keys\n");
rte_hash_free(handle);
return -1;
}
+ cnt = rte_hash_count(handle);
+ if (cnt != added_keys) {
+ printf("rte_hash_count returned wrong value %u, %u,"
+ "%u\n", j, added_keys, cnt);
+ rte_hash_free(handle);
+ return -1;
+ }
+
average_keys_added += added_keys;
/* Reset the table */
diff --git a/test/test/test_hash_multiwriter.c b/test/test/test_hash_multiwriter.c
index ef5fce3..f182f40 100644
--- a/test/test/test_hash_multiwriter.c
+++ b/test/test/test_hash_multiwriter.c
@@ -116,6 +116,7 @@ test_hash_multiwriter(void)
uint32_t duplicated_keys = 0;
uint32_t lost_keys = 0;
+ uint32_t count;
snprintf(name, 32, "test%u", calledCount++);
hash_params.name = name;
@@ -163,6 +164,13 @@ test_hash_multiwriter(void)
NULL, CALL_MASTER);
rte_eal_mp_wait_lcore();
+ count = rte_hash_count(handle);
+ if (count != rounded_nb_total_tsx_insertion) {
+ printf("rte_hash_count returned wrong value %u, %d\n",
+ rounded_nb_total_tsx_insertion, count);
+ goto err3;
+ }
+
while (rte_hash_iterate(handle, &next_key, &next_data, &iter) >= 0) {
/* Search for the key in the list of keys added .*/
i = *(const uint32_t *)next_key;
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
` (7 preceding siblings ...)
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 8/8] hash: add new API function to query the key count Yipeng Wang
@ 2018-07-10 18:00 ` Honnappa Nagarahalli
2018-07-12 1:31 ` Wang, Yipeng1
8 siblings, 1 reply; 65+ messages in thread
From: Honnappa Nagarahalli @ 2018-07-10 18:00 UTC (permalink / raw)
To: Yipeng Wang, pablo.de.lara.guarch
Cc: dev, bruce.richardson, vguvva, brijesh.s.singh, nd
Hi Yipeng/Pablo,
Apologies for my delayed comments
-----Original Message-----
From: Yipeng Wang <yipeng1.wang@intel.com>
Sent: Monday, July 9, 2018 5:45 AM
To: pablo.de.lara.guarch@intel.com
Cc: dev@dpdk.org; yipeng1.wang@intel.com; bruce.richardson@intel.com; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
Subject: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
This patch set adds the read-write concurrency support in rte_hash.
A new flag value is added to indicate if read-write concurrency is needed during creation time. Test cases are implemented to do functional and performance tests.
The new concurrency model is based on rte_rwlock. When Intel TSX is available and the users indicate to use it, the TM version of the rte_rwlock will be called. Both multi-writer and read-write concurrency are protected by the rte_rwlock instead of the x86 specific RTM instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed and the code is infused into the main .c file.
IMO, at a high-level, there are two use cases for the rte_hash library:
1) Writers are on control plane and data plane are readers
2) Writers and readers are on data plane
This distinction is required as in the case of 1) writers are pre-emptible. If I consider platforms without TSX (I do not know how Intel TSX works), the rte_rwlock implementation is blocking. A writer on the control plane can take the lock and get pre-empted. Since rte_rwlock is used for read-write concurrency, it will block the readers (on the data plane) for an extended duration. I think, support for RCU is required to solve this issue.
A new rte_hash_count API is proposed to count how many keys are inserted into the hash table.
v3->v4:
1. Change commit message titles as Pablo suggested. (Pablo) 2. hash: remove unnecessary changes in commit 4. (Pablo) 3. test: remove unnecessary double blank lines. (Pablo) 4. Add Pablo's ack in commit message.
v2->v3:
1. hash: Concurrency bug fix: after beginning cuckoo path moving, the last empty slot needs to be verified again in case other writers raced into this slot and occupy it. A new commit is added to do this bug fix since it applies to master head as well.
2. hash: Concurrency bug fix: if cuckoo path is detected to be invalid, the current slot needs to be emptied since it is duplicated to its target bucket.
3. hash: "const" is used for types in multiple locations. (Pablo) 4. hash: rte_malloc used for readwriter lock used wrong align argument. Similar fix applies to master head so a new commit is created. (Pablo) 5. hash: ring size calculation fix is moved to front. (Pablo) 6. hash: search-and-remove function is refactored to be more aligned with other search function. (Pablo) 7. test: using jhash in functional test for read-write concurrency.
It is because jhash with sequential keys incur more cuckoo path.
8. Multiple coding style, typo, commit message fixes. (Pablo)
v1->v2:
1. Split each commit into two commits for easier review (Pablo).
2. Add more comments in various places (Pablo).
3. hash: In key insertion function, move duplicated key checking to earlier location and protect it using locks. Checking duplicated key should happen first and data updates should be protected.
4. hash: In lookup bulk function, put signature comparison in lock, since writers could happen between signature match on two buckets.
5. hash: Add write locks to reset function as well to protect resets.
5. test: Fix 32-bit compilation error in read-write test (Pablo).
6. test: Check total physical core count in read-write test. Don't test with thread count that larger than physical core count.
7. Other minor fixes such as typos (Pablo).
Yipeng Wang (8):
hash: fix multiwriter lock memory allocation
hash: fix a multi-writer race condition
hash: fix key slot size accuracy
hash: make duplicated code into functions
hash: add read and write concurrency support
test: add tests in hash table perf test
test: add test case for read write concurrency
hash: add new API function to query the key count
lib/librte_hash/meson.build | 1 -
lib/librte_hash/rte_cuckoo_hash.c | 701 +++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 18 +-
lib/librte_hash/rte_cuckoo_hash_x86.h | 164 --------
lib/librte_hash/rte_hash.h | 14 +
lib/librte_hash/rte_hash_version.map | 8 +
test/test/Makefile | 1 +
test/test/test_hash.c | 12 +
test/test/test_hash_multiwriter.c | 9 +
test/test/test_hash_perf.c | 36 +-
test/test/test_hash_readwrite.c | 637 ++++++++++++++++++++++++++++++
11 files changed, 1156 insertions(+), 445 deletions(-) delete mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h
create mode 100644 test/test/test_hash_readwrite.c
--
2.7.4
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support Yipeng Wang
@ 2018-07-11 20:49 ` Stephen Hemminger
2018-07-12 1:22 ` Wang, Yipeng1
0 siblings, 1 reply; 65+ messages in thread
From: Stephen Hemminger @ 2018-07-11 20:49 UTC (permalink / raw)
To: Yipeng Wang
Cc: pablo.de.lara.guarch, dev, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh
On Tue, 10 Jul 2018 09:59:58 -0700
Yipeng Wang <yipeng1.wang@intel.com> wrote:
> +
> +static inline void
> +__hash_rw_reader_lock(const struct rte_hash *h)
> +{
> + if (h->readwrite_concur_support && h->hw_trans_mem_support)
> + rte_rwlock_read_lock_tm(h->readwrite_lock);
> + else if (h->readwrite_concur_support)
> + rte_rwlock_read_lock(h->readwrite_lock);
> +}
> +
> +static inline void
> +__hash_rw_writer_unlock(const struct rte_hash *h)
> +{
> + if (h->multi_writer_support && h->hw_trans_mem_support)
> + rte_rwlock_write_unlock_tm(h->readwrite_lock);
> + else if (h->multi_writer_support)
> + rte_rwlock_write_unlock(h->readwrite_lock);
> +}
> +
> +static inline void
> +__hash_rw_reader_unlock(const struct rte_hash *h)
> +{
> + if (h->readwrite_concur_support && h->hw_trans_mem_support)
> + rte_rwlock_read_unlock_tm(h->readwrite_lock);
> + else if (h->readwrite_concur_support)
> + rte_rwlock_read_unlock(h->readwrite_lock);
> +}
> +
For small windows, reader-writer locks are slower than a spin lock
because there are more cache bounces.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
2018-07-11 20:49 ` Stephen Hemminger
@ 2018-07-12 1:22 ` Wang, Yipeng1
2018-07-12 20:30 ` Thomas Monjalon
0 siblings, 1 reply; 65+ messages in thread
From: Wang, Yipeng1 @ 2018-07-12 1:22 UTC (permalink / raw)
To: Stephen Hemminger
Cc: De Lara Guarch, Pablo, dev, Richardson, Bruce,
honnappa.nagarahalli, vguvva, brijesh.s.singh, Wang, Ren,
Gobriel, Sameh, Tai, Charlie
Hi, Stephen,
You are correct and we understand that spinlock might be slightly faster than counter based rwlock in this case. However, the counter based rwlock is the exception path when TSX fails.
If performance of this exception path is a big concern, a more optimal read-write lock scheme (e.g. TLRW) should be introduced into rte_rwlock in the future.
Thanks
Yipeng
>-----Original Message-----
>From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>
>For small windows, reader-writer locks are slower than a spin lock
>because there are more cache bounces.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library
2018-07-10 18:00 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Honnappa Nagarahalli
@ 2018-07-12 1:31 ` Wang, Yipeng1
2018-07-12 2:36 ` Honnappa Nagarahalli
0 siblings, 1 reply; 65+ messages in thread
From: Wang, Yipeng1 @ 2018-07-12 1:31 UTC (permalink / raw)
To: Honnappa Nagarahalli, De Lara Guarch, Pablo
Cc: dev, Richardson, Bruce, vguvva, brijesh.s.singh, nd, Tai,
Charlie, Wang, Ren, Gobriel, Sameh
Hi, Honnappa,
Thanks for the comment.
RCU can handle one of the currency issue (key deletion while lookup) that has been discussed before, but we think it is not easy for RCU-alone to
protect reader from cuckoo path displacement. Also, RCU solution requires user interaction.
We agree that the current rwlock does not support preemptable writer. But a more advanced rwlock with priority could be
implemented in the future into the rwlock library.
Thanks
Yipeng
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>Sent: Tuesday, July 10, 2018 11:00 AM
>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; nd
><nd@arm.com>
>Subject: RE: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
>
>Hi Yipeng/Pablo,
> Apologies for my delayed comments
>
>-----Original Message-----
>From: Yipeng Wang <yipeng1.wang@intel.com>
>Sent: Monday, July 9, 2018 5:45 AM
>To: pablo.de.lara.guarch@intel.com
>Cc: dev@dpdk.org; yipeng1.wang@intel.com; bruce.richardson@intel.com; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
>vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
>Subject: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
>
>This patch set adds the read-write concurrency support in rte_hash.
>A new flag value is added to indicate if read-write concurrency is needed during creation time. Test cases are implemented to do
>functional and performance tests.
>
>The new concurrency model is based on rte_rwlock. When Intel TSX is available and the users indicate to use it, the TM version of the
>rte_rwlock will be called. Both multi-writer and read-write concurrency are protected by the rte_rwlock instead of the x86 specific
>RTM instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed and the code is infused into the main .c file.
>
>IMO, at a high-level, there are two use cases for the rte_hash library:
>1) Writers are on control plane and data plane are readers
>2) Writers and readers are on data plane
>
>This distinction is required as in the case of 1) writers are pre-emptible. If I consider platforms without TSX (I do not know how Intel
>TSX works), the rte_rwlock implementation is blocking. A writer on the control plane can take the lock and get pre-empted. Since
>rte_rwlock is used for read-write concurrency, it will block the readers (on the data plane) for an extended duration. I think, support
>for RCU is required to solve this issue.
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library
2018-07-12 1:31 ` Wang, Yipeng1
@ 2018-07-12 2:36 ` Honnappa Nagarahalli
2018-07-13 1:47 ` Wang, Yipeng1
0 siblings, 1 reply; 65+ messages in thread
From: Honnappa Nagarahalli @ 2018-07-12 2:36 UTC (permalink / raw)
To: Wang, Yipeng1, De Lara Guarch, Pablo
Cc: dev, Richardson, Bruce, vguvva, brijesh.s.singh, nd, Tai,
Charlie, Wang, Ren, Gobriel, Sameh
Hi Yipeng,
I agree with you on RCU. It solves only part of the problem and requires application involvement. Use of atomics is required to solve few more problems.
Not solving the preemptible writer issue will change the behavior for existing applications, is that ok?
I will submit a RFC with my ideas.
Thank you,
Honnappa
-----Original Message-----
From: Wang, Yipeng1 <yipeng1.wang@intel.com>
Sent: Wednesday, July 11, 2018 8:31 PM
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; nd <nd@arm.com>; Tai, Charlie <charlie.tai@intel.com>; Wang, Ren <ren.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>
Subject: RE: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
Hi, Honnappa,
Thanks for the comment.
RCU can handle one of the currency issue (key deletion while lookup) that has been discussed before, but we think it is not easy for RCU-alone to protect reader from cuckoo path displacement. Also, RCU solution requires user interaction.
We agree that the current rwlock does not support preemptable writer. But a more advanced rwlock with priority could be implemented in the future into the rwlock library.
Thanks
Yipeng
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>Sent: Tuesday, July 10, 2018 11:00 AM
>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; De Lara Guarch, Pablo
><pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>;
>vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; nd <nd@arm.com>
>Subject: RE: [PATCH v4 0/8] Add read-write concurrency to rte_hash
>library
>
>Hi Yipeng/Pablo,
> Apologies for my delayed comments
>
>-----Original Message-----
>From: Yipeng Wang <yipeng1.wang@intel.com>
>Sent: Monday, July 9, 2018 5:45 AM
>To: pablo.de.lara.guarch@intel.com
>Cc: dev@dpdk.org; yipeng1.wang@intel.com; bruce.richardson@intel.com;
>Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
>vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
>Subject: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
>
>This patch set adds the read-write concurrency support in rte_hash.
>A new flag value is added to indicate if read-write concurrency is
>needed during creation time. Test cases are implemented to do functional and performance tests.
>
>The new concurrency model is based on rte_rwlock. When Intel TSX is
>available and the users indicate to use it, the TM version of the
>rte_rwlock will be called. Both multi-writer and read-write concurrency are protected by the rte_rwlock instead of the x86 specific RTM instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed and the code is infused into the main .c file.
>
>IMO, at a high-level, there are two use cases for the rte_hash library:
>1) Writers are on control plane and data plane are readers
>2) Writers and readers are on data plane
>
>This distinction is required as in the case of 1) writers are
>pre-emptible. If I consider platforms without TSX (I do not know how
>Intel TSX works), the rte_rwlock implementation is blocking. A writer
>on the control plane can take the lock and get pre-empted. Since rte_rwlock is used for read-write concurrency, it will block the readers (on the data plane) for an extended duration. I think, support for RCU is required to solve this issue.
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
2018-07-12 1:22 ` Wang, Yipeng1
@ 2018-07-12 20:30 ` Thomas Monjalon
2018-07-13 1:55 ` Wang, Yipeng1
0 siblings, 1 reply; 65+ messages in thread
From: Thomas Monjalon @ 2018-07-12 20:30 UTC (permalink / raw)
To: Wang, Yipeng1, Stephen Hemminger
Cc: dev, De Lara Guarch, Pablo, Richardson, Bruce,
honnappa.nagarahalli, vguvva, brijesh.s.singh, Wang, Ren,
Gobriel, Sameh, Tai, Charlie
12/07/2018 03:22, Wang, Yipeng1:
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>
> > For small windows, reader-writer locks are slower than a spin lock
> > because there are more cache bounces.
>
> Hi, Stephen,
>
> You are correct and we understand that spinlock might be slightly faster than counter based rwlock in this case. However, the counter based rwlock is the exception path when TSX fails.
>
> If performance of this exception path is a big concern, a more optimal read-write lock scheme (e.g. TLRW) should be introduced into rte_rwlock in the future.
Something like this?
eal/rwlocks: Try read/write and relock write to read locks added
https://patches.dpdk.org/patch/40254/
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/8] Add read-write concurrency to rte_hash library
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
` (7 preceding siblings ...)
2018-07-10 17:00 ` [dpdk-dev] [PATCH v5 8/8] hash: add new API function to query the key count Yipeng Wang
@ 2018-07-12 21:03 ` Thomas Monjalon
8 siblings, 0 replies; 65+ messages in thread
From: Thomas Monjalon @ 2018-07-12 21:03 UTC (permalink / raw)
To: Yipeng Wang
Cc: dev, pablo.de.lara.guarch, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh
10/07/2018 18:59, Yipeng Wang:
> This patch set adds the read-write concurrency support in rte_hash.
> A new flag value is added to indicate if read-write concurrency is needed
> during creation time. Test cases are implemented to do functional and
> performance tests.
>
> The new concurrency model is based on rte_rwlock. When Intel TSX is
> available and the users indicate to use it, the TM version of the
> rte_rwlock will be called. Both multi-writer and read-write concurrency
> are protected by the rte_rwlock instead of the x86 specific RTM
> instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed
> and the code is infused into the main .c file.
>
> A new rte_hash_count API is proposed to count how many keys are inserted
> into the hash table.
>
> Yipeng Wang (8):
> hash: fix multiwriter lock memory allocation
> hash: fix a multi-writer race condition
> hash: fix key slot size accuracy
> hash: make duplicated code into functions
> hash: add read and write concurrency support
> test: add tests in hash table perf test
> test: add test case for read write concurrency
> hash: add new API function to query the key count
Applied, thanks
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library
2018-07-12 2:36 ` Honnappa Nagarahalli
@ 2018-07-13 1:47 ` Wang, Yipeng1
0 siblings, 0 replies; 65+ messages in thread
From: Wang, Yipeng1 @ 2018-07-13 1:47 UTC (permalink / raw)
To: Honnappa Nagarahalli, De Lara Guarch, Pablo
Cc: dev, Richardson, Bruce, vguvva, brijesh.s.singh, nd, Tai,
Charlie, Wang, Ren, Gobriel, Sameh
Hi,
I guess your proposal is to let the user application to handle all the concurrency using RCU and locks. That could be complementary
and orthogonal to our current implementation. User could choose to do this way if TSX is not available, and if they want more control over the
concurrency implementations. A new flag value could be added to allow this mode.
Please copy us when sending out the RFC, we will be happy to review it.
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>Sent: Wednesday, July 11, 2018 7:36 PM
>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; nd
><nd@arm.com>; Tai, Charlie <charlie.tai@intel.com>; Wang, Ren <ren.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>
>Subject: RE: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
>
>Hi Yipeng,
> I agree with you on RCU. It solves only part of the problem and requires application involvement. Use of atomics is required to
>solve few more problems.
>
>Not solving the preemptible writer issue will change the behavior for existing applications, is that ok?
>
>I will submit a RFC with my ideas.
>
>Thank you,
>Honnappa
>
>-----Original Message-----
>From: Wang, Yipeng1 <yipeng1.wang@intel.com>
>Sent: Wednesday, July 11, 2018 8:31 PM
>To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; nd
><nd@arm.com>; Tai, Charlie <charlie.tai@intel.com>; Wang, Ren <ren.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>
>Subject: RE: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
>
>Hi, Honnappa,
>
>Thanks for the comment.
>
>RCU can handle one of the currency issue (key deletion while lookup) that has been discussed before, but we think it is not easy for
>RCU-alone to protect reader from cuckoo path displacement. Also, RCU solution requires user interaction.
>
>We agree that the current rwlock does not support preemptable writer. But a more advanced rwlock with priority could be
>implemented in the future into the rwlock library.
>
>Thanks
>Yipeng
>
>>-----Original Message-----
>>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>>Sent: Tuesday, July 10, 2018 11:00 AM
>>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; De Lara Guarch, Pablo
>><pablo.de.lara.guarch@intel.com>
>>Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>;
>>vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; nd <nd@arm.com>
>>Subject: RE: [PATCH v4 0/8] Add read-write concurrency to rte_hash
>>library
>>
>>Hi Yipeng/Pablo,
>> Apologies for my delayed comments
>>
>>-----Original Message-----
>>From: Yipeng Wang <yipeng1.wang@intel.com>
>>Sent: Monday, July 9, 2018 5:45 AM
>>To: pablo.de.lara.guarch@intel.com
>>Cc: dev@dpdk.org; yipeng1.wang@intel.com; bruce.richardson@intel.com;
>>Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
>>vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com
>>Subject: [PATCH v4 0/8] Add read-write concurrency to rte_hash library
>>
>>This patch set adds the read-write concurrency support in rte_hash.
>>A new flag value is added to indicate if read-write concurrency is
>>needed during creation time. Test cases are implemented to do functional and performance tests.
>>
>>The new concurrency model is based on rte_rwlock. When Intel TSX is
>>available and the users indicate to use it, the TM version of the
>>rte_rwlock will be called. Both multi-writer and read-write concurrency are protected by the rte_rwlock instead of the x86 specific
>RTM instructions, so the x86 specific header rte_cuckoo_hash_x86.h is removed and the code is infused into the main .c file.
>>
>>IMO, at a high-level, there are two use cases for the rte_hash library:
>>1) Writers are on control plane and data plane are readers
>>2) Writers and readers are on data plane
>>
>>This distinction is required as in the case of 1) writers are
>>pre-emptible. If I consider platforms without TSX (I do not know how
>>Intel TSX works), the rte_rwlock implementation is blocking. A writer
>>on the control plane can take the lock and get pre-empted. Since rte_rwlock is used for read-write concurrency, it will block the
>readers (on the data plane) for an extended duration. I think, support for RCU is required to solve this issue.
>>
>>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
2018-07-12 20:30 ` Thomas Monjalon
@ 2018-07-13 1:55 ` Wang, Yipeng1
2018-08-17 12:51 ` ASM
0 siblings, 1 reply; 65+ messages in thread
From: Wang, Yipeng1 @ 2018-07-13 1:55 UTC (permalink / raw)
To: Thomas Monjalon, Stephen Hemminger, myravjev
Cc: dev, De Lara Guarch, Pablo, Richardson, Bruce,
honnappa.nagarahalli, vguvva, brijesh.s.singh, Wang, Ren,
Gobriel, Sameh, Tai, Charlie
Thanks for pointing me to this patch.
I guess the try-locks still does not solve the overhead of multiple readers contending the counter. It just provides a non-blocking version of the same algorithm.
The relock function looks interesting. But it would be helpful if any special use case or example is given. Specifically, who should call this function, the reader or the writer? Leonid, could you provide more context?
The TLRW example I gave is potentially a better rw-lock algorithm. Paper Reference: D. Dice and N Shavit. "TLRW: return of the read-write lock". In such scheme, readers won't contend the same reader counter which introduces heavy cache bouncing that Stephen mentioned. Maybe we should introduce similar algorithm into rte_rwlock library.
>-----Original Message-----
>From: Thomas Monjalon [mailto:thomas@monjalon.net]
>Sent: Thursday, July 12, 2018 1:30 PM
>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Stephen Hemminger <stephen@networkplumber.org>
>Cc: dev@dpdk.org; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>;
>honnappa.nagarahalli@arm.com; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; Wang, Ren <ren.wang@intel.com>;
>Gobriel, Sameh <sameh.gobriel@intel.com>; Tai, Charlie <charlie.tai@intel.com>
>Subject: Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
>
>12/07/2018 03:22, Wang, Yipeng1:
>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>>
>> > For small windows, reader-writer locks are slower than a spin lock
>> > because there are more cache bounces.
>>
>> Hi, Stephen,
>>
>> You are correct and we understand that spinlock might be slightly faster than counter based rwlock in this case. However, the
>counter based rwlock is the exception path when TSX fails.
>>
>> If performance of this exception path is a big concern, a more optimal read-write lock scheme (e.g. TLRW) should be introduced into
>rte_rwlock in the future.
>
>Something like this?
> eal/rwlocks: Try read/write and relock write to read locks added
> https://patches.dpdk.org/patch/40254/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
2018-07-13 1:55 ` Wang, Yipeng1
@ 2018-08-17 12:51 ` ASM
0 siblings, 0 replies; 65+ messages in thread
From: ASM @ 2018-08-17 12:51 UTC (permalink / raw)
To: yipeng1.wang
Cc: thomas, stephen, dev, pablo.de.lara.guarch, bruce.richardson,
honnappa.nagarahalli, vguvva, brijesh.s.singh, ren.wang,
sameh.gobriel, charlie.tai
> I guess the try-locks still does not solve the overhead of multiple readers contending the counter. It just provides a non-blocking version of the same algorithm.
DPDK project does not use any rwlock for solving any overhead problem
of multiple reader (at least I did not find it). For non-critical
sections it is often easier to write via rwlock (for to simplify the
code). The try locks, relock to read and other are use only for even
more simplifies uncritical code. For example some process need change
data (write lock), then reread the contents (read lock), it can
release write and lock read, but relock to read is easier. Process in
the write section can switch to read context and give control for
reader function.
I sought by third-party solution for rwlock, and decided that
rte_rwlock is the best choice for it, especially if there is a relock
to read, and try locks.
P.S. For solving the overhead problem of multiple readers (if using of
locks is _extremely_ required), best practice may be MCS lock and
rwlock based on the MCS locks.
---
Best regards,
Leonid Myravjev
On Fri, 13 Jul 2018 at 04:55, Wang, Yipeng1 <yipeng1.wang@intel.com> wrote:
>
> Thanks for pointing me to this patch.
>
> I guess the try-locks still does not solve the overhead of multiple readers contending the counter. It just provides a non-blocking version of the same algorithm.
>
> The relock function looks interesting. But it would be helpful if any special use case or example is given. Specifically, who should call this function, the reader or the writer? Leonid, could you provide more context?
>
> The TLRW example I gave is potentially a better rw-lock algorithm. Paper Reference: D. Dice and N Shavit. "TLRW: return of the read-write lock". In such scheme, readers won't contend the same reader counter which introduces heavy cache bouncing that Stephen mentioned. Maybe we should introduce similar algorithm into rte_rwlock library.
>
> >-----Original Message-----
> >From: Thomas Monjalon [mailto:thomas@monjalon.net]
> >Sent: Thursday, July 12, 2018 1:30 PM
> >To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Stephen Hemminger <stephen@networkplumber.org>
> >Cc: dev@dpdk.org; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>;
> >honnappa.nagarahalli@arm.com; vguvva@caviumnetworks.com; brijesh.s.singh@gmail.com; Wang, Ren <ren.wang@intel.com>;
> >Gobriel, Sameh <sameh.gobriel@intel.com>; Tai, Charlie <charlie.tai@intel.com>
> >Subject: Re: [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support
> >
> >12/07/2018 03:22, Wang, Yipeng1:
> >> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> >>
> >> > For small windows, reader-writer locks are slower than a spin lock
> >> > because there are more cache bounces.
> >>
> >> Hi, Stephen,
> >>
> >> You are correct and we understand that spinlock might be slightly faster than counter based rwlock in this case. However, the
> >counter based rwlock is the exception path when TSX fails.
> >>
> >> If performance of this exception path is a big concern, a more optimal read-write lock scheme (e.g. TLRW) should be introduced into
> >rte_rwlock in the future.
> >
> >Something like this?
> > eal/rwlocks: Try read/write and relock write to read locks added
> > https://patches.dpdk.org/patch/40254/
> >
> >
>
^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2018-08-17 12:52 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-08 10:51 [dpdk-dev] [PATCH v1 0/3] Add read-write concurrency to rte_hash library Yipeng Wang
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 1/3] hash: add read and write concurrency support Yipeng Wang
2018-06-26 14:59 ` De Lara Guarch, Pablo
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 2/3] test: add test case for read write concurrency Yipeng Wang
2018-06-26 15:48 ` De Lara Guarch, Pablo
2018-06-08 10:51 ` [dpdk-dev] [PATCH v1 3/3] hash: add new API function to query the key count Yipeng Wang
2018-06-26 16:11 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 0/6] Add read-write concurrency to rte_hash library Yipeng Wang
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 1/6] hash: make duplicated code into functions Yipeng Wang
2018-07-06 10:04 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 2/6] hash: add read and write concurrency support Yipeng Wang
2018-07-06 17:11 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 3/6] test: add tests in hash table perf test Yipeng Wang
2018-07-06 17:17 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 4/6] test: add test case for read write concurrency Yipeng Wang
2018-07-06 17:31 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 5/6] hash: fix to have more accurate key slot size Yipeng Wang
2018-07-06 17:32 ` De Lara Guarch, Pablo
2018-06-29 12:24 ` [dpdk-dev] [PATCH v2 6/6] hash: add new API function to query the key count Yipeng Wang
2018-07-06 17:36 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
2018-07-09 11:26 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 2/8] hash: fix a multi-writer bug Yipeng Wang
2018-07-09 14:16 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 3/8] hash: fix to have more accurate key slot size Yipeng Wang
2018-07-09 14:20 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 4/8] hash: make duplicated code into functions Yipeng Wang
2018-07-09 14:25 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 5/8] hash: add read and write concurrency support Yipeng Wang
2018-07-09 14:28 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 6/8] test: add tests in hash table perf test Yipeng Wang
2018-07-09 15:33 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 7/8] test: add test case for read write concurrency Yipeng Wang
2018-07-09 16:24 ` De Lara Guarch, Pablo
2018-07-06 19:46 ` [dpdk-dev] [PATCH v3 8/8] hash: add new API function to query the key count Yipeng Wang
2018-07-09 16:22 ` De Lara Guarch, Pablo
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 2/8] hash: fix a multi-writer race condition Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 3/8] hash: fix key slot size accuracy Yipeng Wang
2018-07-09 10:44 ` [dpdk-dev] [PATCH v4 4/8] hash: make duplicated code into functions Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 5/8] hash: add read and write concurrency support Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 6/8] test: add tests in hash table perf test Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 7/8] test: add test case for read write concurrency Yipeng Wang
2018-07-09 10:45 ` [dpdk-dev] [PATCH v4 8/8] hash: add new API function to query the key count Yipeng Wang
2018-07-10 18:00 ` [dpdk-dev] [PATCH v4 0/8] Add read-write concurrency to rte_hash library Honnappa Nagarahalli
2018-07-12 1:31 ` Wang, Yipeng1
2018-07-12 2:36 ` Honnappa Nagarahalli
2018-07-13 1:47 ` Wang, Yipeng1
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 " Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 1/8] hash: fix multiwriter lock memory allocation Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 2/8] hash: fix a multi-writer race condition Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 3/8] hash: fix key slot size accuracy Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 4/8] hash: make duplicated code into functions Yipeng Wang
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 5/8] hash: add read and write concurrency support Yipeng Wang
2018-07-11 20:49 ` Stephen Hemminger
2018-07-12 1:22 ` Wang, Yipeng1
2018-07-12 20:30 ` Thomas Monjalon
2018-07-13 1:55 ` Wang, Yipeng1
2018-08-17 12:51 ` ASM
2018-07-10 16:59 ` [dpdk-dev] [PATCH v5 6/8] test: add tests in hash table perf test Yipeng Wang
2018-07-10 17:00 ` [dpdk-dev] [PATCH v5 7/8] test: add test case for read write concurrency Yipeng Wang
2018-07-10 17:00 ` [dpdk-dev] [PATCH v5 8/8] hash: add new API function to query the key count Yipeng Wang
2018-07-12 21:03 ` [dpdk-dev] [PATCH v5 0/8] Add read-write concurrency to rte_hash library Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).