* [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash
@ 2018-10-12 6:31 Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency Honnappa Nagarahalli
` (6 more replies)
0 siblings, 7 replies; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
This patch has dependency on the following patches in the order:
http://patchwork.dpdk.org/cover/45611/
http://patchwork.dpdk.org/project/dpdk/list/?series=1822
Currently, reader-writer concurrency problems in rte_hash are
addressed using reader-writer locks. Use of reader-writer locks
results in following issues:
1) In many of the use cases for the hash table, writer threads
are running on control plane. If the writer is preempted while
holding the lock, it will block the readers for an extended period
resulting in packet drops. This problem seems to apply for platforms
with transactional memory support as well because of the algorithm
used for rte_rwlock_write_lock_tm:
static inline void
rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
{
if (likely(rte_try_tm(&rwl->cnt)))
return;
rte_rwlock_write_lock(rwl);
}
i.e. there is a posibility of using rte_rwlock_write_lock in
failure cases.
2) Reader-writer lock based solution does not address the following
issue.
rte_hash_lookup_xxx APIs return the index of the element in
the key store. Application(reader) can use that index to reference
other data structures in its scope. Because of this, the
index should not be freed till the application completes
using the index.
3) Since writer blocks all the readers, the hash lookup
rate comes down significantly when there is activity on the writer.
This happens even for unrelated entries. Performance numbers
given below clearly indicate this.
Lock-free solution is required to solve these problems. This patch
series adds the lock-free capabilities in the following steps:
1) Add support to not free the key-store index upon calling
rte_hash_del_xxx APIs. This solves the issue in 2).
2) Correct the alignment for the key store entry to prep for
memory ordering.
3) Add memory ordering to prevent race conditions when a new key
is added to the table.
4) Reader-writer concurrency issue, caused by moving the keys
to their alternate locations during key insert, is solved
by introducing an atomic global counter indicating a change
in table.
5) This solution also has to solve the issue of readers using
key store element even after the key is deleted from
control plane.
To solve this issue, the hash_del_key_xxx APIs do not free
the key store element when lock-free algorithm is enabled.
The key store element has to be freed using the newly introduced
rte_hash_free_key_with_position API. It needs to be called once
all the readers have stopped using the key store element. How this
is determined is outside the scope of this patch (RCU is one such
mechanism that the application can use).
6) Finally, a lock free reader-writer concurrency flag is added
to enable this feature at run time.
Performance numbers can be got from the additional test case added
as part of this patch.
v2->v3
1) Rebased on top of:
http://patchwork.dpdk.org/cover/45611/
http://patchwork.dpdk.org/project/dpdk/list/?series=1822
2) Added comments to RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF
to indicate multi writer support (Yipeng)
3) Updated the comments for rte_hash_add_key_data_xxx APIs
to free the 'data' only after the readers have completed
using 'data' (Yipeng)
4) Extendable tables are not supported when lock free algorithm
is requested.
v1->v2
1) Separate multi-writer capability from rw concurrency
2) Add do not recycle on delete feature (Yipeng)
3) Add Arm copyright
4) Add test case to test lock-free algorithm and multi-writer
test case (Yipeng)
5) Additional API documentation to indicate RCU usage (Yipeng)
6) Additional documentation on rte_hash_reset API (Yipeng)
7) Allocate memory for the global counter and avoid API
changes (Yipeng)
Dharmik Thakkar (1):
test/hash: read-write lock-free concurrency test
Honnappa Nagarahalli (6):
hash: separate multi-writer from rw-concurrency
hash: support do not recycle on delete
hash: correct key store element alignment
hash: add memory ordering to avoid race conditions
hash: fix rw concurrency while moving keys
hash: enable lock-free reader-writer concurrency
lib/librte_hash/rte_cuckoo_hash.c | 498 ++++++++++++----
lib/librte_hash/rte_cuckoo_hash.h | 17 +-
lib/librte_hash/rte_hash.h | 82 ++-
lib/librte_hash/rte_hash_version.map | 7 +
test/test/Makefile | 1 +
test/test/meson.build | 1 +
test/test/test_hash.c | 140 ++++-
test/test/test_hash_readwrite.c | 6 +-
test/test/test_hash_readwrite_lf.c | 1084 ++++++++++++++++++++++++++++++++++
9 files changed, 1704 insertions(+), 132 deletions(-)
create mode 100644 test/test/test_hash_readwrite_lf.c
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 1:02 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete Honnappa Nagarahalli
` (5 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
RW concurrency is required with single writer and multiple reader
usecase as well. Hence, multi-writer should not be enabled by default when
RW concurrency is enabled.
Fixes: f2e3001b53ec ("hash: support read/write concurrency")
Cc: yipeng1.wang@intel.com
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 27 ++++++++++++++++-----------
lib/librte_hash/rte_cuckoo_hash.h | 2 ++
test/test/test_hash_readwrite.c | 6 ++++--
3 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 750caf8..7bbe9e9 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -139,6 +139,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
unsigned int ext_table_support = 0;
unsigned int readwrite_concur_support = 0;
+ unsigned int writer_takes_lock = 0;
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
@@ -162,12 +163,14 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
- if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD)
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
multi_writer_support = 1;
+ writer_takes_lock = 1;
+ }
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) {
readwrite_concur_support = 1;
- multi_writer_support = 1;
+ writer_takes_lock = 1;
}
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_EXT_TABLE)
@@ -355,6 +358,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->multi_writer_support = multi_writer_support;
h->readwrite_concur_support = readwrite_concur_support;
h->ext_table_support = ext_table_support;
+ h->writer_takes_lock = writer_takes_lock;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2))
@@ -363,10 +367,11 @@ rte_hash_create(const struct rte_hash_parameters *params)
#endif
h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR;
- /* Turn on multi-writer only with explicit flag from user and TM
- * support.
+ /* Writer threads need to take the lock when:
+ * 1) RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY is enabled OR
+ * 2) RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD is enabled
*/
- if (h->multi_writer_support) {
+ if (h->writer_takes_lock) {
h->readwrite_lock = rte_malloc(NULL, sizeof(rte_rwlock_t),
RTE_CACHE_LINE_SIZE);
if (h->readwrite_lock == NULL)
@@ -425,10 +430,10 @@ rte_hash_free(struct rte_hash *h)
rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
- if (h->multi_writer_support) {
+ if (h->multi_writer_support)
rte_free(h->local_free_slots);
+ if (h->writer_takes_lock)
rte_free(h->readwrite_lock);
- }
rte_ring_free(h->free_slots);
rte_ring_free(h->free_ext_bkts);
rte_free(h->key_store);
@@ -473,9 +478,9 @@ rte_hash_count(const struct rte_hash *h)
static inline void
__hash_rw_writer_lock(const struct rte_hash *h)
{
- if (h->multi_writer_support && h->hw_trans_mem_support)
+ if (h->writer_takes_lock && h->hw_trans_mem_support)
rte_rwlock_write_lock_tm(h->readwrite_lock);
- else if (h->multi_writer_support)
+ else if (h->writer_takes_lock)
rte_rwlock_write_lock(h->readwrite_lock);
}
@@ -491,9 +496,9 @@ __hash_rw_reader_lock(const struct rte_hash *h)
static inline void
__hash_rw_writer_unlock(const struct rte_hash *h)
{
- if (h->multi_writer_support && h->hw_trans_mem_support)
+ if (h->writer_takes_lock && h->hw_trans_mem_support)
rte_rwlock_write_unlock_tm(h->readwrite_lock);
- else if (h->multi_writer_support)
+ else if (h->writer_takes_lock)
rte_rwlock_write_unlock(h->readwrite_lock);
}
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 7753cd8..1137651 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -166,6 +166,8 @@ struct rte_hash {
uint8_t readwrite_concur_support;
/**< If read-write concurrency support is enabled */
uint8_t ext_table_support; /**< Enable extendable bucket table */
+ uint8_t writer_takes_lock;
+ /**< Indicates if the writer threads need to take lock */
rte_hash_function hash_func; /**< Function used to calculate hash. */
uint32_t hash_func_init_val; /**< Init value used by hash_func. */
rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
index 2a4f7b9..a8fadd0 100644
--- a/test/test/test_hash_readwrite.c
+++ b/test/test/test_hash_readwrite.c
@@ -122,10 +122,12 @@ init_params(int use_htm, int use_jhash)
if (use_htm)
hash_params.extra_flag =
RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
- RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY |
+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
else
hash_params.extra_flag =
- RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY |
+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
hash_params.name = "tests";
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 1:17 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment Honnappa Nagarahalli
` (4 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
rte_hash_lookup_xxx APIs return the index of the element in
the key store. Application(reader) can use that index to reference
other data structures in its scope. Because of this, the
index should not be recycled till the application completes
using the index.
RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is introduced to support this.
When this flag is enabled rte_hash_del_xxx APIs do not free the
key-store index/internal memory associated with the deleted
entry. The new API rte_hash_free_key_with_position should be called
to free the key-store index/internal memory after calling
rte_hash_del_xxx APIs.
Suggested-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 50 +++++++++++++-
lib/librte_hash/rte_cuckoo_hash.h | 8 +++
lib/librte_hash/rte_hash.h | 40 +++++++++++
test/test/test_hash.c | 140 +++++++++++++++++++++++++++++++++++++-
4 files changed, 234 insertions(+), 4 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 7bbe9e9..8de0de3 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -140,6 +140,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
unsigned int ext_table_support = 0;
unsigned int readwrite_concur_support = 0;
unsigned int writer_takes_lock = 0;
+ unsigned int recycle_on_del = 1;
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
@@ -176,6 +177,9 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_EXT_TABLE)
ext_table_support = 1;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL)
+ recycle_on_del = 0;
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
if (multi_writer_support)
/*
@@ -359,6 +363,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->readwrite_concur_support = readwrite_concur_support;
h->ext_table_support = ext_table_support;
h->writer_takes_lock = writer_takes_lock;
+ h->recycle_on_del = recycle_on_del;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2))
@@ -1121,7 +1126,6 @@ remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, unsigned i)
unsigned lcore_id, n_slots;
struct lcore_cache *cached_free_slots;
- bkt->sig_current[i] = NULL_SIGNATURE;
if (h->multi_writer_support) {
lcore_id = rte_lcore_id();
cached_free_slots = &h->local_free_slots[lcore_id];
@@ -1183,7 +1187,12 @@ search_and_remove(const struct rte_hash *h, const void *key,
k = (struct rte_hash_key *) ((char *)keys +
bkt->key_idx[i] * h->key_entry_size);
if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- remove_entry(h, bkt, i);
+ bkt->sig_current[i] = NULL_SIGNATURE;
+ /* Do not free the key store element if
+ * recycle_on_del is disabled.
+ */
+ if (h->recycle_on_del)
+ remove_entry(h, bkt, i);
/* Return index where key is stored,
* subtracting the first dummy index
@@ -1301,6 +1310,43 @@ rte_hash_get_key_with_position(const struct rte_hash *h, const int32_t position,
return 0;
}
+int __rte_experimental
+rte_hash_free_key_with_position(const struct rte_hash *h,
+ const int32_t position)
+{
+ RETURN_IF_TRUE(((h == NULL) || (position == EMPTY_SLOT)), -EINVAL);
+
+ unsigned int lcore_id, n_slots;
+ struct lcore_cache *cached_free_slots;
+ const int32_t total_entries = h->num_buckets * RTE_HASH_BUCKET_ENTRIES;
+
+ /* Out of bounds */
+ if (position >= total_entries)
+ return -EINVAL;
+
+ if (h->multi_writer_support) {
+ lcore_id = rte_lcore_id();
+ cached_free_slots = &h->local_free_slots[lcore_id];
+ /* Cache full, need to free it. */
+ if (cached_free_slots->len == LCORE_CACHE_SIZE) {
+ /* Need to enqueue the free slots in global ring. */
+ n_slots = rte_ring_mp_enqueue_burst(h->free_slots,
+ cached_free_slots->objs,
+ LCORE_CACHE_SIZE, NULL);
+ cached_free_slots->len -= n_slots;
+ }
+ /* Put index of new free slot in cache. */
+ cached_free_slots->objs[cached_free_slots->len] =
+ (void *)((uintptr_t)position);
+ cached_free_slots->len++;
+ } else {
+ rte_ring_sp_enqueue(h->free_slots,
+ (void *)((uintptr_t)position));
+ }
+
+ return 0;
+}
+
static inline void
compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,
const struct rte_hash_bucket *prim_bkt,
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 1137651..5225402 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -166,6 +166,14 @@ struct rte_hash {
uint8_t readwrite_concur_support;
/**< If read-write concurrency support is enabled */
uint8_t ext_table_support; /**< Enable extendable bucket table */
+ uint8_t recycle_on_del;
+ /**< If internal memory/key-store entry should be
+ * freed on calling the rte_hash_del_xxx APIs.
+ * If this is set, rte_hash_free_key_with_position must be
+ * called to free the internal memory associated with
+ * the deleted entry.
+ * This flag is enabled by default.
+ */
uint8_t writer_takes_lock;
/**< Indicates if the writer threads need to take lock */
rte_hash_function hash_func; /**< Function used to calculate hash. */
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 6ace64e..3e5d336 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -14,6 +14,8 @@
#include <stdint.h>
#include <stddef.h>
+#include <rte_compat.h>
+
#ifdef __cplusplus
extern "C" {
#endif
@@ -40,6 +42,11 @@ extern "C" {
/** Flag to indicate the extendabe bucket table feature should be used */
#define RTE_HASH_EXTRA_FLAGS_EXT_TABLE 0x08
+/** Flag to disable freeing of internal memory/indices on hash delete.
+ * Refer to rte_hash_del_xxx APIs for more details.
+ */
+#define RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL 0x10
+
/**
* The type of hash value of a key.
* It should be a value of at least 32bit with fully random pattern.
@@ -236,6 +243,10 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t
* and should only be called from one thread by default.
* Thread safety can be enabled by setting flag during
* table creation.
+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
+ * the hash library's internal memory/index will not be freed by this
+ * API. rte_hash_free_key_with_position API must be called additionally
+ * to free the internal memory/index associated with the key.
*
* @param h
* Hash table to remove the key from.
@@ -257,6 +268,10 @@ rte_hash_del_key(const struct rte_hash *h, const void *key);
* and should only be called from one thread by default.
* Thread safety can be enabled by setting flag during
* table creation.
+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
+ * the hash library's internal memory/index will not be freed by this
+ * API. rte_hash_free_key_with_position API must be called additionally
+ * to free the internal memory/index associated with the key.
*
* @param h
* Hash table to remove the key from.
@@ -296,6 +311,31 @@ rte_hash_get_key_with_position(const struct rte_hash *h, const int32_t position,
void **key);
/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Free hash library's internal memory/index given the position
+ * of the key. This operation is not multi-thread safe and should
+ * only be called from one thread by default. Thread safety
+ * can be enabled by setting flag during table creation.
+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
+ * the hash library's internal memory/index must be freed using this API
+ * after the key is deleted using rte_hash_del_key_xxx APIs.
+ * This API does not validate if the key is already freed.
+ *
+ * @param h
+ * Hash table to free the key from.
+ * @param position
+ * Position returned when the key was deleted.
+ * @return
+ * - 0 if freed successfully
+ * - -EINVAL if the parameters are invalid.
+ */
+int __rte_experimental
+rte_hash_free_key_with_position(const struct rte_hash *h,
+ const int32_t position);
+
+/**
* Find a key-value pair in the hash table.
* This operation is multi-thread safe with regarding to other lookup threads.
* Read-write concurrency can be enabled by setting flag during
diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index 815c734..19131e9 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -260,6 +260,13 @@ static void run_hash_func_tests(void)
* - lookup (hit)
* - delete
* - lookup (miss)
+ *
+ * Repeat the test case when 'free on delete' is disabled.
+ * - add
+ * - lookup (hit)
+ * - delete
+ * - lookup (miss)
+ * - free
*/
static int test_add_delete(void)
{
@@ -295,10 +302,12 @@ static int test_add_delete(void)
/* repeat test with precomputed hash functions */
hash_sig_t hash_value;
- int pos1, expectedPos1;
+ int pos1, expectedPos1, delPos1;
+ ut_params.extra_flag = RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL;
handle = rte_hash_create(&ut_params);
RETURN_IF_ERROR(handle == NULL, "hash creation failed");
+ ut_params.extra_flag = 0;
hash_value = rte_hash_hash(handle, &keys[0]);
pos1 = rte_hash_add_key_with_hash(handle, &keys[0], hash_value);
@@ -315,12 +324,18 @@ static int test_add_delete(void)
print_key_info("Del", &keys[0], pos1);
RETURN_IF_ERROR(pos1 != expectedPos1,
"failed to delete key (pos1=%d)", pos1);
+ delPos1 = pos1;
pos1 = rte_hash_lookup_with_hash(handle, &keys[0], hash_value);
print_key_info("Lkp", &keys[0], pos1);
RETURN_IF_ERROR(pos1 != -ENOENT,
"fail: found key after deleting! (pos1=%d)", pos1);
+ pos1 = rte_hash_free_key_with_position(handle, delPos1);
+ print_key_info("Free", &keys[0], delPos1);
+ RETURN_IF_ERROR(pos1 != 0,
+ "failed to free key (pos1=%d)", delPos1);
+
rte_hash_free(handle);
return 0;
@@ -391,6 +406,84 @@ static int test_add_update_delete(void)
}
/*
+ * Sequence of operations for a single key with 'disable free on del' set:
+ * - delete: miss
+ * - add
+ * - lookup: hit
+ * - add: update
+ * - lookup: hit (updated data)
+ * - delete: hit
+ * - delete: miss
+ * - lookup: miss
+ * - free: hit
+ * - lookup: miss
+ */
+static int test_add_update_delete_free(void)
+{
+ struct rte_hash *handle;
+ int pos0, expectedPos0, delPos0, result;
+
+ ut_params.name = "test2";
+ ut_params.extra_flag = RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL;
+ handle = rte_hash_create(&ut_params);
+ RETURN_IF_ERROR(handle == NULL, "hash creation failed");
+ ut_params.extra_flag = 0;
+
+ pos0 = rte_hash_del_key(handle, &keys[0]);
+ print_key_info("Del", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != -ENOENT,
+ "fail: found non-existent key (pos0=%d)", pos0);
+
+ pos0 = rte_hash_add_key(handle, &keys[0]);
+ print_key_info("Add", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 < 0, "failed to add key (pos0=%d)", pos0);
+ expectedPos0 = pos0;
+
+ pos0 = rte_hash_lookup(handle, &keys[0]);
+ print_key_info("Lkp", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != expectedPos0,
+ "failed to find key (pos0=%d)", pos0);
+
+ pos0 = rte_hash_add_key(handle, &keys[0]);
+ print_key_info("Add", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != expectedPos0,
+ "failed to re-add key (pos0=%d)", pos0);
+
+ pos0 = rte_hash_lookup(handle, &keys[0]);
+ print_key_info("Lkp", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != expectedPos0,
+ "failed to find key (pos0=%d)", pos0);
+
+ delPos0 = rte_hash_del_key(handle, &keys[0]);
+ print_key_info("Del", &keys[0], delPos0);
+ RETURN_IF_ERROR(delPos0 != expectedPos0,
+ "failed to delete key (pos0=%d)", delPos0);
+
+ pos0 = rte_hash_del_key(handle, &keys[0]);
+ print_key_info("Del", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != -ENOENT,
+ "fail: deleted already deleted key (pos0=%d)", pos0);
+
+ pos0 = rte_hash_lookup(handle, &keys[0]);
+ print_key_info("Lkp", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != -ENOENT,
+ "fail: found key after deleting! (pos0=%d)", pos0);
+
+ result = rte_hash_free_key_with_position(handle, delPos0);
+ print_key_info("Free", &keys[0], delPos0);
+ RETURN_IF_ERROR(result != 0,
+ "failed to free key (pos1=%d)", delPos0);
+
+ pos0 = rte_hash_lookup(handle, &keys[0]);
+ print_key_info("Lkp", &keys[0], pos0);
+ RETURN_IF_ERROR(pos0 != -ENOENT,
+ "fail: found key after deleting! (pos0=%d)", pos0);
+
+ rte_hash_free(handle);
+ return 0;
+}
+
+/*
* Sequence of operations for retrieving a key with its position
*
* - create table
@@ -399,11 +492,20 @@ static int test_add_update_delete(void)
* - delete key
* - try to get the deleted key: miss
*
+ * Repeat the test case when 'free on delete' is disabled.
+ * - create table
+ * - add key
+ * - get the key with its position: hit
+ * - delete key
+ * - try to get the deleted key: hit
+ * - free key
+ * - try to get the deleted key: miss
+ *
*/
static int test_hash_get_key_with_position(void)
{
struct rte_hash *handle = NULL;
- int pos, expectedPos, result;
+ int pos, expectedPos, delPos, result;
void *key;
ut_params.name = "hash_get_key_w_pos";
@@ -427,6 +529,38 @@ static int test_hash_get_key_with_position(void)
RETURN_IF_ERROR(result != -ENOENT, "non valid key retrieved");
rte_hash_free(handle);
+
+ ut_params.name = "hash_get_key_w_pos";
+ ut_params.extra_flag = RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL;
+ handle = rte_hash_create(&ut_params);
+ RETURN_IF_ERROR(handle == NULL, "hash creation failed");
+ ut_params.extra_flag = 0;
+
+ pos = rte_hash_add_key(handle, &keys[0]);
+ print_key_info("Add", &keys[0], pos);
+ RETURN_IF_ERROR(pos < 0, "failed to add key (pos0=%d)", pos);
+ expectedPos = pos;
+
+ result = rte_hash_get_key_with_position(handle, pos, &key);
+ RETURN_IF_ERROR(result != 0, "error retrieving a key");
+
+ delPos = rte_hash_del_key(handle, &keys[0]);
+ print_key_info("Del", &keys[0], delPos);
+ RETURN_IF_ERROR(delPos != expectedPos,
+ "failed to delete key (pos0=%d)", delPos);
+
+ result = rte_hash_get_key_with_position(handle, delPos, &key);
+ RETURN_IF_ERROR(result != -ENOENT, "non valid key retrieved");
+
+ result = rte_hash_free_key_with_position(handle, delPos);
+ print_key_info("Free", &keys[0], delPos);
+ RETURN_IF_ERROR(result != 0,
+ "failed to free key (pos1=%d)", delPos);
+
+ result = rte_hash_get_key_with_position(handle, delPos, &key);
+ RETURN_IF_ERROR(result != -ENOENT, "non valid key retrieved");
+
+ rte_hash_free(handle);
return 0;
}
@@ -1609,6 +1743,8 @@ test_hash(void)
return -1;
if (test_add_update_delete() < 0)
return -1;
+ if (test_add_update_delete_free() < 0)
+ return -1;
if (test_five_keys() < 0)
return -1;
if (test_full_bucket() < 0)
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 1:20 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions Honnappa Nagarahalli
` (3 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
Correct the key store array element alignment. This is required to
make 'pdata' in 'struct rte_hash_key' align on the correct boundary.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 4 +++-
lib/librte_hash/rte_cuckoo_hash.h | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 8de0de3..13646d0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -279,7 +279,9 @@ rte_hash_create(const struct rte_hash_parameters *params)
rte_ring_sp_enqueue(r_ext, (void *)((uintptr_t) i));
}
- const uint32_t key_entry_size = sizeof(struct rte_hash_key) + params->key_len;
+ const uint32_t key_entry_size =
+ RTE_ALIGN(sizeof(struct rte_hash_key) + params->key_len,
+ KEY_ALIGNMENT);
const uint64_t key_tbl_size = (uint64_t) key_entry_size * num_key_slots;
k = rte_zmalloc_socket(NULL, key_tbl_size,
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 5225402..b823d37 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -123,7 +123,7 @@ struct rte_hash_key {
};
/* Variable key size */
char key[0];
-} __attribute__((aligned(KEY_ALIGNMENT)));
+};
/* All different signature compare functions */
enum rte_hash_sig_compare_function {
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
` (2 preceding siblings ...)
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 1:56 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 5/7] hash: fix rw concurrency while moving keys Honnappa Nagarahalli
` (2 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
Only race condition that can occur is - using the key store element
before the key write is completed. Hence, while inserting the element
the release memory order is used. Any other race condition is caught
by the key comparison. Memory orderings are added only where needed.
For ex: reads in the writer's context do not need memory ordering
as there is a single writer.
key_idx in the bucket entry and pdata in the key store element are
used for synchronisation. key_idx is used to release an inserted
entry in the bucket to the reader. Use of pdata for synchronisation
is required due to updation of an existing entry where-in only
the pdata is updated without updating key_idx.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 116 ++++++++++++++++++++++++++++----------
1 file changed, 87 insertions(+), 29 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 13646d0..e55725a 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2016 Intel Corporation
+ * Copyright(c) 2018 Arm Limited
*/
#include <string.h>
@@ -585,7 +586,9 @@ enqueue_slot_back(const struct rte_hash *h,
rte_ring_sp_enqueue(h->free_slots, slot_id);
}
-/* Search a key from bucket and update its data */
+/* Search a key from bucket and update its data.
+ * Writer holds the lock before calling this.
+ */
static inline int32_t
search_and_update(const struct rte_hash *h, void *data, const void *key,
struct rte_hash_bucket *bkt, uint16_t sig)
@@ -598,8 +601,13 @@ search_and_update(const struct rte_hash *h, void *data, const void *key,
k = (struct rte_hash_key *) ((char *)keys +
bkt->key_idx[i] * h->key_entry_size);
if (rte_hash_cmp_eq(key, k->key, h) == 0) {
- /* Update data */
- k->pdata = data;
+ /* 'pdata' acts as the synchronization point
+ * when an existing hash entry is updated.
+ * Key is not updated in this case.
+ */
+ __atomic_store_n(&k->pdata,
+ data,
+ __ATOMIC_RELEASE);
/*
* Return index where key is stored,
* subtracting the first dummy index
@@ -655,7 +663,15 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
/* Check if slot is available */
if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) {
prim_bkt->sig_current[i] = sig;
- prim_bkt->key_idx[i] = new_idx;
+ /* Key can be of arbitrary length, so it is
+ * not possible to store it atomically.
+ * Hence the new key element's memory stores
+ * (key as well as data) should be complete
+ * before it is referenced.
+ */
+ __atomic_store_n(&prim_bkt->key_idx[i],
+ new_idx,
+ __ATOMIC_RELEASE);
break;
}
}
@@ -739,8 +755,10 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
*/
curr_bkt->sig_current[curr_slot] =
prev_bkt->sig_current[prev_slot];
- curr_bkt->key_idx[curr_slot] =
- prev_bkt->key_idx[prev_slot];
+ /* Release the updated bucket entry */
+ __atomic_store_n(&curr_bkt->key_idx[curr_slot],
+ prev_bkt->key_idx[prev_slot],
+ __ATOMIC_RELEASE);
curr_slot = prev_slot;
curr_node = prev_node;
@@ -748,7 +766,10 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
}
curr_bkt->sig_current[curr_slot] = sig;
- curr_bkt->key_idx[curr_slot] = new_idx;
+ /* Release the new bucket entry */
+ __atomic_store_n(&curr_bkt->key_idx[curr_slot],
+ new_idx,
+ __ATOMIC_RELEASE);
__hash_rw_writer_unlock(h);
@@ -889,8 +910,15 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
new_idx = (uint32_t)((uintptr_t) slot_id);
/* Copy key */
rte_memcpy(new_k->key, key, h->key_len);
- new_k->pdata = data;
-
+ /* Key can be of arbitrary length, so it is not possible to store
+ * it atomically. Hence the new key element's memory stores
+ * (key as well as data) should be complete before it is referenced.
+ * 'pdata' acts as the synchronization point when an existing hash
+ * entry is updated.
+ */
+ __atomic_store_n(&new_k->pdata,
+ data,
+ __ATOMIC_RELEASE);
/* Find an empty slot and insert */
ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data,
@@ -1034,21 +1062,27 @@ search_one_bucket(const struct rte_hash *h, const void *key, uint16_t sig,
void **data, const struct rte_hash_bucket *bkt)
{
int i;
+ uint32_t key_idx;
+ void *pdata;
struct rte_hash_key *k, *keys = h->key_store;
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == sig &&
- bkt->key_idx[i] != EMPTY_SLOT) {
+ key_idx = __atomic_load_n(&bkt->key_idx[i],
+ __ATOMIC_ACQUIRE);
+ if (bkt->sig_current[i] == sig && key_idx != EMPTY_SLOT) {
k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
+ key_idx * h->key_entry_size);
+ pdata = __atomic_load_n(&k->pdata,
+ __ATOMIC_ACQUIRE);
+
if (rte_hash_cmp_eq(key, k->key, h) == 0) {
if (data != NULL)
- *data = k->pdata;
+ *data = pdata;
/*
* Return index where key is stored,
* subtracting the first dummy index
*/
- return bkt->key_idx[i] - 1;
+ return key_idx - 1;
}
}
}
@@ -1173,21 +1207,25 @@ __rte_hash_compact_ll(struct rte_hash_bucket *cur_bkt, int pos) {
}
}
-/* Search one bucket and remove the matched key */
+/* Search one bucket and remove the matched key.
+ * Writer is expected to hold the lock while calling this
+ * function.
+ */
static inline int32_t
search_and_remove(const struct rte_hash *h, const void *key,
struct rte_hash_bucket *bkt, uint16_t sig, int *pos)
{
struct rte_hash_key *k, *keys = h->key_store;
unsigned int i;
- int32_t ret;
+ uint32_t key_idx;
/* Check if key is in bucket */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
- if (bkt->sig_current[i] == sig &&
- bkt->key_idx[i] != EMPTY_SLOT) {
+ key_idx = __atomic_load_n(&bkt->key_idx[i],
+ __ATOMIC_ACQUIRE);
+ if (bkt->sig_current[i] == sig && key_idx != EMPTY_SLOT) {
k = (struct rte_hash_key *) ((char *)keys +
- bkt->key_idx[i] * h->key_entry_size);
+ key_idx * h->key_entry_size);
if (rte_hash_cmp_eq(key, k->key, h) == 0) {
bkt->sig_current[i] = NULL_SIGNATURE;
/* Do not free the key store element if
@@ -1196,13 +1234,16 @@ search_and_remove(const struct rte_hash *h, const void *key,
if (h->recycle_on_del)
remove_entry(h, bkt, i);
- /* Return index where key is stored,
+ __atomic_store_n(&bkt->key_idx[i],
+ EMPTY_SLOT,
+ __ATOMIC_RELEASE);
+
+ *pos = i;
+ /*
+ * Return index where key is stored,
* subtracting the first dummy index
*/
- ret = bkt->key_idx[i] - 1;
- bkt->key_idx[i] = EMPTY_SLOT;
- *pos = i;
- return ret;
+ return key_idx - 1;
}
}
}
@@ -1402,6 +1443,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
uint32_t prim_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0};
uint32_t sec_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0};
struct rte_hash_bucket *cur_bkt, *next_bkt;
+ void *pdata[RTE_HASH_LOOKUP_BULK_MAX];
/* Prefetch first keys */
for (i = 0; i < PREFETCH_OFFSET && i < num_keys; i++)
@@ -1480,18 +1522,25 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
uint32_t hit_index =
__builtin_ctzl(prim_hitmask[i]) >> 1;
- uint32_t key_idx = primary_bkt[i]->key_idx[hit_index];
+ uint32_t key_idx =
+ __atomic_load_n(
+ &primary_bkt[i]->key_idx[hit_index],
+ __ATOMIC_ACQUIRE);
const struct rte_hash_key *key_slot =
(const struct rte_hash_key *)(
(const char *)h->key_store +
key_idx * h->key_entry_size);
+
+ if (key_idx != EMPTY_SLOT)
+ pdata[i] = __atomic_load_n(&key_slot->pdata,
+ __ATOMIC_ACQUIRE);
/*
* If key index is 0, do not compare key,
* as it is checking the dummy slot
*/
if (!!key_idx & !rte_hash_cmp_eq(key_slot->key, keys[i], h)) {
if (data != NULL)
- data[i] = key_slot->pdata;
+ data[i] = pdata[i];
hits |= 1ULL << i;
positions[i] = key_idx - 1;
@@ -1504,11 +1553,19 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
uint32_t hit_index =
__builtin_ctzl(sec_hitmask[i]) >> 1;
- uint32_t key_idx = secondary_bkt[i]->key_idx[hit_index];
+ uint32_t key_idx =
+ __atomic_load_n(
+ &secondary_bkt[i]->key_idx[hit_index],
+ __ATOMIC_ACQUIRE);
const struct rte_hash_key *key_slot =
(const struct rte_hash_key *)(
(const char *)h->key_store +
key_idx * h->key_entry_size);
+
+ if (key_idx != EMPTY_SLOT)
+ pdata[i] = __atomic_load_n(&key_slot->pdata,
+ __ATOMIC_ACQUIRE);
+
/*
* If key index is 0, do not compare key,
* as it is checking the dummy slot
@@ -1516,7 +1573,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
if (!!key_idx & !rte_hash_cmp_eq(key_slot->key, keys[i], h)) {
if (data != NULL)
- data[i] = key_slot->pdata;
+ data[i] = pdata[i];
hits |= 1ULL << i;
positions[i] = key_idx - 1;
@@ -1612,7 +1669,8 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
idx = *next % RTE_HASH_BUCKET_ENTRIES;
/* If current position is empty, go to the next one */
- while ((position = h->buckets[bucket_idx].key_idx[idx]) == EMPTY_SLOT) {
+ while ((position = __atomic_load_n(&h->buckets[bucket_idx].key_idx[idx],
+ __ATOMIC_ACQUIRE)) == EMPTY_SLOT) {
(*next)++;
/* End of table */
if (*next == total_entries)
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 5/7] hash: fix rw concurrency while moving keys
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
` (3 preceding siblings ...)
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 2:07 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 7/7] test/hash: read-write lock-free concurrency test Honnappa Nagarahalli
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
Reader-writer concurrency issue, caused by moving the keys
to their alternative locations during key insert, is solved
by introducing a global counter(tbl_chng_cnt) indicating a
change in table.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 305 +++++++++++++++++++++++++-------------
lib/librte_hash/rte_cuckoo_hash.h | 3 +
2 files changed, 208 insertions(+), 100 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index e55725a..262162c 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -142,6 +142,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
unsigned int readwrite_concur_support = 0;
unsigned int writer_takes_lock = 0;
unsigned int recycle_on_del = 1;
+ uint32_t *tbl_chng_cnt = NULL;
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
@@ -293,6 +294,14 @@ rte_hash_create(const struct rte_hash_parameters *params)
goto err_unlock;
}
+ tbl_chng_cnt = rte_zmalloc_socket(NULL, sizeof(uint32_t),
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+
+ if (tbl_chng_cnt == NULL) {
+ RTE_LOG(ERR, HASH, "memory allocation failed\n");
+ goto err_unlock;
+ }
+
/*
* If x86 architecture is used, select appropriate compare function,
* which may use x86 intrinsics, otherwise use memcmp
@@ -361,6 +370,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
default_hash_func : params->hash_func;
h->key_store = k;
h->free_slots = r;
+ h->tbl_chng_cnt = tbl_chng_cnt;
+ *h->tbl_chng_cnt = 0;
h->hw_trans_mem_support = hw_trans_mem_support;
h->multi_writer_support = multi_writer_support;
h->readwrite_concur_support = readwrite_concur_support;
@@ -407,6 +418,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
rte_free(buckets);
rte_free(buckets_ext);
rte_free(k);
+ rte_free(tbl_chng_cnt);
return NULL;
}
@@ -447,6 +459,7 @@ rte_hash_free(struct rte_hash *h)
rte_free(h->key_store);
rte_free(h->buckets);
rte_free(h->buckets_ext);
+ rte_free(h->tbl_chng_cnt);
rte_free(h);
rte_free(te);
}
@@ -531,6 +544,7 @@ rte_hash_reset(struct rte_hash *h)
__hash_rw_writer_lock(h);
memset(h->buckets, 0, h->num_buckets * sizeof(struct rte_hash_bucket));
memset(h->key_store, 0, h->key_entry_size * (h->entries + 1));
+ *h->tbl_chng_cnt = 0;
/* clear the free ring */
while (rte_ring_dequeue(h->free_slots, &ptr) == 0)
@@ -744,11 +758,27 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
if (unlikely(&h->buckets[prev_alt_bkt_idx]
!= curr_bkt)) {
/* revert it to empty, otherwise duplicated keys */
- curr_bkt->key_idx[curr_slot] = EMPTY_SLOT;
+ __atomic_store_n(&curr_bkt->key_idx[curr_slot],
+ EMPTY_SLOT,
+ __ATOMIC_RELEASE);
__hash_rw_writer_unlock(h);
return -1;
}
+ /* Inform the previous move. The current move need
+ * not be informed now as the current bucket entry
+ * is present in both primary and secondary.
+ * Since there is one writer, load acquires on
+ * tbl_chng_cnt are not required.
+ */
+ __atomic_store_n(h->tbl_chng_cnt,
+ *h->tbl_chng_cnt + 1,
+ __ATOMIC_RELEASE);
+ /* The stores to sig_alt and sig_current should not
+ * move above the store to tbl_chng_cnt.
+ */
+ __atomic_thread_fence(__ATOMIC_RELEASE);
+
/* Need to swap current/alt sig to allow later
* Cuckoo insert to move elements back to its
* primary bucket if available
@@ -765,6 +795,20 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
curr_bkt = curr_node->bkt;
}
+ /* Inform the previous move. The current move need
+ * not be informed now as the current bucket entry
+ * is present in both primary and secondary.
+ * Since there is one writer, load acquires on
+ * tbl_chng_cnt are not required.
+ */
+ __atomic_store_n(h->tbl_chng_cnt,
+ *h->tbl_chng_cnt + 1,
+ __ATOMIC_RELEASE);
+ /* The stores to sig_alt and sig_current should not
+ * move above the store to tbl_chng_cnt.
+ */
+ __atomic_thread_fence(__ATOMIC_RELEASE);
+
curr_bkt->sig_current[curr_slot] = sig;
/* Release the new bucket entry */
__atomic_store_n(&curr_bkt->key_idx[curr_slot],
@@ -1095,34 +1139,62 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
{
uint32_t prim_bucket_idx, sec_bucket_idx;
struct rte_hash_bucket *bkt, *cur_bkt;
+ uint32_t cnt_b, cnt_a;
int ret;
uint16_t short_sig;
short_sig = get_short_sig(sig);
prim_bucket_idx = get_prim_bucket_index(h, sig);
sec_bucket_idx = get_alt_bucket_index(h, prim_bucket_idx, short_sig);
- bkt = &h->buckets[prim_bucket_idx];
__hash_rw_reader_lock(h);
- /* Check if key is in primary location */
- ret = search_one_bucket(h, key, short_sig, data, bkt);
- if (ret != -1) {
- __hash_rw_reader_unlock(h);
- return ret;
- }
- /* Calculate secondary hash */
- bkt = &h->buckets[sec_bucket_idx];
+ do {
+ /* Load the table change counter before the lookup
+ * starts. Acquire semantics will make sure that
+ * loads in search_one_bucket are not hoisted.
+ */
+ cnt_b = __atomic_load_n(h->tbl_chng_cnt,
+ __ATOMIC_ACQUIRE);
- /* Check if key is in secondary location */
- FOR_EACH_BUCKET(cur_bkt, bkt) {
- ret = search_one_bucket(h, key, short_sig, data, cur_bkt);
+ /* Check if key is in primary location */
+ bkt = &h->buckets[prim_bucket_idx];
+ ret = search_one_bucket(h, key, short_sig, data, bkt);
if (ret != -1) {
__hash_rw_reader_unlock(h);
return ret;
}
- }
+ /* Calculate secondary hash */
+ bkt = &h->buckets[sec_bucket_idx];
+
+ /* Check if key is in secondary location */
+ FOR_EACH_BUCKET(cur_bkt, bkt) {
+ ret = search_one_bucket(h, key, short_sig,
+ data, cur_bkt);
+ if (ret != -1) {
+ __hash_rw_reader_unlock(h);
+ return ret;
+ }
+ }
+
+ /* The loads of sig_current in search_one_bucket
+ * should not move below the load from tbl_chng_cnt.
+ */
+ __atomic_thread_fence(__ATOMIC_ACQUIRE);
+ /* Re-read the table change counter to check if the
+ * table has changed during search. If yes, re-do
+ * the search.
+ * This load should not get hoisted. The load
+ * acquires on cnt_b, key index in primary bucket
+ * and key index in secondary bucket will make sure
+ * that it does not get hoisted.
+ */
+ cnt_a = __atomic_load_n(h->tbl_chng_cnt,
+ __ATOMIC_ACQUIRE);
+ } while (cnt_b != cnt_a);
+
__hash_rw_reader_unlock(h);
+
return -ENOENT;
}
@@ -1444,6 +1516,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
uint32_t sec_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0};
struct rte_hash_bucket *cur_bkt, *next_bkt;
void *pdata[RTE_HASH_LOOKUP_BULK_MAX];
+ uint32_t cnt_b, cnt_a;
/* Prefetch first keys */
for (i = 0; i < PREFETCH_OFFSET && i < num_keys; i++)
@@ -1485,106 +1558,138 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
}
__hash_rw_reader_lock(h);
- /* Compare signatures and prefetch key slot of first hit */
- for (i = 0; i < num_keys; i++) {
- compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
+ do {
+ /* Load the table change counter before the lookup
+ * starts. Acquire semantics will make sure that
+ * loads in compare_signatures are not hoisted.
+ */
+ cnt_b = __atomic_load_n(h->tbl_chng_cnt,
+ __ATOMIC_ACQUIRE);
+
+ /* Compare signatures and prefetch key slot of first hit */
+ for (i = 0; i < num_keys; i++) {
+ compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
primary_bkt[i], secondary_bkt[i],
sig[i], h->sig_cmp_fn);
- if (prim_hitmask[i]) {
- uint32_t first_hit =
- __builtin_ctzl(prim_hitmask[i]) >> 1;
- uint32_t key_idx = primary_bkt[i]->key_idx[first_hit];
- const struct rte_hash_key *key_slot =
- (const struct rte_hash_key *)(
- (const char *)h->key_store +
- key_idx * h->key_entry_size);
- rte_prefetch0(key_slot);
- continue;
- }
+ if (prim_hitmask[i]) {
+ uint32_t first_hit =
+ __builtin_ctzl(prim_hitmask[i])
+ >> 1;
+ uint32_t key_idx =
+ primary_bkt[i]->key_idx[first_hit];
+ const struct rte_hash_key *key_slot =
+ (const struct rte_hash_key *)(
+ (const char *)h->key_store +
+ key_idx * h->key_entry_size);
+ rte_prefetch0(key_slot);
+ continue;
+ }
- if (sec_hitmask[i]) {
- uint32_t first_hit =
- __builtin_ctzl(sec_hitmask[i]) >> 1;
- uint32_t key_idx = secondary_bkt[i]->key_idx[first_hit];
- const struct rte_hash_key *key_slot =
- (const struct rte_hash_key *)(
- (const char *)h->key_store +
- key_idx * h->key_entry_size);
- rte_prefetch0(key_slot);
+ if (sec_hitmask[i]) {
+ uint32_t first_hit =
+ __builtin_ctzl(sec_hitmask[i])
+ >> 1;
+ uint32_t key_idx =
+ secondary_bkt[i]->key_idx[first_hit];
+ const struct rte_hash_key *key_slot =
+ (const struct rte_hash_key *)(
+ (const char *)h->key_store +
+ key_idx * h->key_entry_size);
+ rte_prefetch0(key_slot);
+ }
}
- }
- /* Compare keys, first hits in primary first */
- for (i = 0; i < num_keys; i++) {
- positions[i] = -ENOENT;
- while (prim_hitmask[i]) {
- uint32_t hit_index =
- __builtin_ctzl(prim_hitmask[i]) >> 1;
-
- uint32_t key_idx =
- __atomic_load_n(
- &primary_bkt[i]->key_idx[hit_index],
- __ATOMIC_ACQUIRE);
- const struct rte_hash_key *key_slot =
- (const struct rte_hash_key *)(
- (const char *)h->key_store +
- key_idx * h->key_entry_size);
-
- if (key_idx != EMPTY_SLOT)
- pdata[i] = __atomic_load_n(&key_slot->pdata,
- __ATOMIC_ACQUIRE);
- /*
- * If key index is 0, do not compare key,
- * as it is checking the dummy slot
- */
- if (!!key_idx & !rte_hash_cmp_eq(key_slot->key, keys[i], h)) {
- if (data != NULL)
- data[i] = pdata[i];
+ /* Compare keys, first hits in primary first */
+ for (i = 0; i < num_keys; i++) {
+ positions[i] = -ENOENT;
+ while (prim_hitmask[i]) {
+ uint32_t hit_index =
+ __builtin_ctzl(prim_hitmask[i])
+ >> 1;
+ uint32_t key_idx =
+ __atomic_load_n(
+ &primary_bkt[i]->key_idx[hit_index],
+ __ATOMIC_ACQUIRE);
+ const struct rte_hash_key *key_slot =
+ (const struct rte_hash_key *)(
+ (const char *)h->key_store +
+ key_idx * h->key_entry_size);
- hits |= 1ULL << i;
- positions[i] = key_idx - 1;
- goto next_key;
+ if (key_idx != EMPTY_SLOT)
+ pdata[i] = __atomic_load_n(
+ &key_slot->pdata,
+ __ATOMIC_ACQUIRE);
+ /*
+ * If key index is 0, do not compare key,
+ * as it is checking the dummy slot
+ */
+ if (!!key_idx &
+ !rte_hash_cmp_eq(
+ key_slot->key, keys[i], h)) {
+ if (data != NULL)
+ data[i] = pdata[i];
+
+ hits |= 1ULL << i;
+ positions[i] = key_idx - 1;
+ goto next_key;
+ }
+ prim_hitmask[i] &= ~(3ULL << (hit_index << 1));
}
- prim_hitmask[i] &= ~(3ULL << (hit_index << 1));
- }
-
- while (sec_hitmask[i]) {
- uint32_t hit_index =
- __builtin_ctzl(sec_hitmask[i]) >> 1;
- uint32_t key_idx =
- __atomic_load_n(
- &secondary_bkt[i]->key_idx[hit_index],
- __ATOMIC_ACQUIRE);
- const struct rte_hash_key *key_slot =
- (const struct rte_hash_key *)(
- (const char *)h->key_store +
- key_idx * h->key_entry_size);
-
- if (key_idx != EMPTY_SLOT)
- pdata[i] = __atomic_load_n(&key_slot->pdata,
- __ATOMIC_ACQUIRE);
-
- /*
- * If key index is 0, do not compare key,
- * as it is checking the dummy slot
- */
+ while (sec_hitmask[i]) {
+ uint32_t hit_index =
+ __builtin_ctzl(sec_hitmask[i])
+ >> 1;
+ uint32_t key_idx =
+ __atomic_load_n(
+ &secondary_bkt[i]->key_idx[hit_index],
+ __ATOMIC_ACQUIRE);
+ const struct rte_hash_key *key_slot =
+ (const struct rte_hash_key *)(
+ (const char *)h->key_store +
+ key_idx * h->key_entry_size);
- if (!!key_idx & !rte_hash_cmp_eq(key_slot->key, keys[i], h)) {
- if (data != NULL)
- data[i] = pdata[i];
+ if (key_idx != EMPTY_SLOT)
+ pdata[i] = __atomic_load_n(
+ &key_slot->pdata,
+ __ATOMIC_ACQUIRE);
+ /*
+ * If key index is 0, do not compare key,
+ * as it is checking the dummy slot
+ */
- hits |= 1ULL << i;
- positions[i] = key_idx - 1;
- goto next_key;
+ if (!!key_idx &
+ !rte_hash_cmp_eq(
+ key_slot->key, keys[i], h)) {
+ if (data != NULL)
+ data[i] = pdata[i];
+
+ hits |= 1ULL << i;
+ positions[i] = key_idx - 1;
+ goto next_key;
+ }
+ sec_hitmask[i] &= ~(3ULL << (hit_index << 1));
}
- sec_hitmask[i] &= ~(3ULL << (hit_index << 1));
+next_key:
+ continue;
}
-next_key:
- continue;
- }
+ /* The loads of sig_current in compare_signatures
+ * should not move below the load from tbl_chng_cnt.
+ */
+ __atomic_thread_fence(__ATOMIC_ACQUIRE);
+ /* Re-read the table change counter to check if the
+ * table has changed during search. If yes, re-do
+ * the search.
+ * This load should not get hoisted. The load
+ * acquires on cnt_b, primary key index and secondary
+ * key index will make sure that it does not get
+ * hoisted.
+ */
+ cnt_a = __atomic_load_n(h->tbl_chng_cnt,
+ __ATOMIC_ACQUIRE);
+ } while (cnt_b != cnt_a);
/* all found, do not need to go through ext bkt */
if ((hits == ((1ULL << num_keys) - 1)) || !h->ext_table_support) {
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index b823d37..728cd17 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2016 Intel Corporation
+ * Copyright(c) 2018 Arm Limited
*/
/* rte_cuckoo_hash.h
@@ -196,6 +197,8 @@ struct rte_hash {
rte_rwlock_t *readwrite_lock; /**< Read-write lock thread-safety. */
struct rte_hash_bucket *buckets_ext; /**< Extra buckets array */
struct rte_ring *free_ext_bkts; /**< Ring of indexes of free buckets */
+ uint32_t *tbl_chng_cnt;
+ /**< Indicates if the hash table changed from last read. */
} __rte_cache_aligned;
struct queue_node {
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
` (4 preceding siblings ...)
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 5/7] hash: fix rw concurrency while moving keys Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 2:32 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 7/7] test/hash: read-write lock-free concurrency test Honnappa Nagarahalli
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
Add the flag to enable reader-writer concurrency during
run time. The rte_hash_del_xxx APIs do not free the keystore
element when this flag is enabled. Hence a new API,
rte_hash_free_key_with_position, to free the key store element
is added.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
---
lib/librte_hash/rte_cuckoo_hash.c | 82 ++++++++++++++++++++++++------------
lib/librte_hash/rte_cuckoo_hash.h | 2 +
lib/librte_hash/rte_hash.h | 58 +++++++++++++++++++++----
lib/librte_hash/rte_hash_version.map | 7 +++
4 files changed, 114 insertions(+), 35 deletions(-)
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 262162c..4622ece 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -143,6 +143,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
unsigned int writer_takes_lock = 0;
unsigned int recycle_on_del = 1;
uint32_t *tbl_chng_cnt = NULL;
+ unsigned int readwrite_concur_lf_support = 0;
rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
@@ -162,6 +163,24 @@ rte_hash_create(const struct rte_hash_parameters *params)
return NULL;
}
+ /* Validate correct usage of extra options */
+ if ((params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) &&
+ (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF)) {
+ rte_errno = EINVAL;
+ RTE_LOG(ERR, HASH, "rte_hash_create choose rw concurrency or "
+ "rw concurrency lock free\n");
+ return NULL;
+ }
+
+ if ((params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF) &&
+ (params->extra_flag & RTE_HASH_EXTRA_FLAGS_EXT_TABLE)) {
+ rte_errno = EINVAL;
+ RTE_LOG(ERR, HASH, "rte_hash_create extendable bucket "
+ "feature not supported with rw concurrency "
+ "lock free\n");
+ return NULL;
+ }
+
/* Check extra flags field to check extra options. */
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
hw_trans_mem_support = 1;
@@ -182,6 +201,12 @@ rte_hash_create(const struct rte_hash_parameters *params)
if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL)
recycle_on_del = 0;
+ if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF) {
+ readwrite_concur_lf_support = 1;
+ /* Disable freeing internal memory/index on delete */
+ recycle_on_del = 0;
+ }
+
/* Store all keys and leave the first entry as a dummy entry for lookup_bulk */
if (multi_writer_support)
/*
@@ -378,6 +403,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
h->ext_table_support = ext_table_support;
h->writer_takes_lock = writer_takes_lock;
h->recycle_on_del = recycle_on_del;
+ h->readwrite_concur_lf_support = readwrite_concur_lf_support;
#if defined(RTE_ARCH_X86)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2))
@@ -765,19 +791,21 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
return -1;
}
- /* Inform the previous move. The current move need
- * not be informed now as the current bucket entry
- * is present in both primary and secondary.
- * Since there is one writer, load acquires on
- * tbl_chng_cnt are not required.
- */
- __atomic_store_n(h->tbl_chng_cnt,
- *h->tbl_chng_cnt + 1,
- __ATOMIC_RELEASE);
- /* The stores to sig_alt and sig_current should not
- * move above the store to tbl_chng_cnt.
- */
- __atomic_thread_fence(__ATOMIC_RELEASE);
+ if (h->readwrite_concur_lf_support) {
+ /* Inform the previous move. The current move need
+ * not be informed now as the current bucket entry
+ * is present in both primary and secondary.
+ * Since there is one writer, load acquires on
+ * tbl_chng_cnt are not required.
+ */
+ __atomic_store_n(h->tbl_chng_cnt,
+ *h->tbl_chng_cnt + 1,
+ __ATOMIC_RELEASE);
+ /* The stores to sig_alt and sig_current should not
+ * move above the store to tbl_chng_cnt.
+ */
+ __atomic_thread_fence(__ATOMIC_RELEASE);
+ }
/* Need to swap current/alt sig to allow later
* Cuckoo insert to move elements back to its
@@ -795,19 +823,21 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
curr_bkt = curr_node->bkt;
}
- /* Inform the previous move. The current move need
- * not be informed now as the current bucket entry
- * is present in both primary and secondary.
- * Since there is one writer, load acquires on
- * tbl_chng_cnt are not required.
- */
- __atomic_store_n(h->tbl_chng_cnt,
- *h->tbl_chng_cnt + 1,
- __ATOMIC_RELEASE);
- /* The stores to sig_alt and sig_current should not
- * move above the store to tbl_chng_cnt.
- */
- __atomic_thread_fence(__ATOMIC_RELEASE);
+ if (h->readwrite_concur_lf_support) {
+ /* Inform the previous move. The current move need
+ * not be informed now as the current bucket entry
+ * is present in both primary and secondary.
+ * Since there is one writer, load acquires on
+ * tbl_chng_cnt are not required.
+ */
+ __atomic_store_n(h->tbl_chng_cnt,
+ *h->tbl_chng_cnt + 1,
+ __ATOMIC_RELEASE);
+ /* The stores to sig_alt and sig_current should not
+ * move above the store to tbl_chng_cnt.
+ */
+ __atomic_thread_fence(__ATOMIC_RELEASE);
+ }
curr_bkt->sig_current[curr_slot] = sig;
/* Release the new bucket entry */
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index 728cd17..0c23957 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -175,6 +175,8 @@ struct rte_hash {
* the deleted entry.
* This flag is enabled by default.
*/
+ uint8_t readwrite_concur_lf_support;
+ /**< If read-write concurrency lock free support is enabled */
uint8_t writer_takes_lock;
/**< Indicates if the writer threads need to take lock */
rte_hash_function hash_func; /**< Function used to calculate hash. */
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 3e5d336..22def5e 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -44,9 +44,18 @@ extern "C" {
/** Flag to disable freeing of internal memory/indices on hash delete.
* Refer to rte_hash_del_xxx APIs for more details.
+ * This is enabled by default when RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF
+ * is enabled.
*/
#define RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL 0x10
+/** Flag to support lock free reader writer concurrency. Writer can be
+ * single writer/multi writer.
+ * Currently, extended bucket table feature is not supported with
+ * this feature.
+ */
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF 0x20
+
/**
* The type of hash value of a key.
* It should be a value of at least 32bit with fully random pattern.
@@ -132,7 +141,11 @@ void
rte_hash_free(struct rte_hash *h);
/**
- * Reset all hash structure, by zeroing all entries
+ * Reset all hash structure, by zeroing all entries.
+ * When RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * it is application's responsibility to make sure that
+ * none of the readers are referencing the hash table.
+ *
* @param h
* Hash table to reset
*/
@@ -156,6 +169,10 @@ rte_hash_count(const struct rte_hash *h);
* and should only be called from one thread by default.
* Thread safety can be enabled by setting flag during
* table creation.
+ * The writer needs to be aware if this API is called to update
+ * an existing entry. The application should free any memory
+ * allocated for the existing 'data' only after all the readers
+ * have stopped referrencing it.
*
* @param h
* Hash table to add the key to.
@@ -178,6 +195,10 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
* and should only be called from one thread by default.
* Thread safety can be enabled by setting flag during
* table creation.
+ * The writer needs to be aware if this API is called to update
+ * an existing entry. The application should free any memory
+ * allocated for the existing 'data' only after all the readers
+ * have stopped referencing it.
*
* @param h
* Hash table to add the key to.
@@ -243,10 +264,15 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t
* and should only be called from one thread by default.
* Thread safety can be enabled by setting flag during
* table creation.
- * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
- * the hash library's internal memory/index will not be freed by this
+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL or
+ * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * the hash library's internal memory will not be freed by this
* API. rte_hash_free_key_with_position API must be called additionally
- * to free the internal memory/index associated with the key.
+ * to free any internal memory associated with the key.
+ * If RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * rte_hash_free_key_with_position API should be called after all
+ * the readers have stopped referencing the entry corresponding to
+ * this key. RCU mechanisms can be used to determine such a state.
*
* @param h
* Hash table to remove the key from.
@@ -258,6 +284,8 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t
* - A positive value that can be used by the caller as an offset into an
* array of user data. This value is unique for this key, and is the same
* value that was returned when the key was added.
+ * When lock free concurrency is enabled, this value should be used
+ * while calling the rte_hash_free_key_with_position API.
*/
int32_t
rte_hash_del_key(const struct rte_hash *h, const void *key);
@@ -268,10 +296,15 @@ rte_hash_del_key(const struct rte_hash *h, const void *key);
* and should only be called from one thread by default.
* Thread safety can be enabled by setting flag during
* table creation.
- * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
- * the hash library's internal memory/index will not be freed by this
+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL or
+ * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * the hash library's internal memory will not be freed by this
* API. rte_hash_free_key_with_position API must be called additionally
- * to free the internal memory/index associated with the key.
+ * to free any internal memory associated with the key.
+ * If RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * rte_hash_free_key_with_position API should be called after all
+ * the readers have stopped referencing the entry corresponding to
+ * this key. RCU mechanisms can be used to determine such a state.
*
* @param h
* Hash table to remove the key from.
@@ -285,6 +318,8 @@ rte_hash_del_key(const struct rte_hash *h, const void *key);
* - A positive value that can be used by the caller as an offset into an
* array of user data. This value is unique for this key, and is the same
* value that was returned when the key was added.
+ * When lock free concurrency is enabled, this value should be used
+ * while calling the rte_hash_free_key_with_position API.
*/
int32_t
rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig);
@@ -318,10 +353,15 @@ rte_hash_get_key_with_position(const struct rte_hash *h, const int32_t position,
* of the key. This operation is not multi-thread safe and should
* only be called from one thread by default. Thread safety
* can be enabled by setting flag during table creation.
- * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
- * the hash library's internal memory/index must be freed using this API
+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL or
+ * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * the hash library's internal memory must be freed using this API
* after the key is deleted using rte_hash_del_key_xxx APIs.
* This API does not validate if the key is already freed.
+ * If RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * this API should be called only after all the readers have stopped
+ * referencing the entry corresponding to this key. RCU mechanisms can
+ * be used to determine such a state.
*
* @param h
* Hash table to free the key from.
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index e216ac8..734ae28 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -53,3 +53,10 @@ DPDK_18.08 {
rte_hash_count;
} DPDK_16.07;
+
+EXPERIMENTAL {
+ global:
+
+ rte_hash_free_key_with_position;
+
+};
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [dpdk-dev] [PATCH v3 7/7] test/hash: read-write lock-free concurrency test
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
` (5 preceding siblings ...)
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency Honnappa Nagarahalli
@ 2018-10-12 6:31 ` Honnappa Nagarahalli
2018-10-13 2:52 ` Wang, Yipeng1
6 siblings, 1 reply; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-12 6:31 UTC (permalink / raw)
To: bruce.richardson, pablo.de.lara.guarch
Cc: dev, yipeng1.wang, honnappa.nagarahalli, dharmik.thakkar, gavin.hu, nd
From: Dharmik Thakkar <dharmik.thakkar@arm.com>
Unit tests to check for hash lookup perf with lock-free enabled and
with lock-free disabled.
Unit tests performed with readers running in parallel with writers.
Tests include:
- hash lookup on existing keys with:
- hash add causing NO key-shifts of existing keys in the table
- hash lookup on existing keys likely to be on shift-path with:
- hash add causing key-shifts of existing keys in the table
- hash lookup on existing keys NOT likely to be on shift-path with:
- hash add causing key-shifts of existing keys in the table
- hash lookup on non-existing keys with:
- hash add causing NO key-shifts of existing keys in the table
- hash add causing key-shifts of existing keys in the table
- hash lookup on keys likely to be on shift-path with:
- multiple writers causing key-shifts of existing keys in the table
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
test/test/Makefile | 1 +
test/test/meson.build | 1 +
test/test/test_hash_readwrite_lf.c | 1084 ++++++++++++++++++++++++++++++++++++
3 files changed, 1086 insertions(+)
create mode 100644 test/test/test_hash_readwrite_lf.c
diff --git a/test/test/Makefile b/test/test/Makefile
index 5d8b1dc..0e0e6c4 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -116,6 +116,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite_lf.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/test/test/meson.build b/test/test/meson.build
index d696f5e..bc3350f 100644
--- a/test/test/meson.build
+++ b/test/test/meson.build
@@ -46,6 +46,7 @@ test_sources = files('commands.c',
'test_hash_multiwriter.c',
'test_hash_readwrite.c',
'test_hash_perf.c',
+ 'test_hash_readwrite_lf.c',
'test_hash_scaling.c',
'test_interrupts.c',
'test_kni.c',
diff --git a/test/test/test_hash_readwrite_lf.c b/test/test/test_hash_readwrite_lf.c
new file mode 100644
index 0000000..841e989
--- /dev/null
+++ b/test/test/test_hash_readwrite_lf.c
@@ -0,0 +1,1084 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Arm Limited
+ */
+
+#include <inttypes.h>
+#include <locale.h>
+
+#include <rte_cycles.h>
+#include <rte_hash.h>
+#include <rte_hash_crc.h>
+#include <rte_jhash.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+#include <rte_spinlock.h>
+
+#include "test.h"
+
+#ifndef RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF
+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF 0
+#endif
+
+#define RUN_WITH_HTM_DISABLED 0
+
+#if (RUN_WITH_HTM_DISABLED)
+
+#define TOTAL_ENTRY (5*1024)
+#define TOTAL_INSERT (5*1024)
+
+#else
+
+#define TOTAL_ENTRY (16*1024*1024)
+#define TOTAL_INSERT (16*1024*1024)
+
+#endif
+
+#define READ_FAIL 0
+#define READ_PASS_NO_KEY_SHIFTS 1
+#define READ_PASS_SHIFT_PATH 2
+#define READ_PASS_NON_SHIFT_PATH 3
+
+#define NUM_TEST 3
+unsigned int rwc_core_cnt[NUM_TEST] = {1, 2, 4};
+
+struct rwc_perf {
+ uint32_t w_no_ks_r_pass[NUM_TEST];
+ uint32_t w_no_ks_r_fail[NUM_TEST];
+ uint32_t w_ks_r_pass_nsp[NUM_TEST];
+ uint32_t w_ks_r_pass_sp[NUM_TEST];
+ uint32_t w_ks_r_fail[NUM_TEST];
+ uint32_t multi_rw[NUM_TEST - 1][NUM_TEST];
+};
+
+static struct rwc_perf rwc_lf_results, rwc_non_lf_results;
+
+struct {
+ uint32_t *keys;
+ uint32_t *keys_no_ks;
+ uint32_t *keys_ks;
+ uint32_t *keys_absent;
+ uint32_t *keys_shift_path;
+ uint32_t *keys_non_shift_path;
+ uint32_t count_keys_no_ks;
+ uint32_t count_keys_ks;
+ uint32_t count_keys_absent;
+ uint32_t count_keys_shift_path;
+ uint32_t count_keys_non_shift_path;
+ uint32_t single_insert;
+ struct rte_hash *h;
+} tbl_rwc_test_param;
+
+static rte_atomic64_t gread_cycles;
+static rte_atomic64_t greads;
+
+static volatile uint8_t writer_done;
+static volatile uint8_t multi_writer_done[4];
+uint8_t num_test;
+uint8_t htm;
+
+uint16_t enabled_core_ids[RTE_MAX_LCORE];
+
+uint8_t *scanned_bkts;
+
+static inline int
+get_enabled_cores_list(void)
+{
+ uint32_t i = 0;
+ uint16_t core_id;
+ uint32_t max_cores = rte_lcore_count();
+ for (core_id = 0; core_id < RTE_MAX_LCORE && i < max_cores; core_id++) {
+ if (rte_lcore_is_enabled(core_id)) {
+ enabled_core_ids[i] = core_id;
+ i++;
+ }
+ }
+
+ if (i != rte_lcore_count()) {
+ printf("Number of enabled cores in list is different from "
+ "number given by rte_lcore_count()\n");
+ return -1;
+ }
+ return 0;
+}
+
+static inline int
+check_bucket(uint32_t bkt_idx, uint32_t key)
+{
+ uint32_t iter;
+ uint32_t prev_iter;
+ uint32_t diff;
+ uint32_t count = 0;
+ const void *next_key;
+ void *next_data;
+
+ /* Temporary bucket to hold the keys */
+ uint32_t keys_in_bkt[8];
+
+ iter = bkt_idx * 8;
+ prev_iter = iter;
+ while (rte_hash_iterate(tbl_rwc_test_param.h,
+ &next_key, &next_data, &iter) >= 0) {
+
+ /* Check for duplicate entries */
+ if (*(const uint32_t *)next_key == key)
+ return 1;
+
+ /* Identify if there is any free entry in the bucket */
+ diff = iter - prev_iter;
+ if (diff > 1)
+ break;
+
+ prev_iter = iter;
+ keys_in_bkt[count] = *(const uint32_t *)next_key;
+ count++;
+
+ /* All entries in the bucket are occupied */
+ if (count == 8) {
+
+ /*
+ * Check if bucket was not scanned before, to avoid
+ * duplicate keys.
+ */
+ if (scanned_bkts[bkt_idx] == 0) {
+ /*
+ * Since this bucket (pointed to by bkt_idx) is
+ * full, it is likely that key(s) in this
+ * bucket will be on the shift path, when
+ * collision occurs. Thus, add it to
+ * keys_shift_path.
+ */
+ memcpy(tbl_rwc_test_param.keys_shift_path +
+ tbl_rwc_test_param.count_keys_shift_path
+ , keys_in_bkt, 32);
+ tbl_rwc_test_param.count_keys_shift_path += 8;
+ scanned_bkts[bkt_idx] = 1;
+ }
+ return -1;
+ }
+ }
+ return 0;
+}
+
+static int
+generate_keys(void)
+{
+ uint32_t *keys = NULL;
+ uint32_t *keys_no_ks = NULL;
+ uint32_t *keys_ks = NULL;
+ uint32_t *keys_absent = NULL;
+ uint32_t *keys_non_shift_path = NULL;
+ uint32_t *found = NULL;
+ uint32_t count_keys_no_ks = 0;
+ uint32_t count_keys_ks = 0;
+ uint32_t i;
+
+ /*
+ * keys will consist of a) keys whose addition to the hash table
+ * will result in shifting of the existing keys to their alternate
+ * locations b) keys whose addition to the hash table will not result
+ * in shifting of the existing keys.
+ */
+ keys = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_INSERT, 0);
+ if (keys == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /*
+ * keys_no_ks (no key-shifts): Subset of 'keys' - consists of keys that
+ * will NOT result in shifting of the existing keys to their alternate
+ * locations. Roughly around 900K keys.
+ */
+ keys_no_ks = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_INSERT, 0);
+ if (keys_no_ks == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /*
+ * keys_ks (key-shifts): Subset of 'keys' - consists of keys that will
+ * result in shifting of the existing keys to their alternate locations.
+ * Roughly around 146K keys. There might be repeating keys. More code is
+ * required to filter out these keys which will complicate the test case
+ */
+ keys_ks = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_INSERT, 0);
+ if (keys_ks == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /* Used to identify keys not inserted in the hash table */
+ found = rte_zmalloc(NULL, sizeof(uint32_t) * TOTAL_INSERT, 0);
+ if (found == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /*
+ * This consist of keys not inserted to the hash table.
+ * Used to test perf of lookup on keys that do not exist in the table.
+ */
+ keys_absent = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_INSERT, 0);
+ if (keys_absent == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /*
+ * This consist of keys which are likely to be on the shift
+ * path (i.e. being moved to alternate location), when collision occurs
+ * on addition of a key to an already full primary bucket.
+ * Used to test perf of lookup on keys that are on the shift path.
+ */
+ tbl_rwc_test_param.keys_shift_path = rte_malloc(NULL, sizeof(uint32_t) *
+ TOTAL_INSERT, 0);
+ if (tbl_rwc_test_param.keys_shift_path == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /*
+ * This consist of keys which are never on the shift
+ * path (i.e. being moved to alternate location), when collision occurs
+ * on addition of a key to an already full primary bucket.
+ * Used to test perf of lookup on keys that are not on the shift path.
+ */
+ keys_non_shift_path = rte_malloc(NULL, sizeof(uint32_t) * TOTAL_INSERT,
+ 0);
+ if (keys_non_shift_path == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ /*
+ * Used to mark bkts in which at least one key was shifted to its
+ * alternate location
+ */
+ scanned_bkts = rte_malloc(NULL, sizeof(uint8_t) * TOTAL_INSERT / 8, 0);
+ if (scanned_bkts == NULL) {
+ printf("RTE_MALLOC failed\n");
+ goto err;
+ }
+
+ tbl_rwc_test_param.keys = keys;
+ tbl_rwc_test_param.keys_no_ks = keys_no_ks;
+ tbl_rwc_test_param.keys_ks = keys_ks;
+ tbl_rwc_test_param.keys_absent = keys_absent;
+ tbl_rwc_test_param.keys_non_shift_path = keys_non_shift_path;
+
+ hash_sig_t sig;
+ uint32_t prim_bucket_idx;
+ int ret;
+ uint32_t num_buckets;
+ uint32_t bucket_bitmask;
+ num_buckets = TOTAL_ENTRY/8;
+ bucket_bitmask = num_buckets - 1;
+
+ /* Generate keys by adding previous two keys, neglect overflow */
+ keys[0] = 0;
+ keys[1] = 1;
+ for (i = 2; i < TOTAL_INSERT; i++)
+ keys[i] = keys[i-1] + keys[i-2];
+
+ /* Segregate keys into keys_no_ks and keys_ks */
+ for (i = 0; i < TOTAL_INSERT; i++) {
+ /* Check if primary bucket has space.*/
+ sig = rte_hash_hash(tbl_rwc_test_param.h,
+ tbl_rwc_test_param.keys+i);
+ prim_bucket_idx = sig & bucket_bitmask;
+ ret = check_bucket(prim_bucket_idx, keys[i]);
+ if (ret < 0) {
+ /*
+ * Primary bucket is full, this key will result in
+ * shifting of the keys to their alternate locations.
+ */
+ keys_ks[count_keys_ks] = keys[i];
+ count_keys_ks++;
+ } else if (ret == 0) {
+ /*
+ * Primary bucket has space, this key will not result in
+ * shifting of the keys. Hence, add key to the table.
+ */
+ ret = rte_hash_add_key_data(tbl_rwc_test_param.h,
+ keys+i,
+ (void *)((uintptr_t)i));
+ if (ret < 0) {
+ printf("writer failed %"PRIu32"\n", i);
+ break;
+ }
+ keys_no_ks[count_keys_no_ks] = keys[i];
+ count_keys_no_ks++;
+ }
+ }
+
+ for (i = 0; i < count_keys_no_ks; i++) {
+ /* Identify keys in keys_no_ks with value less than 1M */
+ if (keys_no_ks[i] < TOTAL_INSERT)
+ found[keys_no_ks[i]]++;
+ }
+
+ for (i = 0; i < count_keys_ks; i++) {
+ /* Identify keys in keys_ks with value less than 1M */
+ if (keys_ks[i] < TOTAL_INSERT)
+ found[keys_ks[i]]++;
+ }
+
+ uint32_t count_keys_absent = 0;
+ for (i = 0; i < TOTAL_INSERT; i++) {
+ /* Identify missing keys between 0 and 1M */
+ if (found[i] == 0)
+ keys_absent[count_keys_absent++] = i;
+ }
+
+ /* Find keys that will not be on the shift path */
+ uint32_t iter;
+ const void *next_key;
+ void *next_data;
+ uint32_t count = 0;
+ for (i = 0; i < TOTAL_INSERT / 8; i++) {
+ /* Check bucket for no keys shifted to alternate locations */
+ if (scanned_bkts[i] == 0) {
+ iter = i * 8;
+ while (rte_hash_iterate(tbl_rwc_test_param.h,
+ &next_key, &next_data, &iter) >= 0) {
+
+ /* Check if key belongs to the current bucket */
+ if (i >= (iter-1)/8)
+ keys_non_shift_path[count++]
+ = *(const uint32_t *)next_key;
+ else
+ break;
+ }
+ }
+ }
+
+ tbl_rwc_test_param.count_keys_no_ks = count_keys_no_ks;
+ tbl_rwc_test_param.count_keys_ks = count_keys_ks;
+ tbl_rwc_test_param.count_keys_absent = count_keys_absent;
+ tbl_rwc_test_param.count_keys_non_shift_path = count;
+
+ printf("\nCount of keys NOT causing shifting of existing keys to "
+ "alternate location: %d\n", tbl_rwc_test_param.count_keys_no_ks);
+ printf("\nCount of keys causing shifting of existing keys to alternate "
+ "locations: %d\n\n", tbl_rwc_test_param.count_keys_ks);
+ printf("Count of absent keys that will never be added to the hash "
+ "table: %d\n\n", tbl_rwc_test_param.count_keys_absent);
+ printf("Count of keys likely to be on the shift path: %d\n\n",
+ tbl_rwc_test_param.count_keys_shift_path);
+ printf("Count of keys not likely to be on the shift path: %d\n\n",
+ tbl_rwc_test_param.count_keys_non_shift_path);
+
+ rte_free(found);
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_free(keys);
+ rte_free(keys_no_ks);
+ rte_free(keys_ks);
+ rte_free(keys_absent);
+ rte_free(found);
+ rte_free(tbl_rwc_test_param.keys_shift_path);
+ rte_free(scanned_bkts);
+ return -1;
+}
+
+static int
+init_params(int rwc_lf, int use_jhash)
+{
+ struct rte_hash *handle;
+
+ struct rte_hash_parameters hash_params = {
+ .entries = TOTAL_ENTRY,
+ .key_len = sizeof(uint32_t),
+ .hash_func_init_val = 0,
+ .socket_id = rte_socket_id(),
+ };
+
+ if (use_jhash)
+ hash_params.hash_func = rte_jhash;
+ else
+ hash_params.hash_func = rte_hash_crc;
+
+ if (rwc_lf)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF |
+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
+ else if (htm)
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY |
+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
+ else
+ hash_params.extra_flag =
+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY |
+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
+
+ hash_params.name = "tests";
+
+ handle = rte_hash_create(&hash_params);
+ if (handle == NULL) {
+ printf("hash creation failed");
+ return -1;
+ }
+
+ tbl_rwc_test_param.h = handle;
+ return 0;
+}
+
+static int
+test_rwc_reader(__attribute__((unused)) void *arg)
+{
+ uint32_t i;
+ int ret;
+ uint64_t begin, cycles;
+ uint32_t loop_cnt = 0;
+ uint8_t read_type = (uint8_t)((uintptr_t)arg);
+ uint32_t read_cnt;
+ uint32_t *keys;
+
+ if (read_type == READ_FAIL) {
+ keys = tbl_rwc_test_param.keys_absent;
+ read_cnt = tbl_rwc_test_param.count_keys_absent;
+ } else if (read_type == READ_PASS_NO_KEY_SHIFTS) {
+ keys = tbl_rwc_test_param.keys_no_ks;
+ read_cnt = tbl_rwc_test_param.count_keys_no_ks;
+ } else if (read_type == READ_PASS_SHIFT_PATH) {
+ keys = tbl_rwc_test_param.keys_shift_path;
+ read_cnt = tbl_rwc_test_param.count_keys_shift_path;
+ } else {
+ keys = tbl_rwc_test_param.keys_non_shift_path;
+ read_cnt = tbl_rwc_test_param.count_keys_non_shift_path;
+ }
+
+ begin = rte_rdtsc_precise();
+ do {
+ for (i = 0; i < read_cnt; i++) {
+ ret = rte_hash_lookup(tbl_rwc_test_param.h, keys + i);
+ if ((read_type == READ_FAIL && ret != -ENOENT)
+ || (read_type != READ_FAIL && ret == -ENOENT)) {
+ printf("lookup failed! %"PRIu32"\n", keys[i]);
+ return -1;
+ }
+ }
+ loop_cnt++;
+ } while (!writer_done);
+
+ cycles = rte_rdtsc_precise() - begin;
+ rte_atomic64_add(&gread_cycles, cycles);
+ rte_atomic64_add(&greads, i*loop_cnt);
+ return 0;
+}
+
+static int
+write_keys(uint8_t key_shift)
+{
+ uint32_t i;
+ int ret;
+ uint32_t key_cnt;
+ uint32_t *keys;
+ if (key_shift) {
+ key_cnt = tbl_rwc_test_param.count_keys_ks;
+ keys = tbl_rwc_test_param.keys_ks;
+ } else {
+ key_cnt = tbl_rwc_test_param.count_keys_no_ks;
+ keys = tbl_rwc_test_param.keys_no_ks;
+ }
+ for (i = 0; i < key_cnt; i++) {
+ ret = rte_hash_add_key(tbl_rwc_test_param.h, keys + i);
+ if (!key_shift && ret < 0) {
+ printf("writer failed %"PRIu32"\n", i);
+ return -1;
+ }
+ }
+ return 0;
+}
+
+static int
+test_rwc_multi_writer(__attribute__((unused)) void *arg)
+{
+ uint32_t i, offset;
+ uint32_t pos_core = (uint32_t)((uintptr_t)arg);
+ offset = pos_core * tbl_rwc_test_param.single_insert;
+ for (i = offset; i < offset + tbl_rwc_test_param.single_insert; i++)
+ rte_hash_add_key(tbl_rwc_test_param.h,
+ tbl_rwc_test_param.keys_ks + i);
+ multi_writer_done[pos_core] = 1;
+ return 0;
+}
+
+/*
+ * Test lookup perf:
+ * Reader(s) lookup keys present in the table.
+ */
+static int
+test_hash_add_no_ks_lookup_pass(struct rwc_perf *rwc_perf_results, int rwc_lf)
+{
+ unsigned int n;
+ uint64_t i;
+ int use_jhash = 0;
+ uint8_t key_shift = 0;
+ uint8_t read_type = READ_PASS_NO_KEY_SHIFTS;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gread_cycles);
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ goto err;
+ printf("\nTest: Hash add - no key-shifts, read - pass\n");
+ for (n = 0; n < num_test; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < rwc_core_cnt[n] + 1)
+ goto finish;
+
+ printf("\nNumber of readers: %u\n", rwc_core_cnt[n]);
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+
+ rte_hash_reset(tbl_rwc_test_param.h);
+ writer_done = 0;
+ if (write_keys(key_shift) < 0)
+ goto err;
+ writer_done = 1;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rwc_reader,
+ (void *)(uintptr_t)read_type,
+ enabled_core_ids[i]);
+ rte_eal_mp_wait_lcore();
+
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ if (lcore_config[i].ret < 0)
+ goto err;
+
+ unsigned long long cycles_per_lookup =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ rwc_perf_results->w_no_ks_r_pass[n] = cycles_per_lookup;
+ printf("Cycles per lookup: %llu\n", cycles_per_lookup);
+ }
+
+finish:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return -1;
+}
+
+/*
+ * Test lookup perf:
+ * Reader(s) lookup keys absent in the table while
+ * 'Main' thread adds with no key-shifts.
+ */
+static int
+test_hash_add_no_ks_lookup_fail(struct rwc_perf *rwc_perf_results, int rwc_lf)
+{
+ unsigned int n;
+ uint64_t i;
+ int use_jhash = 0;
+ uint8_t key_shift = 0;
+ uint8_t read_type = READ_FAIL;
+ int ret;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gread_cycles);
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ goto err;
+ printf("\nTest: Hash add - no key-shifts, Hash lookup - fail\n");
+ for (n = 0; n < num_test; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < rwc_core_cnt[n] + 1)
+ goto finish;
+
+ printf("\nNumber of readers: %u\n", rwc_core_cnt[n]);
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+
+ rte_hash_reset(tbl_rwc_test_param.h);
+ writer_done = 0;
+
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rwc_reader,
+ (void *)(uintptr_t)read_type,
+ enabled_core_ids[i]);
+ ret = write_keys(key_shift);
+ writer_done = 1;
+ rte_eal_mp_wait_lcore();
+
+ if (ret < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ if (lcore_config[i].ret < 0)
+ goto err;
+
+ unsigned long long cycles_per_lookup =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ rwc_perf_results->w_no_ks_r_fail[n] = cycles_per_lookup;
+ printf("Cycles per lookup: %llu\n", cycles_per_lookup);
+ }
+
+finish:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return -1;
+}
+
+/*
+ * Test lookup perf:
+ * Reader(s) lookup keys present in the table and not likely to be on the
+ * shift path while 'Main' thread adds keys causing key-shifts.
+ */
+static int
+test_hash_add_ks_lookup_pass_non_sp(struct rwc_perf *rwc_perf_results,
+ int rwc_lf)
+{
+ unsigned int n;
+ uint64_t i;
+ int use_jhash = 0;
+ int ret;
+ uint8_t key_shift;
+ uint8_t read_type = READ_PASS_NON_SHIFT_PATH;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gread_cycles);
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ goto err;
+ printf("\nTest: Hash add - key shift, Hash lookup - pass"
+ " (non-shift-path)\n");
+ for (n = 0; n < num_test; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < rwc_core_cnt[n] + 1)
+ goto finish;
+
+ printf("\nNumber of readers: %u\n", rwc_core_cnt[n]);
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+
+ rte_hash_reset(tbl_rwc_test_param.h);
+ writer_done = 0;
+ key_shift = 0;
+ if (write_keys(key_shift) < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rwc_reader,
+ (void *)(uintptr_t)read_type,
+ enabled_core_ids[i]);
+ key_shift = 1;
+ ret = write_keys(key_shift);
+ writer_done = 1;
+ rte_eal_mp_wait_lcore();
+
+ if (ret < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ if (lcore_config[i].ret < 0)
+ goto err;
+
+ unsigned long long cycles_per_lookup =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ rwc_perf_results->w_ks_r_pass_nsp[n] = cycles_per_lookup;
+ printf("Cycles per lookup: %llu\n", cycles_per_lookup);
+ }
+
+finish:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return -1;
+}
+
+/*
+ * Test lookup perf:
+ * Reader(s) lookup keys present in the table and likely on the shift-path while
+ * 'Main' thread adds keys causing key-shifts.
+ */
+static int
+test_hash_add_ks_lookup_pass_sp(struct rwc_perf *rwc_perf_results, int rwc_lf)
+{
+ unsigned int n;
+ uint64_t i;
+ int use_jhash = 0;
+ int ret;
+ uint8_t key_shift;
+ uint8_t read_type = READ_PASS_SHIFT_PATH;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gread_cycles);
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ goto err;
+ printf("\nTest: Hash add - key shift, Hash lookup - pass (shift-path)"
+ "\n");
+
+ for (n = 0; n < num_test; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < rwc_core_cnt[n])
+ goto finish;
+
+ printf("\nNumber of readers: %u\n", rwc_core_cnt[n]);
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+
+ rte_hash_reset(tbl_rwc_test_param.h);
+ writer_done = 0;
+ key_shift = 0;
+ if (write_keys(key_shift) < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rwc_reader,
+ (void *)(uintptr_t)read_type,
+ enabled_core_ids[i]);
+ key_shift = 1;
+ ret = write_keys(key_shift);
+ writer_done = 1;
+ rte_eal_mp_wait_lcore();
+
+ if (ret < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ if (lcore_config[i].ret < 0)
+ goto err;
+
+ unsigned long long cycles_per_lookup =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ rwc_perf_results->w_ks_r_pass_sp[n] = cycles_per_lookup;
+ printf("Cycles per lookup: %llu\n", cycles_per_lookup);
+ }
+
+finish:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return -1;
+}
+
+/*
+ * Test lookup perf:
+ * Reader(s) lookup keys absent in the table while
+ * 'Main' thread adds keys causing key-shifts.
+ */
+static int
+test_hash_add_ks_lookup_fail(struct rwc_perf *rwc_perf_results, int rwc_lf)
+{
+ unsigned int n;
+ uint64_t i;
+ int use_jhash = 0;
+ int ret;
+ uint8_t key_shift;
+ uint8_t read_type = READ_FAIL;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gread_cycles);
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ goto err;
+ printf("\nTest: Hash add - key shift, Hash lookup - fail\n");
+ for (n = 0; n < num_test; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < rwc_core_cnt[n] + 1)
+ goto finish;
+
+ printf("\nNumber of readers: %u\n", rwc_core_cnt[n]);
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+
+ rte_hash_reset(tbl_rwc_test_param.h);
+ writer_done = 0;
+ key_shift = 0;
+ if (write_keys(key_shift) < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rwc_reader,
+ (void *)(uintptr_t)read_type,
+ enabled_core_ids[i]);
+ key_shift = 1;
+ ret = write_keys(key_shift);
+ writer_done = 1;
+ rte_eal_mp_wait_lcore();
+
+ if (ret < 0)
+ goto err;
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ if (lcore_config[i].ret < 0)
+ goto err;
+
+ unsigned long long cycles_per_lookup =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ rwc_perf_results->w_ks_r_fail[n] = cycles_per_lookup;
+ printf("Cycles per lookup: %llu\n", cycles_per_lookup);
+ }
+
+finish:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return -1;
+}
+
+/*
+ * Test lookup perf:
+ * Reader(s) lookup keys present in the table and likely on the shift-path while
+ * Writers add keys causing key-shifts.
+ */
+static int
+test_hash_multi_add_lookup(struct rwc_perf *rwc_perf_results, int rwc_lf)
+{
+ unsigned int n, m;
+ uint64_t i;
+ int use_jhash = 0;
+ uint8_t key_shift;
+ uint8_t read_type = READ_PASS_SHIFT_PATH;
+
+ rte_atomic64_init(&greads);
+ rte_atomic64_init(&gread_cycles);
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ goto err;
+ printf("\nTest: Multi-add-lookup\n");
+ uint8_t pos_core;
+ for (m = 1; m < num_test; m++) {
+ /* Calculate keys added by each writer */
+ tbl_rwc_test_param.single_insert =
+ tbl_rwc_test_param.count_keys_ks / rwc_core_cnt[m];
+
+ for (n = 0; n < num_test; n++) {
+ unsigned int tot_lcore = rte_lcore_count();
+ if (tot_lcore < rwc_core_cnt[n] + rwc_core_cnt[m] + 1)
+ goto finish;
+
+ printf("\nNumber of writers: %u", rwc_core_cnt[m]);
+ printf("\nNumber of readers: %u\n", rwc_core_cnt[n]);
+
+ rte_atomic64_clear(&greads);
+ rte_atomic64_clear(&gread_cycles);
+
+ rte_hash_reset(tbl_rwc_test_param.h);
+ writer_done = 0;
+ for (i = 0; i < 4; i++)
+ multi_writer_done[i] = 0;
+ key_shift = 0;
+ if (write_keys(key_shift) < 0)
+ goto err;
+
+ /* Launch reader(s) */
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ rte_eal_remote_launch(test_rwc_reader,
+ (void *)(uintptr_t)read_type,
+ enabled_core_ids[i]);
+ key_shift = 1;
+ pos_core = 0;
+
+ /* Launch writers */
+ for (; i <= rwc_core_cnt[m] + rwc_core_cnt[n]; i++) {
+ rte_eal_remote_launch(test_rwc_multi_writer,
+ (void *)(uintptr_t)pos_core,
+ enabled_core_ids[i]);
+ pos_core++;
+ }
+
+ /* Wait for writers to complete */
+ for (i = 0; i < rwc_core_cnt[m]; i++)
+ while
+ (multi_writer_done[i] == 0);
+ writer_done = 1;
+
+ rte_eal_mp_wait_lcore();
+
+ for (i = 1; i <= rwc_core_cnt[n]; i++)
+ if (lcore_config[i].ret < 0)
+ goto err;
+
+ unsigned long long cycles_per_lookup =
+ rte_atomic64_read(&gread_cycles) /
+ rte_atomic64_read(&greads);
+ rwc_perf_results->multi_rw[m][n] = cycles_per_lookup;
+ printf("Cycles per lookup: %llu\n", cycles_per_lookup);
+ }
+ }
+
+finish:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return 0;
+
+err:
+ rte_hash_free(tbl_rwc_test_param.h);
+ return -1;
+}
+
+static int
+test_hash_readwrite_lf_main(void)
+{
+ /*
+ * Variables used to choose different tests.
+ * rwc_lf indicates if read-write concurrency lock-free support is
+ * enabled.
+ * htm indicates if Hardware transactional memory support is enabled.
+ */
+ int rwc_lf = 0;
+ int use_jhash = 0;
+ num_test = NUM_TEST;
+ if (rte_lcore_count() == 1) {
+ printf("More than one lcore is required "
+ "to do read write lock-free concurrency test\n");
+ return -1;
+ }
+
+ setlocale(LC_NUMERIC, "");
+
+ if (rte_tm_supported())
+ htm = 1;
+ else
+ htm = 0;
+
+ if (init_params(rwc_lf, use_jhash) != 0)
+ return -1;
+ if (generate_keys() != 0)
+ return -1;
+ if (get_enabled_cores_list() != 0)
+ return -1;
+
+ if (RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF) {
+ rwc_lf = 1;
+ printf("Test lookup with read-write concurrency lock free support"
+ " enabled\n");
+ if (test_hash_add_no_ks_lookup_pass(&rwc_lf_results, rwc_lf)
+ < 0)
+ return -1;
+ if (test_hash_add_no_ks_lookup_fail(&rwc_lf_results, rwc_lf)
+ < 0)
+ return -1;
+ if (test_hash_add_ks_lookup_pass_non_sp(&rwc_lf_results, rwc_lf)
+ < 0)
+ return -1;
+ if (test_hash_add_ks_lookup_pass_sp(&rwc_lf_results, rwc_lf)
+ < 0)
+ return -1;
+ if (test_hash_add_ks_lookup_fail(&rwc_lf_results, rwc_lf) < 0)
+ return -1;
+ if (test_hash_multi_add_lookup(&rwc_lf_results, rwc_lf) < 0)
+ return -1;
+ }
+ printf("\nTest lookup with read-write concurrency lock free support"
+ " disabled\n");
+ rwc_lf = 0;
+ if (!htm) {
+ printf("With HTM Disabled\n");
+ if (!RUN_WITH_HTM_DISABLED) {
+ printf("Enable RUN_WITH_HTM_DISABLED to test with"
+ " lock-free disabled");
+ goto results;
+ }
+ } else
+ printf("With HTM Enabled\n");
+ if (test_hash_add_no_ks_lookup_pass(&rwc_non_lf_results, rwc_lf) < 0)
+ return -1;
+ if (test_hash_add_no_ks_lookup_fail(&rwc_non_lf_results, rwc_lf) < 0)
+ return -1;
+ if (test_hash_add_ks_lookup_pass_non_sp(&rwc_non_lf_results, rwc_lf)
+ < 0)
+ return -1;
+ if (test_hash_add_ks_lookup_pass_sp(&rwc_non_lf_results, rwc_lf) < 0)
+ return -1;
+ if (test_hash_add_ks_lookup_fail(&rwc_non_lf_results, rwc_lf) < 0)
+ return -1;
+ if (test_hash_multi_add_lookup(&rwc_non_lf_results, rwc_lf) < 0)
+ return -1;
+results:
+ printf("\n\t\t\t\t\t\t********** Results summary **********\n\n");
+ printf("_______\t\t_______\t\t_________\t___\t\t_________\t\t\t\t\t\t"
+ "_________________\n");
+ int i, j;
+ printf("Writers\t\tReaders\t\tLock-free\tHTM\t\tTest-case\t\t\t\t\t\t"
+ "Cycles per lookup\n");
+ printf("_______\t\t_______\t\t_________\t___\t\t_________\t\t\t\t\t\t"
+ "_________________\n");
+ for (i = 0; i < NUM_TEST; i++) {
+ printf("%u\t\t%u\t\t", 1, rwc_core_cnt[i]);
+ printf("Enabled\t\t");
+ printf("N/A\t\t");
+ printf("Hash add - no key-shifts, lookup - pass\t\t\t\t%u\n\t\t"
+ "\t\t\t\t\t\t", rwc_lf_results.w_no_ks_r_pass[i]);
+ printf("Hash add - no key-shifts, lookup - fail\t\t\t\t%u\n\t\t"
+ "\t\t\t\t\t\t", rwc_lf_results.w_no_ks_r_fail[i]);
+ printf("Hash add - key-shifts, lookup - pass (non-shift-path)\t"
+ "\t%u\n\t\t\t\t\t\t\t\t",
+ rwc_lf_results.w_ks_r_pass_nsp[i]);
+ printf("Hash add - key-shifts, lookup - pass (shift-path)\t\t%u"
+ "\n\t\t\t\t\t\t\t\t", rwc_lf_results.w_ks_r_pass_sp[i]);
+ printf("Hash add - key-shifts, Hash lookup fail\t\t\t\t%u\n\n"
+ "\t\t\t\t", rwc_lf_results.w_ks_r_fail[i]);
+
+ printf("Disabled\t");
+ if (htm)
+ printf("Enabled\t\t");
+ else
+ printf("Disabled\t");
+ printf("Hash add - no key-shifts, lookup - pass\t\t\t\t%u\n\t\t"
+ "\t\t\t\t\t\t", rwc_non_lf_results.w_no_ks_r_pass[i]);
+ printf("Hash add - no key-shifts, lookup - fail\t\t\t\t%u\n\t\t"
+ "\t\t\t\t\t\t", rwc_non_lf_results.w_no_ks_r_fail[i]);
+ printf("Hash add - key-shifts, lookup - pass (non-shift-path)\t"
+ "\t%u\n\t\t\t\t\t\t\t\t",
+ rwc_non_lf_results.w_ks_r_pass_nsp[i]);
+ printf("Hash add - key-shifts, lookup - pass (shift-path)\t\t%u"
+ "\n\t\t\t\t\t\t\t\t",
+ rwc_non_lf_results.w_ks_r_pass_sp[i]);
+ printf("Hash add - key-shifts, Hash lookup fail\t\t\t\t%u\n",
+ rwc_non_lf_results.w_ks_r_fail[i]);
+
+ printf("_______\t\t_______\t\t_________\t___\t\t_________\t\t\t\t"
+ "\t\t_________________\n");
+ }
+
+ for (i = 1; i < NUM_TEST; i++) {
+ for (j = 0; j < NUM_TEST; j++) {
+ printf("%u", rwc_core_cnt[i]);
+ printf("\t\t%u\t\t", rwc_core_cnt[j]);
+ printf("Enabled\t\t");
+ printf("N/A\t\t");
+ printf("Multi-add-lookup\t\t\t\t\t\t%u\n\n\t\t\t\t",
+ rwc_lf_results.multi_rw[i][j]);
+ printf("Disabled\t");
+ if (htm)
+ printf("Enabled\t\t");
+ else
+ printf("Disabled\t");
+ printf("Multi-add-lookup\t\t\t\t\t\t%u\n",
+ rwc_non_lf_results.multi_rw[i][j]);
+
+ printf("_______\t\t_______\t\t_________\t___\t\t"
+ "_________\t\t\t\t\t\t_________________\n");
+ }
+ }
+
+ rte_free(tbl_rwc_test_param.keys);
+ rte_free(tbl_rwc_test_param.keys_no_ks);
+ rte_free(tbl_rwc_test_param.keys_ks);
+ rte_free(tbl_rwc_test_param.keys_absent);
+ rte_free(tbl_rwc_test_param.keys_shift_path);
+ rte_free(scanned_bkts);
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(hash_readwrite_lf_autotest, test_hash_readwrite_lf_main);
--
2.7.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency Honnappa Nagarahalli
@ 2018-10-13 1:02 ` Wang, Yipeng1
2018-10-15 20:15 ` Honnappa Nagarahalli
0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 1:02 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency
>
>RW concurrency is required with single writer and multiple reader
>usecase as well. Hence, multi-writer should not be enabled by default when
>RW concurrency is enabled.
>
>Fixes: f2e3001b53ec ("hash: support read/write concurrency")
>Cc: yipeng1.wang@intel.com
>
>Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>Reviewed-by: Gavin Hu <gavin.hu@arm.com>
>---
> lib/librte_hash/rte_cuckoo_hash.c | 27 ++++++++++++++++-----------
> lib/librte_hash/rte_cuckoo_hash.h | 2 ++
> test/test/test_hash_readwrite.c | 6 ++++--
> 3 files changed, 22 insertions(+), 13 deletions(-)
>
>+ uint8_t writer_takes_lock;
>+ /**< Indicates if the writer threads need to take lock */
[Wang, Yipeng]
Our understanding is that the difference between writer_takes_lock and multi_writer_support flag now is that for the multi-writer case
we still have the local cache for key-data pair slot. Please correct me if I am wrong.
But the name is confusing because writer_takes_lock implies multi-writer support. Especially the comment here says
that writer needs a lock, which means multi-writer is supported. So conceptually it does not have different meaning than the multi_writer_support
by just reading the name.
If you want to distinguish these two implementation (with vs. without cache), maybe change the name of multi-writer flag to use_local_cache flag?
And the previous locking mechanism need to enable this flag for performance reasons, while the LF does not.
Or just keep the cache for both cases, and I don't think the local cache will add too much overhead.
> rte_hash_function hash_func; /**< Function used to calculate hash. */
> uint32_t hash_func_init_val; /**< Init value used by hash_func. */
> rte_hash_cmp_eq_t rte_hash_custom_cmp_eq;
>diff --git a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
>index 2a4f7b9..a8fadd0 100644
>--- a/test/test/test_hash_readwrite.c
>+++ b/test/test/test_hash_readwrite.c
>@@ -122,10 +122,12 @@ init_params(int use_htm, int use_jhash)
> if (use_htm)
> hash_params.extra_flag =
> RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
>- RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
>+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY |
>+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
[Wang, Yipeng]
Could you double check that if current applications do not change their code, there is
no functional issue will be introduced by this change, otherwise this would be an API change.
I believe it will have performance implication though.
Otherwise I am OK with this patch.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete Honnappa Nagarahalli
@ 2018-10-13 1:17 ` Wang, Yipeng1
2018-10-16 1:25 ` Honnappa Nagarahalli
0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 1:17 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 2/7] hash: support do not recycle on delete
>
>rte_hash_lookup_xxx APIs return the index of the element in
>the key store. Application(reader) can use that index to reference
>other data structures in its scope. Because of this, the
>index should not be recycled till the application completes
>using the index.
>RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is introduced to support this.
>When this flag is enabled rte_hash_del_xxx APIs do not free the
>key-store index/internal memory associated with the deleted
>entry. The new API rte_hash_free_key_with_position should be called
>to free the key-store index/internal memory after calling
>rte_hash_del_xxx APIs.
>
> uint8_t ext_table_support; /**< Enable extendable bucket table */
>+ uint8_t recycle_on_del;
>+ /**< If internal memory/key-store entry should be
>+ * freed on calling the rte_hash_del_xxx APIs.
>+ * If this is set, rte_hash_free_key_with_position must be
[Wang, Yipeng] If this is *not* set?
>+/** Flag to disable freeing of internal memory/indices on hash delete.
>+ * Refer to rte_hash_del_xxx APIs for more details.
>+ */
>+#define RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL 0x10
>+
[Wang, Yipeng] Maybe call it FREE_AFTER_DEL or NO_FREE_ON_DEL? Recycle_on_del
Sounds like we do the recycle at the delete time, which is opposite to the meaning.
Change *recycle* to *free* to be consistent with the function API name.
I guess I suggested to use *recycle* at beginning, but as a second thought,
I think *free* is more user friendly than recycle. Recycle makes more sense to developers.
And you already use *free* for the function name.
> /**
> * The type of hash value of a key.
> * It should be a value of at least 32bit with fully random pattern.
>@@ -236,6 +243,10 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t
> * and should only be called from one thread by default.
> * Thread safety can be enabled by setting flag during
> * table creation.
>+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
>+ * the hash library's internal memory/index will not be freed by this
>+ * API. rte_hash_free_key_with_position API must be called additionally
>+ * to free the internal memory/index associated with the key.
[Wang, Yipeng] Maybe more explicit on the use case for this flag: This behavior is useful
for multi-threading applications which may still have threads referencing the position after deletion
(or other words of which you think more clear).
Otherwise
Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment Honnappa Nagarahalli
@ 2018-10-13 1:20 ` Wang, Yipeng1
2018-10-16 23:26 ` Honnappa Nagarahalli
0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 1:20 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 3/7] hash: correct key store element alignment
[Wang, Yipeng] "correct" -> "improve"?
>
>Correct the key store array element alignment. This is required to
>make 'pdata' in 'struct rte_hash_key' align on the correct boundary.
[Wang, Yipeng]
More explanation in commit message is appreciated, because people may not understand
what is "correct" boundary.
e.g. Previously pdata could spread across multiple cache lines, which makes the access of pdata
non-atomic which may have performance implications.
Otherwise
Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions Honnappa Nagarahalli
@ 2018-10-13 1:56 ` Wang, Yipeng1
2018-10-16 23:28 ` Honnappa Nagarahalli
0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 1:56 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
When I applied this commit:
fatal: sha1 information is lacking or useless (lib/librte_hash/rte_cuckoo_hash.c).
Please double check.
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 4/7] hash: add memory ordering to avoid race conditions
>
>Only race condition that can occur is - using the key store element
>before the key write is completed. Hence, while inserting the element
>the release memory order is used. Any other race condition is caught
>by the key comparison. Memory orderings are added only where needed.
>For ex: reads in the writer's context do not need memory ordering
>as there is a single writer.
[Wang, Yipeng]
I remember we discussed that this commit itself does not fix any bug/issue, or enabling feature, it is supposed to work with the following patches
to enable lock-free read-write concurrency. You separated the commits for easier review.
If you plan to merge these commits and change the commit message it would be fine. Otherwise the current message title and content
Is misleading. It sounds like a bug fix but actually not.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 5/7] hash: fix rw concurrency while moving keys
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 5/7] hash: fix rw concurrency while moving keys Honnappa Nagarahalli
@ 2018-10-13 2:07 ` Wang, Yipeng1
0 siblings, 0 replies; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 2:07 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 5/7] hash: fix rw concurrency while moving keys
>
>Reader-writer concurrency issue, caused by moving the keys
>to their alternative locations during key insert, is solved
>by introducing a global counter(tbl_chng_cnt) indicating a
>change in table.
>
[Wang, Yipeng]
Similar to my comment on previous commit, the commit message is misleading. It is not a bug fix. This commit
together with the next enables the LF feature.
>Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>Reviewed-by: Gavin Hu <gavin.hu@arm.com>
>Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
>Reviewed-by: Steve Capper <steve.capper@arm.com>
>Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
>---
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency Honnappa Nagarahalli
@ 2018-10-13 2:32 ` Wang, Yipeng1
2018-10-17 13:54 ` Honnappa Nagarahalli
0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 2:32 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency
>
>Add the flag to enable reader-writer concurrency during
>run time. The rte_hash_del_xxx APIs do not free the keystore
>element when this flag is enabled. Hence a new API,
>rte_hash_free_key_with_position, to free the key store element
>is added.
>
>Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>Reviewed-by: Gavin Hu <gavin.hu@arm.com>
>Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
>Reviewed-by: Steve Capper <steve.capper@arm.com>
>Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
>
>+/** Flag to support lock free reader writer concurrency. Writer can be
>+ * single writer/multi writer.
[Wang, Yipeng] "Writer can be multi-writer when the RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD flag is also set."
>+ * Currently, extended bucket table feature is not supported with
[Wang, Yipeng] "extendable bucket table", I also used wrong name sometimes but please use this one.
>@@ -156,6 +169,10 @@ rte_hash_count(const struct rte_hash *h);
> * and should only be called from one thread by default.
> * Thread safety can be enabled by setting flag during
> * table creation.
>+ * The writer needs to be aware if this API is called to update
>+ * an existing entry. The application should free any memory
>+ * allocated for the existing 'data' only after all the readers
>+ * have stopped referrencing it.
> *
[Wang, Yipeng]
This comment to me is assuming a specific user use case, and not the library's responsibility.
How about a more general description:
If the added key is already in the table, the function will update the data of the key.
In such case, it is the user's responsibility to properly handle the old data if the old data
is still being referenced by other threads.
Please let me know if I understand it wrong.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 7/7] test/hash: read-write lock-free concurrency test
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 7/7] test/hash: read-write lock-free concurrency test Honnappa Nagarahalli
@ 2018-10-13 2:52 ` Wang, Yipeng1
0 siblings, 0 replies; 20+ messages in thread
From: Wang, Yipeng1 @ 2018-10-13 2:52 UTC (permalink / raw)
To: Honnappa Nagarahalli, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, dharmik.thakkar, gavin.hu, nd, Gobriel, Sameh
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
>Sent: Thursday, October 11, 2018 11:32 PM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>; honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
>gavin.hu@arm.com; nd@arm.com
>Subject: [PATCH v3 7/7] test/hash: read-write lock-free concurrency test
>
>From: Dharmik Thakkar <dharmik.thakkar@arm.com>
>
>Unit tests to check for hash lookup perf with lock-free enabled and
>with lock-free disabled.
>Unit tests performed with readers running in parallel with writers.
>
>Tests include:
>
>- hash lookup on existing keys with:
> - hash add causing NO key-shifts of existing keys in the table
>
>- hash lookup on existing keys likely to be on shift-path with:
> - hash add causing key-shifts of existing keys in the table
>
>- hash lookup on existing keys NOT likely to be on shift-path with:
> - hash add causing key-shifts of existing keys in the table
>
>- hash lookup on non-existing keys with:
> - hash add causing NO key-shifts of existing keys in the table
> - hash add causing key-shifts of existing keys in the table
>
>- hash lookup on keys likely to be on shift-path with:
> - multiple writers causing key-shifts of existing keys in the table
>
>Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
>Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>Reviewed-by: Gavin Hu <gavin.hu@arm.com>
>---
> test/test/Makefile | 1 +
> test/test/meson.build | 1 +
> test/test/test_hash_readwrite_lf.c | 1084 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 1086 insertions(+)
> create mode 100644 test/test/test_hash_readwrite_lf.c
>
>diff --git a/test/test/Makefile b/test/test/Makefile
>index 5d8b1dc..0e0e6c4 100644
>--- a/test/test/Makefile
>+++ b/test/test/Makefile
>@@ -116,6 +116,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite.c
>+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_readwrite_lf.c
>
> SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
> SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
>diff --git a/test/test/meson.build b/test/test/meson.build
>index d696f5e..bc3350f 100644
>--- a/test/test/meson.build
>+++ b/test/test/meson.build
>@@ -46,6 +46,7 @@ test_sources = files('commands.c',
> 'test_hash_multiwriter.c',
> 'test_hash_readwrite.c',
> 'test_hash_perf.c',
>+ 'test_hash_readwrite_lf.c',
> 'test_hash_scaling.c',
> 'test_interrupts.c',
> 'test_kni.c',
>diff --git a/test/test/test_hash_readwrite_lf.c b/test/test/test_hash_readwrite_lf.c
>new file mode 100644
>index 0000000..841e989
>--- /dev/null
>+++ b/test/test/test_hash_readwrite_lf.c
>@@ -0,0 +1,1084 @@
>+/* SPDX-License-Identifier: BSD-3-Clause
>+ * Copyright(c) 2018 Arm Limited
>+ */
>+
>+#include <inttypes.h>
>+#include <locale.h>
>+
>+#include <rte_cycles.h>
>+#include <rte_hash.h>
>+#include <rte_hash_crc.h>
>+#include <rte_jhash.h>
>+#include <rte_launch.h>
>+#include <rte_malloc.h>
>+#include <rte_random.h>
>+#include <rte_spinlock.h>
>+
>+#include "test.h"
>+
>+#ifndef RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF
>+#define RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF 0
>+#endif
>+
>+#define RUN_WITH_HTM_DISABLED 0
>+
>+#if (RUN_WITH_HTM_DISABLED)
>+
>+#define TOTAL_ENTRY (5*1024)
>+#define TOTAL_INSERT (5*1024)
>+
>+#else
>+
>+#define TOTAL_ENTRY (16*1024*1024)
>+#define TOTAL_INSERT (16*1024*1024)
[Wang, Yipeng]
You can make the number smaller since a previous patch on multi-write test
suggests to use a smaller number (how about 5 million) to reduce the total testing time.
>+
>+static rte_atomic64_t gread_cycles;
>+static rte_atomic64_t greads;
>+
>+static volatile uint8_t writer_done;
>+static volatile uint8_t multi_writer_done[4];
>+uint8_t num_test;
>+uint8_t htm;
[Wang, Yipeng] It would be better to make the variables local.
>+
>+ for (i = 0; i < count_keys_no_ks; i++) {
>+ /* Identify keys in keys_no_ks with value less than 1M */
[Wang, Yipeng] TOTAL_INSERT is 16M. Other comments too.
>+ if (keys_no_ks[i] < TOTAL_INSERT)
>+ found[keys_no_ks[i]]++;
>+ }
>+
>+ for (i = 0; i < count_keys_ks; i++) {
>+ /* Identify keys in keys_ks with value less than 1M */
>+ if (keys_ks[i] < TOTAL_INSERT)
>+ found[keys_ks[i]]++;
>+ }
>+
>+ uint32_t count_keys_absent = 0;
>+ for (i = 0; i < TOTAL_INSERT; i++) {
>+ /* Identify missing keys between 0 and 1M */
>+/*
>+ * Test lookup perf:
[Wang, Yipeng] add comment: for multi-writer
>+ * Reader(s) lookup keys present in the table and likely on the shift-path while
>+ * Writers add keys causing key-shifts.
>+ */
>+static int
>+test_hash_multi_add_lookup(struct rwc_perf *rwc_perf_results, int rwc_lf)
>+{
>+ unsigned int n, m;
>+ uint64_t i;
>+static int
>+test_hash_readwrite_lf_main(void)
>+{
>+ /*
>+ * Variables used to choose different tests.
>+ * rwc_lf indicates if read-write concurrency lock-free support is
>+ * enabled.
>+ * htm indicates if Hardware transactional memory support is enabled.
>+ */
>+ int rwc_lf = 0;
>+ int use_jhash = 0;
>+ num_test = NUM_TEST;
>+ if (rte_lcore_count() == 1) {
>+ printf("More than one lcore is required "
>+ "to do read write lock-free concurrency test\n");
>+ return -1;
>+ }
>+
>+ setlocale(LC_NUMERIC, "");
>+
>+ if (rte_tm_supported())
>+ htm = 1;
>+ else
>+ htm = 0;
>+
>+ if (init_params(rwc_lf, use_jhash) != 0)
>+ return -1;
>+ if (generate_keys() != 0)
[Wang, Yipeng] Generate_keys may run for a long time, so print something to indicate
the test has started.
>+ return -1;
>+ if (get_enabled_cores_list() != 0)
>+ return -1;
>+
[Wang, Yipeng]
Please change/add multi-lookup(bulk lookup) to all the tests. Bulk lookup is usually more commonly used and people care about
as a performance test.
It would make sense to change other multi-thread performance unit test to use bulk lookup as well in future.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency
2018-10-13 1:02 ` Wang, Yipeng1
@ 2018-10-15 20:15 ` Honnappa Nagarahalli
0 siblings, 0 replies; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-15 20:15 UTC (permalink / raw)
To: Wang, Yipeng1, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, Dharmik Thakkar, Gavin Hu (Arm Technology China),
nd, Gobriel, Sameh
Hi Yipeng,
Thank you for the review, appreciate your efforts.
Thank you,
Honnappa
> >
> >RW concurrency is required with single writer and multiple reader
> >usecase as well. Hence, multi-writer should not be enabled by default
> >when RW concurrency is enabled.
> >
> >Fixes: f2e3001b53ec ("hash: support read/write concurrency")
> >Cc: yipeng1.wang@intel.com
> >
> >Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> >Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> >---
> > lib/librte_hash/rte_cuckoo_hash.c | 27 ++++++++++++++++-----------
> >lib/librte_hash/rte_cuckoo_hash.h | 2 ++
> > test/test/test_hash_readwrite.c | 6 ++++--
> > 3 files changed, 22 insertions(+), 13 deletions(-)
> >
> >+ uint8_t writer_takes_lock;
> >+ /**< Indicates if the writer threads need to take lock */
>
> [Wang, Yipeng]
> Our understanding is that the difference between writer_takes_lock and
> multi_writer_support flag now is that for the multi-writer case we still have
> the local cache for key-data pair slot. Please correct me if I am wrong.
Yes, that is correct. Need for lock and need for local cache are separated.
>
> But the name is confusing because writer_takes_lock implies multi-writer
> support. Especially the comment here says that writer needs a lock, which
> means multi-writer is supported. So conceptually it does not have different
> meaning than the multi_writer_support by just reading the name.
>
> If you want to distinguish these two implementation (with vs. without cache),
> maybe change the name of multi-writer flag to use_local_cache flag?
Agree, I will change the 'multi-writer' name to 'use_local_cache'. I will keep the 'writer_takes_lock' flag.
> And the previous locking mechanism need to enable this flag for performance
> reasons, while the LF does not.
> Or just keep the cache for both cases, and I don't think the local cache will
> add too much overhead.
Agree, it might not make much of a performance difference. My goal is to reduce the memory used when multi-writer is not enabled.
>
> > rte_hash_function hash_func; /**< Function used to calculate hash.
> */
> > uint32_t hash_func_init_val; /**< Init value used by hash_func. */
> > rte_hash_cmp_eq_t rte_hash_custom_cmp_eq; diff --git
> >a/test/test/test_hash_readwrite.c b/test/test/test_hash_readwrite.c
> >index 2a4f7b9..a8fadd0 100644
> >--- a/test/test/test_hash_readwrite.c
> >+++ b/test/test/test_hash_readwrite.c
> >@@ -122,10 +122,12 @@ init_params(int use_htm, int use_jhash)
> > if (use_htm)
> > hash_params.extra_flag =
> > RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT |
> >- RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY;
> >+ RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY |
> >+ RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD;
> [Wang, Yipeng]
> Could you double check that if current applications do not change their code,
> there is no functional issue will be introduced by this change, otherwise this
> would be an API change.
> I believe it will have performance implication though.
It is not advertised that multi-writer add is assumed when read-write concurrency is enabled. I think we should be ok.
The functionality will be fine. Only difference is that the local-cache will not be enabled without this flag. So, yes, there will be performance implication.
>
> Otherwise I am OK with this patch.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete
2018-10-13 1:17 ` Wang, Yipeng1
@ 2018-10-16 1:25 ` Honnappa Nagarahalli
0 siblings, 0 replies; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-16 1:25 UTC (permalink / raw)
To: Wang, Yipeng1, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, Dharmik Thakkar, Gavin Hu (Arm Technology China),
nd, Gobriel, Sameh
> >
> >rte_hash_lookup_xxx APIs return the index of the element in the key
> >store. Application(reader) can use that index to reference other data
> >structures in its scope. Because of this, the index should not be
> >recycled till the application completes using the index.
> >RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is introduced to support this.
> >When this flag is enabled rte_hash_del_xxx APIs do not free the
> >key-store index/internal memory associated with the deleted entry. The
> >new API rte_hash_free_key_with_position should be called to free the
> >key-store index/internal memory after calling rte_hash_del_xxx APIs.
> >
> > uint8_t ext_table_support; /**< Enable extendable bucket table */
> >+ uint8_t recycle_on_del;
> >+ /**< If internal memory/key-store entry should be
> >+ * freed on calling the rte_hash_del_xxx APIs.
> >+ * If this is set, rte_hash_free_key_with_position must be
> [Wang, Yipeng] If this is *not* set?
Agree.
>
> >+/** Flag to disable freeing of internal memory/indices on hash delete.
> >+ * Refer to rte_hash_del_xxx APIs for more details.
> >+ */
> >+#define RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL 0x10
> >+
> [Wang, Yipeng] Maybe call it FREE_AFTER_DEL or NO_FREE_ON_DEL?
> Recycle_on_del Sounds like we do the recycle at the delete time, which is
> opposite to the meaning.
>
> Change *recycle* to *free* to be consistent with the function API name.
> I guess I suggested to use *recycle* at beginning, but as a second thought, I
> think *free* is more user friendly than recycle. Recycle makes more sense to
> developers.
> And you already use *free* for the function name.
Will change it to 'NO_FREE_ON_DEL'. It will be disabled by default. Will be enabled by default for RW-Concurrency-LF.
>
> > /**
> > * The type of hash value of a key.
> > * It should be a value of at least 32bit with fully random pattern.
> >@@ -236,6 +243,10 @@ rte_hash_add_key_with_hash(const struct rte_hash
> >*h, const void *key, hash_sig_t
> > * and should only be called from one thread by default.
> > * Thread safety can be enabled by setting flag during
> > * table creation.
> >+ * If RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL is enabled,
> >+ * the hash library's internal memory/index will not be freed by this
> >+ * API. rte_hash_free_key_with_position API must be called
> >+ additionally
> >+ * to free the internal memory/index associated with the key.
> [Wang, Yipeng] Maybe more explicit on the use case for this flag: This
> behavior is useful for multi-threading applications which may still have
> threads referencing the position after deletion (or other words of which you
> think more clear).
I don't think we should address any particular use case here as there can be many which we might not have imagined. IMO, it is better to do this in documentation where we can talk about use cases (that we know and want to address) and how to use these APIs.
>
> Otherwise
> Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment
2018-10-13 1:20 ` Wang, Yipeng1
@ 2018-10-16 23:26 ` Honnappa Nagarahalli
0 siblings, 0 replies; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-16 23:26 UTC (permalink / raw)
To: Wang, Yipeng1, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, Dharmik Thakkar, Gavin Hu (Arm Technology China),
nd, Gobriel, Sameh
> >-----Original Message-----
> >From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
> >Sent: Thursday, October 11, 2018 11:32 PM
> >To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch,
> >Pablo <pablo.de.lara.guarch@intel.com>
> >Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>;
> >honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
> >gavin.hu@arm.com; nd@arm.com
> >Subject: [PATCH v3 3/7] hash: correct key store element alignment
> [Wang, Yipeng] "correct" -> "improve"?
I think 'fix' is the right word to use. If we look at the existing code, original author seems to have tried to align it on certain boundary:
struct rte_hash_key {
union {
uintptr_t idata;
void *pdata;
};
/* Variable key size */
char key[0];
} __attribute__((aligned(KEY_ALIGNMENT)));
But, this does not align every element of the key-store on the alignment boundary. This patch fixes it.
I think what is missing is the "Fixes" tag. I will add that.
I found this bug because I made the store/load of 'pdata' atomic.
> >
> >Correct the key store array element alignment. This is required to make
> >'pdata' in 'struct rte_hash_key' align on the correct boundary.
> [Wang, Yipeng]
> More explanation in commit message is appreciated, because people may not
> understand what is "correct" boundary.
> e.g. Previously pdata could spread across multiple cache lines, which makes
> the access of pdata non-atomic which may have performance implications.
>
I will add more explanation related to atomic access.
> Otherwise
> Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions
2018-10-13 1:56 ` Wang, Yipeng1
@ 2018-10-16 23:28 ` Honnappa Nagarahalli
0 siblings, 0 replies; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-16 23:28 UTC (permalink / raw)
To: Wang, Yipeng1, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, Dharmik Thakkar, Gavin Hu (Arm Technology China),
nd, Gobriel, Sameh
> When I applied this commit:
> fatal: sha1 information is lacking or useless
> (lib/librte_hash/rte_cuckoo_hash.c).
> Please double check.
I will verify the patches when I send out the next version.
>
> >-----Original Message-----
> >From: Honnappa Nagarahalli [mailto:honnappa.nagarahalli@arm.com]
> >Sent: Thursday, October 11, 2018 11:32 PM
> >To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch,
> >Pablo <pablo.de.lara.guarch@intel.com>
> >Cc: dev@dpdk.org; Wang, Yipeng1 <yipeng1.wang@intel.com>;
> >honnappa.nagarahalli@arm.com; dharmik.thakkar@arm.com;
> >gavin.hu@arm.com; nd@arm.com
> >Subject: [PATCH v3 4/7] hash: add memory ordering to avoid race
> >conditions
> >
> >Only race condition that can occur is - using the key store element
> >before the key write is completed. Hence, while inserting the element
> >the release memory order is used. Any other race condition is caught by
> >the key comparison. Memory orderings are added only where needed.
> >For ex: reads in the writer's context do not need memory ordering as
> >there is a single writer.
> [Wang, Yipeng]
> I remember we discussed that this commit itself does not fix any bug/issue, or
> enabling feature, it is supposed to work with the following patches to enable
> lock-free read-write concurrency. You separated the commits for easier
> review.
>
> If you plan to merge these commits and change the commit message it would
> be fine. Otherwise the current message title and content Is misleading. It
> sounds like a bug fix but actually not.
I think it is time to merge this commit. I will do it in the next version.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency
2018-10-13 2:32 ` Wang, Yipeng1
@ 2018-10-17 13:54 ` Honnappa Nagarahalli
0 siblings, 0 replies; 20+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-17 13:54 UTC (permalink / raw)
To: Wang, Yipeng1, Richardson, Bruce, De Lara Guarch, Pablo
Cc: dev, Dharmik Thakkar, Gavin Hu (Arm Technology China),
nd, Gobriel, Sameh
> >Subject: [PATCH v3 6/7] hash: enable lock-free reader-writer
> >concurrency
> >
> >Add the flag to enable reader-writer concurrency during run time. The
> >rte_hash_del_xxx APIs do not free the keystore element when this flag
> >is enabled. Hence a new API, rte_hash_free_key_with_position, to free
> >the key store element is added.
> >
> >Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> >Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> >Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> >Reviewed-by: Steve Capper <steve.capper@arm.com>
> >Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com>
> >
> >+/** Flag to support lock free reader writer concurrency. Writer can be
> >+ * single writer/multi writer.
> [Wang, Yipeng] "Writer can be multi-writer when the
> RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD flag is also set."
Agree. What I am trying to say is that lock free reader writer concurrency is support for both single writer or multi writer use cases. This was one of your questions in the earlier version.
> >+ * Currently, extended bucket table feature is not supported with
> [Wang, Yipeng] "extendable bucket table", I also used wrong name sometimes
> but please use this one.
> >@@ -156,6 +169,10 @@ rte_hash_count(const struct rte_hash *h);
> > * and should only be called from one thread by default.
> > * Thread safety can be enabled by setting flag during
> > * table creation.
> >+ * The writer needs to be aware if this API is called to update
> >+ * an existing entry. The application should free any memory
> >+ * allocated for the existing 'data' only after all the readers
> >+ * have stopped referrencing it.
> > *
> [Wang, Yipeng]
> This comment to me is assuming a specific user use case, and not the library's
> responsibility.
> How about a more general description:
> If the added key is already in the table, the function will update the data of the
> key.
> In such case, it is the user's responsibility to properly handle the old data if
> the old data is still being referenced by other threads.
> Please let me know if I understand it wrong.
Your understanding is correct. I have changed the wordings.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2018-10-17 13:54 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-12 6:31 [dpdk-dev] [PATCH v3 0/7] Address reader-writer concurrency in rte_hash Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 1/7] hash: separate multi-writer from rw-concurrency Honnappa Nagarahalli
2018-10-13 1:02 ` Wang, Yipeng1
2018-10-15 20:15 ` Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 2/7] hash: support do not recycle on delete Honnappa Nagarahalli
2018-10-13 1:17 ` Wang, Yipeng1
2018-10-16 1:25 ` Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 3/7] hash: correct key store element alignment Honnappa Nagarahalli
2018-10-13 1:20 ` Wang, Yipeng1
2018-10-16 23:26 ` Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 4/7] hash: add memory ordering to avoid race conditions Honnappa Nagarahalli
2018-10-13 1:56 ` Wang, Yipeng1
2018-10-16 23:28 ` Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 5/7] hash: fix rw concurrency while moving keys Honnappa Nagarahalli
2018-10-13 2:07 ` Wang, Yipeng1
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 6/7] hash: enable lock-free reader-writer concurrency Honnappa Nagarahalli
2018-10-13 2:32 ` Wang, Yipeng1
2018-10-17 13:54 ` Honnappa Nagarahalli
2018-10-12 6:31 ` [dpdk-dev] [PATCH v3 7/7] test/hash: read-write lock-free concurrency test Honnappa Nagarahalli
2018-10-13 2:52 ` Wang, Yipeng1
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).