* [PATCH 0/2] introduce NitroSketch Mode into membership library
@ 2022-08-10 7:45 Leyi Rong
2022-08-10 7:45 ` [PATCH 1/2] member: implement NitroSketch mode Leyi Rong
` (5 more replies)
0 siblings, 6 replies; 34+ messages in thread
From: Leyi Rong @ 2022-08-10 7:45 UTC (permalink / raw)
To: yipeng1.wang, zaoxingliu, sameh.gobriel; +Cc: dev, Leyi Rong
This patchset introduce a brand new NitroSketch Mode into membership
library. This algorithm provides high-fidelity approximate measurements
and appears as a promissing alternative to triditional approches such as
packet sampling.
Leyi Rong (2):
member: implement NitroSketch mode
test/member: add functional and perf tests for sketch
app/test/test_member.c | 258 ++++++++++++
app/test/test_member_perf.c | 153 ++++++-
lib/member/meson.build | 37 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 149 ++++++-
lib/member/rte_member_heap.h | 420 +++++++++++++++++++
lib/member/rte_member_sketch.c | 583 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 96 +++++
lib/member/rte_member_sketch_avx512.c | 69 +++
lib/member/rte_member_sketch_avx512.h | 36 ++
lib/member/rte_xxh64_avx512.h | 117 ++++++
11 files changed, 1985 insertions(+), 8 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 1/2] member: implement NitroSketch mode
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-08-10 7:45 ` Leyi Rong
2022-08-16 8:59 ` Ferruh Yigit
2022-08-10 7:45 ` [PATCH 2/2] test/member: add functional and perf tests for sketch Leyi Rong
` (4 subsequent siblings)
5 siblings, 1 reply; 34+ messages in thread
From: Leyi Rong @ 2022-08-10 7:45 UTC (permalink / raw)
To: yipeng1.wang, zaoxingliu, sameh.gobriel; +Cc: dev, Leyi Rong
Sketching algorithm provide high-fidelity approximate measurements and
appears as a promising alternative to traditional approches such as
packet sampling.
NitroSketch [1] is a software sketching framework that optimizes
performance, provides accuracy guarantees, and supports a variety of
sketches.
This commit adds a new data structure called sketch into
membership library. This new data structure is an efficient
way to profile the traffic for heavy hitters. Also use min-heap
structure to maintain the top-k flow keys.
[1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
lib/member/meson.build | 37 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 149 ++++++-
lib/member/rte_member_heap.h | 420 +++++++++++++++++++
lib/member/rte_member_sketch.c | 583 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 96 +++++
lib/member/rte_member_sketch_avx512.c | 69 +++
lib/member/rte_member_sketch_avx512.h | 36 ++
lib/member/rte_xxh64_avx512.h | 117 ++++++
9 files changed, 1578 insertions(+), 4 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
diff --git a/lib/member/meson.build b/lib/member/meson.build
index e06fddc240..ab9d48544a 100644
--- a/lib/member/meson.build
+++ b/lib/member/meson.build
@@ -7,6 +7,41 @@ if is_windows
subdir_done()
endif
-sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
+sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
headers = files('rte_member.h')
deps += ['hash']
+includes += include_directories('../hash', '../ring')
+
+# compile AVX512 version if:
+# we are building 64-bit binary AND binutils can generate proper code
+if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
+ # compile AVX512 version if either:
+ # a. we have AVX512 supported in minimum instruction set
+ # baseline
+ # b. it's not minimum instruction set, but supported by
+ # compiler
+ #
+ # in former case, just add avx512 C file to files list
+ # in latter case, compile c file to static lib, using correct
+ # compiler flags, and then have the .o file from static lib
+ # linked into main lib.
+ sketch_avx512_cpu_support = (
+ cc.get_define('__AVX512F__', args: machine_args) != ''
+ )
+
+ if sketch_avx512_cpu_support == true
+ cflags += ['-DCC_AVX512_SUPPORT']
+ if cc.has_argument('-mavx512f')
+ cflags += ['-mavx', '-mavx2', '-mavx512f', '-mavx512ifma', '-march=icelake-server']
+ endif
+ sources += files('rte_member_sketch_avx512.c')
+ elif cc.has_argument('-mavx512f')
+ cflags += '-DCC_AVX512_SUPPORT'
+ sketch_avx512_tmp = static_library('sketch_avx512_tmp',
+ 'rte_member_sketch_avx512.c',
+ include_directories: includes,
+ dependencies: static_rte_eal,
+ c_args: cflags + ['-mavx', '-mavx2', '-mavx512f', '-mavx512ifma', '-march=icelake-server'])
+ objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
+ endif
+endif
diff --git a/lib/member/rte_member.c b/lib/member/rte_member.c
index 7e1632e6b5..8f859f7fbd 100644
--- a/lib/member/rte_member.c
+++ b/lib/member/rte_member.c
@@ -9,10 +9,12 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_tailq.h>
+#include <rte_ring_elem.h>
#include "rte_member.h"
#include "rte_member_ht.h"
#include "rte_member_vbf.h"
+#include "rte_member_sketch.h"
TAILQ_HEAD(rte_member_list, rte_tailq_entry);
static struct rte_tailq_elem rte_member_tailq = {
@@ -72,6 +74,9 @@ rte_member_free(struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_free_vbf(setsum);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_free_sketch(setsum);
+ break;
default:
break;
}
@@ -86,6 +91,8 @@ rte_member_create(const struct rte_member_parameters *params)
struct rte_member_list *member_list;
struct rte_member_setsum *setsum;
int ret;
+ char ring_name[RTE_RING_NAMESIZE];
+ struct rte_ring *sketch_key_ring = NULL;
if (params == NULL) {
rte_errno = EINVAL;
@@ -100,6 +107,16 @@ rte_member_create(const struct rte_member_parameters *params)
return NULL;
}
+ if (params->type == RTE_MEMBER_TYPE_SKETCH) {
+ snprintf(ring_name, sizeof(ring_name), "SK_%s", params->name);
+ sketch_key_ring = rte_ring_create_elem(ring_name, sizeof(uint32_t),
+ rte_align32pow2(params->top_k), params->socket_id, 0);
+ if (sketch_key_ring == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Ring Memory allocation failed\n");
+ return NULL;
+ }
+ }
+
member_list = RTE_TAILQ_CAST(rte_member_tailq.head, rte_member_list);
rte_mcfg_tailq_write_lock();
@@ -145,6 +162,9 @@ rte_member_create(const struct rte_member_parameters *params)
case RTE_MEMBER_TYPE_VBF:
ret = rte_member_create_vbf(setsum, params);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ ret = rte_member_create_sketch(setsum, params, sketch_key_ring);
+ break;
default:
goto error_unlock_exit;
}
@@ -162,6 +182,7 @@ rte_member_create(const struct rte_member_parameters *params)
error_unlock_exit:
rte_free(te);
rte_free(setsum);
+ rte_ring_free(sketch_key_ring);
rte_mcfg_tailq_write_unlock();
return NULL;
}
@@ -178,6 +199,23 @@ rte_member_add(const struct rte_member_setsum *setsum, const void *key,
return rte_member_add_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_add_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch(setsum, key, set_id);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count)
+{
+ if (setsum == NULL || key == NULL || byte_count == 0)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch_byte_count(setsum, key, byte_count);
default:
return -EINVAL;
}
@@ -195,6 +233,8 @@ rte_member_lookup(const struct rte_member_setsum *setsum, const void *key,
return rte_member_lookup_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_lookup_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_lookup_sketch(setsum, key, set_id);
default:
return -EINVAL;
}
@@ -261,6 +301,36 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
}
}
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output)
+{
+ if (setsum == NULL || key == NULL || output == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_query_sketch(setsum, key, output);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count)
+{
+ if (setsum == NULL || key == NULL || count == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_report_heavyhitter_sketch(setsum, key, count);
+ default:
+ return -EINVAL;
+ }
+}
+
int
rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id)
@@ -272,6 +342,8 @@ rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
case RTE_MEMBER_TYPE_HT:
return rte_member_delete_ht(setsum, key, set_id);
/* current vBF implementation does not support delete function */
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_delete_sketch(setsum, key);
case RTE_MEMBER_TYPE_VBF:
default:
return -EINVAL;
@@ -290,6 +362,9 @@ rte_member_reset(const struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_reset_vbf(setsum);
return;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_reset_sketch(setsum);
+ return;
default:
return;
}
diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h
index 2611015771..6fd591611c 100644
--- a/lib/member/rte_member.h
+++ b/lib/member/rte_member.h
@@ -39,6 +39,18 @@
* | | | not overwrite | |
* | | | existing key. | |
* +----------+---------------------+----------------+-------------------------+
+ * +==========+=============================+
+ * | type | sketch |
+ * +==========+=============================+
+ * |structure | counting bloom filter array |
+ * +----------+-----------------------------+
+ * |set id | 1: heavy set, 0: light set |
+ * | | |
+ * +----------+-----------------------------+
+ * |usages & | count size of a flow, |
+ * |properties| used for heavy hitter |
+ * | | detection. |
+ * +----------+-----------------------------+
* -->
*/
@@ -50,6 +62,7 @@ extern "C" {
#endif
#include <stdint.h>
+#include <stdbool.h>
#include <rte_common.h>
@@ -66,6 +79,19 @@ typedef uint16_t member_set_t;
/** Maximum number of characters in setsum name. */
#define RTE_MEMBER_NAMESIZE 32
+/**
+ * As packets skipped in the sampling-based algorithm, the accounting
+ * results accuracy is not guaranteed in the start stage. There should
+ * be a "convergence time" to achieve the accuracy after receiving enough
+ * packets.
+ * For sketch, use the flag if prefer always bounded mode, which only
+ * starts sampling after receiving enough packets to keep the results
+ * accuracy always bounded.
+ */
+#define RTE_MEMBER_SKETCH_ALWAYS_BOUNDED 0x01
+/** For sketch, use the flag if to count packet size instead of packet count */
+#define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02
+
/** @internal Hash function used by membership library. */
#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32)
#include <rte_hash_crc.h>
@@ -104,6 +130,7 @@ struct rte_member_parameters;
enum rte_member_setsum_type {
RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary. */
RTE_MEMBER_TYPE_VBF, /**< Vector of bloom filters. */
+ RTE_MEMBER_TYPE_SKETCH,
RTE_MEMBER_NUM_TYPE
};
@@ -114,6 +141,19 @@ enum rte_member_sig_compare_function {
RTE_MEMBER_COMPARE_NUM
};
+/* sketch update function with different implementations. */
+typedef void (*sketch_update_fn_t)(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+/* sketch lookup function with different implementations. */
+typedef uint64_t (*sketch_lookup_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
+/* sketch delete function with different implementations. */
+typedef void (*sketch_delete_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
/** @internal setsummary structure. */
struct rte_member_setsum {
enum rte_member_setsum_type type; /* Type of the set summary. */
@@ -134,6 +174,21 @@ struct rte_member_setsum {
uint32_t bit_mask; /* Bit mask to get bit location in bf. */
uint32_t num_hashes; /* Number of hash values to index bf. */
+ /* Parameters for sketch */
+ float error_rate;
+ float sample_rate;
+ uint32_t num_col;
+ uint32_t num_row;
+ int always_bounded;
+ double converge_thresh;
+ uint32_t topk;
+ uint32_t count_byte;
+ uint64_t *hash_seeds;
+ sketch_update_fn_t sketch_update; /* Pointer to the sketch update function */
+ sketch_lookup_fn_t sketch_lookup; /* Pointer to the sketch lookup function */
+ sketch_delete_fn_t sketch_delete; /* Pointer to the sketch delete function */
+
+ void *runtime_var;
uint32_t mul_shift; /* vbf internal variable used during bit test. */
uint32_t div_shift; /* vbf internal variable used during bit test. */
@@ -143,6 +198,9 @@ struct rte_member_setsum {
/* Second cache line should start here. */
uint32_t socket_id; /* NUMA Socket ID for memory. */
char name[RTE_MEMBER_NAMESIZE]; /* Name of this set summary. */
+#ifdef RTE_ARCH_X86
+ bool use_avx512;
+#endif
} __rte_cache_aligned;
/**
@@ -261,8 +319,33 @@ struct rte_member_parameters {
*/
uint32_t sec_hash_seed;
+ /**
+ * For count(min) sketch data structure, error rate defines the accuracy
+ * required by the user. Higher accuracy leads to more memory usage, but
+ * the flow size is estimated more accurately.
+ */
+ float error_rate;
+
+ /**
+ * Sampling rate means the internal sample rate of the rows of the count
+ * min sketches. Lower sampling rate can reduce CPU overhead, but the
+ * data structure will require more time to converge statistically.
+ */
+ float sample_rate;
+
+ /**
+ * How many top heavy hitter to be reported. The library will internally
+ * keep the keys of heavy hitters for final report.
+ */
+ uint32_t top_k;
+
+ /**
+ * Extra flags that may passed in by user
+ */
+ uint32_t extra_flag;
+
int socket_id; /**< NUMA Socket ID for memory. */
-};
+} __rte_cache_aligned;
/**
* @warning
@@ -418,7 +501,7 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* RTE_MEMBER_NO_MATCH by default is set as 0.
* For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved.
* For vBF mode the set id is limited by the num_set parameter when create
- * the set-summary.
+ * the set-summary. For sketch mode, this id is ignored.
* @return
* HT (cache mode) and vBF should never fail unless the set_id is not in the
* valid range. In such case -EINVAL is returned.
@@ -429,12 +512,72 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* Return 0 for HT (cache mode) if the add does not cause
* eviction, return 1 otherwise. Return 0 for non-cache mode if success,
* -ENOSPC for full, and 1 if cuckoo eviction happens.
- * Always returns 0 for vBF mode.
+ * Always returns 0 for vBF mode and sketch.
*/
int
rte_member_add(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add the packet byte size into the sketch.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param byte_count
+ * Add the byte count of the packet into the sketch.
+ * @return
+ * Return -EINVAL for invalid parameters, otherwise return 0.
+ */
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Query packet count for a certain flow-key.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param count
+ * The output packet count or byte count.
+ * @return
+ * Return -EINVAL for invalid parameters.
+ */
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *count);
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Report heavyhitter flow-keys into set-summary (SS).
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the output top-k key array.
+ * @param count
+ * Pointer of the output packet count or byte count array of the top-k keys.
+ * @return
+ * Return -EINVAL for invalid parameters. Return a positive integer indicate
+ * how many heavy hitters are reported.
+ */
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **keys, uint64_t *counts);
+
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
diff --git a/lib/member/rte_member_heap.h b/lib/member/rte_member_heap.h
new file mode 100644
index 0000000000..1756a1eca5
--- /dev/null
+++ b/lib/member/rte_member_heap.h
@@ -0,0 +1,420 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#ifndef _RTE_MEMBER_HEAP_H_
+#define _RTE_MEMBER_HEAP_H_
+
+#include <rte_ring_elem.h>
+#include "rte_member.h"
+
+#define LCHILD(x) (2 * x + 1)
+#define RCHILD(x) (2 * x + 2)
+#define PARENT(x) ((x - 1) / 2)
+
+#define HASH_BKT_SIZE 16
+#define HASH_HP_MULTI 4
+#define HASH_RESIZE_MULTI 2
+
+struct hash_bkt {
+ uint16_t sig[HASH_BKT_SIZE];
+ uint16_t idx[HASH_BKT_SIZE];
+};
+
+struct hash {
+ uint16_t bkt_cnt;
+ uint16_t num_item;
+ uint32_t seed;
+ struct hash_bkt buckets[0];
+};
+
+struct node {
+ void *key;
+ uint64_t count;
+};
+
+struct minheap {
+ uint32_t key_len;
+ uint32_t size;
+ uint32_t socket;
+ struct hash *hashtable;
+ struct node *elem;
+};
+
+static int
+hash_table_insert(const void *key, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+
+ for (int i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].idx[i] == 0) {
+ table->buckets[idx].idx[i] = value;
+ table->buckets[idx].sig[i] = sig;
+ table->num_item++;
+ return 0;
+ }
+ }
+
+ return -ENOMEM;
+}
+
+static int
+hash_table_update(const void *key, int old_value, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+
+ for (int i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == old_value) {
+ table->buckets[idx].idx[i] = value;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_del(const void *key, uint16_t value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+
+ for (int i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == value) {
+ table->buckets[idx].idx[i] = 0;
+ table->num_item--;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_lookup(const void *key, int key_len, struct minheap *hp)
+{
+ struct hash *table = hp->hashtable;
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+
+ for (int i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] != 0) {
+ uint32_t hp_idx = table->buckets[idx].idx[i] - 1;
+
+ if (memcmp(hp->elem[hp_idx].key, key, hp->key_len) == 0)
+ return hp_idx;
+ }
+ }
+
+ return -ENOENT; /* key doesn't exist */
+}
+
+static int
+resize_hash_table(struct minheap *hp)
+{
+ uint32_t i;
+ uint32_t new_bkt_cnt;
+
+ while (1) {
+ new_bkt_cnt = hp->hashtable->bkt_cnt * HASH_RESIZE_MULTI;
+
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT load factor is [%f]\n",
+ hp->hashtable->num_item / ((float)hp->hashtable->bkt_cnt * HASH_BKT_SIZE));
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT resize happen!\n");
+ rte_free(hp->hashtable);
+ hp->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ new_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, hp->socket);
+
+ if (hp->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ return -ENOMEM;
+ }
+
+ hp->hashtable->bkt_cnt = new_bkt_cnt;
+
+ for (i = 0; i < hp->size; ++i) {
+ if (hash_table_insert(hp->elem[i].key,
+ i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR,
+ "Sketch Minheap HT resize insert fail!\n");
+ break;
+ }
+ }
+ if (i == hp->size)
+ break;
+ }
+
+ return 0;
+}
+
+/* find the item in the given minheap */
+static int
+rte_member_minheap_find(struct minheap *hp, const void *key)
+{
+ int idx = hash_table_lookup(key, hp->key_len, hp);
+ return idx;
+}
+
+static int
+rte_member_minheap_init(struct minheap *heap, int size,
+ uint32_t socket, uint32_t seed)
+{
+ heap->elem = rte_zmalloc_socket(NULL, sizeof(struct node) * size,
+ RTE_CACHE_LINE_SIZE, socket);
+ if (heap->elem == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap elem allocation failed\n");
+ return -ENOMEM;
+ }
+
+ uint32_t hash_bkt_cnt = rte_align32pow2(size * HASH_HP_MULTI) / HASH_BKT_SIZE;
+
+ if (hash_bkt_cnt == 0)
+ hash_bkt_cnt = 1;
+
+ heap->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ hash_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, socket);
+
+ if (heap->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ rte_free(heap->elem);
+ return -ENOMEM;
+ }
+
+ heap->hashtable->seed = seed;
+ heap->hashtable->bkt_cnt = hash_bkt_cnt;
+ heap->socket = socket;
+
+ return 0;
+}
+
+/* swap the minheap nodes */
+static __rte_always_inline void
+rte_member_heap_swap(struct node *n1, struct node *n2)
+{
+ struct node temp = *n1;
+ *n1 = *n2;
+ *n2 = temp;
+}
+
+/* heapify function */
+static void
+rte_member_heapify(struct minheap *hp, uint32_t idx, bool update_hash)
+{
+ uint32_t smallest;
+
+ if (LCHILD(idx) < hp->size &&
+ hp->elem[LCHILD(idx)].count < hp->elem[idx].count)
+ smallest = LCHILD(idx);
+ else
+ smallest = idx;
+
+ if (RCHILD(idx) < hp->size &&
+ hp->elem[RCHILD(idx)].count < hp->elem[smallest].count)
+ smallest = RCHILD(idx);
+
+ if (smallest != idx) {
+ rte_member_heap_swap(&(hp->elem[idx]), &(hp->elem[smallest]));
+
+ if (update_hash) {
+ if (hash_table_update(hp->elem[smallest].key, idx + 1, smallest + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+
+ if (hash_table_update(hp->elem[idx].key, smallest + 1, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+ }
+ rte_member_heapify(hp, smallest, update_hash);
+ }
+}
+
+/* insert a node into the minheap */
+static int
+rte_member_minheap_insert_node(struct minheap *hp, const void *key,
+ int counter, void *key_slot,
+ struct rte_ring *free_key_slot)
+{
+ struct node nd;
+ uint32_t slot_id;
+
+ if (rte_ring_sc_dequeue_elem(free_key_slot, &slot_id, sizeof(uint32_t)) != 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap get empty keyslot failed\n");
+ return -1;
+ }
+
+ nd.count = counter;
+ nd.key = RTE_PTR_ADD(key_slot, slot_id * hp->key_len);
+
+ memcpy(nd.key, key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* delete a key from the minheap */
+static int
+rte_member_minheap_delete_node(struct minheap *hp, const void *key,
+ void *key_slot, struct rte_ring *free_key_slot)
+{
+ int idx = rte_member_minheap_find(hp, key);
+ uint32_t offset = RTE_PTR_DIFF(hp->elem[idx].key, key_slot) / hp->key_len;
+
+ if (hash_table_del(key, idx + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ rte_ring_sp_enqueue_elem(free_key_slot, &offset, sizeof(uint32_t));
+
+ if (idx == (int)(hp->size - 1)) {
+ hp->size--;
+ return 0;
+ }
+
+ hp->elem[idx] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[idx].key, hp->size, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+ rte_member_heapify(hp, idx, true);
+
+ return 0;
+}
+
+/* replace a min node with a new key. */
+static int
+rte_member_minheap_replace_node(struct minheap *hp,
+ const void *new_key,
+ int new_counter)
+{
+ struct node nd;
+ void *recycle_key = NULL;
+
+ recycle_key = hp->elem[0].key;
+
+ if (hash_table_del(recycle_key, 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ hp->elem[0] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[0].key, hp->size, 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+
+ rte_member_heapify(hp, 0, true);
+
+ nd.count = new_counter;
+ nd.key = recycle_key;
+
+ memcpy(nd.key, new_key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(new_key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace insert failed\n");
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* sort the heap into a decending array */
+static void
+rte_member_heapsort(struct minheap *hp, struct node *result_array)
+{
+ struct minheap new_hp;
+
+ /* build a new heap for using the given array */
+ new_hp.size = hp->size;
+ new_hp.key_len = hp->key_len;
+ new_hp.elem = result_array;
+ memcpy(result_array, hp->elem, hp->size * sizeof(struct node));
+
+ /* sort the new heap */
+ while (new_hp.size > 1) {
+ rte_member_heap_swap(&(new_hp.elem[0]), &(new_hp.elem[new_hp.size - 1]));
+ new_hp.size--;
+ rte_member_heapify(&new_hp, 0, false);
+ }
+}
+
+static void
+rte_member_minheap_free(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ rte_free(hp->elem);
+ rte_free(hp->hashtable);
+}
+
+static void
+rte_member_minheap_reset(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ memset(hp->elem, 0, sizeof(struct node) * hp->size);
+ hp->size = 0;
+
+ memset((char *)hp->hashtable + sizeof(struct hash), 0,
+ hp->hashtable->bkt_cnt * sizeof(struct hash_bkt));
+ hp->hashtable->num_item = 0;
+}
+
+#endif /* _RTE_MEMBER_HEAP_H_ */
diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c
new file mode 100644
index 0000000000..b0a258165f
--- /dev/null
+++ b/lib/member/rte_member_sketch.c
@@ -0,0 +1,583 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#include <math.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_errno.h>
+#include <rte_log.h>
+#include <rte_random.h>
+#include <rte_prefetch.h>
+#include <rte_ring_elem.h>
+
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_member_heap.h"
+
+#ifdef CC_AVX512_SUPPORT
+#include "rte_member_sketch_avx512.h"
+#endif /* CC_AVX512_SUPPORT */
+
+struct sketch_runtime {
+ uint64_t pkt_cnt;
+ uint32_t until_next;
+ int converged;
+ struct minheap heap;
+ struct node *report_array;
+ void *key_slots;
+ struct rte_ring *free_key_slots;
+} __rte_cache_aligned;
+
+/*
+ * Geometric sampling to calculate how many packets needs to be
+ * skipped until next update. This method can mitigate the CPU
+ * overheads compared with coin-toss sampling.
+ */
+static uint32_t
+draw_geometric(const struct rte_member_setsum *ss)
+{
+ double rand = 1;
+
+ if (ss->sample_rate == 1)
+ return 1;
+
+ while (rand == 1 || rand == 0)
+ rand = (double) rte_rand() / (UINT64_MAX);
+
+ return (uint32_t)ceil(log(1 - rand) / log(1 - ss->sample_rate));
+}
+
+static void
+isort(uint64_t *array, int n)
+{
+ for (int i = 1; i < n; i++) {
+ uint64_t t = array[i];
+ int j;
+
+ for (j = i - 1; j >= 0; j--) {
+ if (t < array[j])
+ array[j + 1] = array[j];
+ else
+ break;
+ }
+ array[j + 1] = t;
+ }
+}
+
+static __rte_always_inline void
+swap(uint64_t *a, uint64_t *b)
+{
+ uint64_t tmp = *a;
+ *a = *b;
+ *b = tmp;
+}
+
+static uint64_t
+medianof5(uint64_t a, uint64_t b, uint64_t c, uint64_t d, uint64_t e)
+{
+ if (a > b)
+ swap(&a, &b);
+ if (c > d)
+ swap(&c, &d);
+ if (a > c) {
+ if (d > e)
+ swap(&c, &e);
+ else {
+ swap(&c, &d);
+ swap(&d, &e);
+ }
+ } else {
+ if (b > e)
+ swap(&a, &e);
+ else {
+ swap(&a, &b);
+ swap(&b, &e);
+ }
+ }
+
+ if (a > c)
+ return a > d ? d : a;
+ else
+ return b > c ? c : b;
+}
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *ring)
+{
+ struct sketch_runtime *runtime;
+ uint32_t num_col;
+ uint32_t i;
+
+ if (params->sample_rate == 0 || params->sample_rate > 1) {
+ rte_errno = EINVAL;
+ RTE_MEMBER_LOG(ERR,
+ "Membership Sketch created with invalid parameters\n");
+ return -EINVAL;
+ }
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_COUNT_BYTE)
+ ss->count_byte = 1;
+
+ if (ss->count_byte == 1 &&
+ rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512IFMA) == 1) {
+#ifdef CC_AVX512_SUPPORT
+ ss->use_avx512 = true;
+#else
+ ss->use_avx512 = false;
+#endif
+ }
+
+ if (ss->use_avx512 == true) {
+ ss->num_row = NUM_ROW_VEC;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch AVX512 update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_avx512;
+ ss->sketch_lookup = sketch_lookup_avx512;
+ ss->sketch_delete = sketch_delete_avx512;
+ } else {
+ ss->num_row = NUM_ROW_SCALAR;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch SCALAR update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_scalar;
+ ss->sketch_lookup = sketch_lookup_scalar;
+ ss->sketch_delete = sketch_delete_scalar;
+ }
+
+ ss->socket_id = params->socket_id;
+
+ if (ss->count_byte == 0)
+ num_col = 4.0 / params->error_rate / params->sample_rate;
+ else if (ss->use_avx512 == true)
+ num_col = rte_align32pow2(4.0 / params->error_rate);
+ else
+ num_col = 4.0 / params->error_rate;
+
+ ss->table = rte_zmalloc_socket(NULL,
+ sizeof(uint64_t) * num_col * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->table == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Table memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->hash_seeds = rte_zmalloc_socket(NULL, sizeof(uint64_t) * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->hash_seeds == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Hashseeds memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->runtime_var = rte_zmalloc_socket(NULL, sizeof(struct sketch_runtime),
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->runtime_var == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime memory allocation failed\n");
+ rte_free(ss);
+ return -ENOMEM;
+ }
+ runtime = ss->runtime_var;
+
+ ss->num_col = num_col;
+ ss->sample_rate = params->sample_rate;
+ ss->prim_hash_seed = params->prim_hash_seed;
+ ss->sec_hash_seed = params->sec_hash_seed;
+ ss->error_rate = params->error_rate;
+ ss->topk = params->top_k;
+ ss->key_len = params->key_len;
+ runtime->heap.key_len = ss->key_len;
+
+ runtime->key_slots = rte_zmalloc_socket(NULL, ss->key_len * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->key_slots == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Key Slots allocation failed\n");
+ goto error;
+ }
+
+ runtime->free_key_slots = ring;
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime->free_key_slots,
+ &i, sizeof(uint32_t));
+
+ if (rte_member_minheap_init(&(runtime->heap), params->top_k,
+ ss->socket_id, params->prim_hash_seed) < 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap allocation failed\n");
+ goto error_runtime;
+ }
+
+ runtime->report_array = rte_zmalloc_socket(NULL, sizeof(struct node) * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->report_array == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime Report Array allocation failed\n");
+ goto error_runtime;
+ }
+
+ rte_srand(ss->prim_hash_seed);
+ for (uint32_t i = 0; i < ss->num_row; i++)
+ ss->hash_seeds[i] = rte_rand();
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_ALWAYS_BOUNDED)
+ ss->always_bounded = 1;
+
+ if (ss->always_bounded) {
+ double delta = 1.0 / (pow(2, ss->num_row));
+
+ ss->converge_thresh = 10 * pow(ss->error_rate, -2.0) * sqrt(log(1 / delta));
+ }
+
+ RTE_MEMBER_LOG(DEBUG, "Sketch created, "
+ "the total memory required is %u Bytes\n", ss->num_col * ss->num_row * 8);
+
+ return 0;
+
+error_runtime:
+ rte_member_minheap_free(&runtime->heap);
+ rte_ring_free(runtime->free_key_slots);
+ rte_free(runtime->key_slots);
+error:
+ rte_free(runtime);
+ rte_free(ss);
+
+ return -ENOMEM;
+}
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col[ss->num_row];
+ uint64_t count_row[ss->num_row];
+ uint32_t cur_row;
+ uint64_t count;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ rte_prefetch0(&count_array[cur_row * ss->num_col + col[cur_row]]);
+ }
+
+ /* if sample rate is 1, it is a regular count-min, we report the min */
+ if (ss->sample_rate == 1 || ss->count_byte == 1)
+ return count_min(ss, col);
+
+ /* otherwise we report the median number */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_row[cur_row] = count_array[cur_row * ss->num_col + col[cur_row]];
+
+ if (ss->num_row == 5)
+ return medianof5(count_row[0], count_row[1],
+ count_row[2], count_row[3], count_row[4]);
+
+ isort(count_row, ss->num_row);
+
+ if (ss->num_row % 2 == 0) {
+ count = (count_row[ss->num_row / 2] + count_row[ss->num_row / 2 - 1]) / 2;
+ return count;
+ }
+ /* ss->num_row % 2 != 0 */
+ count = count_row[ss->num_row / 2];
+
+ return count;
+}
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* set corresponding counter to 0 */
+ count_array[cur_row * ss->num_col + col[cur_row]] = 0;
+ }
+}
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ uint64_t *output)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ *output = count;
+
+ return 0;
+}
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ uint64_t count = ss->sketch_lookup(ss, runtime_var->heap.elem[i].key);
+
+ runtime_var->heap.elem[i].count = count;
+ }
+}
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key,
+ uint64_t *count)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = setsum->runtime_var;
+
+ rte_member_update_heap(setsum);
+ rte_member_heapsort(&(runtime_var->heap), runtime_var->report_array);
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ key[i] = runtime_var->report_array[i].key;
+ count[i] = runtime_var->report_array[i].count;
+ }
+
+ return runtime_var->heap.size;
+}
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *ss,
+ const void *key, member_set_t *set_id)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ if (runtime_var->heap.size > 0 && count >= runtime_var->heap.elem[0].count)
+ *set_id = 1;
+ else
+ *set_id = 0;
+
+ if (count == 0)
+ return 0;
+ else
+ return 1;
+}
+
+static void
+should_converge(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ /* For count min sketch - L1 norm */
+ if (runtime_var->pkt_cnt > ss->converge_thresh) {
+ runtime_var->converged = 1;
+ RTE_MEMBER_LOG(DEBUG, "Sketch converged, begin sampling "
+ "from key count %lu\n",
+ runtime_var->pkt_cnt);
+ }
+}
+
+static void
+sketch_update_row(const struct rte_member_setsum *ss, const void *key,
+ uint32_t count, uint32_t cur_row)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* sketch counter update */
+ count_array[cur_row * ss->num_col + col] +=
+ ceil(count / (ss->sample_rate));
+}
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+ count_array[cur_row * ss->num_col + col] += count;
+ }
+}
+
+static void
+heap_update(const struct rte_member_setsum *ss, const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t key_cnt = 0;
+ int found;
+
+ /* We also update the heap for this key */
+ key_cnt = ss->sketch_lookup(ss, key);
+ if (key_cnt > runtime_var->heap.elem[0].count) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ /* the key is found in the top-k heap */
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else if (runtime_var->heap.size < ss->topk) {
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ } else {
+ rte_member_minheap_replace_node(&runtime_var->heap, key, key_cnt);
+ }
+ } else if (runtime_var->heap.size < ss->topk) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ }
+}
+
+/*
+ * Add a single packet into the sketch.
+ * Sketch value is meatured by packet numbers in this mode.
+ */
+int
+rte_member_add_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ __rte_unused member_set_t set_id)
+{
+ uint32_t cur_row;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /*
+ * If sketch is mesured by byte count,
+ * the rte_member_add_sketch_byte_count routine should be used.
+ */
+ if (ss->count_byte == 1) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Byte Mode, "
+ "should use rte_member_add_byte_count()!\n");
+ return -EINVAL;
+ }
+
+ if (ss->sample_rate == 1) {
+ ss->sketch_update(ss, key, 1);
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* convergence stage if it's needed */
+ if (ss->always_bounded && !runtime_var->converged) {
+ ss->sketch_update(ss, key, 1);
+
+ if (!((++runtime_var->pkt_cnt) & (INTERVAL - 1)))
+ should_converge(ss);
+
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* should we skip this packet */
+ if (*until_next >= ss->num_row) {
+ *until_next -= ss->num_row;
+ return 0;
+ }
+ cur_row = *until_next;
+ do {
+ sketch_update_row(ss, key, 1, cur_row);
+ *until_next = draw_geometric(ss);
+ if (cur_row + *until_next >= ss->num_row)
+ break;
+ cur_row += *until_next;
+ } while (1);
+
+ *until_next -= (ss->num_row - cur_row);
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+/*
+ * Add the byte count of the packet into the sketch.
+ * Sketch value is meatured by byte count numbers in this mode.
+ */
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t byte_count)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /* should not call this API if not in count byte mode */
+ if (ss->count_byte == 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Pkt Mode, "
+ "should use rte_member_add()!\n");
+ return -EINVAL;
+ }
+
+ /* there's specific optimization for the sketch update */
+ ss->sketch_update(ss, key, byte_count);
+
+ if (*until_next != 0) {
+ *until_next = *until_next - 1;
+ return 0;
+ }
+
+ *until_next = draw_geometric(ss) - 1;
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *ss,
+ const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ int found;
+
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found < 0)
+ return -1;
+
+ ss->sketch_delete(ss, key);
+
+ return rte_member_minheap_delete_node
+ (&runtime_var->heap, key, runtime_var->key_slots, runtime_var->free_key_slots);
+}
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ rte_free(ss->table);
+ rte_member_minheap_free(&runtime_var->heap);
+ rte_free(runtime_var->key_slots);
+ rte_ring_free(runtime_var->free_key_slots);
+ rte_free(runtime_var);
+}
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t *sketch = ss->table;
+ uint32_t i;
+
+ memset(sketch, 0, sizeof(uint64_t) * ss->num_col * ss->num_row);
+ rte_member_minheap_reset(&runtime_var->heap);
+ rte_ring_reset(runtime_var->free_key_slots);
+
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime_var->free_key_slots, &i, sizeof(uint32_t));
+}
diff --git a/lib/member/rte_member_sketch.h b/lib/member/rte_member_sketch.h
new file mode 100644
index 0000000000..a5e633a74e
--- /dev/null
+++ b/lib/member/rte_member_sketch.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_H_
+#define _RTE_MEMBER_SKETCH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_ring_elem.h>
+
+#define NUM_ROW_SCALAR 5
+#define INTERVAL (1 << 15)
+
+#if !RTE_IS_POWER_OF_2(INTERVAL)
+#error sketch INTERVAL macro must be a power of 2
+#endif
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *r);
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *setsum,
+ const void *key, member_set_t *set_id);
+
+int
+rte_member_add_sketch(const struct rte_member_setsum *setsum,
+ const void *key,
+ member_set_t set_id);
+
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key, uint32_t byte_count);
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *setsum,
+ const void *key);
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output);
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss);
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *setsum);
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count);
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss);
+
+static __rte_always_inline uint64_t
+count_min(const struct rte_member_setsum *ss, const uint32_t *hash_results)
+{
+ uint64_t *count_array = ss->table;
+ uint64_t count;
+ uint32_t cur_row;
+ uint64_t min = UINT64_MAX;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ uint64_t cnt = count_array[cur_row * ss->num_col + hash_results[cur_row]];
+
+ if (cnt < min)
+ min = cnt;
+ }
+ count = min;
+
+ return count;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_H_ */
diff --git a/lib/member/rte_member_sketch_avx512.c b/lib/member/rte_member_sketch_avx512.c
new file mode 100644
index 0000000000..c83f4b6fd1
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.c
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "rte_member_sketch_avx512.h"
+
+__rte_always_inline void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t num_col = ss->num_col;
+ uint32_t key_len = ss->key_len;
+ __m256i v_row_base;
+ __m256i v_hash_result;
+ __m512i current_sketch;
+ __m512i updated_sketch;
+ __m512i v_count;
+
+ const __m256i v_idx = _mm256_set_epi32(7, 6, 5, 4, 3, 2, 1, 0);
+ const __m256i v_col = _mm256_set1_epi32(num_col);
+
+ /* compute the hash result parallelly */
+ v_hash_result = rte_xxh64_sketch_avx512
+ (key, key_len, *(__m512i *)ss->hash_seeds, num_col);
+ v_row_base = _mm256_mullo_epi32(v_idx, v_col);
+ v_hash_result = _mm256_add_epi32(v_row_base, v_hash_result);
+
+ current_sketch =
+ _mm512_i32gather_epi64(v_hash_result, count_array, 8);
+ v_count = _mm512_set1_epi64(count);
+ updated_sketch = _mm512_add_epi64(current_sketch, v_count);
+ _mm512_i32scatter_epi64
+ ((void *)count_array, v_hash_result, updated_sketch, 8);
+}
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+
+ /* currently only for sketch byte count mode */
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds, ss->num_col);
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ return count_min(ss, col);
+}
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint64_t min = UINT64_MAX;
+ uint32_t cur_row;
+
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds,
+ RTE_ALIGN_FLOOR(ss->num_col, 32));
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ min = count_min(ss, col);
+
+ /* subtract the min value from all the counters */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_array[cur_row * ss->num_col + col[cur_row]] -= min;
+}
diff --git a/lib/member/rte_member_sketch_avx512.h b/lib/member/rte_member_sketch_avx512.h
new file mode 100644
index 0000000000..e7c25da643
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_AVX512_H_
+#define _RTE_MEMBER_SKETCH_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_xxh64_avx512.h"
+
+#define NUM_ROW_VEC 8
+
+void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_AVX512_H_ */
diff --git a/lib/member/rte_xxh64_avx512.h b/lib/member/rte_xxh64_avx512.h
new file mode 100644
index 0000000000..406c9ac256
--- /dev/null
+++ b/lib/member/rte_xxh64_avx512.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_XXH64_AVX512_H_
+#define _RTE_XXH64_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <immintrin.h>
+
+/* 0b1001111000110111011110011011000110000101111010111100101010000111 */
+static const uint64_t PRIME64_1 = 0x9E3779B185EBCA87ULL;
+/* 0b1100001010110010101011100011110100100111110101001110101101001111 */
+static const uint64_t PRIME64_2 = 0xC2B2AE3D27D4EB4FULL;
+/* 0b0001011001010110011001111011000110011110001101110111100111111001 */
+static const uint64_t PRIME64_3 = 0x165667B19E3779F9ULL;
+/* 0b1000010111101011110010100111011111000010101100101010111001100011 */
+static const uint64_t PRIME64_4 = 0x85EBCA77C2B2AE63ULL;
+/* 0b0010011111010100111010110010111100010110010101100110011111000101 */
+static const uint64_t PRIME64_5 = 0x27D4EB2F165667C5ULL;
+
+static __rte_always_inline __m512i
+xxh64_round_avx512(__m512i hash, __m512i input)
+{
+ hash = _mm512_madd52lo_epu64(hash,
+ input,
+ _mm512_set1_epi64(PRIME64_2));
+
+ hash = _mm512_rol_epi64(hash, 31);
+
+ return hash;
+}
+
+static __rte_always_inline __m512i
+xxh64_fmix_avx512(__m512i hash)
+{
+ hash = _mm512_xor_si512(hash, _mm512_srli_epi64(hash, 33));
+
+ return hash;
+}
+
+static __rte_always_inline __m256i
+rte_xxh64_sketch_avx512(const void *key, uint32_t key_len,
+ __m512i v_seed, uint32_t modulo)
+{
+ __m512i v_prime64_5, v_hash;
+ size_t remaining = key_len;
+ size_t offset = 0;
+ __m512i input;
+
+ v_prime64_5 = _mm512_set1_epi64(PRIME64_5);
+ v_hash = _mm512_add_epi64
+ (_mm512_add_epi64(v_seed, v_prime64_5),
+ _mm512_set1_epi64(key_len));
+
+ while (remaining >= 8) {
+ input = _mm512_set1_epi64(*(uint64_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ xxh64_round_avx512(_mm512_setzero_si512(), input));
+ v_hash = _mm512_madd52lo_epu64(_mm512_set1_epi64(PRIME64_4),
+ v_hash,
+ _mm512_set1_epi64(PRIME64_1));
+
+ remaining -= 8;
+ offset += 8;
+ }
+
+ if (remaining >= 4) {
+ input = _mm512_set1_epi64
+ (*(uint32_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_1)));
+ v_hash = _mm512_madd52lo_epu64
+ (_mm512_set1_epi64(PRIME64_3),
+ _mm512_rol_epi64(v_hash, 23),
+ _mm512_set1_epi64(PRIME64_2));
+
+ offset += 4;
+ remaining -= 4;
+ }
+
+ while (remaining != 0) {
+ input = _mm512_set1_epi64
+ (*(uint8_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_5)));
+ v_hash = _mm512_mullo_epi64
+ (_mm512_rol_epi64(v_hash, 11),
+ _mm512_set1_epi64(PRIME64_1));
+ offset++;
+ remaining--;
+ }
+
+ v_hash = xxh64_fmix_avx512(v_hash);
+
+ /*
+ * theoritically, such modular operations can be replaced by
+ * _mm512_rem_epi64(), but seems it depends on the complier's
+ * implementation. so here is the limitation that the modulo
+ * value should be power of 2.
+ */
+ __m512i v_hash_remainder = _mm512_set1_epi64((modulo - 1));
+
+ return _mm512_cvtepi64_epi32(_mm512_and_si512(v_hash, v_hash_remainder));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_XXH64_AVX512_H_ */
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 2/2] test/member: add functional and perf tests for sketch
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-08-10 7:45 ` [PATCH 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-08-10 7:45 ` Leyi Rong
2022-08-11 2:33 ` Suanming Mou
2022-08-31 6:46 ` [PATCH v2 0/2] introduce NitroSketch Mode into membership library Leyi Rong
` (3 subsequent siblings)
5 siblings, 1 reply; 34+ messages in thread
From: Leyi Rong @ 2022-08-10 7:45 UTC (permalink / raw)
To: yipeng1.wang, zaoxingliu, sameh.gobriel; +Cc: dev, Leyi Rong
This patch adds functional and performance tests for sketch mode of
membership library.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
app/test/test_member.c | 258 ++++++++++++++++++++++++++++++++++++
app/test/test_member_perf.c | 153 ++++++++++++++++++++-
2 files changed, 407 insertions(+), 4 deletions(-)
diff --git a/app/test/test_member.c b/app/test/test_member.c
index 26a712439f..abe59bb9f8 100644
--- a/app/test/test_member.c
+++ b/app/test/test_member.c
@@ -4,6 +4,7 @@
/* This test is for membership library's simple feature test */
+#include <math.h>
#include "test.h"
#include <rte_memcpy.h>
@@ -28,6 +29,7 @@ test_member(void)
struct rte_member_setsum *setsum_ht;
struct rte_member_setsum *setsum_cache;
struct rte_member_setsum *setsum_vbf;
+struct rte_member_setsum *setsum_sketch;
/* 5-tuple key type */
struct flow_key {
@@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
.socket_id = 0 /* NUMA Socket ID for memory. */
};
+/* for sketch definitions */
+#define TOP_K 10
+#define HH_PKT_SIZE 16
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define PRINT_OUT_COUNT 20
+
+#define SKETCH_LARGEST_KEY_SIZE 1000000
+#define SKETCH_TOTAL_KEY 500
+#define NUM_OF_KEY(key) {\
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \
+}
+
+void *heavy_hitters[TOP_K];
+
/*
* Sequence of operations for find existing setsummary
*
@@ -684,6 +701,243 @@ perform_free(void)
rte_member_free(setsum_vbf);
}
+static void
+print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
+ uint32_t print_num, bool count_byte)
+{
+ uint32_t i;
+
+ for (i = 0; i < print_num; i++) {
+ if (count_byte)
+ printf("key %2u, count %8lu, real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)) *
+ HH_PKT_SIZE,
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
+ ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
+ else
+ printf("key %2u, count %8lu, real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)),
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i)) /
+ (float)NUM_OF_KEY(i));
+ }
+}
+
+static int
+sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int reset_test)
+{
+ uint32_t i;
+ uint64_t result_count[SKETCH_TOTAL_KEY];
+ member_set_t heavy_set[SKETCH_TOTAL_KEY];
+ uint64_t count[TOP_K];
+ int ret;
+ int hh_cnt;
+
+ setsum_sketch = rte_member_create(¶ms);
+ if (setsum_sketch == NULL) {
+ printf("Creation of setsums fail\n");
+ return -1;
+ }
+
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count(setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+
+ return -1;
+ }
+
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %lu\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ if (reset_test) {
+ printf("\nEntering Sketch Reset Test Process!\n");
+ rte_member_reset(setsum_sketch);
+
+ /* after reset, check some key's count */
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ printf("\nReinsert keys after Sketch Reset!\n");
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count
+ (setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %lu\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ printf("\nDelete some keys!\n");
+ uint32_t tmp_key = 0;
+
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+ tmp_key = 1;
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %lu\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+ }
+
+ rte_member_free(setsum_sketch);
+ return 0;
+}
+
+static int
+test_member_sketch(void)
+{
+ unsigned int i, j, index;
+ uint32_t total_pkt = 0;
+ uint32_t *keys;
+ int count_byte = 0;
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++)
+ total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
+
+ printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
+
+ keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
+
+ if (keys == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ return -1;
+ }
+
+ index = 0;
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
+ keys[index++] = i;
+ }
+
+ /* shuffle the keys */
+ for (i = index - 1; i > 0; i--) {
+ uint32_t swap_idx = rte_rand() % i;
+ uint32_t tmp_key = keys[i];
+
+ keys[i] = keys[swap_idx];
+ keys[swap_idx] = tmp_key;
+ }
+
+ params.key_len = 4;
+ params.name = "test_member_sketch";
+ params.type = RTE_MEMBER_TYPE_SKETCH;
+ params.error_rate = SKETCH_ERROR_RATE;
+ params.sample_rate = SKETCH_SAMPLE_RATE;
+ params.extra_flag = 0;
+ params.top_k = TOP_K;
+ params.prim_hash_seed = rte_rdtsc();
+ int reset_test = 0;
+
+ printf("Default sketching params: Error Rate: [%f]\tSample Rate: [%f]\tTopK: [%d]\n",
+ SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
+
+ printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
+ return -1;
+
+ params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ printf("\n[Sketch with Always Bounded Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
+ return -1;
+
+ count_byte = 1;
+ params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ printf("\n[Sketch with Packet Size Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
+ return -1;
+
+ count_byte = 0;
+ params.extra_flag = 0;
+ reset_test = 1;
+ printf("\nreset sketch test\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
+ return -1;
+
+ return 0;
+}
+
static int
test_member(void)
{
@@ -719,6 +973,10 @@ test_member(void)
return -1;
}
+ if (test_member_sketch() < 0) {
+ perform_free();
+ return -1;
+ }
perform_free();
return 0;
}
diff --git a/app/test/test_member_perf.c b/app/test/test_member_perf.c
index 978db0020a..b7eb3e4c66 100644
--- a/app/test/test_member_perf.c
+++ b/app/test/test_member_perf.c
@@ -13,6 +13,7 @@
#include <rte_random.h>
#include <rte_memcpy.h>
#include <rte_thash.h>
+#include <math.h>
#ifdef RTE_EXEC_ENV_WINDOWS
static int
@@ -26,7 +27,7 @@ test_member_perf(void)
#include <rte_member.h>
-#define NUM_KEYSIZES 10
+#define NUM_KEYSIZES RTE_DIM(hashtest_key_lens)
#define NUM_SHUFFLES 10
#define MAX_KEYSIZE 64
#define MAX_ENTRIES (1 << 19)
@@ -36,12 +37,23 @@ test_member_perf(void)
#define BURST_SIZE 64
#define VBF_FALSE_RATE 0.03
+/* for the heavy hitter detection */
+#define SKETCH_LARGEST_KEY_SIZE (1<<15)
+#define SKETCH_PKT_SIZE 16
+#define TOP_K 100
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define NUM_ADDS (KEYS_TO_ADD * 20)
+
static unsigned int test_socket_id;
enum sstype {
HT = 0,
CACHE,
VBF,
+ SKETCH,
+ SKETCH_BOUNDED,
+ SKETCH_BYTE,
NUM_TYPE
};
@@ -88,6 +100,7 @@ static member_set_t data[NUM_TYPE][/* Array to store the data */KEYS_TO_ADD];
/* Array to store all input keys */
static uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE];
+static uint8_t hh_keys[KEYS_TO_ADD][MAX_KEYSIZE];
/* Shuffle the keys that have been added, so lookups will be totally random */
static void
@@ -136,6 +149,10 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
{
unsigned int i, j;
int num_duplicates;
+ int distinct_key = 0;
+ int count_down = SKETCH_LARGEST_KEY_SIZE;
+ uint32_t swap_idx;
+ uint8_t temp_key[MAX_KEYSIZE];
params->key_size = hashtest_key_lens[cycle];
params->cycle = cycle;
@@ -176,6 +193,22 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
/* Shuffle the random values again */
shuffle_input_keys(params);
+ for (i = 0; i < KEYS_TO_ADD; i++) {
+ if (count_down == 0) {
+ distinct_key++;
+ count_down = ceil(SKETCH_LARGEST_KEY_SIZE / (distinct_key + 1));
+ }
+ memcpy(hh_keys[i], keys[distinct_key], params->key_size);
+ count_down--;
+ }
+
+ for (i = KEYS_TO_ADD - 1; i > 0; i--) {
+ swap_idx = rte_rand() % i;
+ memcpy(temp_key, hh_keys[i], params->key_size);
+ memcpy(hh_keys[i], hh_keys[swap_idx], params->key_size);
+ memcpy(hh_keys[swap_idx], temp_key, params->key_size);
+ }
+
/* For testing miss lookup, we insert half and lookup the other half */
unsigned int entry_cnt, bf_key_cnt;
if (!miss) {
@@ -208,6 +241,44 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
params->setsum[VBF] = rte_member_create(&member_params);
if (params->setsum[VBF] == NULL)
fprintf(stderr, "VBF create fail\n");
+
+ member_params.name = "test_member_sketch";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag = 0;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_bounded";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BOUNDED] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BOUNDED] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_byte";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BYTE] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BYTE] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+
for (i = 0; i < NUM_TYPE; i++) {
if (params->setsum[i] == NULL)
return -1;
@@ -243,6 +314,39 @@ timed_adds(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_adds_sketch(struct member_perf_params *params, int type)
+{
+ const uint64_t start_tsc = rte_rdtsc();
+ unsigned int i, j, a;
+ int32_t ret;
+
+ for (i = 0; i < NUM_ADDS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ if (type == SKETCH_BYTE)
+ ret = rte_member_add_byte_count(params->setsum[type],
+ &hh_keys[j], SKETCH_PKT_SIZE);
+ else
+ ret = rte_member_add(params->setsum[type], &hh_keys[j], 1);
+ if (ret < 0) {
+ printf("Error %d in rte_member_add - key=0x", ret);
+ for (a = 0; a < params->key_size; a++)
+ printf("%02x", hh_keys[j][a]);
+ printf("type: %d\n", type);
+
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][ADD] = time_taken / NUM_ADDS;
+
+ return 0;
+}
+
static int
timed_lookups(struct member_perf_params *params, int type)
{
@@ -279,6 +383,36 @@ timed_lookups(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_lookups_sketch(struct member_perf_params *params, int type)
+{
+ unsigned int i, j;
+
+ false_data[type][params->cycle] = 0;
+
+ const uint64_t start_tsc = rte_rdtsc();
+ member_set_t result;
+ int ret;
+
+ for (i = 0; i < NUM_LOOKUPS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ ret = rte_member_lookup(params->setsum[type], &hh_keys[j],
+ &result);
+ if (ret < 0) {
+ printf("lookup wrong internally");
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][LOOKUP] = time_taken / NUM_LOOKUPS;
+
+ return 0;
+}
+
static int
timed_lookups_bulk(struct member_perf_params *params, int type)
{
@@ -531,7 +665,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_adds(¶ms, j) < 0)
return exit_with_fail("timed_adds", ¶ms,
@@ -562,6 +696,17 @@ run_all_tbl_perf_tests(void)
/* Print a dot to show progress on operations */
}
+
+ for (j = SKETCH; j < NUM_TYPE; j++) {
+ if (timed_adds_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_adds_sketch", ¶ms, i, j);
+
+ if (timed_lookups_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_lookups_sketch", ¶ms, i, j);
+ }
+
printf(".");
fflush(stdout);
@@ -574,7 +719,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_miss_lookup(¶ms, j) < 0)
return exit_with_fail("timed_miss_lookup",
¶ms, i, j);
@@ -605,7 +750,7 @@ run_all_tbl_perf_tests(void)
"fr_multi_bulk", "false_positive_rate");
/* Key size not influence False rate so just print out one key size */
for (i = 0; i < 1; i++) {
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
printf("%-18d", hashtest_key_lens[i]);
printf("%-18d", j);
printf("%-18f", (float)false_data[j][i] / NUM_LOOKUPS);
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 2/2] test/member: add functional and perf tests for sketch
2022-08-10 7:45 ` [PATCH 2/2] test/member: add functional and perf tests for sketch Leyi Rong
@ 2022-08-11 2:33 ` Suanming Mou
2022-08-26 2:26 ` Rong, Leyi
0 siblings, 1 reply; 34+ messages in thread
From: Suanming Mou @ 2022-08-11 2:33 UTC (permalink / raw)
To: Leyi Rong, yipeng1.wang, zaoxingliu, sameh.gobriel; +Cc: dev
Hi,
> -----Original Message-----
> From: Leyi Rong <leyi.rong@intel.com>
> Sent: Wednesday, August 10, 2022 3:45 PM
> To: yipeng1.wang@intel.com; zaoxingliu@gmail.com; sameh.gobriel@intel.com
> Cc: dev@dpdk.org; Leyi Rong <leyi.rong@intel.com>
> Subject: [PATCH 2/2] test/member: add functional and perf tests for sketch
>
> This patch adds functional and performance tests for sketch mode of
> membership library.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
> app/test/test_member.c | 258 ++++++++++++++++++++++++++++++++++++
> app/test/test_member_perf.c | 153 ++++++++++++++++++++-
> 2 files changed, 407 insertions(+), 4 deletions(-)
>
> diff --git a/app/test/test_member.c b/app/test/test_member.c index
> 26a712439f..abe59bb9f8 100644
> --- a/app/test/test_member.c
> +++ b/app/test/test_member.c
> @@ -4,6 +4,7 @@
>
> /* This test is for membership library's simple feature test */
>
> +#include <math.h>
> #include "test.h"
>
> #include <rte_memcpy.h>
> @@ -28,6 +29,7 @@ test_member(void)
> struct rte_member_setsum *setsum_ht;
> struct rte_member_setsum *setsum_cache; struct rte_member_setsum
> *setsum_vbf;
> +struct rte_member_setsum *setsum_sketch;
>
> /* 5-tuple key type */
> struct flow_key {
> @@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
> .socket_id = 0 /* NUMA Socket ID for
> memory. */
> };
>
> +/* for sketch definitions */
> +#define TOP_K 10
> +#define HH_PKT_SIZE 16
> +#define SKETCH_ERROR_RATE 0.05
> +#define SKETCH_SAMPLE_RATE 0.001
> +#define PRINT_OUT_COUNT 20
> +
> +#define SKETCH_LARGEST_KEY_SIZE 1000000 #define SKETCH_TOTAL_KEY 500
> +#define NUM_OF_KEY(key) {\
> + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \ }
> +
> +void *heavy_hitters[TOP_K];
> +
> /*
> * Sequence of operations for find existing setsummary
> *
> @@ -684,6 +701,243 @@ perform_free(void)
> rte_member_free(setsum_vbf);
> }
>
> +static void
> +print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
> + uint32_t print_num, bool count_byte) {
> + uint32_t i;
> +
> + for (i = 0; i < print_num; i++) {
> + if (count_byte)
> + printf("key %2u, count %8lu, real count %8u, "
> + "heavy_set %u, deviation rate [%.04f]\n",
> + i, count_result[i],
> + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE /
> (i + 1)) *
> + HH_PKT_SIZE,
> + heavy_set[i],
> + fabs((float)count_result[i] -
> (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
> + ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
> + else
> + printf("key %2u, count %8lu, real count %8u, "
> + "heavy_set %u, deviation rate [%.04f]\n",
> + i, count_result[i],
> + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE /
> (i + 1)),
> + heavy_set[i],
> + fabs((float)count_result[i] -
> (float)NUM_OF_KEY(i)) /
> + (float)NUM_OF_KEY(i));
> + }
> +}
> +
> +static int
> +sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int
> +reset_test) {
> + uint32_t i;
> + uint64_t result_count[SKETCH_TOTAL_KEY];
> + member_set_t heavy_set[SKETCH_TOTAL_KEY];
> + uint64_t count[TOP_K];
> + int ret;
> + int hh_cnt;
> +
> + setsum_sketch = rte_member_create(¶ms);
> + if (setsum_sketch == NULL) {
> + printf("Creation of setsums fail\n");
> + return -1;
> + }
> +
> + for (i = 0; i < total_pkt; i++) {
> + if (count_byte)
> + ret = rte_member_add_byte_count(setsum_sketch,
> &keys[i], HH_PKT_SIZE);
> + else
> + ret = rte_member_add(setsum_sketch, &keys[i], 1);
> +
> + if (ret < 0) {
> + printf("rte_member_add Failed! Error [%d]\n", ret);
I'm afraid rte_member_free(setsum_sketch) is missing here.
Same code below with "return -1;" should be checked.
> +
> + return -1;
> + }
> + }
> +
> + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> + uint32_t tmp_key = i;
> +
> + rte_member_query_count(setsum_sketch, (void *)&tmp_key,
> &result_count[i]);
> + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> &heavy_set[i]);
> + }
> +
> + print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT,
> +count_byte);
> +
> + hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters,
> count);
> + if (hh_cnt < 0) {
> + printf("sketch report heavy hitter error!");
> +
> + return -1;
> + }
> +
> + printf("Report heavy hitters:");
> + for (i = 0; i < (unsigned int)hh_cnt; i++) {
> + printf("%u: %lu\t",
> + *((uint32_t *)heavy_hitters[i]), count[i]);
> + }
> + printf("\n");
> +
> + if (reset_test) {
> + printf("\nEntering Sketch Reset Test Process!\n");
> + rte_member_reset(setsum_sketch);
> +
> + /* after reset, check some key's count */
> + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> + uint32_t tmp_key = i;
> +
> + rte_member_query_count(setsum_sketch, (void
> *)&tmp_key, &result_count[i]);
> + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> &heavy_set[i]);
> + }
> +
> + print_out_sketch_results(result_count, heavy_set,
> PRINT_OUT_COUNT,
> +count_byte);
> +
> + printf("\nReinsert keys after Sketch Reset!\n");
> + for (i = 0; i < total_pkt; i++) {
> + if (count_byte)
> + ret = rte_member_add_byte_count
> + (setsum_sketch, &keys[i],
> HH_PKT_SIZE);
> + else
> + ret = rte_member_add(setsum_sketch, &keys[i],
> 1);
> +
> + if (ret < 0) {
> + printf("rte_member_add Failed! Error [%d]\n",
> ret);
> +
> + return -1;
> + }
> + }
> +
> + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> + uint32_t tmp_key = i;
> +
> + rte_member_query_count(setsum_sketch, (void
> *)&tmp_key, &result_count[i]);
> + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> &heavy_set[i]);
> + }
> +
> + print_out_sketch_results(result_count, heavy_set,
> PRINT_OUT_COUNT,
> +count_byte);
> +
> + hh_cnt = rte_member_report_heavyhitter(setsum_sketch,
> heavy_hitters, count);
> + if (hh_cnt < 0) {
> + printf("sketch report heavy hitter error!");
> +
> + return -1;
> + }
> + printf("Report heavy hitters:");
> + for (i = 0; i < (unsigned int)hh_cnt; i++) {
> + printf("%u: %lu\t",
> + *((uint32_t *)heavy_hitters[i]), count[i]);
> + }
> + printf("\n");
> +
> + printf("\nDelete some keys!\n");
> + uint32_t tmp_key = 0;
> +
> + rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
> + tmp_key = 1;
> + rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
> +
> + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> + uint32_t tmp_key = i;
> +
> + rte_member_query_count(setsum_sketch, (void
> *)&tmp_key, &result_count[i]);
> + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> &heavy_set[i]);
> + }
> +
> + print_out_sketch_results(result_count, heavy_set,
> PRINT_OUT_COUNT,
> +count_byte);
> +
> + hh_cnt = rte_member_report_heavyhitter(setsum_sketch,
> heavy_hitters, count);
> + if (hh_cnt < 0) {
> + printf("sketch report heavy hitter error!");
> +
> + return -1;
> + }
> + printf("Report heavy hitters:");
> + for (i = 0; i < (unsigned int)hh_cnt; i++) {
> + printf("%u: %lu\t",
> + *((uint32_t *)heavy_hitters[i]), count[i]);
> + }
> + printf("\n");
> + }
> +
> + rte_member_free(setsum_sketch);
> + return 0;
> +}
> +
> +static int
> +test_member_sketch(void)
> +{
> + unsigned int i, j, index;
> + uint32_t total_pkt = 0;
> + uint32_t *keys;
> + int count_byte = 0;
> +
> + for (i = 0; i < SKETCH_TOTAL_KEY; i++)
> + total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
> +
> + printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
> +
> + keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
> +
> + if (keys == NULL) {
> + printf("RTE_ZMALLOC failed\n");
> + return -1;
> + }
> +
> + index = 0;
> + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> + for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
> + keys[index++] = i;
> + }
> +
> + /* shuffle the keys */
> + for (i = index - 1; i > 0; i--) {
> + uint32_t swap_idx = rte_rand() % i;
> + uint32_t tmp_key = keys[i];
> +
> + keys[i] = keys[swap_idx];
> + keys[swap_idx] = tmp_key;
> + }
> +
> + params.key_len = 4;
> + params.name = "test_member_sketch";
> + params.type = RTE_MEMBER_TYPE_SKETCH;
> + params.error_rate = SKETCH_ERROR_RATE;
> + params.sample_rate = SKETCH_SAMPLE_RATE;
> + params.extra_flag = 0;
> + params.top_k = TOP_K;
> + params.prim_hash_seed = rte_rdtsc();
> + int reset_test = 0;
> +
> + printf("Default sketching params: Error Rate: [%f]\tSample Rate:
> [%f]\tTopK: [%d]\n",
> + SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
> +
> + printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
> + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
Same here, will keys memory allocated by rte_zmalloc be freed in other place?
> + return -1;
> +
> + params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
> + printf("\n[Sketch with Always Bounded Mode]\n");
> + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
> + return -1;
> +
> + count_byte = 1;
> + params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
> + printf("\n[Sketch with Packet Size Mode]\n");
> + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
> + return -1;
> +
> + count_byte = 0;
> + params.extra_flag = 0;
snip
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 1/2] member: implement NitroSketch mode
2022-08-10 7:45 ` [PATCH 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-08-16 8:59 ` Ferruh Yigit
2022-08-26 2:32 ` Rong, Leyi
0 siblings, 1 reply; 34+ messages in thread
From: Ferruh Yigit @ 2022-08-16 8:59 UTC (permalink / raw)
To: Leyi Rong, yipeng1.wang, zaoxingliu, sameh.gobriel; +Cc: dev
On 8/10/2022 8:45 AM, Leyi Rong wrote:
> Sketching algorithm provide high-fidelity approximate measurements and
> appears as a promising alternative to traditional approches such as
> packet sampling.
>
> NitroSketch [1] is a software sketching framework that optimizes
> performance, provides accuracy guarantees, and supports a variety of
> sketches.
>
> This commit adds a new data structure called sketch into
> membership library. This new data structure is an efficient
> way to profile the traffic for heavy hitters. Also use min-heap
> structure to maintain the top-k flow keys.
>
> [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
Hi Leyi,
If there will be a new version, can you please put the link of the paper
to the commit log, for reference.
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 2/2] test/member: add functional and perf tests for sketch
2022-08-11 2:33 ` Suanming Mou
@ 2022-08-26 2:26 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-08-26 2:26 UTC (permalink / raw)
To: Suanming Mou, Wang, Yipeng1, zaoxingliu, Gobriel, Sameh; +Cc: dev
Hi Suanming,
The issues you've mentioned will be fixed in v2, thanks!
> -----Original Message-----
> From: Suanming Mou <suanmingm@nvidia.com>
> Sent: Thursday, August 11, 2022 10:33 AM
> To: Rong, Leyi <leyi.rong@intel.com>; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>
> Cc: dev@dpdk.org
> Subject: RE: [PATCH 2/2] test/member: add functional and perf tests for sketch
>
> Hi,
>
> > -----Original Message-----
> > From: Leyi Rong <leyi.rong@intel.com>
> > Sent: Wednesday, August 10, 2022 3:45 PM
> > To: yipeng1.wang@intel.com; zaoxingliu@gmail.com;
> > sameh.gobriel@intel.com
> > Cc: dev@dpdk.org; Leyi Rong <leyi.rong@intel.com>
> > Subject: [PATCH 2/2] test/member: add functional and perf tests for
> > sketch
> >
> > This patch adds functional and performance tests for sketch mode of
> > membership library.
> >
> > Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> > Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> > ---
> > app/test/test_member.c | 258 ++++++++++++++++++++++++++++++++++++
> > app/test/test_member_perf.c | 153 ++++++++++++++++++++-
> > 2 files changed, 407 insertions(+), 4 deletions(-)
> >
> > diff --git a/app/test/test_member.c b/app/test/test_member.c index
> > 26a712439f..abe59bb9f8 100644
> > --- a/app/test/test_member.c
> > +++ b/app/test/test_member.c
> > @@ -4,6 +4,7 @@
> >
> > /* This test is for membership library's simple feature test */
> >
> > +#include <math.h>
> > #include "test.h"
> >
> > #include <rte_memcpy.h>
> > @@ -28,6 +29,7 @@ test_member(void)
> > struct rte_member_setsum *setsum_ht;
> > struct rte_member_setsum *setsum_cache; struct rte_member_setsum
> > *setsum_vbf;
> > +struct rte_member_setsum *setsum_sketch;
> >
> > /* 5-tuple key type */
> > struct flow_key {
> > @@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
> > .socket_id = 0 /* NUMA Socket ID for
> > memory. */
> > };
> >
> > +/* for sketch definitions */
> > +#define TOP_K 10
> > +#define HH_PKT_SIZE 16
> > +#define SKETCH_ERROR_RATE 0.05
> > +#define SKETCH_SAMPLE_RATE 0.001
> > +#define PRINT_OUT_COUNT 20
> > +
> > +#define SKETCH_LARGEST_KEY_SIZE 1000000 #define SKETCH_TOTAL_KEY
> 500
> > +#define NUM_OF_KEY(key) {\
> > + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \ }
> > +
> > +void *heavy_hitters[TOP_K];
> > +
> > /*
> > * Sequence of operations for find existing setsummary
> > *
> > @@ -684,6 +701,243 @@ perform_free(void)
> > rte_member_free(setsum_vbf);
> > }
> >
> > +static void
> > +print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
> > + uint32_t print_num, bool count_byte) {
> > + uint32_t i;
> > +
> > + for (i = 0; i < print_num; i++) {
> > + if (count_byte)
> > + printf("key %2u, count %8lu, real count %8u, "
> > + "heavy_set %u, deviation rate [%.04f]\n",
> > + i, count_result[i],
> > + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE /
> > (i + 1)) *
> > + HH_PKT_SIZE,
> > + heavy_set[i],
> > + fabs((float)count_result[i] -
> > (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
> > + ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
> > + else
> > + printf("key %2u, count %8lu, real count %8u, "
> > + "heavy_set %u, deviation rate [%.04f]\n",
> > + i, count_result[i],
> > + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE /
> > (i + 1)),
> > + heavy_set[i],
> > + fabs((float)count_result[i] -
> > (float)NUM_OF_KEY(i)) /
> > + (float)NUM_OF_KEY(i));
> > + }
> > +}
> > +
> > +static int
> > +sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int
> > +reset_test) {
> > + uint32_t i;
> > + uint64_t result_count[SKETCH_TOTAL_KEY];
> > + member_set_t heavy_set[SKETCH_TOTAL_KEY];
> > + uint64_t count[TOP_K];
> > + int ret;
> > + int hh_cnt;
> > +
> > + setsum_sketch = rte_member_create(¶ms);
> > + if (setsum_sketch == NULL) {
> > + printf("Creation of setsums fail\n");
> > + return -1;
> > + }
> > +
> > + for (i = 0; i < total_pkt; i++) {
> > + if (count_byte)
> > + ret = rte_member_add_byte_count(setsum_sketch,
> > &keys[i], HH_PKT_SIZE);
> > + else
> > + ret = rte_member_add(setsum_sketch, &keys[i], 1);
> > +
> > + if (ret < 0) {
> > + printf("rte_member_add Failed! Error [%d]\n", ret);
>
> I'm afraid rte_member_free(setsum_sketch) is missing here.
> Same code below with "return -1;" should be checked.
>
> > +
> > + return -1;
> > + }
> > + }
> > +
> > + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> > + uint32_t tmp_key = i;
> > +
> > + rte_member_query_count(setsum_sketch, (void *)&tmp_key,
> > &result_count[i]);
> > + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> > &heavy_set[i]);
> > + }
> > +
> > + print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT,
> > +count_byte);
> > +
> > + hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters,
> > count);
> > + if (hh_cnt < 0) {
> > + printf("sketch report heavy hitter error!");
> > +
> > + return -1;
> > + }
> > +
> > + printf("Report heavy hitters:");
> > + for (i = 0; i < (unsigned int)hh_cnt; i++) {
> > + printf("%u: %lu\t",
> > + *((uint32_t *)heavy_hitters[i]), count[i]);
> > + }
> > + printf("\n");
> > +
> > + if (reset_test) {
> > + printf("\nEntering Sketch Reset Test Process!\n");
> > + rte_member_reset(setsum_sketch);
> > +
> > + /* after reset, check some key's count */
> > + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> > + uint32_t tmp_key = i;
> > +
> > + rte_member_query_count(setsum_sketch, (void
> > *)&tmp_key, &result_count[i]);
> > + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> > &heavy_set[i]);
> > + }
> > +
> > + print_out_sketch_results(result_count, heavy_set,
> > PRINT_OUT_COUNT,
> > +count_byte);
> > +
> > + printf("\nReinsert keys after Sketch Reset!\n");
> > + for (i = 0; i < total_pkt; i++) {
> > + if (count_byte)
> > + ret = rte_member_add_byte_count
> > + (setsum_sketch, &keys[i],
> > HH_PKT_SIZE);
> > + else
> > + ret = rte_member_add(setsum_sketch, &keys[i],
> > 1);
> > +
> > + if (ret < 0) {
> > + printf("rte_member_add Failed! Error [%d]\n",
> > ret);
> > +
> > + return -1;
> > + }
> > + }
> > +
> > + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> > + uint32_t tmp_key = i;
> > +
> > + rte_member_query_count(setsum_sketch, (void
> > *)&tmp_key, &result_count[i]);
> > + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> > &heavy_set[i]);
> > + }
> > +
> > + print_out_sketch_results(result_count, heavy_set,
> > PRINT_OUT_COUNT,
> > +count_byte);
> > +
> > + hh_cnt = rte_member_report_heavyhitter(setsum_sketch,
> > heavy_hitters, count);
> > + if (hh_cnt < 0) {
> > + printf("sketch report heavy hitter error!");
> > +
> > + return -1;
> > + }
> > + printf("Report heavy hitters:");
> > + for (i = 0; i < (unsigned int)hh_cnt; i++) {
> > + printf("%u: %lu\t",
> > + *((uint32_t *)heavy_hitters[i]), count[i]);
> > + }
> > + printf("\n");
> > +
> > + printf("\nDelete some keys!\n");
> > + uint32_t tmp_key = 0;
> > +
> > + rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
> > + tmp_key = 1;
> > + rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
> > +
> > + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> > + uint32_t tmp_key = i;
> > +
> > + rte_member_query_count(setsum_sketch, (void
> > *)&tmp_key, &result_count[i]);
> > + rte_member_lookup(setsum_sketch, (void *)&tmp_key,
> > &heavy_set[i]);
> > + }
> > +
> > + print_out_sketch_results(result_count, heavy_set,
> > PRINT_OUT_COUNT,
> > +count_byte);
> > +
> > + hh_cnt = rte_member_report_heavyhitter(setsum_sketch,
> > heavy_hitters, count);
> > + if (hh_cnt < 0) {
> > + printf("sketch report heavy hitter error!");
> > +
> > + return -1;
> > + }
> > + printf("Report heavy hitters:");
> > + for (i = 0; i < (unsigned int)hh_cnt; i++) {
> > + printf("%u: %lu\t",
> > + *((uint32_t *)heavy_hitters[i]), count[i]);
> > + }
> > + printf("\n");
> > + }
> > +
> > + rte_member_free(setsum_sketch);
> > + return 0;
> > +}
> > +
> > +static int
> > +test_member_sketch(void)
> > +{
> > + unsigned int i, j, index;
> > + uint32_t total_pkt = 0;
> > + uint32_t *keys;
> > + int count_byte = 0;
> > +
> > + for (i = 0; i < SKETCH_TOTAL_KEY; i++)
> > + total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
> > +
> > + printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
> > +
> > + keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
> > +
> > + if (keys == NULL) {
> > + printf("RTE_ZMALLOC failed\n");
> > + return -1;
> > + }
> > +
> > + index = 0;
> > + for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
> > + for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
> > + keys[index++] = i;
> > + }
> > +
> > + /* shuffle the keys */
> > + for (i = index - 1; i > 0; i--) {
> > + uint32_t swap_idx = rte_rand() % i;
> > + uint32_t tmp_key = keys[i];
> > +
> > + keys[i] = keys[swap_idx];
> > + keys[swap_idx] = tmp_key;
> > + }
> > +
> > + params.key_len = 4;
> > + params.name = "test_member_sketch";
> > + params.type = RTE_MEMBER_TYPE_SKETCH;
> > + params.error_rate = SKETCH_ERROR_RATE;
> > + params.sample_rate = SKETCH_SAMPLE_RATE;
> > + params.extra_flag = 0;
> > + params.top_k = TOP_K;
> > + params.prim_hash_seed = rte_rdtsc();
> > + int reset_test = 0;
> > +
> > + printf("Default sketching params: Error Rate: [%f]\tSample Rate:
> > [%f]\tTopK: [%d]\n",
> > + SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
> > +
> > + printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
> > + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
>
> Same here, will keys memory allocated by rte_zmalloc be freed in other place?
>
> > + return -1;
> > +
> > + params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
> > + printf("\n[Sketch with Always Bounded Mode]\n");
> > + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
> > + return -1;
> > +
> > + count_byte = 1;
> > + params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
> > + printf("\n[Sketch with Packet Size Mode]\n");
> > + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0)
> > + return -1;
> > +
> > + count_byte = 0;
> > + params.extra_flag = 0;
>
>
> snip
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 1/2] member: implement NitroSketch mode
2022-08-16 8:59 ` Ferruh Yigit
@ 2022-08-26 2:32 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-08-26 2:32 UTC (permalink / raw)
To: Ferruh Yigit, Wang, Yipeng1, zaoxingliu, Gobriel, Sameh; +Cc: dev
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@xilinx.com>
> Sent: Tuesday, August 16, 2022 4:59 PM
> To: Rong, Leyi <leyi.rong@intel.com>; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [PATCH 1/2] member: implement NitroSketch mode
>
> On 8/10/2022 8:45 AM, Leyi Rong wrote:
> > Sketching algorithm provide high-fidelity approximate measurements and
> > appears as a promising alternative to traditional approches such as
> > packet sampling.
> >
> > NitroSketch [1] is a software sketching framework that optimizes
> > performance, provides accuracy guarantees, and supports a variety of
> > sketches.
> >
> > This commit adds a new data structure called sketch into membership
> > library. This new data structure is an efficient way to profile the
> > traffic for heavy hitters. Also use min-heap structure to maintain the
> > top-k flow keys.
> >
> > [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> > Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> > Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
>
> Hi Leyi,
>
> If there will be a new version, can you please put the link of the paper to the
> commit log, for reference.
Hi Ferruh,
Sure! Will attach the link in v2. Thanks!
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 0/2] introduce NitroSketch Mode into membership library
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-08-10 7:45 ` [PATCH 1/2] member: implement NitroSketch mode Leyi Rong
2022-08-10 7:45 ` [PATCH 2/2] test/member: add functional and perf tests for sketch Leyi Rong
@ 2022-08-31 6:46 ` Leyi Rong
2022-08-31 6:46 ` [PATCH v2 1/2] member: implement NitroSketch mode Leyi Rong
2022-08-31 6:46 ` [PATCH v2 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-09-15 2:14 ` [PATCH v3 0/2] introduce NitroSketch Mode into membership library Leyi Rong
` (2 subsequent siblings)
5 siblings, 2 replies; 34+ messages in thread
From: Leyi Rong @ 2022-08-31 6:46 UTC (permalink / raw)
To: ferruh.yigit, suanmingm, yipeng1.wang, zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patchset introduce a brand new NitroSketch Mode into membership
library. This algorithm provides high-fidelity approximate measurements
and appears as a promissing alternative to triditional approches such as
packet sampling.
---
v2:
- attach paper link to commit log.
- fix potential memory leaks in test_member.c.
- build error fix according to CI build fail log.
Leyi Rong (2):
member: implement NitroSketch mode
test/member: add functional and perf tests for sketch
app/test/test_member.c | 272 ++++++++++++
app/test/test_member_perf.c | 153 ++++++-
lib/member/meson.build | 38 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 151 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 69 +++
lib/member/rte_member_sketch_avx512.h | 36 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 3 +
12 files changed, 2021 insertions(+), 8 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 1/2] member: implement NitroSketch mode
2022-08-31 6:46 ` [PATCH v2 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-08-31 6:46 ` Leyi Rong
2022-09-13 14:56 ` David Marchand
2022-08-31 6:46 ` [PATCH v2 2/2] test/member: add functional and perf tests for sketch Leyi Rong
1 sibling, 1 reply; 34+ messages in thread
From: Leyi Rong @ 2022-08-31 6:46 UTC (permalink / raw)
To: ferruh.yigit, suanmingm, yipeng1.wang, zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
Sketching algorithm provide high-fidelity approximate measurements and
appears as a promising alternative to traditional approaches such as
packet sampling.
NitroSketch [1] is a software sketching framework that optimizes
performance, provides accuracy guarantees, and supports a variety of
sketches.
This commit adds a new data structure called sketch into
membership library. This new data structure is an efficient
way to profile the traffic for heavy hitters. Also use min-heap
structure to maintain the top-k flow keys.
[1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
lib/member/meson.build | 38 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 151 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 69 +++
lib/member/rte_member_sketch_avx512.h | 36 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 3 +
10 files changed, 1600 insertions(+), 4 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
diff --git a/lib/member/meson.build b/lib/member/meson.build
index e06fddc240..9b3418c25c 100644
--- a/lib/member/meson.build
+++ b/lib/member/meson.build
@@ -7,6 +7,42 @@ if is_windows
subdir_done()
endif
-sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
+sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
headers = files('rte_member.h')
deps += ['hash']
+includes += include_directories('../hash', '../ring')
+
+# compile AVX512 version if:
+# we are building 64-bit binary AND binutils can generate proper code
+if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
+ # compile AVX512 version if either:
+ # a. we have AVX512 supported in minimum instruction set
+ # baseline
+ # b. it's not minimum instruction set, but supported by
+ # compiler
+ #
+ # in former case, just add avx512 C file to files list
+ # in latter case, compile c file to static lib, using correct
+ # compiler flags, and then have the .o file from static lib
+ # linked into main lib.
+ sketch_avx512_cpu_support = (
+ cc.get_define('__AVX512F__', args: machine_args) != ''
+ )
+
+ if sketch_avx512_cpu_support == true
+ cflags += ['-DCC_AVX512_SUPPORT']
+ if cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
+ cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
+ endif
+ sources += files('rte_member_sketch_avx512.c')
+ elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
+ cflags += ['-DCC_AVX512_SUPPORT']
+ cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
+ sketch_avx512_tmp = static_library('sketch_avx512_tmp',
+ 'rte_member_sketch_avx512.c',
+ include_directories: includes,
+ dependencies: static_rte_eal,
+ c_args: cflags)
+ objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
+ endif
+endif
diff --git a/lib/member/rte_member.c b/lib/member/rte_member.c
index 7e1632e6b5..8f859f7fbd 100644
--- a/lib/member/rte_member.c
+++ b/lib/member/rte_member.c
@@ -9,10 +9,12 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_tailq.h>
+#include <rte_ring_elem.h>
#include "rte_member.h"
#include "rte_member_ht.h"
#include "rte_member_vbf.h"
+#include "rte_member_sketch.h"
TAILQ_HEAD(rte_member_list, rte_tailq_entry);
static struct rte_tailq_elem rte_member_tailq = {
@@ -72,6 +74,9 @@ rte_member_free(struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_free_vbf(setsum);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_free_sketch(setsum);
+ break;
default:
break;
}
@@ -86,6 +91,8 @@ rte_member_create(const struct rte_member_parameters *params)
struct rte_member_list *member_list;
struct rte_member_setsum *setsum;
int ret;
+ char ring_name[RTE_RING_NAMESIZE];
+ struct rte_ring *sketch_key_ring = NULL;
if (params == NULL) {
rte_errno = EINVAL;
@@ -100,6 +107,16 @@ rte_member_create(const struct rte_member_parameters *params)
return NULL;
}
+ if (params->type == RTE_MEMBER_TYPE_SKETCH) {
+ snprintf(ring_name, sizeof(ring_name), "SK_%s", params->name);
+ sketch_key_ring = rte_ring_create_elem(ring_name, sizeof(uint32_t),
+ rte_align32pow2(params->top_k), params->socket_id, 0);
+ if (sketch_key_ring == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Ring Memory allocation failed\n");
+ return NULL;
+ }
+ }
+
member_list = RTE_TAILQ_CAST(rte_member_tailq.head, rte_member_list);
rte_mcfg_tailq_write_lock();
@@ -145,6 +162,9 @@ rte_member_create(const struct rte_member_parameters *params)
case RTE_MEMBER_TYPE_VBF:
ret = rte_member_create_vbf(setsum, params);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ ret = rte_member_create_sketch(setsum, params, sketch_key_ring);
+ break;
default:
goto error_unlock_exit;
}
@@ -162,6 +182,7 @@ rte_member_create(const struct rte_member_parameters *params)
error_unlock_exit:
rte_free(te);
rte_free(setsum);
+ rte_ring_free(sketch_key_ring);
rte_mcfg_tailq_write_unlock();
return NULL;
}
@@ -178,6 +199,23 @@ rte_member_add(const struct rte_member_setsum *setsum, const void *key,
return rte_member_add_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_add_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch(setsum, key, set_id);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count)
+{
+ if (setsum == NULL || key == NULL || byte_count == 0)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch_byte_count(setsum, key, byte_count);
default:
return -EINVAL;
}
@@ -195,6 +233,8 @@ rte_member_lookup(const struct rte_member_setsum *setsum, const void *key,
return rte_member_lookup_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_lookup_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_lookup_sketch(setsum, key, set_id);
default:
return -EINVAL;
}
@@ -261,6 +301,36 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
}
}
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output)
+{
+ if (setsum == NULL || key == NULL || output == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_query_sketch(setsum, key, output);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count)
+{
+ if (setsum == NULL || key == NULL || count == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_report_heavyhitter_sketch(setsum, key, count);
+ default:
+ return -EINVAL;
+ }
+}
+
int
rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id)
@@ -272,6 +342,8 @@ rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
case RTE_MEMBER_TYPE_HT:
return rte_member_delete_ht(setsum, key, set_id);
/* current vBF implementation does not support delete function */
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_delete_sketch(setsum, key);
case RTE_MEMBER_TYPE_VBF:
default:
return -EINVAL;
@@ -290,6 +362,9 @@ rte_member_reset(const struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_reset_vbf(setsum);
return;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_reset_sketch(setsum);
+ return;
default:
return;
}
diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h
index 2611015771..c133fa3ed7 100644
--- a/lib/member/rte_member.h
+++ b/lib/member/rte_member.h
@@ -39,6 +39,18 @@
* | | | not overwrite | |
* | | | existing key. | |
* +----------+---------------------+----------------+-------------------------+
+ * +==========+=============================+
+ * | type | sketch |
+ * +==========+=============================+
+ * |structure | counting bloom filter array |
+ * +----------+-----------------------------+
+ * |set id | 1: heavy set, 0: light set |
+ * | | |
+ * +----------+-----------------------------+
+ * |usages & | count size of a flow, |
+ * |properties| used for heavy hitter |
+ * | | detection. |
+ * +----------+-----------------------------+
* -->
*/
@@ -50,6 +62,8 @@ extern "C" {
#endif
#include <stdint.h>
+#include <stdbool.h>
+#include <inttypes.h>
#include <rte_common.h>
@@ -65,6 +79,20 @@ typedef uint16_t member_set_t;
#define RTE_MEMBER_BUCKET_ENTRIES 16
/** Maximum number of characters in setsum name. */
#define RTE_MEMBER_NAMESIZE 32
+/** Max value of the random number */
+#define RTE_RAND_MAX ~0LLU
+/**
+ * As packets skipped in the sampling-based algorithm, the accounting
+ * results accuracy is not guaranteed in the start stage. There should
+ * be a "convergence time" to achieve the accuracy after receiving enough
+ * packets.
+ * For sketch, use the flag if prefer always bounded mode, which only
+ * starts sampling after receiving enough packets to keep the results
+ * accuracy always bounded.
+ */
+#define RTE_MEMBER_SKETCH_ALWAYS_BOUNDED 0x01
+/** For sketch, use the flag if to count packet size instead of packet count */
+#define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02
/** @internal Hash function used by membership library. */
#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32)
@@ -104,6 +132,7 @@ struct rte_member_parameters;
enum rte_member_setsum_type {
RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary. */
RTE_MEMBER_TYPE_VBF, /**< Vector of bloom filters. */
+ RTE_MEMBER_TYPE_SKETCH,
RTE_MEMBER_NUM_TYPE
};
@@ -114,6 +143,19 @@ enum rte_member_sig_compare_function {
RTE_MEMBER_COMPARE_NUM
};
+/* sketch update function with different implementations. */
+typedef void (*sketch_update_fn_t)(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+/* sketch lookup function with different implementations. */
+typedef uint64_t (*sketch_lookup_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
+/* sketch delete function with different implementations. */
+typedef void (*sketch_delete_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
/** @internal setsummary structure. */
struct rte_member_setsum {
enum rte_member_setsum_type type; /* Type of the set summary. */
@@ -134,6 +176,21 @@ struct rte_member_setsum {
uint32_t bit_mask; /* Bit mask to get bit location in bf. */
uint32_t num_hashes; /* Number of hash values to index bf. */
+ /* Parameters for sketch */
+ float error_rate;
+ float sample_rate;
+ uint32_t num_col;
+ uint32_t num_row;
+ int always_bounded;
+ double converge_thresh;
+ uint32_t topk;
+ uint32_t count_byte;
+ uint64_t *hash_seeds;
+ sketch_update_fn_t sketch_update; /* Pointer to the sketch update function */
+ sketch_lookup_fn_t sketch_lookup; /* Pointer to the sketch lookup function */
+ sketch_delete_fn_t sketch_delete; /* Pointer to the sketch delete function */
+
+ void *runtime_var;
uint32_t mul_shift; /* vbf internal variable used during bit test. */
uint32_t div_shift; /* vbf internal variable used during bit test. */
@@ -143,6 +200,9 @@ struct rte_member_setsum {
/* Second cache line should start here. */
uint32_t socket_id; /* NUMA Socket ID for memory. */
char name[RTE_MEMBER_NAMESIZE]; /* Name of this set summary. */
+#ifdef RTE_ARCH_X86
+ bool use_avx512;
+#endif
} __rte_cache_aligned;
/**
@@ -261,8 +321,33 @@ struct rte_member_parameters {
*/
uint32_t sec_hash_seed;
+ /**
+ * For count(min) sketch data structure, error rate defines the accuracy
+ * required by the user. Higher accuracy leads to more memory usage, but
+ * the flow size is estimated more accurately.
+ */
+ float error_rate;
+
+ /**
+ * Sampling rate means the internal sample rate of the rows of the count
+ * min sketches. Lower sampling rate can reduce CPU overhead, but the
+ * data structure will require more time to converge statistically.
+ */
+ float sample_rate;
+
+ /**
+ * How many top heavy hitter to be reported. The library will internally
+ * keep the keys of heavy hitters for final report.
+ */
+ uint32_t top_k;
+
+ /**
+ * Extra flags that may passed in by user
+ */
+ uint32_t extra_flag;
+
int socket_id; /**< NUMA Socket ID for memory. */
-};
+} __rte_cache_aligned;
/**
* @warning
@@ -418,7 +503,7 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* RTE_MEMBER_NO_MATCH by default is set as 0.
* For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved.
* For vBF mode the set id is limited by the num_set parameter when create
- * the set-summary.
+ * the set-summary. For sketch mode, this id is ignored.
* @return
* HT (cache mode) and vBF should never fail unless the set_id is not in the
* valid range. In such case -EINVAL is returned.
@@ -429,12 +514,72 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* Return 0 for HT (cache mode) if the add does not cause
* eviction, return 1 otherwise. Return 0 for non-cache mode if success,
* -ENOSPC for full, and 1 if cuckoo eviction happens.
- * Always returns 0 for vBF mode.
+ * Always returns 0 for vBF mode and sketch.
*/
int
rte_member_add(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add the packet byte size into the sketch.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param byte_count
+ * Add the byte count of the packet into the sketch.
+ * @return
+ * Return -EINVAL for invalid parameters, otherwise return 0.
+ */
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Query packet count for a certain flow-key.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param count
+ * The output packet count or byte count.
+ * @return
+ * Return -EINVAL for invalid parameters.
+ */
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *count);
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Report heavyhitter flow-keys into set-summary (SS).
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param keys
+ * Pointer of the output top-k key array.
+ * @param counts
+ * Pointer of the output packet count or byte count array of the top-k keys.
+ * @return
+ * Return -EINVAL for invalid parameters. Return a positive integer indicate
+ * how many heavy hitters are reported.
+ */
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **keys, uint64_t *counts);
+
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
diff --git a/lib/member/rte_member_heap.h b/lib/member/rte_member_heap.h
new file mode 100644
index 0000000000..3ced34160a
--- /dev/null
+++ b/lib/member/rte_member_heap.h
@@ -0,0 +1,424 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#ifndef _RTE_MEMBER_HEAP_H_
+#define _RTE_MEMBER_HEAP_H_
+
+#include <rte_ring_elem.h>
+#include "rte_member.h"
+
+#define LCHILD(x) (2 * x + 1)
+#define RCHILD(x) (2 * x + 2)
+#define PARENT(x) ((x - 1) / 2)
+
+#define HASH_BKT_SIZE 16
+#define HASH_HP_MULTI 4
+#define HASH_RESIZE_MULTI 2
+
+struct hash_bkt {
+ uint16_t sig[HASH_BKT_SIZE];
+ uint16_t idx[HASH_BKT_SIZE];
+};
+
+struct hash {
+ uint16_t bkt_cnt;
+ uint16_t num_item;
+ uint32_t seed;
+ struct hash_bkt buckets[0];
+};
+
+struct node {
+ void *key;
+ uint64_t count;
+};
+
+struct minheap {
+ uint32_t key_len;
+ uint32_t size;
+ uint32_t socket;
+ struct hash *hashtable;
+ struct node *elem;
+};
+
+static int
+hash_table_insert(const void *key, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].idx[i] == 0) {
+ table->buckets[idx].idx[i] = value;
+ table->buckets[idx].sig[i] = sig;
+ table->num_item++;
+ return 0;
+ }
+ }
+
+ return -ENOMEM;
+}
+
+static int
+hash_table_update(const void *key, int old_value, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == old_value) {
+ table->buckets[idx].idx[i] = value;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_del(const void *key, uint16_t value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == value) {
+ table->buckets[idx].idx[i] = 0;
+ table->num_item--;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_lookup(const void *key, int key_len, struct minheap *hp)
+{
+ struct hash *table = hp->hashtable;
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] != 0) {
+ uint32_t hp_idx = table->buckets[idx].idx[i] - 1;
+
+ if (memcmp(hp->elem[hp_idx].key, key, hp->key_len) == 0)
+ return hp_idx;
+ }
+ }
+
+ return -ENOENT; /* key doesn't exist */
+}
+
+static int
+resize_hash_table(struct minheap *hp)
+{
+ uint32_t i;
+ uint32_t new_bkt_cnt;
+
+ while (1) {
+ new_bkt_cnt = hp->hashtable->bkt_cnt * HASH_RESIZE_MULTI;
+
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT load factor is [%f]\n",
+ hp->hashtable->num_item / ((float)hp->hashtable->bkt_cnt * HASH_BKT_SIZE));
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT resize happen!\n");
+ rte_free(hp->hashtable);
+ hp->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ new_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, hp->socket);
+
+ if (hp->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ return -ENOMEM;
+ }
+
+ hp->hashtable->bkt_cnt = new_bkt_cnt;
+
+ for (i = 0; i < hp->size; ++i) {
+ if (hash_table_insert(hp->elem[i].key,
+ i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR,
+ "Sketch Minheap HT resize insert fail!\n");
+ break;
+ }
+ }
+ if (i == hp->size)
+ break;
+ }
+
+ return 0;
+}
+
+/* find the item in the given minheap */
+static int
+rte_member_minheap_find(struct minheap *hp, const void *key)
+{
+ int idx = hash_table_lookup(key, hp->key_len, hp);
+ return idx;
+}
+
+static int
+rte_member_minheap_init(struct minheap *heap, int size,
+ uint32_t socket, uint32_t seed)
+{
+ heap->elem = rte_zmalloc_socket(NULL, sizeof(struct node) * size,
+ RTE_CACHE_LINE_SIZE, socket);
+ if (heap->elem == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap elem allocation failed\n");
+ return -ENOMEM;
+ }
+
+ uint32_t hash_bkt_cnt = rte_align32pow2(size * HASH_HP_MULTI) / HASH_BKT_SIZE;
+
+ if (hash_bkt_cnt == 0)
+ hash_bkt_cnt = 1;
+
+ heap->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ hash_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, socket);
+
+ if (heap->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ rte_free(heap->elem);
+ return -ENOMEM;
+ }
+
+ heap->hashtable->seed = seed;
+ heap->hashtable->bkt_cnt = hash_bkt_cnt;
+ heap->socket = socket;
+
+ return 0;
+}
+
+/* swap the minheap nodes */
+static __rte_always_inline void
+rte_member_heap_swap(struct node *n1, struct node *n2)
+{
+ struct node temp = *n1;
+ *n1 = *n2;
+ *n2 = temp;
+}
+
+/* heapify function */
+static void
+rte_member_heapify(struct minheap *hp, uint32_t idx, bool update_hash)
+{
+ uint32_t smallest;
+
+ if (LCHILD(idx) < hp->size &&
+ hp->elem[LCHILD(idx)].count < hp->elem[idx].count)
+ smallest = LCHILD(idx);
+ else
+ smallest = idx;
+
+ if (RCHILD(idx) < hp->size &&
+ hp->elem[RCHILD(idx)].count < hp->elem[smallest].count)
+ smallest = RCHILD(idx);
+
+ if (smallest != idx) {
+ rte_member_heap_swap(&(hp->elem[idx]), &(hp->elem[smallest]));
+
+ if (update_hash) {
+ if (hash_table_update(hp->elem[smallest].key, idx + 1, smallest + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+
+ if (hash_table_update(hp->elem[idx].key, smallest + 1, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+ }
+ rte_member_heapify(hp, smallest, update_hash);
+ }
+}
+
+/* insert a node into the minheap */
+static int
+rte_member_minheap_insert_node(struct minheap *hp, const void *key,
+ int counter, void *key_slot,
+ struct rte_ring *free_key_slot)
+{
+ struct node nd;
+ uint32_t slot_id;
+
+ if (rte_ring_sc_dequeue_elem(free_key_slot, &slot_id, sizeof(uint32_t)) != 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap get empty keyslot failed\n");
+ return -1;
+ }
+
+ nd.count = counter;
+ nd.key = RTE_PTR_ADD(key_slot, slot_id * hp->key_len);
+
+ memcpy(nd.key, key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* delete a key from the minheap */
+static int
+rte_member_minheap_delete_node(struct minheap *hp, const void *key,
+ void *key_slot, struct rte_ring *free_key_slot)
+{
+ int idx = rte_member_minheap_find(hp, key);
+ uint32_t offset = RTE_PTR_DIFF(hp->elem[idx].key, key_slot) / hp->key_len;
+
+ if (hash_table_del(key, idx + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ rte_ring_sp_enqueue_elem(free_key_slot, &offset, sizeof(uint32_t));
+
+ if (idx == (int)(hp->size - 1)) {
+ hp->size--;
+ return 0;
+ }
+
+ hp->elem[idx] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[idx].key, hp->size, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+ rte_member_heapify(hp, idx, true);
+
+ return 0;
+}
+
+/* replace a min node with a new key. */
+static int
+rte_member_minheap_replace_node(struct minheap *hp,
+ const void *new_key,
+ int new_counter)
+{
+ struct node nd;
+ void *recycle_key = NULL;
+
+ recycle_key = hp->elem[0].key;
+
+ if (hash_table_del(recycle_key, 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ hp->elem[0] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[0].key, hp->size, 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+
+ rte_member_heapify(hp, 0, true);
+
+ nd.count = new_counter;
+ nd.key = recycle_key;
+
+ memcpy(nd.key, new_key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(new_key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace insert failed\n");
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* sort the heap into a descending array */
+static void
+rte_member_heapsort(struct minheap *hp, struct node *result_array)
+{
+ struct minheap new_hp;
+
+ /* build a new heap for using the given array */
+ new_hp.size = hp->size;
+ new_hp.key_len = hp->key_len;
+ new_hp.elem = result_array;
+ memcpy(result_array, hp->elem, hp->size * sizeof(struct node));
+
+ /* sort the new heap */
+ while (new_hp.size > 1) {
+ rte_member_heap_swap(&(new_hp.elem[0]), &(new_hp.elem[new_hp.size - 1]));
+ new_hp.size--;
+ rte_member_heapify(&new_hp, 0, false);
+ }
+}
+
+static void
+rte_member_minheap_free(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ rte_free(hp->elem);
+ rte_free(hp->hashtable);
+}
+
+static void
+rte_member_minheap_reset(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ memset(hp->elem, 0, sizeof(struct node) * hp->size);
+ hp->size = 0;
+
+ memset((char *)hp->hashtable + sizeof(struct hash), 0,
+ hp->hashtable->bkt_cnt * sizeof(struct hash_bkt));
+ hp->hashtable->num_item = 0;
+}
+
+#endif /* _RTE_MEMBER_HEAP_H_ */
diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c
new file mode 100644
index 0000000000..524ba77620
--- /dev/null
+++ b/lib/member/rte_member_sketch.c
@@ -0,0 +1,594 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#include <math.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_errno.h>
+#include <rte_log.h>
+#include <rte_random.h>
+#include <rte_prefetch.h>
+#include <rte_ring_elem.h>
+
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_member_heap.h"
+
+#ifdef CC_AVX512_SUPPORT
+#include "rte_member_sketch_avx512.h"
+#endif /* CC_AVX512_SUPPORT */
+
+struct sketch_runtime {
+ uint64_t pkt_cnt;
+ uint32_t until_next;
+ int converged;
+ struct minheap heap;
+ struct node *report_array;
+ void *key_slots;
+ struct rte_ring *free_key_slots;
+} __rte_cache_aligned;
+
+/*
+ * Geometric sampling to calculate how many packets needs to be
+ * skipped until next update. This method can mitigate the CPU
+ * overheads compared with coin-toss sampling.
+ */
+static uint32_t
+draw_geometric(const struct rte_member_setsum *ss)
+{
+ double rand = 1;
+
+ if (ss->sample_rate == 1)
+ return 1;
+
+ while (rand == 1 || rand == 0)
+ rand = (double) rte_rand() / (double)(RTE_RAND_MAX);
+
+ return (uint32_t)ceil(log(1 - rand) / log(1 - ss->sample_rate));
+}
+
+static void
+isort(uint64_t *array, int n)
+{
+ int i;
+
+ for (i = 1; i < n; i++) {
+ uint64_t t = array[i];
+ int j;
+
+ for (j = i - 1; j >= 0; j--) {
+ if (t < array[j])
+ array[j + 1] = array[j];
+ else
+ break;
+ }
+ array[j + 1] = t;
+ }
+}
+
+static __rte_always_inline void
+swap(uint64_t *a, uint64_t *b)
+{
+ uint64_t tmp = *a;
+ *a = *b;
+ *b = tmp;
+}
+
+static uint64_t
+medianof5(uint64_t a, uint64_t b, uint64_t c, uint64_t d, uint64_t e)
+{
+ if (a > b)
+ swap(&a, &b);
+ if (c > d)
+ swap(&c, &d);
+ if (a > c) {
+ if (d > e)
+ swap(&c, &e);
+ else {
+ swap(&c, &d);
+ swap(&d, &e);
+ }
+ } else {
+ if (b > e)
+ swap(&a, &e);
+ else {
+ swap(&a, &b);
+ swap(&b, &e);
+ }
+ }
+
+ if (a > c)
+ return a > d ? d : a;
+ else
+ return b > c ? c : b;
+}
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *ring)
+{
+ struct sketch_runtime *runtime;
+ uint32_t num_col;
+ uint32_t i;
+
+ if (params->sample_rate == 0 || params->sample_rate > 1) {
+ rte_errno = EINVAL;
+ RTE_MEMBER_LOG(ERR,
+ "Membership Sketch created with invalid parameters\n");
+ return -EINVAL;
+ }
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_COUNT_BYTE)
+ ss->count_byte = 1;
+
+#ifdef RTE_ARCH_X86
+ if (ss->count_byte == 1 &&
+ rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512IFMA) == 1) {
+#ifdef CC_AVX512_SUPPORT
+ ss->use_avx512 = true;
+#else
+ ss->use_avx512 = false;
+#endif
+ }
+
+ if (ss->use_avx512 == true) {
+#ifdef CC_AVX512_SUPPORT
+ ss->num_row = NUM_ROW_VEC;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch AVX512 update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_avx512;
+ ss->sketch_lookup = sketch_lookup_avx512;
+ ss->sketch_delete = sketch_delete_avx512;
+#endif
+ } else
+#endif
+ {
+ ss->num_row = NUM_ROW_SCALAR;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch SCALAR update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_scalar;
+ ss->sketch_lookup = sketch_lookup_scalar;
+ ss->sketch_delete = sketch_delete_scalar;
+ }
+
+ ss->socket_id = params->socket_id;
+
+ if (ss->count_byte == 0)
+ num_col = 4.0 / params->error_rate / params->sample_rate;
+#ifdef RTE_ARCH_X86
+ else if (ss->use_avx512 == true)
+ num_col = rte_align32pow2(4.0 / params->error_rate);
+#endif
+ else
+ num_col = 4.0 / params->error_rate;
+
+ ss->table = rte_zmalloc_socket(NULL,
+ sizeof(uint64_t) * num_col * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->table == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Table memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->hash_seeds = rte_zmalloc_socket(NULL, sizeof(uint64_t) * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->hash_seeds == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Hashseeds memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->runtime_var = rte_zmalloc_socket(NULL, sizeof(struct sketch_runtime),
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->runtime_var == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime memory allocation failed\n");
+ rte_free(ss);
+ return -ENOMEM;
+ }
+ runtime = ss->runtime_var;
+
+ ss->num_col = num_col;
+ ss->sample_rate = params->sample_rate;
+ ss->prim_hash_seed = params->prim_hash_seed;
+ ss->sec_hash_seed = params->sec_hash_seed;
+ ss->error_rate = params->error_rate;
+ ss->topk = params->top_k;
+ ss->key_len = params->key_len;
+ runtime->heap.key_len = ss->key_len;
+
+ runtime->key_slots = rte_zmalloc_socket(NULL, ss->key_len * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->key_slots == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Key Slots allocation failed\n");
+ goto error;
+ }
+
+ runtime->free_key_slots = ring;
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime->free_key_slots,
+ &i, sizeof(uint32_t));
+
+ if (rte_member_minheap_init(&(runtime->heap), params->top_k,
+ ss->socket_id, params->prim_hash_seed) < 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap allocation failed\n");
+ goto error_runtime;
+ }
+
+ runtime->report_array = rte_zmalloc_socket(NULL, sizeof(struct node) * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->report_array == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime Report Array allocation failed\n");
+ goto error_runtime;
+ }
+
+ rte_srand(ss->prim_hash_seed);
+ for (i = 0; i < ss->num_row; i++)
+ ss->hash_seeds[i] = rte_rand();
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_ALWAYS_BOUNDED)
+ ss->always_bounded = 1;
+
+ if (ss->always_bounded) {
+ double delta = 1.0 / (pow(2, ss->num_row));
+
+ ss->converge_thresh = 10 * pow(ss->error_rate, -2.0) * sqrt(log(1 / delta));
+ }
+
+ RTE_MEMBER_LOG(DEBUG, "Sketch created, "
+ "the total memory required is %u Bytes\n", ss->num_col * ss->num_row * 8);
+
+ return 0;
+
+error_runtime:
+ rte_member_minheap_free(&runtime->heap);
+ rte_ring_free(runtime->free_key_slots);
+ rte_free(runtime->key_slots);
+error:
+ rte_free(runtime);
+ rte_free(ss);
+
+ return -ENOMEM;
+}
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col[ss->num_row];
+ uint64_t count_row[ss->num_row];
+ uint32_t cur_row;
+ uint64_t count;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ rte_prefetch0(&count_array[cur_row * ss->num_col + col[cur_row]]);
+ }
+
+ /* if sample rate is 1, it is a regular count-min, we report the min */
+ if (ss->sample_rate == 1 || ss->count_byte == 1)
+ return count_min(ss, col);
+
+ memset(count_row, 0, sizeof(uint64_t) * ss->num_row);
+
+ /* otherwise we report the median number */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_row[cur_row] = count_array[cur_row * ss->num_col + col[cur_row]];
+
+ if (ss->num_row == 5)
+ return medianof5(count_row[0], count_row[1],
+ count_row[2], count_row[3], count_row[4]);
+
+ isort(count_row, ss->num_row);
+
+ if (ss->num_row % 2 == 0) {
+ count = (count_row[ss->num_row / 2] + count_row[ss->num_row / 2 - 1]) / 2;
+ return count;
+ }
+ /* ss->num_row % 2 != 0 */
+ count = count_row[ss->num_row / 2];
+
+ return count;
+}
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* set corresponding counter to 0 */
+ count_array[cur_row * ss->num_col + col[cur_row]] = 0;
+ }
+}
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ uint64_t *output)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ *output = count;
+
+ return 0;
+}
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ uint64_t count = ss->sketch_lookup(ss, runtime_var->heap.elem[i].key);
+
+ runtime_var->heap.elem[i].count = count;
+ }
+}
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key,
+ uint64_t *count)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = setsum->runtime_var;
+
+ rte_member_update_heap(setsum);
+ rte_member_heapsort(&(runtime_var->heap), runtime_var->report_array);
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ key[i] = runtime_var->report_array[i].key;
+ count[i] = runtime_var->report_array[i].count;
+ }
+
+ return runtime_var->heap.size;
+}
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *ss,
+ const void *key, member_set_t *set_id)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ if (runtime_var->heap.size > 0 && count >= runtime_var->heap.elem[0].count)
+ *set_id = 1;
+ else
+ *set_id = 0;
+
+ if (count == 0)
+ return 0;
+ else
+ return 1;
+}
+
+static void
+should_converge(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ /* For count min sketch - L1 norm */
+ if (runtime_var->pkt_cnt > ss->converge_thresh) {
+ runtime_var->converged = 1;
+ RTE_MEMBER_LOG(DEBUG, "Sketch converged, begin sampling "
+ "from key count %"PRIu64"\n",
+ runtime_var->pkt_cnt);
+ }
+}
+
+static void
+sketch_update_row(const struct rte_member_setsum *ss, const void *key,
+ uint32_t count, uint32_t cur_row)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* sketch counter update */
+ count_array[cur_row * ss->num_col + col] +=
+ ceil(count / (ss->sample_rate));
+}
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+ count_array[cur_row * ss->num_col + col] += count;
+ }
+}
+
+static void
+heap_update(const struct rte_member_setsum *ss, const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t key_cnt = 0;
+ int found;
+
+ /* We also update the heap for this key */
+ key_cnt = ss->sketch_lookup(ss, key);
+ if (key_cnt > runtime_var->heap.elem[0].count) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ /* the key is found in the top-k heap */
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else if (runtime_var->heap.size < ss->topk) {
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ } else {
+ rte_member_minheap_replace_node(&runtime_var->heap, key, key_cnt);
+ }
+ } else if (runtime_var->heap.size < ss->topk) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ }
+}
+
+/*
+ * Add a single packet into the sketch.
+ * Sketch value is meatured by packet numbers in this mode.
+ */
+int
+rte_member_add_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ __rte_unused member_set_t set_id)
+{
+ uint32_t cur_row;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /*
+ * If sketch is measured by byte count,
+ * the rte_member_add_sketch_byte_count routine should be used.
+ */
+ if (ss->count_byte == 1) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Byte Mode, "
+ "should use rte_member_add_byte_count()!\n");
+ return -EINVAL;
+ }
+
+ if (ss->sample_rate == 1) {
+ ss->sketch_update(ss, key, 1);
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* convergence stage if it's needed */
+ if (ss->always_bounded && !runtime_var->converged) {
+ ss->sketch_update(ss, key, 1);
+
+ if (!((++runtime_var->pkt_cnt) & (INTERVAL - 1)))
+ should_converge(ss);
+
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* should we skip this packet */
+ if (*until_next >= ss->num_row) {
+ *until_next -= ss->num_row;
+ return 0;
+ }
+ cur_row = *until_next;
+ do {
+ sketch_update_row(ss, key, 1, cur_row);
+ *until_next = draw_geometric(ss);
+ if (cur_row + *until_next >= ss->num_row)
+ break;
+ cur_row += *until_next;
+ } while (1);
+
+ *until_next -= (ss->num_row - cur_row);
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+/*
+ * Add the byte count of the packet into the sketch.
+ * Sketch value is meatured by byte count numbers in this mode.
+ */
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t byte_count)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /* should not call this API if not in count byte mode */
+ if (ss->count_byte == 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Pkt Mode, "
+ "should use rte_member_add()!\n");
+ return -EINVAL;
+ }
+
+ /* there's specific optimization for the sketch update */
+ ss->sketch_update(ss, key, byte_count);
+
+ if (*until_next != 0) {
+ *until_next = *until_next - 1;
+ return 0;
+ }
+
+ *until_next = draw_geometric(ss) - 1;
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *ss,
+ const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ int found;
+
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found < 0)
+ return -1;
+
+ ss->sketch_delete(ss, key);
+
+ return rte_member_minheap_delete_node
+ (&runtime_var->heap, key, runtime_var->key_slots, runtime_var->free_key_slots);
+}
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ rte_free(ss->table);
+ rte_member_minheap_free(&runtime_var->heap);
+ rte_free(runtime_var->key_slots);
+ rte_ring_free(runtime_var->free_key_slots);
+ rte_free(runtime_var);
+}
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t *sketch = ss->table;
+ uint32_t i;
+
+ memset(sketch, 0, sizeof(uint64_t) * ss->num_col * ss->num_row);
+ rte_member_minheap_reset(&runtime_var->heap);
+ rte_ring_reset(runtime_var->free_key_slots);
+
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime_var->free_key_slots, &i, sizeof(uint32_t));
+}
diff --git a/lib/member/rte_member_sketch.h b/lib/member/rte_member_sketch.h
new file mode 100644
index 0000000000..219323008b
--- /dev/null
+++ b/lib/member/rte_member_sketch.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_H_
+#define _RTE_MEMBER_SKETCH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include <rte_ring_elem.h>
+
+#define NUM_ROW_SCALAR 5
+#define INTERVAL (1 << 15)
+
+#if !RTE_IS_POWER_OF_2(INTERVAL)
+#error sketch INTERVAL macro must be a power of 2
+#endif
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *r);
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *setsum,
+ const void *key, member_set_t *set_id);
+
+int
+rte_member_add_sketch(const struct rte_member_setsum *setsum,
+ const void *key,
+ member_set_t set_id);
+
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key, uint32_t byte_count);
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *setsum,
+ const void *key);
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output);
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss);
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *setsum);
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count);
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss);
+
+static __rte_always_inline uint64_t
+count_min(const struct rte_member_setsum *ss, const uint32_t *hash_results)
+{
+ uint64_t *count_array = ss->table;
+ uint64_t count;
+ uint32_t cur_row;
+ uint64_t min = UINT64_MAX;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ uint64_t cnt = count_array[cur_row * ss->num_col + hash_results[cur_row]];
+
+ if (cnt < min)
+ min = cnt;
+ }
+ count = min;
+
+ return count;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_H_ */
diff --git a/lib/member/rte_member_sketch_avx512.c b/lib/member/rte_member_sketch_avx512.c
new file mode 100644
index 0000000000..c83f4b6fd1
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.c
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "rte_member_sketch_avx512.h"
+
+__rte_always_inline void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t num_col = ss->num_col;
+ uint32_t key_len = ss->key_len;
+ __m256i v_row_base;
+ __m256i v_hash_result;
+ __m512i current_sketch;
+ __m512i updated_sketch;
+ __m512i v_count;
+
+ const __m256i v_idx = _mm256_set_epi32(7, 6, 5, 4, 3, 2, 1, 0);
+ const __m256i v_col = _mm256_set1_epi32(num_col);
+
+ /* compute the hash result parallelly */
+ v_hash_result = rte_xxh64_sketch_avx512
+ (key, key_len, *(__m512i *)ss->hash_seeds, num_col);
+ v_row_base = _mm256_mullo_epi32(v_idx, v_col);
+ v_hash_result = _mm256_add_epi32(v_row_base, v_hash_result);
+
+ current_sketch =
+ _mm512_i32gather_epi64(v_hash_result, count_array, 8);
+ v_count = _mm512_set1_epi64(count);
+ updated_sketch = _mm512_add_epi64(current_sketch, v_count);
+ _mm512_i32scatter_epi64
+ ((void *)count_array, v_hash_result, updated_sketch, 8);
+}
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+
+ /* currently only for sketch byte count mode */
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds, ss->num_col);
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ return count_min(ss, col);
+}
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint64_t min = UINT64_MAX;
+ uint32_t cur_row;
+
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds,
+ RTE_ALIGN_FLOOR(ss->num_col, 32));
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ min = count_min(ss, col);
+
+ /* subtract the min value from all the counters */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_array[cur_row * ss->num_col + col[cur_row]] -= min;
+}
diff --git a/lib/member/rte_member_sketch_avx512.h b/lib/member/rte_member_sketch_avx512.h
new file mode 100644
index 0000000000..e7c25da643
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_AVX512_H_
+#define _RTE_MEMBER_SKETCH_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_xxh64_avx512.h"
+
+#define NUM_ROW_VEC 8
+
+void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_AVX512_H_ */
diff --git a/lib/member/rte_xxh64_avx512.h b/lib/member/rte_xxh64_avx512.h
new file mode 100644
index 0000000000..50ca1b52c7
--- /dev/null
+++ b/lib/member/rte_xxh64_avx512.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_XXH64_AVX512_H_
+#define _RTE_XXH64_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <immintrin.h>
+
+/* 0b1001111000110111011110011011000110000101111010111100101010000111 */
+static const uint64_t PRIME64_1 = 0x9E3779B185EBCA87ULL;
+/* 0b1100001010110010101011100011110100100111110101001110101101001111 */
+static const uint64_t PRIME64_2 = 0xC2B2AE3D27D4EB4FULL;
+/* 0b0001011001010110011001111011000110011110001101110111100111111001 */
+static const uint64_t PRIME64_3 = 0x165667B19E3779F9ULL;
+/* 0b1000010111101011110010100111011111000010101100101010111001100011 */
+static const uint64_t PRIME64_4 = 0x85EBCA77C2B2AE63ULL;
+/* 0b0010011111010100111010110010111100010110010101100110011111000101 */
+static const uint64_t PRIME64_5 = 0x27D4EB2F165667C5ULL;
+
+static __rte_always_inline __m512i
+xxh64_round_avx512(__m512i hash, __m512i input)
+{
+ hash = _mm512_madd52lo_epu64(hash,
+ input,
+ _mm512_set1_epi64(PRIME64_2));
+
+ hash = _mm512_rol_epi64(hash, 31);
+
+ return hash;
+}
+
+static __rte_always_inline __m512i
+xxh64_fmix_avx512(__m512i hash)
+{
+ hash = _mm512_xor_si512(hash, _mm512_srli_epi64(hash, 33));
+
+ return hash;
+}
+
+static __rte_always_inline __m256i
+rte_xxh64_sketch_avx512(const void *key, uint32_t key_len,
+ __m512i v_seed, uint32_t modulo)
+{
+ __m512i v_prime64_5, v_hash;
+ size_t remaining = key_len;
+ size_t offset = 0;
+ __m512i input;
+
+ v_prime64_5 = _mm512_set1_epi64(PRIME64_5);
+ v_hash = _mm512_add_epi64
+ (_mm512_add_epi64(v_seed, v_prime64_5),
+ _mm512_set1_epi64(key_len));
+
+ while (remaining >= 8) {
+ input = _mm512_set1_epi64(*(uint64_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ xxh64_round_avx512(_mm512_setzero_si512(), input));
+ v_hash = _mm512_madd52lo_epu64(_mm512_set1_epi64(PRIME64_4),
+ v_hash,
+ _mm512_set1_epi64(PRIME64_1));
+
+ remaining -= 8;
+ offset += 8;
+ }
+
+ if (remaining >= 4) {
+ input = _mm512_set1_epi64
+ (*(uint32_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_1)));
+ v_hash = _mm512_madd52lo_epu64
+ (_mm512_set1_epi64(PRIME64_3),
+ _mm512_rol_epi64(v_hash, 23),
+ _mm512_set1_epi64(PRIME64_2));
+
+ offset += 4;
+ remaining -= 4;
+ }
+
+ while (remaining != 0) {
+ input = _mm512_set1_epi64
+ (*(uint8_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_5)));
+ v_hash = _mm512_mullo_epi64
+ (_mm512_rol_epi64(v_hash, 11),
+ _mm512_set1_epi64(PRIME64_1));
+ offset++;
+ remaining--;
+ }
+
+ v_hash = xxh64_fmix_avx512(v_hash);
+
+ /*
+ * theoritically, such modular operations can be replaced by
+ * _mm512_rem_epi64(), but seems it depends on the compiler's
+ * implementation. so here is the limitation that the modulo
+ * value should be power of 2.
+ */
+ __m512i v_hash_remainder = _mm512_set1_epi64((modulo - 1));
+
+ return _mm512_cvtepi64_epi32(_mm512_and_si512(v_hash, v_hash_remainder));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_XXH64_AVX512_H_ */
diff --git a/lib/member/version.map b/lib/member/version.map
index 19469c6aba..35a811d0ac 100644
--- a/lib/member/version.map
+++ b/lib/member/version.map
@@ -2,6 +2,9 @@ DPDK_23 {
global:
rte_member_add;
+ rte_member_add_byte_count;
+ rte_member_query_count;
+ rte_member_report_heavyhitter;
rte_member_create;
rte_member_delete;
rte_member_find_existing;
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 2/2] test/member: add functional and perf tests for sketch
2022-08-31 6:46 ` [PATCH v2 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-08-31 6:46 ` [PATCH v2 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-08-31 6:46 ` Leyi Rong
1 sibling, 0 replies; 34+ messages in thread
From: Leyi Rong @ 2022-08-31 6:46 UTC (permalink / raw)
To: ferruh.yigit, suanmingm, yipeng1.wang, zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patch adds functional and performance tests for sketch mode of
membership library.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
app/test/test_member.c | 272 ++++++++++++++++++++++++++++++++++++
app/test/test_member_perf.c | 153 +++++++++++++++++++-
2 files changed, 421 insertions(+), 4 deletions(-)
diff --git a/app/test/test_member.c b/app/test/test_member.c
index 26a712439f..8266e6437b 100644
--- a/app/test/test_member.c
+++ b/app/test/test_member.c
@@ -4,6 +4,7 @@
/* This test is for membership library's simple feature test */
+#include <math.h>
#include "test.h"
#include <rte_memcpy.h>
@@ -28,6 +29,7 @@ test_member(void)
struct rte_member_setsum *setsum_ht;
struct rte_member_setsum *setsum_cache;
struct rte_member_setsum *setsum_vbf;
+struct rte_member_setsum *setsum_sketch;
/* 5-tuple key type */
struct flow_key {
@@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
.socket_id = 0 /* NUMA Socket ID for memory. */
};
+/* for sketch definitions */
+#define TOP_K 10
+#define HH_PKT_SIZE 16
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define PRINT_OUT_COUNT 20
+
+#define SKETCH_LARGEST_KEY_SIZE 1000000
+#define SKETCH_TOTAL_KEY 500
+#define NUM_OF_KEY(key) {\
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \
+}
+
+void *heavy_hitters[TOP_K];
+
/*
* Sequence of operations for find existing setsummary
*
@@ -684,6 +701,257 @@ perform_free(void)
rte_member_free(setsum_vbf);
}
+static void
+print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
+ uint32_t print_num, bool count_byte)
+{
+ uint32_t i;
+
+ for (i = 0; i < print_num; i++) {
+ if (count_byte)
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)) *
+ HH_PKT_SIZE,
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
+ ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
+ else
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)),
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i)) /
+ (float)NUM_OF_KEY(i));
+ }
+}
+
+static int
+sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int reset_test)
+{
+ uint32_t i;
+ uint64_t result_count[SKETCH_TOTAL_KEY];
+ member_set_t heavy_set[SKETCH_TOTAL_KEY];
+ uint64_t count[TOP_K];
+ int ret;
+ int hh_cnt;
+
+ setsum_sketch = rte_member_create(¶ms);
+ if (setsum_sketch == NULL) {
+ printf("Creation of setsums fail\n");
+ return -1;
+ }
+
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count(setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ if (reset_test) {
+ printf("\nEntering Sketch Reset Test Process!\n");
+ rte_member_reset(setsum_sketch);
+
+ /* after reset, check some key's count */
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ printf("\nReinsert keys after Sketch Reset!\n");
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count
+ (setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ printf("\nDelete some keys!\n");
+ uint32_t tmp_key = 0;
+
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+ tmp_key = 1;
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+ }
+
+ rte_member_free(setsum_sketch);
+ return 0;
+}
+
+static int
+test_member_sketch(void)
+{
+ unsigned int i, j, index;
+ uint32_t total_pkt = 0;
+ uint32_t *keys;
+ int count_byte = 0;
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++)
+ total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
+
+ printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
+
+ keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
+
+ if (keys == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ return -1;
+ }
+
+ index = 0;
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
+ keys[index++] = i;
+ }
+
+ /* shuffle the keys */
+ for (i = index - 1; i > 0; i--) {
+ uint32_t swap_idx = rte_rand() % i;
+ uint32_t tmp_key = keys[i];
+
+ keys[i] = keys[swap_idx];
+ keys[swap_idx] = tmp_key;
+ }
+
+ params.key_len = 4;
+ params.name = "test_member_sketch";
+ params.type = RTE_MEMBER_TYPE_SKETCH;
+ params.error_rate = SKETCH_ERROR_RATE;
+ params.sample_rate = SKETCH_SAMPLE_RATE;
+ params.extra_flag = 0;
+ params.top_k = TOP_K;
+ params.prim_hash_seed = rte_rdtsc();
+ int reset_test = 0;
+
+ printf("Default sketching params: Error Rate: [%f]\tSample Rate: [%f]\tTopK: [%d]\n",
+ SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
+
+ printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ printf("\n[Sketch with Always Bounded Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 1;
+ params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ printf("\n[Sketch with Packet Size Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 0;
+ params.extra_flag = 0;
+ reset_test = 1;
+ printf("\nreset sketch test\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ rte_free(keys);
+ return 0;
+}
+
static int
test_member(void)
{
@@ -719,6 +987,10 @@ test_member(void)
return -1;
}
+ if (test_member_sketch() < 0) {
+ perform_free();
+ return -1;
+ }
perform_free();
return 0;
}
diff --git a/app/test/test_member_perf.c b/app/test/test_member_perf.c
index 978db0020a..b7eb3e4c66 100644
--- a/app/test/test_member_perf.c
+++ b/app/test/test_member_perf.c
@@ -13,6 +13,7 @@
#include <rte_random.h>
#include <rte_memcpy.h>
#include <rte_thash.h>
+#include <math.h>
#ifdef RTE_EXEC_ENV_WINDOWS
static int
@@ -26,7 +27,7 @@ test_member_perf(void)
#include <rte_member.h>
-#define NUM_KEYSIZES 10
+#define NUM_KEYSIZES RTE_DIM(hashtest_key_lens)
#define NUM_SHUFFLES 10
#define MAX_KEYSIZE 64
#define MAX_ENTRIES (1 << 19)
@@ -36,12 +37,23 @@ test_member_perf(void)
#define BURST_SIZE 64
#define VBF_FALSE_RATE 0.03
+/* for the heavy hitter detection */
+#define SKETCH_LARGEST_KEY_SIZE (1<<15)
+#define SKETCH_PKT_SIZE 16
+#define TOP_K 100
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define NUM_ADDS (KEYS_TO_ADD * 20)
+
static unsigned int test_socket_id;
enum sstype {
HT = 0,
CACHE,
VBF,
+ SKETCH,
+ SKETCH_BOUNDED,
+ SKETCH_BYTE,
NUM_TYPE
};
@@ -88,6 +100,7 @@ static member_set_t data[NUM_TYPE][/* Array to store the data */KEYS_TO_ADD];
/* Array to store all input keys */
static uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE];
+static uint8_t hh_keys[KEYS_TO_ADD][MAX_KEYSIZE];
/* Shuffle the keys that have been added, so lookups will be totally random */
static void
@@ -136,6 +149,10 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
{
unsigned int i, j;
int num_duplicates;
+ int distinct_key = 0;
+ int count_down = SKETCH_LARGEST_KEY_SIZE;
+ uint32_t swap_idx;
+ uint8_t temp_key[MAX_KEYSIZE];
params->key_size = hashtest_key_lens[cycle];
params->cycle = cycle;
@@ -176,6 +193,22 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
/* Shuffle the random values again */
shuffle_input_keys(params);
+ for (i = 0; i < KEYS_TO_ADD; i++) {
+ if (count_down == 0) {
+ distinct_key++;
+ count_down = ceil(SKETCH_LARGEST_KEY_SIZE / (distinct_key + 1));
+ }
+ memcpy(hh_keys[i], keys[distinct_key], params->key_size);
+ count_down--;
+ }
+
+ for (i = KEYS_TO_ADD - 1; i > 0; i--) {
+ swap_idx = rte_rand() % i;
+ memcpy(temp_key, hh_keys[i], params->key_size);
+ memcpy(hh_keys[i], hh_keys[swap_idx], params->key_size);
+ memcpy(hh_keys[swap_idx], temp_key, params->key_size);
+ }
+
/* For testing miss lookup, we insert half and lookup the other half */
unsigned int entry_cnt, bf_key_cnt;
if (!miss) {
@@ -208,6 +241,44 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
params->setsum[VBF] = rte_member_create(&member_params);
if (params->setsum[VBF] == NULL)
fprintf(stderr, "VBF create fail\n");
+
+ member_params.name = "test_member_sketch";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag = 0;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_bounded";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BOUNDED] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BOUNDED] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_byte";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BYTE] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BYTE] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+
for (i = 0; i < NUM_TYPE; i++) {
if (params->setsum[i] == NULL)
return -1;
@@ -243,6 +314,39 @@ timed_adds(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_adds_sketch(struct member_perf_params *params, int type)
+{
+ const uint64_t start_tsc = rte_rdtsc();
+ unsigned int i, j, a;
+ int32_t ret;
+
+ for (i = 0; i < NUM_ADDS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ if (type == SKETCH_BYTE)
+ ret = rte_member_add_byte_count(params->setsum[type],
+ &hh_keys[j], SKETCH_PKT_SIZE);
+ else
+ ret = rte_member_add(params->setsum[type], &hh_keys[j], 1);
+ if (ret < 0) {
+ printf("Error %d in rte_member_add - key=0x", ret);
+ for (a = 0; a < params->key_size; a++)
+ printf("%02x", hh_keys[j][a]);
+ printf("type: %d\n", type);
+
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][ADD] = time_taken / NUM_ADDS;
+
+ return 0;
+}
+
static int
timed_lookups(struct member_perf_params *params, int type)
{
@@ -279,6 +383,36 @@ timed_lookups(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_lookups_sketch(struct member_perf_params *params, int type)
+{
+ unsigned int i, j;
+
+ false_data[type][params->cycle] = 0;
+
+ const uint64_t start_tsc = rte_rdtsc();
+ member_set_t result;
+ int ret;
+
+ for (i = 0; i < NUM_LOOKUPS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ ret = rte_member_lookup(params->setsum[type], &hh_keys[j],
+ &result);
+ if (ret < 0) {
+ printf("lookup wrong internally");
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][LOOKUP] = time_taken / NUM_LOOKUPS;
+
+ return 0;
+}
+
static int
timed_lookups_bulk(struct member_perf_params *params, int type)
{
@@ -531,7 +665,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_adds(¶ms, j) < 0)
return exit_with_fail("timed_adds", ¶ms,
@@ -562,6 +696,17 @@ run_all_tbl_perf_tests(void)
/* Print a dot to show progress on operations */
}
+
+ for (j = SKETCH; j < NUM_TYPE; j++) {
+ if (timed_adds_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_adds_sketch", ¶ms, i, j);
+
+ if (timed_lookups_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_lookups_sketch", ¶ms, i, j);
+ }
+
printf(".");
fflush(stdout);
@@ -574,7 +719,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_miss_lookup(¶ms, j) < 0)
return exit_with_fail("timed_miss_lookup",
¶ms, i, j);
@@ -605,7 +750,7 @@ run_all_tbl_perf_tests(void)
"fr_multi_bulk", "false_positive_rate");
/* Key size not influence False rate so just print out one key size */
for (i = 0; i < 1; i++) {
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
printf("%-18d", hashtest_key_lens[i]);
printf("%-18d", j);
printf("%-18f", (float)false_data[j][i] / NUM_LOOKUPS);
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] member: implement NitroSketch mode
2022-08-31 6:46 ` [PATCH v2 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-13 14:56 ` David Marchand
2022-09-14 9:42 ` Rong, Leyi
0 siblings, 1 reply; 34+ messages in thread
From: David Marchand @ 2022-09-13 14:56 UTC (permalink / raw)
To: Leyi Rong
Cc: ferruh.yigit, suanmingm, yipeng1.wang, zaoxingliu, sameh.gobriel,
dev, Bruce Richardson, Aaron Conole, Michael Santana,
Lincoln Lavoie
On Wed, Aug 31, 2022 at 8:47 AM Leyi Rong <leyi.rong@intel.com> wrote:
>
> Sketching algorithm provide high-fidelity approximate measurements and
> appears as a promising alternative to traditional approaches such as
> packet sampling.
>
> NitroSketch [1] is a software sketching framework that optimizes
> performance, provides accuracy guarantees, and supports a variety of
> sketches.
>
> This commit adds a new data structure called sketch into
> membership library. This new data structure is an efficient
> way to profile the traffic for heavy hitters. Also use min-heap
> structure to maintain the top-k flow keys.
http://mails.dpdk.org/archives/test-report/2022-August/304026.html
This patch adds new symbols in the API without going through the
experimental phase.
What is the rationale for skipping it?
>
> [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
>
> Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
> lib/member/meson.build | 38 +-
> lib/member/rte_member.c | 75 ++++
> lib/member/rte_member.h | 151 ++++++-
> lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> lib/member/rte_member_sketch.h | 97 +++++
> lib/member/rte_member_sketch_avx512.c | 69 +++
> lib/member/rte_member_sketch_avx512.h | 36 ++
> lib/member/rte_xxh64_avx512.h | 117 +++++
> lib/member/version.map | 3 +
> 10 files changed, 1600 insertions(+), 4 deletions(-)
> create mode 100644 lib/member/rte_member_heap.h
> create mode 100644 lib/member/rte_member_sketch.c
> create mode 100644 lib/member/rte_member_sketch.h
> create mode 100644 lib/member/rte_member_sketch_avx512.c
> create mode 100644 lib/member/rte_member_sketch_avx512.h
> create mode 100644 lib/member/rte_xxh64_avx512.h
>
> diff --git a/lib/member/meson.build b/lib/member/meson.build
> index e06fddc240..9b3418c25c 100644
> --- a/lib/member/meson.build
> +++ b/lib/member/meson.build
> @@ -7,6 +7,42 @@ if is_windows
> subdir_done()
> endif
>
> -sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
> +sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
> headers = files('rte_member.h')
> deps += ['hash']
> +includes += include_directories('../hash', '../ring')
> +
> +# compile AVX512 version if:
> +# we are building 64-bit binary AND binutils can generate proper code
> +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
> + # compile AVX512 version if either:
> + # a. we have AVX512 supported in minimum instruction set
> + # baseline
> + # b. it's not minimum instruction set, but supported by
> + # compiler
> + #
> + # in former case, just add avx512 C file to files list
> + # in latter case, compile c file to static lib, using correct
> + # compiler flags, and then have the .o file from static lib
> + # linked into main lib.
> + sketch_avx512_cpu_support = (
> + cc.get_define('__AVX512F__', args: machine_args) != ''
> + )
> +
> + if sketch_avx512_cpu_support == true
> + cflags += ['-DCC_AVX512_SUPPORT']
> + if cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
> + cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
Pushing those flags in the cflags is probably wrong, as the rest of
the library objects will be compiled with those AVX512 flags.
If later this library code is run on a non supporting AVX512 system,
it will trigger a runtime error.
Please look at how other libraries integrated AVX512.
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v2 1/2] member: implement NitroSketch mode
2022-09-13 14:56 ` David Marchand
@ 2022-09-14 9:42 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-09-14 9:42 UTC (permalink / raw)
To: David Marchand
Cc: ferruh.yigit, suanmingm, Wang, Yipeng1, zaoxingliu, Gobriel,
Sameh, dev, Richardson, Bruce, Aaron Conole, Michael Santana,
Lincoln Lavoie
Hi David,
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, September 13, 2022 10:56 PM
> To: Rong, Leyi <leyi.rong@intel.com>
> Cc: ferruh.yigit@xilinx.com; suanmingm@nvidia.com; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>; dev@dpdk.org; Richardson, Bruce
> <bruce.richardson@intel.com>; Aaron Conole <aconole@redhat.com>; Michael
> Santana <maicolgabriel@hotmail.com>; Lincoln Lavoie <lylavoie@iol.unh.edu>
> Subject: Re: [PATCH v2 1/2] member: implement NitroSketch mode
>
> On Wed, Aug 31, 2022 at 8:47 AM Leyi Rong <leyi.rong@intel.com> wrote:
> >
> > Sketching algorithm provide high-fidelity approximate measurements and
> > appears as a promising alternative to traditional approaches such as
> > packet sampling.
> >
> > NitroSketch [1] is a software sketching framework that optimizes
> > performance, provides accuracy guarantees, and supports a variety of
> > sketches.
> >
> > This commit adds a new data structure called sketch into membership
> > library. This new data structure is an efficient way to profile the
> > traffic for heavy hitters. Also use min-heap structure to maintain the
> > top-k flow keys.
>
> http://mails.dpdk.org/archives/test-report/2022-August/304026.html
>
> This patch adds new symbols in the API without going through the experimental
> phase.
> What is the rationale for skipping it?
>
Yes, these new APIs should be marked as "experimental", will fix it.
> >
> > [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> > Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> > Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> > https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
> >
> > Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> > Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> > Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> > ---
> > lib/member/meson.build | 38 +-
> > lib/member/rte_member.c | 75 ++++
> > lib/member/rte_member.h | 151 ++++++-
> > lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> > lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> > lib/member/rte_member_sketch.h | 97 +++++
> > lib/member/rte_member_sketch_avx512.c | 69 +++
> > lib/member/rte_member_sketch_avx512.h | 36 ++
> > lib/member/rte_xxh64_avx512.h | 117 +++++
> > lib/member/version.map | 3 +
> > 10 files changed, 1600 insertions(+), 4 deletions(-) create mode
> > 100644 lib/member/rte_member_heap.h create mode 100644
> > lib/member/rte_member_sketch.c create mode 100644
> > lib/member/rte_member_sketch.h create mode 100644
> > lib/member/rte_member_sketch_avx512.c
> > create mode 100644 lib/member/rte_member_sketch_avx512.h
> > create mode 100644 lib/member/rte_xxh64_avx512.h
> >
> > diff --git a/lib/member/meson.build b/lib/member/meson.build index
> > e06fddc240..9b3418c25c 100644
> > --- a/lib/member/meson.build
> > +++ b/lib/member/meson.build
> > @@ -7,6 +7,42 @@ if is_windows
> > subdir_done()
> > endif
> >
> > -sources = files('rte_member.c', 'rte_member_ht.c',
> > 'rte_member_vbf.c')
> > +sources = files('rte_member.c', 'rte_member_ht.c',
> > +'rte_member_vbf.c', 'rte_member_sketch.c')
> > headers = files('rte_member.h')
> > deps += ['hash']
> > +includes += include_directories('../hash', '../ring')
> > +
> > +# compile AVX512 version if:
> > +# we are building 64-bit binary AND binutils can generate proper code
> > +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
> > + # compile AVX512 version if either:
> > + # a. we have AVX512 supported in minimum instruction set
> > + # baseline
> > + # b. it's not minimum instruction set, but supported by
> > + # compiler
> > + #
> > + # in former case, just add avx512 C file to files list
> > + # in latter case, compile c file to static lib, using correct
> > + # compiler flags, and then have the .o file from static lib
> > + # linked into main lib.
> > + sketch_avx512_cpu_support = (
> > + cc.get_define('__AVX512F__', args: machine_args) != ''
> > + )
> > +
> > + if sketch_avx512_cpu_support == true
> > + cflags += ['-DCC_AVX512_SUPPORT']
> > + if cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
> > + cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
>
> Pushing those flags in the cflags is probably wrong, as the rest of the library
> objects will be compiled with those AVX512 flags.
> If later this library code is run on a non supporting AVX512 system, it will trigger
> a runtime error.
>
Yes, thanks for the reminder, will fix it.
>
> Please look at how other libraries integrated AVX512.
> Thanks.
>
> --
> David Marchand
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v3 0/2] introduce NitroSketch Mode into membership library
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
` (2 preceding siblings ...)
2022-08-31 6:46 ` [PATCH v2 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-09-15 2:14 ` Leyi Rong
2022-09-15 2:14 ` [PATCH v3 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-15 2:14 ` [PATCH v3 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-09-15 13:03 ` [PATCH v4 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-16 3:03 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Leyi Rong
5 siblings, 2 replies; 34+ messages in thread
From: Leyi Rong @ 2022-09-15 2:14 UTC (permalink / raw)
To: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patchset introduce a brand new NitroSketch Mode into membership
library. This algorithm provides high-fidelity approximate measurements
and appears as a promissing alternative to triditional approches such as
packet sampling.
---
v3:
- mark below new added APIs as "experimental":
rte_member_add_byte_count()
rte_member_query_count()
rte_member_report_heavyhitter()
- meson build file rework for potential runtime issue.
v2:
- attach paper link to commit log.
- fix potential memory leaks in test_member.c.
- build error fix according to CI build fail log.
Leyi Rong (2):
member: implement NitroSketch mode
test/member: add functional and perf tests for sketch
app/test/test_member.c | 272 ++++++++++++
app/test/test_member_perf.c | 153 ++++++-
lib/member/meson.build | 42 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 154 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 69 +++
lib/member/rte_member_sketch_avx512.h | 36 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 9 +
12 files changed, 2034 insertions(+), 8 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v3 1/2] member: implement NitroSketch mode
2022-09-15 2:14 ` [PATCH v3 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-09-15 2:14 ` Leyi Rong
2022-09-15 7:47 ` David Marchand
2022-09-15 2:14 ` [PATCH v3 2/2] test/member: add functional and perf tests for sketch Leyi Rong
1 sibling, 1 reply; 34+ messages in thread
From: Leyi Rong @ 2022-09-15 2:14 UTC (permalink / raw)
To: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
Sketching algorithm provide high-fidelity approximate measurements and
appears as a promising alternative to traditional approaches such as
packet sampling.
NitroSketch [1] is a software sketching framework that optimizes
performance, provides accuracy guarantees, and supports a variety of
sketches.
This commit adds a new data structure called sketch into
membership library. This new data structure is an efficient
way to profile the traffic for heavy hitters. Also use min-heap
structure to maintain the top-k flow keys.
[1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
lib/member/meson.build | 42 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 154 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 69 +++
lib/member/rte_member_sketch_avx512.h | 36 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 9 +
10 files changed, 1613 insertions(+), 4 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
diff --git a/lib/member/meson.build b/lib/member/meson.build
index e06fddc240..8de0c09a6a 100644
--- a/lib/member/meson.build
+++ b/lib/member/meson.build
@@ -7,6 +7,46 @@ if is_windows
subdir_done()
endif
-sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
+sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
headers = files('rte_member.h')
deps += ['hash']
+includes += include_directories('../hash', '../ring')
+
+# compile AVX512 version if:
+# we are building 64-bit binary AND binutils can generate proper code
+if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
+ # compile AVX512 version if either:
+ # a. we have AVX512 supported in minimum instruction set
+ # baseline
+ # b. it's not minimum instruction set, but supported by
+ # compiler
+ #
+ # in former case, just add avx512 C file to files list
+ # in latter case, compile c file to static lib, using correct
+ # compiler flags, and then have the .o file from static lib
+ # linked into main lib.
+
+ #check if all required flags already enabled
+ sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__', '__AVX512IFMA__']
+
+ sketch_avx512_on = true
+ foreach f:sketch_avx512_flags
+ if cc.get_define(f, args: machine_args) == ''
+ sketch_avx512_on = false
+ endif
+ endforeach
+
+ if sketch_avx512_on == true
+ cflags += ['-DCC_AVX512_SUPPORT']
+ sources += files('rte_member_sketch_avx512.c')
+ elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
+ cflags += ['-DCC_AVX512_SUPPORT']
+ cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
+ sketch_avx512_tmp = static_library('sketch_avx512_tmp',
+ 'rte_member_sketch_avx512.c',
+ include_directories: includes,
+ dependencies: static_rte_eal,
+ c_args: cflags)
+ objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
+ endif
+endif
diff --git a/lib/member/rte_member.c b/lib/member/rte_member.c
index 7e1632e6b5..8f859f7fbd 100644
--- a/lib/member/rte_member.c
+++ b/lib/member/rte_member.c
@@ -9,10 +9,12 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_tailq.h>
+#include <rte_ring_elem.h>
#include "rte_member.h"
#include "rte_member_ht.h"
#include "rte_member_vbf.h"
+#include "rte_member_sketch.h"
TAILQ_HEAD(rte_member_list, rte_tailq_entry);
static struct rte_tailq_elem rte_member_tailq = {
@@ -72,6 +74,9 @@ rte_member_free(struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_free_vbf(setsum);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_free_sketch(setsum);
+ break;
default:
break;
}
@@ -86,6 +91,8 @@ rte_member_create(const struct rte_member_parameters *params)
struct rte_member_list *member_list;
struct rte_member_setsum *setsum;
int ret;
+ char ring_name[RTE_RING_NAMESIZE];
+ struct rte_ring *sketch_key_ring = NULL;
if (params == NULL) {
rte_errno = EINVAL;
@@ -100,6 +107,16 @@ rte_member_create(const struct rte_member_parameters *params)
return NULL;
}
+ if (params->type == RTE_MEMBER_TYPE_SKETCH) {
+ snprintf(ring_name, sizeof(ring_name), "SK_%s", params->name);
+ sketch_key_ring = rte_ring_create_elem(ring_name, sizeof(uint32_t),
+ rte_align32pow2(params->top_k), params->socket_id, 0);
+ if (sketch_key_ring == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Ring Memory allocation failed\n");
+ return NULL;
+ }
+ }
+
member_list = RTE_TAILQ_CAST(rte_member_tailq.head, rte_member_list);
rte_mcfg_tailq_write_lock();
@@ -145,6 +162,9 @@ rte_member_create(const struct rte_member_parameters *params)
case RTE_MEMBER_TYPE_VBF:
ret = rte_member_create_vbf(setsum, params);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ ret = rte_member_create_sketch(setsum, params, sketch_key_ring);
+ break;
default:
goto error_unlock_exit;
}
@@ -162,6 +182,7 @@ rte_member_create(const struct rte_member_parameters *params)
error_unlock_exit:
rte_free(te);
rte_free(setsum);
+ rte_ring_free(sketch_key_ring);
rte_mcfg_tailq_write_unlock();
return NULL;
}
@@ -178,6 +199,23 @@ rte_member_add(const struct rte_member_setsum *setsum, const void *key,
return rte_member_add_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_add_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch(setsum, key, set_id);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count)
+{
+ if (setsum == NULL || key == NULL || byte_count == 0)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch_byte_count(setsum, key, byte_count);
default:
return -EINVAL;
}
@@ -195,6 +233,8 @@ rte_member_lookup(const struct rte_member_setsum *setsum, const void *key,
return rte_member_lookup_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_lookup_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_lookup_sketch(setsum, key, set_id);
default:
return -EINVAL;
}
@@ -261,6 +301,36 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
}
}
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output)
+{
+ if (setsum == NULL || key == NULL || output == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_query_sketch(setsum, key, output);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count)
+{
+ if (setsum == NULL || key == NULL || count == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_report_heavyhitter_sketch(setsum, key, count);
+ default:
+ return -EINVAL;
+ }
+}
+
int
rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id)
@@ -272,6 +342,8 @@ rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
case RTE_MEMBER_TYPE_HT:
return rte_member_delete_ht(setsum, key, set_id);
/* current vBF implementation does not support delete function */
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_delete_sketch(setsum, key);
case RTE_MEMBER_TYPE_VBF:
default:
return -EINVAL;
@@ -290,6 +362,9 @@ rte_member_reset(const struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_reset_vbf(setsum);
return;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_reset_sketch(setsum);
+ return;
default:
return;
}
diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h
index 2611015771..178efeed33 100644
--- a/lib/member/rte_member.h
+++ b/lib/member/rte_member.h
@@ -39,6 +39,18 @@
* | | | not overwrite | |
* | | | existing key. | |
* +----------+---------------------+----------------+-------------------------+
+ * +==========+=============================+
+ * | type | sketch |
+ * +==========+=============================+
+ * |structure | counting bloom filter array |
+ * +----------+-----------------------------+
+ * |set id | 1: heavy set, 0: light set |
+ * | | |
+ * +----------+-----------------------------+
+ * |usages & | count size of a flow, |
+ * |properties| used for heavy hitter |
+ * | | detection. |
+ * +----------+-----------------------------+
* -->
*/
@@ -50,6 +62,8 @@ extern "C" {
#endif
#include <stdint.h>
+#include <stdbool.h>
+#include <inttypes.h>
#include <rte_common.h>
@@ -65,6 +79,20 @@ typedef uint16_t member_set_t;
#define RTE_MEMBER_BUCKET_ENTRIES 16
/** Maximum number of characters in setsum name. */
#define RTE_MEMBER_NAMESIZE 32
+/** Max value of the random number */
+#define RTE_RAND_MAX ~0LLU
+/**
+ * As packets skipped in the sampling-based algorithm, the accounting
+ * results accuracy is not guaranteed in the start stage. There should
+ * be a "convergence time" to achieve the accuracy after receiving enough
+ * packets.
+ * For sketch, use the flag if prefer always bounded mode, which only
+ * starts sampling after receiving enough packets to keep the results
+ * accuracy always bounded.
+ */
+#define RTE_MEMBER_SKETCH_ALWAYS_BOUNDED 0x01
+/** For sketch, use the flag if to count packet size instead of packet count */
+#define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02
/** @internal Hash function used by membership library. */
#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32)
@@ -104,6 +132,7 @@ struct rte_member_parameters;
enum rte_member_setsum_type {
RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary. */
RTE_MEMBER_TYPE_VBF, /**< Vector of bloom filters. */
+ RTE_MEMBER_TYPE_SKETCH,
RTE_MEMBER_NUM_TYPE
};
@@ -114,6 +143,19 @@ enum rte_member_sig_compare_function {
RTE_MEMBER_COMPARE_NUM
};
+/* sketch update function with different implementations. */
+typedef void (*sketch_update_fn_t)(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+/* sketch lookup function with different implementations. */
+typedef uint64_t (*sketch_lookup_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
+/* sketch delete function with different implementations. */
+typedef void (*sketch_delete_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
/** @internal setsummary structure. */
struct rte_member_setsum {
enum rte_member_setsum_type type; /* Type of the set summary. */
@@ -134,6 +176,21 @@ struct rte_member_setsum {
uint32_t bit_mask; /* Bit mask to get bit location in bf. */
uint32_t num_hashes; /* Number of hash values to index bf. */
+ /* Parameters for sketch */
+ float error_rate;
+ float sample_rate;
+ uint32_t num_col;
+ uint32_t num_row;
+ int always_bounded;
+ double converge_thresh;
+ uint32_t topk;
+ uint32_t count_byte;
+ uint64_t *hash_seeds;
+ sketch_update_fn_t sketch_update; /* Pointer to the sketch update function */
+ sketch_lookup_fn_t sketch_lookup; /* Pointer to the sketch lookup function */
+ sketch_delete_fn_t sketch_delete; /* Pointer to the sketch delete function */
+
+ void *runtime_var;
uint32_t mul_shift; /* vbf internal variable used during bit test. */
uint32_t div_shift; /* vbf internal variable used during bit test. */
@@ -143,6 +200,9 @@ struct rte_member_setsum {
/* Second cache line should start here. */
uint32_t socket_id; /* NUMA Socket ID for memory. */
char name[RTE_MEMBER_NAMESIZE]; /* Name of this set summary. */
+#ifdef RTE_ARCH_X86
+ bool use_avx512;
+#endif
} __rte_cache_aligned;
/**
@@ -261,8 +321,33 @@ struct rte_member_parameters {
*/
uint32_t sec_hash_seed;
+ /**
+ * For count(min) sketch data structure, error rate defines the accuracy
+ * required by the user. Higher accuracy leads to more memory usage, but
+ * the flow size is estimated more accurately.
+ */
+ float error_rate;
+
+ /**
+ * Sampling rate means the internal sample rate of the rows of the count
+ * min sketches. Lower sampling rate can reduce CPU overhead, but the
+ * data structure will require more time to converge statistically.
+ */
+ float sample_rate;
+
+ /**
+ * How many top heavy hitter to be reported. The library will internally
+ * keep the keys of heavy hitters for final report.
+ */
+ uint32_t top_k;
+
+ /**
+ * Extra flags that may passed in by user
+ */
+ uint32_t extra_flag;
+
int socket_id; /**< NUMA Socket ID for memory. */
-};
+} __rte_cache_aligned;
/**
* @warning
@@ -418,7 +503,7 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* RTE_MEMBER_NO_MATCH by default is set as 0.
* For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved.
* For vBF mode the set id is limited by the num_set parameter when create
- * the set-summary.
+ * the set-summary. For sketch mode, this id is ignored.
* @return
* HT (cache mode) and vBF should never fail unless the set_id is not in the
* valid range. In such case -EINVAL is returned.
@@ -429,12 +514,75 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* Return 0 for HT (cache mode) if the add does not cause
* eviction, return 1 otherwise. Return 0 for non-cache mode if success,
* -ENOSPC for full, and 1 if cuckoo eviction happens.
- * Always returns 0 for vBF mode.
+ * Always returns 0 for vBF mode and sketch.
*/
int
rte_member_add(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add the packet byte size into the sketch.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param byte_count
+ * Add the byte count of the packet into the sketch.
+ * @return
+ * Return -EINVAL for invalid parameters, otherwise return 0.
+ */
+__rte_experimental
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Query packet count for a certain flow-key.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param count
+ * The output packet count or byte count.
+ * @return
+ * Return -EINVAL for invalid parameters.
+ */
+__rte_experimental
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *count);
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Report heavyhitter flow-keys into set-summary (SS).
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param keys
+ * Pointer of the output top-k key array.
+ * @param counts
+ * Pointer of the output packet count or byte count array of the top-k keys.
+ * @return
+ * Return -EINVAL for invalid parameters. Return a positive integer indicate
+ * how many heavy hitters are reported.
+ */
+__rte_experimental
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **keys, uint64_t *counts);
+
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
diff --git a/lib/member/rte_member_heap.h b/lib/member/rte_member_heap.h
new file mode 100644
index 0000000000..3ced34160a
--- /dev/null
+++ b/lib/member/rte_member_heap.h
@@ -0,0 +1,424 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#ifndef _RTE_MEMBER_HEAP_H_
+#define _RTE_MEMBER_HEAP_H_
+
+#include <rte_ring_elem.h>
+#include "rte_member.h"
+
+#define LCHILD(x) (2 * x + 1)
+#define RCHILD(x) (2 * x + 2)
+#define PARENT(x) ((x - 1) / 2)
+
+#define HASH_BKT_SIZE 16
+#define HASH_HP_MULTI 4
+#define HASH_RESIZE_MULTI 2
+
+struct hash_bkt {
+ uint16_t sig[HASH_BKT_SIZE];
+ uint16_t idx[HASH_BKT_SIZE];
+};
+
+struct hash {
+ uint16_t bkt_cnt;
+ uint16_t num_item;
+ uint32_t seed;
+ struct hash_bkt buckets[0];
+};
+
+struct node {
+ void *key;
+ uint64_t count;
+};
+
+struct minheap {
+ uint32_t key_len;
+ uint32_t size;
+ uint32_t socket;
+ struct hash *hashtable;
+ struct node *elem;
+};
+
+static int
+hash_table_insert(const void *key, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].idx[i] == 0) {
+ table->buckets[idx].idx[i] = value;
+ table->buckets[idx].sig[i] = sig;
+ table->num_item++;
+ return 0;
+ }
+ }
+
+ return -ENOMEM;
+}
+
+static int
+hash_table_update(const void *key, int old_value, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == old_value) {
+ table->buckets[idx].idx[i] = value;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_del(const void *key, uint16_t value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == value) {
+ table->buckets[idx].idx[i] = 0;
+ table->num_item--;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_lookup(const void *key, int key_len, struct minheap *hp)
+{
+ struct hash *table = hp->hashtable;
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] != 0) {
+ uint32_t hp_idx = table->buckets[idx].idx[i] - 1;
+
+ if (memcmp(hp->elem[hp_idx].key, key, hp->key_len) == 0)
+ return hp_idx;
+ }
+ }
+
+ return -ENOENT; /* key doesn't exist */
+}
+
+static int
+resize_hash_table(struct minheap *hp)
+{
+ uint32_t i;
+ uint32_t new_bkt_cnt;
+
+ while (1) {
+ new_bkt_cnt = hp->hashtable->bkt_cnt * HASH_RESIZE_MULTI;
+
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT load factor is [%f]\n",
+ hp->hashtable->num_item / ((float)hp->hashtable->bkt_cnt * HASH_BKT_SIZE));
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT resize happen!\n");
+ rte_free(hp->hashtable);
+ hp->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ new_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, hp->socket);
+
+ if (hp->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ return -ENOMEM;
+ }
+
+ hp->hashtable->bkt_cnt = new_bkt_cnt;
+
+ for (i = 0; i < hp->size; ++i) {
+ if (hash_table_insert(hp->elem[i].key,
+ i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR,
+ "Sketch Minheap HT resize insert fail!\n");
+ break;
+ }
+ }
+ if (i == hp->size)
+ break;
+ }
+
+ return 0;
+}
+
+/* find the item in the given minheap */
+static int
+rte_member_minheap_find(struct minheap *hp, const void *key)
+{
+ int idx = hash_table_lookup(key, hp->key_len, hp);
+ return idx;
+}
+
+static int
+rte_member_minheap_init(struct minheap *heap, int size,
+ uint32_t socket, uint32_t seed)
+{
+ heap->elem = rte_zmalloc_socket(NULL, sizeof(struct node) * size,
+ RTE_CACHE_LINE_SIZE, socket);
+ if (heap->elem == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap elem allocation failed\n");
+ return -ENOMEM;
+ }
+
+ uint32_t hash_bkt_cnt = rte_align32pow2(size * HASH_HP_MULTI) / HASH_BKT_SIZE;
+
+ if (hash_bkt_cnt == 0)
+ hash_bkt_cnt = 1;
+
+ heap->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ hash_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, socket);
+
+ if (heap->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ rte_free(heap->elem);
+ return -ENOMEM;
+ }
+
+ heap->hashtable->seed = seed;
+ heap->hashtable->bkt_cnt = hash_bkt_cnt;
+ heap->socket = socket;
+
+ return 0;
+}
+
+/* swap the minheap nodes */
+static __rte_always_inline void
+rte_member_heap_swap(struct node *n1, struct node *n2)
+{
+ struct node temp = *n1;
+ *n1 = *n2;
+ *n2 = temp;
+}
+
+/* heapify function */
+static void
+rte_member_heapify(struct minheap *hp, uint32_t idx, bool update_hash)
+{
+ uint32_t smallest;
+
+ if (LCHILD(idx) < hp->size &&
+ hp->elem[LCHILD(idx)].count < hp->elem[idx].count)
+ smallest = LCHILD(idx);
+ else
+ smallest = idx;
+
+ if (RCHILD(idx) < hp->size &&
+ hp->elem[RCHILD(idx)].count < hp->elem[smallest].count)
+ smallest = RCHILD(idx);
+
+ if (smallest != idx) {
+ rte_member_heap_swap(&(hp->elem[idx]), &(hp->elem[smallest]));
+
+ if (update_hash) {
+ if (hash_table_update(hp->elem[smallest].key, idx + 1, smallest + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+
+ if (hash_table_update(hp->elem[idx].key, smallest + 1, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+ }
+ rte_member_heapify(hp, smallest, update_hash);
+ }
+}
+
+/* insert a node into the minheap */
+static int
+rte_member_minheap_insert_node(struct minheap *hp, const void *key,
+ int counter, void *key_slot,
+ struct rte_ring *free_key_slot)
+{
+ struct node nd;
+ uint32_t slot_id;
+
+ if (rte_ring_sc_dequeue_elem(free_key_slot, &slot_id, sizeof(uint32_t)) != 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap get empty keyslot failed\n");
+ return -1;
+ }
+
+ nd.count = counter;
+ nd.key = RTE_PTR_ADD(key_slot, slot_id * hp->key_len);
+
+ memcpy(nd.key, key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* delete a key from the minheap */
+static int
+rte_member_minheap_delete_node(struct minheap *hp, const void *key,
+ void *key_slot, struct rte_ring *free_key_slot)
+{
+ int idx = rte_member_minheap_find(hp, key);
+ uint32_t offset = RTE_PTR_DIFF(hp->elem[idx].key, key_slot) / hp->key_len;
+
+ if (hash_table_del(key, idx + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ rte_ring_sp_enqueue_elem(free_key_slot, &offset, sizeof(uint32_t));
+
+ if (idx == (int)(hp->size - 1)) {
+ hp->size--;
+ return 0;
+ }
+
+ hp->elem[idx] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[idx].key, hp->size, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+ rte_member_heapify(hp, idx, true);
+
+ return 0;
+}
+
+/* replace a min node with a new key. */
+static int
+rte_member_minheap_replace_node(struct minheap *hp,
+ const void *new_key,
+ int new_counter)
+{
+ struct node nd;
+ void *recycle_key = NULL;
+
+ recycle_key = hp->elem[0].key;
+
+ if (hash_table_del(recycle_key, 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ hp->elem[0] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[0].key, hp->size, 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+
+ rte_member_heapify(hp, 0, true);
+
+ nd.count = new_counter;
+ nd.key = recycle_key;
+
+ memcpy(nd.key, new_key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(new_key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace insert failed\n");
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* sort the heap into a descending array */
+static void
+rte_member_heapsort(struct minheap *hp, struct node *result_array)
+{
+ struct minheap new_hp;
+
+ /* build a new heap for using the given array */
+ new_hp.size = hp->size;
+ new_hp.key_len = hp->key_len;
+ new_hp.elem = result_array;
+ memcpy(result_array, hp->elem, hp->size * sizeof(struct node));
+
+ /* sort the new heap */
+ while (new_hp.size > 1) {
+ rte_member_heap_swap(&(new_hp.elem[0]), &(new_hp.elem[new_hp.size - 1]));
+ new_hp.size--;
+ rte_member_heapify(&new_hp, 0, false);
+ }
+}
+
+static void
+rte_member_minheap_free(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ rte_free(hp->elem);
+ rte_free(hp->hashtable);
+}
+
+static void
+rte_member_minheap_reset(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ memset(hp->elem, 0, sizeof(struct node) * hp->size);
+ hp->size = 0;
+
+ memset((char *)hp->hashtable + sizeof(struct hash), 0,
+ hp->hashtable->bkt_cnt * sizeof(struct hash_bkt));
+ hp->hashtable->num_item = 0;
+}
+
+#endif /* _RTE_MEMBER_HEAP_H_ */
diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c
new file mode 100644
index 0000000000..524ba77620
--- /dev/null
+++ b/lib/member/rte_member_sketch.c
@@ -0,0 +1,594 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#include <math.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_errno.h>
+#include <rte_log.h>
+#include <rte_random.h>
+#include <rte_prefetch.h>
+#include <rte_ring_elem.h>
+
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_member_heap.h"
+
+#ifdef CC_AVX512_SUPPORT
+#include "rte_member_sketch_avx512.h"
+#endif /* CC_AVX512_SUPPORT */
+
+struct sketch_runtime {
+ uint64_t pkt_cnt;
+ uint32_t until_next;
+ int converged;
+ struct minheap heap;
+ struct node *report_array;
+ void *key_slots;
+ struct rte_ring *free_key_slots;
+} __rte_cache_aligned;
+
+/*
+ * Geometric sampling to calculate how many packets needs to be
+ * skipped until next update. This method can mitigate the CPU
+ * overheads compared with coin-toss sampling.
+ */
+static uint32_t
+draw_geometric(const struct rte_member_setsum *ss)
+{
+ double rand = 1;
+
+ if (ss->sample_rate == 1)
+ return 1;
+
+ while (rand == 1 || rand == 0)
+ rand = (double) rte_rand() / (double)(RTE_RAND_MAX);
+
+ return (uint32_t)ceil(log(1 - rand) / log(1 - ss->sample_rate));
+}
+
+static void
+isort(uint64_t *array, int n)
+{
+ int i;
+
+ for (i = 1; i < n; i++) {
+ uint64_t t = array[i];
+ int j;
+
+ for (j = i - 1; j >= 0; j--) {
+ if (t < array[j])
+ array[j + 1] = array[j];
+ else
+ break;
+ }
+ array[j + 1] = t;
+ }
+}
+
+static __rte_always_inline void
+swap(uint64_t *a, uint64_t *b)
+{
+ uint64_t tmp = *a;
+ *a = *b;
+ *b = tmp;
+}
+
+static uint64_t
+medianof5(uint64_t a, uint64_t b, uint64_t c, uint64_t d, uint64_t e)
+{
+ if (a > b)
+ swap(&a, &b);
+ if (c > d)
+ swap(&c, &d);
+ if (a > c) {
+ if (d > e)
+ swap(&c, &e);
+ else {
+ swap(&c, &d);
+ swap(&d, &e);
+ }
+ } else {
+ if (b > e)
+ swap(&a, &e);
+ else {
+ swap(&a, &b);
+ swap(&b, &e);
+ }
+ }
+
+ if (a > c)
+ return a > d ? d : a;
+ else
+ return b > c ? c : b;
+}
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *ring)
+{
+ struct sketch_runtime *runtime;
+ uint32_t num_col;
+ uint32_t i;
+
+ if (params->sample_rate == 0 || params->sample_rate > 1) {
+ rte_errno = EINVAL;
+ RTE_MEMBER_LOG(ERR,
+ "Membership Sketch created with invalid parameters\n");
+ return -EINVAL;
+ }
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_COUNT_BYTE)
+ ss->count_byte = 1;
+
+#ifdef RTE_ARCH_X86
+ if (ss->count_byte == 1 &&
+ rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512IFMA) == 1) {
+#ifdef CC_AVX512_SUPPORT
+ ss->use_avx512 = true;
+#else
+ ss->use_avx512 = false;
+#endif
+ }
+
+ if (ss->use_avx512 == true) {
+#ifdef CC_AVX512_SUPPORT
+ ss->num_row = NUM_ROW_VEC;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch AVX512 update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_avx512;
+ ss->sketch_lookup = sketch_lookup_avx512;
+ ss->sketch_delete = sketch_delete_avx512;
+#endif
+ } else
+#endif
+ {
+ ss->num_row = NUM_ROW_SCALAR;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch SCALAR update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_scalar;
+ ss->sketch_lookup = sketch_lookup_scalar;
+ ss->sketch_delete = sketch_delete_scalar;
+ }
+
+ ss->socket_id = params->socket_id;
+
+ if (ss->count_byte == 0)
+ num_col = 4.0 / params->error_rate / params->sample_rate;
+#ifdef RTE_ARCH_X86
+ else if (ss->use_avx512 == true)
+ num_col = rte_align32pow2(4.0 / params->error_rate);
+#endif
+ else
+ num_col = 4.0 / params->error_rate;
+
+ ss->table = rte_zmalloc_socket(NULL,
+ sizeof(uint64_t) * num_col * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->table == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Table memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->hash_seeds = rte_zmalloc_socket(NULL, sizeof(uint64_t) * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->hash_seeds == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Hashseeds memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->runtime_var = rte_zmalloc_socket(NULL, sizeof(struct sketch_runtime),
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->runtime_var == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime memory allocation failed\n");
+ rte_free(ss);
+ return -ENOMEM;
+ }
+ runtime = ss->runtime_var;
+
+ ss->num_col = num_col;
+ ss->sample_rate = params->sample_rate;
+ ss->prim_hash_seed = params->prim_hash_seed;
+ ss->sec_hash_seed = params->sec_hash_seed;
+ ss->error_rate = params->error_rate;
+ ss->topk = params->top_k;
+ ss->key_len = params->key_len;
+ runtime->heap.key_len = ss->key_len;
+
+ runtime->key_slots = rte_zmalloc_socket(NULL, ss->key_len * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->key_slots == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Key Slots allocation failed\n");
+ goto error;
+ }
+
+ runtime->free_key_slots = ring;
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime->free_key_slots,
+ &i, sizeof(uint32_t));
+
+ if (rte_member_minheap_init(&(runtime->heap), params->top_k,
+ ss->socket_id, params->prim_hash_seed) < 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap allocation failed\n");
+ goto error_runtime;
+ }
+
+ runtime->report_array = rte_zmalloc_socket(NULL, sizeof(struct node) * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->report_array == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime Report Array allocation failed\n");
+ goto error_runtime;
+ }
+
+ rte_srand(ss->prim_hash_seed);
+ for (i = 0; i < ss->num_row; i++)
+ ss->hash_seeds[i] = rte_rand();
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_ALWAYS_BOUNDED)
+ ss->always_bounded = 1;
+
+ if (ss->always_bounded) {
+ double delta = 1.0 / (pow(2, ss->num_row));
+
+ ss->converge_thresh = 10 * pow(ss->error_rate, -2.0) * sqrt(log(1 / delta));
+ }
+
+ RTE_MEMBER_LOG(DEBUG, "Sketch created, "
+ "the total memory required is %u Bytes\n", ss->num_col * ss->num_row * 8);
+
+ return 0;
+
+error_runtime:
+ rte_member_minheap_free(&runtime->heap);
+ rte_ring_free(runtime->free_key_slots);
+ rte_free(runtime->key_slots);
+error:
+ rte_free(runtime);
+ rte_free(ss);
+
+ return -ENOMEM;
+}
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col[ss->num_row];
+ uint64_t count_row[ss->num_row];
+ uint32_t cur_row;
+ uint64_t count;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ rte_prefetch0(&count_array[cur_row * ss->num_col + col[cur_row]]);
+ }
+
+ /* if sample rate is 1, it is a regular count-min, we report the min */
+ if (ss->sample_rate == 1 || ss->count_byte == 1)
+ return count_min(ss, col);
+
+ memset(count_row, 0, sizeof(uint64_t) * ss->num_row);
+
+ /* otherwise we report the median number */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_row[cur_row] = count_array[cur_row * ss->num_col + col[cur_row]];
+
+ if (ss->num_row == 5)
+ return medianof5(count_row[0], count_row[1],
+ count_row[2], count_row[3], count_row[4]);
+
+ isort(count_row, ss->num_row);
+
+ if (ss->num_row % 2 == 0) {
+ count = (count_row[ss->num_row / 2] + count_row[ss->num_row / 2 - 1]) / 2;
+ return count;
+ }
+ /* ss->num_row % 2 != 0 */
+ count = count_row[ss->num_row / 2];
+
+ return count;
+}
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* set corresponding counter to 0 */
+ count_array[cur_row * ss->num_col + col[cur_row]] = 0;
+ }
+}
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ uint64_t *output)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ *output = count;
+
+ return 0;
+}
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ uint64_t count = ss->sketch_lookup(ss, runtime_var->heap.elem[i].key);
+
+ runtime_var->heap.elem[i].count = count;
+ }
+}
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key,
+ uint64_t *count)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = setsum->runtime_var;
+
+ rte_member_update_heap(setsum);
+ rte_member_heapsort(&(runtime_var->heap), runtime_var->report_array);
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ key[i] = runtime_var->report_array[i].key;
+ count[i] = runtime_var->report_array[i].count;
+ }
+
+ return runtime_var->heap.size;
+}
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *ss,
+ const void *key, member_set_t *set_id)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ if (runtime_var->heap.size > 0 && count >= runtime_var->heap.elem[0].count)
+ *set_id = 1;
+ else
+ *set_id = 0;
+
+ if (count == 0)
+ return 0;
+ else
+ return 1;
+}
+
+static void
+should_converge(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ /* For count min sketch - L1 norm */
+ if (runtime_var->pkt_cnt > ss->converge_thresh) {
+ runtime_var->converged = 1;
+ RTE_MEMBER_LOG(DEBUG, "Sketch converged, begin sampling "
+ "from key count %"PRIu64"\n",
+ runtime_var->pkt_cnt);
+ }
+}
+
+static void
+sketch_update_row(const struct rte_member_setsum *ss, const void *key,
+ uint32_t count, uint32_t cur_row)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* sketch counter update */
+ count_array[cur_row * ss->num_col + col] +=
+ ceil(count / (ss->sample_rate));
+}
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+ count_array[cur_row * ss->num_col + col] += count;
+ }
+}
+
+static void
+heap_update(const struct rte_member_setsum *ss, const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t key_cnt = 0;
+ int found;
+
+ /* We also update the heap for this key */
+ key_cnt = ss->sketch_lookup(ss, key);
+ if (key_cnt > runtime_var->heap.elem[0].count) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ /* the key is found in the top-k heap */
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else if (runtime_var->heap.size < ss->topk) {
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ } else {
+ rte_member_minheap_replace_node(&runtime_var->heap, key, key_cnt);
+ }
+ } else if (runtime_var->heap.size < ss->topk) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ }
+}
+
+/*
+ * Add a single packet into the sketch.
+ * Sketch value is meatured by packet numbers in this mode.
+ */
+int
+rte_member_add_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ __rte_unused member_set_t set_id)
+{
+ uint32_t cur_row;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /*
+ * If sketch is measured by byte count,
+ * the rte_member_add_sketch_byte_count routine should be used.
+ */
+ if (ss->count_byte == 1) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Byte Mode, "
+ "should use rte_member_add_byte_count()!\n");
+ return -EINVAL;
+ }
+
+ if (ss->sample_rate == 1) {
+ ss->sketch_update(ss, key, 1);
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* convergence stage if it's needed */
+ if (ss->always_bounded && !runtime_var->converged) {
+ ss->sketch_update(ss, key, 1);
+
+ if (!((++runtime_var->pkt_cnt) & (INTERVAL - 1)))
+ should_converge(ss);
+
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* should we skip this packet */
+ if (*until_next >= ss->num_row) {
+ *until_next -= ss->num_row;
+ return 0;
+ }
+ cur_row = *until_next;
+ do {
+ sketch_update_row(ss, key, 1, cur_row);
+ *until_next = draw_geometric(ss);
+ if (cur_row + *until_next >= ss->num_row)
+ break;
+ cur_row += *until_next;
+ } while (1);
+
+ *until_next -= (ss->num_row - cur_row);
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+/*
+ * Add the byte count of the packet into the sketch.
+ * Sketch value is meatured by byte count numbers in this mode.
+ */
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t byte_count)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /* should not call this API if not in count byte mode */
+ if (ss->count_byte == 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Pkt Mode, "
+ "should use rte_member_add()!\n");
+ return -EINVAL;
+ }
+
+ /* there's specific optimization for the sketch update */
+ ss->sketch_update(ss, key, byte_count);
+
+ if (*until_next != 0) {
+ *until_next = *until_next - 1;
+ return 0;
+ }
+
+ *until_next = draw_geometric(ss) - 1;
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *ss,
+ const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ int found;
+
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found < 0)
+ return -1;
+
+ ss->sketch_delete(ss, key);
+
+ return rte_member_minheap_delete_node
+ (&runtime_var->heap, key, runtime_var->key_slots, runtime_var->free_key_slots);
+}
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ rte_free(ss->table);
+ rte_member_minheap_free(&runtime_var->heap);
+ rte_free(runtime_var->key_slots);
+ rte_ring_free(runtime_var->free_key_slots);
+ rte_free(runtime_var);
+}
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t *sketch = ss->table;
+ uint32_t i;
+
+ memset(sketch, 0, sizeof(uint64_t) * ss->num_col * ss->num_row);
+ rte_member_minheap_reset(&runtime_var->heap);
+ rte_ring_reset(runtime_var->free_key_slots);
+
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime_var->free_key_slots, &i, sizeof(uint32_t));
+}
diff --git a/lib/member/rte_member_sketch.h b/lib/member/rte_member_sketch.h
new file mode 100644
index 0000000000..219323008b
--- /dev/null
+++ b/lib/member/rte_member_sketch.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_H_
+#define _RTE_MEMBER_SKETCH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include <rte_ring_elem.h>
+
+#define NUM_ROW_SCALAR 5
+#define INTERVAL (1 << 15)
+
+#if !RTE_IS_POWER_OF_2(INTERVAL)
+#error sketch INTERVAL macro must be a power of 2
+#endif
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *r);
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *setsum,
+ const void *key, member_set_t *set_id);
+
+int
+rte_member_add_sketch(const struct rte_member_setsum *setsum,
+ const void *key,
+ member_set_t set_id);
+
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key, uint32_t byte_count);
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *setsum,
+ const void *key);
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output);
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss);
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *setsum);
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count);
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss);
+
+static __rte_always_inline uint64_t
+count_min(const struct rte_member_setsum *ss, const uint32_t *hash_results)
+{
+ uint64_t *count_array = ss->table;
+ uint64_t count;
+ uint32_t cur_row;
+ uint64_t min = UINT64_MAX;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ uint64_t cnt = count_array[cur_row * ss->num_col + hash_results[cur_row]];
+
+ if (cnt < min)
+ min = cnt;
+ }
+ count = min;
+
+ return count;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_H_ */
diff --git a/lib/member/rte_member_sketch_avx512.c b/lib/member/rte_member_sketch_avx512.c
new file mode 100644
index 0000000000..c83f4b6fd1
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.c
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "rte_member_sketch_avx512.h"
+
+__rte_always_inline void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t num_col = ss->num_col;
+ uint32_t key_len = ss->key_len;
+ __m256i v_row_base;
+ __m256i v_hash_result;
+ __m512i current_sketch;
+ __m512i updated_sketch;
+ __m512i v_count;
+
+ const __m256i v_idx = _mm256_set_epi32(7, 6, 5, 4, 3, 2, 1, 0);
+ const __m256i v_col = _mm256_set1_epi32(num_col);
+
+ /* compute the hash result parallelly */
+ v_hash_result = rte_xxh64_sketch_avx512
+ (key, key_len, *(__m512i *)ss->hash_seeds, num_col);
+ v_row_base = _mm256_mullo_epi32(v_idx, v_col);
+ v_hash_result = _mm256_add_epi32(v_row_base, v_hash_result);
+
+ current_sketch =
+ _mm512_i32gather_epi64(v_hash_result, count_array, 8);
+ v_count = _mm512_set1_epi64(count);
+ updated_sketch = _mm512_add_epi64(current_sketch, v_count);
+ _mm512_i32scatter_epi64
+ ((void *)count_array, v_hash_result, updated_sketch, 8);
+}
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+
+ /* currently only for sketch byte count mode */
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds, ss->num_col);
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ return count_min(ss, col);
+}
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint64_t min = UINT64_MAX;
+ uint32_t cur_row;
+
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds,
+ RTE_ALIGN_FLOOR(ss->num_col, 32));
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ min = count_min(ss, col);
+
+ /* subtract the min value from all the counters */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_array[cur_row * ss->num_col + col[cur_row]] -= min;
+}
diff --git a/lib/member/rte_member_sketch_avx512.h b/lib/member/rte_member_sketch_avx512.h
new file mode 100644
index 0000000000..e7c25da643
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_AVX512_H_
+#define _RTE_MEMBER_SKETCH_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_xxh64_avx512.h"
+
+#define NUM_ROW_VEC 8
+
+void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_AVX512_H_ */
diff --git a/lib/member/rte_xxh64_avx512.h b/lib/member/rte_xxh64_avx512.h
new file mode 100644
index 0000000000..50ca1b52c7
--- /dev/null
+++ b/lib/member/rte_xxh64_avx512.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_XXH64_AVX512_H_
+#define _RTE_XXH64_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <immintrin.h>
+
+/* 0b1001111000110111011110011011000110000101111010111100101010000111 */
+static const uint64_t PRIME64_1 = 0x9E3779B185EBCA87ULL;
+/* 0b1100001010110010101011100011110100100111110101001110101101001111 */
+static const uint64_t PRIME64_2 = 0xC2B2AE3D27D4EB4FULL;
+/* 0b0001011001010110011001111011000110011110001101110111100111111001 */
+static const uint64_t PRIME64_3 = 0x165667B19E3779F9ULL;
+/* 0b1000010111101011110010100111011111000010101100101010111001100011 */
+static const uint64_t PRIME64_4 = 0x85EBCA77C2B2AE63ULL;
+/* 0b0010011111010100111010110010111100010110010101100110011111000101 */
+static const uint64_t PRIME64_5 = 0x27D4EB2F165667C5ULL;
+
+static __rte_always_inline __m512i
+xxh64_round_avx512(__m512i hash, __m512i input)
+{
+ hash = _mm512_madd52lo_epu64(hash,
+ input,
+ _mm512_set1_epi64(PRIME64_2));
+
+ hash = _mm512_rol_epi64(hash, 31);
+
+ return hash;
+}
+
+static __rte_always_inline __m512i
+xxh64_fmix_avx512(__m512i hash)
+{
+ hash = _mm512_xor_si512(hash, _mm512_srli_epi64(hash, 33));
+
+ return hash;
+}
+
+static __rte_always_inline __m256i
+rte_xxh64_sketch_avx512(const void *key, uint32_t key_len,
+ __m512i v_seed, uint32_t modulo)
+{
+ __m512i v_prime64_5, v_hash;
+ size_t remaining = key_len;
+ size_t offset = 0;
+ __m512i input;
+
+ v_prime64_5 = _mm512_set1_epi64(PRIME64_5);
+ v_hash = _mm512_add_epi64
+ (_mm512_add_epi64(v_seed, v_prime64_5),
+ _mm512_set1_epi64(key_len));
+
+ while (remaining >= 8) {
+ input = _mm512_set1_epi64(*(uint64_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ xxh64_round_avx512(_mm512_setzero_si512(), input));
+ v_hash = _mm512_madd52lo_epu64(_mm512_set1_epi64(PRIME64_4),
+ v_hash,
+ _mm512_set1_epi64(PRIME64_1));
+
+ remaining -= 8;
+ offset += 8;
+ }
+
+ if (remaining >= 4) {
+ input = _mm512_set1_epi64
+ (*(uint32_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_1)));
+ v_hash = _mm512_madd52lo_epu64
+ (_mm512_set1_epi64(PRIME64_3),
+ _mm512_rol_epi64(v_hash, 23),
+ _mm512_set1_epi64(PRIME64_2));
+
+ offset += 4;
+ remaining -= 4;
+ }
+
+ while (remaining != 0) {
+ input = _mm512_set1_epi64
+ (*(uint8_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_5)));
+ v_hash = _mm512_mullo_epi64
+ (_mm512_rol_epi64(v_hash, 11),
+ _mm512_set1_epi64(PRIME64_1));
+ offset++;
+ remaining--;
+ }
+
+ v_hash = xxh64_fmix_avx512(v_hash);
+
+ /*
+ * theoritically, such modular operations can be replaced by
+ * _mm512_rem_epi64(), but seems it depends on the compiler's
+ * implementation. so here is the limitation that the modulo
+ * value should be power of 2.
+ */
+ __m512i v_hash_remainder = _mm512_set1_epi64((modulo - 1));
+
+ return _mm512_cvtepi64_epi32(_mm512_and_si512(v_hash, v_hash_remainder));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_XXH64_AVX512_H_ */
diff --git a/lib/member/version.map b/lib/member/version.map
index 19469c6aba..35199270ff 100644
--- a/lib/member/version.map
+++ b/lib/member/version.map
@@ -14,3 +14,12 @@ DPDK_23 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ # added in 22.11
+ rte_member_add_byte_count;
+ rte_member_query_count;
+ rte_member_report_heavyhitter;
+};
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v3 2/2] test/member: add functional and perf tests for sketch
2022-09-15 2:14 ` [PATCH v3 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-15 2:14 ` [PATCH v3 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-15 2:14 ` Leyi Rong
2022-09-15 7:30 ` Jiang, YuX
1 sibling, 1 reply; 34+ messages in thread
From: Leyi Rong @ 2022-09-15 2:14 UTC (permalink / raw)
To: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patch adds functional and performance tests for sketch mode of
membership library.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
app/test/test_member.c | 272 ++++++++++++++++++++++++++++++++++++
app/test/test_member_perf.c | 153 +++++++++++++++++++-
2 files changed, 421 insertions(+), 4 deletions(-)
diff --git a/app/test/test_member.c b/app/test/test_member.c
index 26a712439f..8266e6437b 100644
--- a/app/test/test_member.c
+++ b/app/test/test_member.c
@@ -4,6 +4,7 @@
/* This test is for membership library's simple feature test */
+#include <math.h>
#include "test.h"
#include <rte_memcpy.h>
@@ -28,6 +29,7 @@ test_member(void)
struct rte_member_setsum *setsum_ht;
struct rte_member_setsum *setsum_cache;
struct rte_member_setsum *setsum_vbf;
+struct rte_member_setsum *setsum_sketch;
/* 5-tuple key type */
struct flow_key {
@@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
.socket_id = 0 /* NUMA Socket ID for memory. */
};
+/* for sketch definitions */
+#define TOP_K 10
+#define HH_PKT_SIZE 16
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define PRINT_OUT_COUNT 20
+
+#define SKETCH_LARGEST_KEY_SIZE 1000000
+#define SKETCH_TOTAL_KEY 500
+#define NUM_OF_KEY(key) {\
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \
+}
+
+void *heavy_hitters[TOP_K];
+
/*
* Sequence of operations for find existing setsummary
*
@@ -684,6 +701,257 @@ perform_free(void)
rte_member_free(setsum_vbf);
}
+static void
+print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
+ uint32_t print_num, bool count_byte)
+{
+ uint32_t i;
+
+ for (i = 0; i < print_num; i++) {
+ if (count_byte)
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)) *
+ HH_PKT_SIZE,
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
+ ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
+ else
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)),
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i)) /
+ (float)NUM_OF_KEY(i));
+ }
+}
+
+static int
+sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int reset_test)
+{
+ uint32_t i;
+ uint64_t result_count[SKETCH_TOTAL_KEY];
+ member_set_t heavy_set[SKETCH_TOTAL_KEY];
+ uint64_t count[TOP_K];
+ int ret;
+ int hh_cnt;
+
+ setsum_sketch = rte_member_create(¶ms);
+ if (setsum_sketch == NULL) {
+ printf("Creation of setsums fail\n");
+ return -1;
+ }
+
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count(setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ if (reset_test) {
+ printf("\nEntering Sketch Reset Test Process!\n");
+ rte_member_reset(setsum_sketch);
+
+ /* after reset, check some key's count */
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ printf("\nReinsert keys after Sketch Reset!\n");
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count
+ (setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ printf("\nDelete some keys!\n");
+ uint32_t tmp_key = 0;
+
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+ tmp_key = 1;
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+ }
+
+ rte_member_free(setsum_sketch);
+ return 0;
+}
+
+static int
+test_member_sketch(void)
+{
+ unsigned int i, j, index;
+ uint32_t total_pkt = 0;
+ uint32_t *keys;
+ int count_byte = 0;
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++)
+ total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
+
+ printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
+
+ keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
+
+ if (keys == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ return -1;
+ }
+
+ index = 0;
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
+ keys[index++] = i;
+ }
+
+ /* shuffle the keys */
+ for (i = index - 1; i > 0; i--) {
+ uint32_t swap_idx = rte_rand() % i;
+ uint32_t tmp_key = keys[i];
+
+ keys[i] = keys[swap_idx];
+ keys[swap_idx] = tmp_key;
+ }
+
+ params.key_len = 4;
+ params.name = "test_member_sketch";
+ params.type = RTE_MEMBER_TYPE_SKETCH;
+ params.error_rate = SKETCH_ERROR_RATE;
+ params.sample_rate = SKETCH_SAMPLE_RATE;
+ params.extra_flag = 0;
+ params.top_k = TOP_K;
+ params.prim_hash_seed = rte_rdtsc();
+ int reset_test = 0;
+
+ printf("Default sketching params: Error Rate: [%f]\tSample Rate: [%f]\tTopK: [%d]\n",
+ SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
+
+ printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ printf("\n[Sketch with Always Bounded Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 1;
+ params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ printf("\n[Sketch with Packet Size Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 0;
+ params.extra_flag = 0;
+ reset_test = 1;
+ printf("\nreset sketch test\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ rte_free(keys);
+ return 0;
+}
+
static int
test_member(void)
{
@@ -719,6 +987,10 @@ test_member(void)
return -1;
}
+ if (test_member_sketch() < 0) {
+ perform_free();
+ return -1;
+ }
perform_free();
return 0;
}
diff --git a/app/test/test_member_perf.c b/app/test/test_member_perf.c
index 978db0020a..b7eb3e4c66 100644
--- a/app/test/test_member_perf.c
+++ b/app/test/test_member_perf.c
@@ -13,6 +13,7 @@
#include <rte_random.h>
#include <rte_memcpy.h>
#include <rte_thash.h>
+#include <math.h>
#ifdef RTE_EXEC_ENV_WINDOWS
static int
@@ -26,7 +27,7 @@ test_member_perf(void)
#include <rte_member.h>
-#define NUM_KEYSIZES 10
+#define NUM_KEYSIZES RTE_DIM(hashtest_key_lens)
#define NUM_SHUFFLES 10
#define MAX_KEYSIZE 64
#define MAX_ENTRIES (1 << 19)
@@ -36,12 +37,23 @@ test_member_perf(void)
#define BURST_SIZE 64
#define VBF_FALSE_RATE 0.03
+/* for the heavy hitter detection */
+#define SKETCH_LARGEST_KEY_SIZE (1<<15)
+#define SKETCH_PKT_SIZE 16
+#define TOP_K 100
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define NUM_ADDS (KEYS_TO_ADD * 20)
+
static unsigned int test_socket_id;
enum sstype {
HT = 0,
CACHE,
VBF,
+ SKETCH,
+ SKETCH_BOUNDED,
+ SKETCH_BYTE,
NUM_TYPE
};
@@ -88,6 +100,7 @@ static member_set_t data[NUM_TYPE][/* Array to store the data */KEYS_TO_ADD];
/* Array to store all input keys */
static uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE];
+static uint8_t hh_keys[KEYS_TO_ADD][MAX_KEYSIZE];
/* Shuffle the keys that have been added, so lookups will be totally random */
static void
@@ -136,6 +149,10 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
{
unsigned int i, j;
int num_duplicates;
+ int distinct_key = 0;
+ int count_down = SKETCH_LARGEST_KEY_SIZE;
+ uint32_t swap_idx;
+ uint8_t temp_key[MAX_KEYSIZE];
params->key_size = hashtest_key_lens[cycle];
params->cycle = cycle;
@@ -176,6 +193,22 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
/* Shuffle the random values again */
shuffle_input_keys(params);
+ for (i = 0; i < KEYS_TO_ADD; i++) {
+ if (count_down == 0) {
+ distinct_key++;
+ count_down = ceil(SKETCH_LARGEST_KEY_SIZE / (distinct_key + 1));
+ }
+ memcpy(hh_keys[i], keys[distinct_key], params->key_size);
+ count_down--;
+ }
+
+ for (i = KEYS_TO_ADD - 1; i > 0; i--) {
+ swap_idx = rte_rand() % i;
+ memcpy(temp_key, hh_keys[i], params->key_size);
+ memcpy(hh_keys[i], hh_keys[swap_idx], params->key_size);
+ memcpy(hh_keys[swap_idx], temp_key, params->key_size);
+ }
+
/* For testing miss lookup, we insert half and lookup the other half */
unsigned int entry_cnt, bf_key_cnt;
if (!miss) {
@@ -208,6 +241,44 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
params->setsum[VBF] = rte_member_create(&member_params);
if (params->setsum[VBF] == NULL)
fprintf(stderr, "VBF create fail\n");
+
+ member_params.name = "test_member_sketch";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag = 0;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_bounded";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BOUNDED] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BOUNDED] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_byte";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BYTE] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BYTE] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+
for (i = 0; i < NUM_TYPE; i++) {
if (params->setsum[i] == NULL)
return -1;
@@ -243,6 +314,39 @@ timed_adds(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_adds_sketch(struct member_perf_params *params, int type)
+{
+ const uint64_t start_tsc = rte_rdtsc();
+ unsigned int i, j, a;
+ int32_t ret;
+
+ for (i = 0; i < NUM_ADDS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ if (type == SKETCH_BYTE)
+ ret = rte_member_add_byte_count(params->setsum[type],
+ &hh_keys[j], SKETCH_PKT_SIZE);
+ else
+ ret = rte_member_add(params->setsum[type], &hh_keys[j], 1);
+ if (ret < 0) {
+ printf("Error %d in rte_member_add - key=0x", ret);
+ for (a = 0; a < params->key_size; a++)
+ printf("%02x", hh_keys[j][a]);
+ printf("type: %d\n", type);
+
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][ADD] = time_taken / NUM_ADDS;
+
+ return 0;
+}
+
static int
timed_lookups(struct member_perf_params *params, int type)
{
@@ -279,6 +383,36 @@ timed_lookups(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_lookups_sketch(struct member_perf_params *params, int type)
+{
+ unsigned int i, j;
+
+ false_data[type][params->cycle] = 0;
+
+ const uint64_t start_tsc = rte_rdtsc();
+ member_set_t result;
+ int ret;
+
+ for (i = 0; i < NUM_LOOKUPS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ ret = rte_member_lookup(params->setsum[type], &hh_keys[j],
+ &result);
+ if (ret < 0) {
+ printf("lookup wrong internally");
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][LOOKUP] = time_taken / NUM_LOOKUPS;
+
+ return 0;
+}
+
static int
timed_lookups_bulk(struct member_perf_params *params, int type)
{
@@ -531,7 +665,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_adds(¶ms, j) < 0)
return exit_with_fail("timed_adds", ¶ms,
@@ -562,6 +696,17 @@ run_all_tbl_perf_tests(void)
/* Print a dot to show progress on operations */
}
+
+ for (j = SKETCH; j < NUM_TYPE; j++) {
+ if (timed_adds_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_adds_sketch", ¶ms, i, j);
+
+ if (timed_lookups_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_lookups_sketch", ¶ms, i, j);
+ }
+
printf(".");
fflush(stdout);
@@ -574,7 +719,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_miss_lookup(¶ms, j) < 0)
return exit_with_fail("timed_miss_lookup",
¶ms, i, j);
@@ -605,7 +750,7 @@ run_all_tbl_perf_tests(void)
"fr_multi_bulk", "false_positive_rate");
/* Key size not influence False rate so just print out one key size */
for (i = 0; i < 1; i++) {
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
printf("%-18d", hashtest_key_lens[i]);
printf("%-18d", j);
printf("%-18f", (float)false_data[j][i] / NUM_LOOKUPS);
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v3 2/2] test/member: add functional and perf tests for sketch
2022-09-15 2:14 ` [PATCH v3 2/2] test/member: add functional and perf tests for sketch Leyi Rong
@ 2022-09-15 7:30 ` Jiang, YuX
2022-09-15 7:41 ` David Marchand
0 siblings, 1 reply; 34+ messages in thread
From: Jiang, YuX @ 2022-09-15 7:30 UTC (permalink / raw)
To: Rong, Leyi, ferruh.yigit, david.marchand, suanmingm, Wang,
Yipeng1, zaoxingliu, Gobriel, Sameh
Cc: dev, Rong, Leyi
> -----Original Message-----
> From: Leyi Rong <leyi.rong@intel.com>
> Sent: Thursday, September 15, 2022 10:15 AM
> To: ferruh.yigit@xilinx.com; david.marchand@redhat.com;
> suanmingm@nvidia.com; Wang, Yipeng1 <yipeng1.wang@intel.com>;
> zaoxingliu@gmail.com; Gobriel, Sameh <sameh.gobriel@intel.com>
> Cc: dev@dpdk.org; Rong, Leyi <leyi.rong@intel.com>
> Subject: [PATCH v3 2/2] test/member: add functional and perf tests for sketch
>
> This patch adds functional and performance tests for sketch mode of
> membership library.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
Tested-by: Yu Jiang <yux.jiang@intel.com>
Verify v3 dpdk patch https://patchwork.dpdk.org/project/dpdk/list/?series=24667 based on latest main dpdk.
Test unit test passed on CentOS7.9.
meson test -C x86_64-native-linuxapp-gcc/ member_autotest member_perf_autotest
Best regards,
Yu Jiang
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v3 2/2] test/member: add functional and perf tests for sketch
2022-09-15 7:30 ` Jiang, YuX
@ 2022-09-15 7:41 ` David Marchand
0 siblings, 0 replies; 34+ messages in thread
From: David Marchand @ 2022-09-15 7:41 UTC (permalink / raw)
To: Jiang, YuX
Cc: Rong, Leyi, ferruh.yigit, suanmingm, Wang, Yipeng1, zaoxingliu,
Gobriel, Sameh, dev
On Thu, Sep 15, 2022 at 9:30 AM Jiang, YuX <yux.jiang@intel.com> wrote:
>
> > -----Original Message-----
> > From: Leyi Rong <leyi.rong@intel.com>
> > Sent: Thursday, September 15, 2022 10:15 AM
> > To: ferruh.yigit@xilinx.com; david.marchand@redhat.com;
> > suanmingm@nvidia.com; Wang, Yipeng1 <yipeng1.wang@intel.com>;
> > zaoxingliu@gmail.com; Gobriel, Sameh <sameh.gobriel@intel.com>
> > Cc: dev@dpdk.org; Rong, Leyi <leyi.rong@intel.com>
> > Subject: [PATCH v3 2/2] test/member: add functional and perf tests for sketch
> >
> > This patch adds functional and performance tests for sketch mode of
> > membership library.
> >
> > Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> > Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> > ---
> Tested-by: Yu Jiang <yux.jiang@intel.com>
>
> Verify v3 dpdk patch https://patchwork.dpdk.org/project/dpdk/list/?series=24667 based on latest main dpdk.
> Test unit test passed on CentOS7.9.
> meson test -C x86_64-native-linuxapp-gcc/ member_autotest member_perf_autotest
This series has AVX512 stuff in it.
Testing on Centos 7 can be used for non regression, but we need tests
with toolchains that have AVX512 support.
--
David Marchand
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v3 1/2] member: implement NitroSketch mode
2022-09-15 2:14 ` [PATCH v3 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-15 7:47 ` David Marchand
2022-09-15 13:02 ` Rong, Leyi
0 siblings, 1 reply; 34+ messages in thread
From: David Marchand @ 2022-09-15 7:47 UTC (permalink / raw)
To: Leyi Rong
Cc: ferruh.yigit, suanmingm, yipeng1.wang, zaoxingliu, sameh.gobriel, dev
On Thu, Sep 15, 2022 at 4:15 AM Leyi Rong <leyi.rong@intel.com> wrote:
>
> Sketching algorithm provide high-fidelity approximate measurements and
> appears as a promising alternative to traditional approaches such as
> packet sampling.
>
> NitroSketch [1] is a software sketching framework that optimizes
> performance, provides accuracy guarantees, and supports a variety of
> sketches.
>
> This commit adds a new data structure called sketch into
> membership library. This new data structure is an efficient
> way to profile the traffic for heavy hitters. Also use min-heap
> structure to maintain the top-k flow keys.
>
> [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
>
> Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
> lib/member/meson.build | 42 +-
> lib/member/rte_member.c | 75 ++++
> lib/member/rte_member.h | 154 ++++++-
> lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> lib/member/rte_member_sketch.h | 97 +++++
> lib/member/rte_member_sketch_avx512.c | 69 +++
> lib/member/rte_member_sketch_avx512.h | 36 ++
> lib/member/rte_xxh64_avx512.h | 117 +++++
> lib/member/version.map | 9 +
> 10 files changed, 1613 insertions(+), 4 deletions(-)
> create mode 100644 lib/member/rte_member_heap.h
> create mode 100644 lib/member/rte_member_sketch.c
> create mode 100644 lib/member/rte_member_sketch.h
> create mode 100644 lib/member/rte_member_sketch_avx512.c
> create mode 100644 lib/member/rte_member_sketch_avx512.h
> create mode 100644 lib/member/rte_xxh64_avx512.h
>
> diff --git a/lib/member/meson.build b/lib/member/meson.build
> index e06fddc240..8de0c09a6a 100644
> --- a/lib/member/meson.build
> +++ b/lib/member/meson.build
> @@ -7,6 +7,46 @@ if is_windows
> subdir_done()
> endif
>
> -sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
> +sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
> headers = files('rte_member.h')
> deps += ['hash']
> +includes += include_directories('../hash', '../ring')
> +
> +# compile AVX512 version if:
> +# we are building 64-bit binary AND binutils can generate proper code
> +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
> + # compile AVX512 version if either:
> + # a. we have AVX512 supported in minimum instruction set
> + # baseline
> + # b. it's not minimum instruction set, but supported by
> + # compiler
> + #
> + # in former case, just add avx512 C file to files list
> + # in latter case, compile c file to static lib, using correct
> + # compiler flags, and then have the .o file from static lib
> + # linked into main lib.
> +
> + #check if all required flags already enabled
> + sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__', '__AVX512IFMA__']
> +
> + sketch_avx512_on = true
> + foreach f:sketch_avx512_flags
> + if cc.get_define(f, args: machine_args) == ''
> + sketch_avx512_on = false
> + endif
> + endforeach
> +
> + if sketch_avx512_on == true
> + cflags += ['-DCC_AVX512_SUPPORT']
> + sources += files('rte_member_sketch_avx512.c')
> + elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
> + cflags += ['-DCC_AVX512_SUPPORT']
> + cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
No.
Again, you can't push AVX512 flags to the variable cflags.
On my laptop, running with Fedora 36:
$ DPDK_TEST=member_autotest ./build-gcc/app/test/dpdk-test --no-huge -m 2048
EAL: Detected CPU lcores: 12
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
APP: HPET is not enabled, using TSC as default timer
RTE>>member_autotest
Expected error section begin...
rte_member_create_vbf(): Membership vBF create with invalid parameters
rte_member_create_vbf(): Membership vBF create with invalid parameters
rte_member_create_vbf(): Membership vBF create with invalid parameters
rte_member_create_ht(): Membership HT create with invalid parameters
rte_member_create_ht(): Membership HT create with invalid parameters
rte_member_create_ht(): Membership HT create with invalid parameters
Expected error section end...
rte_member_create_ht(): Hash table based filter created, the table has
65536 entries, 4096 buckets
rte_member_create(): Creating a setsummary table with mode 0
rte_member_create_ht(): Hash table based filter created, the table has
65536 entries, 4096 buckets
rte_member_create(): Creating a setsummary table with mode 0
rte_member_create_ht(): Hash table based filter created, the table has
65536 entries, 4096 buckets
rte_member_create(): Creating a setsummary table with mode 0
Illegal instruction (core dumped)
The reason is the same as I reported earlier, and this can be
reproduced like this:
$ touch lib/member/rte_member_sketch*
$ ninja -C build-gcc -vv
ninja: Entering directory `build-gcc'
[1/9] ccache gcc -Ilib/member/libsketch_avx512_tmp.a.p -Ilib/member
-I../lib/member -Ilib/hash -I../lib/hash -Ilib/ring -I../lib/ring -I.
-I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include
-Ilib/eal/linux/include -I../lib/eal/linux/include
-Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common
-I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs
-I../lib/kvargs -Ilib/metrics -I../lib/metrics -Ilib/telemetry
-I../lib/telemetry -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64
-Wall -Winvalid-pch -Wextra -Werror -O3 -g -include rte_config.h
-Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral
-Wformat-security -Wmissing-declarations -Wmissing-prototypes
-Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare
-Wstrict-prototypes -Wundef -Wwrite-strings
-Wno-address-of-packed-member -Wno-packed-not-aligned
-Wno-missing-field-initializers -Wno-zero-length-bounds -D_GNU_SOURCE
-fPIC -march=native -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API
-Wno-format-truncation -DCC_AVX512_SUPPORT -mavx512f -mavx512dq
-mavx512ifma -MD -MQ
lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o -MF
lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o.d -o
lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o -c
../lib/member/rte_member_sketch_avx512.c
[2/9] ccache gcc -Ilib/librte_member.a.p -Ilib -I../lib -Ilib/hash
-I../lib/hash -Ilib/ring -I../lib/ring -Ilib/member -I../lib/member
-I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include
-Ilib/eal/linux/include -I../lib/eal/linux/include
-Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common
-I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs
-I../lib/kvargs -Ilib/metrics -I../lib/metrics -Ilib/telemetry
-I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf
-Ilib/mempool -I../lib/mempool -Ilib/rcu -I../lib/rcu
-fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
-Wextra -Werror -O3 -g -include rte_config.h -Wcast-qual -Wdeprecated
-Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations
-Wmissing-prototypes -Wnested-externs -Wold-style-definition
-Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
-Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned
-Wno-missing-field-initializers -Wno-zero-length-bounds -D_GNU_SOURCE
-fPIC -march=native -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API
-Wno-format-truncation -DCC_AVX512_SUPPORT -mavx512f -mavx512dq
-mavx512ifma -DRTE_LOG_DEFAULT_LOGTYPE=lib.member -MD -MQ
lib/librte_member.a.p/member_rte_member.c.o -MF
lib/librte_member.a.p/member_rte_member.c.o.d -o
lib/librte_member.a.p/member_rte_member.c.o -c
../lib/member/rte_member.c
Here, the rte_member.c file is compiled with "-mavx512f -mavx512dq
-mavx512ifma".
The compiler might insert AVX512 instruction in this generic code.
[3/9] ccache gcc -Ilib/librte_member.a.p -Ilib -I../lib -Ilib/hash
-I../lib/hash -Ilib/ring -I../lib/ring -Ilib/member -I../lib/member
-I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include
-Ilib/eal/linux/include -I../lib/eal/linux/include
-Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common
-I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs
-I../lib/kvargs -Ilib/metrics -I../lib/metrics -Ilib/telemetry
-I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf
-Ilib/mempool -I../lib/mempool -Ilib/rcu -I../lib/rcu
-fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
-Wextra -Werror -O3 -g -include rte_config.h -Wcast-qual -Wdeprecated
-Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations
-Wmissing-prototypes -Wnested-externs -Wold-style-definition
-Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
-Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned
-Wno-missing-field-initializers -Wno-zero-length-bounds -D_GNU_SOURCE
-fPIC -march=native -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API
-Wno-format-truncation -DCC_AVX512_SUPPORT -mavx512f -mavx512dq
-mavx512ifma -DRTE_LOG_DEFAULT_LOGTYPE=lib.member -MD -MQ
lib/librte_member.a.p/member_rte_member_sketch.c.o -MF
lib/librte_member.a.p/member_rte_member_sketch.c.o.d -o
lib/librte_member.a.p/member_rte_member_sketch.c.o -c
../lib/member/rte_member_sketch.c
Idem.
[4/9] rm -f lib/member/libsketch_avx512_tmp.a && gcc-ar csrDT
lib/member/libsketch_avx512_tmp.a
lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o
[5/9] rm -f lib/librte_member.a && gcc-ar csrD lib/librte_member.a
lib/librte_member.a.p/member_rte_member.c.o
lib/librte_member.a.p/member_rte_member_ht.c.o
lib/librte_member.a.p/member_rte_member_vbf.c.o
lib/librte_member.a.p/member_rte_member_sketch.c.o
lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o
[6/9] /usr/bin/meson --internal exe --capture lib/member.sym_chk --
/home/dmarchan/git/pub/dpdk.org/buildtools/check-symbols.sh
/home/dmarchan/git/pub/dpdk.org/lib/member/version.map
lib/librte_member.a
[7/9] gcc -o lib/librte_member.so.23.0
lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o
lib/librte_member.a.p/member_rte_member.c.o
lib/librte_member.a.p/member_rte_member_ht.c.o
lib/librte_member.a.p/member_rte_member_vbf.c.o
lib/librte_member.a.p/member_rte_member_sketch.c.o -Wl,--as-needed
-Wl,--no-undefined -Wl,-O1 -shared -fPIC -Wl,--start-group
-Wl,-soname,librte_member.so.23 -Wl,--no-as-needed -pthread -lm -ldl
-lnuma -lfdt -larchive '-Wl,-rpath,$ORIGIN/'
-Wl,-rpath-link,/home/dmarchan/git/pub/dpdk.org/build-gcc/lib
lib/librte_eal.so.23.0 lib/librte_kvargs.so.23.0
lib/librte_telemetry.so.23.0 lib/librte_hash.so.23.0
lib/librte_net.so.23.0 lib/librte_mbuf.so.23.0
lib/librte_mempool.so.23.0 lib/librte_ring.so.23.0
lib/librte_rcu.so.23.0
-Wl,--version-script=/home/dmarchan/git/pub/dpdk.org/lib/member/version.map
/usr/lib64/libbsd.so -Wl,--end-group
[8/9] /usr/bin/meson --internal symbolextractor
/home/dmarchan/git/pub/dpdk.org/build-gcc lib/librte_member.so.23.0
lib/librte_member.so.23.0
lib/librte_member.so.23.0.p/librte_member.so.23.0.symbols
> + sketch_avx512_tmp = static_library('sketch_avx512_tmp',
> + 'rte_member_sketch_avx512.c',
> + include_directories: includes,
> + dependencies: static_rte_eal,
> + c_args: cflags)
> + objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
> + endif
> +endif
--
David Marchand
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v3 1/2] member: implement NitroSketch mode
2022-09-15 7:47 ` David Marchand
@ 2022-09-15 13:02 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-09-15 13:02 UTC (permalink / raw)
To: David Marchand
Cc: ferruh.yigit, suanmingm, Wang, Yipeng1, zaoxingliu, Gobriel, Sameh, dev
Hi David,
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, September 15, 2022 3:47 PM
> To: Rong, Leyi <leyi.rong@intel.com>
> Cc: ferruh.yigit@xilinx.com; suanmingm@nvidia.com; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 1/2] member: implement NitroSketch mode
>
> On Thu, Sep 15, 2022 at 4:15 AM Leyi Rong <leyi.rong@intel.com> wrote:
> >
> > Sketching algorithm provide high-fidelity approximate measurements and
> > appears as a promising alternative to traditional approaches such as
> > packet sampling.
> >
> > NitroSketch [1] is a software sketching framework that optimizes
> > performance, provides accuracy guarantees, and supports a variety of
> > sketches.
> >
> > This commit adds a new data structure called sketch into membership
> > library. This new data structure is an efficient way to profile the
> > traffic for heavy hitters. Also use min-heap structure to maintain the
> > top-k flow keys.
> >
> > [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> > Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> > Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> > https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
> >
> > Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> > Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> > Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> > ---
> > lib/member/meson.build | 42 +-
> > lib/member/rte_member.c | 75 ++++
> > lib/member/rte_member.h | 154 ++++++-
> > lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> > lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> > lib/member/rte_member_sketch.h | 97 +++++
> > lib/member/rte_member_sketch_avx512.c | 69 +++
> > lib/member/rte_member_sketch_avx512.h | 36 ++
> > lib/member/rte_xxh64_avx512.h | 117 +++++
> > lib/member/version.map | 9 +
> > 10 files changed, 1613 insertions(+), 4 deletions(-) create mode
> > 100644 lib/member/rte_member_heap.h create mode 100644
> > lib/member/rte_member_sketch.c create mode 100644
> > lib/member/rte_member_sketch.h create mode 100644
> > lib/member/rte_member_sketch_avx512.c
> > create mode 100644 lib/member/rte_member_sketch_avx512.h
> > create mode 100644 lib/member/rte_xxh64_avx512.h
> >
> > diff --git a/lib/member/meson.build b/lib/member/meson.build index
> > e06fddc240..8de0c09a6a 100644
> > --- a/lib/member/meson.build
> > +++ b/lib/member/meson.build
> > @@ -7,6 +7,46 @@ if is_windows
> > subdir_done()
> > endif
> >
> > -sources = files('rte_member.c', 'rte_member_ht.c',
> > 'rte_member_vbf.c')
> > +sources = files('rte_member.c', 'rte_member_ht.c',
> > +'rte_member_vbf.c', 'rte_member_sketch.c')
> > headers = files('rte_member.h')
> > deps += ['hash']
> > +includes += include_directories('../hash', '../ring')
> > +
> > +# compile AVX512 version if:
> > +# we are building 64-bit binary AND binutils can generate proper code
> > +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
> > + # compile AVX512 version if either:
> > + # a. we have AVX512 supported in minimum instruction set
> > + # baseline
> > + # b. it's not minimum instruction set, but supported by
> > + # compiler
> > + #
> > + # in former case, just add avx512 C file to files list
> > + # in latter case, compile c file to static lib, using correct
> > + # compiler flags, and then have the .o file from static lib
> > + # linked into main lib.
> > +
> > + #check if all required flags already enabled
> > + sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__',
> > + '__AVX512IFMA__']
> > +
> > + sketch_avx512_on = true
> > + foreach f:sketch_avx512_flags
> > + if cc.get_define(f, args: machine_args) == ''
> > + sketch_avx512_on = false
> > + endif
> > + endforeach
> > +
> > + if sketch_avx512_on == true
> > + cflags += ['-DCC_AVX512_SUPPORT']
> > + sources += files('rte_member_sketch_avx512.c')
> > + elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
> > + cflags += ['-DCC_AVX512_SUPPORT']
> > + cflags += ['-mavx512f', '-mavx512dq', '-mavx512ifma']
>
> No.
> Again, you can't push AVX512 flags to the variable cflags.
>
> On my laptop, running with Fedora 36:
>
> $ DPDK_TEST=member_autotest ./build-gcc/app/test/dpdk-test --no-huge -m
> 2048
> EAL: Detected CPU lcores: 12
> EAL: Detected NUMA nodes: 1
> EAL: Detected shared linkage of DPDK
> EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'VA'
> APP: HPET is not enabled, using TSC as default timer
> RTE>>member_autotest
> Expected error section begin...
> rte_member_create_vbf(): Membership vBF create with invalid parameters
> rte_member_create_vbf(): Membership vBF create with invalid parameters
> rte_member_create_vbf(): Membership vBF create with invalid parameters
> rte_member_create_ht(): Membership HT create with invalid parameters
> rte_member_create_ht(): Membership HT create with invalid parameters
> rte_member_create_ht(): Membership HT create with invalid parameters
> Expected error section end...
> rte_member_create_ht(): Hash table based filter created, the table has
> 65536 entries, 4096 buckets
> rte_member_create(): Creating a setsummary table with mode 0
> rte_member_create_ht(): Hash table based filter created, the table has
> 65536 entries, 4096 buckets
> rte_member_create(): Creating a setsummary table with mode 0
> rte_member_create_ht(): Hash table based filter created, the table has
> 65536 entries, 4096 buckets
> rte_member_create(): Creating a setsummary table with mode 0 Illegal
> instruction (core dumped)
>
> The reason is the same as I reported earlier, and this can be reproduced like this:
>
> $ touch lib/member/rte_member_sketch*
> $ ninja -C build-gcc -vv
> ninja: Entering directory `build-gcc'
> [1/9] ccache gcc -Ilib/member/libsketch_avx512_tmp.a.p -Ilib/member -
> I../lib/member -Ilib/hash -I../lib/hash -Ilib/ring -I../lib/ring -I.
> -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -
> I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -
> Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -
> I../lib/kvargs -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -
> fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -
> Wextra -Werror -O3 -g -include rte_config.h -Wcast-qual -Wdeprecated -
> Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -
> Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -
> Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-
> packed-member -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-
> zero-length-bounds -D_GNU_SOURCE -fPIC -march=native -
> DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-
> truncation -DCC_AVX512_SUPPORT -mavx512f -mavx512dq -mavx512ifma -MD
> -MQ lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o -
> MF lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o.d -o
> lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o -
> c ../lib/member/rte_member_sketch_avx512.c
> [2/9] ccache gcc -Ilib/librte_member.a.p -Ilib -I../lib -Ilib/hash -I../lib/hash -
> Ilib/ring -I../lib/ring -Ilib/member -I../lib/member -I. -I.. -Iconfig -I../config -
> Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include
> -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -
> I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/metrics -
> I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -
> I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/rcu -I../lib/rcu -fdiagnostics-
> color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -
> O3 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-
> nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -
> Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -
> Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member
> -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-zero-length-
> bounds -D_GNU_SOURCE -fPIC -march=native -DALLOW_EXPERIMENTAL_API -
> DALLOW_INTERNAL_API -Wno-format-truncation -DCC_AVX512_SUPPORT -
> mavx512f -mavx512dq -mavx512ifma -
> DRTE_LOG_DEFAULT_LOGTYPE=lib.member -MD -MQ
> lib/librte_member.a.p/member_rte_member.c.o -MF
> lib/librte_member.a.p/member_rte_member.c.o.d -o
> lib/librte_member.a.p/member_rte_member.c.o -
> c ../lib/member/rte_member.c
>
> Here, the rte_member.c file is compiled with "-mavx512f -mavx512dq -
> mavx512ifma".
> The compiler might insert AVX512 instruction in this generic code.
>
> [3/9] ccache gcc -Ilib/librte_member.a.p -Ilib -I../lib -Ilib/hash -I../lib/hash -
> Ilib/ring -I../lib/ring -Ilib/member -I../lib/member -I. -I.. -Iconfig -I../config -
> Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include
> -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -
> I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/metrics -
> I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -
> I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/rcu -I../lib/rcu -fdiagnostics-
> color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -
> O3 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-
> nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -
> Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -
> Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member
> -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-zero-length-
> bounds -D_GNU_SOURCE -fPIC -march=native -DALLOW_EXPERIMENTAL_API -
> DALLOW_INTERNAL_API -Wno-format-truncation -DCC_AVX512_SUPPORT -
> mavx512f -mavx512dq -mavx512ifma -
> DRTE_LOG_DEFAULT_LOGTYPE=lib.member -MD -MQ
> lib/librte_member.a.p/member_rte_member_sketch.c.o -MF
> lib/librte_member.a.p/member_rte_member_sketch.c.o.d -o
> lib/librte_member.a.p/member_rte_member_sketch.c.o -
> c ../lib/member/rte_member_sketch.c
>
> Idem.
>
> [4/9] rm -f lib/member/libsketch_avx512_tmp.a && gcc-ar csrDT
> lib/member/libsketch_avx512_tmp.a
> lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o
> [5/9] rm -f lib/librte_member.a && gcc-ar csrD lib/librte_member.a
> lib/librte_member.a.p/member_rte_member.c.o
> lib/librte_member.a.p/member_rte_member_ht.c.o
> lib/librte_member.a.p/member_rte_member_vbf.c.o
> lib/librte_member.a.p/member_rte_member_sketch.c.o
> lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o
> [6/9] /usr/bin/meson --internal exe --capture lib/member.sym_chk --
> /home/dmarchan/git/pub/dpdk.org/buildtools/check-symbols.sh
> /home/dmarchan/git/pub/dpdk.org/lib/member/version.map
> lib/librte_member.a
> [7/9] gcc -o lib/librte_member.so.23.0
> lib/member/libsketch_avx512_tmp.a.p/rte_member_sketch_avx512.c.o
> lib/librte_member.a.p/member_rte_member.c.o
> lib/librte_member.a.p/member_rte_member_ht.c.o
> lib/librte_member.a.p/member_rte_member_vbf.c.o
> lib/librte_member.a.p/member_rte_member_sketch.c.o -Wl,--as-needed -Wl,--
> no-undefined -Wl,-O1 -shared -fPIC -Wl,--start-group
> -Wl,-soname,librte_member.so.23 -Wl,--no-as-needed -pthread -lm -ldl -lnuma -
> lfdt -larchive '-Wl,-rpath,$ORIGIN/'
> -Wl,-rpath-link,/home/dmarchan/git/pub/dpdk.org/build-gcc/lib
> lib/librte_eal.so.23.0 lib/librte_kvargs.so.23.0
> lib/librte_telemetry.so.23.0 lib/librte_hash.so.23.0
> lib/librte_net.so.23.0 lib/librte_mbuf.so.23.0
> lib/librte_mempool.so.23.0 lib/librte_ring.so.23.0
> lib/librte_rcu.so.23.0
> -Wl,--version-
> script=/home/dmarchan/git/pub/dpdk.org/lib/member/version.map
> /usr/lib64/libbsd.so -Wl,--end-group
> [8/9] /usr/bin/meson --internal symbolextractor
> /home/dmarchan/git/pub/dpdk.org/build-gcc lib/librte_member.so.23.0
> lib/librte_member.so.23.0
> lib/librte_member.so.23.0.p/librte_member.so.23.0.symbols
>
Catch it, will fix in v4.
>
>
> > + sketch_avx512_tmp = static_library('sketch_avx512_tmp',
> > + 'rte_member_sketch_avx512.c',
> > + include_directories: includes,
> > + dependencies: static_rte_eal,
> > + c_args: cflags)
> > + objs +=
> sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
> > + endif
> > +endif
>
>
> --
> David Marchand
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 0/2] introduce NitroSketch Mode into membership library
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
` (3 preceding siblings ...)
2022-09-15 2:14 ` [PATCH v3 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-09-15 13:03 ` Leyi Rong
2022-09-15 13:03 ` [PATCH v4 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-15 13:03 ` [PATCH v4 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-09-16 3:03 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Leyi Rong
5 siblings, 2 replies; 34+ messages in thread
From: Leyi Rong @ 2022-09-15 13:03 UTC (permalink / raw)
To: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patchset introduce a brand new NitroSketch Mode into membership
library. This algorithm provides high-fidelity approximate measurements
and appears as a promissing alternative to triditional approches such as
packet sampling.
---
v4:
- remove extra avx512 complier flags from the cflags.
v3:
- mark below new added APIs as "experimental":
rte_member_add_byte_count()
rte_member_query_count()
rte_member_report_heavyhitter()
- meson build file rework for potential runtime issue.
v2:
- attach paper link to commit log.
- fix potential memory leaks in test_member.c.
- build error fix according to CI build fail log.
Leyi Rong (2):
member: implement NitroSketch mode
test/member: add functional and perf tests for sketch
app/test/test_member.c | 272 ++++++++++++
app/test/test_member_perf.c | 153 ++++++-
lib/member/meson.build | 42 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 154 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 70 +++
lib/member/rte_member_sketch_avx512.h | 35 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 9 +
12 files changed, 2034 insertions(+), 8 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 1/2] member: implement NitroSketch mode
2022-09-15 13:03 ` [PATCH v4 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-09-15 13:03 ` Leyi Rong
2022-09-15 14:12 ` Bruce Richardson
2022-09-15 13:03 ` [PATCH v4 2/2] test/member: add functional and perf tests for sketch Leyi Rong
1 sibling, 1 reply; 34+ messages in thread
From: Leyi Rong @ 2022-09-15 13:03 UTC (permalink / raw)
To: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
Sketching algorithm provide high-fidelity approximate measurements and
appears as a promising alternative to traditional approaches such as
packet sampling.
NitroSketch [1] is a software sketching framework that optimizes
performance, provides accuracy guarantees, and supports a variety of
sketches.
This commit adds a new data structure called sketch into
membership library. This new data structure is an efficient
way to profile the traffic for heavy hitters. Also use min-heap
structure to maintain the top-k flow keys.
[1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
lib/member/meson.build | 42 +-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 154 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 70 +++
lib/member/rte_member_sketch_avx512.h | 35 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 9 +
10 files changed, 1613 insertions(+), 4 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
diff --git a/lib/member/meson.build b/lib/member/meson.build
index e06fddc240..3df0fbc7a6 100644
--- a/lib/member/meson.build
+++ b/lib/member/meson.build
@@ -7,6 +7,46 @@ if is_windows
subdir_done()
endif
-sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
+sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
headers = files('rte_member.h')
deps += ['hash']
+includes += include_directories('../hash', '../ring')
+
+# compile AVX512 version if:
+# we are building 64-bit binary AND binutils can generate proper code
+if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
+ # compile AVX512 version if either:
+ # a. we have AVX512 supported in minimum instruction set
+ # baseline
+ # b. it's not minimum instruction set, but supported by
+ # compiler
+ #
+ # in former case, just add avx512 C file to files list
+ # in latter case, compile c file to static lib, using correct
+ # compiler flags, and then have the .o file from static lib
+ # linked into main lib.
+
+ #check if all required flags already enabled
+ sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__', '__AVX512IFMA__']
+
+ sketch_avx512_on = true
+ foreach f:sketch_avx512_flags
+ if cc.get_define(f, args: machine_args) == ''
+ sketch_avx512_on = false
+ endif
+ endforeach
+
+ if sketch_avx512_on == true
+ cflags += ['-DCC_AVX512_SUPPORT']
+ sources += files('rte_member_sketch_avx512.c')
+ elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
+ sketch_avx512_tmp = static_library('sketch_avx512_tmp',
+ 'rte_member_sketch_avx512.c',
+ include_directories: includes,
+ dependencies: static_rte_eal,
+ c_args: cflags +
+ ['-mavx512f', '-mavx512dq', '-mavx512ifma'])
+ objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
+ cflags += ['-DCC_AVX512_SUPPORT']
+ endif
+endif
diff --git a/lib/member/rte_member.c b/lib/member/rte_member.c
index 7e1632e6b5..8f859f7fbd 100644
--- a/lib/member/rte_member.c
+++ b/lib/member/rte_member.c
@@ -9,10 +9,12 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_tailq.h>
+#include <rte_ring_elem.h>
#include "rte_member.h"
#include "rte_member_ht.h"
#include "rte_member_vbf.h"
+#include "rte_member_sketch.h"
TAILQ_HEAD(rte_member_list, rte_tailq_entry);
static struct rte_tailq_elem rte_member_tailq = {
@@ -72,6 +74,9 @@ rte_member_free(struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_free_vbf(setsum);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_free_sketch(setsum);
+ break;
default:
break;
}
@@ -86,6 +91,8 @@ rte_member_create(const struct rte_member_parameters *params)
struct rte_member_list *member_list;
struct rte_member_setsum *setsum;
int ret;
+ char ring_name[RTE_RING_NAMESIZE];
+ struct rte_ring *sketch_key_ring = NULL;
if (params == NULL) {
rte_errno = EINVAL;
@@ -100,6 +107,16 @@ rte_member_create(const struct rte_member_parameters *params)
return NULL;
}
+ if (params->type == RTE_MEMBER_TYPE_SKETCH) {
+ snprintf(ring_name, sizeof(ring_name), "SK_%s", params->name);
+ sketch_key_ring = rte_ring_create_elem(ring_name, sizeof(uint32_t),
+ rte_align32pow2(params->top_k), params->socket_id, 0);
+ if (sketch_key_ring == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Ring Memory allocation failed\n");
+ return NULL;
+ }
+ }
+
member_list = RTE_TAILQ_CAST(rte_member_tailq.head, rte_member_list);
rte_mcfg_tailq_write_lock();
@@ -145,6 +162,9 @@ rte_member_create(const struct rte_member_parameters *params)
case RTE_MEMBER_TYPE_VBF:
ret = rte_member_create_vbf(setsum, params);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ ret = rte_member_create_sketch(setsum, params, sketch_key_ring);
+ break;
default:
goto error_unlock_exit;
}
@@ -162,6 +182,7 @@ rte_member_create(const struct rte_member_parameters *params)
error_unlock_exit:
rte_free(te);
rte_free(setsum);
+ rte_ring_free(sketch_key_ring);
rte_mcfg_tailq_write_unlock();
return NULL;
}
@@ -178,6 +199,23 @@ rte_member_add(const struct rte_member_setsum *setsum, const void *key,
return rte_member_add_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_add_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch(setsum, key, set_id);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count)
+{
+ if (setsum == NULL || key == NULL || byte_count == 0)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch_byte_count(setsum, key, byte_count);
default:
return -EINVAL;
}
@@ -195,6 +233,8 @@ rte_member_lookup(const struct rte_member_setsum *setsum, const void *key,
return rte_member_lookup_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_lookup_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_lookup_sketch(setsum, key, set_id);
default:
return -EINVAL;
}
@@ -261,6 +301,36 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
}
}
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output)
+{
+ if (setsum == NULL || key == NULL || output == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_query_sketch(setsum, key, output);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count)
+{
+ if (setsum == NULL || key == NULL || count == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_report_heavyhitter_sketch(setsum, key, count);
+ default:
+ return -EINVAL;
+ }
+}
+
int
rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id)
@@ -272,6 +342,8 @@ rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
case RTE_MEMBER_TYPE_HT:
return rte_member_delete_ht(setsum, key, set_id);
/* current vBF implementation does not support delete function */
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_delete_sketch(setsum, key);
case RTE_MEMBER_TYPE_VBF:
default:
return -EINVAL;
@@ -290,6 +362,9 @@ rte_member_reset(const struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_reset_vbf(setsum);
return;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_reset_sketch(setsum);
+ return;
default:
return;
}
diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h
index 2611015771..178efeed33 100644
--- a/lib/member/rte_member.h
+++ b/lib/member/rte_member.h
@@ -39,6 +39,18 @@
* | | | not overwrite | |
* | | | existing key. | |
* +----------+---------------------+----------------+-------------------------+
+ * +==========+=============================+
+ * | type | sketch |
+ * +==========+=============================+
+ * |structure | counting bloom filter array |
+ * +----------+-----------------------------+
+ * |set id | 1: heavy set, 0: light set |
+ * | | |
+ * +----------+-----------------------------+
+ * |usages & | count size of a flow, |
+ * |properties| used for heavy hitter |
+ * | | detection. |
+ * +----------+-----------------------------+
* -->
*/
@@ -50,6 +62,8 @@ extern "C" {
#endif
#include <stdint.h>
+#include <stdbool.h>
+#include <inttypes.h>
#include <rte_common.h>
@@ -65,6 +79,20 @@ typedef uint16_t member_set_t;
#define RTE_MEMBER_BUCKET_ENTRIES 16
/** Maximum number of characters in setsum name. */
#define RTE_MEMBER_NAMESIZE 32
+/** Max value of the random number */
+#define RTE_RAND_MAX ~0LLU
+/**
+ * As packets skipped in the sampling-based algorithm, the accounting
+ * results accuracy is not guaranteed in the start stage. There should
+ * be a "convergence time" to achieve the accuracy after receiving enough
+ * packets.
+ * For sketch, use the flag if prefer always bounded mode, which only
+ * starts sampling after receiving enough packets to keep the results
+ * accuracy always bounded.
+ */
+#define RTE_MEMBER_SKETCH_ALWAYS_BOUNDED 0x01
+/** For sketch, use the flag if to count packet size instead of packet count */
+#define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02
/** @internal Hash function used by membership library. */
#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32)
@@ -104,6 +132,7 @@ struct rte_member_parameters;
enum rte_member_setsum_type {
RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary. */
RTE_MEMBER_TYPE_VBF, /**< Vector of bloom filters. */
+ RTE_MEMBER_TYPE_SKETCH,
RTE_MEMBER_NUM_TYPE
};
@@ -114,6 +143,19 @@ enum rte_member_sig_compare_function {
RTE_MEMBER_COMPARE_NUM
};
+/* sketch update function with different implementations. */
+typedef void (*sketch_update_fn_t)(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+/* sketch lookup function with different implementations. */
+typedef uint64_t (*sketch_lookup_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
+/* sketch delete function with different implementations. */
+typedef void (*sketch_delete_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
/** @internal setsummary structure. */
struct rte_member_setsum {
enum rte_member_setsum_type type; /* Type of the set summary. */
@@ -134,6 +176,21 @@ struct rte_member_setsum {
uint32_t bit_mask; /* Bit mask to get bit location in bf. */
uint32_t num_hashes; /* Number of hash values to index bf. */
+ /* Parameters for sketch */
+ float error_rate;
+ float sample_rate;
+ uint32_t num_col;
+ uint32_t num_row;
+ int always_bounded;
+ double converge_thresh;
+ uint32_t topk;
+ uint32_t count_byte;
+ uint64_t *hash_seeds;
+ sketch_update_fn_t sketch_update; /* Pointer to the sketch update function */
+ sketch_lookup_fn_t sketch_lookup; /* Pointer to the sketch lookup function */
+ sketch_delete_fn_t sketch_delete; /* Pointer to the sketch delete function */
+
+ void *runtime_var;
uint32_t mul_shift; /* vbf internal variable used during bit test. */
uint32_t div_shift; /* vbf internal variable used during bit test. */
@@ -143,6 +200,9 @@ struct rte_member_setsum {
/* Second cache line should start here. */
uint32_t socket_id; /* NUMA Socket ID for memory. */
char name[RTE_MEMBER_NAMESIZE]; /* Name of this set summary. */
+#ifdef RTE_ARCH_X86
+ bool use_avx512;
+#endif
} __rte_cache_aligned;
/**
@@ -261,8 +321,33 @@ struct rte_member_parameters {
*/
uint32_t sec_hash_seed;
+ /**
+ * For count(min) sketch data structure, error rate defines the accuracy
+ * required by the user. Higher accuracy leads to more memory usage, but
+ * the flow size is estimated more accurately.
+ */
+ float error_rate;
+
+ /**
+ * Sampling rate means the internal sample rate of the rows of the count
+ * min sketches. Lower sampling rate can reduce CPU overhead, but the
+ * data structure will require more time to converge statistically.
+ */
+ float sample_rate;
+
+ /**
+ * How many top heavy hitter to be reported. The library will internally
+ * keep the keys of heavy hitters for final report.
+ */
+ uint32_t top_k;
+
+ /**
+ * Extra flags that may passed in by user
+ */
+ uint32_t extra_flag;
+
int socket_id; /**< NUMA Socket ID for memory. */
-};
+} __rte_cache_aligned;
/**
* @warning
@@ -418,7 +503,7 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* RTE_MEMBER_NO_MATCH by default is set as 0.
* For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved.
* For vBF mode the set id is limited by the num_set parameter when create
- * the set-summary.
+ * the set-summary. For sketch mode, this id is ignored.
* @return
* HT (cache mode) and vBF should never fail unless the set_id is not in the
* valid range. In such case -EINVAL is returned.
@@ -429,12 +514,75 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* Return 0 for HT (cache mode) if the add does not cause
* eviction, return 1 otherwise. Return 0 for non-cache mode if success,
* -ENOSPC for full, and 1 if cuckoo eviction happens.
- * Always returns 0 for vBF mode.
+ * Always returns 0 for vBF mode and sketch.
*/
int
rte_member_add(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add the packet byte size into the sketch.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param byte_count
+ * Add the byte count of the packet into the sketch.
+ * @return
+ * Return -EINVAL for invalid parameters, otherwise return 0.
+ */
+__rte_experimental
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Query packet count for a certain flow-key.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param count
+ * The output packet count or byte count.
+ * @return
+ * Return -EINVAL for invalid parameters.
+ */
+__rte_experimental
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *count);
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Report heavyhitter flow-keys into set-summary (SS).
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param keys
+ * Pointer of the output top-k key array.
+ * @param counts
+ * Pointer of the output packet count or byte count array of the top-k keys.
+ * @return
+ * Return -EINVAL for invalid parameters. Return a positive integer indicate
+ * how many heavy hitters are reported.
+ */
+__rte_experimental
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **keys, uint64_t *counts);
+
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
diff --git a/lib/member/rte_member_heap.h b/lib/member/rte_member_heap.h
new file mode 100644
index 0000000000..3ced34160a
--- /dev/null
+++ b/lib/member/rte_member_heap.h
@@ -0,0 +1,424 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#ifndef _RTE_MEMBER_HEAP_H_
+#define _RTE_MEMBER_HEAP_H_
+
+#include <rte_ring_elem.h>
+#include "rte_member.h"
+
+#define LCHILD(x) (2 * x + 1)
+#define RCHILD(x) (2 * x + 2)
+#define PARENT(x) ((x - 1) / 2)
+
+#define HASH_BKT_SIZE 16
+#define HASH_HP_MULTI 4
+#define HASH_RESIZE_MULTI 2
+
+struct hash_bkt {
+ uint16_t sig[HASH_BKT_SIZE];
+ uint16_t idx[HASH_BKT_SIZE];
+};
+
+struct hash {
+ uint16_t bkt_cnt;
+ uint16_t num_item;
+ uint32_t seed;
+ struct hash_bkt buckets[0];
+};
+
+struct node {
+ void *key;
+ uint64_t count;
+};
+
+struct minheap {
+ uint32_t key_len;
+ uint32_t size;
+ uint32_t socket;
+ struct hash *hashtable;
+ struct node *elem;
+};
+
+static int
+hash_table_insert(const void *key, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].idx[i] == 0) {
+ table->buckets[idx].idx[i] = value;
+ table->buckets[idx].sig[i] = sig;
+ table->num_item++;
+ return 0;
+ }
+ }
+
+ return -ENOMEM;
+}
+
+static int
+hash_table_update(const void *key, int old_value, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == old_value) {
+ table->buckets[idx].idx[i] = value;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_del(const void *key, uint16_t value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == value) {
+ table->buckets[idx].idx[i] = 0;
+ table->num_item--;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_lookup(const void *key, int key_len, struct minheap *hp)
+{
+ struct hash *table = hp->hashtable;
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] != 0) {
+ uint32_t hp_idx = table->buckets[idx].idx[i] - 1;
+
+ if (memcmp(hp->elem[hp_idx].key, key, hp->key_len) == 0)
+ return hp_idx;
+ }
+ }
+
+ return -ENOENT; /* key doesn't exist */
+}
+
+static int
+resize_hash_table(struct minheap *hp)
+{
+ uint32_t i;
+ uint32_t new_bkt_cnt;
+
+ while (1) {
+ new_bkt_cnt = hp->hashtable->bkt_cnt * HASH_RESIZE_MULTI;
+
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT load factor is [%f]\n",
+ hp->hashtable->num_item / ((float)hp->hashtable->bkt_cnt * HASH_BKT_SIZE));
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT resize happen!\n");
+ rte_free(hp->hashtable);
+ hp->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ new_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, hp->socket);
+
+ if (hp->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ return -ENOMEM;
+ }
+
+ hp->hashtable->bkt_cnt = new_bkt_cnt;
+
+ for (i = 0; i < hp->size; ++i) {
+ if (hash_table_insert(hp->elem[i].key,
+ i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR,
+ "Sketch Minheap HT resize insert fail!\n");
+ break;
+ }
+ }
+ if (i == hp->size)
+ break;
+ }
+
+ return 0;
+}
+
+/* find the item in the given minheap */
+static int
+rte_member_minheap_find(struct minheap *hp, const void *key)
+{
+ int idx = hash_table_lookup(key, hp->key_len, hp);
+ return idx;
+}
+
+static int
+rte_member_minheap_init(struct minheap *heap, int size,
+ uint32_t socket, uint32_t seed)
+{
+ heap->elem = rte_zmalloc_socket(NULL, sizeof(struct node) * size,
+ RTE_CACHE_LINE_SIZE, socket);
+ if (heap->elem == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap elem allocation failed\n");
+ return -ENOMEM;
+ }
+
+ uint32_t hash_bkt_cnt = rte_align32pow2(size * HASH_HP_MULTI) / HASH_BKT_SIZE;
+
+ if (hash_bkt_cnt == 0)
+ hash_bkt_cnt = 1;
+
+ heap->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ hash_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, socket);
+
+ if (heap->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ rte_free(heap->elem);
+ return -ENOMEM;
+ }
+
+ heap->hashtable->seed = seed;
+ heap->hashtable->bkt_cnt = hash_bkt_cnt;
+ heap->socket = socket;
+
+ return 0;
+}
+
+/* swap the minheap nodes */
+static __rte_always_inline void
+rte_member_heap_swap(struct node *n1, struct node *n2)
+{
+ struct node temp = *n1;
+ *n1 = *n2;
+ *n2 = temp;
+}
+
+/* heapify function */
+static void
+rte_member_heapify(struct minheap *hp, uint32_t idx, bool update_hash)
+{
+ uint32_t smallest;
+
+ if (LCHILD(idx) < hp->size &&
+ hp->elem[LCHILD(idx)].count < hp->elem[idx].count)
+ smallest = LCHILD(idx);
+ else
+ smallest = idx;
+
+ if (RCHILD(idx) < hp->size &&
+ hp->elem[RCHILD(idx)].count < hp->elem[smallest].count)
+ smallest = RCHILD(idx);
+
+ if (smallest != idx) {
+ rte_member_heap_swap(&(hp->elem[idx]), &(hp->elem[smallest]));
+
+ if (update_hash) {
+ if (hash_table_update(hp->elem[smallest].key, idx + 1, smallest + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+
+ if (hash_table_update(hp->elem[idx].key, smallest + 1, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+ }
+ rte_member_heapify(hp, smallest, update_hash);
+ }
+}
+
+/* insert a node into the minheap */
+static int
+rte_member_minheap_insert_node(struct minheap *hp, const void *key,
+ int counter, void *key_slot,
+ struct rte_ring *free_key_slot)
+{
+ struct node nd;
+ uint32_t slot_id;
+
+ if (rte_ring_sc_dequeue_elem(free_key_slot, &slot_id, sizeof(uint32_t)) != 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap get empty keyslot failed\n");
+ return -1;
+ }
+
+ nd.count = counter;
+ nd.key = RTE_PTR_ADD(key_slot, slot_id * hp->key_len);
+
+ memcpy(nd.key, key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* delete a key from the minheap */
+static int
+rte_member_minheap_delete_node(struct minheap *hp, const void *key,
+ void *key_slot, struct rte_ring *free_key_slot)
+{
+ int idx = rte_member_minheap_find(hp, key);
+ uint32_t offset = RTE_PTR_DIFF(hp->elem[idx].key, key_slot) / hp->key_len;
+
+ if (hash_table_del(key, idx + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ rte_ring_sp_enqueue_elem(free_key_slot, &offset, sizeof(uint32_t));
+
+ if (idx == (int)(hp->size - 1)) {
+ hp->size--;
+ return 0;
+ }
+
+ hp->elem[idx] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[idx].key, hp->size, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+ rte_member_heapify(hp, idx, true);
+
+ return 0;
+}
+
+/* replace a min node with a new key. */
+static int
+rte_member_minheap_replace_node(struct minheap *hp,
+ const void *new_key,
+ int new_counter)
+{
+ struct node nd;
+ void *recycle_key = NULL;
+
+ recycle_key = hp->elem[0].key;
+
+ if (hash_table_del(recycle_key, 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ hp->elem[0] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[0].key, hp->size, 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+
+ rte_member_heapify(hp, 0, true);
+
+ nd.count = new_counter;
+ nd.key = recycle_key;
+
+ memcpy(nd.key, new_key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(new_key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace insert failed\n");
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* sort the heap into a descending array */
+static void
+rte_member_heapsort(struct minheap *hp, struct node *result_array)
+{
+ struct minheap new_hp;
+
+ /* build a new heap for using the given array */
+ new_hp.size = hp->size;
+ new_hp.key_len = hp->key_len;
+ new_hp.elem = result_array;
+ memcpy(result_array, hp->elem, hp->size * sizeof(struct node));
+
+ /* sort the new heap */
+ while (new_hp.size > 1) {
+ rte_member_heap_swap(&(new_hp.elem[0]), &(new_hp.elem[new_hp.size - 1]));
+ new_hp.size--;
+ rte_member_heapify(&new_hp, 0, false);
+ }
+}
+
+static void
+rte_member_minheap_free(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ rte_free(hp->elem);
+ rte_free(hp->hashtable);
+}
+
+static void
+rte_member_minheap_reset(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ memset(hp->elem, 0, sizeof(struct node) * hp->size);
+ hp->size = 0;
+
+ memset((char *)hp->hashtable + sizeof(struct hash), 0,
+ hp->hashtable->bkt_cnt * sizeof(struct hash_bkt));
+ hp->hashtable->num_item = 0;
+}
+
+#endif /* _RTE_MEMBER_HEAP_H_ */
diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c
new file mode 100644
index 0000000000..524ba77620
--- /dev/null
+++ b/lib/member/rte_member_sketch.c
@@ -0,0 +1,594 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#include <math.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_errno.h>
+#include <rte_log.h>
+#include <rte_random.h>
+#include <rte_prefetch.h>
+#include <rte_ring_elem.h>
+
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_member_heap.h"
+
+#ifdef CC_AVX512_SUPPORT
+#include "rte_member_sketch_avx512.h"
+#endif /* CC_AVX512_SUPPORT */
+
+struct sketch_runtime {
+ uint64_t pkt_cnt;
+ uint32_t until_next;
+ int converged;
+ struct minheap heap;
+ struct node *report_array;
+ void *key_slots;
+ struct rte_ring *free_key_slots;
+} __rte_cache_aligned;
+
+/*
+ * Geometric sampling to calculate how many packets needs to be
+ * skipped until next update. This method can mitigate the CPU
+ * overheads compared with coin-toss sampling.
+ */
+static uint32_t
+draw_geometric(const struct rte_member_setsum *ss)
+{
+ double rand = 1;
+
+ if (ss->sample_rate == 1)
+ return 1;
+
+ while (rand == 1 || rand == 0)
+ rand = (double) rte_rand() / (double)(RTE_RAND_MAX);
+
+ return (uint32_t)ceil(log(1 - rand) / log(1 - ss->sample_rate));
+}
+
+static void
+isort(uint64_t *array, int n)
+{
+ int i;
+
+ for (i = 1; i < n; i++) {
+ uint64_t t = array[i];
+ int j;
+
+ for (j = i - 1; j >= 0; j--) {
+ if (t < array[j])
+ array[j + 1] = array[j];
+ else
+ break;
+ }
+ array[j + 1] = t;
+ }
+}
+
+static __rte_always_inline void
+swap(uint64_t *a, uint64_t *b)
+{
+ uint64_t tmp = *a;
+ *a = *b;
+ *b = tmp;
+}
+
+static uint64_t
+medianof5(uint64_t a, uint64_t b, uint64_t c, uint64_t d, uint64_t e)
+{
+ if (a > b)
+ swap(&a, &b);
+ if (c > d)
+ swap(&c, &d);
+ if (a > c) {
+ if (d > e)
+ swap(&c, &e);
+ else {
+ swap(&c, &d);
+ swap(&d, &e);
+ }
+ } else {
+ if (b > e)
+ swap(&a, &e);
+ else {
+ swap(&a, &b);
+ swap(&b, &e);
+ }
+ }
+
+ if (a > c)
+ return a > d ? d : a;
+ else
+ return b > c ? c : b;
+}
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *ring)
+{
+ struct sketch_runtime *runtime;
+ uint32_t num_col;
+ uint32_t i;
+
+ if (params->sample_rate == 0 || params->sample_rate > 1) {
+ rte_errno = EINVAL;
+ RTE_MEMBER_LOG(ERR,
+ "Membership Sketch created with invalid parameters\n");
+ return -EINVAL;
+ }
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_COUNT_BYTE)
+ ss->count_byte = 1;
+
+#ifdef RTE_ARCH_X86
+ if (ss->count_byte == 1 &&
+ rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512IFMA) == 1) {
+#ifdef CC_AVX512_SUPPORT
+ ss->use_avx512 = true;
+#else
+ ss->use_avx512 = false;
+#endif
+ }
+
+ if (ss->use_avx512 == true) {
+#ifdef CC_AVX512_SUPPORT
+ ss->num_row = NUM_ROW_VEC;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch AVX512 update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_avx512;
+ ss->sketch_lookup = sketch_lookup_avx512;
+ ss->sketch_delete = sketch_delete_avx512;
+#endif
+ } else
+#endif
+ {
+ ss->num_row = NUM_ROW_SCALAR;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch SCALAR update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_scalar;
+ ss->sketch_lookup = sketch_lookup_scalar;
+ ss->sketch_delete = sketch_delete_scalar;
+ }
+
+ ss->socket_id = params->socket_id;
+
+ if (ss->count_byte == 0)
+ num_col = 4.0 / params->error_rate / params->sample_rate;
+#ifdef RTE_ARCH_X86
+ else if (ss->use_avx512 == true)
+ num_col = rte_align32pow2(4.0 / params->error_rate);
+#endif
+ else
+ num_col = 4.0 / params->error_rate;
+
+ ss->table = rte_zmalloc_socket(NULL,
+ sizeof(uint64_t) * num_col * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->table == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Table memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->hash_seeds = rte_zmalloc_socket(NULL, sizeof(uint64_t) * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->hash_seeds == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Hashseeds memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->runtime_var = rte_zmalloc_socket(NULL, sizeof(struct sketch_runtime),
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->runtime_var == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime memory allocation failed\n");
+ rte_free(ss);
+ return -ENOMEM;
+ }
+ runtime = ss->runtime_var;
+
+ ss->num_col = num_col;
+ ss->sample_rate = params->sample_rate;
+ ss->prim_hash_seed = params->prim_hash_seed;
+ ss->sec_hash_seed = params->sec_hash_seed;
+ ss->error_rate = params->error_rate;
+ ss->topk = params->top_k;
+ ss->key_len = params->key_len;
+ runtime->heap.key_len = ss->key_len;
+
+ runtime->key_slots = rte_zmalloc_socket(NULL, ss->key_len * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->key_slots == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Key Slots allocation failed\n");
+ goto error;
+ }
+
+ runtime->free_key_slots = ring;
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime->free_key_slots,
+ &i, sizeof(uint32_t));
+
+ if (rte_member_minheap_init(&(runtime->heap), params->top_k,
+ ss->socket_id, params->prim_hash_seed) < 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap allocation failed\n");
+ goto error_runtime;
+ }
+
+ runtime->report_array = rte_zmalloc_socket(NULL, sizeof(struct node) * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->report_array == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime Report Array allocation failed\n");
+ goto error_runtime;
+ }
+
+ rte_srand(ss->prim_hash_seed);
+ for (i = 0; i < ss->num_row; i++)
+ ss->hash_seeds[i] = rte_rand();
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_ALWAYS_BOUNDED)
+ ss->always_bounded = 1;
+
+ if (ss->always_bounded) {
+ double delta = 1.0 / (pow(2, ss->num_row));
+
+ ss->converge_thresh = 10 * pow(ss->error_rate, -2.0) * sqrt(log(1 / delta));
+ }
+
+ RTE_MEMBER_LOG(DEBUG, "Sketch created, "
+ "the total memory required is %u Bytes\n", ss->num_col * ss->num_row * 8);
+
+ return 0;
+
+error_runtime:
+ rte_member_minheap_free(&runtime->heap);
+ rte_ring_free(runtime->free_key_slots);
+ rte_free(runtime->key_slots);
+error:
+ rte_free(runtime);
+ rte_free(ss);
+
+ return -ENOMEM;
+}
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col[ss->num_row];
+ uint64_t count_row[ss->num_row];
+ uint32_t cur_row;
+ uint64_t count;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ rte_prefetch0(&count_array[cur_row * ss->num_col + col[cur_row]]);
+ }
+
+ /* if sample rate is 1, it is a regular count-min, we report the min */
+ if (ss->sample_rate == 1 || ss->count_byte == 1)
+ return count_min(ss, col);
+
+ memset(count_row, 0, sizeof(uint64_t) * ss->num_row);
+
+ /* otherwise we report the median number */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_row[cur_row] = count_array[cur_row * ss->num_col + col[cur_row]];
+
+ if (ss->num_row == 5)
+ return medianof5(count_row[0], count_row[1],
+ count_row[2], count_row[3], count_row[4]);
+
+ isort(count_row, ss->num_row);
+
+ if (ss->num_row % 2 == 0) {
+ count = (count_row[ss->num_row / 2] + count_row[ss->num_row / 2 - 1]) / 2;
+ return count;
+ }
+ /* ss->num_row % 2 != 0 */
+ count = count_row[ss->num_row / 2];
+
+ return count;
+}
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* set corresponding counter to 0 */
+ count_array[cur_row * ss->num_col + col[cur_row]] = 0;
+ }
+}
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ uint64_t *output)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ *output = count;
+
+ return 0;
+}
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ uint64_t count = ss->sketch_lookup(ss, runtime_var->heap.elem[i].key);
+
+ runtime_var->heap.elem[i].count = count;
+ }
+}
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key,
+ uint64_t *count)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = setsum->runtime_var;
+
+ rte_member_update_heap(setsum);
+ rte_member_heapsort(&(runtime_var->heap), runtime_var->report_array);
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ key[i] = runtime_var->report_array[i].key;
+ count[i] = runtime_var->report_array[i].count;
+ }
+
+ return runtime_var->heap.size;
+}
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *ss,
+ const void *key, member_set_t *set_id)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ if (runtime_var->heap.size > 0 && count >= runtime_var->heap.elem[0].count)
+ *set_id = 1;
+ else
+ *set_id = 0;
+
+ if (count == 0)
+ return 0;
+ else
+ return 1;
+}
+
+static void
+should_converge(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ /* For count min sketch - L1 norm */
+ if (runtime_var->pkt_cnt > ss->converge_thresh) {
+ runtime_var->converged = 1;
+ RTE_MEMBER_LOG(DEBUG, "Sketch converged, begin sampling "
+ "from key count %"PRIu64"\n",
+ runtime_var->pkt_cnt);
+ }
+}
+
+static void
+sketch_update_row(const struct rte_member_setsum *ss, const void *key,
+ uint32_t count, uint32_t cur_row)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* sketch counter update */
+ count_array[cur_row * ss->num_col + col] +=
+ ceil(count / (ss->sample_rate));
+}
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+ count_array[cur_row * ss->num_col + col] += count;
+ }
+}
+
+static void
+heap_update(const struct rte_member_setsum *ss, const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t key_cnt = 0;
+ int found;
+
+ /* We also update the heap for this key */
+ key_cnt = ss->sketch_lookup(ss, key);
+ if (key_cnt > runtime_var->heap.elem[0].count) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ /* the key is found in the top-k heap */
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else if (runtime_var->heap.size < ss->topk) {
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ } else {
+ rte_member_minheap_replace_node(&runtime_var->heap, key, key_cnt);
+ }
+ } else if (runtime_var->heap.size < ss->topk) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ }
+}
+
+/*
+ * Add a single packet into the sketch.
+ * Sketch value is meatured by packet numbers in this mode.
+ */
+int
+rte_member_add_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ __rte_unused member_set_t set_id)
+{
+ uint32_t cur_row;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /*
+ * If sketch is measured by byte count,
+ * the rte_member_add_sketch_byte_count routine should be used.
+ */
+ if (ss->count_byte == 1) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Byte Mode, "
+ "should use rte_member_add_byte_count()!\n");
+ return -EINVAL;
+ }
+
+ if (ss->sample_rate == 1) {
+ ss->sketch_update(ss, key, 1);
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* convergence stage if it's needed */
+ if (ss->always_bounded && !runtime_var->converged) {
+ ss->sketch_update(ss, key, 1);
+
+ if (!((++runtime_var->pkt_cnt) & (INTERVAL - 1)))
+ should_converge(ss);
+
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* should we skip this packet */
+ if (*until_next >= ss->num_row) {
+ *until_next -= ss->num_row;
+ return 0;
+ }
+ cur_row = *until_next;
+ do {
+ sketch_update_row(ss, key, 1, cur_row);
+ *until_next = draw_geometric(ss);
+ if (cur_row + *until_next >= ss->num_row)
+ break;
+ cur_row += *until_next;
+ } while (1);
+
+ *until_next -= (ss->num_row - cur_row);
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+/*
+ * Add the byte count of the packet into the sketch.
+ * Sketch value is meatured by byte count numbers in this mode.
+ */
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t byte_count)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /* should not call this API if not in count byte mode */
+ if (ss->count_byte == 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Pkt Mode, "
+ "should use rte_member_add()!\n");
+ return -EINVAL;
+ }
+
+ /* there's specific optimization for the sketch update */
+ ss->sketch_update(ss, key, byte_count);
+
+ if (*until_next != 0) {
+ *until_next = *until_next - 1;
+ return 0;
+ }
+
+ *until_next = draw_geometric(ss) - 1;
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *ss,
+ const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ int found;
+
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found < 0)
+ return -1;
+
+ ss->sketch_delete(ss, key);
+
+ return rte_member_minheap_delete_node
+ (&runtime_var->heap, key, runtime_var->key_slots, runtime_var->free_key_slots);
+}
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ rte_free(ss->table);
+ rte_member_minheap_free(&runtime_var->heap);
+ rte_free(runtime_var->key_slots);
+ rte_ring_free(runtime_var->free_key_slots);
+ rte_free(runtime_var);
+}
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t *sketch = ss->table;
+ uint32_t i;
+
+ memset(sketch, 0, sizeof(uint64_t) * ss->num_col * ss->num_row);
+ rte_member_minheap_reset(&runtime_var->heap);
+ rte_ring_reset(runtime_var->free_key_slots);
+
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime_var->free_key_slots, &i, sizeof(uint32_t));
+}
diff --git a/lib/member/rte_member_sketch.h b/lib/member/rte_member_sketch.h
new file mode 100644
index 0000000000..219323008b
--- /dev/null
+++ b/lib/member/rte_member_sketch.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_H_
+#define _RTE_MEMBER_SKETCH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include <rte_ring_elem.h>
+
+#define NUM_ROW_SCALAR 5
+#define INTERVAL (1 << 15)
+
+#if !RTE_IS_POWER_OF_2(INTERVAL)
+#error sketch INTERVAL macro must be a power of 2
+#endif
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *r);
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *setsum,
+ const void *key, member_set_t *set_id);
+
+int
+rte_member_add_sketch(const struct rte_member_setsum *setsum,
+ const void *key,
+ member_set_t set_id);
+
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key, uint32_t byte_count);
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *setsum,
+ const void *key);
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output);
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss);
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *setsum);
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count);
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss);
+
+static __rte_always_inline uint64_t
+count_min(const struct rte_member_setsum *ss, const uint32_t *hash_results)
+{
+ uint64_t *count_array = ss->table;
+ uint64_t count;
+ uint32_t cur_row;
+ uint64_t min = UINT64_MAX;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ uint64_t cnt = count_array[cur_row * ss->num_col + hash_results[cur_row]];
+
+ if (cnt < min)
+ min = cnt;
+ }
+ count = min;
+
+ return count;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_H_ */
diff --git a/lib/member/rte_member_sketch_avx512.c b/lib/member/rte_member_sketch_avx512.c
new file mode 100644
index 0000000000..288e37a446
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.c
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "rte_xxh64_avx512.h"
+#include "rte_member_sketch_avx512.h"
+
+__rte_always_inline void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t num_col = ss->num_col;
+ uint32_t key_len = ss->key_len;
+ __m256i v_row_base;
+ __m256i v_hash_result;
+ __m512i current_sketch;
+ __m512i updated_sketch;
+ __m512i v_count;
+
+ const __m256i v_idx = _mm256_set_epi32(7, 6, 5, 4, 3, 2, 1, 0);
+ const __m256i v_col = _mm256_set1_epi32(num_col);
+
+ /* compute the hash result parallelly */
+ v_hash_result = rte_xxh64_sketch_avx512
+ (key, key_len, *(__m512i *)ss->hash_seeds, num_col);
+ v_row_base = _mm256_mullo_epi32(v_idx, v_col);
+ v_hash_result = _mm256_add_epi32(v_row_base, v_hash_result);
+
+ current_sketch =
+ _mm512_i32gather_epi64(v_hash_result, count_array, 8);
+ v_count = _mm512_set1_epi64(count);
+ updated_sketch = _mm512_add_epi64(current_sketch, v_count);
+ _mm512_i32scatter_epi64
+ ((void *)count_array, v_hash_result, updated_sketch, 8);
+}
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+
+ /* currently only for sketch byte count mode */
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds, ss->num_col);
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ return count_min(ss, col);
+}
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint64_t min = UINT64_MAX;
+ uint32_t cur_row;
+
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds,
+ RTE_ALIGN_FLOOR(ss->num_col, 32));
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ min = count_min(ss, col);
+
+ /* subtract the min value from all the counters */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_array[cur_row * ss->num_col + col[cur_row]] -= min;
+}
diff --git a/lib/member/rte_member_sketch_avx512.h b/lib/member/rte_member_sketch_avx512.h
new file mode 100644
index 0000000000..68105cd895
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_AVX512_H_
+#define _RTE_MEMBER_SKETCH_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+
+#define NUM_ROW_VEC 8
+
+void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_AVX512_H_ */
diff --git a/lib/member/rte_xxh64_avx512.h b/lib/member/rte_xxh64_avx512.h
new file mode 100644
index 0000000000..50ca1b52c7
--- /dev/null
+++ b/lib/member/rte_xxh64_avx512.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_XXH64_AVX512_H_
+#define _RTE_XXH64_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <immintrin.h>
+
+/* 0b1001111000110111011110011011000110000101111010111100101010000111 */
+static const uint64_t PRIME64_1 = 0x9E3779B185EBCA87ULL;
+/* 0b1100001010110010101011100011110100100111110101001110101101001111 */
+static const uint64_t PRIME64_2 = 0xC2B2AE3D27D4EB4FULL;
+/* 0b0001011001010110011001111011000110011110001101110111100111111001 */
+static const uint64_t PRIME64_3 = 0x165667B19E3779F9ULL;
+/* 0b1000010111101011110010100111011111000010101100101010111001100011 */
+static const uint64_t PRIME64_4 = 0x85EBCA77C2B2AE63ULL;
+/* 0b0010011111010100111010110010111100010110010101100110011111000101 */
+static const uint64_t PRIME64_5 = 0x27D4EB2F165667C5ULL;
+
+static __rte_always_inline __m512i
+xxh64_round_avx512(__m512i hash, __m512i input)
+{
+ hash = _mm512_madd52lo_epu64(hash,
+ input,
+ _mm512_set1_epi64(PRIME64_2));
+
+ hash = _mm512_rol_epi64(hash, 31);
+
+ return hash;
+}
+
+static __rte_always_inline __m512i
+xxh64_fmix_avx512(__m512i hash)
+{
+ hash = _mm512_xor_si512(hash, _mm512_srli_epi64(hash, 33));
+
+ return hash;
+}
+
+static __rte_always_inline __m256i
+rte_xxh64_sketch_avx512(const void *key, uint32_t key_len,
+ __m512i v_seed, uint32_t modulo)
+{
+ __m512i v_prime64_5, v_hash;
+ size_t remaining = key_len;
+ size_t offset = 0;
+ __m512i input;
+
+ v_prime64_5 = _mm512_set1_epi64(PRIME64_5);
+ v_hash = _mm512_add_epi64
+ (_mm512_add_epi64(v_seed, v_prime64_5),
+ _mm512_set1_epi64(key_len));
+
+ while (remaining >= 8) {
+ input = _mm512_set1_epi64(*(uint64_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ xxh64_round_avx512(_mm512_setzero_si512(), input));
+ v_hash = _mm512_madd52lo_epu64(_mm512_set1_epi64(PRIME64_4),
+ v_hash,
+ _mm512_set1_epi64(PRIME64_1));
+
+ remaining -= 8;
+ offset += 8;
+ }
+
+ if (remaining >= 4) {
+ input = _mm512_set1_epi64
+ (*(uint32_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_1)));
+ v_hash = _mm512_madd52lo_epu64
+ (_mm512_set1_epi64(PRIME64_3),
+ _mm512_rol_epi64(v_hash, 23),
+ _mm512_set1_epi64(PRIME64_2));
+
+ offset += 4;
+ remaining -= 4;
+ }
+
+ while (remaining != 0) {
+ input = _mm512_set1_epi64
+ (*(uint8_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_5)));
+ v_hash = _mm512_mullo_epi64
+ (_mm512_rol_epi64(v_hash, 11),
+ _mm512_set1_epi64(PRIME64_1));
+ offset++;
+ remaining--;
+ }
+
+ v_hash = xxh64_fmix_avx512(v_hash);
+
+ /*
+ * theoritically, such modular operations can be replaced by
+ * _mm512_rem_epi64(), but seems it depends on the compiler's
+ * implementation. so here is the limitation that the modulo
+ * value should be power of 2.
+ */
+ __m512i v_hash_remainder = _mm512_set1_epi64((modulo - 1));
+
+ return _mm512_cvtepi64_epi32(_mm512_and_si512(v_hash, v_hash_remainder));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_XXH64_AVX512_H_ */
diff --git a/lib/member/version.map b/lib/member/version.map
index 19469c6aba..35199270ff 100644
--- a/lib/member/version.map
+++ b/lib/member/version.map
@@ -14,3 +14,12 @@ DPDK_23 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ # added in 22.11
+ rte_member_add_byte_count;
+ rte_member_query_count;
+ rte_member_report_heavyhitter;
+};
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 2/2] test/member: add functional and perf tests for sketch
2022-09-15 13:03 ` [PATCH v4 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-15 13:03 ` [PATCH v4 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-15 13:03 ` Leyi Rong
1 sibling, 0 replies; 34+ messages in thread
From: Leyi Rong @ 2022-09-15 13:03 UTC (permalink / raw)
To: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patch adds functional and performance tests for sketch mode of
membership library.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
app/test/test_member.c | 272 ++++++++++++++++++++++++++++++++++++
app/test/test_member_perf.c | 153 +++++++++++++++++++-
2 files changed, 421 insertions(+), 4 deletions(-)
diff --git a/app/test/test_member.c b/app/test/test_member.c
index 26a712439f..8266e6437b 100644
--- a/app/test/test_member.c
+++ b/app/test/test_member.c
@@ -4,6 +4,7 @@
/* This test is for membership library's simple feature test */
+#include <math.h>
#include "test.h"
#include <rte_memcpy.h>
@@ -28,6 +29,7 @@ test_member(void)
struct rte_member_setsum *setsum_ht;
struct rte_member_setsum *setsum_cache;
struct rte_member_setsum *setsum_vbf;
+struct rte_member_setsum *setsum_sketch;
/* 5-tuple key type */
struct flow_key {
@@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
.socket_id = 0 /* NUMA Socket ID for memory. */
};
+/* for sketch definitions */
+#define TOP_K 10
+#define HH_PKT_SIZE 16
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define PRINT_OUT_COUNT 20
+
+#define SKETCH_LARGEST_KEY_SIZE 1000000
+#define SKETCH_TOTAL_KEY 500
+#define NUM_OF_KEY(key) {\
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \
+}
+
+void *heavy_hitters[TOP_K];
+
/*
* Sequence of operations for find existing setsummary
*
@@ -684,6 +701,257 @@ perform_free(void)
rte_member_free(setsum_vbf);
}
+static void
+print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
+ uint32_t print_num, bool count_byte)
+{
+ uint32_t i;
+
+ for (i = 0; i < print_num; i++) {
+ if (count_byte)
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)) *
+ HH_PKT_SIZE,
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
+ ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
+ else
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)),
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i)) /
+ (float)NUM_OF_KEY(i));
+ }
+}
+
+static int
+sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int reset_test)
+{
+ uint32_t i;
+ uint64_t result_count[SKETCH_TOTAL_KEY];
+ member_set_t heavy_set[SKETCH_TOTAL_KEY];
+ uint64_t count[TOP_K];
+ int ret;
+ int hh_cnt;
+
+ setsum_sketch = rte_member_create(¶ms);
+ if (setsum_sketch == NULL) {
+ printf("Creation of setsums fail\n");
+ return -1;
+ }
+
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count(setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ if (reset_test) {
+ printf("\nEntering Sketch Reset Test Process!\n");
+ rte_member_reset(setsum_sketch);
+
+ /* after reset, check some key's count */
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ printf("\nReinsert keys after Sketch Reset!\n");
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count
+ (setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ printf("\nDelete some keys!\n");
+ uint32_t tmp_key = 0;
+
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+ tmp_key = 1;
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+ }
+
+ rte_member_free(setsum_sketch);
+ return 0;
+}
+
+static int
+test_member_sketch(void)
+{
+ unsigned int i, j, index;
+ uint32_t total_pkt = 0;
+ uint32_t *keys;
+ int count_byte = 0;
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++)
+ total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
+
+ printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
+
+ keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
+
+ if (keys == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ return -1;
+ }
+
+ index = 0;
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
+ keys[index++] = i;
+ }
+
+ /* shuffle the keys */
+ for (i = index - 1; i > 0; i--) {
+ uint32_t swap_idx = rte_rand() % i;
+ uint32_t tmp_key = keys[i];
+
+ keys[i] = keys[swap_idx];
+ keys[swap_idx] = tmp_key;
+ }
+
+ params.key_len = 4;
+ params.name = "test_member_sketch";
+ params.type = RTE_MEMBER_TYPE_SKETCH;
+ params.error_rate = SKETCH_ERROR_RATE;
+ params.sample_rate = SKETCH_SAMPLE_RATE;
+ params.extra_flag = 0;
+ params.top_k = TOP_K;
+ params.prim_hash_seed = rte_rdtsc();
+ int reset_test = 0;
+
+ printf("Default sketching params: Error Rate: [%f]\tSample Rate: [%f]\tTopK: [%d]\n",
+ SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
+
+ printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ printf("\n[Sketch with Always Bounded Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 1;
+ params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ printf("\n[Sketch with Packet Size Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 0;
+ params.extra_flag = 0;
+ reset_test = 1;
+ printf("\nreset sketch test\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ rte_free(keys);
+ return 0;
+}
+
static int
test_member(void)
{
@@ -719,6 +987,10 @@ test_member(void)
return -1;
}
+ if (test_member_sketch() < 0) {
+ perform_free();
+ return -1;
+ }
perform_free();
return 0;
}
diff --git a/app/test/test_member_perf.c b/app/test/test_member_perf.c
index 978db0020a..b7eb3e4c66 100644
--- a/app/test/test_member_perf.c
+++ b/app/test/test_member_perf.c
@@ -13,6 +13,7 @@
#include <rte_random.h>
#include <rte_memcpy.h>
#include <rte_thash.h>
+#include <math.h>
#ifdef RTE_EXEC_ENV_WINDOWS
static int
@@ -26,7 +27,7 @@ test_member_perf(void)
#include <rte_member.h>
-#define NUM_KEYSIZES 10
+#define NUM_KEYSIZES RTE_DIM(hashtest_key_lens)
#define NUM_SHUFFLES 10
#define MAX_KEYSIZE 64
#define MAX_ENTRIES (1 << 19)
@@ -36,12 +37,23 @@ test_member_perf(void)
#define BURST_SIZE 64
#define VBF_FALSE_RATE 0.03
+/* for the heavy hitter detection */
+#define SKETCH_LARGEST_KEY_SIZE (1<<15)
+#define SKETCH_PKT_SIZE 16
+#define TOP_K 100
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define NUM_ADDS (KEYS_TO_ADD * 20)
+
static unsigned int test_socket_id;
enum sstype {
HT = 0,
CACHE,
VBF,
+ SKETCH,
+ SKETCH_BOUNDED,
+ SKETCH_BYTE,
NUM_TYPE
};
@@ -88,6 +100,7 @@ static member_set_t data[NUM_TYPE][/* Array to store the data */KEYS_TO_ADD];
/* Array to store all input keys */
static uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE];
+static uint8_t hh_keys[KEYS_TO_ADD][MAX_KEYSIZE];
/* Shuffle the keys that have been added, so lookups will be totally random */
static void
@@ -136,6 +149,10 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
{
unsigned int i, j;
int num_duplicates;
+ int distinct_key = 0;
+ int count_down = SKETCH_LARGEST_KEY_SIZE;
+ uint32_t swap_idx;
+ uint8_t temp_key[MAX_KEYSIZE];
params->key_size = hashtest_key_lens[cycle];
params->cycle = cycle;
@@ -176,6 +193,22 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
/* Shuffle the random values again */
shuffle_input_keys(params);
+ for (i = 0; i < KEYS_TO_ADD; i++) {
+ if (count_down == 0) {
+ distinct_key++;
+ count_down = ceil(SKETCH_LARGEST_KEY_SIZE / (distinct_key + 1));
+ }
+ memcpy(hh_keys[i], keys[distinct_key], params->key_size);
+ count_down--;
+ }
+
+ for (i = KEYS_TO_ADD - 1; i > 0; i--) {
+ swap_idx = rte_rand() % i;
+ memcpy(temp_key, hh_keys[i], params->key_size);
+ memcpy(hh_keys[i], hh_keys[swap_idx], params->key_size);
+ memcpy(hh_keys[swap_idx], temp_key, params->key_size);
+ }
+
/* For testing miss lookup, we insert half and lookup the other half */
unsigned int entry_cnt, bf_key_cnt;
if (!miss) {
@@ -208,6 +241,44 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
params->setsum[VBF] = rte_member_create(&member_params);
if (params->setsum[VBF] == NULL)
fprintf(stderr, "VBF create fail\n");
+
+ member_params.name = "test_member_sketch";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag = 0;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_bounded";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BOUNDED] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BOUNDED] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_byte";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BYTE] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BYTE] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+
for (i = 0; i < NUM_TYPE; i++) {
if (params->setsum[i] == NULL)
return -1;
@@ -243,6 +314,39 @@ timed_adds(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_adds_sketch(struct member_perf_params *params, int type)
+{
+ const uint64_t start_tsc = rte_rdtsc();
+ unsigned int i, j, a;
+ int32_t ret;
+
+ for (i = 0; i < NUM_ADDS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ if (type == SKETCH_BYTE)
+ ret = rte_member_add_byte_count(params->setsum[type],
+ &hh_keys[j], SKETCH_PKT_SIZE);
+ else
+ ret = rte_member_add(params->setsum[type], &hh_keys[j], 1);
+ if (ret < 0) {
+ printf("Error %d in rte_member_add - key=0x", ret);
+ for (a = 0; a < params->key_size; a++)
+ printf("%02x", hh_keys[j][a]);
+ printf("type: %d\n", type);
+
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][ADD] = time_taken / NUM_ADDS;
+
+ return 0;
+}
+
static int
timed_lookups(struct member_perf_params *params, int type)
{
@@ -279,6 +383,36 @@ timed_lookups(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_lookups_sketch(struct member_perf_params *params, int type)
+{
+ unsigned int i, j;
+
+ false_data[type][params->cycle] = 0;
+
+ const uint64_t start_tsc = rte_rdtsc();
+ member_set_t result;
+ int ret;
+
+ for (i = 0; i < NUM_LOOKUPS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ ret = rte_member_lookup(params->setsum[type], &hh_keys[j],
+ &result);
+ if (ret < 0) {
+ printf("lookup wrong internally");
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][LOOKUP] = time_taken / NUM_LOOKUPS;
+
+ return 0;
+}
+
static int
timed_lookups_bulk(struct member_perf_params *params, int type)
{
@@ -531,7 +665,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_adds(¶ms, j) < 0)
return exit_with_fail("timed_adds", ¶ms,
@@ -562,6 +696,17 @@ run_all_tbl_perf_tests(void)
/* Print a dot to show progress on operations */
}
+
+ for (j = SKETCH; j < NUM_TYPE; j++) {
+ if (timed_adds_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_adds_sketch", ¶ms, i, j);
+
+ if (timed_lookups_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_lookups_sketch", ¶ms, i, j);
+ }
+
printf(".");
fflush(stdout);
@@ -574,7 +719,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_miss_lookup(¶ms, j) < 0)
return exit_with_fail("timed_miss_lookup",
¶ms, i, j);
@@ -605,7 +750,7 @@ run_all_tbl_perf_tests(void)
"fr_multi_bulk", "false_positive_rate");
/* Key size not influence False rate so just print out one key size */
for (i = 0; i < 1; i++) {
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
printf("%-18d", hashtest_key_lens[i]);
printf("%-18d", j);
printf("%-18f", (float)false_data[j][i] / NUM_LOOKUPS);
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 1/2] member: implement NitroSketch mode
2022-09-15 13:03 ` [PATCH v4 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-15 14:12 ` Bruce Richardson
2022-09-16 2:30 ` Rong, Leyi
0 siblings, 1 reply; 34+ messages in thread
From: Bruce Richardson @ 2022-09-15 14:12 UTC (permalink / raw)
To: Leyi Rong
Cc: ferruh.yigit, david.marchand, suanmingm, yipeng1.wang,
zaoxingliu, sameh.gobriel, dev
On Thu, Sep 15, 2022 at 09:03:41PM +0800, Leyi Rong wrote:
> Sketching algorithm provide high-fidelity approximate measurements and
> appears as a promising alternative to traditional approaches such as
> packet sampling.
>
> NitroSketch [1] is a software sketching framework that optimizes
> performance, provides accuracy guarantees, and supports a variety of
> sketches.
>
> This commit adds a new data structure called sketch into
> membership library. This new data structure is an efficient
> way to profile the traffic for heavy hitters. Also use min-heap
> structure to maintain the top-k flow keys.
>
> [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
>
> Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
Couple of comments below on the meson.build file.
/Bruce
> lib/member/meson.build | 42 +-
> lib/member/rte_member.c | 75 ++++
> lib/member/rte_member.h | 154 ++++++-
> lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> lib/member/rte_member_sketch.h | 97 +++++
> lib/member/rte_member_sketch_avx512.c | 70 +++
> lib/member/rte_member_sketch_avx512.h | 35 ++
> lib/member/rte_xxh64_avx512.h | 117 +++++
> lib/member/version.map | 9 +
> 10 files changed, 1613 insertions(+), 4 deletions(-)
> create mode 100644 lib/member/rte_member_heap.h
> create mode 100644 lib/member/rte_member_sketch.c
> create mode 100644 lib/member/rte_member_sketch.h
> create mode 100644 lib/member/rte_member_sketch_avx512.c
> create mode 100644 lib/member/rte_member_sketch_avx512.h
> create mode 100644 lib/member/rte_xxh64_avx512.h
>
> diff --git a/lib/member/meson.build b/lib/member/meson.build
> index e06fddc240..3df0fbc7a6 100644
> --- a/lib/member/meson.build
> +++ b/lib/member/meson.build
> @@ -7,6 +7,46 @@ if is_windows
> subdir_done()
> endif
>
> -sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
> +sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c')
At this point, with 4 entries I'd suggest reworking this to a vertical
list.
sources = files(
'rte_member.c',
....
)
> headers = files('rte_member.h')
> deps += ['hash']
> +includes += include_directories('../hash', '../ring')
> +
Use dependencies rather than include-paths. Having a dependency on hash
means that your code will be compiled with the hash include paths already
present. Similarly you can add "ring" to the deps variable and drop this
line.
> +# compile AVX512 version if:
> +# we are building 64-bit binary AND binutils can generate proper code
I think this comment is no longer necessary. The condition itself is clear.
The following comment can be kept though as it explains what is being done
at a higher level than the code.
> +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
> + # compile AVX512 version if either:
> + # a. we have AVX512 supported in minimum instruction set
> + # baseline
> + # b. it's not minimum instruction set, but supported by
> + # compiler
> + #
> + # in former case, just add avx512 C file to files list
> + # in latter case, compile c file to static lib, using correct
> + # compiler flags, and then have the .o file from static lib
> + # linked into main lib.
> +
> + #check if all required flags already enabled
> + sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__', '__AVX512IFMA__']
> +
> + sketch_avx512_on = true
> + foreach f:sketch_avx512_flags
> + if cc.get_define(f, args: machine_args) == ''
> + sketch_avx512_on = false
> + endif
> + endforeach
> +
> + if sketch_avx512_on == true
> + cflags += ['-DCC_AVX512_SUPPORT']
> + sources += files('rte_member_sketch_avx512.c')
> + elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
> + sketch_avx512_tmp = static_library('sketch_avx512_tmp',
> + 'rte_member_sketch_avx512.c',
> + include_directories: includes,
> + dependencies: static_rte_eal,
> + c_args: cflags +
> + ['-mavx512f', '-mavx512dq', '-mavx512ifma'])
> + objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
> + cflags += ['-DCC_AVX512_SUPPORT']
Logic looks correct here now, but beware of your indentation. You are
mixing tabs and spaces. Meson files should have spaces only, no tabs.
> + endif
> +endif
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v4 1/2] member: implement NitroSketch mode
2022-09-15 14:12 ` Bruce Richardson
@ 2022-09-16 2:30 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-09-16 2:30 UTC (permalink / raw)
To: Richardson, Bruce
Cc: ferruh.yigit, david.marchand, suanmingm, Wang, Yipeng1,
zaoxingliu, Gobriel, Sameh, dev
Hi Bruce,
The comments will be addressed in v5, thank you!
> -----Original Message-----
> From: Richardson, Bruce <bruce.richardson@intel.com>
> Sent: Thursday, September 15, 2022 10:13 PM
> To: Rong, Leyi <leyi.rong@intel.com>
> Cc: ferruh.yigit@xilinx.com; david.marchand@redhat.com;
> suanmingm@nvidia.com; Wang, Yipeng1 <yipeng1.wang@intel.com>;
> zaoxingliu@gmail.com; Gobriel, Sameh <sameh.gobriel@intel.com>;
> dev@dpdk.org
> Subject: Re: [PATCH v4 1/2] member: implement NitroSketch mode
>
> On Thu, Sep 15, 2022 at 09:03:41PM +0800, Leyi Rong wrote:
> > Sketching algorithm provide high-fidelity approximate measurements and
> > appears as a promising alternative to traditional approaches such as
> > packet sampling.
> >
> > NitroSketch [1] is a software sketching framework that optimizes
> > performance, provides accuracy guarantees, and supports a variety of
> > sketches.
> >
> > This commit adds a new data structure called sketch into membership
> > library. This new data structure is an efficient way to profile the
> > traffic for heavy hitters. Also use min-heap structure to maintain the
> > top-k flow keys.
> >
> > [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> > Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> > Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> > https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
> >
> > Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> > Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> > Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> > ---
>
> Couple of comments below on the meson.build file.
>
> /Bruce
>
> > lib/member/meson.build | 42 +-
> > lib/member/rte_member.c | 75 ++++
> > lib/member/rte_member.h | 154 ++++++-
> > lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> > lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> > lib/member/rte_member_sketch.h | 97 +++++
> > lib/member/rte_member_sketch_avx512.c | 70 +++
> > lib/member/rte_member_sketch_avx512.h | 35 ++
> > lib/member/rte_xxh64_avx512.h | 117 +++++
> > lib/member/version.map | 9 +
> > 10 files changed, 1613 insertions(+), 4 deletions(-) create mode
> > 100644 lib/member/rte_member_heap.h create mode 100644
> > lib/member/rte_member_sketch.c create mode 100644
> > lib/member/rte_member_sketch.h create mode 100644
> > lib/member/rte_member_sketch_avx512.c
> > create mode 100644 lib/member/rte_member_sketch_avx512.h
> > create mode 100644 lib/member/rte_xxh64_avx512.h
> >
> > diff --git a/lib/member/meson.build b/lib/member/meson.build index
> > e06fddc240..3df0fbc7a6 100644
> > --- a/lib/member/meson.build
> > +++ b/lib/member/meson.build
> > @@ -7,6 +7,46 @@ if is_windows
> > subdir_done()
> > endif
> >
> > -sources = files('rte_member.c', 'rte_member_ht.c',
> > 'rte_member_vbf.c')
> > +sources = files('rte_member.c', 'rte_member_ht.c',
> > +'rte_member_vbf.c', 'rte_member_sketch.c')
>
> At this point, with 4 entries I'd suggest reworking this to a vertical list.
>
> sources = files(
> 'rte_member.c',
> ....
> )
>
> > headers = files('rte_member.h')
> > deps += ['hash']
> > +includes += include_directories('../hash', '../ring')
> > +
>
> Use dependencies rather than include-paths. Having a dependency on hash
> means that your code will be compiled with the hash include paths already
> present. Similarly you can add "ring" to the deps variable and drop this line.
>
> > +# compile AVX512 version if:
> > +# we are building 64-bit binary AND binutils can generate proper code
>
> I think this comment is no longer necessary. The condition itself is clear.
> The following comment can be kept though as it explains what is being done at a
> higher level than the code.
>
> > +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
> > + # compile AVX512 version if either:
> > + # a. we have AVX512 supported in minimum instruction set
> > + # baseline
> > + # b. it's not minimum instruction set, but supported by
> > + # compiler
> > + #
> > + # in former case, just add avx512 C file to files list
> > + # in latter case, compile c file to static lib, using correct
> > + # compiler flags, and then have the .o file from static lib
> > + # linked into main lib.
> > +
> > + #check if all required flags already enabled
> > + sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__',
> > + '__AVX512IFMA__']
> > +
> > + sketch_avx512_on = true
> > + foreach f:sketch_avx512_flags
> > + if cc.get_define(f, args: machine_args) == ''
> > + sketch_avx512_on = false
> > + endif
> > + endforeach
> > +
> > + if sketch_avx512_on == true
> > + cflags += ['-DCC_AVX512_SUPPORT']
> > + sources += files('rte_member_sketch_avx512.c')
> > + elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
> > + sketch_avx512_tmp = static_library('sketch_avx512_tmp',
> > + 'rte_member_sketch_avx512.c',
> > + include_directories: includes,
> > + dependencies: static_rte_eal,
> > + c_args: cflags +
> > + ['-mavx512f', '-mavx512dq', '-mavx512ifma'])
> > + objs +=
> sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
> > + cflags += ['-DCC_AVX512_SUPPORT']
>
> Logic looks correct here now, but beware of your indentation. You are mixing
> tabs and spaces. Meson files should have spaces only, no tabs.
>
> > + endif
> > +endif
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v5 0/2] introduce NitroSketch Mode into membership library
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
` (4 preceding siblings ...)
2022-09-15 13:03 ` [PATCH v4 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-09-16 3:03 ` Leyi Rong
2022-09-16 3:03 ` [PATCH v5 1/2] member: implement NitroSketch mode Leyi Rong
` (2 more replies)
5 siblings, 3 replies; 34+ messages in thread
From: Leyi Rong @ 2022-09-16 3:03 UTC (permalink / raw)
To: ferruh.yigit, bruce.richardson, david.marchand, suanmingm,
yipeng1.wang, zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patchset introduce a brand new NitroSketch Mode into membership
library. This algorithm provides high-fidelity approximate measurements
and appears as a promissing alternative to triditional approches such as
packet sampling.
---
v5:
- meson.build rework.
v4:
- remove extra avx512 complier flags from the cflags.
v3:
- mark below new added APIs as "experimental":
rte_member_add_byte_count()
rte_member_query_count()
rte_member_report_heavyhitter()
- meson build file rework for potential runtime issue.
v2:
- attach paper link to commit log.
- fix potential memory leaks in test_member.c.
- build error fix according to CI build fail log.
Leyi Rong (2):
member: implement NitroSketch mode
test/member: add functional and perf tests for sketch
app/test/test_member.c | 272 ++++++++++++
app/test/test_member_perf.c | 153 ++++++-
lib/member/meson.build | 49 ++-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 154 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 70 +++
lib/member/rte_member_sketch_avx512.h | 35 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 9 +
12 files changed, 2040 insertions(+), 9 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v5 1/2] member: implement NitroSketch mode
2022-09-16 3:03 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Leyi Rong
@ 2022-09-16 3:03 ` Leyi Rong
2022-09-22 3:00 ` Jiang, YuX
2022-10-03 12:37 ` Thomas Monjalon
2022-09-16 3:03 ` [PATCH v5 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-10-09 23:13 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Thomas Monjalon
2 siblings, 2 replies; 34+ messages in thread
From: Leyi Rong @ 2022-09-16 3:03 UTC (permalink / raw)
To: ferruh.yigit, bruce.richardson, david.marchand, suanmingm,
yipeng1.wang, zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
Sketching algorithm provide high-fidelity approximate measurements and
appears as a promising alternative to traditional approaches such as
packet sampling.
NitroSketch [1] is a software sketching framework that optimizes
performance, provides accuracy guarantees, and supports a variety of
sketches.
This commit adds a new data structure called sketch into
membership library. This new data structure is an efficient
way to profile the traffic for heavy hitters. Also use min-heap
structure to maintain the top-k flow keys.
[1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
lib/member/meson.build | 49 ++-
lib/member/rte_member.c | 75 ++++
lib/member/rte_member.h | 154 ++++++-
lib/member/rte_member_heap.h | 424 ++++++++++++++++++
lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
lib/member/rte_member_sketch.h | 97 +++++
lib/member/rte_member_sketch_avx512.c | 70 +++
lib/member/rte_member_sketch_avx512.h | 35 ++
lib/member/rte_xxh64_avx512.h | 117 +++++
lib/member/version.map | 9 +
10 files changed, 1619 insertions(+), 5 deletions(-)
create mode 100644 lib/member/rte_member_heap.h
create mode 100644 lib/member/rte_member_sketch.c
create mode 100644 lib/member/rte_member_sketch.h
create mode 100644 lib/member/rte_member_sketch_avx512.c
create mode 100644 lib/member/rte_member_sketch_avx512.h
create mode 100644 lib/member/rte_xxh64_avx512.h
diff --git a/lib/member/meson.build b/lib/member/meson.build
index e06fddc240..8bd0af492c 100644
--- a/lib/member/meson.build
+++ b/lib/member/meson.build
@@ -7,6 +7,51 @@ if is_windows
subdir_done()
endif
-sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c')
headers = files('rte_member.h')
-deps += ['hash']
+
+sources = files(
+ 'rte_member.c',
+ 'rte_member_ht.c',
+ 'rte_member_vbf.c',
+ 'rte_member_sketch.c'
+)
+
+deps += ['hash', 'ring']
+
+# compile AVX512 version if:
+if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
+ # compile AVX512 version if either:
+ # a. we have AVX512 supported in minimum instruction set
+ # baseline
+ # b. it's not minimum instruction set, but supported by
+ # compiler
+ #
+ # in former case, just add avx512 C file to files list
+ # in latter case, compile c file to static lib, using correct
+ # compiler flags, and then have the .o file from static lib
+ # linked into main lib.
+
+ #check if all required flags already enabled
+ sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__', '__AVX512IFMA__']
+
+ sketch_avx512_on = true
+ foreach f:sketch_avx512_flags
+ if cc.get_define(f, args: machine_args) == ''
+ sketch_avx512_on = false
+ endif
+ endforeach
+
+ if sketch_avx512_on == true
+ cflags += ['-DCC_AVX512_SUPPORT']
+ sources += files('rte_member_sketch_avx512.c')
+ elif cc.has_multi_arguments('-mavx512f', '-mavx512dq', '-mavx512ifma')
+ sketch_avx512_tmp = static_library('sketch_avx512_tmp',
+ 'rte_member_sketch_avx512.c',
+ include_directories: includes,
+ dependencies: [static_rte_eal, static_rte_hash],
+ c_args: cflags +
+ ['-mavx512f', '-mavx512dq', '-mavx512ifma'])
+ objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
+ cflags += ['-DCC_AVX512_SUPPORT']
+ endif
+endif
diff --git a/lib/member/rte_member.c b/lib/member/rte_member.c
index 7e1632e6b5..8f859f7fbd 100644
--- a/lib/member/rte_member.c
+++ b/lib/member/rte_member.c
@@ -9,10 +9,12 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_tailq.h>
+#include <rte_ring_elem.h>
#include "rte_member.h"
#include "rte_member_ht.h"
#include "rte_member_vbf.h"
+#include "rte_member_sketch.h"
TAILQ_HEAD(rte_member_list, rte_tailq_entry);
static struct rte_tailq_elem rte_member_tailq = {
@@ -72,6 +74,9 @@ rte_member_free(struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_free_vbf(setsum);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_free_sketch(setsum);
+ break;
default:
break;
}
@@ -86,6 +91,8 @@ rte_member_create(const struct rte_member_parameters *params)
struct rte_member_list *member_list;
struct rte_member_setsum *setsum;
int ret;
+ char ring_name[RTE_RING_NAMESIZE];
+ struct rte_ring *sketch_key_ring = NULL;
if (params == NULL) {
rte_errno = EINVAL;
@@ -100,6 +107,16 @@ rte_member_create(const struct rte_member_parameters *params)
return NULL;
}
+ if (params->type == RTE_MEMBER_TYPE_SKETCH) {
+ snprintf(ring_name, sizeof(ring_name), "SK_%s", params->name);
+ sketch_key_ring = rte_ring_create_elem(ring_name, sizeof(uint32_t),
+ rte_align32pow2(params->top_k), params->socket_id, 0);
+ if (sketch_key_ring == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Ring Memory allocation failed\n");
+ return NULL;
+ }
+ }
+
member_list = RTE_TAILQ_CAST(rte_member_tailq.head, rte_member_list);
rte_mcfg_tailq_write_lock();
@@ -145,6 +162,9 @@ rte_member_create(const struct rte_member_parameters *params)
case RTE_MEMBER_TYPE_VBF:
ret = rte_member_create_vbf(setsum, params);
break;
+ case RTE_MEMBER_TYPE_SKETCH:
+ ret = rte_member_create_sketch(setsum, params, sketch_key_ring);
+ break;
default:
goto error_unlock_exit;
}
@@ -162,6 +182,7 @@ rte_member_create(const struct rte_member_parameters *params)
error_unlock_exit:
rte_free(te);
rte_free(setsum);
+ rte_ring_free(sketch_key_ring);
rte_mcfg_tailq_write_unlock();
return NULL;
}
@@ -178,6 +199,23 @@ rte_member_add(const struct rte_member_setsum *setsum, const void *key,
return rte_member_add_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_add_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch(setsum, key, set_id);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count)
+{
+ if (setsum == NULL || key == NULL || byte_count == 0)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_add_sketch_byte_count(setsum, key, byte_count);
default:
return -EINVAL;
}
@@ -195,6 +233,8 @@ rte_member_lookup(const struct rte_member_setsum *setsum, const void *key,
return rte_member_lookup_ht(setsum, key, set_id);
case RTE_MEMBER_TYPE_VBF:
return rte_member_lookup_vbf(setsum, key, set_id);
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_lookup_sketch(setsum, key, set_id);
default:
return -EINVAL;
}
@@ -261,6 +301,36 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
}
}
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output)
+{
+ if (setsum == NULL || key == NULL || output == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_query_sketch(setsum, key, output);
+ default:
+ return -EINVAL;
+ }
+}
+
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count)
+{
+ if (setsum == NULL || key == NULL || count == NULL)
+ return -EINVAL;
+
+ switch (setsum->type) {
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_report_heavyhitter_sketch(setsum, key, count);
+ default:
+ return -EINVAL;
+ }
+}
+
int
rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id)
@@ -272,6 +342,8 @@ rte_member_delete(const struct rte_member_setsum *setsum, const void *key,
case RTE_MEMBER_TYPE_HT:
return rte_member_delete_ht(setsum, key, set_id);
/* current vBF implementation does not support delete function */
+ case RTE_MEMBER_TYPE_SKETCH:
+ return rte_member_delete_sketch(setsum, key);
case RTE_MEMBER_TYPE_VBF:
default:
return -EINVAL;
@@ -290,6 +362,9 @@ rte_member_reset(const struct rte_member_setsum *setsum)
case RTE_MEMBER_TYPE_VBF:
rte_member_reset_vbf(setsum);
return;
+ case RTE_MEMBER_TYPE_SKETCH:
+ rte_member_reset_sketch(setsum);
+ return;
default:
return;
}
diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h
index 2611015771..178efeed33 100644
--- a/lib/member/rte_member.h
+++ b/lib/member/rte_member.h
@@ -39,6 +39,18 @@
* | | | not overwrite | |
* | | | existing key. | |
* +----------+---------------------+----------------+-------------------------+
+ * +==========+=============================+
+ * | type | sketch |
+ * +==========+=============================+
+ * |structure | counting bloom filter array |
+ * +----------+-----------------------------+
+ * |set id | 1: heavy set, 0: light set |
+ * | | |
+ * +----------+-----------------------------+
+ * |usages & | count size of a flow, |
+ * |properties| used for heavy hitter |
+ * | | detection. |
+ * +----------+-----------------------------+
* -->
*/
@@ -50,6 +62,8 @@ extern "C" {
#endif
#include <stdint.h>
+#include <stdbool.h>
+#include <inttypes.h>
#include <rte_common.h>
@@ -65,6 +79,20 @@ typedef uint16_t member_set_t;
#define RTE_MEMBER_BUCKET_ENTRIES 16
/** Maximum number of characters in setsum name. */
#define RTE_MEMBER_NAMESIZE 32
+/** Max value of the random number */
+#define RTE_RAND_MAX ~0LLU
+/**
+ * As packets skipped in the sampling-based algorithm, the accounting
+ * results accuracy is not guaranteed in the start stage. There should
+ * be a "convergence time" to achieve the accuracy after receiving enough
+ * packets.
+ * For sketch, use the flag if prefer always bounded mode, which only
+ * starts sampling after receiving enough packets to keep the results
+ * accuracy always bounded.
+ */
+#define RTE_MEMBER_SKETCH_ALWAYS_BOUNDED 0x01
+/** For sketch, use the flag if to count packet size instead of packet count */
+#define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02
/** @internal Hash function used by membership library. */
#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32)
@@ -104,6 +132,7 @@ struct rte_member_parameters;
enum rte_member_setsum_type {
RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary. */
RTE_MEMBER_TYPE_VBF, /**< Vector of bloom filters. */
+ RTE_MEMBER_TYPE_SKETCH,
RTE_MEMBER_NUM_TYPE
};
@@ -114,6 +143,19 @@ enum rte_member_sig_compare_function {
RTE_MEMBER_COMPARE_NUM
};
+/* sketch update function with different implementations. */
+typedef void (*sketch_update_fn_t)(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+/* sketch lookup function with different implementations. */
+typedef uint64_t (*sketch_lookup_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
+/* sketch delete function with different implementations. */
+typedef void (*sketch_delete_fn_t)(const struct rte_member_setsum *ss,
+ const void *key);
+
/** @internal setsummary structure. */
struct rte_member_setsum {
enum rte_member_setsum_type type; /* Type of the set summary. */
@@ -134,6 +176,21 @@ struct rte_member_setsum {
uint32_t bit_mask; /* Bit mask to get bit location in bf. */
uint32_t num_hashes; /* Number of hash values to index bf. */
+ /* Parameters for sketch */
+ float error_rate;
+ float sample_rate;
+ uint32_t num_col;
+ uint32_t num_row;
+ int always_bounded;
+ double converge_thresh;
+ uint32_t topk;
+ uint32_t count_byte;
+ uint64_t *hash_seeds;
+ sketch_update_fn_t sketch_update; /* Pointer to the sketch update function */
+ sketch_lookup_fn_t sketch_lookup; /* Pointer to the sketch lookup function */
+ sketch_delete_fn_t sketch_delete; /* Pointer to the sketch delete function */
+
+ void *runtime_var;
uint32_t mul_shift; /* vbf internal variable used during bit test. */
uint32_t div_shift; /* vbf internal variable used during bit test. */
@@ -143,6 +200,9 @@ struct rte_member_setsum {
/* Second cache line should start here. */
uint32_t socket_id; /* NUMA Socket ID for memory. */
char name[RTE_MEMBER_NAMESIZE]; /* Name of this set summary. */
+#ifdef RTE_ARCH_X86
+ bool use_avx512;
+#endif
} __rte_cache_aligned;
/**
@@ -261,8 +321,33 @@ struct rte_member_parameters {
*/
uint32_t sec_hash_seed;
+ /**
+ * For count(min) sketch data structure, error rate defines the accuracy
+ * required by the user. Higher accuracy leads to more memory usage, but
+ * the flow size is estimated more accurately.
+ */
+ float error_rate;
+
+ /**
+ * Sampling rate means the internal sample rate of the rows of the count
+ * min sketches. Lower sampling rate can reduce CPU overhead, but the
+ * data structure will require more time to converge statistically.
+ */
+ float sample_rate;
+
+ /**
+ * How many top heavy hitter to be reported. The library will internally
+ * keep the keys of heavy hitters for final report.
+ */
+ uint32_t top_k;
+
+ /**
+ * Extra flags that may passed in by user
+ */
+ uint32_t extra_flag;
+
int socket_id; /**< NUMA Socket ID for memory. */
-};
+} __rte_cache_aligned;
/**
* @warning
@@ -418,7 +503,7 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* RTE_MEMBER_NO_MATCH by default is set as 0.
* For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved.
* For vBF mode the set id is limited by the num_set parameter when create
- * the set-summary.
+ * the set-summary. For sketch mode, this id is ignored.
* @return
* HT (cache mode) and vBF should never fail unless the set_id is not in the
* valid range. In such case -EINVAL is returned.
@@ -429,12 +514,75 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum,
* Return 0 for HT (cache mode) if the add does not cause
* eviction, return 1 otherwise. Return 0 for non-cache mode if success,
* -ENOSPC for full, and 1 if cuckoo eviction happens.
- * Always returns 0 for vBF mode.
+ * Always returns 0 for vBF mode and sketch.
*/
int
rte_member_add(const struct rte_member_setsum *setsum, const void *key,
member_set_t set_id);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add the packet byte size into the sketch.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param byte_count
+ * Add the byte count of the packet into the sketch.
+ * @return
+ * Return -EINVAL for invalid parameters, otherwise return 0.
+ */
+__rte_experimental
+int
+rte_member_add_byte_count(const struct rte_member_setsum *setsum,
+ const void *key, uint32_t byte_count);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Query packet count for a certain flow-key.
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param key
+ * Pointer of the key to be added.
+ * @param count
+ * The output packet count or byte count.
+ * @return
+ * Return -EINVAL for invalid parameters.
+ */
+__rte_experimental
+int
+rte_member_query_count(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *count);
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Report heavyhitter flow-keys into set-summary (SS).
+ *
+ * @param setsum
+ * Pointer of a set-summary.
+ * @param keys
+ * Pointer of the output top-k key array.
+ * @param counts
+ * Pointer of the output packet count or byte count array of the top-k keys.
+ * @return
+ * Return -EINVAL for invalid parameters. Return a positive integer indicate
+ * how many heavy hitters are reported.
+ */
+__rte_experimental
+int
+rte_member_report_heavyhitter(const struct rte_member_setsum *setsum,
+ void **keys, uint64_t *counts);
+
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
diff --git a/lib/member/rte_member_heap.h b/lib/member/rte_member_heap.h
new file mode 100644
index 0000000000..3ced34160a
--- /dev/null
+++ b/lib/member/rte_member_heap.h
@@ -0,0 +1,424 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#ifndef _RTE_MEMBER_HEAP_H_
+#define _RTE_MEMBER_HEAP_H_
+
+#include <rte_ring_elem.h>
+#include "rte_member.h"
+
+#define LCHILD(x) (2 * x + 1)
+#define RCHILD(x) (2 * x + 2)
+#define PARENT(x) ((x - 1) / 2)
+
+#define HASH_BKT_SIZE 16
+#define HASH_HP_MULTI 4
+#define HASH_RESIZE_MULTI 2
+
+struct hash_bkt {
+ uint16_t sig[HASH_BKT_SIZE];
+ uint16_t idx[HASH_BKT_SIZE];
+};
+
+struct hash {
+ uint16_t bkt_cnt;
+ uint16_t num_item;
+ uint32_t seed;
+ struct hash_bkt buckets[0];
+};
+
+struct node {
+ void *key;
+ uint64_t count;
+};
+
+struct minheap {
+ uint32_t key_len;
+ uint32_t size;
+ uint32_t socket;
+ struct hash *hashtable;
+ struct node *elem;
+};
+
+static int
+hash_table_insert(const void *key, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].idx[i] == 0) {
+ table->buckets[idx].idx[i] = value;
+ table->buckets[idx].sig[i] = sig;
+ table->num_item++;
+ return 0;
+ }
+ }
+
+ return -ENOMEM;
+}
+
+static int
+hash_table_update(const void *key, int old_value, int value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == old_value) {
+ table->buckets[idx].idx[i] = value;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_del(const void *key, uint16_t value, int key_len, struct hash *table)
+{
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == value) {
+ table->buckets[idx].idx[i] = 0;
+ table->num_item--;
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int
+hash_table_lookup(const void *key, int key_len, struct minheap *hp)
+{
+ struct hash *table = hp->hashtable;
+ uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed);
+ uint16_t idx = hash % table->bkt_cnt;
+ uint16_t sig = hash >> 16;
+ int i;
+
+ for (i = 0; i < HASH_BKT_SIZE; i++) {
+ if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] != 0) {
+ uint32_t hp_idx = table->buckets[idx].idx[i] - 1;
+
+ if (memcmp(hp->elem[hp_idx].key, key, hp->key_len) == 0)
+ return hp_idx;
+ }
+ }
+
+ return -ENOENT; /* key doesn't exist */
+}
+
+static int
+resize_hash_table(struct minheap *hp)
+{
+ uint32_t i;
+ uint32_t new_bkt_cnt;
+
+ while (1) {
+ new_bkt_cnt = hp->hashtable->bkt_cnt * HASH_RESIZE_MULTI;
+
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT load factor is [%f]\n",
+ hp->hashtable->num_item / ((float)hp->hashtable->bkt_cnt * HASH_BKT_SIZE));
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT resize happen!\n");
+ rte_free(hp->hashtable);
+ hp->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ new_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, hp->socket);
+
+ if (hp->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ return -ENOMEM;
+ }
+
+ hp->hashtable->bkt_cnt = new_bkt_cnt;
+
+ for (i = 0; i < hp->size; ++i) {
+ if (hash_table_insert(hp->elem[i].key,
+ i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR,
+ "Sketch Minheap HT resize insert fail!\n");
+ break;
+ }
+ }
+ if (i == hp->size)
+ break;
+ }
+
+ return 0;
+}
+
+/* find the item in the given minheap */
+static int
+rte_member_minheap_find(struct minheap *hp, const void *key)
+{
+ int idx = hash_table_lookup(key, hp->key_len, hp);
+ return idx;
+}
+
+static int
+rte_member_minheap_init(struct minheap *heap, int size,
+ uint32_t socket, uint32_t seed)
+{
+ heap->elem = rte_zmalloc_socket(NULL, sizeof(struct node) * size,
+ RTE_CACHE_LINE_SIZE, socket);
+ if (heap->elem == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap elem allocation failed\n");
+ return -ENOMEM;
+ }
+
+ uint32_t hash_bkt_cnt = rte_align32pow2(size * HASH_HP_MULTI) / HASH_BKT_SIZE;
+
+ if (hash_bkt_cnt == 0)
+ hash_bkt_cnt = 1;
+
+ heap->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) +
+ hash_bkt_cnt * sizeof(struct hash_bkt),
+ RTE_CACHE_LINE_SIZE, socket);
+
+ if (heap->hashtable == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n");
+ rte_free(heap->elem);
+ return -ENOMEM;
+ }
+
+ heap->hashtable->seed = seed;
+ heap->hashtable->bkt_cnt = hash_bkt_cnt;
+ heap->socket = socket;
+
+ return 0;
+}
+
+/* swap the minheap nodes */
+static __rte_always_inline void
+rte_member_heap_swap(struct node *n1, struct node *n2)
+{
+ struct node temp = *n1;
+ *n1 = *n2;
+ *n2 = temp;
+}
+
+/* heapify function */
+static void
+rte_member_heapify(struct minheap *hp, uint32_t idx, bool update_hash)
+{
+ uint32_t smallest;
+
+ if (LCHILD(idx) < hp->size &&
+ hp->elem[LCHILD(idx)].count < hp->elem[idx].count)
+ smallest = LCHILD(idx);
+ else
+ smallest = idx;
+
+ if (RCHILD(idx) < hp->size &&
+ hp->elem[RCHILD(idx)].count < hp->elem[smallest].count)
+ smallest = RCHILD(idx);
+
+ if (smallest != idx) {
+ rte_member_heap_swap(&(hp->elem[idx]), &(hp->elem[smallest]));
+
+ if (update_hash) {
+ if (hash_table_update(hp->elem[smallest].key, idx + 1, smallest + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+
+ if (hash_table_update(hp->elem[idx].key, smallest + 1, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return;
+ }
+ }
+ rte_member_heapify(hp, smallest, update_hash);
+ }
+}
+
+/* insert a node into the minheap */
+static int
+rte_member_minheap_insert_node(struct minheap *hp, const void *key,
+ int counter, void *key_slot,
+ struct rte_ring *free_key_slot)
+{
+ struct node nd;
+ uint32_t slot_id;
+
+ if (rte_ring_sc_dequeue_elem(free_key_slot, &slot_id, sizeof(uint32_t)) != 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap get empty keyslot failed\n");
+ return -1;
+ }
+
+ nd.count = counter;
+ nd.key = RTE_PTR_ADD(key_slot, slot_id * hp->key_len);
+
+ memcpy(nd.key, key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* delete a key from the minheap */
+static int
+rte_member_minheap_delete_node(struct minheap *hp, const void *key,
+ void *key_slot, struct rte_ring *free_key_slot)
+{
+ int idx = rte_member_minheap_find(hp, key);
+ uint32_t offset = RTE_PTR_DIFF(hp->elem[idx].key, key_slot) / hp->key_len;
+
+ if (hash_table_del(key, idx + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ rte_ring_sp_enqueue_elem(free_key_slot, &offset, sizeof(uint32_t));
+
+ if (idx == (int)(hp->size - 1)) {
+ hp->size--;
+ return 0;
+ }
+
+ hp->elem[idx] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[idx].key, hp->size, idx + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+ rte_member_heapify(hp, idx, true);
+
+ return 0;
+}
+
+/* replace a min node with a new key. */
+static int
+rte_member_minheap_replace_node(struct minheap *hp,
+ const void *new_key,
+ int new_counter)
+{
+ struct node nd;
+ void *recycle_key = NULL;
+
+ recycle_key = hp->elem[0].key;
+
+ if (hash_table_del(recycle_key, 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n");
+ return -1;
+ }
+
+ hp->elem[0] = hp->elem[hp->size - 1];
+
+ if (hash_table_update(hp->elem[0].key, hp->size, 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ hp->size--;
+
+ rte_member_heapify(hp, 0, true);
+
+ nd.count = new_counter;
+ nd.key = recycle_key;
+
+ memcpy(nd.key, new_key, hp->key_len);
+
+ uint32_t i = (hp->size)++;
+
+ while (i && nd.count < hp->elem[PARENT(i)].count) {
+ hp->elem[i] = hp->elem[PARENT(i)];
+ if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1,
+ hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n");
+ return -1;
+ }
+ i = PARENT(i);
+ }
+
+ hp->elem[i] = nd;
+
+ if (hash_table_insert(new_key, i + 1, hp->key_len, hp->hashtable) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace insert failed\n");
+ if (resize_hash_table(hp) < 0) {
+ RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace resize failed\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* sort the heap into a descending array */
+static void
+rte_member_heapsort(struct minheap *hp, struct node *result_array)
+{
+ struct minheap new_hp;
+
+ /* build a new heap for using the given array */
+ new_hp.size = hp->size;
+ new_hp.key_len = hp->key_len;
+ new_hp.elem = result_array;
+ memcpy(result_array, hp->elem, hp->size * sizeof(struct node));
+
+ /* sort the new heap */
+ while (new_hp.size > 1) {
+ rte_member_heap_swap(&(new_hp.elem[0]), &(new_hp.elem[new_hp.size - 1]));
+ new_hp.size--;
+ rte_member_heapify(&new_hp, 0, false);
+ }
+}
+
+static void
+rte_member_minheap_free(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ rte_free(hp->elem);
+ rte_free(hp->hashtable);
+}
+
+static void
+rte_member_minheap_reset(struct minheap *hp)
+{
+ if (hp == NULL)
+ return;
+
+ memset(hp->elem, 0, sizeof(struct node) * hp->size);
+ hp->size = 0;
+
+ memset((char *)hp->hashtable + sizeof(struct hash), 0,
+ hp->hashtable->bkt_cnt * sizeof(struct hash_bkt));
+ hp->hashtable->num_item = 0;
+}
+
+#endif /* _RTE_MEMBER_HEAP_H_ */
diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c
new file mode 100644
index 0000000000..524ba77620
--- /dev/null
+++ b/lib/member/rte_member_sketch.c
@@ -0,0 +1,594 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020, Alan Liu <zaoxingliu@gmail.com>
+ */
+
+#include <math.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_errno.h>
+#include <rte_log.h>
+#include <rte_random.h>
+#include <rte_prefetch.h>
+#include <rte_ring_elem.h>
+
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+#include "rte_member_heap.h"
+
+#ifdef CC_AVX512_SUPPORT
+#include "rte_member_sketch_avx512.h"
+#endif /* CC_AVX512_SUPPORT */
+
+struct sketch_runtime {
+ uint64_t pkt_cnt;
+ uint32_t until_next;
+ int converged;
+ struct minheap heap;
+ struct node *report_array;
+ void *key_slots;
+ struct rte_ring *free_key_slots;
+} __rte_cache_aligned;
+
+/*
+ * Geometric sampling to calculate how many packets needs to be
+ * skipped until next update. This method can mitigate the CPU
+ * overheads compared with coin-toss sampling.
+ */
+static uint32_t
+draw_geometric(const struct rte_member_setsum *ss)
+{
+ double rand = 1;
+
+ if (ss->sample_rate == 1)
+ return 1;
+
+ while (rand == 1 || rand == 0)
+ rand = (double) rte_rand() / (double)(RTE_RAND_MAX);
+
+ return (uint32_t)ceil(log(1 - rand) / log(1 - ss->sample_rate));
+}
+
+static void
+isort(uint64_t *array, int n)
+{
+ int i;
+
+ for (i = 1; i < n; i++) {
+ uint64_t t = array[i];
+ int j;
+
+ for (j = i - 1; j >= 0; j--) {
+ if (t < array[j])
+ array[j + 1] = array[j];
+ else
+ break;
+ }
+ array[j + 1] = t;
+ }
+}
+
+static __rte_always_inline void
+swap(uint64_t *a, uint64_t *b)
+{
+ uint64_t tmp = *a;
+ *a = *b;
+ *b = tmp;
+}
+
+static uint64_t
+medianof5(uint64_t a, uint64_t b, uint64_t c, uint64_t d, uint64_t e)
+{
+ if (a > b)
+ swap(&a, &b);
+ if (c > d)
+ swap(&c, &d);
+ if (a > c) {
+ if (d > e)
+ swap(&c, &e);
+ else {
+ swap(&c, &d);
+ swap(&d, &e);
+ }
+ } else {
+ if (b > e)
+ swap(&a, &e);
+ else {
+ swap(&a, &b);
+ swap(&b, &e);
+ }
+ }
+
+ if (a > c)
+ return a > d ? d : a;
+ else
+ return b > c ? c : b;
+}
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *ring)
+{
+ struct sketch_runtime *runtime;
+ uint32_t num_col;
+ uint32_t i;
+
+ if (params->sample_rate == 0 || params->sample_rate > 1) {
+ rte_errno = EINVAL;
+ RTE_MEMBER_LOG(ERR,
+ "Membership Sketch created with invalid parameters\n");
+ return -EINVAL;
+ }
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_COUNT_BYTE)
+ ss->count_byte = 1;
+
+#ifdef RTE_ARCH_X86
+ if (ss->count_byte == 1 &&
+ rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
+ rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512IFMA) == 1) {
+#ifdef CC_AVX512_SUPPORT
+ ss->use_avx512 = true;
+#else
+ ss->use_avx512 = false;
+#endif
+ }
+
+ if (ss->use_avx512 == true) {
+#ifdef CC_AVX512_SUPPORT
+ ss->num_row = NUM_ROW_VEC;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch AVX512 update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_avx512;
+ ss->sketch_lookup = sketch_lookup_avx512;
+ ss->sketch_delete = sketch_delete_avx512;
+#endif
+ } else
+#endif
+ {
+ ss->num_row = NUM_ROW_SCALAR;
+ RTE_MEMBER_LOG(NOTICE,
+ "Membership Sketch SCALAR update/lookup/delete ops is selected\n");
+ ss->sketch_update = sketch_update_scalar;
+ ss->sketch_lookup = sketch_lookup_scalar;
+ ss->sketch_delete = sketch_delete_scalar;
+ }
+
+ ss->socket_id = params->socket_id;
+
+ if (ss->count_byte == 0)
+ num_col = 4.0 / params->error_rate / params->sample_rate;
+#ifdef RTE_ARCH_X86
+ else if (ss->use_avx512 == true)
+ num_col = rte_align32pow2(4.0 / params->error_rate);
+#endif
+ else
+ num_col = 4.0 / params->error_rate;
+
+ ss->table = rte_zmalloc_socket(NULL,
+ sizeof(uint64_t) * num_col * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->table == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Table memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->hash_seeds = rte_zmalloc_socket(NULL, sizeof(uint64_t) * ss->num_row,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->hash_seeds == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Hashseeds memory allocation failed\n");
+ return -ENOMEM;
+ }
+
+ ss->runtime_var = rte_zmalloc_socket(NULL, sizeof(struct sketch_runtime),
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (ss->runtime_var == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime memory allocation failed\n");
+ rte_free(ss);
+ return -ENOMEM;
+ }
+ runtime = ss->runtime_var;
+
+ ss->num_col = num_col;
+ ss->sample_rate = params->sample_rate;
+ ss->prim_hash_seed = params->prim_hash_seed;
+ ss->sec_hash_seed = params->sec_hash_seed;
+ ss->error_rate = params->error_rate;
+ ss->topk = params->top_k;
+ ss->key_len = params->key_len;
+ runtime->heap.key_len = ss->key_len;
+
+ runtime->key_slots = rte_zmalloc_socket(NULL, ss->key_len * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->key_slots == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Key Slots allocation failed\n");
+ goto error;
+ }
+
+ runtime->free_key_slots = ring;
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime->free_key_slots,
+ &i, sizeof(uint32_t));
+
+ if (rte_member_minheap_init(&(runtime->heap), params->top_k,
+ ss->socket_id, params->prim_hash_seed) < 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch Minheap allocation failed\n");
+ goto error_runtime;
+ }
+
+ runtime->report_array = rte_zmalloc_socket(NULL, sizeof(struct node) * ss->topk,
+ RTE_CACHE_LINE_SIZE, ss->socket_id);
+ if (runtime->report_array == NULL) {
+ RTE_MEMBER_LOG(ERR, "Sketch Runtime Report Array allocation failed\n");
+ goto error_runtime;
+ }
+
+ rte_srand(ss->prim_hash_seed);
+ for (i = 0; i < ss->num_row; i++)
+ ss->hash_seeds[i] = rte_rand();
+
+ if (params->extra_flag & RTE_MEMBER_SKETCH_ALWAYS_BOUNDED)
+ ss->always_bounded = 1;
+
+ if (ss->always_bounded) {
+ double delta = 1.0 / (pow(2, ss->num_row));
+
+ ss->converge_thresh = 10 * pow(ss->error_rate, -2.0) * sqrt(log(1 / delta));
+ }
+
+ RTE_MEMBER_LOG(DEBUG, "Sketch created, "
+ "the total memory required is %u Bytes\n", ss->num_col * ss->num_row * 8);
+
+ return 0;
+
+error_runtime:
+ rte_member_minheap_free(&runtime->heap);
+ rte_ring_free(runtime->free_key_slots);
+ rte_free(runtime->key_slots);
+error:
+ rte_free(runtime);
+ rte_free(ss);
+
+ return -ENOMEM;
+}
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col[ss->num_row];
+ uint64_t count_row[ss->num_row];
+ uint32_t cur_row;
+ uint64_t count;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ rte_prefetch0(&count_array[cur_row * ss->num_col + col[cur_row]]);
+ }
+
+ /* if sample rate is 1, it is a regular count-min, we report the min */
+ if (ss->sample_rate == 1 || ss->count_byte == 1)
+ return count_min(ss, col);
+
+ memset(count_row, 0, sizeof(uint64_t) * ss->num_row);
+
+ /* otherwise we report the median number */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_row[cur_row] = count_array[cur_row * ss->num_col + col[cur_row]];
+
+ if (ss->num_row == 5)
+ return medianof5(count_row[0], count_row[1],
+ count_row[2], count_row[3], count_row[4]);
+
+ isort(count_row, ss->num_row);
+
+ if (ss->num_row % 2 == 0) {
+ count = (count_row[ss->num_row / 2] + count_row[ss->num_row / 2 - 1]) / 2;
+ return count;
+ }
+ /* ss->num_row % 2 != 0 */
+ count = count_row[ss->num_row / 2];
+
+ return count;
+}
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* set corresponding counter to 0 */
+ count_array[cur_row * ss->num_col + col[cur_row]] = 0;
+ }
+}
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ uint64_t *output)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ *output = count;
+
+ return 0;
+}
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ uint64_t count = ss->sketch_lookup(ss, runtime_var->heap.elem[i].key);
+
+ runtime_var->heap.elem[i].count = count;
+ }
+}
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key,
+ uint64_t *count)
+{
+ uint32_t i;
+ struct sketch_runtime *runtime_var = setsum->runtime_var;
+
+ rte_member_update_heap(setsum);
+ rte_member_heapsort(&(runtime_var->heap), runtime_var->report_array);
+
+ for (i = 0; i < runtime_var->heap.size; i++) {
+ key[i] = runtime_var->report_array[i].key;
+ count[i] = runtime_var->report_array[i].count;
+ }
+
+ return runtime_var->heap.size;
+}
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *ss,
+ const void *key, member_set_t *set_id)
+{
+ uint64_t count = ss->sketch_lookup(ss, key);
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ if (runtime_var->heap.size > 0 && count >= runtime_var->heap.elem[0].count)
+ *set_id = 1;
+ else
+ *set_id = 0;
+
+ if (count == 0)
+ return 0;
+ else
+ return 1;
+}
+
+static void
+should_converge(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ /* For count min sketch - L1 norm */
+ if (runtime_var->pkt_cnt > ss->converge_thresh) {
+ runtime_var->converged = 1;
+ RTE_MEMBER_LOG(DEBUG, "Sketch converged, begin sampling "
+ "from key count %"PRIu64"\n",
+ runtime_var->pkt_cnt);
+ }
+}
+
+static void
+sketch_update_row(const struct rte_member_setsum *ss, const void *key,
+ uint32_t count, uint32_t cur_row)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+
+ /* sketch counter update */
+ count_array[cur_row * ss->num_col + col] +=
+ ceil(count / (ss->sample_rate));
+}
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t col;
+ uint32_t cur_row;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ col = MEMBER_HASH_FUNC(key, ss->key_len,
+ ss->hash_seeds[cur_row]) % ss->num_col;
+ count_array[cur_row * ss->num_col + col] += count;
+ }
+}
+
+static void
+heap_update(const struct rte_member_setsum *ss, const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t key_cnt = 0;
+ int found;
+
+ /* We also update the heap for this key */
+ key_cnt = ss->sketch_lookup(ss, key);
+ if (key_cnt > runtime_var->heap.elem[0].count) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ /* the key is found in the top-k heap */
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else if (runtime_var->heap.size < ss->topk) {
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ } else {
+ rte_member_minheap_replace_node(&runtime_var->heap, key, key_cnt);
+ }
+ } else if (runtime_var->heap.size < ss->topk) {
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found >= 0) {
+ if (runtime_var->heap.elem[found].count < key_cnt)
+ rte_member_heapify(&runtime_var->heap, found, true);
+
+ runtime_var->heap.elem[found].count = key_cnt;
+ } else
+ rte_member_minheap_insert_node(&runtime_var->heap, key,
+ key_cnt, runtime_var->key_slots, runtime_var->free_key_slots);
+ }
+}
+
+/*
+ * Add a single packet into the sketch.
+ * Sketch value is meatured by packet numbers in this mode.
+ */
+int
+rte_member_add_sketch(const struct rte_member_setsum *ss,
+ const void *key,
+ __rte_unused member_set_t set_id)
+{
+ uint32_t cur_row;
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /*
+ * If sketch is measured by byte count,
+ * the rte_member_add_sketch_byte_count routine should be used.
+ */
+ if (ss->count_byte == 1) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Byte Mode, "
+ "should use rte_member_add_byte_count()!\n");
+ return -EINVAL;
+ }
+
+ if (ss->sample_rate == 1) {
+ ss->sketch_update(ss, key, 1);
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* convergence stage if it's needed */
+ if (ss->always_bounded && !runtime_var->converged) {
+ ss->sketch_update(ss, key, 1);
+
+ if (!((++runtime_var->pkt_cnt) & (INTERVAL - 1)))
+ should_converge(ss);
+
+ heap_update(ss, key);
+ return 0;
+ }
+
+ /* should we skip this packet */
+ if (*until_next >= ss->num_row) {
+ *until_next -= ss->num_row;
+ return 0;
+ }
+ cur_row = *until_next;
+ do {
+ sketch_update_row(ss, key, 1, cur_row);
+ *until_next = draw_geometric(ss);
+ if (cur_row + *until_next >= ss->num_row)
+ break;
+ cur_row += *until_next;
+ } while (1);
+
+ *until_next -= (ss->num_row - cur_row);
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+/*
+ * Add the byte count of the packet into the sketch.
+ * Sketch value is meatured by byte count numbers in this mode.
+ */
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t byte_count)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint32_t *until_next = &(runtime_var->until_next);
+
+ /* should not call this API if not in count byte mode */
+ if (ss->count_byte == 0) {
+ RTE_MEMBER_LOG(ERR, "Sketch is Pkt Mode, "
+ "should use rte_member_add()!\n");
+ return -EINVAL;
+ }
+
+ /* there's specific optimization for the sketch update */
+ ss->sketch_update(ss, key, byte_count);
+
+ if (*until_next != 0) {
+ *until_next = *until_next - 1;
+ return 0;
+ }
+
+ *until_next = draw_geometric(ss) - 1;
+
+ heap_update(ss, key);
+
+ return 0;
+}
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *ss,
+ const void *key)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ int found;
+
+ found = rte_member_minheap_find(&runtime_var->heap, key);
+ if (found < 0)
+ return -1;
+
+ ss->sketch_delete(ss, key);
+
+ return rte_member_minheap_delete_node
+ (&runtime_var->heap, key, runtime_var->key_slots, runtime_var->free_key_slots);
+}
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+
+ rte_free(ss->table);
+ rte_member_minheap_free(&runtime_var->heap);
+ rte_free(runtime_var->key_slots);
+ rte_ring_free(runtime_var->free_key_slots);
+ rte_free(runtime_var);
+}
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *ss)
+{
+ struct sketch_runtime *runtime_var = ss->runtime_var;
+ uint64_t *sketch = ss->table;
+ uint32_t i;
+
+ memset(sketch, 0, sizeof(uint64_t) * ss->num_col * ss->num_row);
+ rte_member_minheap_reset(&runtime_var->heap);
+ rte_ring_reset(runtime_var->free_key_slots);
+
+ for (i = 0; i < ss->topk; i++)
+ rte_ring_sp_enqueue_elem(runtime_var->free_key_slots, &i, sizeof(uint32_t));
+}
diff --git a/lib/member/rte_member_sketch.h b/lib/member/rte_member_sketch.h
new file mode 100644
index 0000000000..219323008b
--- /dev/null
+++ b/lib/member/rte_member_sketch.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_H_
+#define _RTE_MEMBER_SKETCH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include <rte_ring_elem.h>
+
+#define NUM_ROW_SCALAR 5
+#define INTERVAL (1 << 15)
+
+#if !RTE_IS_POWER_OF_2(INTERVAL)
+#error sketch INTERVAL macro must be a power of 2
+#endif
+
+int
+rte_member_create_sketch(struct rte_member_setsum *ss,
+ const struct rte_member_parameters *params,
+ struct rte_ring *r);
+
+int
+rte_member_lookup_sketch(const struct rte_member_setsum *setsum,
+ const void *key, member_set_t *set_id);
+
+int
+rte_member_add_sketch(const struct rte_member_setsum *setsum,
+ const void *key,
+ member_set_t set_id);
+
+int
+rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss,
+ const void *key, uint32_t byte_count);
+
+void
+sketch_update_scalar(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_scalar(const struct rte_member_setsum *ss,
+ const void *key);
+
+int
+rte_member_delete_sketch(const struct rte_member_setsum *setsum,
+ const void *key);
+
+int
+rte_member_query_sketch(const struct rte_member_setsum *setsum,
+ const void *key, uint64_t *output);
+
+void
+rte_member_free_sketch(struct rte_member_setsum *ss);
+
+void
+rte_member_reset_sketch(const struct rte_member_setsum *setsum);
+
+int
+rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum,
+ void **key, uint64_t *count);
+
+void
+rte_member_update_heap(const struct rte_member_setsum *ss);
+
+static __rte_always_inline uint64_t
+count_min(const struct rte_member_setsum *ss, const uint32_t *hash_results)
+{
+ uint64_t *count_array = ss->table;
+ uint64_t count;
+ uint32_t cur_row;
+ uint64_t min = UINT64_MAX;
+
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++) {
+ uint64_t cnt = count_array[cur_row * ss->num_col + hash_results[cur_row]];
+
+ if (cnt < min)
+ min = cnt;
+ }
+ count = min;
+
+ return count;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_H_ */
diff --git a/lib/member/rte_member_sketch_avx512.c b/lib/member/rte_member_sketch_avx512.c
new file mode 100644
index 0000000000..288e37a446
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.c
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "rte_xxh64_avx512.h"
+#include "rte_member_sketch_avx512.h"
+
+__rte_always_inline void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count)
+{
+ uint64_t *count_array = ss->table;
+ uint32_t num_col = ss->num_col;
+ uint32_t key_len = ss->key_len;
+ __m256i v_row_base;
+ __m256i v_hash_result;
+ __m512i current_sketch;
+ __m512i updated_sketch;
+ __m512i v_count;
+
+ const __m256i v_idx = _mm256_set_epi32(7, 6, 5, 4, 3, 2, 1, 0);
+ const __m256i v_col = _mm256_set1_epi32(num_col);
+
+ /* compute the hash result parallelly */
+ v_hash_result = rte_xxh64_sketch_avx512
+ (key, key_len, *(__m512i *)ss->hash_seeds, num_col);
+ v_row_base = _mm256_mullo_epi32(v_idx, v_col);
+ v_hash_result = _mm256_add_epi32(v_row_base, v_hash_result);
+
+ current_sketch =
+ _mm512_i32gather_epi64(v_hash_result, count_array, 8);
+ v_count = _mm512_set1_epi64(count);
+ updated_sketch = _mm512_add_epi64(current_sketch, v_count);
+ _mm512_i32scatter_epi64
+ ((void *)count_array, v_hash_result, updated_sketch, 8);
+}
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+
+ /* currently only for sketch byte count mode */
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds, ss->num_col);
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ return count_min(ss, col);
+}
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss, const void *key)
+{
+ uint32_t col[ss->num_row];
+ uint64_t *count_array = ss->table;
+ uint64_t min = UINT64_MAX;
+ uint32_t cur_row;
+
+ __m256i v_hash_result = rte_xxh64_sketch_avx512
+ (key, ss->key_len, *(__m512i *)ss->hash_seeds,
+ RTE_ALIGN_FLOOR(ss->num_col, 32));
+ _mm256_storeu_si256((__m256i *)col, v_hash_result);
+
+ min = count_min(ss, col);
+
+ /* subtract the min value from all the counters */
+ for (cur_row = 0; cur_row < ss->num_row; cur_row++)
+ count_array[cur_row * ss->num_col + col[cur_row]] -= min;
+}
diff --git a/lib/member/rte_member_sketch_avx512.h b/lib/member/rte_member_sketch_avx512.h
new file mode 100644
index 0000000000..68105cd895
--- /dev/null
+++ b/lib/member/rte_member_sketch_avx512.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_MEMBER_SKETCH_AVX512_H_
+#define _RTE_MEMBER_SKETCH_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_vect.h>
+#include "rte_member.h"
+#include "rte_member_sketch.h"
+
+#define NUM_ROW_VEC 8
+
+void
+sketch_update_avx512(const struct rte_member_setsum *ss,
+ const void *key,
+ uint32_t count);
+
+uint64_t
+sketch_lookup_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+void
+sketch_delete_avx512(const struct rte_member_setsum *ss,
+ const void *key);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMBER_SKETCH_AVX512_H_ */
diff --git a/lib/member/rte_xxh64_avx512.h b/lib/member/rte_xxh64_avx512.h
new file mode 100644
index 0000000000..50ca1b52c7
--- /dev/null
+++ b/lib/member/rte_xxh64_avx512.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_XXH64_AVX512_H_
+#define _RTE_XXH64_AVX512_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <immintrin.h>
+
+/* 0b1001111000110111011110011011000110000101111010111100101010000111 */
+static const uint64_t PRIME64_1 = 0x9E3779B185EBCA87ULL;
+/* 0b1100001010110010101011100011110100100111110101001110101101001111 */
+static const uint64_t PRIME64_2 = 0xC2B2AE3D27D4EB4FULL;
+/* 0b0001011001010110011001111011000110011110001101110111100111111001 */
+static const uint64_t PRIME64_3 = 0x165667B19E3779F9ULL;
+/* 0b1000010111101011110010100111011111000010101100101010111001100011 */
+static const uint64_t PRIME64_4 = 0x85EBCA77C2B2AE63ULL;
+/* 0b0010011111010100111010110010111100010110010101100110011111000101 */
+static const uint64_t PRIME64_5 = 0x27D4EB2F165667C5ULL;
+
+static __rte_always_inline __m512i
+xxh64_round_avx512(__m512i hash, __m512i input)
+{
+ hash = _mm512_madd52lo_epu64(hash,
+ input,
+ _mm512_set1_epi64(PRIME64_2));
+
+ hash = _mm512_rol_epi64(hash, 31);
+
+ return hash;
+}
+
+static __rte_always_inline __m512i
+xxh64_fmix_avx512(__m512i hash)
+{
+ hash = _mm512_xor_si512(hash, _mm512_srli_epi64(hash, 33));
+
+ return hash;
+}
+
+static __rte_always_inline __m256i
+rte_xxh64_sketch_avx512(const void *key, uint32_t key_len,
+ __m512i v_seed, uint32_t modulo)
+{
+ __m512i v_prime64_5, v_hash;
+ size_t remaining = key_len;
+ size_t offset = 0;
+ __m512i input;
+
+ v_prime64_5 = _mm512_set1_epi64(PRIME64_5);
+ v_hash = _mm512_add_epi64
+ (_mm512_add_epi64(v_seed, v_prime64_5),
+ _mm512_set1_epi64(key_len));
+
+ while (remaining >= 8) {
+ input = _mm512_set1_epi64(*(uint64_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ xxh64_round_avx512(_mm512_setzero_si512(), input));
+ v_hash = _mm512_madd52lo_epu64(_mm512_set1_epi64(PRIME64_4),
+ v_hash,
+ _mm512_set1_epi64(PRIME64_1));
+
+ remaining -= 8;
+ offset += 8;
+ }
+
+ if (remaining >= 4) {
+ input = _mm512_set1_epi64
+ (*(uint32_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_1)));
+ v_hash = _mm512_madd52lo_epu64
+ (_mm512_set1_epi64(PRIME64_3),
+ _mm512_rol_epi64(v_hash, 23),
+ _mm512_set1_epi64(PRIME64_2));
+
+ offset += 4;
+ remaining -= 4;
+ }
+
+ while (remaining != 0) {
+ input = _mm512_set1_epi64
+ (*(uint8_t *)RTE_PTR_ADD(key, offset));
+ v_hash = _mm512_xor_epi64(v_hash,
+ _mm512_mullo_epi64(input,
+ _mm512_set1_epi64(PRIME64_5)));
+ v_hash = _mm512_mullo_epi64
+ (_mm512_rol_epi64(v_hash, 11),
+ _mm512_set1_epi64(PRIME64_1));
+ offset++;
+ remaining--;
+ }
+
+ v_hash = xxh64_fmix_avx512(v_hash);
+
+ /*
+ * theoritically, such modular operations can be replaced by
+ * _mm512_rem_epi64(), but seems it depends on the compiler's
+ * implementation. so here is the limitation that the modulo
+ * value should be power of 2.
+ */
+ __m512i v_hash_remainder = _mm512_set1_epi64((modulo - 1));
+
+ return _mm512_cvtepi64_epi32(_mm512_and_si512(v_hash, v_hash_remainder));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_XXH64_AVX512_H_ */
diff --git a/lib/member/version.map b/lib/member/version.map
index 19469c6aba..35199270ff 100644
--- a/lib/member/version.map
+++ b/lib/member/version.map
@@ -14,3 +14,12 @@ DPDK_23 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ # added in 22.11
+ rte_member_add_byte_count;
+ rte_member_query_count;
+ rte_member_report_heavyhitter;
+};
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v5 2/2] test/member: add functional and perf tests for sketch
2022-09-16 3:03 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-16 3:03 ` [PATCH v5 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-16 3:03 ` Leyi Rong
2022-10-09 23:13 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Thomas Monjalon
2 siblings, 0 replies; 34+ messages in thread
From: Leyi Rong @ 2022-09-16 3:03 UTC (permalink / raw)
To: ferruh.yigit, bruce.richardson, david.marchand, suanmingm,
yipeng1.wang, zaoxingliu, sameh.gobriel
Cc: dev, Leyi Rong
This patch adds functional and performance tests for sketch mode of
membership library.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
app/test/test_member.c | 272 ++++++++++++++++++++++++++++++++++++
app/test/test_member_perf.c | 153 +++++++++++++++++++-
2 files changed, 421 insertions(+), 4 deletions(-)
diff --git a/app/test/test_member.c b/app/test/test_member.c
index 26a712439f..8266e6437b 100644
--- a/app/test/test_member.c
+++ b/app/test/test_member.c
@@ -4,6 +4,7 @@
/* This test is for membership library's simple feature test */
+#include <math.h>
#include "test.h"
#include <rte_memcpy.h>
@@ -28,6 +29,7 @@ test_member(void)
struct rte_member_setsum *setsum_ht;
struct rte_member_setsum *setsum_cache;
struct rte_member_setsum *setsum_vbf;
+struct rte_member_setsum *setsum_sketch;
/* 5-tuple key type */
struct flow_key {
@@ -108,6 +110,21 @@ static struct rte_member_parameters params = {
.socket_id = 0 /* NUMA Socket ID for memory. */
};
+/* for sketch definitions */
+#define TOP_K 10
+#define HH_PKT_SIZE 16
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define PRINT_OUT_COUNT 20
+
+#define SKETCH_LARGEST_KEY_SIZE 1000000
+#define SKETCH_TOTAL_KEY 500
+#define NUM_OF_KEY(key) {\
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \
+}
+
+void *heavy_hitters[TOP_K];
+
/*
* Sequence of operations for find existing setsummary
*
@@ -684,6 +701,257 @@ perform_free(void)
rte_member_free(setsum_vbf);
}
+static void
+print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set,
+ uint32_t print_num, bool count_byte)
+{
+ uint32_t i;
+
+ for (i = 0; i < print_num; i++) {
+ if (count_byte)
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)) *
+ HH_PKT_SIZE,
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i) * HH_PKT_SIZE) /
+ ((float)NUM_OF_KEY(i) * HH_PKT_SIZE));
+ else
+ printf("key %2u, count %8"PRIu64", real count %8u, "
+ "heavy_set %u, deviation rate [%.04f]\n",
+ i, count_result[i],
+ (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)),
+ heavy_set[i],
+ fabs((float)count_result[i] - (float)NUM_OF_KEY(i)) /
+ (float)NUM_OF_KEY(i));
+ }
+}
+
+static int
+sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int reset_test)
+{
+ uint32_t i;
+ uint64_t result_count[SKETCH_TOTAL_KEY];
+ member_set_t heavy_set[SKETCH_TOTAL_KEY];
+ uint64_t count[TOP_K];
+ int ret;
+ int hh_cnt;
+
+ setsum_sketch = rte_member_create(¶ms);
+ if (setsum_sketch == NULL) {
+ printf("Creation of setsums fail\n");
+ return -1;
+ }
+
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count(setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ if (reset_test) {
+ printf("\nEntering Sketch Reset Test Process!\n");
+ rte_member_reset(setsum_sketch);
+
+ /* after reset, check some key's count */
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ printf("\nReinsert keys after Sketch Reset!\n");
+ for (i = 0; i < total_pkt; i++) {
+ if (count_byte)
+ ret = rte_member_add_byte_count
+ (setsum_sketch, &keys[i], HH_PKT_SIZE);
+ else
+ ret = rte_member_add(setsum_sketch, &keys[i], 1);
+
+ if (ret < 0) {
+ printf("rte_member_add Failed! Error [%d]\n", ret);
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ }
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+
+ printf("\nDelete some keys!\n");
+ uint32_t tmp_key = 0;
+
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+ tmp_key = 1;
+ rte_member_delete(setsum_sketch, (void *)&tmp_key, 0);
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ uint32_t tmp_key = i;
+
+ rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]);
+ rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]);
+ }
+
+ print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte);
+
+ hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count);
+ if (hh_cnt < 0) {
+ printf("sketch report heavy hitter error!");
+ rte_member_free(setsum_sketch);
+
+ return -1;
+ }
+ printf("Report heavy hitters:");
+ for (i = 0; i < (unsigned int)hh_cnt; i++) {
+ printf("%u: %"PRIu64"\t",
+ *((uint32_t *)heavy_hitters[i]), count[i]);
+ }
+ printf("\n");
+ }
+
+ rte_member_free(setsum_sketch);
+ return 0;
+}
+
+static int
+test_member_sketch(void)
+{
+ unsigned int i, j, index;
+ uint32_t total_pkt = 0;
+ uint32_t *keys;
+ int count_byte = 0;
+
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++)
+ total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1));
+
+ printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt);
+
+ keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0);
+
+ if (keys == NULL) {
+ printf("RTE_ZMALLOC failed\n");
+ return -1;
+ }
+
+ index = 0;
+ for (i = 0; i < SKETCH_TOTAL_KEY; i++) {
+ for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++)
+ keys[index++] = i;
+ }
+
+ /* shuffle the keys */
+ for (i = index - 1; i > 0; i--) {
+ uint32_t swap_idx = rte_rand() % i;
+ uint32_t tmp_key = keys[i];
+
+ keys[i] = keys[swap_idx];
+ keys[swap_idx] = tmp_key;
+ }
+
+ params.key_len = 4;
+ params.name = "test_member_sketch";
+ params.type = RTE_MEMBER_TYPE_SKETCH;
+ params.error_rate = SKETCH_ERROR_RATE;
+ params.sample_rate = SKETCH_SAMPLE_RATE;
+ params.extra_flag = 0;
+ params.top_k = TOP_K;
+ params.prim_hash_seed = rte_rdtsc();
+ int reset_test = 0;
+
+ printf("Default sketching params: Error Rate: [%f]\tSample Rate: [%f]\tTopK: [%d]\n",
+ SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K);
+
+ printf("\n[Sketch with Fixed Sampling Rate Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ printf("\n[Sketch with Always Bounded Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 1;
+ params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ printf("\n[Sketch with Packet Size Mode]\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ count_byte = 0;
+ params.extra_flag = 0;
+ reset_test = 1;
+ printf("\nreset sketch test\n");
+ if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) {
+ rte_free(keys);
+ return -1;
+ }
+
+ rte_free(keys);
+ return 0;
+}
+
static int
test_member(void)
{
@@ -719,6 +987,10 @@ test_member(void)
return -1;
}
+ if (test_member_sketch() < 0) {
+ perform_free();
+ return -1;
+ }
perform_free();
return 0;
}
diff --git a/app/test/test_member_perf.c b/app/test/test_member_perf.c
index 978db0020a..b7eb3e4c66 100644
--- a/app/test/test_member_perf.c
+++ b/app/test/test_member_perf.c
@@ -13,6 +13,7 @@
#include <rte_random.h>
#include <rte_memcpy.h>
#include <rte_thash.h>
+#include <math.h>
#ifdef RTE_EXEC_ENV_WINDOWS
static int
@@ -26,7 +27,7 @@ test_member_perf(void)
#include <rte_member.h>
-#define NUM_KEYSIZES 10
+#define NUM_KEYSIZES RTE_DIM(hashtest_key_lens)
#define NUM_SHUFFLES 10
#define MAX_KEYSIZE 64
#define MAX_ENTRIES (1 << 19)
@@ -36,12 +37,23 @@ test_member_perf(void)
#define BURST_SIZE 64
#define VBF_FALSE_RATE 0.03
+/* for the heavy hitter detection */
+#define SKETCH_LARGEST_KEY_SIZE (1<<15)
+#define SKETCH_PKT_SIZE 16
+#define TOP_K 100
+#define SKETCH_ERROR_RATE 0.05
+#define SKETCH_SAMPLE_RATE 0.001
+#define NUM_ADDS (KEYS_TO_ADD * 20)
+
static unsigned int test_socket_id;
enum sstype {
HT = 0,
CACHE,
VBF,
+ SKETCH,
+ SKETCH_BOUNDED,
+ SKETCH_BYTE,
NUM_TYPE
};
@@ -88,6 +100,7 @@ static member_set_t data[NUM_TYPE][/* Array to store the data */KEYS_TO_ADD];
/* Array to store all input keys */
static uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE];
+static uint8_t hh_keys[KEYS_TO_ADD][MAX_KEYSIZE];
/* Shuffle the keys that have been added, so lookups will be totally random */
static void
@@ -136,6 +149,10 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
{
unsigned int i, j;
int num_duplicates;
+ int distinct_key = 0;
+ int count_down = SKETCH_LARGEST_KEY_SIZE;
+ uint32_t swap_idx;
+ uint8_t temp_key[MAX_KEYSIZE];
params->key_size = hashtest_key_lens[cycle];
params->cycle = cycle;
@@ -176,6 +193,22 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
/* Shuffle the random values again */
shuffle_input_keys(params);
+ for (i = 0; i < KEYS_TO_ADD; i++) {
+ if (count_down == 0) {
+ distinct_key++;
+ count_down = ceil(SKETCH_LARGEST_KEY_SIZE / (distinct_key + 1));
+ }
+ memcpy(hh_keys[i], keys[distinct_key], params->key_size);
+ count_down--;
+ }
+
+ for (i = KEYS_TO_ADD - 1; i > 0; i--) {
+ swap_idx = rte_rand() % i;
+ memcpy(temp_key, hh_keys[i], params->key_size);
+ memcpy(hh_keys[i], hh_keys[swap_idx], params->key_size);
+ memcpy(hh_keys[swap_idx], temp_key, params->key_size);
+ }
+
/* For testing miss lookup, we insert half and lookup the other half */
unsigned int entry_cnt, bf_key_cnt;
if (!miss) {
@@ -208,6 +241,44 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle,
params->setsum[VBF] = rte_member_create(&member_params);
if (params->setsum[VBF] == NULL)
fprintf(stderr, "VBF create fail\n");
+
+ member_params.name = "test_member_sketch";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag = 0;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_bounded";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BOUNDED] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BOUNDED] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+ member_params.name = "test_member_sketch_byte";
+ member_params.key_len = params->key_size;
+ member_params.type = RTE_MEMBER_TYPE_SKETCH;
+ member_params.error_rate = SKETCH_ERROR_RATE;
+ member_params.sample_rate = SKETCH_SAMPLE_RATE;
+ member_params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE;
+ member_params.top_k = TOP_K;
+ member_params.prim_hash_seed = rte_rdtsc();
+ params->setsum[SKETCH_BYTE] = rte_member_create(&member_params);
+ if (params->setsum[SKETCH_BYTE] == NULL)
+ fprintf(stderr, "sketch create fail\n");
+
+
for (i = 0; i < NUM_TYPE; i++) {
if (params->setsum[i] == NULL)
return -1;
@@ -243,6 +314,39 @@ timed_adds(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_adds_sketch(struct member_perf_params *params, int type)
+{
+ const uint64_t start_tsc = rte_rdtsc();
+ unsigned int i, j, a;
+ int32_t ret;
+
+ for (i = 0; i < NUM_ADDS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ if (type == SKETCH_BYTE)
+ ret = rte_member_add_byte_count(params->setsum[type],
+ &hh_keys[j], SKETCH_PKT_SIZE);
+ else
+ ret = rte_member_add(params->setsum[type], &hh_keys[j], 1);
+ if (ret < 0) {
+ printf("Error %d in rte_member_add - key=0x", ret);
+ for (a = 0; a < params->key_size; a++)
+ printf("%02x", hh_keys[j][a]);
+ printf("type: %d\n", type);
+
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][ADD] = time_taken / NUM_ADDS;
+
+ return 0;
+}
+
static int
timed_lookups(struct member_perf_params *params, int type)
{
@@ -279,6 +383,36 @@ timed_lookups(struct member_perf_params *params, int type)
return 0;
}
+static int
+timed_lookups_sketch(struct member_perf_params *params, int type)
+{
+ unsigned int i, j;
+
+ false_data[type][params->cycle] = 0;
+
+ const uint64_t start_tsc = rte_rdtsc();
+ member_set_t result;
+ int ret;
+
+ for (i = 0; i < NUM_LOOKUPS / KEYS_TO_ADD; i++) {
+ for (j = 0; j < KEYS_TO_ADD; j++) {
+ ret = rte_member_lookup(params->setsum[type], &hh_keys[j],
+ &result);
+ if (ret < 0) {
+ printf("lookup wrong internally");
+ return -1;
+ }
+ }
+ }
+
+ const uint64_t end_tsc = rte_rdtsc();
+ const uint64_t time_taken = end_tsc - start_tsc;
+
+ cycles[type][params->cycle][LOOKUP] = time_taken / NUM_LOOKUPS;
+
+ return 0;
+}
+
static int
timed_lookups_bulk(struct member_perf_params *params, int type)
{
@@ -531,7 +665,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_adds(¶ms, j) < 0)
return exit_with_fail("timed_adds", ¶ms,
@@ -562,6 +696,17 @@ run_all_tbl_perf_tests(void)
/* Print a dot to show progress on operations */
}
+
+ for (j = SKETCH; j < NUM_TYPE; j++) {
+ if (timed_adds_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_adds_sketch", ¶ms, i, j);
+
+ if (timed_lookups_sketch(¶ms, j) < 0)
+ return exit_with_fail
+ ("timed_lookups_sketch", ¶ms, i, j);
+ }
+
printf(".");
fflush(stdout);
@@ -574,7 +719,7 @@ run_all_tbl_perf_tests(void)
printf("Could not create keys/data/table\n");
return -1;
}
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
if (timed_miss_lookup(¶ms, j) < 0)
return exit_with_fail("timed_miss_lookup",
¶ms, i, j);
@@ -605,7 +750,7 @@ run_all_tbl_perf_tests(void)
"fr_multi_bulk", "false_positive_rate");
/* Key size not influence False rate so just print out one key size */
for (i = 0; i < 1; i++) {
- for (j = 0; j < NUM_TYPE; j++) {
+ for (j = 0; j < SKETCH; j++) {
printf("%-18d", hashtest_key_lens[i]);
printf("%-18d", j);
printf("%-18f", (float)false_data[j][i] / NUM_LOOKUPS);
--
2.25.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v5 1/2] member: implement NitroSketch mode
2022-09-16 3:03 ` [PATCH v5 1/2] member: implement NitroSketch mode Leyi Rong
@ 2022-09-22 3:00 ` Jiang, YuX
2022-10-03 12:37 ` Thomas Monjalon
1 sibling, 0 replies; 34+ messages in thread
From: Jiang, YuX @ 2022-09-22 3:00 UTC (permalink / raw)
To: Rong, Leyi, ferruh.yigit, Richardson, Bruce, david.marchand,
suanmingm, Wang, Yipeng1, zaoxingliu, Gobriel, Sameh
Cc: dev, Rong, Leyi
> -----Original Message-----
> From: Leyi Rong <leyi.rong@intel.com>
> Sent: Friday, September 16, 2022 11:03 AM
> To: ferruh.yigit@xilinx.com; Richardson, Bruce <bruce.richardson@intel.com>;
> david.marchand@redhat.com; suanmingm@nvidia.com; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>
> Cc: dev@dpdk.org; Rong, Leyi <leyi.rong@intel.com>
> Subject: [PATCH v5 1/2] member: implement NitroSketch mode
>
> Sketching algorithm provide high-fidelity approximate measurements and
> appears as a promising alternative to traditional approaches such as
> packet sampling.
>
> NitroSketch [1] is a software sketching framework that optimizes
> performance, provides accuracy guarantees, and supports a variety of
> sketches.
>
> This commit adds a new data structure called sketch into
> membership library. This new data structure is an efficient
> way to profile the traffic for heavy hitters. Also use min-heap
> structure to maintain the top-k flow keys.
>
> [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir
> Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General
> Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019.
> https://dl.acm.org/doi/pdf/10.1145/3341302.3342076
>
> Signed-off-by: Alan Liu <zaoxingliu@gmail.com>
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
Tested-by: Yu Jiang <yux.jiang@intel.com>
Best regards,
Yu Jiang
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v5 1/2] member: implement NitroSketch mode
2022-09-16 3:03 ` [PATCH v5 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-22 3:00 ` Jiang, YuX
@ 2022-10-03 12:37 ` Thomas Monjalon
2022-10-08 3:36 ` Rong, Leyi
1 sibling, 1 reply; 34+ messages in thread
From: Thomas Monjalon @ 2022-10-03 12:37 UTC (permalink / raw)
To: yipeng1.wang, Leyi Rong
Cc: ferruh.yigit, bruce.richardson, david.marchand, suanmingm,
zaoxingliu, sameh.gobriel, dev
16/09/2022 05:03, Leyi Rong:
> lib/member/meson.build | 49 ++-
> lib/member/rte_member.c | 75 ++++
> lib/member/rte_member.h | 154 ++++++-
> lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> lib/member/rte_member_sketch.h | 97 +++++
> lib/member/rte_member_sketch_avx512.c | 70 +++
> lib/member/rte_member_sketch_avx512.h | 35 ++
> lib/member/rte_xxh64_avx512.h | 117 +++++
What means xxh64? Should it be x86_64?
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v5 1/2] member: implement NitroSketch mode
2022-10-03 12:37 ` Thomas Monjalon
@ 2022-10-08 3:36 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-10-08 3:36 UTC (permalink / raw)
To: Thomas Monjalon, Wang, Yipeng1
Cc: ferruh.yigit, Richardson, Bruce, david.marchand, suanmingm,
zaoxingliu, Gobriel, Sameh, dev
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, October 3, 2022 8:38 PM
> To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Rong, Leyi
> <leyi.rong@intel.com>
> Cc: ferruh.yigit@xilinx.com; Richardson, Bruce <bruce.richardson@intel.com>;
> david.marchand@redhat.com; suanmingm@nvidia.com; zaoxingliu@gmail.com;
> Gobriel, Sameh <sameh.gobriel@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH v5 1/2] member: implement NitroSketch mode
>
> 16/09/2022 05:03, Leyi Rong:
> > lib/member/meson.build | 49 ++-
> > lib/member/rte_member.c | 75 ++++
> > lib/member/rte_member.h | 154 ++++++-
> > lib/member/rte_member_heap.h | 424 ++++++++++++++++++
> > lib/member/rte_member_sketch.c | 594 ++++++++++++++++++++++++++
> > lib/member/rte_member_sketch.h | 97 +++++
> > lib/member/rte_member_sketch_avx512.c | 70 +++
> > lib/member/rte_member_sketch_avx512.h | 35 ++
> > lib/member/rte_xxh64_avx512.h | 117 +++++
>
> What means xxh64? Should it be x86_64?
Hi Thomas,
xxh64 stands for xxHash generates 64-bit hashes, using 64-bit arithmetic.
>
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v5 0/2] introduce NitroSketch Mode into membership library
2022-09-16 3:03 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-16 3:03 ` [PATCH v5 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-16 3:03 ` [PATCH v5 2/2] test/member: add functional and perf tests for sketch Leyi Rong
@ 2022-10-09 23:13 ` Thomas Monjalon
2022-10-10 2:36 ` Rong, Leyi
2 siblings, 1 reply; 34+ messages in thread
From: Thomas Monjalon @ 2022-10-09 23:13 UTC (permalink / raw)
To: Leyi Rong
Cc: ferruh.yigit, bruce.richardson, david.marchand, suanmingm,
yipeng1.wang, zaoxingliu, sameh.gobriel, dev, dev
16/09/2022 05:03, Leyi Rong:
> This patchset introduce a brand new NitroSketch Mode into membership
> library. This algorithm provides high-fidelity approximate measurements
> and appears as a promissing alternative to triditional approches such as
> packet sampling.
[...]
> Leyi Rong (2):
> member: implement NitroSketch mode
> test/member: add functional and perf tests for sketch
Fixed few code style warnings, added a release note,
and applied, thanks.
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v5 0/2] introduce NitroSketch Mode into membership library
2022-10-09 23:13 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Thomas Monjalon
@ 2022-10-10 2:36 ` Rong, Leyi
2022-10-10 7:30 ` Thomas Monjalon
0 siblings, 1 reply; 34+ messages in thread
From: Rong, Leyi @ 2022-10-10 2:36 UTC (permalink / raw)
To: Thomas Monjalon
Cc: ferruh.yigit, Richardson, Bruce, david.marchand, suanmingm, Wang,
Yipeng1, zaoxingliu, Gobriel, Sameh, dev, dev
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, October 10, 2022 7:13 AM
> To: Rong, Leyi <leyi.rong@intel.com>
> Cc: ferruh.yigit@xilinx.com; Richardson, Bruce <bruce.richardson@intel.com>;
> david.marchand@redhat.com; suanmingm@nvidia.com; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>; dev@dpdk.org; dev@dpdk.org
> Subject: Re: [PATCH v5 0/2] introduce NitroSketch Mode into membership
> library
>
> 16/09/2022 05:03, Leyi Rong:
> > This patchset introduce a brand new NitroSketch Mode into membership
> > library. This algorithm provides high-fidelity approximate
> > measurements and appears as a promissing alternative to triditional
> > approches such as packet sampling.
> [...]
> > Leyi Rong (2):
> > member: implement NitroSketch mode
> > test/member: add functional and perf tests for sketch
>
> Fixed few code style warnings, added a release note, and applied, thanks.
>
Thank you, Thomas!
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v5 0/2] introduce NitroSketch Mode into membership library
2022-10-10 2:36 ` Rong, Leyi
@ 2022-10-10 7:30 ` Thomas Monjalon
2022-10-10 8:28 ` Rong, Leyi
0 siblings, 1 reply; 34+ messages in thread
From: Thomas Monjalon @ 2022-10-10 7:30 UTC (permalink / raw)
To: Rong, Leyi
Cc: ferruh.yigit, Richardson, Bruce, david.marchand, suanmingm, Wang,
Yipeng1, zaoxingliu, Gobriel, Sameh, dev
10/10/2022 04:36, Rong, Leyi:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 16/09/2022 05:03, Leyi Rong:
> > > This patchset introduce a brand new NitroSketch Mode into membership
> > > library. This algorithm provides high-fidelity approximate
> > > measurements and appears as a promissing alternative to triditional
> > > approches such as packet sampling.
> > [...]
> > > Leyi Rong (2):
> > > member: implement NitroSketch mode
> > > test/member: add functional and perf tests for sketch
> >
> > Fixed few code style warnings, added a release note, and applied, thanks.
> >
>
> Thank you, Thomas!
Please, could you check this compilation issue?
https://bugs.dpdk.org/show_bug.cgi?id=1096
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v5 0/2] introduce NitroSketch Mode into membership library
2022-10-10 7:30 ` Thomas Monjalon
@ 2022-10-10 8:28 ` Rong, Leyi
0 siblings, 0 replies; 34+ messages in thread
From: Rong, Leyi @ 2022-10-10 8:28 UTC (permalink / raw)
To: Thomas Monjalon
Cc: ferruh.yigit, Richardson, Bruce, david.marchand, suanmingm, Wang,
Yipeng1, zaoxingliu, Gobriel, Sameh, dev
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, October 10, 2022 3:30 PM
> To: Rong, Leyi <leyi.rong@intel.com>
> Cc: ferruh.yigit@xilinx.com; Richardson, Bruce <bruce.richardson@intel.com>;
> david.marchand@redhat.com; suanmingm@nvidia.com; Wang, Yipeng1
> <yipeng1.wang@intel.com>; zaoxingliu@gmail.com; Gobriel, Sameh
> <sameh.gobriel@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH v5 0/2] introduce NitroSketch Mode into membership
> library
>
> 10/10/2022 04:36, Rong, Leyi:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 16/09/2022 05:03, Leyi Rong:
> > > > This patchset introduce a brand new NitroSketch Mode into
> > > > membership library. This algorithm provides high-fidelity
> > > > approximate measurements and appears as a promissing alternative
> > > > to triditional approches such as packet sampling.
> > > [...]
> > > > Leyi Rong (2):
> > > > member: implement NitroSketch mode
> > > > test/member: add functional and perf tests for sketch
> > >
> > > Fixed few code style warnings, added a release note, and applied, thanks.
> > >
> >
> > Thank you, Thomas!
>
> Please, could you check this compilation issue?
> https://bugs.dpdk.org/show_bug.cgi?id=1096
>
Yes, fix patch is sent with https://patchwork.dpdk.org/project/dpdk/patch/20221010082245.2868071-1-leyi.rong@intel.com/
>
>
>
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2022-10-10 8:29 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-10 7:45 [PATCH 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-08-10 7:45 ` [PATCH 1/2] member: implement NitroSketch mode Leyi Rong
2022-08-16 8:59 ` Ferruh Yigit
2022-08-26 2:32 ` Rong, Leyi
2022-08-10 7:45 ` [PATCH 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-08-11 2:33 ` Suanming Mou
2022-08-26 2:26 ` Rong, Leyi
2022-08-31 6:46 ` [PATCH v2 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-08-31 6:46 ` [PATCH v2 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-13 14:56 ` David Marchand
2022-09-14 9:42 ` Rong, Leyi
2022-08-31 6:46 ` [PATCH v2 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-09-15 2:14 ` [PATCH v3 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-15 2:14 ` [PATCH v3 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-15 7:47 ` David Marchand
2022-09-15 13:02 ` Rong, Leyi
2022-09-15 2:14 ` [PATCH v3 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-09-15 7:30 ` Jiang, YuX
2022-09-15 7:41 ` David Marchand
2022-09-15 13:03 ` [PATCH v4 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-15 13:03 ` [PATCH v4 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-15 14:12 ` Bruce Richardson
2022-09-16 2:30 ` Rong, Leyi
2022-09-15 13:03 ` [PATCH v4 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-09-16 3:03 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Leyi Rong
2022-09-16 3:03 ` [PATCH v5 1/2] member: implement NitroSketch mode Leyi Rong
2022-09-22 3:00 ` Jiang, YuX
2022-10-03 12:37 ` Thomas Monjalon
2022-10-08 3:36 ` Rong, Leyi
2022-09-16 3:03 ` [PATCH v5 2/2] test/member: add functional and perf tests for sketch Leyi Rong
2022-10-09 23:13 ` [PATCH v5 0/2] introduce NitroSketch Mode into membership library Thomas Monjalon
2022-10-10 2:36 ` Rong, Leyi
2022-10-10 7:30 ` Thomas Monjalon
2022-10-10 8:28 ` Rong, Leyi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).