From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 2786DC604 for ; Thu, 16 Jun 2016 14:15:26 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 16 Jun 2016 05:14:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,480,1459839600"; d="scan'208";a="976979540" Received: from irsmsx153.ger.corp.intel.com ([163.33.192.75]) by orsmga001.jf.intel.com with ESMTP; 16 Jun 2016 05:14:54 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.51]) by IRSMSX153.ger.corp.intel.com ([169.254.9.105]) with mapi id 14.03.0248.002; Thu, 16 Jun 2016 13:14:18 +0100 From: "Ananyev, Konstantin" To: "Shen, Wei1" , "dev@dpdk.org" CC: "De Lara Guarch, Pablo" , "stephen@networkplumber.org" , "Tai, Charlie" , "Maciocco, Christian" , "Gobriel, Sameh" , "Shen, Wei1" Thread-Topic: [dpdk-dev] [PATCH v2] rte_hash: add scalable multi-writer insertion w/ Intel TSX Thread-Index: AQHRx4sBlHIrOAz6cUSHhqpXm8rsW5/r/fAg Date: Thu, 16 Jun 2016 12:14:18 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836B72290@irsmsx105.ger.corp.intel.com> References: <1462565102-15312-1-git-send-email-wei1.shen@intel.com> <1466052753-69632-1-git-send-email-wei1.shen@intel.com> In-Reply-To: <1466052753-69632-1-git-send-email-wei1.shen@intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2] rte_hash: add scalable multi-writer insertion w/ Intel TSX X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jun 2016 12:15:28 -0000 Hi Wei, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Wei Shen > Sent: Thursday, June 16, 2016 5:53 AM > To: dev@dpdk.org > Cc: De Lara Guarch, Pablo; stephen@networkplumber.org; Tai, Charlie; Maci= occo, Christian; Gobriel, Sameh; Shen, Wei1 > Subject: [dpdk-dev] [PATCH v2] rte_hash: add scalable multi-writer insert= ion w/ Intel TSX >=20 > This patch introduced scalable multi-writer Cuckoo Hash insertion > based on a split Cuckoo Search and Move operation using Intel > TSX. It can do scalable hash insertion with 22 cores with little > performance loss and negligible TSX abortion rate. >=20 > * Added an extra rte_hash flag definition to switch default > single writer Cuckoo Hash behavior to multiwriter. >=20 > * Added a make_space_insert_bfs_mw() function to do split Cuckoo > search in BFS order. >=20 > * Added tsx_cuckoo_move_insert() to do Cuckoo move in Intel TSX > protected manner. >=20 > * Added test_hash_multiwriter() as test case for multi-writer > Cuckoo Hash. >=20 > Signed-off-by: Shen Wei > Signed-off-by: Sameh Gobriel > --- > app/test/Makefile | 1 + > app/test/test_hash_multiwriter.c | 272 +++++++++++++++++++++++++++= ++++++ > doc/guides/rel_notes/release_16_07.rst | 12 ++ > lib/librte_hash/rte_cuckoo_hash.c | 231 +++++++++++++++++++++++++--= - > lib/librte_hash/rte_hash.h | 3 + > 5 files changed, 494 insertions(+), 25 deletions(-) > create mode 100644 app/test/test_hash_multiwriter.c >=20 > diff --git a/app/test/Makefile b/app/test/Makefile > index 053f3a2..5476300 100644 > --- a/app/test/Makefile > +++ b/app/test/Makefile > @@ -120,6 +120,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) +=3D test_thash.c > SRCS-$(CONFIG_RTE_LIBRTE_HASH) +=3D test_hash_perf.c > SRCS-$(CONFIG_RTE_LIBRTE_HASH) +=3D test_hash_functions.c > SRCS-$(CONFIG_RTE_LIBRTE_HASH) +=3D test_hash_scaling.c > +SRCS-$(CONFIG_RTE_LIBRTE_HASH) +=3D test_hash_multiwriter.c >=20 > SRCS-$(CONFIG_RTE_LIBRTE_LPM) +=3D test_lpm.c > SRCS-$(CONFIG_RTE_LIBRTE_LPM) +=3D test_lpm_perf.c > diff --git a/app/test/test_hash_multiwriter.c b/app/test/test_hash_multiw= riter.c > new file mode 100644 > index 0000000..54a0d2c > --- /dev/null > +++ b/app/test/test_hash_multiwriter.c > @@ -0,0 +1,272 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2016 Intel Corporation. All rights reserved. > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyrig= ht > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of Intel Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS F= OR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGH= T > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTA= L, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF US= E, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON A= NY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE U= SE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE= . > + */ > +#include > +#include > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "test.h" > + > +/* > + * Check condition and return an error if true. Assumes that "handle" is= the > + * name of the hash structure pointer to be freed. > + */ > +#define RETURN_IF_ERROR(cond, str, ...) do { = \ > + if (cond) { \ > + printf("ERROR line %d: " str "\n", __LINE__, \ > + ##__VA_ARGS__); \ > + if (handle) \ > + rte_hash_free(handle); \ > + return -1; \ > + } \ > +} while (0) > + > +#define RTE_APP_TEST_HASH_MULTIWRITER_FAILED 0 > + > +struct { > + uint32_t *keys; > + uint32_t *found; > + uint32_t nb_tsx_insertion; > + struct rte_hash *h; > +} tbl_multiwriter_test_params; > + > +const uint32_t nb_entries =3D 16*1024*1024; > +const uint32_t nb_total_tsx_insertion =3D 15*1024*1024; > +uint32_t rounded_nb_total_tsx_insertion; > + > +static rte_atomic64_t gcycles; > +static rte_atomic64_t ginsertions; > + > +static int > +test_hash_multiwriter_worker(__attribute__((unused)) void *arg) > +{ > + uint64_t i, offset; > + uint32_t lcore_id =3D rte_lcore_id(); > + uint64_t begin, cycles; > + > + offset =3D (lcore_id - rte_get_master_lcore()) > + * tbl_multiwriter_test_params.nb_tsx_insertion; > + > + printf("Core #%d inserting %d: %'"PRId64" - %'"PRId64"\n", > + lcore_id, tbl_multiwriter_test_params.nb_tsx_insertion, > + offset, offset + tbl_multiwriter_test_params.nb_tsx_insertion); > + > + begin =3D rte_rdtsc_precise(); > + > + for (i =3D offset; > + i < offset + tbl_multiwriter_test_params.nb_tsx_insertion; > + i++) { > + if (rte_hash_add_key(tbl_multiwriter_test_params.h, > + tbl_multiwriter_test_params.keys + i) < 0) > + break; > + } > + > + cycles =3D rte_rdtsc_precise() - begin; > + rte_atomic64_add(&gcycles, cycles); > + rte_atomic64_add(&ginsertions, i - offset); > + > + for (; i < offset + tbl_multiwriter_test_params.nb_tsx_insertion; i++) > + tbl_multiwriter_test_params.keys[i] > + =3D RTE_APP_TEST_HASH_MULTIWRITER_FAILED; > + > + return 0; > +} > + > + > +static int > +test_hash_multiwriter(void) > +{ > + unsigned int i, rounded_nb_total_tsx_insertion; > + static unsigned calledCount =3D 1; > + > + uint32_t *keys; > + uint32_t *found; > + > + struct rte_hash_parameters hash_params =3D { > + .entries =3D nb_entries, > + .key_len =3D sizeof(uint32_t), > + .hash_func =3D rte_hash_crc, > + .hash_func_init_val =3D 0, > + .socket_id =3D rte_socket_id(), > + .extra_flag =3D RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT > + | RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD, > + }; > + > + struct rte_hash *handle; > + char name[RTE_HASH_NAMESIZE]; > + > + const void *next_key; > + void *next_data; > + uint32_t iter =3D 0; > + > + uint32_t duplicated_keys =3D 0; > + uint32_t lost_keys =3D 0; > + > + snprintf(name, 32, "test%u", calledCount++); > + hash_params.name =3D name; > + > + handle =3D rte_hash_create(&hash_params); > + RETURN_IF_ERROR(handle =3D=3D NULL, "hash creation failed"); > + > + tbl_multiwriter_test_params.h =3D handle; > + tbl_multiwriter_test_params.nb_tsx_insertion =3D > + nb_total_tsx_insertion / rte_lcore_count(); > + > + rounded_nb_total_tsx_insertion =3D (nb_total_tsx_insertion / > + tbl_multiwriter_test_params.nb_tsx_insertion) > + * tbl_multiwriter_test_params.nb_tsx_insertion; > + > + rte_srand(rte_rdtsc()); > + > + keys =3D rte_malloc(NULL, sizeof(uint32_t) * nb_entries, 0); > + > + if (keys =3D=3D NULL) { > + printf("RTE_MALLOC failed\n"); > + goto err1; > + } > + > + found =3D rte_zmalloc(NULL, sizeof(uint32_t) * nb_entries, 0); > + if (found =3D=3D NULL) { > + printf("RTE_ZMALLOC failed\n"); > + goto err2; > + } > + > + for (i =3D 0; i < nb_entries; i++) > + keys[i] =3D i; > + > + tbl_multiwriter_test_params.keys =3D keys; > + tbl_multiwriter_test_params.found =3D found; > + > + rte_atomic64_init(&gcycles); > + rte_atomic64_clear(&gcycles); > + > + rte_atomic64_init(&ginsertions); > + rte_atomic64_clear(&ginsertions); > + > + /* Fire all threads. */ > + rte_eal_mp_remote_launch(test_hash_multiwriter_worker, > + NULL, CALL_MASTER); > + rte_eal_mp_wait_lcore(); > + > + while (rte_hash_iterate(handle, &next_key, &next_data, &iter) >=3D 0) { > + /* Search for the key in the list of keys added .*/ > + i =3D *(const uint32_t *)next_key; > + tbl_multiwriter_test_params.found[i]++; > + } > + > + for (i =3D 0; i < rounded_nb_total_tsx_insertion; i++) { > + if (tbl_multiwriter_test_params.keys[i] > + !=3D RTE_APP_TEST_HASH_MULTIWRITER_FAILED) { > + if (tbl_multiwriter_test_params.found[i] > 1) { > + duplicated_keys++; > + break; > + } > + if (tbl_multiwriter_test_params.found[i] =3D=3D 0) { > + lost_keys++; > + printf("key %d is lost\n", i); > + break; > + } > + } > + } > + > + if (duplicated_keys > 0) { > + printf("%d key duplicated\n", duplicated_keys); > + goto err3; > + } > + > + if (lost_keys > 0) { > + printf("%d key lost\n", lost_keys); > + goto err3; > + } > + > + printf("No key corrupted during multiwriter insertion.\n"); > + > + unsigned long long int cycles_per_insertion =3D > + rte_atomic64_read(&gcycles)/ > + rte_atomic64_read(&ginsertions); > + > + printf(" cycles per insertion: %llu\n", cycles_per_insertion); > + > + rte_free(tbl_multiwriter_test_params.found); > + rte_free(tbl_multiwriter_test_params.keys); > + rte_hash_free(handle); > + return 0; > + > +err3: > + rte_free(tbl_multiwriter_test_params.found); > +err2: > + rte_free(tbl_multiwriter_test_params.keys); > +err1: > + rte_hash_free(handle); > + return -1; > +} > + > +static int > +test_hash_multiwriter_main(void) > +{ > + int r =3D -1; > + > + if (rte_lcore_count() =3D=3D 1) { > + printf( > + "More than one lcore is required to do multiwriter test\n"); > + return 0; > + } > + > + if (!rte_tm_supported()) { > + printf( > + "Hardware transactional memory (lock elision) is NOT supported\n"); > + return 0; > + } > + > + printf("Hardware transactional memory (lock elision) is supported\n"); > + > + setlocale(LC_NUMERIC, ""); > + > + r =3D test_hash_multiwriter(); > + > + return r; > +} > + > + > +static struct test_command hash_scaling_cmd =3D { > + .command =3D "hash_multiwriter_autotest", > + .callback =3D test_hash_multiwriter_main, > +}; > + > +REGISTER_TEST_COMMAND(hash_scaling_cmd); > diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_note= s/release_16_07.rst > index 131723c..f8264fb 100644 > --- a/doc/guides/rel_notes/release_16_07.rst > +++ b/doc/guides/rel_notes/release_16_07.rst > @@ -70,6 +70,18 @@ New Features > * Enable RSS per network interface through the configuration file. > * Streamline the CLI code. >=20 > +* **Added multi-writer support for RTE Hash with Intel TSX.** > + > + The following features/modifications have been added to rte_hash libra= ry: > + > + * Enabled application developers to use an extra flag for rte_hash cre= ation > + to specify default behavior (multi-thread safe/unsafe) with rte_hash= _add_key > + function. > + * Changed Cuckoo search algorithm to breadth first search for multi-wr= iter > + routine and split Cuckoo Search and Move operations in order to redu= ce > + transactional code region and improve TSX performance. > + * Added a hash multi-writer test case for test app. > + >=20 > Resolved Issues > --------------- > diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuck= oo_hash.c > index 7b7d1f8..3cb6770 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.c > +++ b/lib/librte_hash/rte_cuckoo_hash.c > @@ -1,7 +1,7 @@ > /*- > * BSD LICENSE > * > - * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. > + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. > * All rights reserved. > * > * Redistribution and use in source and binary forms, with or without > @@ -100,7 +100,13 @@ EAL_REGISTER_TAILQ(rte_hash_tailq) >=20 > #define KEY_ALIGNMENT 16 >=20 > -#define LCORE_CACHE_SIZE 8 > +#define LCORE_CACHE_SIZE 64 > + > +#define RTE_HASH_BFS_QUEUE_MAX_LEN 1000 > + > +#define RTE_XABORT_CUCKOO_PATH_INVALIDED 0x4 > + > +#define RTE_HASH_TSX_MAX_RETRY 10 >=20 > #if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) > /* > @@ -190,6 +196,7 @@ struct rte_hash { > memory support */ > struct lcore_cache *local_free_slots; > /**< Local cache per lcore, storing some indexes of the free slots */ > + uint8_t multiwriter_add; /**< Multi-write safe hash add behavior */ > } __rte_cache_aligned; >=20 > /* Structure storing both primary and secondary hashes */ > @@ -372,7 +379,7 @@ rte_hash_create(const struct rte_hash_parameters *par= ams) >=20 > /* > * If x86 architecture is used, select appropriate compare function, > - * which may use x86 instrinsics, otherwise use memcmp > + * which may use x86 intrinsics, otherwise use memcmp > */ > #if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) > /* Select function to compare keys */ > @@ -431,7 +438,16 @@ rte_hash_create(const struct rte_hash_parameters *pa= rams) > h->free_slots =3D r; > h->hw_trans_mem_support =3D hw_trans_mem_support; >=20 > - /* populate the free slots ring. Entry zero is reserved for key misses = */ > + /* Turn on multi-writer only with explicit flat from user and TM > + * support. > + */ > + if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD > + && h->hw_trans_mem_support) > + h->multiwriter_add =3D 1; > + else > + h->multiwriter_add =3D 0; Wonder why MULTIPLE-WRITER support has to be implemented only for machines = with TSX support? >>From initial discussion my understanding was that it would work on both arh= itectures with and without TSX: on non-TSX platforms approach with spin-lock will be used. Do I miss something here? =20 > + > + /* Populate free slots ring. Entry zero is reserved for key misses. */ > for (i =3D 1; i < params->entries + 1; i++) > rte_ring_sp_enqueue(r, (void *)((uintptr_t) i)); >=20 > @@ -599,6 +615,123 @@ make_space_bucket(const struct rte_hash *h, struct = rte_hash_bucket *bkt) >=20 > } >=20 > +struct queue_node { > + struct rte_hash_bucket *bkt; /* Current bucket on the bfs search */ > + > + struct queue_node *prev; /* Parent(bucket) in search path */ > + int prev_slot; /* Parent(slot) in search path */ > +}; > + > +/* Shift buckets along cuckoo_path and fill the path head with new entry= */ > +static inline int > +tsx_cuckoo_move_insert(const struct rte_hash *h, struct queue_node *leaf= , > + uint32_t leaf_slot, hash_sig_t sig, > + hash_sig_t alt_hash, uint32_t new_idx) > +{ > + unsigned try =3D 0; > + unsigned status; > + uint32_t prev_alt_bkt_idx; > + > + struct queue_node *prev_node, *curr_node =3D leaf; > + struct rte_hash_bucket *prev_bkt, *curr_bkt =3D leaf->bkt; > + uint32_t prev_slot, curr_slot =3D leaf_slot; > + > + while (try < RTE_HASH_TSX_MAX_RETRY) { > + status =3D rte_xbegin(); Hmm, would it compile for non-IA platform? As I remember, we have rte_xbegin/xend/... defined only for x86 arch, and I don't see here any mechanism to exclude that code from compilation on= non-x86 arch (#ifdef RTE_ARCH_X86 or so). Konstantin > + if (likely(status =3D=3D RTE_XBEGIN_STARTED)) { > + while (likely(curr_node->prev !=3D NULL)) { > + prev_node =3D curr_node->prev; > + prev_bkt =3D prev_node->bkt; > + prev_slot =3D curr_node->prev_slot; > + > + prev_alt_bkt_idx > + =3D prev_bkt->signatures[prev_slot].alt > + & h->bucket_bitmask; > + > + if (unlikely(&h->buckets[prev_alt_bkt_idx] > + !=3D curr_bkt)) { > + rte_xabort(RTE_XABORT_CUCKOO_PATH_INVALIDED); > + } > + > + /* Need to swap current/alt sig to allow later Cuckoo insert to > + * move elements back to its primary bucket if available > + */ > + curr_bkt->signatures[curr_slot].alt =3D > + prev_bkt->signatures[prev_slot].current; > + curr_bkt->signatures[curr_slot].current =3D > + prev_bkt->signatures[prev_slot].alt; > + curr_bkt->key_idx[curr_slot] > + =3D prev_bkt->key_idx[prev_slot]; > + > + curr_slot =3D prev_slot; > + curr_node =3D prev_node; > + curr_bkt =3D curr_node->bkt; > + } > + > + curr_bkt->signatures[curr_slot].current =3D sig; > + curr_bkt->signatures[curr_slot].alt =3D alt_hash; > + curr_bkt->key_idx[curr_slot] =3D new_idx; > + > + rte_xend(); > + > + return 0; > + } > + > + /* If we abort we give up this cuckoo path, since most likely it's > + * no longer valid as TSX detected data conflict > + */ > + try++; > + rte_pause(); > + } > + > + return -1; > +} > + > +/* > + * Make space for new key, using bfs Cuckoo Search and Multi-Writer safe > + * Cuckoo > + */ > +static inline int > +make_space_insert_bfs_mw(const struct rte_hash *h, struct rte_hash_bucke= t *bkt, > + hash_sig_t sig, hash_sig_t alt_hash, > + uint32_t new_idx) > +{ > + unsigned i; > + struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN]; > + struct queue_node *tail, *head; > + struct rte_hash_bucket *curr_bkt, *alt_bkt; > + > + tail =3D queue; > + head =3D queue + 1; > + tail->bkt =3D bkt; > + tail->prev =3D NULL; > + tail->prev_slot =3D -1; > + > + /* Cuckoo bfs Search */ > + while (likely(tail !=3D head && head < > + queue + RTE_HASH_BFS_QUEUE_MAX_LEN - 4)) { > + curr_bkt =3D tail->bkt; > + for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > + if (curr_bkt->signatures[i].sig =3D=3D NULL_SIGNATURE) { > + if (likely(tsx_cuckoo_move_insert(h, tail, i, > + sig, alt_hash, new_idx) =3D=3D 0)) > + return 0; > + } > + > + /* Enqueue new node and keep prev node info */ > + alt_bkt =3D &(h->buckets[curr_bkt->signatures[i].alt > + & h->bucket_bitmask]); > + head->bkt =3D alt_bkt; > + head->prev =3D tail; > + head->prev_slot =3D i; > + head++; > + } > + tail++; > + } > + > + return -ENOSPC; > +} > + > /* > * Function called to enqueue back an index in the cache/ring, > * as slot has not being used and it can be used in the > @@ -712,30 +845,78 @@ __rte_hash_add_key_with_hash(const struct rte_hash = *h, const void *key, > rte_memcpy(new_k->key, key, h->key_len); > new_k->pdata =3D data; >=20 > - /* Insert new entry is there is room in the primary bucket */ > - for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > - /* Check if slot is available */ > - if (likely(prim_bkt->signatures[i].sig =3D=3D NULL_SIGNATURE)) { > - prim_bkt->signatures[i].current =3D sig; > - prim_bkt->signatures[i].alt =3D alt_hash; > - prim_bkt->key_idx[i] =3D new_idx; > + if (h->multiwriter_add) { > + unsigned status; > + unsigned try =3D 0; > + > + while (try < RTE_HASH_TSX_MAX_RETRY) { > + status =3D rte_xbegin(); > + if (likely(status =3D=3D RTE_XBEGIN_STARTED)) { > + /* Insert new entry if there is room in the primary > + * bucket. > + */ > + for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > + /* Check if slot is available */ > + if (likely(prim_bkt->signatures[i].sig =3D=3D NULL_SIGNATURE)) { > + prim_bkt->signatures[i].current =3D sig; > + prim_bkt->signatures[i].alt =3D alt_hash; > + prim_bkt->key_idx[i] =3D new_idx; > + break; > + } > + } > + rte_xend(); > + > + if (i !=3D RTE_HASH_BUCKET_ENTRIES) > + return new_idx - 1; > + > + break; /* break off try loop if transaction commits */ > + } else { > + /* If we abort we give up this cuckoo path. */ > + try++; > + rte_pause(); > + } > + } > + > + /* Primary bucket full, need to make space for new entry */ > + ret =3D make_space_insert_bfs_mw(h, prim_bkt, sig, alt_hash, > + new_idx); > + > + if (ret >=3D 0) > return new_idx - 1; > + > + /* Also search secondary bucket to get better occupancy */ > + ret =3D make_space_insert_bfs_mw(h, sec_bkt, sig, alt_hash, > + new_idx); > + > + if (ret >=3D 0) > + return new_idx - 1; > + } else { > + for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > + /* Check if slot is available */ > + if (likely(prim_bkt->signatures[i].sig =3D=3D NULL_SIGNATURE)) { > + prim_bkt->signatures[i].current =3D sig; > + prim_bkt->signatures[i].alt =3D alt_hash; > + prim_bkt->key_idx[i] =3D new_idx; > + break; > + } > } > - } >=20 > - /* Primary bucket is full, so we need to make space for new entry */ > - ret =3D make_space_bucket(h, prim_bkt); > - /* > - * After recursive function. > - * Insert the new entry in the position of the pushed entry > - * if successful or return error and > - * store the new slot back in the ring > - */ > - if (ret >=3D 0) { > - prim_bkt->signatures[ret].current =3D sig; > - prim_bkt->signatures[ret].alt =3D alt_hash; > - prim_bkt->key_idx[ret] =3D new_idx; > - return new_idx - 1; > + if (i !=3D RTE_HASH_BUCKET_ENTRIES) > + return new_idx - 1; > + > + /* Primary bucket full, need to make space for new entry > + * After recursive function. > + * Insert the new entry in the position of the pushed entry > + * if successful or return error and > + * store the new slot back in the ring > + */ > + ret =3D make_space_bucket(h, prim_bkt); > + if (ret >=3D 0) { > + prim_bkt->signatures[ret].current =3D sig; > + prim_bkt->signatures[ret].alt =3D alt_hash; > + prim_bkt->key_idx[ret] =3D new_idx; > + return new_idx - 1; > + } > } >=20 > /* Error in addition, store new slot back in the ring and return error = */ > diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h > index 724315a..c9612fb 100644 > --- a/lib/librte_hash/rte_hash.h > +++ b/lib/librte_hash/rte_hash.h > @@ -60,6 +60,9 @@ extern "C" { > /** Enable Hardware transactional memory support. */ > #define RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT 0x01 >=20 > +/** Default behavior of insertion, single writer/multi writer */ > +#define RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD 0x02 > + > /** Signature of key that is stored internally. */ > typedef uint32_t hash_sig_t; >=20 > -- > 2.5.5