From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80054.outbound.protection.outlook.com [40.107.8.54]) by dpdk.org (Postfix) with ESMTP id B25475F24 for ; Thu, 27 Sep 2018 06:24:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hBAMJ2nBST/Ljaq25SLF8zYUd5sQeooqkaIWer060v4=; b=glOAfS0nHxlPmYPwNQQTG6SEViTC3iKrj4aNi1rDvmifhEgCAmtLxr00u3Hf0uiGb0up6oIX6bIzRrtHPM4wAoA77J8W7eHMVRa1eZ0evsret2r/6RDp0rpsmX6rSxYpOSrFuYrxJnHk+NqmMcNA4zadrCAg8k0mTLIH47QWrLs= Received: from AM6PR08MB3672.eurprd08.prod.outlook.com (20.177.115.29) by AM6PR08MB2950.eurprd08.prod.outlook.com (52.135.163.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1164.22; Thu, 27 Sep 2018 04:24:13 +0000 Received: from AM6PR08MB3672.eurprd08.prod.outlook.com ([fe80::589e:d3cf:9777:5ff9]) by AM6PR08MB3672.eurprd08.prod.outlook.com ([fe80::589e:d3cf:9777:5ff9%2]) with mapi id 15.20.1164.024; Thu, 27 Sep 2018 04:24:13 +0000 From: Honnappa Nagarahalli To: Yipeng Wang , "bruce.richardson@intel.com" CC: "dev@dpdk.org" , "michel@digirati.com.br" Thread-Topic: [PATCH v2 7/7] hash: use partial-key hashing Thread-Index: AQHUUgpSAxEcwk/VE0m37zNHBAy4GKT85kYQ Date: Thu, 27 Sep 2018 04:24:13 +0000 Message-ID: References: <1536253745-133104-1-git-send-email-yipeng1.wang@intel.com> <1537550255-252066-1-git-send-email-yipeng1.wang@intel.com> <1537550255-252066-8-git-send-email-yipeng1.wang@intel.com> In-Reply-To: <1537550255-252066-8-git-send-email-yipeng1.wang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM6PR08MB2950; 6:kjgVP9xrhm/b1EuyOPM0D8SzA5zTjfNS4n5sEKr5/z2Ts5nor6+zxHMnh96xLc9q7cejwMoURzSJ38/UvkxXjZRWgqjnggmBPD2QMuV5CGQU0AI6BfwVYWWoHVlBJHZbHOHXNChrzjqWfrVLBArj5Am92RrDjxzHIxz0004h2NNvbQW64a7GywwTFQoH0sMYHLpUyz9DBunYyf+1eJcRxDGWHIw8oD09bNCckgZsbxGzd3xf4hDmDXWtDR9ulYp9BZRQZPfS/H+peqkSPb5Bpicc+KW2G1KTBsK/E8tbI53Tk7cMAnelaX3IBq1oKrEjenTSiMrLl7Jo8JpQcZu1QF7NCadjR4yfOtNUK3X6QsM/lJi5nM6X3FNPgVvooFakiMl0Vm6MZWinX3AQAeu7Qz8IlBJ49d9WK3gfReTy/g0cjhk0KfLi3PLsPm+2wR+dDALma2EwGhfhyyqUhsiKVg==; 5:4MBXcvJ/afe3qcSL7OJ2Q7HomIp84WA3xsX3i9Fao1OGM3Unvk0Fl04wE1vOm8vqx94wLOm54BWw+S0eJH+QSN8+qDojKVIlsswX3ZSscD/UfaCZDucMikgIzf5Tro4QUEfIkdKtCWG96RDq7aESNqxElrfKexxW5U4y9OlV4gE=; 7:ekEzVPk/DgV4EUPvNvd4A4aBw4mhR32T+dKZuYBvL237abLDViDvJYr/3k/TkMHGvCAJkd/kOLcEW8qXtxweSzYH4wF98qhuuMqARPtN+TsnXh3ZHTmd9jlfmTs7cgnx+anjqn3OkieFXYK0mx2eoxB4TR3hv0Y8o5qvx87cfexqstWFc8PijVvIkFZBxUbukgpUDa4/BwV1W6RGzrWP2lDCrqlc01F1RbajKmPO0sLiqXUYHsrOJD/eGnNDh1av x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 3b0c7eb2-9797-41df-cb82-08d62431144d x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534165)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:AM6PR08MB2950; x-ms-traffictypediagnostic: AM6PR08MB2950: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(228905959029699)(166494164430575)(131327999870524)(180628864354917); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3002001)(3231355)(944501410)(52105095)(10201501046)(6055026)(149066)(150057)(6041310)(20161123558120)(20161123560045)(20161123564045)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(201708071742011)(7699051); SRVR:AM6PR08MB2950; BCL:0; PCL:0; RULEID:; SRVR:AM6PR08MB2950; x-forefront-prvs: 0808323E97 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(396003)(366004)(39860400002)(376002)(346002)(199004)(40434004)(13464003)(189003)(110136005)(186003)(54906003)(305945005)(66066001)(97736004)(2906002)(68736007)(316002)(8936002)(5250100002)(81156014)(81166006)(74316002)(6246003)(33656002)(8676002)(4326008)(14444005)(5024004)(6436002)(105586002)(7696005)(7736002)(2501003)(53546011)(72206003)(229853002)(86362001)(106356001)(486006)(446003)(102836004)(478600001)(256004)(9686003)(99286004)(53946003)(14454004)(55016002)(53936002)(6506007)(25786009)(34290500001)(5660300001)(11346002)(6116002)(3846002)(476003)(76176011)(71200400001)(2900100001)(71190400001)(26005)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:AM6PR08MB2950; H:AM6PR08MB3672.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: Si+qUs03ZAUYmrDZZnNHJzOFfOn05aAWzwlWB4Qws56ShBcC+KUUn3R2eUap5nOAUpsFqv7W12vdK2gHfhocyo3dGg3YSgiM8iw6iXp1tNxVp5S8jHPKtc2v435YWu9OFIWqcS8wZUZerwUte8mzytKPRj1mrWHNp8SOD1FhCnXBihqrM3jWHpfsXarPauS4e0vvOc5zsmp0qWSQOVRBQMQ5z0nnB4IslWGip/87NQC54VKw0fX7vCGL8eSHJQUow9rHj4cWPPwc5WMXIcycka0RYqgSmncE1rloWVRwHjrbtRRzs1Zknap32mnYpYtmqzVR1yDYGyzKrWyKHROZeCSxGEoKZxVVbTPPmdlcWaw= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3b0c7eb2-9797-41df-cb82-08d62431144d X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Sep 2018 04:24:13.0562 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB2950 Subject: Re: [dpdk-dev] [PATCH v2 7/7] hash: use partial-key hashing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Sep 2018 04:24:15 -0000 > -----Original Message----- > From: Yipeng Wang > Sent: Friday, September 21, 2018 12:18 PM > To: bruce.richardson@intel.com > Cc: dev@dpdk.org; yipeng1.wang@intel.com; michel@digirati.com.br; > Honnappa Nagarahalli > Subject: [PATCH v2 7/7] hash: use partial-key hashing > > This commit changes the hashing mechanism to "partial-key hashing" to > calculate bucket index and signature of key. > > This is proposed in Bin Fan, et al's paper > "MemC3: Compact and Concurrent MemCache with Dumber Caching and > Smarter Hashing". Bascially the idea is to use "xor" to derive alternativ= e > bucket from current bucket index and signature. > > With "partial-key hashing", it reduces the bucket memory requirement from > two cache lines to one cache line, which improves the memory efficiency a= nd > thus the lookup speed. > > Signed-off-by: Yipeng Wang > --- > lib/librte_hash/rte_cuckoo_hash.c | 228 ++++++++++++++++++--------------= ----- > - > lib/librte_hash/rte_cuckoo_hash.h | 6 +- > lib/librte_hash/rte_hash.h | 5 +- > 3 files changed, 114 insertions(+), 125 deletions(-) > > diff --git a/lib/librte_hash/rte_cuckoo_hash.c > b/lib/librte_hash/rte_cuckoo_hash.c > index 616900b..5108ff0 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.c > +++ b/lib/librte_hash/rte_cuckoo_hash.c > @@ -90,6 +90,27 @@ rte_hash_cmp_eq(const void *key1, const void *key2, > const struct rte_hash *h) > return cmp_jump_table[h->cmp_jump_table_idx](key1, key2, > h->key_len); } > > +static inline void > +get_buckets_index(const struct rte_hash *h, const hash_sig_t hash, > +uint32_t *prim_bkt, uint32_t *sec_bkt, uint16_t *sig) { > +/* > + * We use higher 16 bits of hash as the signature value stored in table. > + * We use the lower bits for the primary bucket > + * location. Then we XOR primary bucket location and the signature > + * to get the secondary bucket location. This is same as > + * proposed in Bin Fan, et al's paper > + * "MemC3: Compact and Concurrent MemCache with Dumber > Caching and > + * Smarter Hashing". The benefit to use > + * XOR is that one could derive the alternative bucket location > + * by only using the current bucket location and the signature. > + */ > +*sig =3D hash >> 16; > + > +*prim_bkt =3D hash & h->bucket_bitmask; > +*sec_bkt =3D (*prim_bkt ^ *sig) & h->bucket_bitmask; } > + IMO, this function can be split into 2 - one for primary bucket index and a= nother for secondary bucket index. The secondary bucket index calculation f= unction can be used in functions ' rte_hash_cuckoo_move_insert_mw' and ' rt= e_hash_cuckoo_make_space_mw'. > struct rte_hash * > rte_hash_create(const struct rte_hash_parameters *params) { @@ -327,9 > +348,7 @@ rte_hash_create(const struct rte_hash_parameters *params) > h->ext_table_support =3D ext_table_support; > > #if defined(RTE_ARCH_X86) > -if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) > -h->sig_cmp_fn =3D RTE_HASH_COMPARE_AVX2; > -else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) > +if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) > h->sig_cmp_fn =3D RTE_HASH_COMPARE_SSE; > else > #endif > @@ -416,18 +435,6 @@ rte_hash_hash(const struct rte_hash *h, const void > *key) > return h->hash_func(key, h->key_len, h->hash_func_init_val); } > > -/* Calc the secondary hash value from the primary hash value of a given = key > */ -static inline hash_sig_t -rte_hash_secondary_hash(const hash_sig_t > primary_hash) -{ > -static const unsigned all_bits_shift =3D 12; > -static const unsigned alt_bits_xor =3D 0x5bd1e995; > - > -uint32_t tag =3D primary_hash >> all_bits_shift; > - > -return primary_hash ^ ((tag + 1) * alt_bits_xor); > -} > - > int32_t > rte_hash_count(const struct rte_hash *h) { @@ -558,14 +565,13 @@ > enqueue_slot_back(const struct rte_hash *h, > /* Search a key from bucket and update its data */ static inline int32_= t > search_and_update(const struct rte_hash *h, void *data, const void *key, > -struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash) > +struct rte_hash_bucket *bkt, uint16_t sig) > { > int i; > struct rte_hash_key *k, *keys =3D h->key_store; > > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > -if (bkt->sig_current[i] =3D=3D sig && > -bkt->sig_alt[i] =3D=3D alt_hash) { > +if (bkt->sig_current[i] =3D=3D sig) { > k =3D (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); > if (rte_hash_cmp_eq(key, k->key, h) =3D=3D 0) { @@ - > 592,7 +598,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > struct rte_hash_bucket *prim_bkt, > struct rte_hash_bucket *sec_bkt, > const struct rte_hash_key *key, void *data, > -hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx, > +uint16_t sig, uint32_t new_idx, > int32_t *ret_val) > { > unsigned int i; > @@ -603,7 +609,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > /* Check if key was inserted after last check but before this > * protected region in case of inserting duplicated keys. > */ > -ret =3D search_and_update(h, data, key, prim_bkt, sig, alt_hash); > +ret =3D search_and_update(h, data, key, prim_bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -611,7 +617,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > } > > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > -ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > +ret =3D search_and_update(h, data, key, cur_bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -626,7 +632,6 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > /* Check if slot is available */ > if (likely(prim_bkt->key_idx[i] =3D=3D EMPTY_SLOT)) { > prim_bkt->sig_current[i] =3D sig; > -prim_bkt->sig_alt[i] =3D alt_hash; > prim_bkt->key_idx[i] =3D new_idx; > break; > } > @@ -651,7 +656,7 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > struct rte_hash_bucket *alt_bkt, > const struct rte_hash_key *key, void *data, > struct queue_node *leaf, uint32_t leaf_slot, > -hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx, > +uint16_t sig, uint32_t new_idx, > int32_t *ret_val) > { > uint32_t prev_alt_bkt_idx; > @@ -672,7 +677,7 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > /* Check if key was inserted after last check but before this > * protected region. > */ > -ret =3D search_and_update(h, data, key, bkt, sig, alt_hash); > +ret =3D search_and_update(h, data, key, bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -680,7 +685,7 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > } > > FOR_EACH_BUCKET(cur_bkt, alt_bkt) { > -ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > +ret =3D search_and_update(h, data, key, cur_bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -693,8 +698,9 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > prev_bkt =3D prev_node->bkt; > prev_slot =3D curr_node->prev_slot; > > -prev_alt_bkt_idx =3D > -prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask; > +prev_alt_bkt_idx =3D (prev_node->cur_bkt_idx ^ > +prev_bkt->sig_current[prev_slot]) & > +h->bucket_bitmask; > > if (unlikely(&h->buckets[prev_alt_bkt_idx] > !=3D curr_bkt)) { > @@ -708,10 +714,8 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > * Cuckoo insert to move elements back to its > * primary bucket if available > */ > -curr_bkt->sig_alt[curr_slot] =3D > - prev_bkt->sig_current[prev_slot]; > curr_bkt->sig_current[curr_slot] =3D > -prev_bkt->sig_alt[prev_slot]; > +prev_bkt->sig_current[prev_slot]; > curr_bkt->key_idx[curr_slot] =3D > prev_bkt->key_idx[prev_slot]; > > @@ -721,7 +725,6 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > } > > curr_bkt->sig_current[curr_slot] =3D sig; > -curr_bkt->sig_alt[curr_slot] =3D alt_hash; > curr_bkt->key_idx[curr_slot] =3D new_idx; > > __hash_rw_writer_unlock(h); > @@ -739,39 +742,44 @@ rte_hash_cuckoo_make_space_mw(const struct > rte_hash *h, > struct rte_hash_bucket *bkt, > struct rte_hash_bucket *sec_bkt, > const struct rte_hash_key *key, void *data, > -hash_sig_t sig, hash_sig_t alt_hash, > +uint16_t sig, uint32_t bucket_idx, > uint32_t new_idx, int32_t *ret_val) > { > unsigned int i; > struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN]; > struct queue_node *tail, *head; > struct rte_hash_bucket *curr_bkt, *alt_bkt; > +uint32_t cur_idx, alt_idx; > > tail =3D queue; > head =3D queue + 1; > tail->bkt =3D bkt; > tail->prev =3D NULL; > tail->prev_slot =3D -1; > +tail->cur_bkt_idx =3D bucket_idx; > > /* Cuckoo bfs Search */ > while (likely(tail !=3D head && head < > queue + > RTE_HASH_BFS_QUEUE_MAX_LEN - > RTE_HASH_BUCKET_ENTRIES)) { > curr_bkt =3D tail->bkt; > +cur_idx =3D tail->cur_bkt_idx; > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > if (curr_bkt->key_idx[i] =3D=3D EMPTY_SLOT) { > int32_t ret =3D > rte_hash_cuckoo_move_insert_mw(h, > bkt, sec_bkt, key, data, > -tail, i, sig, alt_hash, > +tail, i, sig, > new_idx, ret_val); > if (likely(ret !=3D -1)) > return ret; > } > > /* Enqueue new node and keep prev node info */ > -alt_bkt =3D &(h->buckets[curr_bkt->sig_alt[i] > - & h->bucket_bitmask]); > +alt_idx =3D (curr_bkt->sig_current[i] ^ cur_idx) & > +h->bucket_bitmask; > +alt_bkt =3D &(h->buckets[alt_idx]); > head->bkt =3D alt_bkt; > +head->cur_bkt_idx =3D alt_idx; > head->prev =3D tail; > head->prev_slot =3D i; > head++; > @@ -786,7 +794,7 @@ static inline int32_t > __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void *data) > { > -hash_sig_t alt_hash; > +uint16_t short_sig; > uint32_t prim_bucket_idx, sec_bucket_idx; > struct rte_hash_bucket *prim_bkt, *sec_bkt, *cur_bkt; > struct rte_hash_key *new_k, *keys =3D h->key_store; @@ -801,18 > +809,15 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const > void *key, > int32_t ret_val; > struct rte_hash_bucket *last; > > -prim_bucket_idx =3D sig & h->bucket_bitmask; > +get_buckets_index(h, sig, &prim_bucket_idx, &sec_bucket_idx, > +&short_sig); > prim_bkt =3D &h->buckets[prim_bucket_idx]; > -rte_prefetch0(prim_bkt); > - > -alt_hash =3D rte_hash_secondary_hash(sig); > -sec_bucket_idx =3D alt_hash & h->bucket_bitmask; > sec_bkt =3D &h->buckets[sec_bucket_idx]; > +rte_prefetch0(prim_bkt); > rte_prefetch0(sec_bkt); > > /* Check if key is already inserted in primary location */ > __hash_rw_writer_lock(h); > -ret =3D search_and_update(h, data, key, prim_bkt, sig, alt_hash); > +ret =3D search_and_update(h, data, key, prim_bkt, short_sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > return ret; > @@ -820,12 +825,13 @@ __rte_hash_add_key_with_hash(const struct > rte_hash *h, const void *key, > > /* Check if key is already inserted in secondary location */ > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > -ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > +ret =3D search_and_update(h, data, key, cur_bkt, short_sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > return ret; > } > } > + > __hash_rw_writer_unlock(h); > > /* Did not find a match, so get a new slot for storing the new key */ > @@ -863,7 +869,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > > /* Find an empty slot and insert */ > ret =3D rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, > -sig, alt_hash, new_idx, &ret_val); > +short_sig, new_idx, &ret_val); > if (ret =3D=3D 0) > return new_idx - 1; > else if (ret =3D=3D 1) { > @@ -873,7 +879,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > > /* Primary bucket full, need to make space for new entry */ > ret =3D rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, > data, > -sig, alt_hash, new_idx, &ret_val); > +short_sig, prim_bucket_idx, new_idx, > &ret_val); > if (ret =3D=3D 0) > return new_idx - 1; > else if (ret =3D=3D 1) { > @@ -883,7 +889,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > > /* Also search secondary bucket to get better occupancy */ > ret =3D rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, > data, > -alt_hash, sig, new_idx, &ret_val); > +short_sig, sec_bucket_idx, new_idx, &ret_val); > > if (ret =3D=3D 0) > return new_idx - 1; > @@ -903,14 +909,14 @@ __rte_hash_add_key_with_hash(const struct > rte_hash *h, const void *key, > */ > __hash_rw_writer_lock(h); > /* We check for duplicates again since could be inserted before the > lock */ > -ret =3D search_and_update(h, data, key, prim_bkt, sig, alt_hash); > +ret =3D search_and_update(h, data, key, prim_bkt, short_sig); > if (ret !=3D -1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > goto failure; > } > > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > -ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > +ret =3D search_and_update(h, data, key, cur_bkt, short_sig); > if (ret !=3D -1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > goto failure; > @@ -923,8 +929,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > /* Check if slot is available */ > if (likely(cur_bkt->key_idx[i] =3D=3D EMPTY_SLOT)) { > -cur_bkt->sig_current[i] =3D alt_hash; > -cur_bkt->sig_alt[i] =3D sig; > +cur_bkt->sig_current[i] =3D short_sig; > cur_bkt->key_idx[i] =3D new_idx; > __hash_rw_writer_unlock(h); > return new_idx - 1; > @@ -942,8 +947,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > > bkt_id =3D (uint32_t)((uintptr_t)ext_bkt_id) - 1; > /* Use the first location of the new bucket */ > -(h->buckets_ext[bkt_id]).sig_current[0] =3D alt_hash; > -(h->buckets_ext[bkt_id]).sig_alt[0] =3D sig; > +(h->buckets_ext[bkt_id]).sig_current[0] =3D short_sig; > (h->buckets_ext[bkt_id]).key_idx[0] =3D new_idx; > /* Link the new bucket to sec bucket linked list */ > last =3D rte_hash_get_last_bkt(sec_bkt); @@ -1002,7 +1006,7 @@ > rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *da= ta) > > /* Search one bucket to find the match key */ static inline int32_t - > search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t s= ig, > +search_one_bucket(const struct rte_hash *h, const void *key, uint16_t > +sig, > void **data, const struct rte_hash_bucket *bkt) { > int i; > @@ -1031,30 +1035,28 @@ static inline int32_t > __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void **data) > { > -uint32_t bucket_idx; > -hash_sig_t alt_hash; > +uint32_t prim_bucket_idx, sec_bucket_idx; > struct rte_hash_bucket *bkt, *cur_bkt; > int ret; > +uint16_t short_sig; > > -bucket_idx =3D sig & h->bucket_bitmask; > -bkt =3D &h->buckets[bucket_idx]; > +get_buckets_index(h, sig, &prim_bucket_idx, &sec_bucket_idx, > &short_sig); > +bkt =3D &h->buckets[prim_bucket_idx]; > > __hash_rw_reader_lock(h); > > /* Check if key is in primary location */ > -ret =3D search_one_bucket(h, key, sig, data, bkt); > +ret =3D search_one_bucket(h, key, short_sig, data, bkt); > if (ret !=3D -1) { > __hash_rw_reader_unlock(h); > return ret; > } > /* Calculate secondary hash */ > -alt_hash =3D rte_hash_secondary_hash(sig); > -bucket_idx =3D alt_hash & h->bucket_bitmask; > -bkt =3D &h->buckets[bucket_idx]; > +bkt =3D &h->buckets[sec_bucket_idx]; > > /* Check if key is in secondary location */ > FOR_EACH_BUCKET(cur_bkt, bkt) { > -ret =3D search_one_bucket(h, key, alt_hash, data, cur_bkt); > +ret =3D search_one_bucket(h, key, short_sig, data, cur_bkt); > if (ret !=3D -1) { > __hash_rw_reader_unlock(h); > return ret; > @@ -1101,7 +1103,6 @@ remove_entry(const struct rte_hash *h, struct > rte_hash_bucket *bkt, unsigned i) > struct lcore_cache *cached_free_slots; > > bkt->sig_current[i] =3D NULL_SIGNATURE; > -bkt->sig_alt[i] =3D NULL_SIGNATURE; > if (h->multi_writer_support) { > lcore_id =3D rte_lcore_id(); > cached_free_slots =3D &h->local_free_slots[lcore_id]; @@ - > 1126,7 +1127,7 @@ remove_entry(const struct rte_hash *h, struct > rte_hash_bucket *bkt, unsigned i) > /* Search one bucket and remove the matched key */ static inline int32_= t > search_and_remove(const struct rte_hash *h, const void *key, > -struct rte_hash_bucket *bkt, hash_sig_t sig) > +struct rte_hash_bucket *bkt, uint16_t sig) > { > struct rte_hash_key *k, *keys =3D h->key_store; > unsigned int i; > @@ -1158,31 +1159,29 @@ static inline int32_t > __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig) > { > -uint32_t bucket_idx; > -hash_sig_t alt_hash; > +uint32_t prim_bucket_idx, sec_bucket_idx; > struct rte_hash_bucket *prim_bkt, *sec_bkt; > struct rte_hash_bucket *cur_bkt, *prev_bkt, *next_bkt; > int32_t ret, i; > struct rte_hash_bucket *tobe_removed_bkt =3D NULL; > +uint16_t short_sig; > > -bucket_idx =3D sig & h->bucket_bitmask; > -prim_bkt =3D &h->buckets[bucket_idx]; > +get_buckets_index(h, sig, &prim_bucket_idx, &sec_bucket_idx, > &short_sig); > +prim_bkt =3D &h->buckets[prim_bucket_idx]; > > __hash_rw_writer_lock(h); > /* look for key in primary bucket */ > -ret =3D search_and_remove(h, key, prim_bkt, sig); > +ret =3D search_and_remove(h, key, prim_bkt, short_sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > return ret; > } > > /* Calculate secondary hash */ > -alt_hash =3D rte_hash_secondary_hash(sig); > -bucket_idx =3D alt_hash & h->bucket_bitmask; > -sec_bkt =3D &h->buckets[bucket_idx]; > +sec_bkt =3D &h->buckets[sec_bucket_idx]; > > /* look for key in secondary bucket */ > -ret =3D search_and_remove(h, key, sec_bkt, alt_hash); > +ret =3D search_and_remove(h, key, sec_bkt, short_sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > return ret; > @@ -1192,7 +1191,7 @@ __rte_hash_del_key_with_hash(const struct > rte_hash *h, const void *key, > if (h->ext_table_support) { > next_bkt =3D sec_bkt->next; > FOR_EACH_BUCKET(cur_bkt, next_bkt) { > -ret =3D search_and_remove(h, key, cur_bkt, alt_hash); > +ret =3D search_and_remove(h, key, cur_bkt, short_sig); > if (ret !=3D -1) > goto return_bkt; > } > @@ -1265,55 +1264,35 @@ static inline void compare_signatures(uint32_t > *prim_hash_matches, uint32_t *sec_hash_matches, > const struct rte_hash_bucket *prim_bkt, > const struct rte_hash_bucket *sec_bkt, > -hash_sig_t prim_hash, hash_sig_t sec_hash, > +uint16_t sig, > enum rte_hash_sig_compare_function sig_cmp_fn) { > unsigned int i; > > +/* For match mask the first bit of every two bits indicates the match > +*/ > switch (sig_cmp_fn) { > -#ifdef RTE_MACHINE_CPUFLAG_AVX2 > -case RTE_HASH_COMPARE_AVX2: > -*prim_hash_matches =3D > _mm256_movemask_ps((__m256)_mm256_cmpeq_epi32( > -_mm256_load_si256( > -(__m256i const *)prim_bkt- > >sig_current), > -_mm256_set1_epi32(prim_hash))); > -*sec_hash_matches =3D > _mm256_movemask_ps((__m256)_mm256_cmpeq_epi32( > -_mm256_load_si256( > -(__m256i const *)sec_bkt- > >sig_current), > -_mm256_set1_epi32(sec_hash))); > -break; > -#endif > #ifdef RTE_MACHINE_CPUFLAG_SSE2 > case RTE_HASH_COMPARE_SSE: > -/* Compare the first 4 signatures in the bucket */ > -*prim_hash_matches =3D > _mm_movemask_ps((__m128)_mm_cmpeq_epi16( > +/* Compare all signatures in the bucket */ > +*prim_hash_matches =3D > _mm_movemask_epi8(_mm_cmpeq_epi16( > _mm_load_si128( > (__m128i const *)prim_bkt- > >sig_current), > -_mm_set1_epi32(prim_hash))); > -*prim_hash_matches |=3D > (_mm_movemask_ps((__m128)_mm_cmpeq_epi16( > -_mm_load_si128( > -(__m128i const *)&prim_bkt- > >sig_current[4]), > -_mm_set1_epi32(prim_hash)))) << 4; > -/* Compare the first 4 signatures in the bucket */ > -*sec_hash_matches =3D > _mm_movemask_ps((__m128)_mm_cmpeq_epi16( > +_mm_set1_epi16(sig))); > +/* Compare all signatures in the bucket */ > +*sec_hash_matches =3D > _mm_movemask_epi8(_mm_cmpeq_epi16( > _mm_load_si128( > (__m128i const *)sec_bkt- > >sig_current), > -_mm_set1_epi32(sec_hash))); > -*sec_hash_matches |=3D > (_mm_movemask_ps((__m128)_mm_cmpeq_epi16( > -_mm_load_si128( > -(__m128i const *)&sec_bkt- > >sig_current[4]), > -_mm_set1_epi32(sec_hash)))) << 4; > +_mm_set1_epi16(sig))); > break; > #endif > default: > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > *prim_hash_matches |=3D > -((prim_hash =3D=3D prim_bkt->sig_current[i]) << i); > +((sig =3D=3D prim_bkt->sig_current[i]) << (i << 1)); > *sec_hash_matches |=3D > -((sec_hash =3D=3D sec_bkt->sig_current[i]) << i); > +((sig =3D=3D sec_bkt->sig_current[i]) << (i << 1)); > } > } > - > } > > #define PREFETCH_OFFSET 4 > @@ -1326,7 +1305,9 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > int32_t i; > int32_t ret; > uint32_t prim_hash[RTE_HASH_LOOKUP_BULK_MAX]; > -uint32_t sec_hash[RTE_HASH_LOOKUP_BULK_MAX]; > +uint32_t prim_index[RTE_HASH_LOOKUP_BULK_MAX]; > +uint32_t sec_index[RTE_HASH_LOOKUP_BULK_MAX]; > +uint16_t sig[RTE_HASH_LOOKUP_BULK_MAX]; > const struct rte_hash_bucket > *primary_bkt[RTE_HASH_LOOKUP_BULK_MAX]; > const struct rte_hash_bucket > *secondary_bkt[RTE_HASH_LOOKUP_BULK_MAX]; > uint32_t prim_hitmask[RTE_HASH_LOOKUP_BULK_MAX] =3D {0}; @@ - > 1345,10 +1326,11 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > rte_prefetch0(keys[i + PREFETCH_OFFSET]); > > prim_hash[i] =3D rte_hash_hash(h, keys[i]); > -sec_hash[i] =3D rte_hash_secondary_hash(prim_hash[i]); > +get_buckets_index(h, prim_hash[i], > +&prim_index[i], &sec_index[i], &sig[i]); > > -primary_bkt[i] =3D &h->buckets[prim_hash[i] & h- > >bucket_bitmask]; > -secondary_bkt[i] =3D &h->buckets[sec_hash[i] & h- > >bucket_bitmask]; > +primary_bkt[i] =3D &h->buckets[prim_index[i]]; > +secondary_bkt[i] =3D &h->buckets[sec_index[i]]; > > rte_prefetch0(primary_bkt[i]); > rte_prefetch0(secondary_bkt[i]); > @@ -1357,10 +1339,12 @@ __rte_hash_lookup_bulk(const struct rte_hash > *h, const void **keys, > /* Calculate and prefetch rest of the buckets */ > for (; i < num_keys; i++) { > prim_hash[i] =3D rte_hash_hash(h, keys[i]); > -sec_hash[i] =3D rte_hash_secondary_hash(prim_hash[i]); > > -primary_bkt[i] =3D &h->buckets[prim_hash[i] & h- > >bucket_bitmask]; > -secondary_bkt[i] =3D &h->buckets[sec_hash[i] & h- > >bucket_bitmask]; > +get_buckets_index(h, prim_hash[i], > +&prim_index[i], &sec_index[i], &sig[i]); > + > +primary_bkt[i] =3D &h->buckets[prim_index[i]]; > +secondary_bkt[i] =3D &h->buckets[sec_index[i]]; > > rte_prefetch0(primary_bkt[i]); > rte_prefetch0(secondary_bkt[i]); > @@ -1371,10 +1355,11 @@ __rte_hash_lookup_bulk(const struct rte_hash > *h, const void **keys, > for (i =3D 0; i < num_keys; i++) { > compare_signatures(&prim_hitmask[i], &sec_hitmask[i], > primary_bkt[i], secondary_bkt[i], > -prim_hash[i], sec_hash[i], h->sig_cmp_fn); > +sig[i], h->sig_cmp_fn); > > if (prim_hitmask[i]) { > -uint32_t first_hit =3D __builtin_ctzl(prim_hitmask[i]); > +uint32_t first_hit =3D > +__builtin_ctzl(prim_hitmask[i]) >> 1; > uint32_t key_idx =3D primary_bkt[i]->key_idx[first_hit]; > const struct rte_hash_key *key_slot =3D > (const struct rte_hash_key *)( > @@ -1385,7 +1370,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > } > > if (sec_hitmask[i]) { > -uint32_t first_hit =3D __builtin_ctzl(sec_hitmask[i]); > +uint32_t first_hit =3D > +__builtin_ctzl(sec_hitmask[i]) >> 1; > uint32_t key_idx =3D secondary_bkt[i]- > >key_idx[first_hit]; > const struct rte_hash_key *key_slot =3D > (const struct rte_hash_key *)( > @@ -1399,7 +1385,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > for (i =3D 0; i < num_keys; i++) { > positions[i] =3D -ENOENT; > while (prim_hitmask[i]) { > -uint32_t hit_index =3D __builtin_ctzl(prim_hitmask[i]); > +uint32_t hit_index =3D > +__builtin_ctzl(prim_hitmask[i]) >> 1; > > uint32_t key_idx =3D primary_bkt[i]->key_idx[hit_index]; > const struct rte_hash_key *key_slot =3D @@ -1418,11 > +1405,12 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void > **keys, > positions[i] =3D key_idx - 1; > goto next_key; > } > -prim_hitmask[i] &=3D ~(1 << (hit_index)); > +prim_hitmask[i] &=3D ~(3ULL << (hit_index << 1)); > } > > while (sec_hitmask[i]) { > -uint32_t hit_index =3D __builtin_ctzl(sec_hitmask[i]); > +uint32_t hit_index =3D > +__builtin_ctzl(sec_hitmask[i]) >> 1; > > uint32_t key_idx =3D secondary_bkt[i]- > >key_idx[hit_index]; > const struct rte_hash_key *key_slot =3D @@ -1442,7 > +1430,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void > **keys, > positions[i] =3D key_idx - 1; > goto next_key; > } > -sec_hitmask[i] &=3D ~(1 << (hit_index)); > +sec_hitmask[i] &=3D ~(3ULL << (hit_index << 1)); > } > > next_key: > @@ -1465,10 +1453,10 @@ __rte_hash_lookup_bulk(const struct rte_hash > *h, const void **keys, > FOR_EACH_BUCKET(cur_bkt, next_bkt) { > if (data !=3D NULL) > ret =3D search_one_bucket(h, keys[i], > -sec_hash[i], &data[i], > cur_bkt); > +sig[i], &data[i], cur_bkt); > else > ret =3D search_one_bucket(h, keys[i], > -sec_hash[i], NULL, cur_bkt); > +sig[i], NULL, cur_bkt); > if (ret !=3D -1) { > positions[i] =3D ret; > hits |=3D 1ULL << i; > diff --git a/lib/librte_hash/rte_cuckoo_hash.h > b/lib/librte_hash/rte_cuckoo_hash.h > index e601520..7753cd8 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.h > +++ b/lib/librte_hash/rte_cuckoo_hash.h > @@ -129,18 +129,15 @@ struct rte_hash_key { enum > rte_hash_sig_compare_function { > RTE_HASH_COMPARE_SCALAR =3D 0, > RTE_HASH_COMPARE_SSE, > -RTE_HASH_COMPARE_AVX2, > RTE_HASH_COMPARE_NUM > }; > > /** Bucket structure */ > struct rte_hash_bucket { > -hash_sig_t sig_current[RTE_HASH_BUCKET_ENTRIES]; > +uint16_t sig_current[RTE_HASH_BUCKET_ENTRIES]; > > uint32_t key_idx[RTE_HASH_BUCKET_ENTRIES]; > > -hash_sig_t sig_alt[RTE_HASH_BUCKET_ENTRIES]; > - > uint8_t flag[RTE_HASH_BUCKET_ENTRIES]; > > void *next; > @@ -193,6 +190,7 @@ struct rte_hash { > > struct queue_node { > struct rte_hash_bucket *bkt; /* Current bucket on the bfs search */ > +uint32_t cur_bkt_idx; > > struct queue_node *prev; /* Parent(bucket) in search path */ > int prev_slot; /* Parent(slot) in search path */ > diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h inde= x > 11d8e28..0bd7696 100644 > --- a/lib/librte_hash/rte_hash.h > +++ b/lib/librte_hash/rte_hash.h > @@ -40,7 +40,10 @@ extern "C" { > /** Flag to indicate the extendabe bucket table feature should be used *= / > #define RTE_HASH_EXTRA_FLAGS_EXT_TABLE 0x08 > > -/** Signature of key that is stored internally. */ > +/** > + * A hash value that is used to generate signature stored in table and > +the > + * location the signature is stored. > + */ This is an external file. This documentation goes into the API guide. IMO, = we should change the comment to help the user. How about changing this to '= hash value of the key'? > typedef uint32_t hash_sig_t; > > /** Type of function that can be used for calculating the hash value. */ > -- > 2.7.4 IMPORTANT NOTICE: The contents of this email and any attachments are confid= ential and may also be privileged. If you are not the intended recipient, p= lease notify the sender immediately and do not disclose the contents to any= other person, use it for any purpose, or store or copy the information in = any medium. Thank you.