From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0085.outbound.protection.outlook.com [104.47.0.85]) by dpdk.org (Postfix) with ESMTP id 53B415699 for ; Mon, 1 Oct 2018 22:09:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0dgNh97+0wmxS1TIMOIjrBrrv7urYmcLMaZedXTYFrs=; b=K2KNd6A4JCTRpw7vrXBhoGOx24BabYhLeedi8oaT5zk+VOeGMvjsMk8h0PNxyTjwzxZNUS8Dqiwp/HEA7V8OuYpe9VZldqKFKljH6Es4fFkIWyiP7EbqqqDWcuaZF/HDeuV4LdK4FwbvZ4AP1oAsnnA7TpCMpf/89q0X0+tps8Q= Received: from AM6PR08MB3672.eurprd08.prod.outlook.com (20.177.115.29) by AM6PR08MB3671.eurprd08.prod.outlook.com (20.177.115.28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1185.20; Mon, 1 Oct 2018 20:09:22 +0000 Received: from AM6PR08MB3672.eurprd08.prod.outlook.com ([fe80::f423:e46a:a03c:e928]) by AM6PR08MB3672.eurprd08.prod.outlook.com ([fe80::f423:e46a:a03c:e928%2]) with mapi id 15.20.1185.024; Mon, 1 Oct 2018 20:09:22 +0000 From: Honnappa Nagarahalli To: Yipeng Wang , "bruce.richardson@intel.com" CC: "konstantin.ananyev@intel.com" , "dev@dpdk.org" , "sameh.gobriel@intel.com" , Honnappa Nagarahalli , nd Thread-Topic: [PATCH v4 4/4] hash: use partial-key hashing Thread-Index: AQHUV4tnzvHsniaDIk6u1kgIBUJ0FqUK1efg Date: Mon, 1 Oct 2018 20:09:22 +0000 Message-ID: References: <1537993618-92630-1-git-send-email-yipeng1.wang@intel.com> <1538155426-145177-1-git-send-email-yipeng1.wang@intel.com> <1538155426-145177-5-git-send-email-yipeng1.wang@intel.com> In-Reply-To: <1538155426-145177-5-git-send-email-yipeng1.wang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM6PR08MB3671; 6:xbbn2XnqAQhqzed7Z/5H45M6MmDrVvJSbjPa4lDhixdhX/+S66NSyGVIYRcYcF0VsCg0RSTA9Wf0UC690PDiWPrISmJlvaAW/zBNHjMTjsYDEyROGqv4Nx25AP5kxNlnThZER0sU/TbSwHkBQKreaqhQvHtzMBwOhVUxSk3R+v11d7h5Tm0t8cZ31B1+DBUO7ItbrS7NEfudVylHH4GrMgiwujEwaFu78GF5K4W5ea69S22sCgFE89qNlJoRxixlp7m/rxALvALfsfx5Dq4XuP3c01MkN6T3IywFtbpDsvKcXVsUZsdzduCfYT50xPGOH0A7ftCj6HpQ+uwMUGDgKYlKuQF+1OVLkpOPcACfYdZt9d41D1WFIWKpUldmLdPyzC90ehrOfGtvvFbZeWZlfQlZGcxmHK85idWlu6bpudxr2z2su4NiX7oAUf4/LlaJ1I4BxFSpcihdP1IdaM8TEg==; 5:SLQhR1PnZCNRY+K4i/4jeMxSomX1n/nx4eSCI4+DL+GIUtXVcCMzFczwhKu9scIts3f9Ohk7BWW3NDyqVESxMkI+YXHs8p33sNw3+/q739xvo87j1jdf3hsXx2ZJtVOyaZDjSclgkJQhSeKrYgfVyWLDal3sqPKBL/wJHa5UjMs=; 7:T5IZyGDHiwo9kLa6dOJ22EdIZoLdrEkCyZbIWXHIPsJaQ5piyOKkSuK3Mvb4G+Y4UQ+iJROSgGV9v6YlHwtIezjSXIw90DZJHk4DFCD60RWXNC9AxH2oaS0yWtSr8EA43jJIQAqtPl2zJt3sL1LzqNa8S4qIMA1ylYwspzGpZZKov8vaDY26au88wfnW1OM26jcM+2lyhU69r7bsnvJ3PrdQPrdJYHnGI37Xy+iccPPXsTY8Jyn5X8DjyqTh0ffw x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-ms-office365-filtering-correlation-id: 0e4b72b6-9451-4fc6-f46f-08d627d9c73a x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534165)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:AM6PR08MB3671; x-ms-traffictypediagnostic: AM6PR08MB3671: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(131327999870524)(228905959029699)(180628864354917); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3231355)(944501410)(52105095)(3002001)(6055026)(149066)(150057)(6041310)(20161123564045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(201708071742011)(7699051); SRVR:AM6PR08MB3671; BCL:0; PCL:0; RULEID:; SRVR:AM6PR08MB3671; x-forefront-prvs: 0812095267 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(366004)(39860400002)(346002)(136003)(376002)(199004)(189003)(55016002)(486006)(6436002)(229853002)(11346002)(446003)(305945005)(2900100001)(476003)(316002)(5660300001)(4744004)(33656002)(66066001)(6506007)(7696005)(76176011)(71190400001)(71200400001)(8676002)(81166006)(81156014)(68736007)(4326008)(106356001)(14454004)(8936002)(54906003)(105586002)(110136005)(97736004)(72206003)(256004)(14444005)(478600001)(99286004)(2501003)(5250100002)(26005)(7736002)(9686003)(186003)(6116002)(86362001)(3846002)(6246003)(25786009)(53946003)(74316002)(53936002)(2906002)(102836004)(559001)(579004); DIR:OUT; SFP:1101; SCL:1; SRVR:AM6PR08MB3671; H:AM6PR08MB3672.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: LSZgfx6YCi20SGFFPdvAkz/wDl3dItzzzG8BqRRE05yIW7b517lp6fF3v/lqX6JbVRBD4zDLgqkPS6M1vgwCIiAeCRW6olTiwGVXTLTO6Qbt6JX+9/apbAwWB6W/xJ6cGgGar4l64qgbCtktzoX7xBqpH5UwkGHonBF4DZtRiVnDWrPQPz79f9X8BV7nm4bIYpx55FI7jA+0BFPxaZTKxw4M5xkUQd9fABJ1ashpWzgtGSVLkmuwxFJlWiii+qc9cc9uOt9IAS262zP+JFM/+m7Up5c6CjOuwYGZKIb1n7ITGogQik1gvWoejVC3K3RTAnU1Z5XkyeHpqRxcoPCo5HHRigfOSw3IPU0lFMPGtmM= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0e4b72b6-9451-4fc6-f46f-08d627d9c73a X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Oct 2018 20:09:22.2326 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB3671 Subject: Re: [dpdk-dev] [PATCH v4 4/4] hash: use partial-key hashing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 20:09:24 -0000 >=20 > This commit changes the hashing mechanism to "partial-key hashing" to > calculate bucket index and signature of key. >=20 > This is proposed in Bin Fan, et al's paper > "MemC3: Compact and Concurrent MemCache with Dumber Caching and > Smarter Hashing". Bascially the idea is to use "xor" to derive alternativ= e > bucket from current bucket index and signature. >=20 > With "partial-key hashing", it reduces the bucket memory requirement from > two cache lines to one cache line, which improves the memory efficiency a= nd > thus the lookup speed. >=20 > Signed-off-by: Yipeng Wang > --- > lib/librte_hash/rte_cuckoo_hash.c | 246 +++++++++++++++++++-------------= ----- > - > lib/librte_hash/rte_cuckoo_hash.h | 6 +- > lib/librte_hash/rte_hash.h | 5 +- > 3 files changed, 131 insertions(+), 126 deletions(-) >=20 > diff --git a/lib/librte_hash/rte_cuckoo_hash.c > b/lib/librte_hash/rte_cuckoo_hash.c > index 02650b9..e101708 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.c > +++ b/lib/librte_hash/rte_cuckoo_hash.c > @@ -90,6 +90,36 @@ rte_hash_cmp_eq(const void *key1, const void *key2, > const struct rte_hash *h) > return cmp_jump_table[h->cmp_jump_table_idx](key1, key2, > h->key_len); } >=20 > +/* > + * We use higher 16 bits of hash as the signature value stored in table. > + * We use the lower bits for the primary bucket > + * location. Then we XOR primary bucket location and the signature > + * to get the secondary bucket location. This is same as > + * proposed in Bin Fan, et al's paper > + * "MemC3: Compact and Concurrent MemCache with Dumber Caching and > + * Smarter Hashing". The benefit to use > + * XOR is that one could derive the alternative bucket location > + * by only using the current bucket location and the signature. > + */ > +static inline uint16_t > +get_short_sig(const hash_sig_t hash) > +{ > + return hash >> 16; > +} > + > +static inline uint32_t > +get_prim_bucket_index(const struct rte_hash *h, const hash_sig_t hash) > +{ > + return hash & h->bucket_bitmask; > +} > + > +static inline uint32_t > +get_alt_bucket_index(const struct rte_hash *h, > + uint32_t cur_bkt_idx, uint16_t sig) > +{ > + return (cur_bkt_idx ^ sig) & h->bucket_bitmask; } > + > struct rte_hash * > rte_hash_create(const struct rte_hash_parameters *params) { @@ -327,9 > +357,7 @@ rte_hash_create(const struct rte_hash_parameters *params) > h->ext_table_support =3D ext_table_support; >=20 > #if defined(RTE_ARCH_X86) > - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) > - h->sig_cmp_fn =3D RTE_HASH_COMPARE_AVX2; > - else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) > h->sig_cmp_fn =3D RTE_HASH_COMPARE_SSE; > else > #endif > @@ -417,18 +445,6 @@ rte_hash_hash(const struct rte_hash *h, const void > *key) > return h->hash_func(key, h->key_len, h->hash_func_init_val); } >=20 > -/* Calc the secondary hash value from the primary hash value of a given = key > */ -static inline hash_sig_t -rte_hash_secondary_hash(const hash_sig_t > primary_hash) -{ > - static const unsigned all_bits_shift =3D 12; > - static const unsigned alt_bits_xor =3D 0x5bd1e995; > - > - uint32_t tag =3D primary_hash >> all_bits_shift; > - > - return primary_hash ^ ((tag + 1) * alt_bits_xor); > -} > - > int32_t > rte_hash_count(const struct rte_hash *h) { @@ -560,14 +576,13 @@ > enqueue_slot_back(const struct rte_hash *h, > /* Search a key from bucket and update its data */ static inline int32_= t > search_and_update(const struct rte_hash *h, void *data, const void *key, > - struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash) > + struct rte_hash_bucket *bkt, uint16_t sig) > { > int i; > struct rte_hash_key *k, *keys =3D h->key_store; >=20 > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > - if (bkt->sig_current[i] =3D=3D sig && > - bkt->sig_alt[i] =3D=3D alt_hash) { > + if (bkt->sig_current[i] =3D=3D sig) { > k =3D (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); > if (rte_hash_cmp_eq(key, k->key, h) =3D=3D 0) { @@ - > 594,7 +609,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > struct rte_hash_bucket *prim_bkt, > struct rte_hash_bucket *sec_bkt, > const struct rte_hash_key *key, void *data, > - hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx, > + uint16_t sig, uint32_t new_idx, > int32_t *ret_val) > { > unsigned int i; > @@ -605,7 +620,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > /* Check if key was inserted after last check but before this > * protected region in case of inserting duplicated keys. > */ > - ret =3D search_and_update(h, data, key, prim_bkt, sig, alt_hash); > + ret =3D search_and_update(h, data, key, prim_bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -613,7 +628,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > } >=20 > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > + ret =3D search_and_update(h, data, key, cur_bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -628,7 +643,6 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > /* Check if slot is available */ > if (likely(prim_bkt->key_idx[i] =3D=3D EMPTY_SLOT)) { > prim_bkt->sig_current[i] =3D sig; > - prim_bkt->sig_alt[i] =3D alt_hash; > prim_bkt->key_idx[i] =3D new_idx; > break; > } > @@ -653,7 +667,7 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > struct rte_hash_bucket *alt_bkt, > const struct rte_hash_key *key, void *data, > struct queue_node *leaf, uint32_t leaf_slot, > - hash_sig_t sig, hash_sig_t alt_hash, uint32_t new_idx, > + uint16_t sig, uint32_t new_idx, > int32_t *ret_val) > { > uint32_t prev_alt_bkt_idx; > @@ -674,7 +688,7 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > /* Check if key was inserted after last check but before this > * protected region. > */ > - ret =3D search_and_update(h, data, key, bkt, sig, alt_hash); > + ret =3D search_and_update(h, data, key, bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -682,7 +696,7 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > } >=20 > FOR_EACH_BUCKET(cur_bkt, alt_bkt) { > - ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > + ret =3D search_and_update(h, data, key, cur_bkt, sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > *ret_val =3D ret; > @@ -695,8 +709,9 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > prev_bkt =3D prev_node->bkt; > prev_slot =3D curr_node->prev_slot; >=20 > - prev_alt_bkt_idx =3D > - prev_bkt->sig_alt[prev_slot] & h->bucket_bitmask; > + prev_alt_bkt_idx =3D get_alt_bucket_index(h, > + prev_node->cur_bkt_idx, > + prev_bkt->sig_current[prev_slot]); >=20 > if (unlikely(&h->buckets[prev_alt_bkt_idx] > !=3D curr_bkt)) { > @@ -710,10 +725,8 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > * Cuckoo insert to move elements back to its > * primary bucket if available > */ > - curr_bkt->sig_alt[curr_slot] =3D > - prev_bkt->sig_current[prev_slot]; > curr_bkt->sig_current[curr_slot] =3D > - prev_bkt->sig_alt[prev_slot]; > + prev_bkt->sig_current[prev_slot]; > curr_bkt->key_idx[curr_slot] =3D > prev_bkt->key_idx[prev_slot]; >=20 > @@ -723,7 +736,6 @@ rte_hash_cuckoo_move_insert_mw(const struct > rte_hash *h, > } >=20 > curr_bkt->sig_current[curr_slot] =3D sig; > - curr_bkt->sig_alt[curr_slot] =3D alt_hash; > curr_bkt->key_idx[curr_slot] =3D new_idx; >=20 > __hash_rw_writer_unlock(h); > @@ -741,39 +753,44 @@ rte_hash_cuckoo_make_space_mw(const struct > rte_hash *h, > struct rte_hash_bucket *bkt, > struct rte_hash_bucket *sec_bkt, > const struct rte_hash_key *key, void *data, > - hash_sig_t sig, hash_sig_t alt_hash, > + uint16_t sig, uint32_t bucket_idx, > uint32_t new_idx, int32_t *ret_val) > { > unsigned int i; > struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN]; > struct queue_node *tail, *head; > struct rte_hash_bucket *curr_bkt, *alt_bkt; > + uint32_t cur_idx, alt_idx; >=20 > tail =3D queue; > head =3D queue + 1; > tail->bkt =3D bkt; > tail->prev =3D NULL; > tail->prev_slot =3D -1; > + tail->cur_bkt_idx =3D bucket_idx; >=20 > /* Cuckoo bfs Search */ > while (likely(tail !=3D head && head < > queue + > RTE_HASH_BFS_QUEUE_MAX_LEN - > RTE_HASH_BUCKET_ENTRIES)) { > curr_bkt =3D tail->bkt; > + cur_idx =3D tail->cur_bkt_idx; > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > if (curr_bkt->key_idx[i] =3D=3D EMPTY_SLOT) { > int32_t ret =3D > rte_hash_cuckoo_move_insert_mw(h, > bkt, sec_bkt, key, data, > - tail, i, sig, alt_hash, > + tail, i, sig, > new_idx, ret_val); > if (likely(ret !=3D -1)) > return ret; > } >=20 > /* Enqueue new node and keep prev node info */ > - alt_bkt =3D &(h->buckets[curr_bkt->sig_alt[i] > - & h->bucket_bitmask]); > + alt_idx =3D get_alt_bucket_index(h, cur_idx, > + curr_bkt->sig_current[i]); > + alt_bkt =3D &(h->buckets[alt_idx]); > head->bkt =3D alt_bkt; > + head->cur_bkt_idx =3D alt_idx; > head->prev =3D tail; > head->prev_slot =3D i; > head++; > @@ -788,7 +805,7 @@ static inline int32_t > __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void *data) > { > - hash_sig_t alt_hash; > + uint16_t short_sig; > uint32_t prim_bucket_idx, sec_bucket_idx; > struct rte_hash_bucket *prim_bkt, *sec_bkt, *cur_bkt; > struct rte_hash_key *new_k, *keys =3D h->key_store; @@ -803,18 > +820,17 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const > void *key, > int32_t ret_val; > struct rte_hash_bucket *last; >=20 > - prim_bucket_idx =3D sig & h->bucket_bitmask; > + short_sig =3D get_short_sig(sig); > + prim_bucket_idx =3D get_prim_bucket_index(h, sig); > + sec_bucket_idx =3D get_alt_bucket_index(h, prim_bucket_idx, short_sig); > prim_bkt =3D &h->buckets[prim_bucket_idx]; > - rte_prefetch0(prim_bkt); > - > - alt_hash =3D rte_hash_secondary_hash(sig); > - sec_bucket_idx =3D alt_hash & h->bucket_bitmask; > sec_bkt =3D &h->buckets[sec_bucket_idx]; > + rte_prefetch0(prim_bkt); > rte_prefetch0(sec_bkt); >=20 > /* Check if key is already inserted in primary location */ > __hash_rw_writer_lock(h); > - ret =3D search_and_update(h, data, key, prim_bkt, sig, alt_hash); > + ret =3D search_and_update(h, data, key, prim_bkt, short_sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > return ret; > @@ -822,12 +838,13 @@ __rte_hash_add_key_with_hash(const struct > rte_hash *h, const void *key, >=20 > /* Check if key is already inserted in secondary location */ > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > + ret =3D search_and_update(h, data, key, cur_bkt, short_sig); > if (ret !=3D -1) { > __hash_rw_writer_unlock(h); > return ret; > } > } > + > __hash_rw_writer_unlock(h); >=20 > /* Did not find a match, so get a new slot for storing the new key */ > @@ -865,7 +882,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, >=20 > /* Find an empty slot and insert */ > ret =3D rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, > - sig, alt_hash, new_idx, &ret_val); > + short_sig, new_idx, &ret_val); > if (ret =3D=3D 0) > return new_idx - 1; > else if (ret =3D=3D 1) { > @@ -875,7 +892,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, >=20 > /* Primary bucket full, need to make space for new entry */ > ret =3D rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, > data, > - sig, alt_hash, new_idx, &ret_val); > + short_sig, prim_bucket_idx, new_idx, > &ret_val); > if (ret =3D=3D 0) > return new_idx - 1; > else if (ret =3D=3D 1) { > @@ -885,7 +902,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, >=20 > /* Also search secondary bucket to get better occupancy */ > ret =3D rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, > data, > - alt_hash, sig, new_idx, &ret_val); > + short_sig, sec_bucket_idx, new_idx, &ret_val); >=20 > if (ret =3D=3D 0) > return new_idx - 1; > @@ -905,14 +922,14 @@ __rte_hash_add_key_with_hash(const struct > rte_hash *h, const void *key, > */ > __hash_rw_writer_lock(h); > /* We check for duplicates again since could be inserted before the > lock */ > - ret =3D search_and_update(h, data, key, prim_bkt, sig, alt_hash); > + ret =3D search_and_update(h, data, key, prim_bkt, short_sig); > if (ret !=3D -1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > goto failure; > } >=20 > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret =3D search_and_update(h, data, key, cur_bkt, alt_hash, sig); > + ret =3D search_and_update(h, data, key, cur_bkt, short_sig); > if (ret !=3D -1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > goto failure; > @@ -924,8 +941,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > /* Check if slot is available */ > if (likely(cur_bkt->key_idx[i] =3D=3D EMPTY_SLOT)) { > - cur_bkt->sig_current[i] =3D alt_hash; > - cur_bkt->sig_alt[i] =3D sig; > + cur_bkt->sig_current[i] =3D short_sig; > cur_bkt->key_idx[i] =3D new_idx; > __hash_rw_writer_unlock(h); > return new_idx - 1; > @@ -943,8 +959,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, >=20 > bkt_id =3D (uint32_t)((uintptr_t)ext_bkt_id) - 1; > /* Use the first location of the new bucket */ > - (h->buckets_ext[bkt_id]).sig_current[0] =3D alt_hash; > - (h->buckets_ext[bkt_id]).sig_alt[0] =3D sig; > + (h->buckets_ext[bkt_id]).sig_current[0] =3D short_sig; > (h->buckets_ext[bkt_id]).key_idx[0] =3D new_idx; > /* Link the new bucket to sec bucket linked list */ > last =3D rte_hash_get_last_bkt(sec_bkt); @@ -1003,7 +1018,7 @@ > rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *da= ta) >=20 > /* Search one bucket to find the match key */ static inline int32_t - > search_one_bucket(const struct rte_hash *h, const void *key, hash_sig_t s= ig, > +search_one_bucket(const struct rte_hash *h, const void *key, uint16_t > +sig, > void **data, const struct rte_hash_bucket *bkt) { > int i; > @@ -1032,30 +1047,30 @@ static inline int32_t > __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void **data) > { > - uint32_t bucket_idx; > - hash_sig_t alt_hash; > + uint32_t prim_bucket_idx, sec_bucket_idx; > struct rte_hash_bucket *bkt, *cur_bkt; > int ret; > + uint16_t short_sig; >=20 > - bucket_idx =3D sig & h->bucket_bitmask; > - bkt =3D &h->buckets[bucket_idx]; > + short_sig =3D get_short_sig(sig); > + prim_bucket_idx =3D get_prim_bucket_index(h, sig); > + sec_bucket_idx =3D get_alt_bucket_index(h, prim_bucket_idx, short_sig); > + bkt =3D &h->buckets[prim_bucket_idx]; >=20 > __hash_rw_reader_lock(h); >=20 > /* Check if key is in primary location */ > - ret =3D search_one_bucket(h, key, sig, data, bkt); > + ret =3D search_one_bucket(h, key, short_sig, data, bkt); > if (ret !=3D -1) { > __hash_rw_reader_unlock(h); > return ret; > } > /* Calculate secondary hash */ > - alt_hash =3D rte_hash_secondary_hash(sig); > - bucket_idx =3D alt_hash & h->bucket_bitmask; > - bkt =3D &h->buckets[bucket_idx]; > + bkt =3D &h->buckets[sec_bucket_idx]; >=20 > /* Check if key is in secondary location */ > FOR_EACH_BUCKET(cur_bkt, bkt) { > - ret =3D search_one_bucket(h, key, alt_hash, data, cur_bkt); > + ret =3D search_one_bucket(h, key, short_sig, data, cur_bkt); > if (ret !=3D -1) { > __hash_rw_reader_unlock(h); > return ret; > @@ -1102,7 +1117,6 @@ remove_entry(const struct rte_hash *h, struct > rte_hash_bucket *bkt, unsigned i) > struct lcore_cache *cached_free_slots; >=20 > bkt->sig_current[i] =3D NULL_SIGNATURE; > - bkt->sig_alt[i] =3D NULL_SIGNATURE; > if (h->multi_writer_support) { > lcore_id =3D rte_lcore_id(); > cached_free_slots =3D &h->local_free_slots[lcore_id]; @@ - > 1141,9 +1155,7 @@ __rte_hash_compact_ll(struct rte_hash_bucket *cur_bkt, > int pos) { > if (last_bkt->key_idx[i] !=3D EMPTY_SLOT) { > cur_bkt->key_idx[pos] =3D last_bkt->key_idx[i]; > cur_bkt->sig_current[pos] =3D last_bkt->sig_current[i]; > - cur_bkt->sig_alt[pos] =3D last_bkt->sig_alt[i]; > last_bkt->sig_current[i] =3D NULL_SIGNATURE; > - last_bkt->sig_alt[i] =3D NULL_SIGNATURE; > last_bkt->key_idx[i] =3D EMPTY_SLOT; > return; > } > @@ -1153,7 +1165,7 @@ __rte_hash_compact_ll(struct rte_hash_bucket > *cur_bkt, int pos) { > /* Search one bucket and remove the matched key */ static inline int32_= t > search_and_remove(const struct rte_hash *h, const void *key, > - struct rte_hash_bucket *bkt, hash_sig_t sig, int *pos) > + struct rte_hash_bucket *bkt, uint16_t sig, int *pos) > { > struct rte_hash_key *k, *keys =3D h->key_store; > unsigned int i; > @@ -1185,19 +1197,21 @@ static inline int32_t > __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig) > { > - uint32_t bucket_idx; > - hash_sig_t alt_hash; > + uint32_t prim_bucket_idx, sec_bucket_idx; > struct rte_hash_bucket *prim_bkt, *sec_bkt, *prev_bkt, *last_bkt; > struct rte_hash_bucket *cur_bkt; > int pos; > int32_t ret, i; > + uint16_t short_sig; >=20 > - bucket_idx =3D sig & h->bucket_bitmask; > - prim_bkt =3D &h->buckets[bucket_idx]; > + short_sig =3D get_short_sig(sig); > + prim_bucket_idx =3D get_prim_bucket_index(h, sig); > + sec_bucket_idx =3D get_alt_bucket_index(h, prim_bucket_idx, short_sig); > + prim_bkt =3D &h->buckets[prim_bucket_idx]; >=20 > __hash_rw_writer_lock(h); > /* look for key in primary bucket */ > - ret =3D search_and_remove(h, key, prim_bkt, sig, &pos); > + ret =3D search_and_remove(h, key, prim_bkt, short_sig, &pos); > if (ret !=3D -1) { > __rte_hash_compact_ll(prim_bkt, pos); > last_bkt =3D prim_bkt->next; > @@ -1206,12 +1220,10 @@ __rte_hash_del_key_with_hash(const struct > rte_hash *h, const void *key, > } >=20 > /* Calculate secondary hash */ > - alt_hash =3D rte_hash_secondary_hash(sig); > - bucket_idx =3D alt_hash & h->bucket_bitmask; > - sec_bkt =3D &h->buckets[bucket_idx]; > + sec_bkt =3D &h->buckets[sec_bucket_idx]; >=20 > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret =3D search_and_remove(h, key, cur_bkt, alt_hash, &pos); > + ret =3D search_and_remove(h, key, cur_bkt, short_sig, &pos); > if (ret !=3D -1) { > __rte_hash_compact_ll(cur_bkt, pos); > last_bkt =3D sec_bkt->next; > @@ -1288,55 +1300,35 @@ static inline void compare_signatures(uint32_t > *prim_hash_matches, uint32_t *sec_hash_matches, > const struct rte_hash_bucket *prim_bkt, > const struct rte_hash_bucket *sec_bkt, > - hash_sig_t prim_hash, hash_sig_t sec_hash, > + uint16_t sig, > enum rte_hash_sig_compare_function sig_cmp_fn) { > unsigned int i; >=20 > + /* For match mask the first bit of every two bits indicates the match > +*/ > switch (sig_cmp_fn) { > -#ifdef RTE_MACHINE_CPUFLAG_AVX2 > - case RTE_HASH_COMPARE_AVX2: > - *prim_hash_matches =3D > _mm256_movemask_ps((__m256)_mm256_cmpeq_epi32( > - _mm256_load_si256( > - (__m256i const *)prim_bkt- > >sig_current), > - _mm256_set1_epi32(prim_hash))); > - *sec_hash_matches =3D > _mm256_movemask_ps((__m256)_mm256_cmpeq_epi32( > - _mm256_load_si256( > - (__m256i const *)sec_bkt- > >sig_current), > - _mm256_set1_epi32(sec_hash))); > - break; > -#endif > #ifdef RTE_MACHINE_CPUFLAG_SSE2 > case RTE_HASH_COMPARE_SSE: > - /* Compare the first 4 signatures in the bucket */ > - *prim_hash_matches =3D > _mm_movemask_ps((__m128)_mm_cmpeq_epi16( > + /* Compare all signatures in the bucket */ > + *prim_hash_matches =3D > _mm_movemask_epi8(_mm_cmpeq_epi16( > _mm_load_si128( > (__m128i const *)prim_bkt- > >sig_current), > - _mm_set1_epi32(prim_hash))); > - *prim_hash_matches |=3D > (_mm_movemask_ps((__m128)_mm_cmpeq_epi16( > - _mm_load_si128( > - (__m128i const *)&prim_bkt- > >sig_current[4]), > - _mm_set1_epi32(prim_hash)))) << 4; > - /* Compare the first 4 signatures in the bucket */ > - *sec_hash_matches =3D > _mm_movemask_ps((__m128)_mm_cmpeq_epi16( > + _mm_set1_epi16(sig))); > + /* Compare all signatures in the bucket */ > + *sec_hash_matches =3D > _mm_movemask_epi8(_mm_cmpeq_epi16( > _mm_load_si128( > (__m128i const *)sec_bkt- > >sig_current), > - _mm_set1_epi32(sec_hash))); > - *sec_hash_matches |=3D > (_mm_movemask_ps((__m128)_mm_cmpeq_epi16( > - _mm_load_si128( > - (__m128i const *)&sec_bkt- > >sig_current[4]), > - _mm_set1_epi32(sec_hash)))) << 4; > + _mm_set1_epi16(sig))); > break; > #endif > default: > for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > *prim_hash_matches |=3D > - ((prim_hash =3D=3D prim_bkt->sig_current[i]) << i); > + ((sig =3D=3D prim_bkt->sig_current[i]) << (i << 1)); > *sec_hash_matches |=3D > - ((sec_hash =3D=3D sec_bkt->sig_current[i]) << i); > + ((sig =3D=3D sec_bkt->sig_current[i]) << (i << 1)); > } > } > - > } >=20 > #define PREFETCH_OFFSET 4 > @@ -1349,7 +1341,9 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > int32_t i; > int32_t ret; > uint32_t prim_hash[RTE_HASH_LOOKUP_BULK_MAX]; > - uint32_t sec_hash[RTE_HASH_LOOKUP_BULK_MAX]; > + uint32_t prim_index[RTE_HASH_LOOKUP_BULK_MAX]; > + uint32_t sec_index[RTE_HASH_LOOKUP_BULK_MAX]; > + uint16_t sig[RTE_HASH_LOOKUP_BULK_MAX]; > const struct rte_hash_bucket > *primary_bkt[RTE_HASH_LOOKUP_BULK_MAX]; > const struct rte_hash_bucket > *secondary_bkt[RTE_HASH_LOOKUP_BULK_MAX]; > uint32_t prim_hitmask[RTE_HASH_LOOKUP_BULK_MAX] =3D {0}; @@ - > 1368,10 +1362,13 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > rte_prefetch0(keys[i + PREFETCH_OFFSET]); >=20 > prim_hash[i] =3D rte_hash_hash(h, keys[i]); > - sec_hash[i] =3D rte_hash_secondary_hash(prim_hash[i]); >=20 > - primary_bkt[i] =3D &h->buckets[prim_hash[i] & h- > >bucket_bitmask]; > - secondary_bkt[i] =3D &h->buckets[sec_hash[i] & h- > >bucket_bitmask]; > + sig[i] =3D get_short_sig(prim_hash[i]); > + prim_index[i] =3D get_prim_bucket_index(h, prim_hash[i]); > + sec_index[i] =3D get_alt_bucket_index(h, prim_index[i], sig[i]); > + > + primary_bkt[i] =3D &h->buckets[prim_index[i]]; > + secondary_bkt[i] =3D &h->buckets[sec_index[i]]; >=20 > rte_prefetch0(primary_bkt[i]); > rte_prefetch0(secondary_bkt[i]); > @@ -1380,10 +1377,13 @@ __rte_hash_lookup_bulk(const struct rte_hash > *h, const void **keys, > /* Calculate and prefetch rest of the buckets */ > for (; i < num_keys; i++) { > prim_hash[i] =3D rte_hash_hash(h, keys[i]); > - sec_hash[i] =3D rte_hash_secondary_hash(prim_hash[i]); >=20 > - primary_bkt[i] =3D &h->buckets[prim_hash[i] & h- > >bucket_bitmask]; > - secondary_bkt[i] =3D &h->buckets[sec_hash[i] & h- > >bucket_bitmask]; > + sig[i] =3D get_short_sig(prim_hash[i]); > + prim_index[i] =3D get_prim_bucket_index(h, prim_hash[i]); > + sec_index[i] =3D get_alt_bucket_index(h, prim_index[i], sig[i]); > + > + primary_bkt[i] =3D &h->buckets[prim_index[i]]; > + secondary_bkt[i] =3D &h->buckets[sec_index[i]]; >=20 > rte_prefetch0(primary_bkt[i]); > rte_prefetch0(secondary_bkt[i]); > @@ -1394,10 +1394,11 @@ __rte_hash_lookup_bulk(const struct rte_hash > *h, const void **keys, > for (i =3D 0; i < num_keys; i++) { > compare_signatures(&prim_hitmask[i], &sec_hitmask[i], > primary_bkt[i], secondary_bkt[i], > - prim_hash[i], sec_hash[i], h->sig_cmp_fn); > + sig[i], h->sig_cmp_fn); >=20 > if (prim_hitmask[i]) { > - uint32_t first_hit =3D __builtin_ctzl(prim_hitmask[i]); > + uint32_t first_hit =3D > + __builtin_ctzl(prim_hitmask[i]) >> 1; > uint32_t key_idx =3D primary_bkt[i]->key_idx[first_hit]; > const struct rte_hash_key *key_slot =3D > (const struct rte_hash_key *)( > @@ -1408,7 +1409,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > } >=20 > if (sec_hitmask[i]) { > - uint32_t first_hit =3D __builtin_ctzl(sec_hitmask[i]); > + uint32_t first_hit =3D > + __builtin_ctzl(sec_hitmask[i]) >> 1; > uint32_t key_idx =3D secondary_bkt[i]- > >key_idx[first_hit]; > const struct rte_hash_key *key_slot =3D > (const struct rte_hash_key *)( > @@ -1422,7 +1424,8 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > for (i =3D 0; i < num_keys; i++) { > positions[i] =3D -ENOENT; > while (prim_hitmask[i]) { > - uint32_t hit_index =3D __builtin_ctzl(prim_hitmask[i]); > + uint32_t hit_index =3D > + __builtin_ctzl(prim_hitmask[i]) >> 1; >=20 > uint32_t key_idx =3D primary_bkt[i]->key_idx[hit_index]; > const struct rte_hash_key *key_slot =3D @@ -1441,11 > +1444,12 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void > **keys, > positions[i] =3D key_idx - 1; > goto next_key; > } > - prim_hitmask[i] &=3D ~(1 << (hit_index)); > + prim_hitmask[i] &=3D ~(3ULL << (hit_index << 1)); > } >=20 > while (sec_hitmask[i]) { > - uint32_t hit_index =3D __builtin_ctzl(sec_hitmask[i]); > + uint32_t hit_index =3D > + __builtin_ctzl(sec_hitmask[i]) >> 1; >=20 > uint32_t key_idx =3D secondary_bkt[i]- > >key_idx[hit_index]; > const struct rte_hash_key *key_slot =3D @@ -1465,7 > +1469,7 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const void > **keys, > positions[i] =3D key_idx - 1; > goto next_key; > } > - sec_hitmask[i] &=3D ~(1 << (hit_index)); > + sec_hitmask[i] &=3D ~(3ULL << (hit_index << 1)); > } >=20 > next_key: > @@ -1488,10 +1492,10 @@ __rte_hash_lookup_bulk(const struct rte_hash > *h, const void **keys, > FOR_EACH_BUCKET(cur_bkt, next_bkt) { > if (data !=3D NULL) > ret =3D search_one_bucket(h, keys[i], > - sec_hash[i], &data[i], > cur_bkt); > + sig[i], &data[i], cur_bkt); > else > ret =3D search_one_bucket(h, keys[i], > - sec_hash[i], NULL, cur_bkt); > + sig[i], NULL, cur_bkt); > if (ret !=3D -1) { > positions[i] =3D ret; > hits |=3D 1ULL << i; > diff --git a/lib/librte_hash/rte_cuckoo_hash.h > b/lib/librte_hash/rte_cuckoo_hash.h > index e601520..7753cd8 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.h > +++ b/lib/librte_hash/rte_cuckoo_hash.h > @@ -129,18 +129,15 @@ struct rte_hash_key { enum > rte_hash_sig_compare_function { > RTE_HASH_COMPARE_SCALAR =3D 0, > RTE_HASH_COMPARE_SSE, > - RTE_HASH_COMPARE_AVX2, > RTE_HASH_COMPARE_NUM > }; >=20 > /** Bucket structure */ > struct rte_hash_bucket { > - hash_sig_t sig_current[RTE_HASH_BUCKET_ENTRIES]; > + uint16_t sig_current[RTE_HASH_BUCKET_ENTRIES]; >=20 > uint32_t key_idx[RTE_HASH_BUCKET_ENTRIES]; >=20 > - hash_sig_t sig_alt[RTE_HASH_BUCKET_ENTRIES]; > - > uint8_t flag[RTE_HASH_BUCKET_ENTRIES]; >=20 > void *next; > @@ -193,6 +190,7 @@ struct rte_hash { >=20 > struct queue_node { > struct rte_hash_bucket *bkt; /* Current bucket on the bfs search */ > + uint32_t cur_bkt_idx; >=20 > struct queue_node *prev; /* Parent(bucket) in search path */ > int prev_slot; /* Parent(slot) in search path */ > diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h inde= x > 11d8e28..6ace64e 100644 > --- a/lib/librte_hash/rte_hash.h > +++ b/lib/librte_hash/rte_hash.h > @@ -40,7 +40,10 @@ extern "C" { > /** Flag to indicate the extendabe bucket table feature should be used *= / > #define RTE_HASH_EXTRA_FLAGS_EXT_TABLE 0x08 >=20 > -/** Signature of key that is stored internally. */ > +/** > + * The type of hash value of a key. > + * It should be a value of at least 32bit with fully random pattern. > + */ > typedef uint32_t hash_sig_t; >=20 > /** Type of function that can be used for calculating the hash value. */ > -- > 2.7.4 Reviewed-by: Honnappa Nagarahalli