From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4BCB0A0487 for ; Fri, 5 Jul 2019 07:33:29 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2B8DA2E83; Fri, 5 Jul 2019 07:33:28 +0200 (CEST) Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30082.outbound.protection.outlook.com [40.107.3.82]) by dpdk.org (Postfix) with ESMTP id 06F852C5E; Fri, 5 Jul 2019 07:33:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dQ4wYFSKYY0ec9UtEWAjuQxS6sIqDNFpEkPnmr3v8ao=; b=5Vul15JAgbKriJz8yMdarJMaghe0wtRqpeLs0xsa895GIBoV+z8Zes57KvpwC7po2QwGIwAVjA3falvuboCk/yN9qb9ga33bdz7XYkNfM3LVSbjIegwIxKfzDba9GnmQd2z0cbRODqB30m2RRlpl6BpUctsvtZIVLv0AJ1kEND4= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.152) by VE1PR08MB4927.eurprd08.prod.outlook.com (10.255.114.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2052.18; Fri, 5 Jul 2019 05:33:24 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::a89e:33:fbda:ed35]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::a89e:33:fbda:ed35%4]) with mapi id 15.20.2052.010; Fri, 5 Jul 2019 05:33:24 +0000 From: Honnappa Nagarahalli To: "Wang, Yipeng1" , "Gobriel, Sameh" , "Richardson, Bruce" , "De Lara Guarch, Pablo" CC: "Gavin Hu (Arm Technology China)" , "Ruifeng Wang (Arm Technology China)" , "dev@dpdk.org" , nd , "stable@dpdk.org" , nd , nd Thread-Topic: [PATCH 2/3] lib/hash: load pData after full key compare Thread-Index: AQHVLeD2jHh9aL8rKU6eVhpBi8s9w6a2wHwAgALljgCAAdsIoA== Date: Fri, 5 Jul 2019 05:33:24 +0000 Message-ID: References: <20190625211520.43181-1-honnappa.nagarahalli@arm.com> <20190625211520.43181-3-honnappa.nagarahalli@arm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 9a01fd9e-5de6-41a1-8b19-9f32ef0499b2.0 x-checkrecipientchecked: true authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1f9ad268-b5f9-4018-03fb-08d7010a4cf9 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:VE1PR08MB4927; x-ms-traffictypediagnostic: VE1PR08MB4927: x-microsoft-antispam-prvs: nodisclaimer: True x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 008960E8EC x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(346002)(376002)(136003)(366004)(39860400002)(396003)(51914003)(199004)(189003)(7696005)(229853002)(86362001)(55016002)(26005)(3846002)(66066001)(6436002)(99286004)(52536014)(186003)(6246003)(6506007)(54906003)(6116002)(2906002)(5660300002)(102836004)(53936002)(33656002)(9686003)(110136005)(316002)(68736007)(14444005)(256004)(72206003)(66476007)(4326008)(66556008)(8936002)(476003)(73956011)(25786009)(66446008)(446003)(305945005)(486006)(71190400001)(71200400001)(76116006)(8676002)(14454004)(478600001)(81166006)(66946007)(81156014)(64756008)(7736002)(74316002)(11346002)(76176011); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4927; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: DcOhRBTUan3oQQjJECO146OAnI7m/JII+QawlDaSGoo9eDrdXr8Qgu3G4SzDBfJrjEd6iLZ2lWoG6jL5bIC1HrULwa4WWs319ajw6QVBm+MsdwHu5UOS7SH8RCM1R/eyIwOXIepvB+XG0DX9jZgmRPmPLIA6ZzOP6fji0lNaub7j47fCb2Lxl2f0uBMGyoGh011UZ9zA9Zzj/ioCADrLWB6iEun5PGZsZ5cU1MUu+OHNF7nxzA63IbobohG10BPRaNfN2eKxTn8hqRxu8wwOJC7JzXRX40Ua4NAz5l0VveS+Ha1XIIMcItB4cPFy6TvomL9GmI4sx6C57ohVjchBuVYFB16NK2Dqwy7AvO2cIjiBV4uwbVQS77bgEHnmZaENi08E63s7prsu9pynIxyJg0hpGdvosx/bazjee8sdaW8= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1f9ad268-b5f9-4018-03fb-08d7010a4cf9 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Jul 2019 05:33:24.7573 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Honnappa.Nagarahalli@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4927 Subject: Re: [dpdk-dev] [PATCH 2/3] lib/hash: load pData after full key compare X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > >Subject: RE: [PATCH 2/3] lib/hash: load pData after full key compare > > > >Thank you Yipeng for your comments. > > > >> > > >> >When a hash entry is added, there are 2 sets of stores. > >> > > >> >1) The application writes its data to memory (whose address is > >> >provided in rte_hash_add_key_with_hash_data API (or NULL)) > >> >2) The rte_hash library writes to its own internal data structures; > >> >key store entry and the hash table. > >> > > >> >The only ordering requirement between these 2 is that - the store to > >> >the application data must complete before the store to key_index. > >> >There are no ordering requirements between the stores to the > >> >key/signature and store to application data. The synchronization > >> >point for application data can be any point between the 'store to > >> >application data' and 'store to the key_index'. So, pData should not > >> >be a guard variable for the data in hash table. It should be a guard > >> >variable only for the application data written to the memory > >> >location pointed by pData. Hence, pData can be loaded after full key > comparison. > >> > > >> >Fixes: e605a1d36 ("hash: add lock-free r/w concurrency") > >> >Cc: stable@dpdk.org > >> > > >> >Signed-off-by: Honnappa Nagarahalli > >> >Reviewed-by: Gavin Hu > >> >Tested-by: Ruifeng Wang > >> >--- > >> > lib/librte_hash/rte_cuckoo_hash.c | 67 > >> >+++++++++++++++---------------- > >> > 1 file changed, 32 insertions(+), 35 deletions(-) > >> > > >> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c > >> >b/lib/librte_hash/rte_cuckoo_hash.c > >> >index f37f6957d..077328fed 100644 > >> >--- a/lib/librte_hash/rte_cuckoo_hash.c > >> >+++ b/lib/librte_hash/rte_cuckoo_hash.c > >> >@@ -649,9 +649,11 @@ search_and_update(const struct rte_hash *h, > >> >void > >> *data, const void *key, > >> > k =3D (struct rte_hash_key *) ((char *)keys + > >> > bkt->key_idx[i] * h->key_entry_size); > >> > if (rte_hash_cmp_eq(key, k->key, h) =3D=3D 0) { > >> >- /* 'pdata' acts as the synchronization point > >> >- * when an existing hash entry is updated. > >> >- * Key is not updated in this case. > >> >+ /* The store to application data at *data > >> >+ * should not leak after the store to pdata > >> >+ * in the key store. i.e. pdata is the guard > >> >+ * variable. Release the application data > >> >+ * to the readers. > >> > */ > >> > __atomic_store_n(&k->pdata, > >> > data, > >> >@@ -711,11 +713,10 @@ rte_hash_cuckoo_insert_mw(const struct > >> rte_hash *h, > >> > /* Check if slot is available */ > >> > if (likely(prim_bkt->key_idx[i] =3D=3D EMPTY_SLOT)) { > >> > prim_bkt->sig_current[i] =3D sig; > >> >- /* Key can be of arbitrary length, so it is > >> >- * not possible to store it atomically. > >> >- * Hence the new key element's memory stores > >> >- * (key as well as data) should be complete > >> >- * before it is referenced. > >> >+ /* Store to signature and key should not > >> >+ * leak after the store to key_idx. i.e. > >> >+ * key_idx is the guard variable for signature > >> >+ * and key. > >> > */ > >> > __atomic_store_n(&prim_bkt->key_idx[i], > >> > new_idx, > >> >@@ -990,17 +991,15 @@ __rte_hash_add_key_with_hash(const struct > >> >rte_hash *h, const void *key, > >> > > >> > new_k =3D RTE_PTR_ADD(keys, (uintptr_t)slot_id * h->key_entry_size)= ; > >> > new_idx =3D (uint32_t)((uintptr_t) slot_id); > >> >- /* Copy key */ > >> >- memcpy(new_k->key, key, h->key_len); > >> >- /* Key can be of arbitrary length, so it is not possible to store > >> >- * it atomically. Hence the new key element's memory stores > >> >- * (key as well as data) should be complete before it is referenced= . > >> >- * 'pdata' acts as the synchronization point when an existing hash > >> >- * entry is updated. > >> >+ /* The store to application data (by the application) at *data shou= ld > >> >+ * not leak after the store of pdata in the key store. i.e. pdata i= s > >> >+ * the guard variable. Release the application data to the readers. > >> > */ > >> > __atomic_store_n(&new_k->pdata, > >> > data, > >> > __ATOMIC_RELEASE); > >> [Wang, Yipeng] Actually do we need to guard pdata for the newly > >> inserted key? I thought the guard of key_idx already can make sure > >> The order for the application to read data. > >Yes, you are correct. In the hash_add case, the store-release on > >key_idx would be sufficient. However, hash_update case requires > >store-release on pData. This was the reason to keep store-release for pD= ata > in hash_add when the lock-free algorithm was introduced. >=20 > [Wang, Yipeng] Sorry that I am still a bit confused, we already have stor= e > release in search_and_update function right? Isn't that enough for the > hash_update case? No problem, looks like I did not use the correct terms. We are talking abou= t 2 paths in the code: 1) When a new key is getting inserted, store-release of key_idx acts as the= guard variable for the store to application data as well as the stores to = internal hash table structures (signature, key, pdata). 2) But when an existing key is updated, there is no store to key_idx. In th= is case 'pdata' acts as the guard variable for the store to application dat= a. Hence we need a store-release on 'pdata'. Due to this we need a load-acq= uire on 'pdata' in the lookup function. Then, why do we need store-release on 'pdata' in code path 1? - store-relea= se on 'pdata' in code path 1 is done for consistency with code path 2 i.e. = we want to use store-release on 'pdata' consistently in both the code paths= (unless we see performance degradation in path 1). IMO, it is much easier = to understand the code this way. > > > >> >+ /* Copy key */ > >> >+ memcpy(new_k->key, key, h->key_len); > >> [Wang, Yipeng] You don't need to do the order change just to show the > >> point of unnecessary ordering I think. > >> I am afraid it may cause further confusion for future people who read > >> this change, especially it is not in the commit Message (and it is a b= ug fix). > >I made this change to keep it inline with the corresponding change in > >the lookup function. I can add this explanation to the commit message. > Please let me know if this is ok for you. >=20 > [Wang, Yipeng] Thanks for the change. > To me it still looks unnecessary but If you think this cosmetic change wo= uld > help others to understand the code better, I am OK with it. I agree this is unnecessary. When the change was being reviewed internally = at Arm, I had not made this change initially. It resulted in questions as t= he new key insert's memory ordering steps did not match that of the lookup = function. IMO, if we look at the algorithm as a whole (instead of looking a= t this commit alone), this code will be easier to understand.