From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yipeng1.wang@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id A54FA1B143
 for <dev@dpdk.org>; Fri, 28 Sep 2018 02:43:21 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 27 Sep 2018 17:43:20 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,312,1534834800"; d="scan'208";a="84036027"
Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206])
 by FMSMGA003.fm.intel.com with ESMTP; 27 Sep 2018 17:43:16 -0700
Received: from fmsmsx113.amr.corp.intel.com (10.18.116.7) by
 FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS)
 id 14.3.319.2; Thu, 27 Sep 2018 17:43:16 -0700
Received: from fmsmsx151.amr.corp.intel.com ([169.254.7.87]) by
 FMSMSX113.amr.corp.intel.com ([169.254.13.80]) with mapi id 14.03.0319.002;
 Thu, 27 Sep 2018 17:43:16 -0700
From: "Wang, Yipeng1" <yipeng1.wang@intel.com>
To: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>, "Richardson, Bruce"
 <bruce.richardson@intel.com>, "De Lara Guarch, Pablo"
 <pablo.de.lara.guarch@intel.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, "gavin.hu@arm.com" <gavin.hu@arm.com>,
 "steve.capper@arm.com" <steve.capper@arm.com>, "ola.liljedahl@arm.com"
 <ola.liljedahl@arm.com>, "nd@arm.com" <nd@arm.com>
Thread-Topic: [dpdk-dev] [PATCH 2/4] hash: add memory ordering to avoid race
 conditions
Thread-Index: AQHURgTw9q3ETpVDqE+BfQvCqkpoP6UE8Beg
Date: Fri, 28 Sep 2018 00:43:15 +0000
Message-ID: <D2C4A16CA39F7F4E8E384D204491D7A6614D81C0@FMSMSX151.amr.corp.intel.com>
References: <1536253938-192391-1-git-send-email-honnappa.nagarahalli@arm.com>
 <1536253938-192391-3-git-send-email-honnappa.nagarahalli@arm.com>
In-Reply-To: <1536253938-192391-3-git-send-email-honnappa.nagarahalli@arm.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-version: 11.0.400.15
dlp-reaction: no-action
x-ctpclassification: CTP_NT
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOTJkYzIxODItYjA5ZC00MmRkLWFjNDctNjRmYjY4NWFkY2RmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiN1wvMVJPc3hsUm1sR1dSd0t1MXFTdE5rSVhEaEl1Qjlpc2JyUXJXYkFVcmVRVkxnSlJmaFRtZE1wUm9rczVtOHcifQ==
x-originating-ip: [10.1.200.106]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH 2/4] hash: add memory ordering to avoid
	race	conditions
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Sep 2018 00:43:22 -0000

Some general comments for  the various __atomic_store/load added,

1. Although it passes the compiler check, but I just want to confirm that i=
f we should use GCC/clang builtins, or if
There are higher level APIs in DPDK to do atomic operations?=20

2. We believe compiler will translate the atomic_store/load to regular MOV =
instruction on
Total Store Order architecture (e.g. X86_64). But we run the perf test on x=
86 and here is the relative slowdown on
lookup comparing to master head. I am not sure if the performance drop come=
s from the atomic buitins.

Keysize | single lookup | bulk lookup
4 | 0.93 | 0.95
8 | 0.95 | 0.96
16 | 0.97 | 0.96
32 | 0.97 | 1.00
48 | 1.03 | 0.99
64 | 1.04 | 0.98
9 | 0.91 | 0.96
13 | 0.97 | 0.98
37 | 1.04 | 1.03
40 | 1.02 | 0.98

Other comments inlined.

>-----Original Message-----
>Only race condition that can occur is -  using the key store element
>before the key write is completed. Hence, while inserting the element
>the release memory order is used. Any other race condition is caught
>by the key comparison. Memory orderings are added only where needed.
>For ex: reads in the writer's context do not need memory ordering
>as there is a single writer.
>
[Wang, Yipeng] I assume this commit works together with the next one to ena=
ble
read-write concurrency on relaxed memory ordering machine. This commit by i=
tself does not do that, right?
If this is the case, should we merge the two commits,
or maybe just explain it a little bit more in the commit message?

>key_idx in the bucket entry and pdata in the key store element are
>used for synchronisation. key_idx is used to release an inserted
>entry in the bucket to the reader. Use of pdata for synchronisation
>is required due to updation of an existing entry where-in only
>the pdata is updated without updating key_idx.
>
>
>-/* Search a key from bucket and update its data */
>+/* Search a key from bucket and update its data.
>+ * Writer holds the lock before calling this.
>+ */
> static inline int32_t
> search_and_update(const struct rte_hash *h, void *data, const void *key,
> 	struct rte_hash_bucket *bkt, hash_sig_t sig, hash_sig_t alt_hash)
>@@ -499,8 +501,13 @@ search_and_update(const struct rte_hash *h, void *dat=
a, const void *key,
> 			k =3D (struct rte_hash_key *) ((char *)keys +
> 					bkt->key_idx[i] * h->key_entry_size);
> 			if (rte_hash_cmp_eq(key, k->key, h) =3D=3D 0) {
>-				/* Update data */
>-				k->pdata =3D data;
>+				/* 'pdata' acts as the synchronization point
>+				 * when an existing hash entry is updated.
>+				 * Key is not updated in this case.
>+				 */
[Wang, Yipeng] I did not quite understand why do we need synchronization fo=
r hash data update.
Since pdata write is already atomic, the lookup will either read out the st=
ale data or the new data,
which should be fine without synchronization.
Is it to ensure the order of multiple reads in lookup threads?

>@@ -1243,11 +1289,19 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, c=
onst void **keys,
> 		while (sec_hitmask[i]) {
> 			uint32_t hit_index =3D __builtin_ctzl(sec_hitmask[i]);
>
>-			uint32_t key_idx =3D secondary_bkt[i]->key_idx[hit_index];
>+			uint32_t key_idx =3D
>+			__atomic_load_n(
>+				&secondary_bkt[i]->key_idx[hit_index],
>+				__ATOMIC_ACQUIRE);
> 			const struct rte_hash_key *key_slot =3D
> 				(const struct rte_hash_key *)(
> 				(const char *)h->key_store +
> 				key_idx * h->key_entry_size);
>+
>+			if (key_idx !=3D EMPTY_SLOT)
[Wang, Yipeng] I think even for current code, we need to check empty_slot.
Could you export this as a bug fix commit?