From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id BC9502A6A for ; Fri, 18 Jan 2019 00:03:02 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jan 2019 15:03:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,489,1539673200"; d="scan'208";a="292444873" Received: from fmsmsx103.amr.corp.intel.com ([10.18.124.201]) by orsmga005.jf.intel.com with ESMTP; 17 Jan 2019 15:03:01 -0800 Received: from fmsmsx124.amr.corp.intel.com (10.18.125.39) by FMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP Server (TLS) id 14.3.408.0; Thu, 17 Jan 2019 15:03:01 -0800 Received: from fmsmsx108.amr.corp.intel.com ([169.254.9.99]) by fmsmsx124.amr.corp.intel.com ([169.254.8.215]) with mapi id 14.03.0415.000; Thu, 17 Jan 2019 15:03:01 -0800 From: "Eads, Gage" To: Honnappa Nagarahalli , "dev@dpdk.org" CC: "olivier.matz@6wind.com" , "arybchenko@solarflare.com" , "Richardson, Bruce" , "Ananyev, Konstantin" , nd , nd Thread-Topic: [dpdk-dev] [PATCH v3 1/2] eal: add 128-bit cmpset (x86-64 only) Thread-Index: AQHUra7wYmJ+w7UMKEWkA1HXhAc3iqWy9AnAgAD6ulA= Date: Thu, 17 Jan 2019 23:03:00 +0000 Message-ID: <9184057F7FC11744A2107296B6B8EB1E541C8BCA@FMSMSX108.amr.corp.intel.com> References: <20190115223232.31866-1-gage.eads@intel.com> <20190116151835.22424-1-gage.eads@intel.com> <20190116151835.22424-2-gage.eads@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYjJmYTI0YWUtOTRmNS00N2Q0LWJiZGYtYzQ2NjdjMmM0ZjkxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiZzZ1VVlZZmc2T1l4dDR2UE9jNjhnQzJtSDk5b3FlN2dsXC9TQnNhNjF6SXVQb25PSkNLeFB0czlUblhBUWlncnoifQ== x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.400.15 dlp-reaction: no-action x-originating-ip: [10.1.200.108] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 1/2] eal: add 128-bit cmpset (x86-64 only) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jan 2019 23:03:03 -0000 > -----Original Message----- > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] > Sent: Thursday, January 17, 2019 9:45 AM > To: Eads, Gage ; dev@dpdk.org > Cc: olivier.matz@6wind.com; arybchenko@solarflare.com; Richardson, Bruce > ; Ananyev, Konstantin > ; nd ; Honnappa Nagarahalli > ; nd > Subject: RE: [dpdk-dev] [PATCH v3 1/2] eal: add 128-bit cmpset (x86-64 on= ly) >=20 > > Subject: [dpdk-dev] [PATCH v3 1/2] eal: add 128-bit cmpset (x86-64 > > only) > > > > This operation can be used for non-blocking algorithms, such as a non- > > blocking stack or ring. > > > > Signed-off-by: Gage Eads > > --- > > .../common/include/arch/x86/rte_atomic_64.h | 22 > > ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h > > b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h > > index fd2ec9c53..34c2addf8 100644 > > --- a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h > > +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h > Since this is a 128b operation should there be a new file created with th= e name > rte_atomic_128.h? >=20 > > @@ -34,6 +34,7 @@ > > /* > > * Inspired from FreeBSD src/sys/amd64/include/atomic.h > > * Copyright (c) 1998 Doug Rabson > > + * Copyright (c) 2019 Intel Corporation > > * All rights reserved. > > */ > > > > @@ -208,4 +209,25 @@ static inline void > > rte_atomic64_clear(rte_atomic64_t > > *v) } #endif > > > > +static inline int > > +rte_atomic128_cmpset(volatile uint64_t *dst, uint64_t *exp, uint64_t > > +*src) { > The API name suggests it is a 128b operation. 'dst', 'exp' and 'src' shou= ld be > pointers to 128b (__int128)? Or we could define our own data type. I agree, I'm not a big fan of the 64b pointers here. I avoided __int128 ori= ginally because it fails to compile with -pedantic, but on second thought (= and with your suggestion of a separate data type), we can resolve that with= this typedef: typedef struct { RTE_STD_C11 __int128 val; } rte_int128_t; > Since, it is a new API, can we define it with memory orderings which will= be more > conducive to relaxed memory ordering based architectures? You can refer t= o [1] > and [2] for guidance. I certainly see the value in controlling the operation's memory ordering, l= ike in the __atomic intrinsics, but I'm not sure this patchset is the right= place to address that. I see that work going a couple ways: 1. Expand the existing rte_atomicN_* interfaces with additional arguments. = In that case, I'd prefer this be done in a separate patchset that addresses= all the atomic operations, not just cmpset, so the interface changes are c= hosen according to the needs of the full set of atomic operations. If this = approach is taken then there's no need to solve this while rte_atomic128_cm= pset is experimental, since all the other functions are non-experimental an= yway. - Or - 2. Don't modify the existing rte_atomicN_* interfaces (or their strongly or= dered behavior), and instead create new versions of them that take addition= al arguments. In this case, we can implement rte_atomic128_cmpset() as is a= nd create a more flexible version in a later patchset. Either way, I think the current interface (w.r.t. memory ordering options) = can work and still leaves us in a good position for future changes/improvem= ents. > If this an external API, it requires 'experimental' tag. Good catch -- will fix. >=20 > 1. https://github.com/ARM- > software/progress64/blob/master/src/lockfree/aarch64.h#L63 I didn't know about aarch64's CASP instruction -- very cool!=20 > 2. https://github.com/ARM- > software/progress64/blob/master/src/lockfree/x86-64.h#L34 >=20 > > + uint8_t res; > > + > > + asm volatile ( > > + MPLOCKED > > + "cmpxchg16b %[dst];" > > + " sete %[res]" > > + : [dst] "=3Dm" (*dst), > > + [res] "=3Dr" (res) > > + : "c" (src[1]), > > + "b" (src[0]), > > + "m" (*dst), > > + "d" (exp[1]), > > + "a" (exp[0]) > > + : "memory"); > > + > > + return res; > > +} > > + > > #endif /* _RTE_ATOMIC_X86_64_H_ */ > > -- > > 2.13.6