From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 75858A0471 for ; Fri, 19 Jul 2019 15:56:32 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 93E862C17; Fri, 19 Jul 2019 15:56:31 +0200 (CEST) Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140081.outbound.protection.outlook.com [40.107.14.81]) by dpdk.org (Postfix) with ESMTP id 613452BC7 for ; Fri, 19 Jul 2019 15:56:30 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bRGVyijyYvfmUquttkFiiYg1/wQn/izhTh6dZsqFTY50Xwhiz2vOKOqlr7IWf5h+b5903pSskVcU6lpWveqpmxxVdN3LhxQIcq663NZF5MuxUkhZD7hCVyQLq2THIL1Tj0Z+S+CVXp2PLBGvwvtPEJmONVRSL3TLbqgLq0u/QGGIg7WipWOSyFCuhF0dpKuvGyTEI6dgJ45d+bCk6bAo4NgikF1xzkHt1yXbvV1O9hCvfvS/of51xuKJ0gGvfmzdvSftxSDpRQzDiEkfgTGBBDmYhSSjdp7u/O10Vh4Kh5E9ntzs++TdK2sdlQM8vtIVumU+LW0qu/Lco1LPmHZ2Pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dtK1Y1FnuOL9FVCA2GiMJBXrKBxsFkau9gjHWucAUVI=; b=E9sE7Yi96Ekl0MHYOij2QNS5rHPaIqXcpQZZcwJOSvSPlOTv9etwSqznSEgAx8jm68JaRjlHrq/ZqHcgsRgpgJQxqTCBUMzRL639ECTPg7Cb3kFVL8fvU0ifpSqc3UXVWqaR84Ebc+2w6CVLoAhrbI6VFRIIKyMoqOzCFoi7wvu3EDpgdnzPb6UAgNt7rCyTx47Fh+FCkYfTiR8WDcMVQ6T3DbnWa3H5527nd+HwrOTK99bfRjduyjXv1G4IAaOqefK4iY4iQFjzFPhNSXUFaQnzMK/+1tas4VfP/wEt6f9SQ3TtiyrixYn9Fi/tyhRDjJCKDe97wNhO5YnIHvXUEQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=arm.com;dmarc=pass action=none header.from=arm.com;dkim=pass header.d=arm.com;arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dtK1Y1FnuOL9FVCA2GiMJBXrKBxsFkau9gjHWucAUVI=; b=7hGFwOj+EDoCovccQp0uiCPXN9aS+nOD6YBvpdZrgU2kMomSNoTIOLVmOvi7KU6klLoNyKRiJlt7D3PvVxWDoHw/5KuhbyOuz5pKHe29qkbVEfW3hAkORx5Ab+uHPaxLKE0rMZArbagtOFU+QbeKwPfaOdTiJNUO+wjCXkHqAsw= Received: from VE1PR08MB4640.eurprd08.prod.outlook.com (10.255.27.75) by VE1PR08MB4877.eurprd08.prod.outlook.com (10.255.113.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2073.14; Fri, 19 Jul 2019 13:56:27 +0000 Received: from VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::f4e4:378b:49d3:d876]) by VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::f4e4:378b:49d3:d876%5]) with mapi id 15.20.2094.013; Fri, 19 Jul 2019 13:56:27 +0000 From: "Phil Yang (Arm Technology China)" To: "jerinj@marvell.com" , "gage.eads@intel.com" , "dev@dpdk.org" CC: "thomas@monjalon.net" , "hemant.agrawal@nxp.com" , Honnappa Nagarahalli , "Gavin Hu (Arm Technology China)" , nd , nd Thread-Topic: [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare exchange Thread-Index: AQHVPfq0Xw54Ur+dYUGfqkk+8/l5GqbRl9uggABJBwCAAAFicA== Date: Fri, 19 Jul 2019 13:56:27 +0000 Message-ID: References: <1561257671-10316-1-git-send-email-phil.yang@arm.com> <1561709503-11665-1-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 88dd1dd2-1712-4c13-b629-06c63d4cee9a.0 x-checkrecipientchecked: true authentication-results: spf=none (sender IP is ) smtp.mailfrom=Phil.Yang@arm.com; x-originating-ip: [113.29.88.7] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 58608f70-1968-48dd-00d2-08d70c50e528 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:VE1PR08MB4877; x-ms-traffictypediagnostic: VE1PR08MB4877: x-ms-exchange-purlcount: 2 x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-microsoft-antispam-prvs: nodisclaimer: True x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 01039C93E4 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(366004)(136003)(376002)(346002)(396003)(13464003)(199004)(189003)(966005)(5660300002)(305945005)(81166006)(68736007)(7736002)(33656002)(110136005)(64756008)(8936002)(7696005)(256004)(229853002)(3846002)(486006)(81156014)(66946007)(76116006)(6506007)(55016002)(478600001)(102836004)(53546011)(76176011)(86362001)(66066001)(4326008)(2501003)(14454004)(71190400001)(9686003)(71200400001)(6306002)(55236004)(14444005)(446003)(8676002)(2201001)(186003)(476003)(6116002)(53936002)(99286004)(52536014)(66446008)(74316002)(66556008)(66476007)(6436002)(6246003)(25786009)(2906002)(316002)(11346002)(26005)(54906003); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4877; H:VE1PR08MB4640.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: 5Ss7Dpsrhk1dX5lT+PuypwICTLz81XnU6rsdbisqspmGwGMgt1EsYNT5OBH++TCFp03h+3O00S2y59UM8pzwSzTt+p8Cobw8smXFyoEBDYzZ7tAr8GcK3Q4Fas6cwMG62vaL3g9UlshsY0Sxx3v3i7Jfk3beAJfEr8BrJmarQ6hsNAGw6anOwUgO/HJM0kkiDPKqHm977vdl7k34AXTrsxWyI7rU+x553//kdbfUw7cr0vgbUFqr6qFIAI7+GNsda2YsAX0dKNf+MCfiF3WrvJeSC7ykFncF4NtkDZT0glFHLAEzz/UpZi8IOKqunfXUEfF3p+rJyBx/UL1FCQ34swz/1RYI61w0Vx7Oa2cPM3nP80GMNF+x6qin0L+fam2iocQKQOnfq2gsVXlIRqsGCUo9HtICWwOYPkTfIqu3W5Q= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 58608f70-1968-48dd-00d2-08d70c50e528 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jul 2019 13:56:27.5916 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Phil.Yang@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4877 Subject: Re: [dpdk-dev] [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare exchange X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Jerin Jacob Kollanukkaran > Sent: Friday, July 19, 2019 8:35 PM > To: Phil Yang (Arm Technology China) ; dev@dpdk.org > Cc: thomas@monjalon.net; hemant.agrawal@nxp.com; Honnappa > Nagarahalli ; Gavin Hu (Arm Technology > China) ; nd ; gage.eads@intel.com; nd > > Subject: RE: [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare > exchange >=20 > > > > +#define RTE_HAS_ACQ(mo) ((mo) !=3D __ATOMIC_RELAXED && (mo) !=3D > > > > +__ATOMIC_RELEASE) #define RTE_HAS_RLS(mo) ((mo) =3D=3D > > > > __ATOMIC_RELEASE || \ > > > > + (mo) =3D=3D __ATOMIC_ACQ_REL || \ > > > > + (mo) =3D=3D __ATOMIC_SEQ_CST) > > > > + > > > > +#define RTE_MO_LOAD(mo) (RTE_HAS_ACQ((mo)) \ > > > > + ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED) #define > > > > RTE_MO_STORE(mo) > > > > +(RTE_HAS_RLS((mo)) \ > > > > + ? __ATOMIC_RELEASE : __ATOMIC_RELAXED) > > > > + > > > > > > The one starts with RTE_ are public symbols, If it is generic enough, > > > Move to common layer so that every architecturse can use. > > > If you think, otherwise make it internal > > > > Let's keep it internal. I will remove the 'RTE_' tag. >=20 > Probably change to __HAS_ACQ to avoid collision(just in case) OK. >=20 > > > > > > > > > > > > > +#ifdef __ARM_FEATURE_ATOMICS > > > > > > This define is added in gcc 9.1 and I believe for clang it is not sup= ported > yet. > > > So old gcc and clang this will be undefined. > > > I think, With meson + native build, we can find the presence of > > > ATOMIC support by running a.out. Not sure about make and cross build > case. > > > I don't want block this feature because of this, IMO, We can add this > > > code with existing __ARM_FEATURE_ATOMICS scheme and later find a > > > method to enhance it. But please check how to fix it. > > > > OK. >=20 > After thinking on this a bit, I think, in order to support old gcc(< gcc= 9.1) and > clang, > We can introduce a config option, where, by default it is disabled and en= able > In specific config(where we know, lse is supported) and meson config. >=20 > i.e > #if defined(__ARM_FEATURE_ATOMICS) || > defined(RTE_ARM_FEATURE_ATOMICS) Cool >=20 >=20 > > > > > > > > > +#define __ATOMIC128_CAS_OP(cas_op_name, op_string) > \ > > > > +static inline rte_int128_t = \ > > > > +cas_op_name(rte_int128_t *dst, rte_int128_t old, = \ > > > > + rte_int128_t updated) = \ > > > > +{ = \ > > > > + /* caspX instructions register pair must start from even-numbered > > > > + * register at operand 1. > > > > + * So, specify registers for local variables here. > > > > + */ = \ > > > > + register uint64_t x0 __asm("x0") =3D (uint64_t)old.val[0]; = \ > > > > > > Since direct x0 register used in the code and > > > cas_op_name() and rte_atomic128_cmp_exchange() is inline function, > > > Based on parent function load, we may corrupt x0 register aka > > > > Since x0/x1 and x2/x3 are used a lot and often contain live values. > > Maybe to change them to some relatively less frequently used registers > like > > x14/x15 and x16/x17 might help for this case? > > According to the PCS (Procedure Call Standard), x14-x17 are also tempor= ary > > registers. >=20 > X14-x17 are temporary registers but since > cas_op_name() and rte_atomic128_cmp_exchange() are inline functions, > Based on the parent function register usage, it _may_ corrupt. Just checked how Linux Kernel does similar things: https://github.com/torvalds/linux/blob/master/arch/arm64/include/asm/atomic= _lse.h#L19=20 Same methods. I will finish the benchmarking for the no_inline approach. If it has no sig= nificant performance loss, I think we can make it as no_inline. =20 >=20 >=20 > > > > > Break arm64 ABI. Not sure clobber list will help here or not? > > > > In my understanding, for the register variable, if it contains a live v= alue in > the > > specified register, the compiler will move the live value into a free r= egister. > > Since x0~x3 are present in the input/output operands and x0/x1's value > needs to > > be restored to the variable 'old' as a return value. > > So I didn't add them into the clobber list. >=20 > OK >=20 > > > > > Making it as no_inline will help but not sure about the performance > impact. > > > May be you can check with compiler team. > > > > > > We burned our hands with this scheme, see > > > 5b40ec6b966260e0ff66a8a2c689664f75d6a0e6 ("mempool/octeontx2: fix > > > possible arm64 ABI break") > > > > > > Probably we can choose a scheme for rc2 and adjust as when we have > > > complete clarity. > > > > > > > > > > > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_ARM64) > > > > > > There is nothing specific to x86 and arm64 here, Can we remove this > #ifdef ? > > > > Without this constraint, it will break 32-bit x86 builds. > > http://mails.dpdk.org/archives/test-report/2019-June/086586.html >=20 > OK . #ifdef RTE_ARCH_64 would help then. OK. >=20 > > > > > > > > > +/** > > > > + * 128-bit integer structure. > > > > + */ > > > > +RTE_STD_C11 > > > > +typedef struct { > > > > + RTE_STD_C11 > > > > + union { > > > > + uint64_t val[2]; > > > > + __extension__ __int128 int128; >=20 > Instead of guarding RTE_ARCH_64 on this complete structure, > How about it only under > #ifdef RTE_ARCH_64 > __extension__ __int128 int128; > #endif > So that it rte_int128_t will be available for 32bit as well. Agree, it should be work. But I am not sure.=20 Hi Gage, How do you think about this?=20 >=20 >=20 > > > > + }; > > > > +} __rte_aligned(16) rte_int128_t; > > > > +#endif > > > > + > > > > #ifdef __DOXYGEN__