From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 897E7A0471 for ; Fri, 19 Jul 2019 13:01:28 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F1EE22BA8; Fri, 19 Jul 2019 13:01:26 +0200 (CEST) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00044.outbound.protection.outlook.com [40.107.0.44]) by dpdk.org (Postfix) with ESMTP id EECA31B53 for ; Fri, 19 Jul 2019 13:01:24 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MfSWlLMM2u7nDYItlMtUZrWe+3Eurd0dBaYk0s2aLdkF4hjpQYKKCWl9sXwdYa3btjoLe/oP7WCTzj6tAJtMbUv6p3U8w+hOrUistzSOTgviVariwMpDz/wrQXZ9YOQeJvr/GfXP2oBGYyWFd0LhAmw4jWkcDACZOM9S2IfSnrMiKhMh+2wxz9owzo7e+GnuOGsPS8A+QyxI9ClHyV3GcZMwzPwTlptLSmTJwHFhq0XhYEOAtCZxY52EjkqGqFyQQTy39XIddRugstSKqZSJ5G8wsEPUQciFvvpWnzM8wnFAhYoLOnI7xG+dt4Kvl/aoTiF1/HpsBLFa3nbSIML+2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wFjwmkcKHuFy61/DXkOqPZSi8pkJG6cxTMpBk7tTQP0=; b=nS/FCu8mRL3/g53CDZMnnSQ0WlMkK9GWhxmTIJpDIgoCw1uF496zH5uWQhi15rD5YsbLm+raRmTkTJE9dsWpmZdSMmAlimDNItqDiSeMMcgHbpIwmmHVfEoR2kLbhTW4UizCrc1c2eGyr3G/thJX1oAi7AYcZnUQ8n8dkRO6yRfxdj7YdjiajCyiVmaqCQqXrY5UTkgPsRbh+L6akn+mhXCIL2YYjrgbBEySaj4uuZtBdwPbRBF7d4L+RZvsaUKDj+WJufNdcZznCxjmUcOB1oy1FkwpXZbT3txJwaJBsXWQAF0ZZn52psLNz+UvHVYTW6U4Arom9p67qr/ej4xJBA== ARC-Authentication-Results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=arm.com;dmarc=pass action=none header.from=arm.com;dkim=pass header.d=arm.com;arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wFjwmkcKHuFy61/DXkOqPZSi8pkJG6cxTMpBk7tTQP0=; b=E31e4L9XsXpFRsiG9s4MINRnNgthaUjABYkY9B4K+QavfRVTCquIOEY99t5MmXDIt8QcdB4DSRG/J2pmW27gY40aGQP625R/HHf0H/1AkCqr1yA42acDWjcXM60pvK91qlqnB4elXBMsqpEUSHtA6o2u5ukWuzO/FyvfABLU85k= Received: from VE1PR08MB4640.eurprd08.prod.outlook.com (10.255.27.75) by VE1PR08MB4992.eurprd08.prod.outlook.com (10.255.158.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2094.12; Fri, 19 Jul 2019 11:01:22 +0000 Received: from VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::f4e4:378b:49d3:d876]) by VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::f4e4:378b:49d3:d876%5]) with mapi id 15.20.2094.013; Fri, 19 Jul 2019 11:01:22 +0000 From: "Phil Yang (Arm Technology China)" To: "jerinj@marvell.com" , "dev@dpdk.org" CC: "thomas@monjalon.net" , "hemant.agrawal@nxp.com" , Honnappa Nagarahalli , "Gavin Hu (Arm Technology China)" , nd , "gage.eads@intel.com" , nd Thread-Topic: [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare exchange Thread-Index: AQHVPfq0Xw54Ur+dYUGfqkk+8/l5GqbRl9ug Date: Fri, 19 Jul 2019 11:01:22 +0000 Message-ID: References: <1561257671-10316-1-git-send-email-phil.yang@arm.com> <1561709503-11665-1-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 3ca1991d-2361-42ed-aad9-ff09d2b37d17.0 x-checkrecipientchecked: true x-ts-email-id: f8a23b2a-12dd-4a91-b10f-1153ef9e4ee2 authentication-results: spf=none (sender IP is ) smtp.mailfrom=Phil.Yang@arm.com; x-originating-ip: [113.29.88.7] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: aa832ab5-83d4-4e17-1eef-08d70c386fb0 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:VE1PR08MB4992; x-ms-traffictypediagnostic: VE1PR08MB4992: x-ms-exchange-purlcount: 1 x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-microsoft-antispam-prvs: nodisclaimer: True x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 01039C93E4 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(346002)(396003)(39860400002)(136003)(366004)(376002)(189003)(199004)(13464003)(6246003)(6436002)(2906002)(478600001)(71190400001)(71200400001)(52536014)(8936002)(74316002)(8676002)(25786009)(81166006)(81156014)(66946007)(2501003)(14454004)(64756008)(66446008)(66556008)(76116006)(86362001)(66476007)(966005)(256004)(14444005)(4326008)(3846002)(6116002)(110136005)(26005)(316002)(99286004)(11346002)(446003)(33656002)(68736007)(53936002)(102836004)(229853002)(54906003)(186003)(476003)(486006)(7696005)(6506007)(5660300002)(7736002)(53546011)(66066001)(55016002)(9686003)(55236004)(305945005)(6306002)(76176011); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4992; H:VE1PR08MB4640.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: 23CubLk9UCtqgLUg+Mr6NM3SwmXC8BK5QkPyiqf2H6ik1tN+5H2UeG2vSehavGvWdRuX+uu7E+UP31epCOTF1JmFrElvIlirCaFn4xD9Ju240RchtGRjQR9/Qs7XNVozYySjDsUvD863jGNwja5iaJ7Z0veM+8SAIyDvJaxpEyZ0vdgJGV9tKjNPD5QlPhU8ZX6kspJmCnnCCJ4677eA43mB+DLvOB9AOHxUr6pBU1QSP7cW4FmEMpEleqqAWCTIuscGumKMQ+JmbwxCc0sJ6K9J+wwE/7gIvhWlFwFE/PaGpZZktFWI039xaDvMCIgsdtkJrphk3xE8PDwUCD2W3qwznro30SizkD8BmxTK/hgFQ9ke8wuAEF95tVspik/Zfqf32BjWeyxxFi14VZvqRx1jHZ1/JLjdTKQXJGtzKL0= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: aa832ab5-83d4-4e17-1eef-08d70c386fb0 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jul 2019 11:01:22.6140 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Phil.Yang@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4992 Subject: Re: [dpdk-dev] [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare exchange X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Jerin Jacob Kollanukkaran > Sent: Friday, July 19, 2019 2:25 PM > To: Phil Yang (Arm Technology China) ; dev@dpdk.org > Cc: thomas@monjalon.net; hemant.agrawal@nxp.com; Honnappa > Nagarahalli ; Gavin Hu (Arm Technology > China) ; nd ; gage.eads@intel.com > Subject: RE: [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare > exchange >=20 > > -----Original Message----- > > From: Phil Yang > > Sent: Friday, June 28, 2019 1:42 PM > > To: dev@dpdk.org > > Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran > ; > > hemant.agrawal@nxp.com; Honnappa.Nagarahalli@arm.com; > > gavin.hu@arm.com; nd@arm.com; gage.eads@intel.com > > Subject: [EXT] [PATCH v3 1/3] eal/arm64: add 128-bit atomic compare > > exchange > > > > External Email > > > > ---------------------------------------------------------------------- > > Add 128-bit atomic compare exchange on aarch64. > > > > Signed-off-by: Phil Yang > > Tested-by: Honnappa Nagarahalli > > Reviewed-by: Honnappa Nagarahalli > > --- > > +#define RTE_HAS_ACQ(mo) ((mo) !=3D __ATOMIC_RELAXED && (mo) !=3D > > +__ATOMIC_RELEASE) #define RTE_HAS_RLS(mo) ((mo) =3D=3D > > __ATOMIC_RELEASE || \ > > + (mo) =3D=3D __ATOMIC_ACQ_REL || \ > > + (mo) =3D=3D __ATOMIC_SEQ_CST) > > + > > +#define RTE_MO_LOAD(mo) (RTE_HAS_ACQ((mo)) \ > > + ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED) #define > > RTE_MO_STORE(mo) > > +(RTE_HAS_RLS((mo)) \ > > + ? __ATOMIC_RELEASE : __ATOMIC_RELAXED) > > + >=20 > The one starts with RTE_ are public symbols, If it is generic enough, > Move to common layer so that every architecturse can use. > If you think, otherwise make it internal Let's keep it internal. I will remove the 'RTE_' tag.=20 >=20 >=20 >=20 > > +#ifdef __ARM_FEATURE_ATOMICS >=20 > This define is added in gcc 9.1 and I believe for clang it is not support= ed yet. > So old gcc and clang this will be undefined. > I think, With meson + native build, we can find the presence of > ATOMIC support by running a.out. Not sure about make and cross build case= . > I don't want block this feature because of this, IMO, We can add this cod= e > with existing __ARM_FEATURE_ATOMICS scheme and later find a method > to enhance it. But please check how to fix it. OK. >=20 > > +#define __ATOMIC128_CAS_OP(cas_op_name, op_string) = \ > > +static inline rte_int128_t = \ > > +cas_op_name(rte_int128_t *dst, rte_int128_t old, = \ > > + rte_int128_t updated) = \ > > +{ = \ > > + /* caspX instructions register pair must start from even-numbered > > + * register at operand 1. > > + * So, specify registers for local variables here. > > + */ = \ > > + register uint64_t x0 __asm("x0") =3D (uint64_t)old.val[0]; = \ >=20 > Since direct x0 register used in the code and > cas_op_name() and rte_atomic128_cmp_exchange() is inline function, > Based on parent function load, we may corrupt x0 register aka Since x0/x1 and x2/x3 are used a lot and often contain live values. Maybe to change them to some relatively less frequently used registers like= x14/x15 and x16/x17 might help for this case? According to the PCS (Procedure Call Standard), x14-x17 are also temporary = registers. > Break arm64 ABI. Not sure clobber list will help here or not? In my understanding, for the register variable, if it contains a live value= in the specified register, the compiler will move the live value into a fr= ee register.=20 Since x0~x3 are present in the input/output operands and x0/x1's value need= s to be restored to the variable 'old' as a return value.=20 So I didn't add them into the clobber list. > Making it as no_inline will help but not sure about the performance impac= t. > May be you can check with compiler team. >=20 > We burned our hands with this scheme, see > 5b40ec6b966260e0ff66a8a2c689664f75d6a0e6 ("mempool/octeontx2: fix > possible arm64 ABI break") >=20 > Probably we can choose a scheme for rc2 and adjust as when we have > complete clarity. >=20 > > + register uint64_t x1 __asm("x1") =3D (uint64_t)old.val[1]; = \ > > + register uint64_t x2 __asm("x2") =3D (uint64_t)updated.val[0]; = \ > > + register uint64_t x3 __asm("x3") =3D (uint64_t)updated.val[1]; = \ > > + asm volatile( = \ > > + op_string " %[old0], %[old1], %[upd0], %[upd1], > > [%[dst]]" \ > > + : [old0] "+r" (x0), \ > > + [old1] "+r" (x1) \ > > + : [upd0] "r" (x2), \ > > + [upd1] "r" (x3), \ > > + [dst] "r" (dst) \ > > + : "memory"); \ >=20 > Should n't we add x0,x1, x2, x3 in clobber list? Same as above. >=20 >=20 > > static inline int __rte_experimental > > rte_atomic128_cmp_exchange(rte_int128_t *dst, > > rte_int128_t *exp, > > diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h > > b/lib/librte_eal/common/include/generic/rte_atomic.h > > index 9958543..2355e50 100644 > > --- a/lib/librte_eal/common/include/generic/rte_atomic.h > > +++ b/lib/librte_eal/common/include/generic/rte_atomic.h > > @@ -1081,6 +1081,20 @@ static inline void > > rte_atomic64_clear(rte_atomic64_t *v) > > > > /*------------------------ 128 bit atomic operations -----------------= --------*/ > > > > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_ARM64) >=20 > There is nothing specific to x86 and arm64 here, Can we remove this #ifde= f ? Without this constraint, it will break 32-bit x86 builds. http://mails.dpdk.org/archives/test-report/2019-June/086586.html=20 >=20 > > +/** > > + * 128-bit integer structure. > > + */ > > +RTE_STD_C11 > > +typedef struct { > > + RTE_STD_C11 > > + union { > > + uint64_t val[2]; > > + __extension__ __int128 int128; > > + }; > > +} __rte_aligned(16) rte_int128_t; > > +#endif > > + > > #ifdef __DOXYGEN__