From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0C5FDA0471 for ; Wed, 14 Aug 2019 12:25:09 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CFA4D374; Wed, 14 Aug 2019 12:25:07 +0200 (CEST) Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-eopbgr20076.outbound.protection.outlook.com [40.107.2.76]) by dpdk.org (Postfix) with ESMTP id 449F6375B for ; Wed, 14 Aug 2019 12:25:06 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LUUilTEd7Ff6M3hsl3gaMoX/eHimU0nsdn/9s479+sE=; b=SnJa9KDsqsZYixfeNBMjfRxqALLK0hFQGpU8M+tTX9eso4UgeHKSGctDEDZuQV39EuLY1Fz6xVHGx3hu5dD6GzIdyhOpckm8KjuZwdVgqTqXQS5XJfccnwwAQalazzczvZlVtLWUgPq8gqrOjAZHI9bZ/ZcENFPIfJ5eufnX9UY= Received: from DB7PR08CA0045.eurprd08.prod.outlook.com (2603:10a6:10:26::22) by AM5PR0802MB2594.eurprd08.prod.outlook.com (2603:10a6:203:99::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2157.20; Wed, 14 Aug 2019 10:25:03 +0000 Received: from DB5EUR03FT029.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::200) by DB7PR08CA0045.outlook.office365.com (2603:10a6:10:26::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2157.20 via Frontend Transport; Wed, 14 Aug 2019 10:25:03 +0000 Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=temperror action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT029.mail.protection.outlook.com (10.152.20.131) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2115.18 via Frontend Transport; Wed, 14 Aug 2019 10:25:02 +0000 Received: ("Tessian outbound cc8a947d4660:v26"); Wed, 14 Aug 2019 10:24:58 +0000 X-CR-MTA-TID: 64aa7808 Received: from c217a8d86b98.1 (cr-mta-lb-1.cr-mta-net [104.47.8.55]) by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2DFB457B-DDD3-48D9-8D8F-0D5CB3C8FB4D.1; Wed, 14 Aug 2019 10:24:53 +0000 Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-am5eur03lp2055.outbound.protection.outlook.com [104.47.8.55]) by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c217a8d86b98.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384); Wed, 14 Aug 2019 10:24:53 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DZWOcg8sur/1+hO/glVDXkT3glcyRc1QC0HXfPoUfmk1iKDBNj/C7Ej++SSidLEMmfetxu0KdPOOeJ9eVXvVdQvNzM0ZXZLU3u2wbmZnYZA9wlp72uiq9qb9PFv91I0yj5BkJkNft+VrvOXR+UE3NJtBYMFgzo1h5hs9EN2DtMelScReQgq79FN2VjPehO60OH/tL6pmEsv1NriiM7lXCJGPXUkw6xK+oNyKVdTzyAFq84XbdIa96JP7jVyqgyuncJsDFxIKcdxM3+RDRIXHxypvcg0I8tfYHiTrhZMgY3kNsiFE+eMAvlH5fMC8kTNsJ0hGbN2Gzdk0Y9rMi3NTbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LUUilTEd7Ff6M3hsl3gaMoX/eHimU0nsdn/9s479+sE=; b=NbCcpIBYXv0jviiHvw9+3y9YRQU2RS0cybU3wa8tOrtrPYgP8vGWU9GbauziSSK8t4vXGkkqARQiVJlFf9aLjnNt8sk8CsJ5Q5dWEDonH61fM4rkxyr4DDsSodDne/VtZMSaQG86AWR2S2iiil21IhuwL96fTUNrY5eiYZA/j6Cdz52HWxxWSXE7SwrToG+TD7xVThGsRVeVkQwiz0OHKLXH51cEpGZ1sSc8WZNEv6Ozjsgq2ib49eG07uLq2+Lqfw6ML24jvCg84aS4KOuZZxT9TgPYzSumcwo0xcMONiDtcrBlmsqv0u+QARJIgvZB9yg1cYZMzAKMHQe+S53xJA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LUUilTEd7Ff6M3hsl3gaMoX/eHimU0nsdn/9s479+sE=; b=SnJa9KDsqsZYixfeNBMjfRxqALLK0hFQGpU8M+tTX9eso4UgeHKSGctDEDZuQV39EuLY1Fz6xVHGx3hu5dD6GzIdyhOpckm8KjuZwdVgqTqXQS5XJfccnwwAQalazzczvZlVtLWUgPq8gqrOjAZHI9bZ/ZcENFPIfJ5eufnX9UY= Received: from VE1PR08MB4640.eurprd08.prod.outlook.com (10.255.27.75) by VE1PR08MB4911.eurprd08.prod.outlook.com (10.255.114.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2157.18; Wed, 14 Aug 2019 10:24:50 +0000 Received: from VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::e5ba:d190:d546:6ee3]) by VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::e5ba:d190:d546:6ee3%4]) with mapi id 15.20.2157.022; Wed, 14 Aug 2019 10:24:50 +0000 From: "Phil Yang (Arm Technology China)" To: "jerinj@marvell.com" , "thomas@monjalon.net" , "gage.eads@intel.com" , "dev@dpdk.org" CC: "hemant.agrawal@nxp.com" , Honnappa Nagarahalli , "Gavin Hu (Arm Technology China)" , nd , nd Thread-Topic: [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange Thread-Index: AdVSe/GcDIRaPYMDT8KYTHZdN80AvwAC1UGA Date: Wed, 14 Aug 2019 10:24:50 +0000 Message-ID: References: In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: f5221c03-dc80-4ac8-980c-00d3e84fcd36.0 x-checkrecipientchecked: true Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Phil.Yang@arm.com; x-originating-ip: [113.29.88.7] x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: 93c27bad-3181-4b9c-62b9-08d720a1aaaf X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam-Untrusted: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:VE1PR08MB4911; X-MS-TrafficTypeDiagnostic: VE1PR08MB4911:|AM5PR0802MB2594: x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:8273;OLM:8273; x-forefront-prvs: 01294F875B X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(346002)(376002)(136003)(39860400002)(396003)(366004)(13464003)(199004)(189003)(229853002)(316002)(76176011)(486006)(55016002)(25786009)(99286004)(186003)(55236004)(9686003)(8936002)(54906003)(26005)(11346002)(5660300002)(476003)(7696005)(52536014)(2501003)(6506007)(53546011)(110136005)(66066001)(102836004)(446003)(6436002)(6246003)(305945005)(81166006)(8676002)(2201001)(66446008)(66476007)(66556008)(66946007)(2906002)(81156014)(64756008)(86362001)(7736002)(478600001)(76116006)(14454004)(33656002)(71200400001)(74316002)(4326008)(71190400001)(256004)(6116002)(53936002)(3846002); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4911; H:VE1PR08MB4640.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Message-Info-Original: dyNMIZ0NDdOCL1BjR3BxvouFLWL3864JC74CXKT7F2s/enaw919R8vYkppqt/al3/Xl1ip3lI0fXhjQ1mYaef+THrnErvZ70j1evhmlAL1P5bcsVBp+Rvy/2kFBgurVT6t0/edNlmLtZwGUb2ANxEkieaz8+7hZW88cNZdoy+pFxqnDM6cGiT63EzyUqArJuLaES+ILbu1eBJlUzYg+Y9N3j4s6VDQCrilLNHkjGSUEIIeLAmw/uLpczlE33/0l7F1Tt+cAXc1j2bTZ5VzhkbRILVgpfbvh/g9Tv1K5RHHxfiHJvnBmTFLckwsStTzuArdH31p6e18vbjSHPSUp9ZIPqH1mdQlm/GlCwnKCngo8k5ObN+9KyDfQm/7PO6vw5UaHQphbPrFhZuDqY/WOsd8JAVXK5aOE7KHzC6QiPAc4= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4911 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Phil.Yang@arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT029.eop-EUR03.prod.protection.outlook.com X-Forefront-Antispam-Report: CIP:63.35.35.123; IPV:CAL; SCL:-1; CTRY:IE; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(136003)(346002)(396003)(39860400002)(376002)(2980300002)(13464003)(199004)(189003)(126002)(6116002)(23726003)(6246003)(11346002)(446003)(8676002)(336012)(476003)(63350400001)(54906003)(76176011)(2906002)(25786009)(46406003)(63370400001)(3846002)(356004)(2501003)(81156014)(7696005)(33656002)(186003)(9686003)(4326008)(99286004)(81166006)(102836004)(6506007)(86362001)(53546011)(66066001)(8746002)(70586007)(70206006)(76130400001)(52536014)(486006)(110136005)(14454004)(22756006)(50466002)(8936002)(229853002)(478600001)(97756001)(7736002)(26826003)(305945005)(5660300002)(74316002)(2201001)(26005)(316002)(55016002)(47776003); DIR:OUT; SFP:1101; SCL:1; SRVR:AM5PR0802MB2594; H:64aa7808-outbound-1.mta.getcheckrecipient.com; FPR:; SPF:TempError; LANG:en; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; A:1; MX:1; X-MS-Office365-Filtering-Correlation-Id-Prvs: ab1f027f-3819-4b5d-adfe-08d720a1a3c9 X-Microsoft-Antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(710020)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:AM5PR0802MB2594; NoDisclaimer: True X-Forefront-PRVS: 01294F875B X-Microsoft-Antispam-Message-Info: 3r/zZAGIAY5FIWV1UqPgL8zzpLOKNuwAGAxXGQEplEkBe4IG1gwb+Pe/Vnizg8Zh4n7xgAVwHofp4Xlh51h7cJqCBv9nY3lkx19LygFchJD/G9SGbFJsBWW2RlBUqF7Nei06p+5zx0RmtDU/AiPyImC5BaxEELj9mgx4X+oQdXaOZK09PQ5gjQIPp9wY2ah6GMP8OEnUvglyn5EReAoSFCN16InrQu1528rXEjwE29GAnBo/ZwbxaCqRuyKVxjNFbYuOb8+rezmkhDzdDB/N+qt7rpFk3oxgQclZE3hXV8x52MKTiwcjY0eGPCluwsoPddjYMDgxLlspelBA3sY1Dz0f1ijXR1XiVyRPk0LYtePbayYnVO8aCqGHHi7inINgUJhuSfdDEb4rhCA83Njzn+XFB5A/RoEEtzYvAOvw1Ak= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Aug 2019 10:25:02.1118 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 93c27bad-3181-4b9c-62b9-08d720a1aaaf X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0802MB2594 Subject: Re: [dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Jerin Jacob Kollanukkaran > Sent: Wednesday, August 14, 2019 4:46 PM > To: Phil Yang (Arm Technology China) ; > thomas@monjalon.net; gage.eads@intel.com; dev@dpdk.org > Cc: hemant.agrawal@nxp.com; Honnappa Nagarahalli > ; Gavin Hu (Arm Technology China) > ; nd > Subject: RE: [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchang= e >=20 > > -----Original Message----- > > From: Phil Yang > > Sent: Wednesday, August 14, 2019 1:58 PM > > To: thomas@monjalon.net; Jerin Jacob Kollanukkaran > ; > > gage.eads@intel.com; dev@dpdk.org > > Cc: hemant.agrawal@nxp.com; Honnappa.Nagarahalli@arm.com; > > gavin.hu@arm.com; nd@arm.com > > Subject: [EXT] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare > > exchange > > +#define __HAS_ACQ(mo) ((mo) !=3D __ATOMIC_RELAXED && (mo) !=3D > > +__ATOMIC_RELEASE) #define __HAS_RLS(mo) ((mo) =3D=3D > > __ATOMIC_RELEASE || (mo) =3D=3D __ATOMIC_ACQ_REL || \ > > + (mo) =3D=3D __ATOMIC_SEQ_CST) > > + > > +#define __MO_LOAD(mo) (__HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : > > +__ATOMIC_RELAXED) #define __MO_STORE(mo) (__HAS_RLS((mo)) ? > > +__ATOMIC_RELEASE : __ATOMIC_RELAXED) > > + > > +#if defined(__ARM_FEATURE_ATOMICS) || > > defined(RTE_ARM_FEATURE_ATOMICS) > > +#define __ATOMIC128_CAS_OP(cas_op_name, op_string) = \ > > +static __rte_noinline rte_int128_t = \ >=20 >=20 > Could you check the cost of making it as __rte_noinline? > If it is costly, How about having two versions, one with __rte_noinline > to make compliance with arm64 procedure call standard for > old gcc and clang. > Other one without explicit register hardcoding + inline for latest > gcc Hi Jerin, According to the stack_lf_perf_autotest, making it as __rte_noinline has no= overhead on ThunderX2 with GCC 8.3. The 'Average cycles per object push/pop' numbers for __rte_noinline and __r= te_always_inline versions are nearly the same. Test results : ###### Two NUMA Node ###### #### __rte_noinline #### RTE>>stack_lf_perf_autotest ### Testing using two NUMA nodes ### Average cycles per object push/pop (bulk size: 8): 24.10 Average cycles per object push/pop (bulk size: 32): 6.85 ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 680.39 Average cycles per object push/pop (bulk size: 32): 146.38 Test OK #### __rte_always-inline #### RTE>>stack_lf_perf_autotest ### Testing using two NUMA nodes ### Average cycles per object push/pop (bulk size: 8): 24.29 Average cycles per object push/pop (bulk size: 32): 6.92 ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 683.92 Average cycles per object push/pop (bulk size: 32): 145.11 Test OK ###### Single NUMA ###### #### __rte_always-inline #### RTE>>stack_lf_perf_autotest ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 582.92 Average cycles per object push/pop (bulk size: 32): 125.57 Test OK #### __rte_noinline #### RTE>>stack_lf_perf_autotest ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 537.56 Average cycles per object push/pop (bulk size: 32): 122.98 Test OK Thanks, Phil Yang >=20 >=20 > > +cas_op_name(rte_int128_t *dst, rte_int128_t old, = \ > > + rte_int128_t updated) \ > > +{ = \ > > + /* caspX instructions register pair must start from even-numbered > > + * register at operand 1. > > + * So, specify registers for local variables here. > > + */ \ > > + register uint64_t x0 __asm("x0") =3D (uint64_t)old.val[0]; = \ > > + register uint64_t x1 __asm("x1") =3D (uint64_t)old.val[1]; = \ > > + register uint64_t x2 __asm("x2") =3D (uint64_t)updated.val[0]; = \ > > + register uint64_t x3 __asm("x3") =3D (uint64_t)updated.val[1]; = \ > > + asm volatile( \ > > + op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]" \ > > + : [old0] "+r" (x0), \ > > + [old1] "+r" (x1) \ > > + : [upd0] "r" (x2), \ > > + [upd1] "r" (x3), \ > > + [dst] "r" (dst) \ > > + : "memory"); \ > > + old.val[0] =3D x0; = \ > > + old.val[1] =3D x1; = \ > > + return old; \ > > +} > > +