From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 23120A0471 for ; Sat, 22 Jun 2019 15:21:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1FD201C450; Sat, 22 Jun 2019 15:21:22 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by dpdk.org (Postfix) with ESMTP id 573841BF13 for ; Sat, 22 Jun 2019 15:21:20 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5MDKHqk025107; Sat, 22 Jun 2019 06:21:18 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : content-type : content-transfer-encoding : mime-version; s=pfpt0818; bh=wTWVV2G9riND+UNsU16qOymCrha3hny1swVfQDGHes4=; b=m2wxnJRI7tXyPhHwPO/ou2bu7iLjixPalZXluhksQT8B2QKqDZMSg0jQL5vpXcS4C935 YR7g92SPvHJRVEIFn4IF8Ymld210VW1O/W+dvqDkEDmd3z9BxmHYGa/s7Lph1rr9d3Ud BJ4fkbXVnjc7Y4ZQ2qYHJcSnSdjEH6FjMRgHCKi0oy36dpivPJmQDXiy3rkDrtIhgUOP RR/VtJ+M5r+6USgoOe0vBcv7cY+CYnvoPmm9wcReWPOnCq3lpJ55gOGPAwl1QWfsSNo3 2M/ZK5WgjfdTRppzd4talXTMgHGNSn+SgmThKuk2pVc2S4WaZLu/qGN4iFRpXgfkt/8k uQ== Received: from sc-exch01.marvell.com ([199.233.58.181]) by mx0a-0016f401.pphosted.com with ESMTP id 2t9hpnrg7g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Sat, 22 Jun 2019 06:21:17 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Sat, 22 Jun 2019 06:21:17 -0700 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (104.47.36.52) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1367.3 via Frontend Transport; Sat, 22 Jun 2019 06:21:16 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector2-marvell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wTWVV2G9riND+UNsU16qOymCrha3hny1swVfQDGHes4=; b=sP1qClZaOiIy2UWhDVhCooWb1zdNFYt9AFa7vhb+Izl1nM5eaD/mhIOpjtgmm69w6kG+gXvgzt7XUiTFIXgGDgORCdGqIHQFf7iirjawZ88awaK/L/XM5VzPonTXonsEk5o0ixk6Bx2xLHDwU++Ijo7rwWrAr9wwi3Ew4GF5vLw= Received: from BYAPR18MB2424.namprd18.prod.outlook.com (20.179.91.149) by BYAPR18MB2918.namprd18.prod.outlook.com (20.179.59.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2008.16; Sat, 22 Jun 2019 13:21:14 +0000 Received: from BYAPR18MB2424.namprd18.prod.outlook.com ([fe80::75fd:a528:a1bf:bef4]) by BYAPR18MB2424.namprd18.prod.outlook.com ([fe80::75fd:a528:a1bf:bef4%3]) with mapi id 15.20.1987.014; Sat, 22 Jun 2019 13:21:14 +0000 From: Jerin Jacob Kollanukkaran To: Aaron Conole , Pavan Nikhilesh Bhagavatula CC: "dev@dpdk.org" , Nithin Kumar Dabilpuram , Vamsi Krishna Attunuru , Olivier Matz Thread-Topic: [dpdk-dev] [PATCH v3 25/27] mempool/octeontx2: add optimized dequeue operation for arm64 Thread-Index: AdUo/U8EQD/P0qNoQuefPq1VWnoiHg== Date: Sat, 22 Jun 2019 13:21:14 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [106.200.240.198] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e8dd01c8-7ad9-4194-d3cd-08d6f714806f x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:BYAPR18MB2918; x-ms-traffictypediagnostic: BYAPR18MB2918: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:873; x-forefront-prvs: 0076F48C8A x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(366004)(39850400004)(346002)(376002)(396003)(13464003)(189003)(199004)(102836004)(53936002)(3846002)(7696005)(6116002)(6436002)(110136005)(54906003)(486006)(9686003)(476003)(64756008)(66946007)(68736007)(478600001)(73956011)(76116006)(14454004)(25786009)(66446008)(66556008)(66476007)(55016002)(4326008)(99286004)(86362001)(2906002)(53546011)(316002)(186003)(6506007)(229853002)(26005)(52536014)(6636002)(66066001)(8676002)(305945005)(7736002)(71200400001)(81156014)(8936002)(81166006)(74316002)(71190400001)(33656002)(256004)(6246003)(5660300002); DIR:OUT; SFP:1101; SCL:1; SRVR:BYAPR18MB2918; H:BYAPR18MB2424.namprd18.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: marvell.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: VcgIzCTGZunIfMZ+0TV8iWy7UQg87VUE0VwmeHJOXVCDyhXUf/hRpiFsCf8NMuFUAjP+9Xz94QQKuNFKqHXlG81VovVHzhxu9ipHy7zG31hpzG4tZ3SpRyMhKXsUnkMBNrnNOq0FIIyx/A6cH2lyY/WK5HhJu9i7Yydm99ppJ/ezGyh54jmIHTPJnLAmCW9u1yrS6UuWTxHvs1JuYg4wLotpknT0zFrFHDEcIu2pub9IHuz7dbewn0J0rGNMtutNESCk6FUp5qjTnMgAO2Ruuvmdf4KhmvXpqDywbHmE6yNECQ7+rCsSpYIcOmMZKeagR/bHaYWz3SaDQ+N1PYFjp/TBlJoB8gtNqaIHy6azxBSa6BPyrY/qzyVxESE4FaMmBnXVbezvJwgaBzl2l2+V1VICPoKZPzSg2DYu1YiYQUo= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: e8dd01c8-7ad9-4194-d3cd-08d6f714806f X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Jun 2019 13:21:14.3693 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: jerinj@marvell.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR18MB2918 X-OriginatorOrg: marvell.com X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-06-22_09:, , signatures=0 Subject: Re: [dpdk-dev] [PATCH v3 25/27] mempool/octeontx2: add optimized dequeue operation for arm64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Aaron Conole > Sent: Saturday, June 22, 2019 12:57 AM > To: Pavan Nikhilesh Bhagavatula > Cc: Jerin Jacob Kollanukkaran ; dev@dpdk.org; Nithin > Kumar Dabilpuram ; Vamsi Krishna Attunuru > ; Olivier Matz > Subject: Re: [EXT] Re: [dpdk-dev] [PATCH v3 25/27] mempool/octeontx2: > add optimized dequeue operation for arm64 >=20 > Pavan Nikhilesh Bhagavatula writes: >=20 > > Hi Aaron, > > > >>-----Original Message----- > >>From: Aaron Conole > >>Sent: Tuesday, June 18, 2019 2:55 AM > >>To: Jerin Jacob Kollanukkaran > >>Cc: dev@dpdk.org; Nithin Kumar Dabilpuram > ; > >>Vamsi Krishna Attunuru ; Pavan Nikhilesh > >>Bhagavatula ; Olivier Matz > >> > >>Subject: [EXT] Re: [dpdk-dev] [PATCH v3 25/27] mempool/octeontx2: > >>add optimized dequeue operation for arm64 > >> > >>> From: Pavan Nikhilesh > >>> > >>> This patch adds an optimized arm64 instruction based routine to > >>leverage > >>> CPU pipeline characteristics of octeontx2. The theme is to fill the > >>> pipeline with CASP operations as much HW can do so that HW can do > >>alloc() > >>> HW ops in full throttle. > >>> > >>> Cc: Olivier Matz > >>> Cc: Aaron Conole > >>> > >>> Signed-off-by: Pavan Nikhilesh > >>> Signed-off-by: Jerin Jacob > >>> Signed-off-by: Vamsi Attunuru > >>> --- > >>> drivers/mempool/octeontx2/otx2_mempool_ops.c | 291 > >>+++++++++++++++++++ > >>> 1 file changed, 291 insertions(+) > >>> > >>> diff --git a/drivers/mempool/octeontx2/otx2_mempool_ops.c > >>b/drivers/mempool/octeontx2/otx2_mempool_ops.c > >>> index c59bd73c0..e6737abda 100644 > >>> --- a/drivers/mempool/octeontx2/otx2_mempool_ops.c > >>> +++ b/drivers/mempool/octeontx2/otx2_mempool_ops.c > >>> @@ -37,6 +37,293 @@ npa_lf_aura_op_alloc_one(const int64_t > >>wdata, int64_t * const addr, > >>> return -ENOENT; > >>> } > >>> > >>> +#if defined(RTE_ARCH_ARM64) > >>> +static __rte_noinline int > >>> +npa_lf_aura_op_search_alloc(const int64_t wdata, int64_t * const > >>addr, > >>> + void **obj_table, unsigned int n) { > >>> + uint8_t i; > >>> + > >>> + for (i =3D 0; i < n; i++) { > >>> + if (obj_table[i] !=3D NULL) > >>> + continue; > >>> + if (npa_lf_aura_op_alloc_one(wdata, addr, obj_table, > >>i)) > >>> + return -ENOENT; > >>> + } > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static __attribute__((optimize("-O3"))) __rte_noinline int __hot > >> > >>Sorry if I missed this before. > >> > >>Is there a good reason to hard-code this optimization, rather than let > >>the build system provide it? > > > > Some versions of compiler don't have support for __int128_t for CASP > inline-asm. > > i.e. if the optimization level is reduced to -O0 the CASP restrictions > > aren't followed and compiler might end up violation the CASP rules > example: > > > > /tmp/ccSPMGzq.s:1648: Error: reg pair must start from even reg at > > operand 1 - `casp x21,x22,x0,x1,[x19]' > > /tmp/ccSPMGzq.s:1706: Error: reg pair must start from even reg at > > operand 1 - `casp x13,x14,x0,x1,[x11]' > > /tmp/ccSPMGzq.s:1745: Error: reg pair must start from even reg at > > operand 1 - `casp x9,x10,x0,x1,[x7]' > > /tmp/ccSPMGzq.s:1775: Error: reg pair must start from even reg at > > operand 1 - `casp x7,x8,x0,x1,[x5]'* > > > > Forcing to -O3 with __rte_noinline in place fixes it as the alignment f= its in. >=20 > It makes sense to document this - it isn't apparent that it is needed. > It would be good to put a comment just before that explains it, preferabl= y > with the compilers that aren't behaving. This would help in the future t= o > determine when it would be safe to drop the flag. Yes. Will add the comment.