From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 9ED83A05DC for ; Mon, 10 Jun 2019 07:30:01 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 081B41BE0F; Mon, 10 Jun 2019 07:30:01 +0200 (CEST) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00057.outbound.protection.outlook.com [40.107.0.57]) by dpdk.org (Postfix) with ESMTP id 2FBB01BC9C; Mon, 10 Jun 2019 07:30:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tUwMfL1yHaYBwpGN3MgbFtsckCL5LuWxuy6YYrandQE=; b=ZGfyLQL1Plrra4d275fDfdP973aNtgT1NiNNnrOF4tcPqhFN8WLnScfAUdpkIRCXAoXfdk3yHhtNIH/BtT2PX3Qb49PGQMxikYQZXz8JVIBujWf49AH9TTUt+QL6jYQ2M9u23AuXaidpl6a6axzMcXTKbZ5kkM0cSQW6xoHAEow= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.152) by VE1PR08MB5086.eurprd08.prod.outlook.com (20.179.29.208) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1965.12; Mon, 10 Jun 2019 05:29:58 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::9983:2882:a24:c0b0]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::9983:2882:a24:c0b0%5]) with mapi id 15.20.1965.017; Mon, 10 Jun 2019 05:29:58 +0000 From: Honnappa Nagarahalli To: "jerinj@marvell.com" , "dev@dpdk.org" CC: "thomas@monjalon.net" , "Gavin Hu (Arm Technology China)" , "msantana@redhat.com" , "aconole@redhat.com" , "stable@dpdk.org" , Honnappa Nagarahalli , nd , nd Thread-Topic: [dpdk-dev] [PATCH] acl: fix build issue with some arm64 compiler Thread-Index: AQHVHHdDb9Awx2+OaUqUzFfJSJHNCaaPqs4QgAAOuoCABKcgsA== Date: Mon, 10 Jun 2019 05:29:57 +0000 Message-ID: References: <20190606145054.39995-1-jerinj@marvell.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 6f7fe80c-1aca-46bb-a073-ca1786fda394.0 x-checkrecipientchecked: true authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 3bf61aa2-5aa3-40ce-bf42-08d6ed64ad70 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:VE1PR08MB5086; x-ms-traffictypediagnostic: VE1PR08MB5086: x-ms-exchange-purlcount: 1 x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr nodisclaimer: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-forefront-prvs: 0064B3273C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(366004)(376002)(396003)(136003)(39860400002)(189003)(199004)(186003)(81166006)(486006)(446003)(8676002)(478600001)(11346002)(316002)(26005)(81156014)(53936002)(6246003)(68736007)(5660300002)(110136005)(476003)(256004)(966005)(66066001)(54906003)(6506007)(86362001)(3846002)(102836004)(2906002)(76176011)(229853002)(72206003)(6116002)(7696005)(76116006)(8936002)(99286004)(25786009)(9686003)(4326008)(52536014)(6306002)(73956011)(55016002)(66946007)(66446008)(64756008)(66556008)(66476007)(74316002)(33656002)(6436002)(2501003)(14454004)(7736002)(71200400001)(305945005)(71190400001); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5086; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: ZZMseKHWNu4XsIgNR1ACVelasAYyZ3zg/rNUd8MYC4n1SMscjUl9fYEazVEjQMqOAwIZv5DXwClrjXWpz8PSA+cClxNh5P9hmxBachWUq+vbi9+9lEXAlpso/gRySXgjzZ8oN5p18MHH6vOxYZbLbrFvYS3iGlSKxzpCI3WdntNPS1abHWxgvvqY+55TK3gMvNGcfi+PmtGUny4e76MYqdcNAp7A5E1of8o92R09/eDV6YoXiFwtZaFTJ+m6bqlgAsACKVxYshQ6AToeUiIaLr3F9ZycAmdZ3IcEMvm/KwLd4rOrijHJDBq6Z7JGRmjxPzTOt06NhlblPR5J5pDmIaQ1tMzt8SNw4imaWSv+wBzqh0OMNSpaaWEVs9lR+JKWcNn2Kg8TJP1LcYvuxFLgTCa6SaR6yB7R9Rr2rZKA008= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3bf61aa2-5aa3-40ce-bf42-08d6ed64ad70 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Jun 2019 05:29:58.0404 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Honnappa.Nagarahalli@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5086 Subject: Re: [dpdk-dev] [PATCH] acl: fix build issue with some arm64 compiler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > > ---------------------------------------------------------------------- > > > Subject: [dpdk-dev] [PATCH] acl: fix build issue with some arm64 > > > compiler > > > > > > From: Jerin Jacob > > > > > > Some compilers reporting the following error, though the existing > > > code doesn't have any uninitialized variable case. > > > Just to make compiler happy, initialize the int32x4_t variable one > > > shot in C language. > > > > > > ../lib/librte_acl/acl_run_neon.h: In function 'search_neon_4' > > > ../lib/librte_acl/acl_run_neon.h:230:12: error: 'input' may be used > > > uninitialized in this function [-Werror=3Dmaybe-uninitialized] > > > int32x4_t input; > > > > > > Fixes: 34fa6c27c156 ("acl: add NEON optimization for ARMv8") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Jerin Jacob > > > --- > > > lib/librte_acl/acl_run_neon.h | 29 ++++++++++++----------------- > > > 1 file changed, 12 insertions(+), 17 deletions(-) > > > > > > diff --git a/lib/librte_acl/acl_run_neon.h > > > b/lib/librte_acl/acl_run_neon.h index 01b9766d8..dc9e9efe9 100644 > > > --- a/lib/librte_acl/acl_run_neon.h > > > +++ b/lib/librte_acl/acl_run_neon.h > > > @@ -165,7 +165,6 @@ search_neon_8(const struct rte_acl_ctx *ctx, > > > const uint8_t **data, > > > uint64_t index_array[8]; > > > struct completion cmplt[8]; > > > struct parms parms[8]; > > > - int32x4_t input0, input1; > > > > > > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, > > > total_packets, categories, ctx->trans_table); @@ -181,17 > > > +180,14 @@ search_neon_8(const struct rte_acl_ctx *ctx, const > > > +uint8_t > > > **data, > > > > > > while (flows.started > 0) { > > > /* Gather 4 bytes of input data for each stream. */ > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), > > > input0, 0); > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), > > > input1, 0); > > > - > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), > > > input0, 1); > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), > > > input1, 1); > > > - > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), > > > input0, 2); > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 6), > > > input1, 2); > > > - > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), > > > input0, 3); > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 7), > > > input1, 3); > > > + int32x4_t input0 =3D {GET_NEXT_4BYTES(parms, 0), > > > + GET_NEXT_4BYTES(parms, 1), > > > + GET_NEXT_4BYTES(parms, 2), > > > + GET_NEXT_4BYTES(parms, 3)}; > > > + int32x4_t input1 =3D {GET_NEXT_4BYTES(parms, 4), > > > + GET_NEXT_4BYTES(parms, 5), > > > + GET_NEXT_4BYTES(parms, 6), > > > + GET_NEXT_4BYTES(parms, 7)}; > > > > > This mixes the use of NEON intrinsics with GCC vector extensions. ACLE > > (Arm C Language Extensions) specifically recommends not to mix the two > > methods in section 12.2.6. IMO, Aaron's suggestion of using a temp vect= or > should be good. >=20 > We are using this pattern across DPDK and SSE for x86 as well. > https://git.dpdk.org/dpdk/tree/drivers/net/i40e/i40e_rxtx_vec_neon.c#n91 I am not sure about x86, I have not looked at a document similar to ACLE fo= r x86. IMO, it is not relevant here as this is Arm specific code. >=20 > Since it used in fastpath, a temp variable would be additional cost for n= o > reason. Then, I would suggest we can go with using 'vdupq_n_s32'. > If GCC supports it then I think it is fine, I think, above usage matters = with C++ > portability. I did not understand the C++ portability part. Can you elaborate more? >=20 >=20 > > > > > /* Process the 4 bytes of input on each stream. */ > > > > > > @@ -227,7 +223,6 @@ search_neon_4(const struct rte_acl_ctx *ctx, > > > const uint8_t **data, > > > uint64_t index_array[4]; > > > struct completion cmplt[4]; > > > struct parms parms[4]; > > > - int32x4_t input; > > > > > > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, > > > total_packets, categories, ctx->trans_table); @@ -242,10 > > > +237,10 @@ search_neon_4(const struct rte_acl_ctx *ctx, const > > > +uint8_t > > > **data, > > > > > > while (flows.started > 0) { > > > /* Gather 4 bytes of input data for each stream. */ > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, > > > 0); > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, > > > 1); > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, > > > 2); > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, > > > 3); > > > + int32x4_t input =3D {GET_NEXT_4BYTES(parms, 0), > > > + GET_NEXT_4BYTES(parms, 1), > > > + GET_NEXT_4BYTES(parms, 2), > > > + GET_NEXT_4BYTES(parms, 3)}; > > > > > > /* Process the 4 bytes of input on each stream. */ > > > input =3D transition4(input, flows.trans, index_array); > > > -- > > > 2.21.0