From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 25295A0495 for ; Mon, 10 Jun 2019 11:39:42 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3FC681BFB4; Mon, 10 Jun 2019 11:39:41 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by dpdk.org (Postfix) with ESMTP id CA6BB1BFAD; Mon, 10 Jun 2019 11:39:39 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5A9aiJT012381; Mon, 10 Jun 2019 02:39:36 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pfpt0818; bh=7kgBZ9ax7Ru3oCviNWRxKfFiC/u1S+UX7jDHYRoixkY=; b=s/CxgwgOLqaAoA0+Z0w2Zmf04CseeHmTl2MXA6kCE+PFl8iXU8AbRAivjluOD0mJBeP+ xPaC25UdBFRJQa8L+bvaL9c7MYpCCnMHriRvhU/tlDSE0zSJ0cXcpv6vRxTfPP9DwwOX FRGSMS8aHXGq9Q/xGS7wzq6CTzGIkem+hsvDoYN2WZgAHujFI+4RAf6w1aOKPWIlAYIs iJzYSA0r2fXgi5le/hJNQx4+XJ66ZjA6JAAGOQGpLMgfafHcfq9BoQC0WDNag6+2CvAK Axx5lsupnNf4KJdPLprvqFpuXPcLOEw0xlZc9Yu6LABJSsAcX5sAWS4PrTzSnsxfWvG0 vQ== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0b-0016f401.pphosted.com with ESMTP id 2t0chjwses-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 10 Jun 2019 02:39:36 -0700 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Mon, 10 Jun 2019 02:39:35 -0700 Received: from NAM01-BY2-obe.outbound.protection.outlook.com (104.47.34.56) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1367.3 via Frontend Transport; Mon, 10 Jun 2019 02:39:35 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector2-marvell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7kgBZ9ax7Ru3oCviNWRxKfFiC/u1S+UX7jDHYRoixkY=; b=bLLXW+f8WbBHYtSHkviSKkLIFtXiD3axPD7WVIPVgPzLBwj6s2B/qC40OnCJAwHdGdKvh8JNwXfU1c50bS2t5NFPsmrY/gWIcHXi8vADSG/ov+go8QlJABqq72ycVrepPwYq4Q4twyRawHqLKbtwf4bkmiYUjKQ7VyFQVIi/MHo= Received: from BYAPR18MB2424.namprd18.prod.outlook.com (20.179.91.149) by BYAPR18MB2359.namprd18.prod.outlook.com (20.179.90.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1965.15; Mon, 10 Jun 2019 09:39:30 +0000 Received: from BYAPR18MB2424.namprd18.prod.outlook.com ([fe80::1ce4:557d:eeb8:843c]) by BYAPR18MB2424.namprd18.prod.outlook.com ([fe80::1ce4:557d:eeb8:843c%7]) with mapi id 15.20.1965.017; Mon, 10 Jun 2019 09:39:30 +0000 From: Jerin Jacob Kollanukkaran To: Honnappa Nagarahalli , "dev@dpdk.org" CC: "thomas@monjalon.net" , "Gavin Hu (Arm Technology China)" , "msantana@redhat.com" , "aconole@redhat.com" , "stable@dpdk.org" , nd , nd Thread-Topic: [dpdk-dev] [PATCH] acl: fix build issue with some arm64 compiler Thread-Index: AQHVHHdB3HeKdCaZ2UOmWQXk7P2C/aaPrK6AgAAH7TCABK2LgIAAQvdg Date: Mon, 10 Jun 2019 09:39:30 +0000 Message-ID: References: <20190606145054.39995-1-jerinj@marvell.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [14.140.231.66] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 51c9d0d2-c626-4411-dd64-08d6ed87897b x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:BYAPR18MB2359; x-ms-traffictypediagnostic: BYAPR18MB2359: x-ms-exchange-purlcount: 3 x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7219; x-forefront-prvs: 0064B3273C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39850400004)(346002)(376002)(366004)(136003)(396003)(13464003)(199004)(189003)(99286004)(2501003)(476003)(5660300002)(3846002)(6116002)(86362001)(2906002)(81156014)(11346002)(446003)(6436002)(81166006)(8676002)(8936002)(6246003)(66946007)(14454004)(6306002)(74316002)(73956011)(966005)(7736002)(53936002)(305945005)(9686003)(55016002)(66476007)(66556008)(64756008)(66446008)(76116006)(229853002)(71200400001)(71190400001)(68736007)(25786009)(4326008)(478600001)(52536014)(186003)(33656002)(7696005)(66066001)(256004)(486006)(102836004)(53546011)(110136005)(54906003)(76176011)(6506007)(316002)(26005)(55236004); DIR:OUT; SFP:1101; SCL:1; SRVR:BYAPR18MB2359; H:BYAPR18MB2424.namprd18.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: marvell.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: M5Fv4hYYkPRQBKccpB/aJa6DQU5AyWToo/GJrU1+N1ZvWl0OcRe0AJk31z8ieP6UH+TAI1+NQkL2UeJqClxyZYLfIkWOAPtkkY4pnsJ4S6atVFQ1/CGLySu4aH0jstRmfH1LknccD8VxMo8f0FzfhwSOlf/jb2yauLVBsBtj3MgehqnZU7F1qD2soO419iDAKX3APDlk6DDQfI3m+BwP1j+704d+r8f1O+A+lChOT9XVYDRwZRqfsghubGBuxl917yv+UCvA0t9RYkeCpK2Q7YwUZFwy1IkiNiYVy2cWtMtZ/l943B8JH45/VuRbHC/wmHO4Azc6r8SwzJSHFpZw7GdjE72BBzcqV+XUZng7f42OVoY8cEy57+uNHWHgvzWrPME0Yloqxrc4CtSvHTlF/bESRlx9kIb5iBpjlSJYC4o= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 51c9d0d2-c626-4411-dd64-08d6ed87897b X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Jun 2019 09:39:30.1128 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: jerinj@marvell.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR18MB2359 X-OriginatorOrg: marvell.com X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-06-10_05:, , signatures=0 Subject: Re: [dpdk-dev] [PATCH] acl: fix build issue with some arm64 compiler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Honnappa Nagarahalli > Sent: Monday, June 10, 2019 11:00 AM > To: Jerin Jacob Kollanukkaran ; dev@dpdk.org > Cc: thomas@monjalon.net; Gavin Hu (Arm Technology China) > ; msantana@redhat.com; aconole@redhat.com; > stable@dpdk.org; Honnappa Nagarahalli ; > nd ; nd > Subject: [EXT] RE: [dpdk-dev] [PATCH] acl: fix build issue with some arm6= 4 > compiler >=20 > > > -- > > > > Subject: [dpdk-dev] [PATCH] acl: fix build issue with some arm64 > > > > compiler > > > > > > > > From: Jerin Jacob > > > > > > > > Some compilers reporting the following error, though the existing > > > > code doesn't have any uninitialized variable case. > > > > Just to make compiler happy, initialize the int32x4_t variable one > > > > shot in C language. > > > > > > > > ../lib/librte_acl/acl_run_neon.h: In function 'search_neon_4' > > > > ../lib/librte_acl/acl_run_neon.h:230:12: error: 'input' may be > > > > used uninitialized in this function [-Werror=3Dmaybe-uninitialized] > > > > int32x4_t input; > > > > > > > > Fixes: 34fa6c27c156 ("acl: add NEON optimization for ARMv8") > > > > Cc: stable@dpdk.org > > > > > > > > Signed-off-by: Jerin Jacob > > > > --- > > > > lib/librte_acl/acl_run_neon.h | 29 ++++++++++++----------------- > > > > 1 file changed, 12 insertions(+), 17 deletions(-) > > > > > > > > diff --git a/lib/librte_acl/acl_run_neon.h > > > > b/lib/librte_acl/acl_run_neon.h index 01b9766d8..dc9e9efe9 100644 > > > > --- a/lib/librte_acl/acl_run_neon.h > > > > +++ b/lib/librte_acl/acl_run_neon.h > > > > @@ -165,7 +165,6 @@ search_neon_8(const struct rte_acl_ctx *ctx, > > > > const uint8_t **data, > > > > uint64_t index_array[8]; > > > > struct completion cmplt[8]; > > > > struct parms parms[8]; > > > > - int32x4_t input0, input1; > > > > > > > > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, > > > > total_packets, categories, ctx->trans_table); @@ -181,17 > > > > +180,14 @@ search_neon_8(const struct rte_acl_ctx *ctx, const > > > > +uint8_t > > > > **data, > > > > > > > > while (flows.started > 0) { > > > > /* Gather 4 bytes of input data for each stream. */ > > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), > > > > input0, 0); > > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), > > > > input1, 0); > > > > - > > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), > > > > input0, 1); > > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), > > > > input1, 1); > > > > - > > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), > > > > input0, 2); > > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 6), > > > > input1, 2); > > > > - > > > > - input0 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), > > > > input0, 3); > > > > - input1 =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 7), > > > > input1, 3); > > > > + int32x4_t input0 =3D {GET_NEXT_4BYTES(parms, 0), > > > > + GET_NEXT_4BYTES(parms, 1), > > > > + GET_NEXT_4BYTES(parms, 2), > > > > + GET_NEXT_4BYTES(parms, 3)}; > > > > + int32x4_t input1 =3D {GET_NEXT_4BYTES(parms, 4), > > > > + GET_NEXT_4BYTES(parms, 5), > > > > + GET_NEXT_4BYTES(parms, 6), > > > > + GET_NEXT_4BYTES(parms, 7)}; > > > > > > > This mixes the use of NEON intrinsics with GCC vector extensions. > > > ACLE (Arm C Language Extensions) specifically recommends not to mix > > > the two methods in section 12.2.6. IMO, Aaron's suggestion of using > > > a temp vector > > should be good. > > > > We are using this pattern across DPDK and SSE for x86 as well. > > https://git.dpdk.org/dpdk/tree/drivers/net/i40e/i40e_rxtx_vec_neon.c#n > > 91 > I am not sure about x86, I have not looked at a document similar to ACLE = for > x86. IMO, it is not relevant here as this is Arm specific code. What I meant was its been already used in DPDK for arm64. https://git.dpdk.org/dpdk/tree/drivers/net/i40e/i40e_rxtx_vec_neon.c#n91 Please see offial page vector gcc gcc documentation. The examples are using= this scheme. https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html This is to just create 'input' variable. I am fine to use any other scheme = with out additional cost of instructions. >=20 > > > > Since it used in fastpath, a temp variable would be additional cost > > for no reason. > Then, I would suggest we can go with using 'vdupq_n_s32'. We have to form uint64x2_t with 4 x uint32_t variable, How does 'vdupq_n_s3= 2' help here? Can you share code snippet without any temp variable? >=20 > > If GCC supports it then I think it is fine, I think, above usage > > matters with C++ portability. > I did not understand the C++ portability part. Can you elaborate more? >=20 > > > > > > > > > > > /* Process the 4 bytes of input on each stream. */ > > > > > > > > @@ -227,7 +223,6 @@ search_neon_4(const struct rte_acl_ctx *ctx, > > > > const uint8_t **data, > > > > uint64_t index_array[4]; > > > > struct completion cmplt[4]; > > > > struct parms parms[4]; > > > > - int32x4_t input; > > > > > > > > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, > > > > total_packets, categories, ctx->trans_table); @@ -242,10 > > > > +237,10 @@ search_neon_4(const struct rte_acl_ctx *ctx, const > > > > +uint8_t > > > > **data, > > > > > > > > while (flows.started > 0) { > > > > /* Gather 4 bytes of input data for each stream. */ > > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, > > > > 0); > > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, > > > > 1); > > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, > > > > 2); > > > > - input =3D vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, > > > > 3); > > > > + int32x4_t input =3D {GET_NEXT_4BYTES(parms, 0), > > > > + GET_NEXT_4BYTES(parms, 1), > > > > + GET_NEXT_4BYTES(parms, 2), > > > > + GET_NEXT_4BYTES(parms, 3)}; > > > > > > > > /* Process the 4 bytes of input on each stream. */ > > > > input =3D transition4(input, flows.trans, index_array); > > > > -- > > > > 2.21.0