From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
"dev@dpdk.org" <dev@dpdk.org>
Cc: "thomas@monjalon.net" <thomas@monjalon.net>,
"Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
"msantana@redhat.com" <msantana@redhat.com>,
"aconole@redhat.com" <aconole@redhat.com>,
"stable@dpdk.org" <stable@dpdk.org>, nd <nd@arm.com>,
nd <nd@arm.com>
Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH] acl: fix build issue with some arm64 compiler
Date: Mon, 10 Jun 2019 09:39:30 +0000 [thread overview]
Message-ID: <BYAPR18MB2424BC6B08BCDFCC0A36B1B5C8130@BYAPR18MB2424.namprd18.prod.outlook.com> (raw)
In-Reply-To: <VE1PR08MB5149E78DC15C57B9D897FA1098130@VE1PR08MB5149.eurprd08.prod.outlook.com>
> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Monday, June 10, 2019 11:00 AM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>; msantana@redhat.com; aconole@redhat.com;
> stable@dpdk.org; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> nd <nd@arm.com>; nd <nd@arm.com>
> Subject: [EXT] RE: [dpdk-dev] [PATCH] acl: fix build issue with some arm64
> compiler
>
> > > --
> > > > Subject: [dpdk-dev] [PATCH] acl: fix build issue with some arm64
> > > > compiler
> > > >
> > > > From: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > Some compilers reporting the following error, though the existing
> > > > code doesn't have any uninitialized variable case.
> > > > Just to make compiler happy, initialize the int32x4_t variable one
> > > > shot in C language.
> > > >
> > > > ../lib/librte_acl/acl_run_neon.h: In function 'search_neon_4'
> > > > ../lib/librte_acl/acl_run_neon.h:230:12: error: 'input' may be
> > > > used uninitialized in this function [-Werror=maybe-uninitialized]
> > > > int32x4_t input;
> > > >
> > > > Fixes: 34fa6c27c156 ("acl: add NEON optimization for ARMv8")
> > > > Cc: stable@dpdk.org
> > > >
> > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > > ---
> > > > lib/librte_acl/acl_run_neon.h | 29 ++++++++++++-----------------
> > > > 1 file changed, 12 insertions(+), 17 deletions(-)
> > > >
> > > > diff --git a/lib/librte_acl/acl_run_neon.h
> > > > b/lib/librte_acl/acl_run_neon.h index 01b9766d8..dc9e9efe9 100644
> > > > --- a/lib/librte_acl/acl_run_neon.h
> > > > +++ b/lib/librte_acl/acl_run_neon.h
> > > > @@ -165,7 +165,6 @@ search_neon_8(const struct rte_acl_ctx *ctx,
> > > > const uint8_t **data,
> > > > uint64_t index_array[8];
> > > > struct completion cmplt[8];
> > > > struct parms parms[8];
> > > > - int32x4_t input0, input1;
> > > >
> > > > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
> > > > total_packets, categories, ctx->trans_table); @@ -181,17
> > > > +180,14 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
> > > > +uint8_t
> > > > **data,
> > > >
> > > > while (flows.started > 0) {
> > > > /* Gather 4 bytes of input data for each stream. */
> > > > - input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
> > > > input0, 0);
> > > > - input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4),
> > > > input1, 0);
> > > > -
> > > > - input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1),
> > > > input0, 1);
> > > > - input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5),
> > > > input1, 1);
> > > > -
> > > > - input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2),
> > > > input0, 2);
> > > > - input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 6),
> > > > input1, 2);
> > > > -
> > > > - input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3),
> > > > input0, 3);
> > > > - input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 7),
> > > > input1, 3);
> > > > + int32x4_t input0 = {GET_NEXT_4BYTES(parms, 0),
> > > > + GET_NEXT_4BYTES(parms, 1),
> > > > + GET_NEXT_4BYTES(parms, 2),
> > > > + GET_NEXT_4BYTES(parms, 3)};
> > > > + int32x4_t input1 = {GET_NEXT_4BYTES(parms, 4),
> > > > + GET_NEXT_4BYTES(parms, 5),
> > > > + GET_NEXT_4BYTES(parms, 6),
> > > > + GET_NEXT_4BYTES(parms, 7)};
> > > >
> > > This mixes the use of NEON intrinsics with GCC vector extensions.
> > > ACLE (Arm C Language Extensions) specifically recommends not to mix
> > > the two methods in section 12.2.6. IMO, Aaron's suggestion of using
> > > a temp vector
> > should be good.
> >
> > We are using this pattern across DPDK and SSE for x86 as well.
> > https://git.dpdk.org/dpdk/tree/drivers/net/i40e/i40e_rxtx_vec_neon.c#n
> > 91
> I am not sure about x86, I have not looked at a document similar to ACLE for
> x86. IMO, it is not relevant here as this is Arm specific code.
What I meant was its been already used in DPDK for arm64.
https://git.dpdk.org/dpdk/tree/drivers/net/i40e/i40e_rxtx_vec_neon.c#n91
Please see offial page vector gcc gcc documentation. The examples are using this scheme.
https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
This is to just create 'input' variable. I am fine to use any other scheme with out additional cost
of instructions.
>
> >
> > Since it used in fastpath, a temp variable would be additional cost
> > for no reason.
> Then, I would suggest we can go with using 'vdupq_n_s32'.
We have to form uint64x2_t with 4 x uint32_t variable, How does 'vdupq_n_s32' help here?
Can you share code snippet without any temp variable?
>
> > If GCC supports it then I think it is fine, I think, above usage
> > matters with C++ portability.
> I did not understand the C++ portability part. Can you elaborate more?
>
> >
> >
> > >
> > > > /* Process the 4 bytes of input on each stream. */
> > > >
> > > > @@ -227,7 +223,6 @@ search_neon_4(const struct rte_acl_ctx *ctx,
> > > > const uint8_t **data,
> > > > uint64_t index_array[4];
> > > > struct completion cmplt[4];
> > > > struct parms parms[4];
> > > > - int32x4_t input;
> > > >
> > > > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
> > > > total_packets, categories, ctx->trans_table); @@ -242,10
> > > > +237,10 @@ search_neon_4(const struct rte_acl_ctx *ctx, const
> > > > +uint8_t
> > > > **data,
> > > >
> > > > while (flows.started > 0) {
> > > > /* Gather 4 bytes of input data for each stream. */
> > > > - input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input,
> > > > 0);
> > > > - input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input,
> > > > 1);
> > > > - input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input,
> > > > 2);
> > > > - input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input,
> > > > 3);
> > > > + int32x4_t input = {GET_NEXT_4BYTES(parms, 0),
> > > > + GET_NEXT_4BYTES(parms, 1),
> > > > + GET_NEXT_4BYTES(parms, 2),
> > > > + GET_NEXT_4BYTES(parms, 3)};
> > > >
> > > > /* Process the 4 bytes of input on each stream. */
> > > > input = transition4(input, flows.trans, index_array);
> > > > --
> > > > 2.21.0
next prev parent reply other threads:[~2019-06-10 9:39 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-06 14:50 jerinj
2019-06-06 15:55 ` Michael Santana Francisco
2019-06-07 5:42 ` Honnappa Nagarahalli
2019-06-07 5:35 ` Honnappa Nagarahalli
2019-06-07 6:21 ` Jerin Jacob Kollanukkaran
2019-06-10 5:29 ` Honnappa Nagarahalli
2019-06-10 9:39 ` Jerin Jacob Kollanukkaran [this message]
2019-06-11 1:27 ` Honnappa Nagarahalli
2019-06-11 14:24 ` Jerin Jacob Kollanukkaran
2019-06-10 12:10 ` Aaron Conole
2019-06-11 14:15 ` [dpdk-stable] [dpdk-dev] [PATCH v2] " jerinj
2019-06-11 14:53 ` Aaron Conole
2019-06-11 15:07 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BYAPR18MB2424BC6B08BCDFCC0A36B1B5C8130@BYAPR18MB2424.namprd18.prod.outlook.com \
--to=jerinj@marvell.com \
--cc=Gavin.Hu@arm.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=aconole@redhat.com \
--cc=dev@dpdk.org \
--cc=msantana@redhat.com \
--cc=nd@arm.com \
--cc=stable@dpdk.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).