From: Aaron Conole <aconole@redhat.com>
To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Cc: "dev\@dpdk.org" <dev@dpdk.org>,
"gavin.hu\@arm.com" <gavin.hu@arm.com>,
"konstantin.ananyev\@intel.com" <konstantin.ananyev@intel.com>
Subject: Re: [dpdk-dev] [EXT] [PATCH 1/3] acl: fix arm argument types
Date: Wed, 05 Jun 2019 13:09:34 -0400 [thread overview]
Message-ID: <f7tblzcnelt.fsf@dhcp-25.97.bos.redhat.com> (raw)
In-Reply-To: <BYAPR18MB2424F700ACC1114208BC2619C8160@BYAPR18MB2424.namprd18.prod.outlook.com> (Jerin Jacob Kollanukkaran's message of "Wed, 5 Jun 2019 15:16:11 +0000")
Jerin Jacob Kollanukkaran <jerinj@marvell.com> writes:
>> -----Original Message-----
>> From: Jerin Jacob Kollanukkaran
>> Sent: Wednesday, April 10, 2019 8:10 PM
>> To: dev@dpdk.org; aconole@redhat.com
>> Cc: gavin.hu@arm.com; konstantin.ananyev@intel.com
>> Subject: Re: [EXT] [PATCH 1/3] acl: fix arm argument types
>>
>> On Mon, 2019-04-08 at 14:24 -0400, Aaron Conole wrote:
>> > -------------------------------------------------------------------
>> > ---
>> > Compiler complains of argument type mismatch, like:
>>
>> Can you share more details on how to reproduce this issue?
>>
>> We already have
>> CFLAGS_acl_run_neon.o += -flax-vector-conversions in the Makefile.
>>
>> If you are taking out -flax-vector-conversions the correct way to fix will be
>> use vreinterpret*.
>>
>> For me the code looks clean, If unnecessary casting is avoided.
>
>
> Considering the following patch is part of dpdk.org now. I think, We may not need this
> patch in benefit to avoid a lot of typecasting.
>
> https://git.dpdk.org/dpdk/commit/?id=e53ce4e4137974f46743e74bd9ab912e0166c8b1
Correct, the lax conversions aren't needed.
>
>
>
>>
>>
>> >
>> > ../lib/librte_acl/acl_run_neon.h: In function ‘transition4’:
>> > ../lib/librte_acl/acl_run_neon.h:115:2: note: use -flax-vector-
>> > conversions
>> > to permit conversions between vectors with differing element
>> > types
>> > or numbers of subparts
>> > node_type = vbicq_s32(tr_hi_lo.val[0], index_msk);
>> > ^
>> > ../lib/librte_acl/acl_run_neon.h:115:41: error: incompatible type
>> > for
>> > argument 2 of ‘vbicq_s32’
>> >
>> > Signed-off-by: Aaron Conole <aconole@redhat.com>
>> > ---
>> > lib/librte_acl/acl_run_neon.h | 46 ++++++++++++++++++++-------------
>> > --
>> > 1 file changed, 27 insertions(+), 19 deletions(-)
>> >
>> >
>> >
>> > /*
>> > @@ -179,6 +183,9 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
>> > uint8_t **data,
>> > acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]);
>> > acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]);
>> >
>> > + memset(&input0, 0, sizeof(input0));
>> > + memset(&input1, 0, sizeof(input1));
>>
>> Why this memset only required for arm64? If it real issue, Shouldn't it
>> required for x86 and ppc ?
>>
Something for this part is still needed (see for example:
https://travis-ci.com/DPDK/dpdk/jobs/205675369).
I have two alternate approaches, butneither have even been compile tested
(and the obvious '-Wno-maybe-uninitialized' - but I dislike that
approach because it will afflict all routines):
1. Something like this:
@@ -181,8 +181,8 @@ search_neon_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
while (flows.started > 0) {
/* Gather 4 bytes of input data for each stream. */
- input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0, 0);
- input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1, 0);
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), vdup_n_s32(0), 0);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), vdup_n_s32(0), 0);
input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0, 1);
input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1, 1);
@@ -242,7 +242,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
while (flows.started > 0) {
/* Gather 4 bytes of input data for each stream. */
- input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), vdup_n_s32(0), 0);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
---------
2: something like this
diff --git a/lib/librte_acl/acl_run_neon.h b/lib/librte_acl/acl_run_neon.h
index a055a8240..0eb42865a 100644
--- a/lib/librte_acl/acl_run_neon.h
+++ b/lib/librte_acl/acl_run_neon.h
@@ -165,7 +165,8 @@ search_neon_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
uint64_t index_array[8];
struct completion cmplt[8];
struct parms parms[8];
- int32x4_t input0, input1;
+ static int32x4_t ZERO_VAL;
+ int32x4_t input0 = ZERO_VAL, input1 = ZERO_VAL;
acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
total_packets, categories, ctx->trans_table);
@@ -181,8 +182,8 @@ search_neon_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
while (flows.started > 0) {
/* Gather 4 bytes of input data for each stream. */
- input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), vdup_n_s32(0), 0);
- input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), vdup_n_s32(0), 0);
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0, 0);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1, 0);
input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0, 1);
input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1, 1);
@@ -227,7 +228,8 @@ search_neon_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
uint64_t index_array[4];
struct completion cmplt[4];
struct parms parms[4];
- int32x4_t input;
+ static int32x4_t ZERO_VAL;
+ int32x4_t input = ZERO_VAL;
acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
total_packets, categories, ctx->trans_table);
@@ -242,7 +244,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
while (flows.started > 0) {
/* Gather 4 bytes of input data for each stream. */
- input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), vdup_n_s32(0), 0);
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
---
WDYT?
next prev parent reply other threads:[~2019-06-05 17:09 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-08 18:24 [dpdk-dev] [PATCH 0/3] librte_acl: fixes related to testing with the meson build Aaron Conole
2019-04-08 18:24 ` Aaron Conole
2019-04-08 18:24 ` [dpdk-dev] [PATCH 1/3] acl: fix arm argument types Aaron Conole
2019-04-08 18:24 ` Aaron Conole
2019-04-10 14:39 ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-04-10 14:39 ` Jerin Jacob Kollanukkaran
2019-04-10 15:52 ` Aaron Conole
2019-04-10 15:52 ` Aaron Conole
2019-04-10 16:07 ` Jerin Jacob Kollanukkaran
2019-04-10 16:07 ` Jerin Jacob Kollanukkaran
2019-04-10 17:20 ` Aaron Conole
2019-04-10 17:20 ` Aaron Conole
2019-04-30 12:57 ` Aaron Conole
2019-04-30 12:57 ` Aaron Conole
2019-06-05 15:16 ` Jerin Jacob Kollanukkaran
2019-06-05 17:09 ` Aaron Conole [this message]
2019-04-08 18:24 ` [dpdk-dev] [PATCH 2/3] acl: update the build for multi-arch Aaron Conole
2019-04-08 18:24 ` Aaron Conole
2019-04-08 18:24 ` [dpdk-dev] [PATCH 3/3] acl: adjust the tests Aaron Conole
2019-04-08 18:24 ` Aaron Conole
2019-04-09 8:41 ` Ananyev, Konstantin
2019-04-09 8:41 ` Ananyev, Konstantin
2019-04-09 13:01 ` Aaron Conole
2019-04-09 13:01 ` Aaron Conole
2019-04-09 16:03 ` Ananyev, Konstantin
2019-04-09 16:03 ` Ananyev, Konstantin
2019-04-09 17:04 ` Ananyev, Konstantin
2019-04-09 17:04 ` Ananyev, Konstantin
2019-04-10 8:13 ` Richardson, Bruce
2019-04-10 8:13 ` Richardson, Bruce
2019-04-10 13:10 ` Aaron Conole
2019-04-10 13:10 ` Aaron Conole
2019-04-10 13:24 ` Bruce Richardson
2019-04-10 13:24 ` Bruce Richardson
2019-04-10 13:46 ` Bruce Richardson
2019-04-10 13:46 ` Bruce Richardson
2019-04-09 17:05 ` Richardson, Bruce
2019-04-09 17:05 ` Richardson, Bruce
2019-04-09 18:29 ` Ananyev, Konstantin
2019-04-09 18:29 ` Ananyev, Konstantin
2019-04-10 9:06 ` Bruce Richardson
2019-04-10 9:06 ` Bruce Richardson
2019-04-08 20:40 ` [dpdk-dev] [PATCH 0/3] librte_acl: fixes related to testing with the meson build Aaron Conole
2019-04-08 20:40 ` Aaron Conole
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7tblzcnelt.fsf@dhcp-25.97.bos.redhat.com \
--to=aconole@redhat.com \
--cc=dev@dpdk.org \
--cc=gavin.hu@arm.com \
--cc=jerinj@marvell.com \
--cc=konstantin.ananyev@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).