DPDK patches and discussions
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: Aaron Conole <aconole@redhat.com>
Cc: "msantana@redhat.com" <msantana@redhat.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"Ruifeng Wang (Arm Technology China)" <Ruifeng.Wang@arm.com>,
	"Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
	 Dharmik Thakkar <Dharmik.Thakkar@arm.com>,
	"jerin.jacob@caviumnetworks.com" <jerin.jacob@caviumnetworks.com>,
	"yskoh@mellanox.com" <yskoh@mellanox.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"bruce.richardson@intel.com" <bruce.richardson@intel.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>, nd <nd@arm.com>
Subject: Re: [dpdk-dev] DPDK compilation on arm is failing in Travis
Date: Fri, 7 Jun 2019 13:53:55 +0000	[thread overview]
Message-ID: <VE1PR08MB5149E1F99E9DEEC25C63756498100@VE1PR08MB5149.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <f7th891o7ec.fsf@dhcp-25.97.bos.redhat.com>

> >> >
> >> >  Thomas Monjalon <thomas@monjalon.net> writes:
> >> >
> >> >
> >> >
> >> >  The compilation of the master branch is failing for aarch64:
> >> >
> >> >  https://travis-ci.com/DPDK/dpdk
> >> >
> >> > The log is so much verbose that I am not able to understand what
> >> >
> >> > is really wrong.
> >> >
> >> > Please help to diagnose and fix, thanks.
> >> >
> >> >
> >> >
> >> > A discussion about this:
> >> >
> >> >
> >> >
> >> > http://mails.dpdk.org/archives/dev/2019-June/134012.html
> >> >
> >> >
> >> >
> >> > I see the error now.
> >> >
> >> > It is printing the full log after the error, so I missed the error
> >> >
> >> > at the top.
> >> >
> >> >
> >> >
> >> > I've read your comment about a possible error with the patch
> >> >
> >> > removing weak functions but neither me nor Bruce were able to
> >> > reproduce
> >> >
> >> > it.
> >> >
> >> >  What is the condition to see this compiler warning?
> >> >
> >> >
> >> >
> >> > It is only on ARM, and only when the neon intrinsics are in use.
> >> >
> >> > I am not able to reproduce it from the tip of master.
> >> >
> >> >
> >> >
> >> > I am using:
> >> >
> >> > gcc (Ubuntu 8.3.0-6ubuntu1~18.04) 8.3.0
> >> >
> >> >
> >> >
> >> > From the log on Travis, looks like the compiler is:
> >> >
> >> > gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
> >> >
> >> >
> >> >
> >> > Is this the issue?
> >> >
> >> >
> >> >
> >> > Why are we seeing the error now?
> >> >
> >> > I tested with gcc-5 (Ubuntu/Linaro 5.5.0-12ubuntu1) 5.5.0 20171010,
> >> > it
> >> works fine. I cannot get hold of 5.4.0. Not sure if needs to be supported.
> >> >
> >> > Are there any issues in upgrading to 7 or 8?
> >> >
> >> > I have tested it on my ubuntu 16.04 vm on commit
> >> > 8cb511bb94ad92a76990f175cac76bb13d51daba
> >> > (head of master seems to be failing for other reasons on my vm).
> >> > I tested the following gcc versions:
> >> >
> >> > gcc 5.5.0 "cc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010"
> >> > gcc 7.4.0 "cc (Ubuntu 7.4.0-1ubuntu1~16.04~ppa1) 7.4.0"
> >> > gcc 8.1.0 "cc (Ubuntu 8.1.0-5ubuntu1~16.04) 8.1.0"
> >> >
> >> > All tested versions failed on the exact same error shown in travis.
> >> > I don't know if the compiler is at fault here. Maybe Aaron's patch
> >> > is a viable
> >> option?
> >> >
> >> >  The issue is the vector lane setting code looks like:
> >> >
> >> >
> >> >
> >> >    lval = lane_set(scalar, rval, lane id)
> >> >
> >> >
> >> >
> >> > In this case, 'rval' is being used before it is ever set, but it
> >> >
> >> > really could be just 0 for the first lane setting code.
> >> > Thereafter,
> >> >
> >> > we use the old value of input as the rval, but each time a
> >> > different lane is
> >> set.
> >> >
> >> >
> >> >
> >> > It would be nice if there were an intrinsic that formatted
> >> > correctly
> >> >
> >> > from the start (something we could call like lval =
> >> >
> >> > lane_set_from_array(scalar_array)).
> >> >
> >> > [Honnappa] This exists already. ‘vdupq_n_s32’ can be used. Can you
> >> > try the
> >> following?
> >>
> >> Well, it isn't exactly that.  You are setting all lanes from a scalar.
> > Yes, you are correct, it sets all the lanes. I am not sure on how this
> > will affect the performance.
> >
> >> I'd rather be able to say:
> >>
> >>    input0 = vdupq_nn_s32(&parms[0]);
> >>    input1 = vdupq_nn_s32(&parms[4]);
> >>
> >> Something like that, which lets us delete all the rest of the
> >> lane-set code.  But it seems it doesn't exist.
> >>
> >> Regardless, I think either patch should work (either using the 'all lanes'
> >> setting you have or the static variable).  I have no preference on it
> >> - it's up to you (or someone else) to say which is preferred.  I
> >> guess your version could be preferable since there's no static to
> >> need to "explain" :)
> > I think we can go ahead with your patch with using a temporary vector
> > for the first set, as it does not introduce any change to the code and
> > hence performance should not get affected.
> >
> > But, I do not understand why you have added 'static'. Also, changing
> > 'ZEROVAL' to 'tmp' or something similar will be better.
> 
> The static is there to guarantee '0' value.  Otherwise we create a temp
> variable that has to be initialized explicitly.
Ok, I am fine with this. I guess this is the explanation you wanted to avoid 😊.

> 
> >>
> >> > honnag01@qc2400f-1:~/dpdk$ git diff
> >> >
> >> > diff --git a/lib/librte_acl/acl_run_neon.h
> >> > b/lib/librte_acl/acl_run_neon.h
> >> >
> >> > index 01b9766d8..b3196cd12 100644
> >> >
> >> > --- a/lib/librte_acl/acl_run_neon.h
> >> >
> >> > +++ b/lib/librte_acl/acl_run_neon.h
> >> >
> >> > @@ -181,8 +181,8 @@ search_neon_8(const struct rte_acl_ctx *ctx,
> >> > const uint8_t **data,
> >> >
> >> >
> >> >
> >> >         while (flows.started > 0) {
> >> >
> >> >                 /* Gather 4 bytes of input data for each stream. */
> >> >
> >> > -               input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0,
> 0);
> >> >
> >> > -               input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1,
> 0);
> >> >
> >> > +               input0 = vdupq_n_s32(GET_NEXT_4BYTES(parms, 0));
> >> >
> >> > +               input1 = vdupq_n_s32(GET_NEXT_4BYTES(parms, 4));
> >> >
> >> >
> >> >
> >> >                 input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1),
> >> > input0, 1);
> >> >
> >> >                 input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5),
> >> > input1, 1);
> >> >
> >> > @@ -242,7 +242,7 @@ search_neon_4(const struct rte_acl_ctx *ctx,
> >> > const uint8_t **data,
> >> >
> >> >
> >> >
> >> >         while (flows.started > 0) {
> >> >
> >> >                 /* Gather 4 bytes of input data for each stream. */
> >> >
> >> > -               input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
> >> >
> >> > +               input = vdupq_n_s32(GET_NEXT_4BYTES(parms, 0));
> >> >
> >> >                 input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1),
> >> > input, 1);
> >> >
> >> >                 input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2),
> >> > input, 2);
> >> >
> >> >                                                 input =
> >> > vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
> >> >
> >> >
> >> >
> >> >  Then 'input' would never appear as an rval before it was set.
> >> >
> >> >
> >> >
> >> > I thought Jerin Jacob (CC'd) would have some opinion on the right fix.
> >> >
> >> > There are three 'fixes' I know exist - one is to squelch the
> >> > warning
> >> >
> >> > (but I don't like it because it could hide future code that
> >> > introduces
> >> >
> >> > this), one is to create a static and use assignment, one is to
> >> > replace
> >> >
> >> > the first call and pass in a 0'd lane for the first one.
> >> >
> >> >
> >> >
> >> > Actually, I think I have a patch that could work to not introduce
> >> > an
> >> >
> >> > assignment, but squelch the warning.  Something like the following
> >> > (not
> >> >
> >> > tested).
> >> >
> >> >
> >> >
> >> > ---
> >> >
> >> >
> >> >
> >> > diff --git a/lib/librte_acl/acl_run_neon.h
> >> >
> >> > b/lib/librte_acl/acl_run_neon.h index 01b9766d8..37c984fef 100644
> >> >
> >> > --- a/lib/librte_acl/acl_run_neon.h
> >> >
> >> > +++ b/lib/librte_acl/acl_run_neon.h
> >> >
> >> > @@ -165,6 +165,7 @@ search_neon_8(const struct rte_acl_ctx *ctx,
> >> > const
> >> >
> >> > uint8_t **data,
> >> >
> >> >     uint64_t index_array[8];
> >> >
> >> >     struct completion cmplt[8];
> >> >
> >> >     struct parms parms[8];
> >> >
> >> > +   static int32x4_t ZEROVAL;
> >> >
> >> >     int32x4_t input0, input1;
> >> >
> >> >
> >> >
> >> >     acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, @@ -
> >> >
> >> > 181,8 +182,8 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
> >> >
> >> > uint8_t **data,
> >> >
> >> >
> >> >
> >> >     while (flows.started > 0) {
> >> >
> >> >             /* Gather 4 bytes of input data for each stream. */
> >> >
> >> > -           input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0,
> >> >
> >> > 0);
> >> >
> >> > -           input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1,
> >> >
> >> > 0);
> >> >
> >> > +           input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
> >> >
> >> > ZEROVAL, 0);
> >> >
> >> > +           input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4),
> >> >
> >> > ZEROVAL, 0);
> >> >
> >> >
> >> >
> >> >             input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1),
> >> > input0,
> >> >
> >> > 1);
> >> >
> >> >              input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5),
> >> > input1,
> >> >
> >> > 1); @@
> >> >
> >> >  -227,6 +228,7 @@ search_neon_4(const struct rte_acl_ctx *ctx,
> >> > const
> >> >
> >> > uint8_t **data,
> >> >
> >> >     uint64_t index_array[4];
> >> >
> >> >     struct completion cmplt[4];
> >> >
> >> >     struct parms parms[4];
> >> >
> >> > +   static int32x4_t ZEROVAL;
> >> >
> >> >     int32x4_t input;
> >> >
> >> >
> >> >
> >> >     acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, @@ -
> >> >
> >> > 242,7 +244,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const
> >> >
> >> > uint8_t **data,
> >> >
> >> >
> >> >
> >> >     while (flows.started > 0) {
> >> >
> >> >             /* Gather 4 bytes of input data for each stream. */
> >> >
> >> > -           input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
> >> >
> >> > +           input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
> >> >
> >> > ZEROVAL, 0);
> >> >
> >> >             input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1),
> >> > input, 1);
> >> >
> >> >             input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2),
> >> > input, 2);
> >> >
> >> >             input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3),
> >> > input, 3);
> >> >
> >> > --
> >> >
> >> > 2.21.0

  reply	other threads:[~2019-06-07 13:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-05 18:26 Thomas Monjalon
2019-06-05 19:40 ` Aaron Conole
2019-06-05 20:04   ` Thomas Monjalon
2019-06-05 20:58     ` Aaron Conole
2019-06-05 21:05       ` Honnappa Nagarahalli
2019-06-05 21:36         ` Honnappa Nagarahalli
2019-06-05 22:38           ` Michael Santana Francisco
2019-06-06  4:42             ` Honnappa Nagarahalli
2019-06-06 14:50               ` Aaron Conole
2019-06-07  5:10                 ` Honnappa Nagarahalli
2019-06-07 13:24                   ` Aaron Conole
2019-06-07 13:53                     ` Honnappa Nagarahalli [this message]
2019-06-08  8:38                       ` Jerin Jacob Kollanukkaran
2019-06-08  8:41                         ` Jerin Jacob Kollanukkaran
2019-06-06 14:57             ` Jerin Jacob Kollanukkaran
2019-06-06 17:06               ` Michael Santana Francisco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VE1PR08MB5149E1F99E9DEEC25C63756498100@VE1PR08MB5149.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=Dharmik.Thakkar@arm.com \
    --cc=Gavin.Hu@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=aconole@redhat.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=msantana@redhat.com \
    --cc=nd@arm.com \
    --cc=thomas@monjalon.net \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).