From: Ruifeng Wang <Ruifeng.Wang@arm.com> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>, "jerinj@marvell.com" <jerinj@marvell.com>, Nithin Kumar Dabilpuram <ndabilpuram@marvell.com> Cc: "dev@dpdk.org" <dev@dpdk.org>, "vladimir.medvedkin@intel.com" <vladimir.medvedkin@intel.com>, "hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, nd <nd@arm.com>, "stable@dpdk.org" <stable@dpdk.org>, nd <nd@arm.com> Subject: Re: [dpdk-stable] [EXT] [PATCH v2 4/5] common/octeontx2: fix build with sve enabled Date: Mon, 11 Jan 2021 09:51:54 +0000 Message-ID: <VI1PR0802MB2351ED35DC0890FB627D45369EAB0@VI1PR0802MB2351.eurprd08.prod.outlook.com> (raw) In-Reply-To: <CO6PR18MB382874FDB8C371FF727AAB63DEAE9@CO6PR18MB3828.namprd18.prod.outlook.com> > -----Original Message----- > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com> > Sent: Friday, January 8, 2021 6:29 PM > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; jerinj@marvell.com; Nithin > Kumar Dabilpuram <ndabilpuram@marvell.com> > Cc: dev@dpdk.org; vladimir.medvedkin@intel.com; > hemant.agrawal@nxp.com; Honnappa Nagarahalli > <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; stable@dpdk.org > Subject: RE: [EXT] [PATCH v2 4/5] common/octeontx2: fix build with sve > enabled > > Hi Ruifeng, > > >Building with gcc 10.2 with SVE extension enabled got error: > > > >{standard input}: Assembler messages: > >{standard input}:4002: Error: selected processor does not support `mov > >z3.b,#0' > >{standard input}:4003: Error: selected processor does not support > >`whilelo p1.b,xzr,x7' > >{standard input}:4005: Error: selected processor does not support `ld1b > >z0.b,p1/z,[x8]' > >{standard input}:4006: Error: selected processor does not support > >`whilelo p4.s,wzr,w7' > > > >This is because inline assembly code explicitly resets cpu model to not > >have SVE support. Thus SVE instructions generated by compiler auto > >vectorization got rejected by assembler. > > > >Fixed the issue by replacing inline assembly with equivalent atomic > >built-ins. Compiler will generate LSE instructions for cpu that has the > >extension. > > > >Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") > >Cc: jerinj@marvell.com > >Cc: stable@dpdk.org > > > >Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > >--- > > drivers/common/octeontx2/otx2_io_arm64.h | 37 +++-------------------- > >- > > 1 file changed, 4 insertions(+), 33 deletions(-) > > > >diff --git a/drivers/common/octeontx2/otx2_io_arm64.h > >b/drivers/common/octeontx2/otx2_io_arm64.h > >index b5c85d9a6..8843a79b5 100644 > >--- a/drivers/common/octeontx2/otx2_io_arm64.h > >+++ b/drivers/common/octeontx2/otx2_io_arm64.h > >@@ -24,55 +24,26 @@ > > static __rte_always_inline uint64_t > > otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) { > >- uint64_t result; > >- > > /* Atomic add with no ordering */ > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldadd %x[i], %x[r], [%[b]]" > >- : [r] "=r" (result), "+m" (*ptr) > >- : [i] "r" (incr), [b] "r" (ptr) > >- : "memory"); > >- return result; > >+ return (uint64_t)__atomic_fetch_add(ptr, incr, > >__ATOMIC_RELAXED); > > } > > > > Here LDADD acts as a way to interface to co-processors i.e. > LDADD instruction opcode + specific io address are recognized by HW > interceptor and dispatched to the specific coprocessor. OK. Now I understand the background. > > Leaving it to the compiler to use the correct instruction is a bad idea. > This breaks the arm64_armv8_linux_gcc build as it doesn't have the > +lse enabled. > __atomic_fetch_add will generate a different instruction with SVE enabled. > > Instead can we add +sve to the first line to prevent outer loop from > optimizing out the trap? Since the inline assembly needs to be preserved, we have to tune the enabled extensions. I will change in next version. Thanks, Ruifeng > > I tested with 10.2 and n2 config below change works fine. > -" .cpu generic+lse\n" > +" .cpu generic+lse+sve\n" > > Regards, > Pavan. > > > static __rte_always_inline uint64_t > > otx2_atomic64_add_sync(int64_t incr, int64_t *ptr) { > >- uint64_t result; > >- > >- /* Atomic add with ordering */ > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldadda %x[i], %x[r], [%[b]]" > >- : [r] "=r" (result), "+m" (*ptr) > >- : [i] "r" (incr), [b] "r" (ptr) > >- : "memory"); > >- return result; > >+ return (uint64_t)__atomic_fetch_add(ptr, incr, > >__ATOMIC_ACQUIRE); > > } > > > > static __rte_always_inline uint64_t > > otx2_lmt_submit(rte_iova_t io_address) { > >- uint64_t result; > >- > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldeor xzr,%x[rf],[%[rs]]" : > >- [rf] "=r"(result): [rs] "r"(io_address)); > >- return result; > >+ return __atomic_fetch_xor((uint64_t *)io_address, 0, > >__ATOMIC_RELAXED); > > } > > > > static __rte_always_inline uint64_t > > otx2_lmt_submit_release(rte_iova_t io_address) { > >- uint64_t result; > >- > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldeorl xzr,%x[rf],[%[rs]]" : > >- [rf] "=r"(result) : [rs] "r"(io_address)); > >- return result; > >+ return __atomic_fetch_xor((uint64_t *)io_address, 0, > >__ATOMIC_RELEASE); > > } > > > > static __rte_always_inline void > >-- > >2.25.1
next prev parent reply other threads:[~2021-01-11 9:52 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <20201218101210.356836-1-ruifeng.wang@arm.com> [not found] ` <20210108082523.1062058-1-ruifeng.wang@arm.com> 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 2/5] net/hns3: " Ruifeng Wang 2021-01-09 0:06 ` Honnappa Nagarahalli 2021-01-09 2:11 ` oulijun 2021-01-11 2:39 ` Ruifeng Wang 2021-01-11 13:38 ` Honnappa Nagarahalli 2021-01-09 2:15 ` oulijun 2021-01-11 2:27 ` Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 3/5] net/octeontx: " Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 4/5] common/octeontx2: " Ruifeng Wang 2021-01-08 10:29 ` [dpdk-stable] [EXT] " Pavan Nikhilesh Bhagavatula 2021-01-11 9:51 ` Ruifeng Wang [this message] [not found] ` <20210112025709.1121523-1-ruifeng.wang@arm.com> 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 2/5] net/hns3: " Ruifeng Wang 2021-01-13 2:16 ` Honnappa Nagarahalli 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 3/5] net/octeontx: " Ruifeng Wang 2021-01-12 4:39 ` [dpdk-stable] [dpdk-dev] " Jerin Jacob 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 4/5] common/octeontx2: " Ruifeng Wang 2021-01-12 4:38 ` [dpdk-stable] [dpdk-dev] " Jerin Jacob
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=VI1PR0802MB2351ED35DC0890FB627D45369EAB0@VI1PR0802MB2351.eurprd08.prod.outlook.com \ --to=ruifeng.wang@arm.com \ --cc=Honnappa.Nagarahalli@arm.com \ --cc=dev@dpdk.org \ --cc=hemant.agrawal@nxp.com \ --cc=jerinj@marvell.com \ --cc=nd@arm.com \ --cc=ndabilpuram@marvell.com \ --cc=pbhagavatula@marvell.com \ --cc=stable@dpdk.org \ --cc=vladimir.medvedkin@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
patches for DPDK stable branches This inbox may be cloned and mirrored by anyone: git clone --mirror http://inbox.dpdk.org/stable/0 stable/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 stable stable/ http://inbox.dpdk.org/stable \ stable@dpdk.org public-inbox-index stable Example config snippet for mirrors. Newsgroup available over NNTP: nntp://inbox.dpdk.org/inbox.dpdk.stable AGPL code for this site: git clone https://public-inbox.org/public-inbox.git