patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
To: Ruifeng Wang <ruifeng.wang@arm.com>,
	Jerin Jacob Kollanukkaran <jerinj@marvell.com>,
	Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"vladimir.medvedkin@intel.com" <vladimir.medvedkin@intel.com>,
	"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
	"nd@arm.com" <nd@arm.com>, "stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-stable] [EXT] [PATCH v2 4/5] common/octeontx2: fix build with sve enabled
Date: Fri, 8 Jan 2021 10:29:20 +0000
Message-ID: <CO6PR18MB382874FDB8C371FF727AAB63DEAE9@CO6PR18MB3828.namprd18.prod.outlook.com> (raw)
In-Reply-To: <20210108082523.1062058-5-ruifeng.wang@arm.com>

Hi Ruifeng,

>Building with gcc 10.2 with SVE extension enabled got error:
>
>{standard input}: Assembler messages:
>{standard input}:4002: Error: selected processor does not support `mov
>z3.b,#0'
>{standard input}:4003: Error: selected processor does not support
>`whilelo p1.b,xzr,x7'
>{standard input}:4005: Error: selected processor does not support `ld1b
>z0.b,p1/z,[x8]'
>{standard input}:4006: Error: selected processor does not support
>`whilelo p4.s,wzr,w7'
>
>This is because inline assembly code explicitly resets cpu model to
>not have SVE support. Thus SVE instructions generated by compiler
>auto vectorization got rejected by assembler.
>
>Fixed the issue by replacing inline assembly with equivalent atomic
>built-ins. Compiler will generate LSE instructions for cpu that has
>the extension.
>
>Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs")
>Cc: jerinj@marvell.com
>Cc: stable@dpdk.org
>
>Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
>---
> drivers/common/octeontx2/otx2_io_arm64.h | 37 +++--------------------
>-
> 1 file changed, 4 insertions(+), 33 deletions(-)
>
>diff --git a/drivers/common/octeontx2/otx2_io_arm64.h
>b/drivers/common/octeontx2/otx2_io_arm64.h
>index b5c85d9a6..8843a79b5 100644
>--- a/drivers/common/octeontx2/otx2_io_arm64.h
>+++ b/drivers/common/octeontx2/otx2_io_arm64.h
>@@ -24,55 +24,26 @@
> static __rte_always_inline uint64_t
> otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr)
> {
>-	uint64_t result;
>-
> 	/* Atomic add with no ordering */
>-	asm volatile (
>-		".cpu  generic+lse\n"
>-		"ldadd %x[i], %x[r], [%[b]]"
>-		: [r] "=r" (result), "+m" (*ptr)
>-		: [i] "r" (incr), [b] "r" (ptr)
>-		: "memory");
>-	return result;
>+	return (uint64_t)__atomic_fetch_add(ptr, incr,
>__ATOMIC_RELAXED);
> }
>

Here LDADD acts as a way to interface to co-processors i.e. 
LDADD instruction opcode + specific io address are recognized by 
HW interceptor and dispatched to the specific coprocessor.

Leaving it to the compiler to use the correct instruction is a bad idea.
This breaks the arm64_armv8_linux_gcc build as it doesn't have the
+lse enabled.
__atomic_fetch_add will generate a different instruction with SVE 
enabled.

Instead can we add +sve to the first line to prevent outer loop from optimizing out 
the trap?

I tested with 10.2 and n2 config below change works fine.
-" .cpu          generic+lse\n"
+" .cpu		generic+lse+sve\n"

Regards,
Pavan.

> static __rte_always_inline uint64_t
> otx2_atomic64_add_sync(int64_t incr, int64_t *ptr)
> {
>-	uint64_t result;
>-
>-	/* Atomic add with ordering */
>-	asm volatile (
>-		".cpu  generic+lse\n"
>-		"ldadda %x[i], %x[r], [%[b]]"
>-		: [r] "=r" (result), "+m" (*ptr)
>-		: [i] "r" (incr), [b] "r" (ptr)
>-		: "memory");
>-	return result;
>+	return (uint64_t)__atomic_fetch_add(ptr, incr,
>__ATOMIC_ACQUIRE);
> }
>
> static __rte_always_inline uint64_t
> otx2_lmt_submit(rte_iova_t io_address)
> {
>-	uint64_t result;
>-
>-	asm volatile (
>-		".cpu  generic+lse\n"
>-		"ldeor xzr,%x[rf],[%[rs]]" :
>-		 [rf] "=r"(result): [rs] "r"(io_address));
>-	return result;
>+	return __atomic_fetch_xor((uint64_t *)io_address, 0,
>__ATOMIC_RELAXED);
> }
>
> static __rte_always_inline uint64_t
> otx2_lmt_submit_release(rte_iova_t io_address)
> {
>-	uint64_t result;
>-
>-	asm volatile (
>-		".cpu  generic+lse\n"
>-		"ldeorl xzr,%x[rf],[%[rs]]" :
>-		 [rf] "=r"(result) : [rs] "r"(io_address));
>-	return result;
>+	return __atomic_fetch_xor((uint64_t *)io_address, 0,
>__ATOMIC_RELEASE);
> }
>
> static __rte_always_inline void
>--
>2.25.1


  reply	other threads:[~2021-01-08 10:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20201218101210.356836-1-ruifeng.wang@arm.com>
     [not found] ` <20210108082523.1062058-1-ruifeng.wang@arm.com>
2021-01-08  8:25   ` [dpdk-stable] [PATCH v2 2/5] net/hns3: " Ruifeng Wang
2021-01-09  0:06     ` Honnappa Nagarahalli
2021-01-09  2:11       ` oulijun
2021-01-11  2:39         ` Ruifeng Wang
2021-01-11 13:38           ` Honnappa Nagarahalli
2021-01-09  2:15     ` oulijun
2021-01-11  2:27       ` Ruifeng Wang
2021-01-08  8:25   ` [dpdk-stable] [PATCH v2 3/5] net/octeontx: " Ruifeng Wang
2021-01-08  8:25   ` [dpdk-stable] [PATCH v2 4/5] common/octeontx2: " Ruifeng Wang
2021-01-08 10:29     ` Pavan Nikhilesh Bhagavatula [this message]
2021-01-11  9:51       ` [dpdk-stable] [EXT] " Ruifeng Wang
     [not found] ` <20210112025709.1121523-1-ruifeng.wang@arm.com>
2021-01-12  2:57   ` [dpdk-stable] [PATCH v3 2/5] net/hns3: " Ruifeng Wang
2021-01-13  2:16     ` Honnappa Nagarahalli
2021-01-12  2:57   ` [dpdk-stable] [PATCH v3 3/5] net/octeontx: " Ruifeng Wang
2021-01-12  4:39     ` [dpdk-stable] [dpdk-dev] " Jerin Jacob
2021-01-12  2:57   ` [dpdk-stable] [PATCH v3 4/5] common/octeontx2: " Ruifeng Wang
2021-01-12  4:38     ` [dpdk-stable] [dpdk-dev] " Jerin Jacob

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO6PR18MB382874FDB8C371FF727AAB63DEAE9@CO6PR18MB3828.namprd18.prod.outlook.com \
    --to=pbhagavatula@marvell.com \
    --cc=dev@dpdk.org \
    --cc=hemant.agrawal@nxp.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=nd@arm.com \
    --cc=ndabilpuram@marvell.com \
    --cc=ruifeng.wang@arm.com \
    --cc=stable@dpdk.org \
    --cc=vladimir.medvedkin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

patches for DPDK stable branches

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git