DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC] eal/arm: remove CASP constraints for GCC
@ 2021-10-04 10:03 pbhagavatula
  2021-10-18  6:39 ` Ruifeng Wang
  2021-11-05  8:57 ` [dpdk-dev] [PATCH v2] " pbhagavatula
  0 siblings, 2 replies; 7+ messages in thread
From: pbhagavatula @ 2021-10-04 10:03 UTC (permalink / raw)
  To: jerinj, Ruifeng Wang; +Cc: dev, Pavan Nikhilesh

From: Pavan Nikhilesh <pbhagavatula@marvell.com>

GCC now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 lib/eal/arm/include/rte_atomic_64.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index fa6f334c0d..f6f31ae777 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -52,6 +52,7 @@ rte_atomic_thread_fence(int memorder)
 #define __LSE_PREAMBLE	""
 #endif
 
+#if defined(__clang__)
 #define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
 static __rte_noinline void                                                  \
 cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
@@ -76,6 +77,19 @@ cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	old->val[0] = x0;                                                   \
 	old->val[1] = x1;                                                   \
 }
+#else
+#define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
+static __rte_always_inline void                                             \
+cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
+{                                                                           \
+	asm volatile(                                                       \
+		__LSE_PREAMBLE                                              \
+		op_string " %[old], %H[old], %[upd], %H[upd], [%[dst]]"     \
+		: [old] "+r"(old->int128)                                   \
+		: [upd] "r"(updated.int128), [dst] "r"(dst)                 \
+		: "memory");                                                \
+}
+#endif
 
 __ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
 __ATOMIC128_CAS_OP(__cas_128_acquire, "caspa")
-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [RFC] eal/arm: remove CASP constraints for GCC
  2021-10-04 10:03 [dpdk-dev] [RFC] eal/arm: remove CASP constraints for GCC pbhagavatula
@ 2021-10-18  6:39 ` Ruifeng Wang
  2021-11-05  8:57 ` [dpdk-dev] [PATCH v2] " pbhagavatula
  1 sibling, 0 replies; 7+ messages in thread
From: Ruifeng Wang @ 2021-10-18  6:39 UTC (permalink / raw)
  To: pbhagavatula, jerinj; +Cc: dev, nd

> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Monday, October 4, 2021 6:03 PM
> To: jerinj@marvell.com; Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [RFC] eal/arm: remove CASP constraints for GCC
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> GCC now assigns even register pairs for CASP, the fix has also been
> backported to all stable releases of older GCC versions.
> Removing the manual register allocation allows GCC to inline the functions
> and pick optimal registers for performing CASP.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
>  lib/eal/arm/include/rte_atomic_64.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/lib/eal/arm/include/rte_atomic_64.h
> b/lib/eal/arm/include/rte_atomic_64.h
> index fa6f334c0d..f6f31ae777 100644
> --- a/lib/eal/arm/include/rte_atomic_64.h
> +++ b/lib/eal/arm/include/rte_atomic_64.h
> @@ -52,6 +52,7 @@ rte_atomic_thread_fence(int memorder)
>  #define __LSE_PREAMBLE	""
>  #endif
> 
> +#if defined(__clang__)
>  #define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
>  static __rte_noinline void                                                  \
>  cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)
> \
> @@ -76,6 +77,19 @@ cas_op_name(rte_int128_t *dst, rte_int128_t *old,
> rte_int128_t updated)     \
>  	old->val[0] = x0;                                                   \
>  	old->val[1] = x1;                                                   \
>  }
> +#else
> +#define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
> +static __rte_always_inline void                                             \
> +cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)
> \
> +{                                                                           \
> +	asm volatile(                                                       \
> +		__LSE_PREAMBLE                                              \
Change looks good.

One minor comment, gcc doesn't need this PREAMBLE.

Thanks,
Ruifeng
> +		op_string " %[old], %H[old], %[upd], %H[upd], [%[dst]]"     \
> +		: [old] "+r"(old->int128)                                   \
> +		: [upd] "r"(updated.int128), [dst] "r"(dst)                 \
> +		: "memory");                                                \
> +}
> +#endif
> 
>  __ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
> __ATOMIC128_CAS_OP(__cas_128_acquire, "caspa")
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
  2021-10-04 10:03 [dpdk-dev] [RFC] eal/arm: remove CASP constraints for GCC pbhagavatula
  2021-10-18  6:39 ` Ruifeng Wang
@ 2021-11-05  8:57 ` pbhagavatula
  2021-11-08  7:15   ` Ruifeng Wang
  1 sibling, 1 reply; 7+ messages in thread
From: pbhagavatula @ 2021-11-05  8:57 UTC (permalink / raw)
  To: ruifeng.wang, david.marchand, jerinj; +Cc: dev, Pavan Nikhilesh

From: Pavan Nikhilesh <pbhagavatula@marvell.com>

GCC now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 v2 Changes:
 - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).

 lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index fa6f334c0d..6047911507 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -46,12 +46,8 @@ rte_atomic_thread_fence(int memorder)
 /*------------------------ 128 bit atomic operations -------------------------*/

 #if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
-#if defined(RTE_CC_CLANG)
-#define __LSE_PREAMBLE	".arch armv8-a+lse\n"
-#else
-#define __LSE_PREAMBLE	""
-#endif

+#if defined(RTE_CC_CLANG)
 #define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
 static __rte_noinline void                                                  \
 cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
@@ -65,7 +61,7 @@ cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	register uint64_t x2 __asm("x2") = (uint64_t)updated.val[0];        \
 	register uint64_t x3 __asm("x3") = (uint64_t)updated.val[1];        \
 	asm volatile(                                                       \
-		__LSE_PREAMBLE						    \
+		".arch armv8-a+lse\n"                                       \
 		op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]"   \
 		: [old0] "+r" (x0),                                         \
 		[old1] "+r" (x1)                                            \
@@ -76,13 +72,24 @@ cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	old->val[0] = x0;                                                   \
 	old->val[1] = x1;                                                   \
 }
+#else
+#define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
+static __rte_always_inline void                                             \
+cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
+{                                                                           \
+	asm volatile(                                                       \
+		op_string " %[old], %H[old], %[upd], %H[upd], [%[dst]]"     \
+		: [old] "+r"(old->int128)                                   \
+		: [upd] "r"(updated.int128), [dst] "r"(dst)                 \
+		: "memory");                                                \
+}
+#endif

 __ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
 __ATOMIC128_CAS_OP(__cas_128_acquire, "caspa")
 __ATOMIC128_CAS_OP(__cas_128_release, "caspl")
 __ATOMIC128_CAS_OP(__cas_128_acq_rel, "caspal")

-#undef __LSE_PREAMBLE
 #undef __ATOMIC128_CAS_OP

 #endif
--
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
  2021-11-05  8:57 ` [dpdk-dev] [PATCH v2] " pbhagavatula
@ 2021-11-08  7:15   ` Ruifeng Wang
  2021-11-16 14:56     ` David Marchand
  0 siblings, 1 reply; 7+ messages in thread
From: Ruifeng Wang @ 2021-11-08  7:15 UTC (permalink / raw)
  To: pbhagavatula, david.marchand, jerinj; +Cc: dev, nd

> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Friday, November 5, 2021 4:57 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> jerinj@marvell.com
> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> GCC now assigns even register pairs for CASP, the fix has also been
> backported to all stable releases of older GCC versions.
> Removing the manual register allocation allows GCC to inline the functions
> and pick optimal registers for performing CASP.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
>  v2 Changes:
>  - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).
> 
>  lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
  2021-11-08  7:15   ` Ruifeng Wang
@ 2021-11-16 14:56     ` David Marchand
  2022-01-20 15:32       ` [EXT] " Pavan Nikhilesh Bhagavatula
  2022-02-11  7:53       ` David Marchand
  0 siblings, 2 replies; 7+ messages in thread
From: David Marchand @ 2021-11-16 14:56 UTC (permalink / raw)
  To: pbhagavatula, Ruifeng Wang, jerinj
  Cc: dev, nd, Honnappa Nagarahalli, Thomas Monjalon

On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
> > -----Original Message-----
> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> > Sent: Friday, November 5, 2021 4:57 PM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> > jerinj@marvell.com
> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> >
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > GCC now assigns even register pairs for CASP, the fix has also been
> > backported to all stable releases of older GCC versions.
> > Removing the manual register allocation allows GCC to inline the functions
> > and pick optimal registers for performing CASP.
> >
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

Patch lgtm but it is late for merging in 21.11.

It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
This is used by the stack library and mempool.
There might be other impacts I did not think of.


Do you have links to bugs or commits for the mentionned fix on gcc side?
This will help when we get reports from users with compilers without the fix.


Thanks.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [EXT] Re: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
  2021-11-16 14:56     ` David Marchand
@ 2022-01-20 15:32       ` Pavan Nikhilesh Bhagavatula
  2022-02-11  7:53       ` David Marchand
  1 sibling, 0 replies; 7+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2022-01-20 15:32 UTC (permalink / raw)
  To: David Marchand, Ruifeng Wang, Jerin Jacob Kollanukkaran
  Cc: dev, nd, Honnappa Nagarahalli, Thomas Monjalon

>On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang
><Ruifeng.Wang@arm.com> wrote:
>>
>> > -----Original Message-----
>> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> > Sent: Friday, November 5, 2021 4:57 PM
>> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>;
>david.marchand@redhat.com;
>> > jerinj@marvell.com
>> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
>> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints
>for GCC
>> >
>> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >
>> > GCC now assigns even register pairs for CASP, the fix has also been
>> > backported to all stable releases of older GCC versions.
>> > Removing the manual register allocation allows GCC to inline the
>functions
>> > and pick optimal registers for performing CASP.
>> >
>> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
>Patch lgtm but it is late for merging in 21.11.
>
>It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
>This is used by the stack library and mempool.
>There might be other impacts I did not think of.
>
>
>Do you have links to bugs or commits for the mentionned fix on gcc
>side?

Here is the gcc git commit that fixes this.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf11d707c422e5f4e9e5cdacb818c3

>This will help when we get reports from users with compilers without
>the fix.
>
>
>Thanks.
>
>--
>David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
  2021-11-16 14:56     ` David Marchand
  2022-01-20 15:32       ` [EXT] " Pavan Nikhilesh Bhagavatula
@ 2022-02-11  7:53       ` David Marchand
  1 sibling, 0 replies; 7+ messages in thread
From: David Marchand @ 2022-02-11  7:53 UTC (permalink / raw)
  To: pbhagavatula, Ruifeng Wang, jerinj
  Cc: dev, nd, Honnappa Nagarahalli, Thomas Monjalon

On Tue, Nov 16, 2021 at 3:56 PM David Marchand
<david.marchand@redhat.com> wrote:
> > > GCC now assigns even register pairs for CASP, the fix has also been
> > > backported to all stable releases of older GCC versions.
> > > Removing the manual register allocation allows GCC to inline the functions
> > > and pick optimal registers for performing CASP.
> > >
> > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>

I added a reference to gcc commit and applied, thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-11  7:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04 10:03 [dpdk-dev] [RFC] eal/arm: remove CASP constraints for GCC pbhagavatula
2021-10-18  6:39 ` Ruifeng Wang
2021-11-05  8:57 ` [dpdk-dev] [PATCH v2] " pbhagavatula
2021-11-08  7:15   ` Ruifeng Wang
2021-11-16 14:56     ` David Marchand
2022-01-20 15:32       ` [EXT] " Pavan Nikhilesh Bhagavatula
2022-02-11  7:53       ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).