patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Kevin Traynor <ktraynor@redhat.com>
To: Gavin Hu <gavin.hu@arm.com>
Cc: Phil Yang <phil.yang@arm.com>,
	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
	Ola Liljedahl <ola.liljedahl@arm.com>,
	Steve Capper <steve.capper@arm.com>,
	Jerin Jacob <jerinj@marvell.com>,
	Nipun Gupta <nipun.gupta@nxp.com>,
	Konstantin Ananyev <konstantin.ananyev@intel.com>,
	dpdk stable <stable@dpdk.org>
Subject: [dpdk-stable] patch 'spinlock: reimplement with atomic one-way barrier' has been queued to LTS release 18.11.2
Date: Tue, 16 Apr 2019 15:37:16 +0100	[thread overview]
Message-ID: <20190416143719.21601-58-ktraynor@redhat.com> (raw)
In-Reply-To: <20190416143719.21601-1-ktraynor@redhat.com>

Hi,

FYI, your patch has been queued to LTS release 18.11.2

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 04/24/19. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Thanks.

Kevin Traynor

---
From 548edf47ab0869243541b1c855d1e9330b4718a7 Mon Sep 17 00:00:00 2001
From: Gavin Hu <gavin.hu@arm.com>
Date: Fri, 8 Mar 2019 15:56:37 +0800
Subject: [PATCH] spinlock: reimplement with atomic one-way barrier

[ upstream commit 453d8f736676255696bce3c5e01b43b543ff896c ]

The __sync builtin based implementation generates full memory barriers
('dmb ish') on Arm platforms. Using C11 atomic builtins to generate one way
barriers.

Here is the assembly code of __sync_compare_and_swap builtin.
__sync_bool_compare_and_swap(dst, exp, src);
   0x000000000090f1b0 <+16>:    e0 07 40 f9 ldr x0, [sp, #8]
   0x000000000090f1b4 <+20>:    e1 0f 40 79 ldrh    w1, [sp, #6]
   0x000000000090f1b8 <+24>:    e2 0b 40 79 ldrh    w2, [sp, #4]
   0x000000000090f1bc <+28>:    21 3c 00 12 and w1, w1, #0xffff
   0x000000000090f1c0 <+32>:    03 7c 5f 48 ldxrh   w3, [x0]
   0x000000000090f1c4 <+36>:    7f 00 01 6b cmp w3, w1
   0x000000000090f1c8 <+40>:    61 00 00 54 b.ne    0x90f1d4
<rte_atomic16_cmpset+52>  // b.any
   0x000000000090f1cc <+44>:    02 fc 04 48 stlxrh  w4, w2, [x0]
   0x000000000090f1d0 <+48>:    84 ff ff 35 cbnz    w4, 0x90f1c0
<rte_atomic16_cmpset+32>
   0x000000000090f1d4 <+52>:    bf 3b 03 d5 dmb ish
   0x000000000090f1d8 <+56>:    e0 17 9f 1a cset    w0, eq  // eq = none

The benchmarking results showed constant improvements on all available
platforms:
1. Cavium ThunderX2: 126% performance;
2. Hisilicon 1616: 30%;
3. Qualcomm Falkor: 13%;
4. Marvell ARMADA 8040 with A72 cores on macchiatobin: 3.7%

Here is the example test result on TX2:
$sudo ./build/app/test -l 16-27 -- i
RTE>>spinlock_autotest

*** spinlock_autotest without this patch ***
Test with lock on 12 cores...
Core [16] Cost Time = 53886 us
Core [17] Cost Time = 53605 us
Core [18] Cost Time = 53163 us
Core [19] Cost Time = 49419 us
Core [20] Cost Time = 34317 us
Core [21] Cost Time = 53408 us
Core [22] Cost Time = 53970 us
Core [23] Cost Time = 53930 us
Core [24] Cost Time = 53283 us
Core [25] Cost Time = 51504 us
Core [26] Cost Time = 50718 us
Core [27] Cost Time = 51730 us
Total Cost Time = 612933 us

*** spinlock_autotest with this patch ***
Test with lock on 12 cores...
Core [16] Cost Time = 18808 us
Core [17] Cost Time = 29497 us
Core [18] Cost Time = 29132 us
Core [19] Cost Time = 26150 us
Core [20] Cost Time = 21892 us
Core [21] Cost Time = 24377 us
Core [22] Cost Time = 27211 us
Core [23] Cost Time = 11070 us
Core [24] Cost Time = 29802 us
Core [25] Cost Time = 15793 us
Core [26] Cost Time = 7474 us
Core [27] Cost Time = 29550 us
Total Cost Time = 270756 us

In the tests on ThunderX2, with more cores contending, the performance gain
was even higher, indicating the __atomic implementation scales up better
than __sync.

Fixes: af75078fece3 ("first public release")

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 .../common/include/generic/rte_spinlock.h      | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
index c4c3fc31e..87ae7a4f1 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -62,7 +62,12 @@ static inline void
 rte_spinlock_lock(rte_spinlock_t *sl)
 {
-	while (__sync_lock_test_and_set(&sl->locked, 1))
-		while(sl->locked)
+	int exp = 0;
+
+	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
+				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
+		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
 			rte_pause();
+		exp = 0;
+	}
 }
 #endif
@@ -81,5 +86,5 @@ static inline void
 rte_spinlock_unlock (rte_spinlock_t *sl)
 {
-	__sync_lock_release(&sl->locked);
+	__atomic_store_n(&sl->locked, 0, __ATOMIC_RELEASE);
 }
 #endif
@@ -100,5 +105,8 @@ static inline int
 rte_spinlock_trylock (rte_spinlock_t *sl)
 {
-	return __sync_lock_test_and_set(&sl->locked,1) == 0;
+	int exp = 0;
+	return __atomic_compare_exchange_n(&sl->locked, &exp, 1,
+				0, /* disallow spurious failure */
+				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
 }
 #endif
@@ -114,5 +122,5 @@ rte_spinlock_trylock (rte_spinlock_t *sl)
 static inline int rte_spinlock_is_locked (rte_spinlock_t *sl)
 {
-	return sl->locked;
+	return __atomic_load_n(&sl->locked, __ATOMIC_ACQUIRE);
 }
 
-- 
2.20.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2019-04-16 15:34:27.777024964 +0100
+++ 0058-spinlock-reimplement-with-atomic-one-way-barrier.patch	2019-04-16 15:34:25.236178707 +0100
@@ -1,8 +1,10 @@
-From 453d8f736676255696bce3c5e01b43b543ff896c Mon Sep 17 00:00:00 2001
+From 548edf47ab0869243541b1c855d1e9330b4718a7 Mon Sep 17 00:00:00 2001
 From: Gavin Hu <gavin.hu@arm.com>
 Date: Fri, 8 Mar 2019 15:56:37 +0800
 Subject: [PATCH] spinlock: reimplement with atomic one-way barrier
 
+[ upstream commit 453d8f736676255696bce3c5e01b43b543ff896c ]
+
 The __sync builtin based implementation generates full memory barriers
 ('dmb ish') on Arm platforms. Using C11 atomic builtins to generate one way
 barriers.
@@ -71,7 +73,6 @@
 than __sync.
 
 Fixes: af75078fece3 ("first public release")
-Cc: stable@dpdk.org
 
 Signed-off-by: Gavin Hu <gavin.hu@arm.com>
 Reviewed-by: Phil Yang <phil.yang@arm.com>

  parent reply	other threads:[~2019-04-16 14:39 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-16 14:36 [dpdk-stable] patch 'eal: support strlcat function' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'common/cpt: fix null auth only' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'crypto/openssl: fix big numbers after computations' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'crypto/openssl: fix modexp' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'test/crypto: fix duplicate id used by CCP device' " Kevin Traynor
2019-04-17  6:55   ` [dpdk-stable] [EXT] " Hemant Agrawal
2019-04-16 14:36 ` [dpdk-stable] patch 'event/opdl: replace sprintf with snprintf' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/nfp: fix setting MAC address' " Kevin Traynor
2019-04-23 10:24   ` Pablo Cascón
2019-04-23 13:03     ` Kevin Traynor
2019-04-23 13:35       ` Pablo Cascón
2019-04-25 16:02         ` Kevin Traynor
2019-04-26  9:26           ` Pablo Cascón
2019-04-26  9:32             ` Kevin Traynor
2019-04-26 10:00               ` Pablo Cascón
2019-04-16 14:36 ` [dpdk-stable] patch 'net/i40e: fix time sync for 25G' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/qede: support IOVA VA mode' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/mlx5: fix packet inline on Tx queue wraparound' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/bnxt: silence IOVA warnings' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/bnxt: suppress spurious error log' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/nfp: fix RSS query' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/ixgbe: restore VLAN filter for VF' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'app/testpmd: remove unused field from port struct' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'app/testpmd: fix a typo in log message' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'ethdev: fix method name in doxygen comment' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/qede: fix Rx packet drop' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: fix negative error codes' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: remove unused variable' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: remove extra checks for error codes' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: fix buffer overflow' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: fix EEPROM get for small and uneven lengths' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: fix link configuration' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: fix missing VLAN filter offload' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/atlantic: fix xstats return' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/enic: fix max MTU calculation' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/octeontx: fix vdev name' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'vhost: prevent disabled rings to be processed with zero-copy' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/virtio-user: fix multiqueue with vhost kernel' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'vhost: fix interrupt suppression for the split ring' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/virtio: add barrier in interrupt enable' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'app/testpmd: fix stdout flush after printing stats' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/nfp: fix possible buffer overflow' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/tap: fix getting max iovec' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/sfc: fix speed capabilities reported in device info' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/bonding: fix LACP negotiation' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/cxgbe: fix missing checksum flags and packet type' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'doc: fix examples in bonding guide' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/bonding: fix port id types' " Kevin Traynor
2019-04-16 14:36 ` [dpdk-stable] patch 'net/bonding: fix slave " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'net/bonding: fix packet count type for LACP' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'net/bonding: fix queue index types' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'drivers/net: fix possible overflow using strlcat' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'compress/qat: fix setup inter buffers' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'examples/ipsec-secgw: fix AES-CTR block size' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'examples/ipsec-secgw: fix debug logs' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'cryptodev: fix driver name comparison' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'test/crypto: fix possible overflow using strlcat' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'build: remove meson warning for Arm' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'doc: update cross Arm toolchain in Linux guide' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'acl: fix compiler flags with meson and AVX2 runtime' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'malloc: fix documentation of realloc function' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'eal/linux: fix log levels for pagemap reading failure' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'ring: enforce reading tail before slots' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'test/spinlock: remove delay for correct benchmarking' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'test/spinlock: amortize the cost of getting time' " Kevin Traynor
2019-04-16 14:37 ` Kevin Traynor [this message]
2019-04-16 14:37 ` [dpdk-stable] patch 'rwlock: reimplement with atomic builtins' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'eal/ppc: fix global memory barrier' " Kevin Traynor
2019-04-16 14:37 ` [dpdk-stable] patch 'vfio: document multiprocess limitation for container API' " Kevin Traynor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190416143719.21601-58-ktraynor@redhat.com \
    --to=ktraynor@redhat.com \
    --cc=gavin.hu@arm.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=nipun.gupta@nxp.com \
    --cc=ola.liljedahl@arm.com \
    --cc=phil.yang@arm.com \
    --cc=stable@dpdk.org \
    --cc=steve.capper@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).