patches for DPDK stable branches
 help / color / mirror / Atom feed
From: luca.boccassi@gmail.com
To: Konstantin Ananyev <konstantin.ananyev@intel.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>,
	dpdk stable <stable@dpdk.org>
Subject: [dpdk-stable] patch 'eal/x86: use lock-prefixed instructions for SMP barrier' has been queued to LTS release 16.11.5
Date: Wed,  7 Feb 2018 16:46:33 +0000	[thread overview]
Message-ID: <20180207164705.29052-2-luca.boccassi@gmail.com> (raw)
In-Reply-To: <20180207164705.29052-1-luca.boccassi@gmail.com>

Hi,

FYI, your patch has been queued to LTS release 16.11.5

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 02/09/18. So please
shout if anyone has objections.

Thanks.

Luca Boccassi

---
>From 4b93724891bb8e1ee7c5c2e5e269728416595320 Mon Sep 17 00:00:00 2001
From: Konstantin Ananyev <konstantin.ananyev@intel.com>
Date: Mon, 15 Jan 2018 15:09:31 +0000
Subject: [PATCH] eal/x86: use lock-prefixed instructions for SMP barrier

[ upstream commit 096ffd811fe21d652e51f07a7859967ffaabc72c ]

On x86 it is possible to use lock-prefixed instructions to get
the similar effect as mfence.
As pointed by Java guys, on most modern HW that gives a better
performance than using mfence:
https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
That patch adopts that technique for rte_smp_mb() implementation.
On BDW 2.2 mb_autotest on single lcore reports 2X cycle reduction,
i.e. from ~110 to ~55 cycles per operation.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 .../common/include/arch/x86/rte_atomic.h           | 44 +++++++++++++++++++++-
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
index 00b1cdf5d..d12b679a3 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
@@ -55,12 +55,52 @@ extern "C" {
 
 #define	rte_rmb() _mm_lfence()
 
-#define rte_smp_mb() rte_mb()
-
 #define rte_smp_wmb() rte_compiler_barrier()
 
 #define rte_smp_rmb() rte_compiler_barrier()
 
+/*
+ * From Intel Software Development Manual; Vol 3;
+ * 8.2.2 Memory Ordering in P6 and More Recent Processor Families:
+ * ...
+ * . Reads are not reordered with other reads.
+ * . Writes are not reordered with older reads.
+ * . Writes to memory are not reordered with other writes,
+ *   with the following exceptions:
+ *   . streaming stores (writes) executed with the non-temporal move
+ *     instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and
+ *   . string operations (see Section 8.2.4.1).
+ *  ...
+ * . Reads may be reordered with older writes to different locations but not
+ * with older writes to the same location.
+ * . Reads or writes cannot be reordered with I/O instructions,
+ * locked instructions, or serializing instructions.
+ * . Reads cannot pass earlier LFENCE and MFENCE instructions.
+ * . Writes ... cannot pass earlier LFENCE, SFENCE, and MFENCE instructions.
+ * . LFENCE instructions cannot pass earlier reads.
+ * . SFENCE instructions cannot pass earlier writes ...
+ * . MFENCE instructions cannot pass earlier reads, writes ...
+ *
+ * As pointed by Java guys, that makes possible to use lock-prefixed
+ * instructions to get the same effect as mfence and on most modern HW
+ * that gives a better perfomance then using mfence:
+ * https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
+ * Basic idea is to use lock prefixed add with some dummy memory location
+ * as the destination. From their experiments 128B(2 cache lines) below
+ * current stack pointer looks like a good candidate.
+ * So below we use that techinque for rte_smp_mb() implementation.
+ */
+
+static inline void __attribute__((always_inline))
+rte_smp_mb(void)
+{
+#ifdef RTE_ARCH_I686
+	asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
+#else
+	asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+#endif
+}
+
 /*------------------------- 16 bit atomic operations -------------------------*/
 
 #ifndef RTE_FORCE_INTRINSICS
-- 
2.14.2

  reply	other threads:[~2018-02-07 16:47 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-26 13:12 [dpdk-stable] patch 'kni: fix build with kernel 4.15' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'eal: update assertion macro' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'contigmem: fix build on FreeBSD 12' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'mem: fix mmap error check on huge page attach' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'app/testpmd: fix crash of txonly with multiple segments' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'mbuf: cleanup function to get last segment' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'bus/pci: fix interrupt handler type' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'memzone: fix leak on allocation error' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'malloc: protect stats with lock' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'malloc: fix end for bounded elements' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'vfio: fix enabled check on error' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'pmdinfogen: fix cross compilation for ARM big endian' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'lpm: fix ARM big endian build' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/i40e: " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/ixgbe: " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'examples/l3fwd-power: fix Rx without interrupt' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'examples/l3fwd-power: fix frequency detection' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/nfp: fix MTU settings' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/nfp: fix jumbo " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/nfp: fix CRC strip check behaviour' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/thunderx: fix multi segment Tx function return' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/szedata2: fix check of mmap return value' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/bonding: fix activated slave in 8023ad mode' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/qede: fix to reject config with no Rx queue' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'ethdev: fix missing imissed counter in xstats' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'ethdev: fix typo in functions comment' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/i40e: fix VLAN offload setting' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/fm10k: fix logical port delete' " luca.boccassi
2018-01-26 13:12 ` [dpdk-stable] patch 'net/igb: fix Tx queue number assignment' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'vhost: fix dequeue zero copy with virtio1' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/virtio: fix incorrect cast' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/virtio: fix vector Rx flushing' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'examples/vhost: fix sending ARP packet to self' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/ixgbe: fix the failure of number of Tx queue check' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e/base: fix NVM lock' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e/base: fix link LED blink' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e/base: fix compile issue for GCC 6.3' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/ixgbe/base: add media type of fixed fiber' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e: fix VSI MAC filter on primary address change' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/bnxt: parse checksum offload flags' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/bonding: fix setting slave MAC addresses' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'ethdev: fix link autonegotiation value' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/pcap: fix the NUMA id display in logs' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/ixgbe: fix max queue number for VF' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e: fix VF reset stats crash' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/ixgbe: fix mailbox interrupt handler' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/e1000: " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/mlx5: fix deadlock of link status alarm' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'vhost: fix error code check when creating thread' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'examples/vhost: fix startup check' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'pdump: fix error check when creating/canceling thread' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'app/procinfo: add compilation option in config' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'test: register test as failed if setup failed' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'test/table: fix uninitialized parameter' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'test/memzone: fix wrong test' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'test/memzone: fix NULL freeing' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'test/memzone: fix freeing test' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'test/crypto: fix missing include' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'examples/ipsec-secgw: fix corner case for SPI value' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e: fix flag for MAC address write' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'vhost: fix mbuf free' " luca.boccassi
2018-01-26 13:13 ` [dpdk-stable] patch 'net/i40e: fix flow director Rx resource defect' " luca.boccassi
2018-02-07 16:46   ` [dpdk-stable] patch 'keepalive: fix state alignment' " luca.boccassi
2018-02-07 16:46     ` luca.boccassi [this message]
2018-02-07 16:46     ` [dpdk-stable] patch 'mbuf: fix NULL freeing when debug enabled' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'crypto/qat: fix null auth algo overwrite' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/enic: fix crash due to static max number of queues' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/mlx5: fix missing RSS capability' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/ena: do not set Tx L4 offloads in Rx path' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/virtio: fix Rx and Tx handler selection for ARM32' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/virtio: fix queue flushing with vector Rx enabled' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/virtio: fix memory leak when reinitializing device' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/virtio: fix typo in function name' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/i40e: fix VF Rx interrupt enabling' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/ixgbe: " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/e1000: " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/bnxt: fix size of Tx ring in HW' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/qede/base: fix VF LRO tunnel configuration' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'eal/ppc: remove the braces in memory barrier macros' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'mk: support renamed Makefile in external project' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/ixgbe: fix reset error handling' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'test/pmd_perf: declare variables as static' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'examples/bond: check mbuf allocation' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'mk: fix external build' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'test/memzone: handle previously allocated memzones' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'usertools/devbind: remove unused function' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/bonding: check error of MAC address setting' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'net/qede: fix few log messages' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'app/testpmd: fix port index in RSS forward config' " luca.boccassi
2018-02-07 16:46     ` [dpdk-stable] patch 'app/testpmd: fix port topology " luca.boccassi
2018-02-07 16:47     ` [dpdk-stable] patch 'examples/ip_pipeline: fix timer period unit' " luca.boccassi
2018-02-07 16:47     ` [dpdk-stable] patch 'test/reorder: fix memory leak' " luca.boccassi
2018-02-07 16:47     ` [dpdk-stable] patch 'test/ring_perf: " luca.boccassi
2018-02-07 16:47     ` [dpdk-stable] patch 'test/table: " luca.boccassi
2018-02-07 16:47     ` [dpdk-stable] patch 'test/timer_perf: " luca.boccassi
2018-02-07 16:47     ` [dpdk-stable] patch 'net/i40e: fix Rx interrupt' " luca.boccassi
2018-02-09 10:40     ` [dpdk-stable] patch 'net/bnxt: fix Rx checksum flags' " Luca Boccassi
2018-02-09 10:40       ` [dpdk-stable] patch 'net/bnxt: fix link speed setting with autoneg off' " Luca Boccassi
2018-02-11 12:49         ` [dpdk-stable] patch 'eal/ppc: support sPAPR IOMMU for vfio-pci' " Luca Boccassi
2018-02-11 12:49           ` [dpdk-stable] patch 'igb_uio: switch to new irq function for MSI-X' " Luca Boccassi
2018-02-15 12:03             ` [dpdk-stable] patch 'ethdev: fix data alignment' " Luca Boccassi
2018-02-15 12:03               ` [dpdk-stable] patch 'net/virtio: fix mbuf data offset for simple Rx' " Luca Boccassi
2018-02-15 12:03               ` [dpdk-stable] patch 'net/virtio: fix resuming port with Rx vector path' " Luca Boccassi
2018-02-15 12:03               ` [dpdk-stable] patch 'net/vhost: fix log messages on create/destroy' " Luca Boccassi
2018-02-15 12:03               ` [dpdk-stable] patch 'net/virtio-user: fix start with kernel vhost' " Luca Boccassi
2018-02-15 12:03               ` [dpdk-stable] patch 'examples/exception_path: align stats on cache line' " Luca Boccassi
2018-02-15 12:03               ` [dpdk-stable] patch 'doc: fix outdated link to IPsec white paper' " Luca Boccassi
2018-02-11 12:49           ` [dpdk-stable] patch 'igb_uio: fix IRQ disable on recent kernels' " Luca Boccassi
2018-02-11 12:49           ` [dpdk-stable] patch 'igb_uio: fix MSI-X IRQ assignment with new IRQ function' " Luca Boccassi
2018-02-09 10:40       ` [dpdk-stable] patch 'net/i40e: check multi-driver option parsing' " Luca Boccassi
2018-02-09 10:40       ` [dpdk-stable] patch 'app/testpmd: fix flow director filter' " Luca Boccassi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180207164705.29052-2-luca.boccassi@gmail.com \
    --to=luca.boccassi@gmail.com \
    --cc=bruce.richardson@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).