Dekel£¬ To control the memory order for device memory, I think you should use rte_io_mb() instead of rte_mb(). This will generate correct result. rte_wmb() is used for system memory. > -----Original Message----- > From: Dekel Peled > Sent: Monday, March 18, 2019 8:58 PM > To: chaozhu@linux.vnet.ibm.com > Cc: yskoh@mellanox.com; shahafs@mellanox.com; dev@dpdk.org; > orika@mellanox.com; thomas@monjalon.net; dekelp@mellanox.com; > stable@dpdk.org > Subject: [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER > > From previous patch description: "to improve performance on PPC64, use light > weight sync instruction instead of sync instruction." > > Excerpt from IBM doc [1], section "Memory barrier instructions": > "The second form of the sync instruction is light-weight sync, or lwsync. > This form is used to control ordering for storage accesses to system memory > only. It does not create a memory barrier for accesses to device memory." > > This patch removes the use of lwsync, so calls to rte_wmb() and > rte_rmb() will provide correct memory barrier to ensure order of accesses to > system memory and device memory. > > [1] https://www.ibm.com/developerworks/systems/articles/powerpc.html > > Fixes: d23a6bd04d72 ("eal/ppc: fix memory barrier for IBM POWER") > Cc: stable@dpdk.org > > Signed-off-by: Dekel Peled > --- > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h | 8 -------- > 1 file changed, 8 deletions(-) > > diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h > b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h > index ce38350..797381c 100644 > --- a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h > +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h > @@ -63,11 +63,7 @@ > * Guarantees that the STORE operations generated before the barrier > * occur before the STORE operations generated after. > */ > -#ifdef RTE_ARCH_64 > -#define rte_wmb() asm volatile("lwsync" : : : "memory") > -#else > #define rte_wmb() asm volatile("sync" : : : "memory") > -#endif > > /** > * Read memory barrier. > @@ -75,11 +71,7 @@ > * Guarantees that the LOAD operations generated before the barrier > * occur before the LOAD operations generated after. > */ > -#ifdef RTE_ARCH_64 > -#define rte_rmb() asm volatile("lwsync" : : : "memory") > -#else > #define rte_rmb() asm volatile("sync" : : : "memory") > -#endif > > #define rte_smp_mb() rte_mb() > > -- > 1.8.3.1