From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dekelp@mellanox.com>
Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129])
 by dpdk.org (Postfix) with ESMTP id 2A01A1DBD
 for <dev@dpdk.org>; Mon, 18 Mar 2019 07:43:26 +0100 (CET)
Received: from Internal Mail-Server by MTLPINE1 (envelope-from
 dekelp@mellanox.com)
 with ESMTPS (AES256-SHA encrypted); 18 Mar 2019 08:43:25 +0200
Received: from mtl-vdi-280.wap.labs.mlnx. (mtl-vdi-280.wap.labs.mlnx
 [10.128.130.87])
 by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x2I6hPe1019876;
 Mon, 18 Mar 2019 08:43:25 +0200
From: Dekel Peled <dekelp@mellanox.com>
To: yskoh@mellanox.com, shahafs@mellanox.com
Cc: dev@dpdk.org, orika@mellanox.com, dekelp@mellanox.com, stable@dpdk.org
Date: Mon, 18 Mar 2019 08:42:35 +0200
Message-Id: <1552891355-32094-1-git-send-email-dekelp@mellanox.com>
X-Mailer: git-send-email 1.7.1
Subject: [dpdk-dev] [PATCH] net/mlx5: fix sync on Tx doorbell for PPC64
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2019 06:43:26 -0000

In file lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:
rte_mb() is defined as asm "sync".
rte_wmb() is defined as asm "lwsync".

mlx5_tx_dbrec_cond_wmb() uses rte_wmb() to ensure ordering between
DB record and BF copy.
For P9 processor, not having strongly-ordered memory model, this
memory barrier is not strict enough, so rte_mb() has to be used.
For x86 processor, having strongly-ordered memory model, the use
of rte_mb() instead of rte_wmb() causes up to ~10% performance hit.

This patch adds mlx5_arch_specific_mb(), defined as rte_mb() for PPC64
and as rte_wmb() for other processors.
mlx5_tx_dbrec_cond_wmb() will use mlx5_arch_specific_mb() in order to
guarantee data is valid for any processor architecture.

Original work by Yongseok Koh.

Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx.h  | 2 +-
 drivers/net/mlx5/mlx5_utils.h | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 53115dd..df51589 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -707,7 +707,7 @@ uint32_t mlx5_tx_update_ext_mp(struct mlx5_txq_data *txq, uintptr_t addr,
 	rte_cio_wmb();
 	*txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
 	/* Ensure ordering between DB record and BF copy. */
-	rte_wmb();
+	mlx5_arch_specific_mb();
 	mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
 	if (cond)
 		rte_wmb();
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 97092c7..6742271 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -25,6 +25,15 @@
 #define bool _Bool
 #endif
 
+/*
+ * Define strict memory-barrier for PPC64.
+ */
+#if defined(__PPC64__)
+#define mlx5_arch_specific_mb() rte_mb()
+#else
+#define mlx5_arch_specific_mb() rte_wmb()
+#endif
+
 /* Bit-field manipulation. */
 #define BITFIELD_DECLARE(bf, type, size) \
 	type bf[(((size_t)(size) / (sizeof(type) * CHAR_BIT)) + \
-- 
1.8.3.1

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by dpdk.space (Postfix) with ESMTP id 3DC51A00E6
	for <public@inbox.dpdk.org>; Mon, 18 Mar 2019 07:43:29 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 5382A2C30;
	Mon, 18 Mar 2019 07:43:27 +0100 (CET)
Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129])
 by dpdk.org (Postfix) with ESMTP id 2A01A1DBD
 for <dev@dpdk.org>; Mon, 18 Mar 2019 07:43:26 +0100 (CET)
Received: from Internal Mail-Server by MTLPINE1 (envelope-from
 dekelp@mellanox.com)
 with ESMTPS (AES256-SHA encrypted); 18 Mar 2019 08:43:25 +0200
Received: from mtl-vdi-280.wap.labs.mlnx. (mtl-vdi-280.wap.labs.mlnx
 [10.128.130.87])
 by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x2I6hPe1019876;
 Mon, 18 Mar 2019 08:43:25 +0200
From: Dekel Peled <dekelp@mellanox.com>
To: yskoh@mellanox.com, shahafs@mellanox.com
Cc: dev@dpdk.org, orika@mellanox.com, dekelp@mellanox.com, stable@dpdk.org
Date: Mon, 18 Mar 2019 08:42:35 +0200
Message-Id: <1552891355-32094-1-git-send-email-dekelp@mellanox.com>
X-Mailer: git-send-email 1.7.1
Subject: [dpdk-dev] [PATCH] net/mlx5: fix sync on Tx doorbell for PPC64
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <20190318064235.GX_QqgWGY9PKSLzR3oUkqpNnM9BBRrwGBf4yWZnI03w@z>

In file lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:
rte_mb() is defined as asm "sync".
rte_wmb() is defined as asm "lwsync".

mlx5_tx_dbrec_cond_wmb() uses rte_wmb() to ensure ordering between
DB record and BF copy.
For P9 processor, not having strongly-ordered memory model, this
memory barrier is not strict enough, so rte_mb() has to be used.
For x86 processor, having strongly-ordered memory model, the use
of rte_mb() instead of rte_wmb() causes up to ~10% performance hit.

This patch adds mlx5_arch_specific_mb(), defined as rte_mb() for PPC64
and as rte_wmb() for other processors.
mlx5_tx_dbrec_cond_wmb() will use mlx5_arch_specific_mb() in order to
guarantee data is valid for any processor architecture.

Original work by Yongseok Koh.

Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx.h  | 2 +-
 drivers/net/mlx5/mlx5_utils.h | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 53115dd..df51589 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -707,7 +707,7 @@ uint32_t mlx5_tx_update_ext_mp(struct mlx5_txq_data *txq, uintptr_t addr,
 	rte_cio_wmb();
 	*txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
 	/* Ensure ordering between DB record and BF copy. */
-	rte_wmb();
+	mlx5_arch_specific_mb();
 	mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
 	if (cond)
 		rte_wmb();
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 97092c7..6742271 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -25,6 +25,15 @@
 #define bool _Bool
 #endif
 
+/*
+ * Define strict memory-barrier for PPC64.
+ */
+#if defined(__PPC64__)
+#define mlx5_arch_specific_mb() rte_mb()
+#else
+#define mlx5_arch_specific_mb() rte_wmb()
+#endif
+
 /* Bit-field manipulation. */
 #define BITFIELD_DECLARE(bf, type, size) \
 	type bf[(((size_t)(size) / (sizeof(type) * CHAR_BIT)) + \
-- 
1.8.3.1