* [dpdk-dev] [PATCH] net/mlx5: fix sync on Tx doorbell for PPC64
@ 2019-03-18 6:42 Dekel Peled
2019-03-18 6:42 ` Dekel Peled
0 siblings, 1 reply; 2+ messages in thread
From: Dekel Peled @ 2019-03-18 6:42 UTC (permalink / raw)
To: yskoh, shahafs; +Cc: dev, orika, dekelp, stable
In file lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:
rte_mb() is defined as asm "sync".
rte_wmb() is defined as asm "lwsync".
mlx5_tx_dbrec_cond_wmb() uses rte_wmb() to ensure ordering between
DB record and BF copy.
For P9 processor, not having strongly-ordered memory model, this
memory barrier is not strict enough, so rte_mb() has to be used.
For x86 processor, having strongly-ordered memory model, the use
of rte_mb() instead of rte_wmb() causes up to ~10% performance hit.
This patch adds mlx5_arch_specific_mb(), defined as rte_mb() for PPC64
and as rte_wmb() for other processors.
mlx5_tx_dbrec_cond_wmb() will use mlx5_arch_specific_mb() in order to
guarantee data is valid for any processor architecture.
Original work by Yongseok Koh.
Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
drivers/net/mlx5/mlx5_rxtx.h | 2 +-
drivers/net/mlx5/mlx5_utils.h | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 53115dd..df51589 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -707,7 +707,7 @@ uint32_t mlx5_tx_update_ext_mp(struct mlx5_txq_data *txq, uintptr_t addr,
rte_cio_wmb();
*txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
/* Ensure ordering between DB record and BF copy. */
- rte_wmb();
+ mlx5_arch_specific_mb();
mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
if (cond)
rte_wmb();
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 97092c7..6742271 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -25,6 +25,15 @@
#define bool _Bool
#endif
+/*
+ * Define strict memory-barrier for PPC64.
+ */
+#if defined(__PPC64__)
+#define mlx5_arch_specific_mb() rte_mb()
+#else
+#define mlx5_arch_specific_mb() rte_wmb()
+#endif
+
/* Bit-field manipulation. */
#define BITFIELD_DECLARE(bf, type, size) \
type bf[(((size_t)(size) / (sizeof(type) * CHAR_BIT)) + \
--
1.8.3.1
^ permalink raw reply [flat|nested] 2+ messages in thread
* [dpdk-dev] [PATCH] net/mlx5: fix sync on Tx doorbell for PPC64
2019-03-18 6:42 [dpdk-dev] [PATCH] net/mlx5: fix sync on Tx doorbell for PPC64 Dekel Peled
@ 2019-03-18 6:42 ` Dekel Peled
0 siblings, 0 replies; 2+ messages in thread
From: Dekel Peled @ 2019-03-18 6:42 UTC (permalink / raw)
To: yskoh, shahafs; +Cc: dev, orika, dekelp, stable
In file lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:
rte_mb() is defined as asm "sync".
rte_wmb() is defined as asm "lwsync".
mlx5_tx_dbrec_cond_wmb() uses rte_wmb() to ensure ordering between
DB record and BF copy.
For P9 processor, not having strongly-ordered memory model, this
memory barrier is not strict enough, so rte_mb() has to be used.
For x86 processor, having strongly-ordered memory model, the use
of rte_mb() instead of rte_wmb() causes up to ~10% performance hit.
This patch adds mlx5_arch_specific_mb(), defined as rte_mb() for PPC64
and as rte_wmb() for other processors.
mlx5_tx_dbrec_cond_wmb() will use mlx5_arch_specific_mb() in order to
guarantee data is valid for any processor architecture.
Original work by Yongseok Koh.
Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
drivers/net/mlx5/mlx5_rxtx.h | 2 +-
drivers/net/mlx5/mlx5_utils.h | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 53115dd..df51589 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -707,7 +707,7 @@ uint32_t mlx5_tx_update_ext_mp(struct mlx5_txq_data *txq, uintptr_t addr,
rte_cio_wmb();
*txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
/* Ensure ordering between DB record and BF copy. */
- rte_wmb();
+ mlx5_arch_specific_mb();
mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
if (cond)
rte_wmb();
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 97092c7..6742271 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -25,6 +25,15 @@
#define bool _Bool
#endif
+/*
+ * Define strict memory-barrier for PPC64.
+ */
+#if defined(__PPC64__)
+#define mlx5_arch_specific_mb() rte_mb()
+#else
+#define mlx5_arch_specific_mb() rte_wmb()
+#endif
+
/* Bit-field manipulation. */
#define BITFIELD_DECLARE(bf, type, size) \
type bf[(((size_t)(size) / (sizeof(type) * CHAR_BIT)) + \
--
1.8.3.1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2019-03-18 6:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-18 6:42 [dpdk-dev] [PATCH] net/mlx5: fix sync on Tx doorbell for PPC64 Dekel Peled
2019-03-18 6:42 ` Dekel Peled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).