DPDK patches and discussions
 help / color / mirror / Atom feed
From: Yongseok Koh <yskoh@mellanox.com>
To: adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com
Cc: dev@dpdk.org, Yongseok Koh <yskoh@mellanox.com>,
	stable@dpdk.org, Sagi Grimberg <sagi@grimberg.me>,
	Alexander Solganik <solganik@gmail.com>
Subject: [dpdk-dev] [PATCH v2] net/mlx5: fix Tx doorbell memory barrier
Date: Tue, 24 Oct 2017 17:27:25 -0700	[thread overview]
Message-ID: <49c39b44917c35ecaabf06f5f920d0f7e0ed0b6b.1508891141.git.yskoh@mellanox.com> (raw)
In-Reply-To: <20171022080022.13528-1-yskoh@mellanox.com>

Configuring UAR as IO-mapped makes maximum throughput decline by noticeable
amount. If UAR is configured as write-combining register, a write memory
barrier is needed on ringing a doorbell. rte_wmb() is mostly effective when
the size of a burst is comparatively small. Revert the register back to
write-combining and enforce a write memory barrier instead, except for
vectorized Tx burst routines. Application can change it by setting
MLX5_SHUT_UP_BF under its own necessity.

Fixes: 9f9bebae5530 ("net/mlx5: don't map doorbell register to write combining")
Cc: stable@dpdk.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Alexander Solganik <solganik@gmail.com>

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---

v2:
* Add documentation.
* Rename functions to minimize changes.

 doc/guides/nics/mlx5.rst              | 17 +++++++++++++++++
 drivers/net/mlx5/mlx5.c               |  2 --
 drivers/net/mlx5/mlx5_rxtx.h          | 23 +++++++++++++++++++++--
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  2 +-
 5 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index d24941a22..085d3940c 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -171,6 +171,23 @@ Environment variables
   This is disabled by default since this can also decrease performance for
   unaligned packet sizes.
 
+- ``MLX5_SHUT_UP_BF``
+
+  Configures HW Tx doorbell register as IO-mapped.
+
+  By default, the HW Tx doorbell is configured as a write-combining register.
+  The register would be flushed to HW usually when the write-combining buffer
+  becomes full, but it depends on CPU design.
+
+  Except for vectorized Tx burst routines, a write memory barrier is enforced
+  after updating the register so that the update can be immediately visible to
+  HW.
+
+  When vectorized Tx burst is called, the barrier is set only if the burst size
+  is not aligned to MLX5_VPMD_TX_MAX_BURST. However, setting this environmental
+  variable will bring better latency even though the maximum throughput can
+  slightly decline.
+
 Run-time configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 89fdc134f..fcdcbc367 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1037,8 +1037,6 @@ rte_mlx5_pmd_init(void)
 	 * using this PMD, which is not supported in forked processes.
 	 */
 	setenv("RDMAV_HUGEPAGES_SAFE", "1", 1);
-	/* Don't map UAR to WC if BlueFlame is not used.*/
-	setenv("MLX5_SHUT_UP_BF", "1", 1);
 	/* Match the size of Rx completion entry to the size of a cacheline. */
 	if (RTE_CACHE_LINE_SIZE == 128)
 		setenv("MLX5_CQE_SIZE", "128", 0);
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index ea037427b..d34f3cc04 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -578,15 +578,18 @@ mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct rte_mbuf *mb)
 }
 
 /**
- * Ring TX queue doorbell.
+ * Ring TX queue doorbell and flush the update if requested.
  *
  * @param txq
  *   Pointer to TX queue structure.
  * @param wqe
  *   Pointer to the last WQE posted in the NIC.
+ * @param cond
+ *   Request for write memory barrier after BlueFlame update.
  */
 static __rte_always_inline void
-mlx5_tx_dbrec(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe)
+mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe,
+		       int cond)
 {
 	uint64_t *dst = (uint64_t *)((uintptr_t)txq->bf_reg);
 	volatile uint64_t *src = ((volatile uint64_t *)wqe);
@@ -596,6 +599,22 @@ mlx5_tx_dbrec(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe)
 	/* Ensure ordering between DB record and BF copy. */
 	rte_wmb();
 	*dst = *src;
+	if (cond)
+		rte_wmb();
+}
+
+/**
+ * Ring TX queue doorbell and flush the update by write memory barrier.
+ *
+ * @param txq
+ *   Pointer to TX queue structure.
+ * @param wqe
+ *   Pointer to the last WQE posted in the NIC.
+ */
+static __rte_always_inline void
+mlx5_tx_dbrec(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe)
+{
+	mlx5_tx_dbrec_cond_wmb(txq, wqe, 1);
 }
 
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 4cb7f2889..61f5bc45b 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -345,7 +345,7 @@ txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
 	txq->wqe_ci += (nb_dword_in_hdr + pkts_n + (nb_dword_per_wqebb - 1)) /
 		       nb_dword_per_wqebb;
 	/* Ring QP doorbell. */
-	mlx5_tx_dbrec(txq, wqe);
+	mlx5_tx_dbrec_cond_wmb(txq, wqe, pkts_n < MLX5_VPMD_TX_MAX_BURST);
 	return pkts_n;
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index e9819b762..a53027d84 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -344,7 +344,7 @@ txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t pkts_n,
 	txq->wqe_ci += (nb_dword_in_hdr + pkts_n + (nb_dword_per_wqebb - 1)) /
 		       nb_dword_per_wqebb;
 	/* Ring QP doorbell. */
-	mlx5_tx_dbrec(txq, wqe);
+	mlx5_tx_dbrec_cond_wmb(txq, wqe, pkts_n < MLX5_VPMD_TX_MAX_BURST);
 	return pkts_n;
 }
 
-- 
2.11.0

  parent reply	other threads:[~2017-10-25  0:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-22  8:00 [dpdk-dev] [PATCH] " Yongseok Koh
2017-10-22  9:46 ` Sagi Grimberg
2017-10-22 22:01   ` Yongseok Koh
2017-10-23  7:50     ` Nélio Laranjeiro
2017-10-23 17:24       ` Yongseok Koh
2017-10-24  6:49         ` Nélio Laranjeiro
2017-10-23  7:00 ` Nélio Laranjeiro
2017-10-25  0:27 ` Yongseok Koh [this message]
2017-10-25  9:19   ` [dpdk-dev] [PATCH v2] " Nélio Laranjeiro
2017-10-25 21:34     ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49c39b44917c35ecaabf06f5f920d0f7e0ed0b6b.1508891141.git.yskoh@mellanox.com \
    --to=yskoh@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=nelio.laranjeiro@6wind.com \
    --cc=sagi@grimberg.me \
    --cc=solganik@gmail.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).