patches for DPDK stable branches
 help / color / Atom feed
* [dpdk-stable] [PATCH RFC v1 1/6] net/mlx5: relax the barrier for UAR write
       [not found] <20200213123854.203566-1-gavin.hu@arm.com>
@ 2020-02-13 12:38 ` Gavin Hu
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 2/6] net/mlx5: use cio barrier before the BF WQE Gavin Hu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: Gavin Hu @ 2020-02-13 12:38 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, rasland, drc, bruce.richardson,
	konstantin.ananyev, matan, shahafs, viacheslavo, jerinj,
	Honnappa.Nagarahalli, ruifeng.wang, phil.yang, joyce.kong,
	steve.capper, stable

The UAR is part of PCI address space that is mapped for direct access to
the HCA from the CPU. Read-Write accesses to this space are strongly
ordered thus a compiler barrier is sufficient for all arches.

This patch set is based on the following aarch64 architecural facts:
1. The PCI BAR space is mapped as nGnRE device memory, not cachable nor
write-combine.
2. io accesses to a single device is total ordered.

Fixes: 6bf10ab69be0 ("net/mlx5: support 32-bit systems")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/mlx5/mlx5_rxtx.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 939778aa5..50b3cc3c9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -546,7 +546,7 @@ __mlx5_uar_write64_relaxed(uint64_t val, void *addr,
 static __rte_always_inline void
 __mlx5_uar_write64(uint64_t val, void *addr, rte_spinlock_t *lock)
 {
-	rte_io_wmb();
+	rte_compiler_barrier();
 	__mlx5_uar_write64_relaxed(val, addr, lock);
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-stable] [PATCH RFC v1 2/6] net/mlx5: use cio barrier before the BF WQE
       [not found] <20200213123854.203566-1-gavin.hu@arm.com>
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 1/6] net/mlx5: relax the barrier for UAR write Gavin Hu
@ 2020-02-13 12:38 ` Gavin Hu
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 3/6] net/mlx5: add missing barrier Gavin Hu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: Gavin Hu @ 2020-02-13 12:38 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, rasland, drc, bruce.richardson,
	konstantin.ananyev, matan, shahafs, viacheslavo, jerinj,
	Honnappa.Nagarahalli, ruifeng.wang, phil.yang, joyce.kong,
	steve.capper, stable

To ensure the WQE and doorbell record, which reside in the host memory,
are visible to HW before the blue frame, a CIO barrier is sufficient, a
rte_wmb is overkill.

Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/mlx5/mlx5_rxtx.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 50b3cc3c9..c672af4c4 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -658,7 +658,7 @@ mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe,
 	rte_cio_wmb();
 	*txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
 	/* Ensure ordering between DB record and BF copy. */
-	rte_wmb();
+	rte_cio_wmb();
 	mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
 	if (cond)
 		rte_wmb();
-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-stable] [PATCH RFC v1 3/6] net/mlx5: add missing barrier
       [not found] <20200213123854.203566-1-gavin.hu@arm.com>
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 1/6] net/mlx5: relax the barrier for UAR write Gavin Hu
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 2/6] net/mlx5: use cio barrier before the BF WQE Gavin Hu
@ 2020-02-13 12:38 ` Gavin Hu
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 5/6] net/mlx5: non-cacheable mapping defaulted for aarch64 Gavin Hu
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 6/6] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Gavin Hu
  4 siblings, 0 replies; 5+ messages in thread
From: Gavin Hu @ 2020-02-13 12:38 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, rasland, drc, bruce.richardson,
	konstantin.ananyev, matan, shahafs, viacheslavo, jerinj,
	Honnappa.Nagarahalli, ruifeng.wang, phil.yang, joyce.kong,
	steve.capper, stable

To keep order of the modification of RX queue descriptor(rxq->cq_db) and
the CQ doorbell register, a rte_cio_wmb barrier is required.

The situation was rescued by the stronger than required barrier in the
mlx5_uar_write64, it becomes a must when the barrier is relaxed.

Fixes: 6bf10ab69be0 ("net/mlx5: support 32-bit systems")
Cc: stable@dpdk.org

Suggested-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/mlx5/mlx5_rxq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index dc0fd8211..2d1b153a3 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -856,7 +856,8 @@ mlx5_arm_cq(struct mlx5_rxq_data *rxq, int sq_n_rxq)
 	doorbell = (uint64_t)doorbell_hi << 32;
 	doorbell |=  rxq->cqn;
 	rxq->cq_db[MLX5_CQ_ARM_DB] = rte_cpu_to_be_32(doorbell_hi);
-	mlx5_uar_write64(rte_cpu_to_be_64(doorbell),
+	rte_cio_wmb();
+	mlx5_uar_write64_relaxed(rte_cpu_to_be_64(doorbell),
 			 cq_db_reg, rxq->uar_lock_cq);
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-stable] [PATCH RFC v1 5/6] net/mlx5: non-cacheable mapping defaulted for aarch64
       [not found] <20200213123854.203566-1-gavin.hu@arm.com>
                   ` (2 preceding siblings ...)
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 3/6] net/mlx5: add missing barrier Gavin Hu
@ 2020-02-13 12:38 ` Gavin Hu
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 6/6] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Gavin Hu
  4 siblings, 0 replies; 5+ messages in thread
From: Gavin Hu @ 2020-02-13 12:38 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, rasland, drc, bruce.richardson,
	konstantin.ananyev, matan, shahafs, viacheslavo, jerinj,
	Honnappa.Nagarahalli, ruifeng.wang, phil.yang, joyce.kong,
	steve.capper, stable

aarch64 does not map pci resources to 'write-combine' nor
cacheable. In Linux Kernel arch_can_pci_mmap_wc() equals to 0 on
aarch64[1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
tree/drivers/pci/pci-sysfs.c?h=v5.4#n1204

Fixes: f078ceb6ae93 ("net/mlx5: fix Tx doorbell write memory barrier")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/mlx5/mlx5_txq.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index bc13abfe6..144bab4a6 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -319,7 +319,11 @@ txq_uar_ncattr_init(struct mlx5_txq_ctrl *txq_ctrl, size_t page_size)
 	off_t cmd;
 
 	txq_ctrl->txq.db_heu = priv->config.dbnc == MLX5_TXDB_HEURISTIC;
+#ifdef RTE_ARCH_ARM64
+	txq_ctrl->txq.db_nc = 1;
+#else
 	txq_ctrl->txq.db_nc = 0;
+#endif
 	/* Check the doorbell register mapping type. */
 	cmd = txq_ctrl->uar_mmap_offset / page_size;
 	cmd >>= MLX5_UAR_MMAP_CMD_SHIFT;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-stable] [PATCH RFC v1 6/6] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt
       [not found] <20200213123854.203566-1-gavin.hu@arm.com>
                   ` (3 preceding siblings ...)
  2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 5/6] net/mlx5: non-cacheable mapping defaulted for aarch64 Gavin Hu
@ 2020-02-13 12:38 ` Gavin Hu
  4 siblings, 0 replies; 5+ messages in thread
From: Gavin Hu @ 2020-02-13 12:38 UTC (permalink / raw)
  To: dev
  Cc: nd, Phil Yang, david.marchand, thomas, rasland, drc,
	bruce.richardson, konstantin.ananyev, matan, shahafs,
	viacheslavo, jerinj, Honnappa.Nagarahalli, ruifeng.wang,
	joyce.kong, steve.capper, stable

From: Phil Yang <phil.yang@arm.com>

PMD Rx queue descriptor contains two mlx5_mprq_buf fields, which
are the multi-packet RQ buffer header pointers. It uses the common
rte_atomic_XXX functions to make sure the refcnt access is atomic.

The common rte_atomic_XXX functions are full barriers on aarch64.
Optimized it with one-way barrier to improve performance.

Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org

Suggested-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
 drivers/net/mlx5/mlx5_rxq.c  |  2 +-
 drivers/net/mlx5/mlx5_rxtx.c | 16 +++++++++-------
 drivers/net/mlx5/mlx5_rxtx.h |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 2d1b153a3..765bb1af5 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1535,7 +1535,7 @@ mlx5_mprq_buf_init(struct rte_mempool *mp, void *opaque_arg,
 
 	memset(_m, 0, sizeof(*buf));
 	buf->mp = mp;
-	rte_atomic16_set(&buf->refcnt, 1);
+	__atomic_store_n(&buf->refcnt, 1, __ATOMIC_RELAXED);
 	for (j = 0; j != strd_n; ++j) {
 		shinfo = &buf->shinfos[j];
 		shinfo->free_cb = mlx5_mprq_buf_free_cb;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 5eea932d4..0e7519c56 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1592,10 +1592,11 @@ mlx5_mprq_buf_free_cb(void *addr __rte_unused, void *opaque)
 {
 	struct mlx5_mprq_buf *buf = opaque;
 
-	if (rte_atomic16_read(&buf->refcnt) == 1) {
+	if (__atomic_load_n(&buf->refcnt, __ATOMIC_RELAXED) == 1) {
 		rte_mempool_put(buf->mp, buf);
-	} else if (rte_atomic16_add_return(&buf->refcnt, -1) == 0) {
-		rte_atomic16_set(&buf->refcnt, 1);
+	} else if (unlikely(__atomic_sub_fetch(&buf->refcnt, 1,
+						__ATOMIC_RELAXED) == 0)) {
+		__atomic_store_n(&buf->refcnt, 1, __ATOMIC_RELAXED);
 		rte_mempool_put(buf->mp, buf);
 	}
 }
@@ -1676,7 +1677,8 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 
 		if (consumed_strd == strd_n) {
 			/* Replace WQE only if the buffer is still in use. */
-			if (rte_atomic16_read(&buf->refcnt) > 1) {
+			if (__atomic_load_n(&buf->refcnt,
+					    __ATOMIC_RELAXED) > 1) {
 				mprq_buf_replace(rxq, rq_ci & wq_mask, strd_n);
 				/* Release the old buffer. */
 				mlx5_mprq_buf_free(buf);
@@ -1766,9 +1768,9 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			void *buf_addr;
 
 			/* Increment the refcnt of the whole chunk. */
-			rte_atomic16_add_return(&buf->refcnt, 1);
-			MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf->refcnt) <=
-				    strd_n + 1);
+			__atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_ACQUIRE);
+			MLX5_ASSERT(__atomic_load_n(&buf->refcnt,
+					__ATOMIC_RELAXED) <= strd_n + 1);
 			buf_addr = RTE_PTR_SUB(addr, headroom_sz);
 			/*
 			 * MLX5 device doesn't use iova but it is necessary in a
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index d32c4e430..1f453fe09 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -78,7 +78,7 @@ struct rxq_zip {
 /* Multi-Packet RQ buffer header. */
 struct mlx5_mprq_buf {
 	struct rte_mempool *mp;
-	rte_atomic16_t refcnt; /* Atomically accessed refcnt. */
+	uint16_t refcnt; /* Atomically accessed refcnt. */
 	uint8_t pad[RTE_PKTMBUF_HEADROOM]; /* Headroom for the first packet. */
 	struct rte_mbuf_ext_shared_info shinfos[];
 	/*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200213123854.203566-1-gavin.hu@arm.com>
2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 1/6] net/mlx5: relax the barrier for UAR write Gavin Hu
2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 2/6] net/mlx5: use cio barrier before the BF WQE Gavin Hu
2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 3/6] net/mlx5: add missing barrier Gavin Hu
2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 5/6] net/mlx5: non-cacheable mapping defaulted for aarch64 Gavin Hu
2020-02-13 12:38 ` [dpdk-stable] [PATCH RFC v1 6/6] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Gavin Hu

patches for DPDK stable branches

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ http://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/ public-inbox