patches for DPDK stable branches
 help / color / mirror / Atom feed
* [PATCH] net/mlx5: fix data access race condition for shared Rx queue
@ 2024-07-05 13:05 Jiawei Wang
  2024-07-18  7:22 ` Raslan Darawsheh
  0 siblings, 1 reply; 2+ messages in thread
From: Jiawei Wang @ 2024-07-05 13:05 UTC (permalink / raw)
  To: bingz, viacheslavo, Dariusz Sosnowski, Ori Kam, Suanming Mou,
	Matan Azrad, Alexander Kozyrev
  Cc: dev, rasland, stable

The rxq_data resources were shared for shared Rx queue with the same
group and queue ID.
The cq_ci:24 of rxq_data was unalignment with other fields in the one
32-bit data, like the dynf_meta and delay_drop.

  32bit:  xxxx xxxI IIII IIII IIII IIII IIII IIIx
                  ^ .... .... .... .... ...^
                  |          cq_ci         |

The issue is that while the control thread updates the dynf_meta:1 or
delay_drop:1 value during port start, another data thread updates the
cq_ci at the same time, it causes the bytes race condition with
different thread, and cq_ci value may be overwritten and updated the
abnormal value into HW CQ DB.

This patch separates the cq_ci from the configuration data spaces, and
adds checking for delay_drop and dynf_meta if shared Rx queue if
started.

Fixes: 02a6195cbe ("net/mlx5: support enhanced CQE compression in Rx burst")
Cc: stable@dpdk.org

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_devx.c |  3 ++-
 drivers/net/mlx5/mlx5_flow.c | 24 +++++++++++++-----------
 drivers/net/mlx5/mlx5_rx.h   |  4 ++--
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 7db271acb4..8ebe784000 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -684,7 +684,8 @@ mlx5_rxq_devx_obj_new(struct mlx5_rxq_priv *rxq)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
-	rxq_data->delay_drop = priv->config.std_delay_drop;
+	if (!rxq_data->shared || !rxq_ctrl->started)
+		rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret) {
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 5b3f2b9119..72fb3a55ba 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1853,18 +1853,20 @@ mlx5_flow_rxq_dynf_set(struct rte_eth_dev *dev)
 		if (rxq == NULL || rxq->ctrl == NULL)
 			continue;
 		data = &rxq->ctrl->rxq;
-		if (!rte_flow_dynf_metadata_avail()) {
-			data->dynf_meta = 0;
-			data->flow_meta_mask = 0;
-			data->flow_meta_offset = -1;
-			data->flow_meta_port_mask = 0;
-		} else {
-			data->dynf_meta = 1;
-			data->flow_meta_mask = rte_flow_dynf_metadata_mask;
-			data->flow_meta_offset = rte_flow_dynf_metadata_offs;
-			data->flow_meta_port_mask = priv->sh->dv_meta_mask;
+		if (!data->shared || !rxq->ctrl->started) {
+			if (!rte_flow_dynf_metadata_avail()) {
+				data->dynf_meta = 0;
+				data->flow_meta_mask = 0;
+				data->flow_meta_offset = -1;
+				data->flow_meta_port_mask = 0;
+			} else {
+				data->dynf_meta = 1;
+				data->flow_meta_mask = rte_flow_dynf_metadata_mask;
+				data->flow_meta_offset = rte_flow_dynf_metadata_offs;
+				data->flow_meta_port_mask = priv->sh->dv_meta_mask;
+			}
+			data->mark_flag = mark_flag;
 		}
-		data->mark_flag = mark_flag;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 1485556d89..7d144921ab 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -101,14 +101,14 @@ struct __rte_cache_aligned mlx5_rxq_data {
 	unsigned int shared:1; /* Shared RXQ. */
 	unsigned int delay_drop:1; /* Enable delay drop. */
 	unsigned int cqe_comp_layout:1; /* CQE Compression Layout*/
-	unsigned int cq_ci:24;
+	uint16_t port_id;
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
-	uint16_t port_id;
 	uint32_t elts_ci;
 	uint32_t rq_ci;
 	uint16_t consumed_strd; /* Number of consumed strides in WQE. */
 	uint32_t rq_pi;
+	uint32_t cq_ci:24;
 	uint16_t rq_repl_thresh; /* Threshold for buffer replenishment. */
 	uint32_t byte_mask;
 	union {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] net/mlx5: fix data access race condition for shared Rx queue
  2024-07-05 13:05 [PATCH] net/mlx5: fix data access race condition for shared Rx queue Jiawei Wang
@ 2024-07-18  7:22 ` Raslan Darawsheh
  0 siblings, 0 replies; 2+ messages in thread
From: Raslan Darawsheh @ 2024-07-18  7:22 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Bing Zhao, Slava Ovsiienko,
	Dariusz Sosnowski, Ori Kam, Suanming Mou, Matan Azrad,
	Alexander Kozyrev
  Cc: dev, stable

Hi,

From: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
Sent: Friday, July 5, 2024 4:05 PM
To: Bing Zhao; Slava Ovsiienko; Dariusz Sosnowski; Ori Kam; Suanming Mou; Matan Azrad; Alexander Kozyrev
Cc: dev@dpdk.org; Raslan Darawsheh; stable@dpdk.org
Subject: [PATCH] net/mlx5: fix data access race condition for shared Rx queue

The rxq_data resources were shared for shared Rx queue with the same
group and queue ID.
The cq_ci:24 of rxq_data was unalignment with other fields in the one
32-bit data, like the dynf_meta and delay_drop.

  32bit:  xxxx xxxI IIII IIII IIII IIII IIII IIIx
                  ^ .... .... .... .... ...^
                  |          cq_ci         |

The issue is that while the control thread updates the dynf_meta:1 or
delay_drop:1 value during port start, another data thread updates the
cq_ci at the same time, it causes the bytes race condition with
different thread, and cq_ci value may be overwritten and updated the
abnormal value into HW CQ DB.

This patch separates the cq_ci from the configuration data spaces, and
adds checking for delay_drop and dynf_meta if shared Rx queue if
started.

Fixes: 02a6195cbe ("net/mlx5: support enhanced CQE compression in Rx burst")
Cc: stable@dpdk.org

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-07-18  7:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-05 13:05 [PATCH] net/mlx5: fix data access race condition for shared Rx queue Jiawei Wang
2024-07-18  7:22 ` Raslan Darawsheh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).