DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] net/mlx5: fix decreasing the reference count of a Tx queue
@ 2025-07-03 15:02 Bing Zhao
  2025-07-06 14:17 ` Raslan Darawsheh
  0 siblings, 1 reply; 2+ messages in thread
From: Bing Zhao @ 2025-07-03 15:02 UTC (permalink / raw)
  To: viacheslavo, dev, rasland; +Cc: orika, dsosnowski, suanmingm, matan, thomas

When changing the order of the Tx queues startup, the depth of the
queue is compared. If not equal to the current big log2 value, next
queue will be checked and the current one will be skipped for the
next iteration.

The mlx5_txq_get() will increase the reference count number, and the
size check no match is not an error and the startup will continue but
not fall into the error roll-back label. The reference count should
be decreased by 1 to dereference the count, or else in the device
close stage, the queue cannot be released in the FW and the TIS, PD
will be leaked as well.

By calling the mlx5_txq_release() before continue will recover the
reference count to the initial state and solve the leak.

Fixes: 6f356d3840e6 ("net/mlx5: pass DevX object info in Tx queue start")

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_trigger.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 90287a1b75..6c6f228afd 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -61,8 +61,12 @@ mlx5_txq_start(struct rte_eth_dev *dev)
 			struct mlx5_txq_ctrl *txq_ctrl = mlx5_txq_get(dev, i);
 			struct mlx5_txq_data *txq_data = &txq_ctrl->txq;
 
-			if (!txq_ctrl || txq_data->elts_n != cnt)
+			if (!txq_ctrl)
+				continue;
+			if (txq_data->elts_n != cnt) {
+				mlx5_txq_release(dev, i);
 				continue;
+			}
 			if (!txq_ctrl->is_hairpin)
 				txq_alloc_elts(txq_ctrl);
 			MLX5_ASSERT(!txq_ctrl->obj);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] net/mlx5: fix decreasing the reference count of a Tx queue
  2025-07-03 15:02 [PATCH] net/mlx5: fix decreasing the reference count of a Tx queue Bing Zhao
@ 2025-07-06 14:17 ` Raslan Darawsheh
  0 siblings, 0 replies; 2+ messages in thread
From: Raslan Darawsheh @ 2025-07-06 14:17 UTC (permalink / raw)
  To: Bing Zhao, viacheslavo, dev; +Cc: orika, dsosnowski, suanmingm, matan, thomas

Hi,


On 03/07/2025 6:02 PM, Bing Zhao wrote:
> When changing the order of the Tx queues startup, the depth of the
> queue is compared. If not equal to the current big log2 value, next
> queue will be checked and the current one will be skipped for the
> next iteration.
> 
> The mlx5_txq_get() will increase the reference count number, and the
> size check no match is not an error and the startup will continue but
> not fall into the error roll-back label. The reference count should
> be decreased by 1 to dereference the count, or else in the device
> close stage, the queue cannot be released in the FW and the TIS, PD
> will be leaked as well.
> 
> By calling the mlx5_txq_release() before continue will recover the
> reference count to the initial state and solve the leak.
> 
> Fixes: 6f356d3840e6 ("net/mlx5: pass DevX object info in Tx queue start")
> 
> Signed-off-by: Bing Zhao <bingz@nvidia.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Patch applied to next-net-mlx,

Kindest regards
Raslan Darawsheh


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-07-06 14:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-03 15:02 [PATCH] net/mlx5: fix decreasing the reference count of a Tx queue Bing Zhao
2025-07-06 14:17 ` Raslan Darawsheh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).