* [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy
@ 2025-04-27 11:28 Maayan Kashani
  2025-04-28 23:58 ` Patrick Robb
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Maayan Kashani @ 2025-04-27 11:28 UTC (permalink / raw)
  To: dev
  Cc: mkashani, dsosnowski, rasland, stable, Alex Vesker,
	Viacheslav Ovsiienko, Bing Zhao, Ori Kam, Suanming Mou,
	Matan Azrad
Queue sync operation was skipped on rule destroy.
Unlike on fw wqe rule create in which both fence and notify_hw
are set to true, on destroy fence was set to false causing
previous queue operation to be stuck in the queue forever.
Example:
   rule_a - HW rule, rule_b - FW WQE rule.
Sequence:
   rule_a destroy, burst=1 (HW rule put to queue but no DB)
   rule_b destroy, burst=0 (FW WQE rule cmd but no queue sync)
Outcome:
   rule_a is stuck forever in the queue - no completion.
Fixes: 338aaf911665 ("net/mlx5/hws: add send FW match STE using gen WQE")
Cc: stable@dpdk.org
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Maayan Kashani <mkashani@nvidia.com>
---
 drivers/net/mlx5/hws/mlx5dr_send.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c b/drivers/net/mlx5/hws/mlx5dr_send.c
index e121c7f7ed5..d01fc7ef2ca 100644
--- a/drivers/net/mlx5/hws/mlx5dr_send.c
+++ b/drivers/net/mlx5/hws/mlx5dr_send.c
@@ -339,7 +339,7 @@ void mlx5dr_send_stes_fw(struct mlx5dr_send_engine *queue,
 	pdn = ctx->pd_num;
 
 	/* Writing through FW can't HW fence, therefore we drain the queue */
-	if (send_attr->fence)
+	if (send_attr->fence || send_attr->notify_hw)
 		mlx5dr_send_queue_action(ctx,
 					 queue_id,
 					 MLX5DR_SEND_QUEUE_ACTION_DRAIN_SYNC);
-- 
2.21.0
^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy
  2025-04-27 11:28 [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy Maayan Kashani
@ 2025-04-28 23:58 ` Patrick Robb
  2025-05-05 16:02 ` Bing Zhao
  2025-05-13  6:26 ` Raslan Darawsheh
  2 siblings, 0 replies; 4+ messages in thread
From: Patrick Robb @ 2025-04-28 23:58 UTC (permalink / raw)
  To: Maayan Kashani; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1752 bytes --]
Recheck-request: iol-intel-Performance
Looks like this patch was affected by an infra failure - putting in a
retest request for the CI testing system.
On Sun, Apr 27, 2025 at 7:28 AM Maayan Kashani <mkashani@nvidia.com> wrote:
> Queue sync operation was skipped on rule destroy.
> Unlike on fw wqe rule create in which both fence and notify_hw
> are set to true, on destroy fence was set to false causing
> previous queue operation to be stuck in the queue forever.
> Example:
>    rule_a - HW rule, rule_b - FW WQE rule.
> Sequence:
>    rule_a destroy, burst=1 (HW rule put to queue but no DB)
>    rule_b destroy, burst=0 (FW WQE rule cmd but no queue sync)
> Outcome:
>    rule_a is stuck forever in the queue - no completion.
>
> Fixes: 338aaf911665 ("net/mlx5/hws: add send FW match STE using gen WQE")
> Cc: stable@dpdk.org
>
> Signed-off-by: Alex Vesker <valex@nvidia.com>
> Signed-off-by: Maayan Kashani <mkashani@nvidia.com>
> ---
>  drivers/net/mlx5/hws/mlx5dr_send.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c
> b/drivers/net/mlx5/hws/mlx5dr_send.c
> index e121c7f7ed5..d01fc7ef2ca 100644
> --- a/drivers/net/mlx5/hws/mlx5dr_send.c
> +++ b/drivers/net/mlx5/hws/mlx5dr_send.c
> @@ -339,7 +339,7 @@ void mlx5dr_send_stes_fw(struct mlx5dr_send_engine
> *queue,
>         pdn = ctx->pd_num;
>
>         /* Writing through FW can't HW fence, therefore we drain the queue
> */
> -       if (send_attr->fence)
> +       if (send_attr->fence || send_attr->notify_hw)
>                 mlx5dr_send_queue_action(ctx,
>                                          queue_id,
>
>  MLX5DR_SEND_QUEUE_ACTION_DRAIN_SYNC);
> --
> 2.21.0
>
>
[-- Attachment #2: Type: text/html, Size: 2436 bytes --]
^ permalink raw reply	[flat|nested] 4+ messages in thread
* RE: [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy
  2025-04-27 11:28 [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy Maayan Kashani
  2025-04-28 23:58 ` Patrick Robb
@ 2025-05-05 16:02 ` Bing Zhao
  2025-05-13  6:26 ` Raslan Darawsheh
  2 siblings, 0 replies; 4+ messages in thread
From: Bing Zhao @ 2025-05-05 16:02 UTC (permalink / raw)
  To: Maayan Kashani, dev
  Cc: Dariusz Sosnowski, Raslan Darawsheh, stable, Alex Vesker,
	Slava Ovsiienko, Ori Kam, Suanming Mou, Matan Azrad
Hi,
> -----Original Message-----
> From: Maayan Kashani <mkashani@nvidia.com>
> Sent: Sunday, April 27, 2025 7:28 PM
> To: dev@dpdk.org
> Cc: Maayan Kashani <mkashani@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; Raslan Darawsheh <rasland@nvidia.com>;
> stable@dpdk.org; Alex Vesker <valex@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Bing Zhao <bingz@nvidia.com>; Ori Kam
> <orika@nvidia.com>; Suanming Mou <suanmingm@nvidia.com>; Matan Azrad
> <matan@nvidia.com>
> Subject: [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy
> 
> Queue sync operation was skipped on rule destroy.
> Unlike on fw wqe rule create in which both fence and notify_hw are set to
> true, on destroy fence was set to false causing previous queue operation
> to be stuck in the queue forever.
> Example:
>    rule_a - HW rule, rule_b - FW WQE rule.
> Sequence:
>    rule_a destroy, burst=1 (HW rule put to queue but no DB)
>    rule_b destroy, burst=0 (FW WQE rule cmd but no queue sync)
> Outcome:
>    rule_a is stuck forever in the queue - no completion.
> 
> Fixes: 338aaf911665 ("net/mlx5/hws: add send FW match STE using gen WQE")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Alex Vesker <valex@nvidia.com>
> Signed-off-by: Maayan Kashani <mkashani@nvidia.com>
> ---
>  drivers/net/mlx5/hws/mlx5dr_send.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c
> b/drivers/net/mlx5/hws/mlx5dr_send.c
> index e121c7f7ed5..d01fc7ef2ca 100644
> --- a/drivers/net/mlx5/hws/mlx5dr_send.c
> +++ b/drivers/net/mlx5/hws/mlx5dr_send.c
> @@ -339,7 +339,7 @@ void mlx5dr_send_stes_fw(struct mlx5dr_send_engine
> *queue,
>  	pdn = ctx->pd_num;
> 
>  	/* Writing through FW can't HW fence, therefore we drain the queue
> */
> -	if (send_attr->fence)
> +	if (send_attr->fence || send_attr->notify_hw)
>  		mlx5dr_send_queue_action(ctx,
>  					 queue_id,
>  					 MLX5DR_SEND_QUEUE_ACTION_DRAIN_SYNC);
> --
> 2.21.0
Acked-by: Bing Zhao <bingz@nvidia.com>
Thanks
^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy
  2025-04-27 11:28 [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy Maayan Kashani
  2025-04-28 23:58 ` Patrick Robb
  2025-05-05 16:02 ` Bing Zhao
@ 2025-05-13  6:26 ` Raslan Darawsheh
  2 siblings, 0 replies; 4+ messages in thread
From: Raslan Darawsheh @ 2025-05-13  6:26 UTC (permalink / raw)
  To: Maayan Kashani, dev
  Cc: dsosnowski, stable, Alex Vesker, Viacheslav Ovsiienko, Bing Zhao,
	Ori Kam, Suanming Mou, Matan Azrad
Hi,
On 27/04/2025 2:28 PM, Maayan Kashani wrote:
> Queue sync operation was skipped on rule destroy.
> Unlike on fw wqe rule create in which both fence and notify_hw
> are set to true, on destroy fence was set to false causing
> previous queue operation to be stuck in the queue forever.
> Example:
>     rule_a - HW rule, rule_b - FW WQE rule.
> Sequence:
>     rule_a destroy, burst=1 (HW rule put to queue but no DB)
>     rule_b destroy, burst=0 (FW WQE rule cmd but no queue sync)
> Outcome:
>     rule_a is stuck forever in the queue - no completion.
> 
> Fixes: 338aaf911665 ("net/mlx5/hws: add send FW match STE using gen WQE")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Alex Vesker <valex@nvidia.com>
> Signed-off-by: Maayan Kashani <mkashani@nvidia.com>
> ---
Patch applied to next-net-mlx,
-- 
Kindest regards
Raslan Darawsheh
^ permalink raw reply	[flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-05-13  6:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-27 11:28 [PATCH] net/mlx5/hws: fix send queue drain on FW WQE destroy Maayan Kashani
2025-04-28 23:58 ` Patrick Robb
2025-05-05 16:02 ` Bing Zhao
2025-05-13  6:26 ` Raslan Darawsheh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).