* [PATCH] net/mlx5: fix error packets drop in the regular Rx
@ 2024-03-11 18:14 Viacheslav Ovsiienko
2024-04-18 12:16 ` Kevin Traynor
0 siblings, 1 reply; 5+ messages in thread
From: Viacheslav Ovsiienko @ 2024-03-11 18:14 UTC (permalink / raw)
To: stable; +Cc: bluca, ktraynor, christian.ehrhardt, xuemingl
[ upstream commit ef296e8f6140ea469b50c7bfe73501b1c9ef86e1 ]
When packet gets received with error it is reported in CQE
structure and PMD analyzes the error syndrome and provides
two options - either reset the entire queue for the critical
errors, or just ignore the packet.
The non-vectorized rx_burst did not ignore the non-critical
error packets, and in case of packet length exceeding the
mbuf data buffer length it took the next element in the queue
WQE ring, resulting in CQE/WQE consume indices synchronization
lost.
Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
drivers/net/mlx5/mlx5_rx.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 5bf1a679b2..cc087348a4 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -613,7 +613,8 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec,
* @param mprq
* Indication if it is called from MPRQ.
* @return
- * 0 in case of empty CQE, MLX5_REGULAR_ERROR_CQE_RET in case of error CQE,
+ * 0 in case of empty CQE,
+ * MLX5_REGULAR_ERROR_CQE_RET in case of error CQE,
* MLX5_CRITICAL_ERROR_CQE_RET in case of error CQE lead to Rx queue reset,
* otherwise the packet size in regular RxQ,
* and striding byte count format in mprq case.
@@ -697,6 +698,11 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
if (ret == MLX5_RECOVERY_ERROR_RET ||
ret == MLX5_RECOVERY_COMPLETED_RET)
return MLX5_CRITICAL_ERROR_CQE_RET;
+ if (!mprq && ret == MLX5_RECOVERY_IGNORE_RET) {
+ *skip_cnt = 1;
+ ++rxq->cq_ci;
+ return MLX5_ERROR_CQE_MASK;
+ }
} else {
return 0;
}
@@ -971,19 +977,18 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask];
len = mlx5_rx_poll_len(rxq, cqe, cqe_n, cqe_mask, &mcqe, &skip_cnt, false);
if (unlikely(len & MLX5_ERROR_CQE_MASK)) {
+ /* We drop packets with non-critical errors */
+ rte_mbuf_raw_free(rep);
if (len == MLX5_CRITICAL_ERROR_CQE_RET) {
- rte_mbuf_raw_free(rep);
rq_ci = rxq->rq_ci << sges_n;
break;
}
+ /* Skip specified amount of error CQEs packets */
rq_ci >>= sges_n;
rq_ci += skip_cnt;
rq_ci <<= sges_n;
- idx = rq_ci & wqe_mask;
- wqe = &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx];
- seg = (*rxq->elts)[idx];
- cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask];
- len = len & ~MLX5_ERROR_CQE_MASK;
+ MLX5_ASSERT(!pkt);
+ continue;
}
if (len == 0) {
rte_mbuf_raw_free(rep);
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net/mlx5: fix error packets drop in the regular Rx
2024-03-11 18:14 [PATCH] net/mlx5: fix error packets drop in the regular Rx Viacheslav Ovsiienko
@ 2024-04-18 12:16 ` Kevin Traynor
0 siblings, 0 replies; 5+ messages in thread
From: Kevin Traynor @ 2024-04-18 12:16 UTC (permalink / raw)
To: Viacheslav Ovsiienko, stable; +Cc: bluca, christian.ehrhardt, xuemingl
On 11/03/2024 18:14, Viacheslav Ovsiienko wrote:
> [ upstream commit ef296e8f6140ea469b50c7bfe73501b1c9ef86e1 ]
>
> When packet gets received with error it is reported in CQE
> structure and PMD analyzes the error syndrome and provides
> two options - either reset the entire queue for the critical
> errors, or just ignore the packet.
>
> The non-vectorized rx_burst did not ignore the non-critical
> error packets, and in case of packet length exceeding the
> mbuf data buffer length it took the next element in the queue
> WQE ring, resulting in CQE/WQE consume indices synchronization
> lost.
>
> Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx queue")
> Cc: stable@dpdk.org
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
> drivers/net/mlx5/mlx5_rx.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
fyi - for 21.11 branch, I had already rebased and applied this. It seems
to be on 22.11 and 23.11 branches (or queued) also.
https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=c52e6e0ecda72ad163fc7757abe825105d7a16c8
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] net/mlx5: fix error packets drop in the regular Rx
@ 2024-02-20 11:45 Viacheslav Ovsiienko
2024-02-20 14:04 ` Dariusz Sosnowski
2024-02-27 16:16 ` Raslan Darawsheh
0 siblings, 2 replies; 5+ messages in thread
From: Viacheslav Ovsiienko @ 2024-02-20 11:45 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, orika, dsosnowski, stable
When packet gets received with error it is reported in CQE
structure and PMD analyzes the error syndrome and provides
two options - either reset the entire queue for the critical
errors, or just ignore the packet.
The non-vectorized rx_burst did not ignore the non-critical
error packets, and in case of packet length exceeding the
mbuf data buffer length it took the next element in the queue
WQE ring, resulting in CQE/WQE consume indices synchronization
lost.
Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
drivers/net/mlx5/mlx5_rx.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 5bf1a679b2..cc087348a4 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -613,7 +613,8 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec,
* @param mprq
* Indication if it is called from MPRQ.
* @return
- * 0 in case of empty CQE, MLX5_REGULAR_ERROR_CQE_RET in case of error CQE,
+ * 0 in case of empty CQE,
+ * MLX5_REGULAR_ERROR_CQE_RET in case of error CQE,
* MLX5_CRITICAL_ERROR_CQE_RET in case of error CQE lead to Rx queue reset,
* otherwise the packet size in regular RxQ,
* and striding byte count format in mprq case.
@@ -697,6 +698,11 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
if (ret == MLX5_RECOVERY_ERROR_RET ||
ret == MLX5_RECOVERY_COMPLETED_RET)
return MLX5_CRITICAL_ERROR_CQE_RET;
+ if (!mprq && ret == MLX5_RECOVERY_IGNORE_RET) {
+ *skip_cnt = 1;
+ ++rxq->cq_ci;
+ return MLX5_ERROR_CQE_MASK;
+ }
} else {
return 0;
}
@@ -971,19 +977,18 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask];
len = mlx5_rx_poll_len(rxq, cqe, cqe_n, cqe_mask, &mcqe, &skip_cnt, false);
if (unlikely(len & MLX5_ERROR_CQE_MASK)) {
+ /* We drop packets with non-critical errors */
+ rte_mbuf_raw_free(rep);
if (len == MLX5_CRITICAL_ERROR_CQE_RET) {
- rte_mbuf_raw_free(rep);
rq_ci = rxq->rq_ci << sges_n;
break;
}
+ /* Skip specified amount of error CQEs packets */
rq_ci >>= sges_n;
rq_ci += skip_cnt;
rq_ci <<= sges_n;
- idx = rq_ci & wqe_mask;
- wqe = &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx];
- seg = (*rxq->elts)[idx];
- cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask];
- len = len & ~MLX5_ERROR_CQE_MASK;
+ MLX5_ASSERT(!pkt);
+ continue;
}
if (len == 0) {
rte_mbuf_raw_free(rep);
--
2.18.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] net/mlx5: fix error packets drop in the regular Rx
2024-02-20 11:45 Viacheslav Ovsiienko
@ 2024-02-20 14:04 ` Dariusz Sosnowski
2024-02-27 16:16 ` Raslan Darawsheh
1 sibling, 0 replies; 5+ messages in thread
From: Dariusz Sosnowski @ 2024-02-20 14:04 UTC (permalink / raw)
To: Slava Ovsiienko, dev; +Cc: Matan Azrad, Raslan Darawsheh, Ori Kam, stable
> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Tuesday, February 20, 2024 12:45
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Raslan Darawsheh
> <rasland@nvidia.com>; Ori Kam <orika@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix error packets drop in the regular Rx
>
> When packet gets received with error it is reported in CQE structure and PMD
> analyzes the error syndrome and provides two options - either reset the entire
> queue for the critical errors, or just ignore the packet.
>
> The non-vectorized rx_burst did not ignore the non-critical error packets, and
> in case of packet length exceeding the mbuf data buffer length it took the next
> element in the queue WQE ring, resulting in CQE/WQE consume indices
> synchronization lost.
>
> Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx
> queue")
> Cc: stable@dpdk.org
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Best regards,
Dariusz Sosnowski
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] net/mlx5: fix error packets drop in the regular Rx
2024-02-20 11:45 Viacheslav Ovsiienko
2024-02-20 14:04 ` Dariusz Sosnowski
@ 2024-02-27 16:16 ` Raslan Darawsheh
1 sibling, 0 replies; 5+ messages in thread
From: Raslan Darawsheh @ 2024-02-27 16:16 UTC (permalink / raw)
To: Slava Ovsiienko, dev; +Cc: Matan Azrad, Ori Kam, Dariusz Sosnowski, stable
Hi,
> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Tuesday, February 20, 2024 1:45 PM
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Raslan Darawsheh
> <rasland@nvidia.com>; Ori Kam <orika@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix error packets drop in the regular Rx
>
> When packet gets received with error it is reported in CQE structure and PMD
> analyzes the error syndrome and provides two options - either reset the entire
> queue for the critical errors, or just ignore the packet.
>
> The non-vectorized rx_burst did not ignore the non-critical error packets, and
> in case of packet length exceeding the mbuf data buffer length it took the next
> element in the queue WQE ring, resulting in CQE/WQE consume indices
> synchronization lost.
>
> Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx
> queue")
> Cc: stable@dpdk.org
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Patch applied to next-net-mlx,
Kindest regards,
Raslan Darawsheh
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-04-18 12:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-11 18:14 [PATCH] net/mlx5: fix error packets drop in the regular Rx Viacheslav Ovsiienko
2024-04-18 12:16 ` Kevin Traynor
-- strict thread matches above, loose matches on Subject: below --
2024-02-20 11:45 Viacheslav Ovsiienko
2024-02-20 14:04 ` Dariusz Sosnowski
2024-02-27 16:16 ` Raslan Darawsheh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).