DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/4] net/mlx5: fixes for rx queue count calculation
@ 2020-11-10 14:09 Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation" Maxime Leroy
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Maxime Leroy @ 2020-11-10 14:09 UTC (permalink / raw)
  Cc: dev

This patchset provides several bug fixes for rx queue count calculation for mlx5 driver.

Didier Pallard (1):
  net/mlx5: fix Rx descriptor status returned value

Maxime Leroy (3):
  Revert "net/mlx5: fix Rx queue count calculation"
  net/mlx5: fixed used initialization in rx_queue_count
  mlx5: re-add support of rx_queue_count for mlx5_rx_burst_mprq

 drivers/net/mlx5/mlx5_rxtx.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation"
  2020-11-10 14:09 [dpdk-dev] [PATCH 0/4] net/mlx5: fixes for rx queue count calculation Maxime Leroy
@ 2020-11-10 14:09 ` Maxime Leroy
  2020-11-11 19:51   ` Slava Ovsiienko
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 2/4] net/mlx5: fixed used initialization in rx_queue_count Maxime Leroy
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Maxime Leroy @ 2020-11-10 14:09 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko; +Cc: dev, Nelio Laranjeiro

This reverts commit d2d57605522d4a43be17e22e649e54033f6d8835.

This fix is uncorrect for at least two reasons.

First issue, when there are more than 8 CQEs to uncompress, the
computation done in this commit cannot work. Because the zip-ai
variable describes the current index inside the CQE8 array and thus is
limited from 0 to 7 included. So if we are decompressed the 9 packets,
ai is 0. So in this case, n is equals to cqe_cnt - 0.

Example with 11 packets we will have:
C | a | e0 | e1 | e2 | e3 | e4 | e5 | C | a | e0

c <-- CQE compressed
a <-- Array of minicqe
ex <-- emptry entry to store uncompressed CQE.

If the 9th packet is decompressed by the soft, n is equals to 9.
But with this commit, n is equals to 11 (i.e. 11 - 0).

Second issue is to count the next packet.

Example:

     packet 1                       |    packet 2
C | a | e0 | e1 | e2 | e3 | e4 | e5 | C | a | e0

There are 2 packets compressed in the first queue. For the first packet,
n is computed correctly.

But for the second, n is not computed properly. Because the zip context
is for the first packet. The  second packet is not yet decompressed, so
there are no context.

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 844a1c63..4c566486 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -462,11 +462,19 @@ rx_queue_count(struct mlx5_rxq_data *rxq)
 {
 	struct rxq_zip *zip = &rxq->zip;
 	volatile struct mlx5_cqe *cqe;
-	unsigned int cq_ci = rxq->cq_ci;
 	const unsigned int cqe_n = (1 << rxq->cqe_n);
 	const unsigned int cqe_cnt = cqe_n - 1;
-	unsigned int used = 0;
+	unsigned int cq_ci;
+	unsigned int used;
 
+	/* if we are processing a compressed cqe */
+	if (zip->ai) {
+		used = zip->cqe_cnt - zip->ca;
+		cq_ci = zip->cq_ci;
+	} else {
+		used = 0;
+		cq_ci = rxq->cq_ci;
+	}
 	cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
 	while (check_cqe(cqe, cqe_n, cq_ci) != MLX5_CQE_STATUS_HW_OWN) {
 		int8_t op_own;
@@ -474,17 +482,14 @@ rx_queue_count(struct mlx5_rxq_data *rxq)
 
 		op_own = cqe->op_own;
 		if (MLX5_CQE_FORMAT(op_own) == MLX5_COMPRESSED)
-			if (unlikely(zip->ai))
-				n = zip->cqe_cnt - zip->ai;
-			else
-				n = rte_be_to_cpu_32(cqe->byte_cnt);
+			n = rte_be_to_cpu_32(cqe->byte_cnt);
 		else
 			n = 1;
 		cq_ci += n;
 		used += n;
 		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
 	}
-	used = RTE_MIN(used, cqe_n);
+	used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
 	return used;
 }
 
@@ -507,12 +512,11 @@ mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
 			container_of(rxq, struct mlx5_rxq_ctrl, rxq);
 	struct rte_eth_dev *dev = ETH_DEV(rxq_ctrl->priv);
 
-	if (dev->rx_pkt_burst == NULL ||
-	    dev->rx_pkt_burst == removed_rx_burst) {
+	if (dev->rx_pkt_burst != mlx5_rx_burst) {
 		rte_errno = ENOTSUP;
 		return -rte_errno;
 	}
-	if (offset >= (1 << rxq->cqe_n)) {
+	if (offset >= (1 << rxq->elts_n)) {
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
@@ -642,8 +646,7 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_rxq_data *rxq;
 
-	if (dev->rx_pkt_burst == NULL ||
-	    dev->rx_pkt_burst == removed_rx_burst) {
+	if (dev->rx_pkt_burst != mlx5_rx_burst) {
 		rte_errno = ENOTSUP;
 		return -rte_errno;
 	}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 2/4] net/mlx5: fixed used initialization in rx_queue_count
  2020-11-10 14:09 [dpdk-dev] [PATCH 0/4] net/mlx5: fixes for rx queue count calculation Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation" Maxime Leroy
@ 2020-11-10 14:09 ` Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 3/4] net/mlx5: fix Rx descriptor status returned value Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 4/4] mlx5: re-add support of rx_queue_count for mlx5_rx_burst_mprq Maxime Leroy
  3 siblings, 0 replies; 7+ messages in thread
From: Maxime Leroy @ 2020-11-10 14:09 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko, Olivier Matz
  Cc: dev, Nelio Laranjeiro

Mini (compressed) completion queue entries (CQEs) are returned by the
NIC when PCI back pressure is detected, in which case the first CQE64
contains common packet information followed by a number of CQE8
providing the rest, followed by a matching number of empty CQE64
entries to be used by software for decompression.

CQE already decompressed in software are not used/holded anymore by the
nic.

In the rx_queue_count function, if rx poll function has already started
to decompress some CQes. We need to count the number of CQEs not yet
decompressed, this CQEs are still holded by the hardware.

The context of the CQEs decompression is stored in the zip structure.
To get the number of cpe not decompressed yet (i.e. still holded by
the hadware), the following formula is used: zip->cqe_cnt - zip->ca in
rx_queue_count function.

The zip->cqe_cnt is the number of CQes compressed and zip->ca is the
current index of the cqe to decompress.

Thus, we can easily have cqe_cnt < ca. So this method to compute the
number of cqes still holded by the hardware is wrong.

The proper way to get the number of cqes not yet decrompressed is:

 - First, we need to know the current packet index to decompress:
   zip->ca + zip->ai. In the example below, the current packet index is
   2.

 - Then the index of the last packet index to decompress:
   zip->ci + zip->cqe_cnt. In the example below, the last packet index
   is 3.

- Thus the number of packets used by the hardware (i.e. not decompress
  yet) is: (zip->ci + zip->cqe_cnt) - (zip->ca + zip->ai). In the
  example below, the number of packets used by the hardware for the current
  cqe in decompression is 1.

::

       zip->cq_ci = 0  /* Current CQE */
       zip->ca = 1 /* Current array index for decompression */
       zip->ai = 1 /* array index in the mini cqe table in CQE1 below */
       zip->cqe_cnt = 3 /* number of CQEs set in the first CQE */

          0              1          2           6            7
      +---------+  +---------+ +-------+   +---------+   +-------+
      | CQE64   |  |  CQE64  | | CQE64 |   | CQE64   |   | CQE64 |
      |---------|  |---------| |-------|   |-------  |   |-------|
      |cqe_cnt=3|  | cqe8[0] | |       | . |cqe_cnt=X|   |cqe8[0]|
      | .....   |  | cqe8[1] | |       | . |         |   |  ...  | ...
      | .....   |  | cqe8[2] | |       | . |         |   |       |
      | .....   |  |         | |       |   |         |   |       |
      +---------+  +---------+ +-------+   +-------+++   +-------+

Fixes: 8788fec1f269 ("net/mlx5: implement descriptor status API")
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 4c566486..511003d1 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -464,13 +464,15 @@ rx_queue_count(struct mlx5_rxq_data *rxq)
 	volatile struct mlx5_cqe *cqe;
 	const unsigned int cqe_n = (1 << rxq->cqe_n);
 	const unsigned int cqe_cnt = cqe_n - 1;
-	unsigned int cq_ci;
+	unsigned int cq_ci, cq_end, cq_cur;
 	unsigned int used;
 
 	/* if we are processing a compressed cqe */
 	if (zip->ai) {
-		used = zip->cqe_cnt - zip->ca;
 		cq_ci = zip->cq_ci;
+		cq_end = cq_ci + zip->cqe_cnt;
+		cq_cur = zip->ca + zip->ai;
+		used = cq_end - cq_cur;
 	} else {
 		used = 0;
 		cq_ci = rxq->cq_ci;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 3/4] net/mlx5: fix Rx descriptor status returned value
  2020-11-10 14:09 [dpdk-dev] [PATCH 0/4] net/mlx5: fixes for rx queue count calculation Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation" Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 2/4] net/mlx5: fixed used initialization in rx_queue_count Maxime Leroy
@ 2020-11-10 14:09 ` Maxime Leroy
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 4/4] mlx5: re-add support of rx_queue_count for mlx5_rx_burst_mprq Maxime Leroy
  3 siblings, 0 replies; 7+ messages in thread
From: Maxime Leroy @ 2020-11-10 14:09 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko, Olivier Matz
  Cc: dev, Didier Pallard

From: Didier Pallard <didier.pallard@6wind.com>

Two bugs in rx_queue_count function:
- One entry may contain several segments, so 'used' must be multiplied
  by number of segments per entry to properly reflect the queue usage.
- rx_queue_count returns the number of entries used in queue, so it ranges
  from 0 to max number of entries in queue, not this number minus
  one.

Fixes: 8788fec1f269 ("net/mlx5: implement descriptor status API")
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 511003d1..2cabd650 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -463,6 +463,7 @@ rx_queue_count(struct mlx5_rxq_data *rxq)
 	struct rxq_zip *zip = &rxq->zip;
 	volatile struct mlx5_cqe *cqe;
 	const unsigned int cqe_n = (1 << rxq->cqe_n);
+	const unsigned int sges_n = (1 << rxq->sges_n);
 	const unsigned int cqe_cnt = cqe_n - 1;
 	unsigned int cq_ci, cq_end, cq_cur;
 	unsigned int used;
@@ -491,7 +492,7 @@ rx_queue_count(struct mlx5_rxq_data *rxq)
 		used += n;
 		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
 	}
-	used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
+	used = RTE_MIN(used * sges_n, 1U << rxq->elts_n);
 	return used;
 }
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 4/4] mlx5: re-add support of rx_queue_count for mlx5_rx_burst_mprq
  2020-11-10 14:09 [dpdk-dev] [PATCH 0/4] net/mlx5: fixes for rx queue count calculation Maxime Leroy
                   ` (2 preceding siblings ...)
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 3/4] net/mlx5: fix Rx descriptor status returned value Maxime Leroy
@ 2020-11-10 14:09 ` Maxime Leroy
  3 siblings, 0 replies; 7+ messages in thread
From: Maxime Leroy @ 2020-11-10 14:09 UTC (permalink / raw)
  To: Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko; +Cc: dev

The commit 014a6936008b ("Revert "net/mlx5: fix Rx queue count
calculation"")' has removed the support of rx_queue_count for
mlx5_rx_burst_mprq.

This commit has been revert because the fixes done on rx_queue_count
computation was wrong. Anyway, it's still true that the Rx queue count
calculation is same for any rx burts method since CQ handling is the
same for regular, vectorized, and multi-packet Rx queues.

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 2cabd650..db3c6100 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -492,7 +492,7 @@ rx_queue_count(struct mlx5_rxq_data *rxq)
 		used += n;
 		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
 	}
-	used = RTE_MIN(used * sges_n, 1U << rxq->elts_n);
+	used = RTE_MIN(used * sges_n, cqe_cnt);
 	return used;
 }
 
@@ -515,7 +515,8 @@ mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
 			container_of(rxq, struct mlx5_rxq_ctrl, rxq);
 	struct rte_eth_dev *dev = ETH_DEV(rxq_ctrl->priv);
 
-	if (dev->rx_pkt_burst != mlx5_rx_burst) {
+	if (dev->rx_pkt_burst == NULL ||
+	    dev->rx_pkt_burst == removed_rx_burst) {
 		rte_errno = ENOTSUP;
 		return -rte_errno;
 	}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation"
  2020-11-10 14:09 ` [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation" Maxime Leroy
@ 2020-11-11 19:51   ` Slava Ovsiienko
  2020-11-12 15:43     ` Maxime Leroy
  0 siblings, 1 reply; 7+ messages in thread
From: Slava Ovsiienko @ 2020-11-11 19:51 UTC (permalink / raw)
  To: Maxime Leroy, Matan Azrad, Shahaf Shuler
  Cc: dev, NBU-Contact-N?lio Laranjeiro

Hi, Maxime

Thanks a lot for the patch. There is the comment for the entire series.

[1]_____
> 
> First issue, when there are more than 8 CQEs to uncompress, the computation
> done in this commit cannot work. Because the zip-ai variable describes the
> current index inside the CQE8 array and thus is limited from 0 to 7 included. So
> if we are decompressed the 9 packets, ai is 0. So in this case, n is equals to
> cqe_cnt - 0.
> 
> Example with 11 packets we will have:
> C | a | e0 | e1 | e2 | e3 | e4 | e5 | C | a | e0
> 
1. ai is not index in the array (just tree lsbs of ai). It is an index of the miniCQE being processed
in the compressed session and is in the range [0 . .zip.cqe_cnt-1]. In your example there will be
two compressed sessions. The bug was we corrected each compressed session for the ai of the
first one (in processing that we were).

[2]_____
>	/* if we are processing a compressed cqe */
> 	if (zip->ai) {
>-		used = zip->cqe_cnt - zip->ca;
> 		cq_ci = zip->cq_ci;
>+		cq_end = cq_ci + zip->cqe_cnt;
>+		cq_cur = zip->ca + zip->ai;
>+		used = cq_end - cq_cur;
>	} else {
> 		used = 0;
> 		cq_ci = rxq->cq_ci;

Sorry, it seems to be incorrect.
zip->cq_ci is the index of the NEXT CQE, following the compressed session being processed.
zip->ai is index of miniCQE being processed. "used" should be calculated much simple:

    used = zip->cqe_cnt - zip->ai

[3]_____
-       if (dev->rx_pkt_burst == NULL ||
-           dev->rx_pkt_burst == removed_rx_burst) {
+       if (dev->rx_pkt_burst != mlx5_rx_burst) {

In this way, we cut the support for other rx_burst routines, we should restore.

[4]______
I'am OK with Didier patch "net/mlx5: fix Rx descriptor status returned value"

I see you wrote the luxury commit messages, and I'm crying with bloody tears about what I'm going to ask you for -
could we squash the series in to single commit? Or at least two - Didier and yours? 

With best regards, Slava
 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation"
  2020-11-11 19:51   ` Slava Ovsiienko
@ 2020-11-12 15:43     ` Maxime Leroy
  0 siblings, 0 replies; 7+ messages in thread
From: Maxime Leroy @ 2020-11-12 15:43 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Matan Azrad, Shahaf Shuler, dev, NBU-Contact-N?lio Laranjeiro

Hi Slava,

On Wed, Nov 11, 2020 at 8:51 PM Slava Ovsiienko <viacheslavo@nvidia.com> wrote:
>
> Hi, Maxime
>
> Thanks a lot for the patch. There is the comment for the entire series.
>
> [1]_____
> >
> > First issue, when there are more than 8 CQEs to uncompress, the computation
> > done in this commit cannot work. Because the zip-ai variable describes the
> > current index inside the CQE8 array and thus is limited from 0 to 7 included. So
> > if we are decompressed the 9 packets, ai is 0. So in this case, n is equals to
> > cqe_cnt - 0.
> >
> > Example with 11 packets we will have:
> > C | a | e0 | e1 | e2 | e3 | e4 | e5 | C | a | e0
> >
> 1. ai is not index in the array (just tree lsbs of ai). It is an index of the miniCQE being processed
> in the compressed session and is in the range [0 . .zip.cqe_cnt-1]. In your example there will be
> two compressed sessions. The bug was we corrected each compressed session for the ai of the
> first one (in processing that we were).

The name of the variable (i.e. array index) has confused me. But you are right.
>
> [2]_____
> >       /* if we are processing a compressed cqe */
> >       if (zip->ai) {
> >-              used = zip->cqe_cnt - zip->ca;
> >               cq_ci = zip->cq_ci;
> >+              cq_end = cq_ci + zip->cqe_cnt;
> >+              cq_cur = zip->ca + zip->ai;
> >+              used = cq_end - cq_cur;
> >       } else {
> >               used = 0;
> >               cq_ci = rxq->cq_ci;
>
> Sorry, it seems to be incorrect.
> zip->cq_ci is the index of the NEXT CQE, following the compressed session being processed.
> zip->ai is index of miniCQE being processed. "used" should be calculated much simple:
>
>     used = zip->cqe_cnt - zip->ai

You are right.

>
> [3]_____
> -       if (dev->rx_pkt_burst == NULL ||
> -           dev->rx_pkt_burst == removed_rx_burst) {
> +       if (dev->rx_pkt_burst != mlx5_rx_burst) {
>
> In this way, we cut the support for other rx_burst routines, we should restore.
>
> [4]______
> I'am OK with Didier patch "net/mlx5: fix Rx descriptor status returned value"
>
> I see you wrote the luxury commit messages, and I'm crying with bloody tears about what I'm going to ask you for -
> could we squash the series in to single commit? Or at least two - Didier and yours?
>
> With best regards, Slava
>

I have just sent a V2 version fixing all these points.

Best regards,

Maxime Leroy

>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-12 15:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-10 14:09 [dpdk-dev] [PATCH 0/4] net/mlx5: fixes for rx queue count calculation Maxime Leroy
2020-11-10 14:09 ` [dpdk-dev] [PATCH 1/4] Revert "net/mlx5: fix Rx queue count calculation" Maxime Leroy
2020-11-11 19:51   ` Slava Ovsiienko
2020-11-12 15:43     ` Maxime Leroy
2020-11-10 14:09 ` [dpdk-dev] [PATCH 2/4] net/mlx5: fixed used initialization in rx_queue_count Maxime Leroy
2020-11-10 14:09 ` [dpdk-dev] [PATCH 3/4] net/mlx5: fix Rx descriptor status returned value Maxime Leroy
2020-11-10 14:09 ` [dpdk-dev] [PATCH 4/4] mlx5: re-add support of rx_queue_count for mlx5_rx_burst_mprq Maxime Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).