DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx
@ 2021-11-15 18:24 Lance Richardson
  2021-11-15 18:24 ` [PATCH 1/2] net/bnxt: avoid unnecessary work in AVX2 Rx path Lance Richardson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Lance Richardson @ 2021-11-15 18:24 UTC (permalink / raw)
  Cc: dev

[-- Attachment #1: Type: text/plain, Size: 359 bytes --]

This series contains two minor performance fixes for the
bnxt AVX2 vecgtorized burst receive function.

Lance Richardson (2):
  net/bnxt: avoid unnecessary work in AVX2 Rx path
  net/bnxt: remove software prefetches from AVX2 Rx path

 drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

-- 
2.25.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4221 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] net/bnxt: avoid unnecessary work in AVX2 Rx path
  2021-11-15 18:24 [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Lance Richardson
@ 2021-11-15 18:24 ` Lance Richardson
  2021-11-15 18:24 ` [PATCH 2/2] net/bnxt: remove software prefetches from " Lance Richardson
  2021-11-17  3:59 ` [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Ajit Khaparde
  2 siblings, 0 replies; 4+ messages in thread
From: Lance Richardson @ 2021-11-15 18:24 UTC (permalink / raw)
  To: Bruce Richardson, Konstantin Ananyev, Ajit Khaparde, Somnath Kotur
  Cc: dev, stable

[-- Attachment #1: Type: text/plain, Size: 1352 bytes --]

Each call to the AVX2 vector burst receive function makes at
least one pass through the function's inner loop, loading
256 bytes of completion descriptors and copying 8 rte_mbuf
pointers regardless of whether there are any packets to be
received.

Unidirectional forwarding performance is improved by about
3-4% if we ensure that at least one packet can be received
before entering the inner loop.

Fixes: c4e4c18963b0 ("net/bnxt: add AVX2 RX/Tx")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
---
 drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c b/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
index e4905b4fd1..54e3af22ac 100644
--- a/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
+++ b/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
@@ -98,6 +98,10 @@ recv_burst_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	rte_prefetch0(&cp_desc_ring[cons + 8]);
 	rte_prefetch0(&cp_desc_ring[cons + 12]);
 
+	/* Return immediately if there is not at least one completed packet. */
+	if (!bnxt_cpr_cmp_valid(&cp_desc_ring[cons], raw_cons, cp_ring_size))
+		return 0;
+
 	/* Ensure that we do not go past the ends of the rings. */
 	nb_pkts = RTE_MIN(nb_pkts, RTE_MIN(rx_ring_size - mbcons,
 					   (cp_ring_size - cons) / 2));
-- 
2.25.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4221 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/2] net/bnxt: remove software prefetches from AVX2 Rx path
  2021-11-15 18:24 [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Lance Richardson
  2021-11-15 18:24 ` [PATCH 1/2] net/bnxt: avoid unnecessary work in AVX2 Rx path Lance Richardson
@ 2021-11-15 18:24 ` Lance Richardson
  2021-11-17  3:59 ` [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Ajit Khaparde
  2 siblings, 0 replies; 4+ messages in thread
From: Lance Richardson @ 2021-11-15 18:24 UTC (permalink / raw)
  To: Bruce Richardson, Konstantin Ananyev, Ajit Khaparde, Somnath Kotur
  Cc: dev, stable

[-- Attachment #1: Type: text/plain, Size: 1849 bytes --]

Testing has shown no performance benefit from software prefetching
of receive completion descriptors in the AVX2 burst receive path,
and slightly better performance without them on some CPU families,
so this patch removes them.

Fixes: c4e4c18963b0 ("net/bnxt: add AVX2 RX/Tx")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
---
 drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c b/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
index 54e3af22ac..34bd22edf0 100644
--- a/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
+++ b/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
@@ -92,12 +92,6 @@ recv_burst_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	cons = raw_cons & (cp_ring_size - 1);
 	mbcons = (raw_cons / 2) & (rx_ring_size - 1);
 
-	/* Prefetch first four descriptor pairs. */
-	rte_prefetch0(&cp_desc_ring[cons + 0]);
-	rte_prefetch0(&cp_desc_ring[cons + 4]);
-	rte_prefetch0(&cp_desc_ring[cons + 8]);
-	rte_prefetch0(&cp_desc_ring[cons + 12]);
-
 	/* Return immediately if there is not at least one completed packet. */
 	if (!bnxt_cpr_cmp_valid(&cp_desc_ring[cons], raw_cons, cp_ring_size))
 		return 0;
@@ -136,14 +130,6 @@ recv_burst_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		_mm256_storeu_si256((void *)&rx_pkts[i + 4], t0);
 #endif
 
-		/* Prefetch eight descriptor pairs for next iteration. */
-		if (i + BNXT_RX_DESCS_PER_LOOP_VEC256 < nb_pkts) {
-			rte_prefetch0(&cp_desc_ring[cons + 16]);
-			rte_prefetch0(&cp_desc_ring[cons + 20]);
-			rte_prefetch0(&cp_desc_ring[cons + 24]);
-			rte_prefetch0(&cp_desc_ring[cons + 28]);
-		}
-
 		/*
 		 * Load eight receive completion descriptors into 256-bit
 		 * registers. Loads are issued in reverse order in order to
-- 
2.25.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4221 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx
  2021-11-15 18:24 [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Lance Richardson
  2021-11-15 18:24 ` [PATCH 1/2] net/bnxt: avoid unnecessary work in AVX2 Rx path Lance Richardson
  2021-11-15 18:24 ` [PATCH 2/2] net/bnxt: remove software prefetches from " Lance Richardson
@ 2021-11-17  3:59 ` Ajit Khaparde
  2 siblings, 0 replies; 4+ messages in thread
From: Ajit Khaparde @ 2021-11-17  3:59 UTC (permalink / raw)
  To: Lance Richardson; +Cc: dpdk-dev, Ferruh Yigit

On Mon, Nov 15, 2021 at 10:24 AM Lance Richardson
<lance.richardson@broadcom.com> wrote:
>
> This series contains two minor performance fixes for the
> bnxt AVX2 vecgtorized burst receive function.
>
> Lance Richardson (2):
>   net/bnxt: avoid unnecessary work in AVX2 Rx path
>   net/bnxt: remove software prefetches from AVX2 Rx path
Patchset applied to dpdk-next-net-brcm. Thanks

>
>  drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-17  4:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15 18:24 [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Lance Richardson
2021-11-15 18:24 ` [PATCH 1/2] net/bnxt: avoid unnecessary work in AVX2 Rx path Lance Richardson
2021-11-15 18:24 ` [PATCH 2/2] net/bnxt: remove software prefetches from " Lance Richardson
2021-11-17  3:59 ` [PATCH 0/2] net/bnxt: minor performance fixes for AVX2 Rx Ajit Khaparde

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).