patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH 1/2] net/i40e: fix generic build on FreeBSD
       [not found] <bug-788-3@http.bugs.dpdk.org/>
@ 2021-08-18 16:38 ` Bruce Richardson
  2021-08-18 16:38   ` [dpdk-stable] [PATCH 2/2] net/ice: " Bruce Richardson
  2021-09-01  6:23   ` [dpdk-stable] [dpdk-dev] [PATCH 1/2] net/i40e: " Zhang, Qi Z
  0 siblings, 2 replies; 6+ messages in thread
From: Bruce Richardson @ 2021-08-18 16:38 UTC (permalink / raw)
  To: dev; +Cc: brian90013, Bruce Richardson, wenzhuo.lu, stable, Beilei Xing

The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue.

Bugzilla ID: 788
Fixes: 0604b1f2208f ("net/i40e: fix crash in AVX512")
Cc: wenzhuo.lu@intel.com
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/i40e/i40e_rxtx_vec_common.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h b/drivers/net/i40e/i40e_rxtx_vec_common.h
index f52ed98d62..65715ed1ce 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -268,7 +268,7 @@ i40e_rx_vec_dev_conf_condition_check_default(struct rte_eth_dev *dev)
 #endif
 }
 
-#ifdef CC_AVX2_SUPPORT
+#ifdef __AVX2__
 static __rte_always_inline void
 i40e_rxq_rearm_common(struct i40e_rx_queue *rxq, __rte_unused bool avx512)
 {
@@ -329,7 +329,7 @@ i40e_rxq_rearm_common(struct i40e_rx_queue *rxq, __rte_unused bool avx512)
 		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
 	}
 #else
-#ifdef CC_AVX512_SUPPORT
+#ifdef __AVX512VL__
 	if (avx512) {
 		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
 		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-stable] [PATCH 2/2] net/ice: fix generic build on FreeBSD
  2021-08-18 16:38 ` [dpdk-stable] [PATCH 1/2] net/i40e: fix generic build on FreeBSD Bruce Richardson
@ 2021-08-18 16:38   ` Bruce Richardson
  2021-08-30  8:18     ` Rong, Leyi
  2021-09-01  6:23   ` [dpdk-stable] [dpdk-dev] [PATCH 1/2] net/i40e: " Zhang, Qi Z
  1 sibling, 1 reply; 6+ messages in thread
From: Bruce Richardson @ 2021-08-18 16:38 UTC (permalink / raw)
  To: dev
  Cc: brian90013, Bruce Richardson, wenzhuo.lu, leyi.rong, stable,
	Qiming Yang, Qi Zhang

The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue.

Bugzilla ID: 788
Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
Fixes: 20daa1c978b7 ("net/ice: fix crash in AVX512")
Cc: wenzhuo.lu@intel.com
Cc: leyi.rong@intel.com
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/ice/ice_rxtx_vec_common.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ice/ice_rxtx_vec_common.h b/drivers/net/ice/ice_rxtx_vec_common.h
index 2d8ef7dc8a..e609a75fc6 100644
--- a/drivers/net/ice/ice_rxtx_vec_common.h
+++ b/drivers/net/ice/ice_rxtx_vec_common.h
@@ -194,7 +194,7 @@ _ice_tx_queue_release_mbufs_vec(struct ice_tx_queue *txq)
 	 */
 	i = txq->tx_next_dd - txq->tx_rs_thresh + 1;
 
-#ifdef CC_AVX512_SUPPORT
+#ifdef __AVX512VL__
 	struct rte_eth_dev *dev = &rte_eth_devices[txq->vsi->adapter->pf.dev_data->port_id];
 
 	if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512 ||
@@ -352,7 +352,7 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev)
 	return result;
 }
 
-#ifdef CC_AVX2_SUPPORT
+#ifdef __AVX2__
 static __rte_always_inline void
 ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512)
 {
@@ -414,7 +414,7 @@ ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512)
 		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
 	}
 #else
-#ifdef CC_AVX512_SUPPORT
+#ifdef __AVX512VL__
 	if (avx512) {
 		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
 		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-stable] [PATCH 2/2] net/ice: fix generic build on FreeBSD
  2021-08-18 16:38   ` [dpdk-stable] [PATCH 2/2] net/ice: " Bruce Richardson
@ 2021-08-30  8:18     ` Rong, Leyi
  2021-09-01  6:24       ` Zhang, Qi Z
  0 siblings, 1 reply; 6+ messages in thread
From: Rong, Leyi @ 2021-08-30  8:18 UTC (permalink / raw)
  To: Richardson, Bruce, dev
  Cc: brian90013, Lu, Wenzhuo, stable, Yang, Qiming, Zhang, Qi Z


> -----Original Message-----
> From: Richardson, Bruce <bruce.richardson@intel.com>
> Sent: Thursday, August 19, 2021 12:38 AM
> To: dev@dpdk.org
> Cc: brian90013@gmail.com; Richardson, Bruce <bruce.richardson@intel.com>;
> Lu, Wenzhuo <wenzhuo.lu@intel.com>; Rong, Leyi <leyi.rong@intel.com>;
> stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>
> Subject: [PATCH 2/2] net/ice: fix generic build on FreeBSD
> 
> The common header file for vectorization is included in multiple files, and so
> must use macros for the current compilation unit, rather than the compiler-
> capability flag set for the whole driver. With the current, incorrect, macro, the
> AVX512 or AVX2 flags may be set when compiling up SSE code, leading to
> compilation errors. Changing from "CC_AVX*_SUPPORT"
> to the compiler-defined "__AVX*__" macros fixes this issue.
> 
> Bugzilla ID: 788
> Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
> Fixes: 20daa1c978b7 ("net/ice: fix crash in AVX512")
> Cc: wenzhuo.lu@intel.com
> Cc: leyi.rong@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  drivers/net/ice/ice_rxtx_vec_common.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_rxtx_vec_common.h
> b/drivers/net/ice/ice_rxtx_vec_common.h
> index 2d8ef7dc8a..e609a75fc6 100644
> --- a/drivers/net/ice/ice_rxtx_vec_common.h
> +++ b/drivers/net/ice/ice_rxtx_vec_common.h
> @@ -194,7 +194,7 @@ _ice_tx_queue_release_mbufs_vec(struct ice_tx_queue
> *txq)
>  	 */
>  	i = txq->tx_next_dd - txq->tx_rs_thresh + 1;
> 
> -#ifdef CC_AVX512_SUPPORT
> +#ifdef __AVX512VL__
>  	struct rte_eth_dev *dev = &rte_eth_devices[txq->vsi->adapter-
> >pf.dev_data->port_id];
> 
>  	if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512 || @@ -352,7
> +352,7 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev)
>  	return result;
>  }
> 
> -#ifdef CC_AVX2_SUPPORT
> +#ifdef __AVX2__
>  static __rte_always_inline void
>  ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512)
> { @@ -414,7 +414,7 @@ ice_rxq_rearm_common(struct ice_rx_queue *rxq,
> __rte_unused bool avx512)
>  		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>  	}
>  #else
> -#ifdef CC_AVX512_SUPPORT
> +#ifdef __AVX512VL__
>  	if (avx512) {
>  		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
>  		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
> --
> 2.30.2

Acked-by: Leyi Rong <leyi.rong@intel.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH 1/2] net/i40e: fix generic build on FreeBSD
  2021-08-18 16:38 ` [dpdk-stable] [PATCH 1/2] net/i40e: fix generic build on FreeBSD Bruce Richardson
  2021-08-18 16:38   ` [dpdk-stable] [PATCH 2/2] net/ice: " Bruce Richardson
@ 2021-09-01  6:23   ` Zhang, Qi Z
  1 sibling, 0 replies; 6+ messages in thread
From: Zhang, Qi Z @ 2021-09-01  6:23 UTC (permalink / raw)
  To: Richardson, Bruce, dev
  Cc: brian90013, Richardson, Bruce, Lu, Wenzhuo, stable, Xing, Beilei



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
> Sent: Thursday, August 19, 2021 12:38 AM
> To: dev@dpdk.org
> Cc: brian90013@gmail.com; Richardson, Bruce <bruce.richardson@intel.com>;
> Lu, Wenzhuo <wenzhuo.lu@intel.com>; stable@dpdk.org; Xing, Beilei
> <beilei.xing@intel.com>
> Subject: [dpdk-dev] [PATCH 1/2] net/i40e: fix generic build on FreeBSD
> 
> The common header file for vectorization is included in multiple files, and so
> must use macros for the current compilation unit, rather than the
> compiler-capability flag set for the whole driver. With the current, incorrect,
> macro, the AVX512 or AVX2 flags may be set when compiling up SSE code,
> leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
> to the compiler-defined "__AVX*__" macros fixes this issue.
> 
> Bugzilla ID: 788
> Fixes: 0604b1f2208f ("net/i40e: fix crash in AVX512")
> Cc: wenzhuo.lu@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  drivers/net/i40e/i40e_rxtx_vec_common.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h
> b/drivers/net/i40e/i40e_rxtx_vec_common.h
> index f52ed98d62..65715ed1ce 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_common.h
> +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
> @@ -268,7 +268,7 @@
> i40e_rx_vec_dev_conf_condition_check_default(struct rte_eth_dev *dev)
> #endif  }
> 
> -#ifdef CC_AVX2_SUPPORT
> +#ifdef __AVX2__
>  static __rte_always_inline void
>  i40e_rxq_rearm_common(struct i40e_rx_queue *rxq, __rte_unused bool
> avx512)  { @@ -329,7 +329,7 @@ i40e_rxq_rearm_common(struct
> i40e_rx_queue *rxq, __rte_unused bool avx512)
>  		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>  	}
>  #else
> -#ifdef CC_AVX512_SUPPORT
> +#ifdef __AVX512VL__
>  	if (avx512) {
>  		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
>  		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
> --
> 2.30.2

Applied to dpdk-next-net-intel.

Thanks
Qi


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-stable] [PATCH 2/2] net/ice: fix generic build on FreeBSD
  2021-08-30  8:18     ` Rong, Leyi
@ 2021-09-01  6:24       ` Zhang, Qi Z
  0 siblings, 0 replies; 6+ messages in thread
From: Zhang, Qi Z @ 2021-09-01  6:24 UTC (permalink / raw)
  To: Rong, Leyi, Richardson, Bruce, dev
  Cc: brian90013, Lu, Wenzhuo, stable, Yang, Qiming



> -----Original Message-----
> From: Rong, Leyi <leyi.rong@intel.com>
> Sent: Monday, August 30, 2021 4:18 PM
> To: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org
> Cc: brian90013@gmail.com; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>
> Subject: RE: [PATCH 2/2] net/ice: fix generic build on FreeBSD
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce <bruce.richardson@intel.com>
> > Sent: Thursday, August 19, 2021 12:38 AM
> > To: dev@dpdk.org
> > Cc: brian90013@gmail.com; Richardson, Bruce
> > <bruce.richardson@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > Rong, Leyi <leyi.rong@intel.com>; stable@dpdk.org; Yang, Qiming
> > <qiming.yang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> > Subject: [PATCH 2/2] net/ice: fix generic build on FreeBSD
> >
> > The common header file for vectorization is included in multiple
> > files, and so must use macros for the current compilation unit, rather
> > than the compiler- capability flag set for the whole driver. With the
> > current, incorrect, macro, the
> > AVX512 or AVX2 flags may be set when compiling up SSE code, leading to
> > compilation errors. Changing from "CC_AVX*_SUPPORT"
> > to the compiler-defined "__AVX*__" macros fixes this issue.
> >
> > Bugzilla ID: 788
> > Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
> > Fixes: 20daa1c978b7 ("net/ice: fix crash in AVX512")
> > Cc: wenzhuo.lu@intel.com
> > Cc: leyi.rong@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  drivers/net/ice/ice_rxtx_vec_common.h | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ice/ice_rxtx_vec_common.h
> > b/drivers/net/ice/ice_rxtx_vec_common.h
> > index 2d8ef7dc8a..e609a75fc6 100644
> > --- a/drivers/net/ice/ice_rxtx_vec_common.h
> > +++ b/drivers/net/ice/ice_rxtx_vec_common.h
> > @@ -194,7 +194,7 @@ _ice_tx_queue_release_mbufs_vec(struct
> > ice_tx_queue
> > *txq)
> >  	 */
> >  	i = txq->tx_next_dd - txq->tx_rs_thresh + 1;
> >
> > -#ifdef CC_AVX512_SUPPORT
> > +#ifdef __AVX512VL__
> >  	struct rte_eth_dev *dev = &rte_eth_devices[txq->vsi->adapter-
> > >pf.dev_data->port_id];
> >
> >  	if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512 || @@ -352,7
> > +352,7 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev)
> >  	return result;
> >  }
> >
> > -#ifdef CC_AVX2_SUPPORT
> > +#ifdef __AVX2__
> >  static __rte_always_inline void
> >  ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool
> > avx512) { @@ -414,7 +414,7 @@ ice_rxq_rearm_common(struct
> ice_rx_queue
> > *rxq, __rte_unused bool avx512)
> >  		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> >  	}
> >  #else
> > -#ifdef CC_AVX512_SUPPORT
> > +#ifdef __AVX512VL__
> >  	if (avx512) {
> >  		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
> >  		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
> > --
> > 2.30.2
> 
> Acked-by: Leyi Rong <leyi.rong@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-stable] [PATCH 2/2] net/ice: fix generic build on FreeBSD
  2021-09-29 12:13 [dpdk-stable] " Leyi Rong
@ 2021-09-29 12:13 ` Leyi Rong
  0 siblings, 0 replies; 6+ messages in thread
From: Leyi Rong @ 2021-09-29 12:13 UTC (permalink / raw)
  To: ferruh.yigit, bruce.richardson, qi.z.zhang
  Cc: dev, Leyi Rong, wenzhuo.lu, stable

The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new ice_rxtx_common_avx.h header
file to avoid such bugs.

Bugzilla ID: 788
Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
Fixes: 20daa1c978b7 ("net/ice: fix crash in AVX512")
Cc: wenzhuo.lu@intel.com
Cc: leyi.rong@intel.com
Cc: stable@dpdk.org

Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/ice/ice_rxtx_common_avx.h | 213 ++++++++++++++++++++++++++
 drivers/net/ice/ice_rxtx_vec_common.h | 205 +------------------------
 2 files changed, 218 insertions(+), 200 deletions(-)
 create mode 100644 drivers/net/ice/ice_rxtx_common_avx.h

diff --git a/drivers/net/ice/ice_rxtx_common_avx.h b/drivers/net/ice/ice_rxtx_common_avx.h
new file mode 100644
index 0000000000..81e0db5dd3
--- /dev/null
+++ b/drivers/net/ice/ice_rxtx_common_avx.h
@@ -0,0 +1,213 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+#ifndef _ICE_RXTX_COMMON_AVX_H_
+#define _ICE_RXTX_COMMON_AVX_H_
+
+#include "ice_rxtx.h"
+
+#ifndef __INTEL_COMPILER
+#pragma GCC diagnostic ignored "-Wcast-qual"
+#endif
+
+#ifdef __AVX2__
+static __rte_always_inline void
+ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512)
+{
+	int i;
+	uint16_t rx_id;
+	volatile union ice_rx_flex_desc *rxdp;
+	struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start];
+
+	rxdp = rxq->rx_ring + rxq->rxrearm_start;
+
+	/* Pull 'n' more MBUFs into the software ring */
+	if (rte_mempool_get_bulk(rxq->mp,
+				 (void *)rxep,
+				 ICE_RXQ_REARM_THRESH) < 0) {
+		if (rxq->rxrearm_nb + ICE_RXQ_REARM_THRESH >=
+		    rxq->nb_rx_desc) {
+			__m128i dma_addr0;
+
+			dma_addr0 = _mm_setzero_si128();
+			for (i = 0; i < ICE_DESCS_PER_LOOP; i++) {
+				rxep[i].mbuf = &rxq->fake_mbuf;
+				_mm_store_si128((__m128i *)&rxdp[i].read,
+						dma_addr0);
+			}
+		}
+		rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed +=
+			ICE_RXQ_REARM_THRESH;
+		return;
+	}
+
+#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC
+	struct rte_mbuf *mb0, *mb1;
+	__m128i dma_addr0, dma_addr1;
+	__m128i hdr_room = _mm_set_epi64x(RTE_PKTMBUF_HEADROOM,
+			RTE_PKTMBUF_HEADROOM);
+	/* Initialize the mbufs in vector, process 2 mbufs in one loop */
+	for (i = 0; i < ICE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
+		__m128i vaddr0, vaddr1;
+
+		mb0 = rxep[0].mbuf;
+		mb1 = rxep[1].mbuf;
+
+		/* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */
+		RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) !=
+				offsetof(struct rte_mbuf, buf_addr) + 8);
+		vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr);
+		vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr);
+
+		/* convert pa to dma_addr hdr/data */
+		dma_addr0 = _mm_unpackhi_epi64(vaddr0, vaddr0);
+		dma_addr1 = _mm_unpackhi_epi64(vaddr1, vaddr1);
+
+		/* add headroom to pa values */
+		dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
+		dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
+
+		/* flush desc with pa dma_addr */
+		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
+		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
+	}
+#else
+#ifdef __AVX512VL__
+	if (avx512) {
+		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
+		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
+		__m512i dma_addr0_3, dma_addr4_7;
+		__m512i hdr_room = _mm512_set1_epi64(RTE_PKTMBUF_HEADROOM);
+		/* Initialize the mbufs in vector, process 8 mbufs in one loop */
+		for (i = 0; i < ICE_RXQ_REARM_THRESH;
+				i += 8, rxep += 8, rxdp += 8) {
+			__m128i vaddr0, vaddr1, vaddr2, vaddr3;
+			__m128i vaddr4, vaddr5, vaddr6, vaddr7;
+			__m256i vaddr0_1, vaddr2_3;
+			__m256i vaddr4_5, vaddr6_7;
+			__m512i vaddr0_3, vaddr4_7;
+
+			mb0 = rxep[0].mbuf;
+			mb1 = rxep[1].mbuf;
+			mb2 = rxep[2].mbuf;
+			mb3 = rxep[3].mbuf;
+			mb4 = rxep[4].mbuf;
+			mb5 = rxep[5].mbuf;
+			mb6 = rxep[6].mbuf;
+			mb7 = rxep[7].mbuf;
+
+			/* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */
+			RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) !=
+					offsetof(struct rte_mbuf, buf_addr) + 8);
+			vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr);
+			vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr);
+			vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr);
+			vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr);
+			vaddr4 = _mm_loadu_si128((__m128i *)&mb4->buf_addr);
+			vaddr5 = _mm_loadu_si128((__m128i *)&mb5->buf_addr);
+			vaddr6 = _mm_loadu_si128((__m128i *)&mb6->buf_addr);
+			vaddr7 = _mm_loadu_si128((__m128i *)&mb7->buf_addr);
+
+			/**
+			 * merge 0 & 1, by casting 0 to 256-bit and inserting 1
+			 * into the high lanes. Similarly for 2 & 3, and so on.
+			 */
+			vaddr0_1 =
+				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0),
+							vaddr1, 1);
+			vaddr2_3 =
+				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2),
+							vaddr3, 1);
+			vaddr4_5 =
+				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr4),
+							vaddr5, 1);
+			vaddr6_7 =
+				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr6),
+							vaddr7, 1);
+			vaddr0_3 =
+				_mm512_inserti64x4(_mm512_castsi256_si512(vaddr0_1),
+						   vaddr2_3, 1);
+			vaddr4_7 =
+				_mm512_inserti64x4(_mm512_castsi256_si512(vaddr4_5),
+						   vaddr6_7, 1);
+
+			/* convert pa to dma_addr hdr/data */
+			dma_addr0_3 = _mm512_unpackhi_epi64(vaddr0_3, vaddr0_3);
+			dma_addr4_7 = _mm512_unpackhi_epi64(vaddr4_7, vaddr4_7);
+
+			/* add headroom to pa values */
+			dma_addr0_3 = _mm512_add_epi64(dma_addr0_3, hdr_room);
+			dma_addr4_7 = _mm512_add_epi64(dma_addr4_7, hdr_room);
+
+			/* flush desc with pa dma_addr */
+			_mm512_store_si512((__m512i *)&rxdp->read, dma_addr0_3);
+			_mm512_store_si512((__m512i *)&(rxdp + 4)->read, dma_addr4_7);
+		}
+	} else
+#endif /* __AVX512VL__ */
+	{
+		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
+		__m256i dma_addr0_1, dma_addr2_3;
+		__m256i hdr_room = _mm256_set1_epi64x(RTE_PKTMBUF_HEADROOM);
+		/* Initialize the mbufs in vector, process 4 mbufs in one loop */
+		for (i = 0; i < ICE_RXQ_REARM_THRESH;
+				i += 4, rxep += 4, rxdp += 4) {
+			__m128i vaddr0, vaddr1, vaddr2, vaddr3;
+			__m256i vaddr0_1, vaddr2_3;
+
+			mb0 = rxep[0].mbuf;
+			mb1 = rxep[1].mbuf;
+			mb2 = rxep[2].mbuf;
+			mb3 = rxep[3].mbuf;
+
+			/* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */
+			RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) !=
+					offsetof(struct rte_mbuf, buf_addr) + 8);
+			vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr);
+			vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr);
+			vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr);
+			vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr);
+
+			/**
+			 * merge 0 & 1, by casting 0 to 256-bit and inserting 1
+			 * into the high lanes. Similarly for 2 & 3
+			 */
+			vaddr0_1 =
+				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0),
+							vaddr1, 1);
+			vaddr2_3 =
+				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2),
+							vaddr3, 1);
+
+			/* convert pa to dma_addr hdr/data */
+			dma_addr0_1 = _mm256_unpackhi_epi64(vaddr0_1, vaddr0_1);
+			dma_addr2_3 = _mm256_unpackhi_epi64(vaddr2_3, vaddr2_3);
+
+			/* add headroom to pa values */
+			dma_addr0_1 = _mm256_add_epi64(dma_addr0_1, hdr_room);
+			dma_addr2_3 = _mm256_add_epi64(dma_addr2_3, hdr_room);
+
+			/* flush desc with pa dma_addr */
+			_mm256_store_si256((__m256i *)&rxdp->read, dma_addr0_1);
+			_mm256_store_si256((__m256i *)&(rxdp + 2)->read, dma_addr2_3);
+		}
+	}
+
+#endif
+
+	rxq->rxrearm_start += ICE_RXQ_REARM_THRESH;
+	if (rxq->rxrearm_start >= rxq->nb_rx_desc)
+		rxq->rxrearm_start = 0;
+
+	rxq->rxrearm_nb -= ICE_RXQ_REARM_THRESH;
+
+	rx_id = (uint16_t)((rxq->rxrearm_start == 0) ?
+			     (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1));
+
+	/* Update the tail pointer on the NIC */
+	ICE_PCI_REG_WC_WRITE(rxq->qrx_tail, rx_id);
+}
+#endif /* __AVX2__ */
+
+#endif /* _ICE_RXTX_COMMON_AVX_H_ */
diff --git a/drivers/net/ice/ice_rxtx_vec_common.h b/drivers/net/ice/ice_rxtx_vec_common.h
index 5b5250565e..94ba87cbd9 100644
--- a/drivers/net/ice/ice_rxtx_vec_common.h
+++ b/drivers/net/ice/ice_rxtx_vec_common.h
@@ -11,6 +11,10 @@
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
 
+#ifdef __AVX2__
+#include "ice_rxtx_common_avx.h"
+#endif
+
 static inline uint16_t
 ice_rx_reassemble_packets(struct ice_rx_queue *rxq, struct rte_mbuf **rx_bufs,
 			  uint16_t nb_bufs, uint8_t *split_flags)
@@ -194,7 +198,7 @@ _ice_tx_queue_release_mbufs_vec(struct ice_tx_queue *txq)
 	 */
 	i = txq->tx_next_dd - txq->tx_rs_thresh + 1;
 
-#ifdef CC_AVX512_SUPPORT
+#ifdef __AVX512VL__
 	struct rte_eth_dev *dev = &rte_eth_devices[txq->vsi->adapter->pf.dev_data->port_id];
 
 	if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512 ||
@@ -355,205 +359,6 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev)
 	return result;
 }
 
-#ifdef CC_AVX2_SUPPORT
-static __rte_always_inline void
-ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512)
-{
-	int i;
-	uint16_t rx_id;
-	volatile union ice_rx_flex_desc *rxdp;
-	struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start];
-
-	rxdp = rxq->rx_ring + rxq->rxrearm_start;
-
-	/* Pull 'n' more MBUFs into the software ring */
-	if (rte_mempool_get_bulk(rxq->mp,
-				 (void *)rxep,
-				 ICE_RXQ_REARM_THRESH) < 0) {
-		if (rxq->rxrearm_nb + ICE_RXQ_REARM_THRESH >=
-		    rxq->nb_rx_desc) {
-			__m128i dma_addr0;
-
-			dma_addr0 = _mm_setzero_si128();
-			for (i = 0; i < ICE_DESCS_PER_LOOP; i++) {
-				rxep[i].mbuf = &rxq->fake_mbuf;
-				_mm_store_si128((__m128i *)&rxdp[i].read,
-						dma_addr0);
-			}
-		}
-		rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed +=
-			ICE_RXQ_REARM_THRESH;
-		return;
-	}
-
-#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC
-	struct rte_mbuf *mb0, *mb1;
-	__m128i dma_addr0, dma_addr1;
-	__m128i hdr_room = _mm_set_epi64x(RTE_PKTMBUF_HEADROOM,
-			RTE_PKTMBUF_HEADROOM);
-	/* Initialize the mbufs in vector, process 2 mbufs in one loop */
-	for (i = 0; i < ICE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
-		__m128i vaddr0, vaddr1;
-
-		mb0 = rxep[0].mbuf;
-		mb1 = rxep[1].mbuf;
-
-		/* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */
-		RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) !=
-				offsetof(struct rte_mbuf, buf_addr) + 8);
-		vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr);
-		vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr);
-
-		/* convert pa to dma_addr hdr/data */
-		dma_addr0 = _mm_unpackhi_epi64(vaddr0, vaddr0);
-		dma_addr1 = _mm_unpackhi_epi64(vaddr1, vaddr1);
-
-		/* add headroom to pa values */
-		dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
-		dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
-
-		/* flush desc with pa dma_addr */
-		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
-		_mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
-	}
-#else
-#ifdef CC_AVX512_SUPPORT
-	if (avx512) {
-		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
-		struct rte_mbuf *mb4, *mb5, *mb6, *mb7;
-		__m512i dma_addr0_3, dma_addr4_7;
-		__m512i hdr_room = _mm512_set1_epi64(RTE_PKTMBUF_HEADROOM);
-		/* Initialize the mbufs in vector, process 8 mbufs in one loop */
-		for (i = 0; i < ICE_RXQ_REARM_THRESH;
-				i += 8, rxep += 8, rxdp += 8) {
-			__m128i vaddr0, vaddr1, vaddr2, vaddr3;
-			__m128i vaddr4, vaddr5, vaddr6, vaddr7;
-			__m256i vaddr0_1, vaddr2_3;
-			__m256i vaddr4_5, vaddr6_7;
-			__m512i vaddr0_3, vaddr4_7;
-
-			mb0 = rxep[0].mbuf;
-			mb1 = rxep[1].mbuf;
-			mb2 = rxep[2].mbuf;
-			mb3 = rxep[3].mbuf;
-			mb4 = rxep[4].mbuf;
-			mb5 = rxep[5].mbuf;
-			mb6 = rxep[6].mbuf;
-			mb7 = rxep[7].mbuf;
-
-			/* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */
-			RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) !=
-					offsetof(struct rte_mbuf, buf_addr) + 8);
-			vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr);
-			vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr);
-			vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr);
-			vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr);
-			vaddr4 = _mm_loadu_si128((__m128i *)&mb4->buf_addr);
-			vaddr5 = _mm_loadu_si128((__m128i *)&mb5->buf_addr);
-			vaddr6 = _mm_loadu_si128((__m128i *)&mb6->buf_addr);
-			vaddr7 = _mm_loadu_si128((__m128i *)&mb7->buf_addr);
-
-			/**
-			 * merge 0 & 1, by casting 0 to 256-bit and inserting 1
-			 * into the high lanes. Similarly for 2 & 3, and so on.
-			 */
-			vaddr0_1 =
-				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0),
-							vaddr1, 1);
-			vaddr2_3 =
-				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2),
-							vaddr3, 1);
-			vaddr4_5 =
-				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr4),
-							vaddr5, 1);
-			vaddr6_7 =
-				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr6),
-							vaddr7, 1);
-			vaddr0_3 =
-				_mm512_inserti64x4(_mm512_castsi256_si512(vaddr0_1),
-							vaddr2_3, 1);
-			vaddr4_7 =
-				_mm512_inserti64x4(_mm512_castsi256_si512(vaddr4_5),
-							vaddr6_7, 1);
-
-			/* convert pa to dma_addr hdr/data */
-			dma_addr0_3 = _mm512_unpackhi_epi64(vaddr0_3, vaddr0_3);
-			dma_addr4_7 = _mm512_unpackhi_epi64(vaddr4_7, vaddr4_7);
-
-			/* add headroom to pa values */
-			dma_addr0_3 = _mm512_add_epi64(dma_addr0_3, hdr_room);
-			dma_addr4_7 = _mm512_add_epi64(dma_addr4_7, hdr_room);
-
-			/* flush desc with pa dma_addr */
-			_mm512_store_si512((__m512i *)&rxdp->read, dma_addr0_3);
-			_mm512_store_si512((__m512i *)&(rxdp + 4)->read, dma_addr4_7);
-		}
-	} else
-#endif
-	{
-		struct rte_mbuf *mb0, *mb1, *mb2, *mb3;
-		__m256i dma_addr0_1, dma_addr2_3;
-		__m256i hdr_room = _mm256_set1_epi64x(RTE_PKTMBUF_HEADROOM);
-		/* Initialize the mbufs in vector, process 4 mbufs in one loop */
-		for (i = 0; i < ICE_RXQ_REARM_THRESH;
-				i += 4, rxep += 4, rxdp += 4) {
-			__m128i vaddr0, vaddr1, vaddr2, vaddr3;
-			__m256i vaddr0_1, vaddr2_3;
-
-			mb0 = rxep[0].mbuf;
-			mb1 = rxep[1].mbuf;
-			mb2 = rxep[2].mbuf;
-			mb3 = rxep[3].mbuf;
-
-			/* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */
-			RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) !=
-					offsetof(struct rte_mbuf, buf_addr) + 8);
-			vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr);
-			vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr);
-			vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr);
-			vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr);
-
-			/**
-			 * merge 0 & 1, by casting 0 to 256-bit and inserting 1
-			 * into the high lanes. Similarly for 2 & 3
-			 */
-			vaddr0_1 =
-				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0),
-							vaddr1, 1);
-			vaddr2_3 =
-				_mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2),
-							vaddr3, 1);
-
-			/* convert pa to dma_addr hdr/data */
-			dma_addr0_1 = _mm256_unpackhi_epi64(vaddr0_1, vaddr0_1);
-			dma_addr2_3 = _mm256_unpackhi_epi64(vaddr2_3, vaddr2_3);
-
-			/* add headroom to pa values */
-			dma_addr0_1 = _mm256_add_epi64(dma_addr0_1, hdr_room);
-			dma_addr2_3 = _mm256_add_epi64(dma_addr2_3, hdr_room);
-
-			/* flush desc with pa dma_addr */
-			_mm256_store_si256((__m256i *)&rxdp->read, dma_addr0_1);
-			_mm256_store_si256((__m256i *)&(rxdp + 2)->read, dma_addr2_3);
-		}
-	}
-
-#endif
-
-	rxq->rxrearm_start += ICE_RXQ_REARM_THRESH;
-	if (rxq->rxrearm_start >= rxq->nb_rx_desc)
-		rxq->rxrearm_start = 0;
-
-	rxq->rxrearm_nb -= ICE_RXQ_REARM_THRESH;
-
-	rx_id = (uint16_t)((rxq->rxrearm_start == 0) ?
-			     (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1));
-
-	/* Update the tail pointer on the NIC */
-	ICE_PCI_REG_WC_WRITE(rxq->qrx_tail, rx_id);
-}
-#endif
-
 static inline void
 ice_txd_enable_offload(struct rte_mbuf *tx_pkt,
 		       uint64_t *txd_hi)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-29 12:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-788-3@http.bugs.dpdk.org/>
2021-08-18 16:38 ` [dpdk-stable] [PATCH 1/2] net/i40e: fix generic build on FreeBSD Bruce Richardson
2021-08-18 16:38   ` [dpdk-stable] [PATCH 2/2] net/ice: " Bruce Richardson
2021-08-30  8:18     ` Rong, Leyi
2021-09-01  6:24       ` Zhang, Qi Z
2021-09-01  6:23   ` [dpdk-stable] [dpdk-dev] [PATCH 1/2] net/i40e: " Zhang, Qi Z
2021-09-29 12:13 [dpdk-stable] " Leyi Rong
2021-09-29 12:13 ` [dpdk-stable] [PATCH 2/2] net/ice: " Leyi Rong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).