DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
@ 2024-09-06 14:11 Bruce Richardson
  2024-09-06 14:11 ` [PATCH 1/4] net/i40e: fix AVX-512 pointer copy " Bruce Richardson
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-09-06 14:11 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson

The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
assumed that each pointer was always 8B in size.

Bruce Richardson (4):
  net/i40e: fix AVX-512 pointer copy on 32-bit
  net/ice: fix AVX-512 pointer copy on 32-bit
  net/iavf: fix AVX-512 pointer copy on 32-bit
  common/idpf: fix AVX-512 pointer copy on 32-bit

 drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
 drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
 drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
 drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
 4 files changed, 28 insertions(+)

--
2.43.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] net/i40e: fix AVX-512 pointer copy on 32-bit
  2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
@ 2024-09-06 14:11 ` Bruce Richardson
  2024-09-30 13:27   ` Stokes, Ian
  2024-09-06 14:11 ` [PATCH 2/4] net/ice: " Bruce Richardson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Bruce Richardson @ 2024-09-06 14:11 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, stable

The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
copying 32 pointers only requires half the number of AVX-512 load store
operations.

Fixes: 5171b4ee6b6b ("net/i40e: optimize Tx by using AVX512")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/i40e/i40e_rxtx_vec_avx512.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx512.c b/drivers/net/i40e/i40e_rxtx_vec_avx512.c
index 0238b03f8a..3b2750221b 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_avx512.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_avx512.c
@@ -799,6 +799,7 @@ i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq)
 		uint32_t copied = 0;
 		/* n is multiple of 32 */
 		while (copied < n) {
+#ifdef RTE_ARCH_64
 			const __m512i a = _mm512_load_si512(&txep[copied]);
 			const __m512i b = _mm512_load_si512(&txep[copied + 8]);
 			const __m512i c = _mm512_load_si512(&txep[copied + 16]);
@@ -808,6 +809,12 @@ i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq)
 			_mm512_storeu_si512(&cache_objs[copied + 8], b);
 			_mm512_storeu_si512(&cache_objs[copied + 16], c);
 			_mm512_storeu_si512(&cache_objs[copied + 24], d);
+#else
+			const __m512i a = _mm512_load_si512(&txep[copied]);
+			const __m512i b = _mm512_load_si512(&txep[copied + 16]);
+			_mm512_storeu_si512(&cache_objs[copied], a);
+			_mm512_storeu_si512(&cache_objs[copied + 16], b);
+#endif
 			copied += 32;
 		}
 		cache->len += n;
-- 
2.43.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] net/ice: fix AVX-512 pointer copy on 32-bit
  2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
  2024-09-06 14:11 ` [PATCH 1/4] net/i40e: fix AVX-512 pointer copy " Bruce Richardson
@ 2024-09-06 14:11 ` Bruce Richardson
  2024-09-30 13:29   ` Stokes, Ian
  2024-09-06 14:11 ` [PATCH 3/4] net/iavf: " Bruce Richardson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Bruce Richardson @ 2024-09-06 14:11 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, stable

The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
copying 32 pointers only requires half the number of AVX-512 load store
operations.

Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/ice/ice_rxtx_vec_avx512.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ice/ice_rxtx_vec_avx512.c b/drivers/net/ice/ice_rxtx_vec_avx512.c
index 04148e8ea2..add095ef06 100644
--- a/drivers/net/ice/ice_rxtx_vec_avx512.c
+++ b/drivers/net/ice/ice_rxtx_vec_avx512.c
@@ -907,6 +907,7 @@ ice_tx_free_bufs_avx512(struct ice_tx_queue *txq)
 		uint32_t copied = 0;
 		/* n is multiple of 32 */
 		while (copied < n) {
+#ifdef RTE_ARCH_64
 			const __m512i a = _mm512_loadu_si512(&txep[copied]);
 			const __m512i b = _mm512_loadu_si512(&txep[copied + 8]);
 			const __m512i c = _mm512_loadu_si512(&txep[copied + 16]);
@@ -916,6 +917,12 @@ ice_tx_free_bufs_avx512(struct ice_tx_queue *txq)
 			_mm512_storeu_si512(&cache_objs[copied + 8], b);
 			_mm512_storeu_si512(&cache_objs[copied + 16], c);
 			_mm512_storeu_si512(&cache_objs[copied + 24], d);
+#else
+			const __m512i a = _mm512_loadu_si512(&txep[copied]);
+			const __m512i b = _mm512_loadu_si512(&txep[copied + 16]);
+			_mm512_storeu_si512(&cache_objs[copied], a);
+			_mm512_storeu_si512(&cache_objs[copied + 16], b);
+#endif
 			copied += 32;
 		}
 		cache->len += n;
-- 
2.43.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/4] net/iavf: fix AVX-512 pointer copy on 32-bit
  2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
  2024-09-06 14:11 ` [PATCH 1/4] net/i40e: fix AVX-512 pointer copy " Bruce Richardson
  2024-09-06 14:11 ` [PATCH 2/4] net/ice: " Bruce Richardson
@ 2024-09-06 14:11 ` Bruce Richardson
  2024-09-06 14:11 ` [PATCH 4/4] common/idpf: " Bruce Richardson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-09-06 14:11 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, stable

The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
copying 32 pointers only requires half the number of AVX-512 load store
operations.

Fixes: 9ab9514c150e ("net/iavf: enable AVX512 for Tx")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/iavf/iavf_rxtx_vec_avx512.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/iavf/iavf_rxtx_vec_avx512.c b/drivers/net/iavf/iavf_rxtx_vec_avx512.c
index 3bb6f305df..d6a861bf80 100644
--- a/drivers/net/iavf/iavf_rxtx_vec_avx512.c
+++ b/drivers/net/iavf/iavf_rxtx_vec_avx512.c
@@ -1892,6 +1892,7 @@ iavf_tx_free_bufs_avx512(struct iavf_tx_queue *txq)
 		uint32_t copied = 0;
 		/* n is multiple of 32 */
 		while (copied < n) {
+#ifdef RTE_ARCH_64
 			const __m512i a = _mm512_loadu_si512(&txep[copied]);
 			const __m512i b = _mm512_loadu_si512(&txep[copied + 8]);
 			const __m512i c = _mm512_loadu_si512(&txep[copied + 16]);
@@ -1901,6 +1902,12 @@ iavf_tx_free_bufs_avx512(struct iavf_tx_queue *txq)
 			_mm512_storeu_si512(&cache_objs[copied + 8], b);
 			_mm512_storeu_si512(&cache_objs[copied + 16], c);
 			_mm512_storeu_si512(&cache_objs[copied + 24], d);
+#else
+			const __m512i a = _mm512_loadu_si512(&txep[copied]);
+			const __m512i b = _mm512_loadu_si512(&txep[copied + 16]);
+			_mm512_storeu_si512(&cache_objs[copied], a);
+			_mm512_storeu_si512(&cache_objs[copied + 16], b);
+#endif
 			copied += 32;
 		}
 		cache->len += n;
-- 
2.43.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/4] common/idpf: fix AVX-512 pointer copy on 32-bit
  2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
                   ` (2 preceding siblings ...)
  2024-09-06 14:11 ` [PATCH 3/4] net/iavf: " Bruce Richardson
@ 2024-09-06 14:11 ` Bruce Richardson
  2024-09-30 13:31 ` [PATCH 0/4] fix issues with using AVX-512 drivers " Stokes, Ian
  2024-09-30 15:38 ` David Marchand
  5 siblings, 0 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-09-06 14:11 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, stable

The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
copying 32 pointers only requires half the number of AVX-512 load store
operations.

Fixes: 5bf87b45b2c8 ("net/idpf: add AVX512 data path for single queue model")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/common/idpf/idpf_common_rxtx_avx512.c
index 3b5e124ec8..b8450b03ae 100644
--- a/drivers/common/idpf/idpf_common_rxtx_avx512.c
+++ b/drivers/common/idpf/idpf_common_rxtx_avx512.c
@@ -1043,6 +1043,7 @@ idpf_tx_singleq_free_bufs_avx512(struct idpf_tx_queue *txq)
 		uint32_t copied = 0;
 		/* n is multiple of 32 */
 		while (copied < n) {
+#ifdef RTE_ARCH_64
 			const __m512i a = _mm512_loadu_si512(&txep[copied]);
 			const __m512i b = _mm512_loadu_si512(&txep[copied + 8]);
 			const __m512i c = _mm512_loadu_si512(&txep[copied + 16]);
@@ -1052,6 +1053,12 @@ idpf_tx_singleq_free_bufs_avx512(struct idpf_tx_queue *txq)
 			_mm512_storeu_si512(&cache_objs[copied + 8], b);
 			_mm512_storeu_si512(&cache_objs[copied + 16], c);
 			_mm512_storeu_si512(&cache_objs[copied + 24], d);
+#else
+			const __m512i a = _mm512_loadu_si512(&txep[copied]);
+			const __m512i b = _mm512_loadu_si512(&txep[copied + 16]);
+			_mm512_storeu_si512(&cache_objs[copied], a);
+			_mm512_storeu_si512(&cache_objs[copied + 16], b);
+#endif
 			copied += 32;
 		}
 		cache->len += n;
-- 
2.43.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/4] net/i40e: fix AVX-512 pointer copy on 32-bit
  2024-09-06 14:11 ` [PATCH 1/4] net/i40e: fix AVX-512 pointer copy " Bruce Richardson
@ 2024-09-30 13:27   ` Stokes, Ian
  0 siblings, 0 replies; 14+ messages in thread
From: Stokes, Ian @ 2024-09-30 13:27 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: Richardson, Bruce, stable

> The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
> copying 32 pointers only requires half the number of AVX-512 load store
> operations.
> 
> Fixes: 5171b4ee6b6b ("net/i40e: optimize Tx by using AVX512")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  drivers/net/i40e/i40e_rxtx_vec_avx512.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx512.c
> b/drivers/net/i40e/i40e_rxtx_vec_avx512.c
> index 0238b03f8a..3b2750221b 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_avx512.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_avx512.c
> @@ -799,6 +799,7 @@ i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq)
>  		uint32_t copied = 0;
>  		/* n is multiple of 32 */
>  		while (copied < n) {
> +#ifdef RTE_ARCH_64
>  			const __m512i a = _mm512_load_si512(&txep[copied]);
>  			const __m512i b = _mm512_load_si512(&txep[copied +
> 8]);
>  			const __m512i c = _mm512_load_si512(&txep[copied +
> 16]);
> @@ -808,6 +809,12 @@ i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq)
>  			_mm512_storeu_si512(&cache_objs[copied + 8], b);
>  			_mm512_storeu_si512(&cache_objs[copied + 16], c);
>  			_mm512_storeu_si512(&cache_objs[copied + 24], d);
> +#else
> +			const __m512i a = _mm512_load_si512(&txep[copied]);
> +			const __m512i b = _mm512_load_si512(&txep[copied +
> 16]);
> +			_mm512_storeu_si512(&cache_objs[copied], a);
> +			_mm512_storeu_si512(&cache_objs[copied + 16], b);
> +#endif
>  			copied += 32;
>  		}
>  		cache->len += n;
> --
> 2.43.0

Looks good to me, ACKED.

Thanks
Ian


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 2/4] net/ice: fix AVX-512 pointer copy on 32-bit
  2024-09-06 14:11 ` [PATCH 2/4] net/ice: " Bruce Richardson
@ 2024-09-30 13:29   ` Stokes, Ian
  0 siblings, 0 replies; 14+ messages in thread
From: Stokes, Ian @ 2024-09-30 13:29 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: Richardson, Bruce, stable

> The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
> copying 32 pointers only requires half the number of AVX-512 load store
> operations.
> 
> Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  drivers/net/ice/ice_rxtx_vec_avx512.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx512.c
> b/drivers/net/ice/ice_rxtx_vec_avx512.c
> index 04148e8ea2..add095ef06 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx512.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx512.c
> @@ -907,6 +907,7 @@ ice_tx_free_bufs_avx512(struct ice_tx_queue *txq)
>  		uint32_t copied = 0;
>  		/* n is multiple of 32 */
>  		while (copied < n) {
> +#ifdef RTE_ARCH_64
>  			const __m512i a =
> _mm512_loadu_si512(&txep[copied]);
>  			const __m512i b = _mm512_loadu_si512(&txep[copied
> + 8]);
>  			const __m512i c = _mm512_loadu_si512(&txep[copied +
> 16]);
> @@ -916,6 +917,12 @@ ice_tx_free_bufs_avx512(struct ice_tx_queue *txq)
>  			_mm512_storeu_si512(&cache_objs[copied + 8], b);
>  			_mm512_storeu_si512(&cache_objs[copied + 16], c);
>  			_mm512_storeu_si512(&cache_objs[copied + 24], d);
> +#else
> +			const __m512i a =
> _mm512_loadu_si512(&txep[copied]);
> +			const __m512i b = _mm512_loadu_si512(&txep[copied
> + 16]);
> +			_mm512_storeu_si512(&cache_objs[copied], a);
> +			_mm512_storeu_si512(&cache_objs[copied + 16], b);
> +#endif
>  			copied += 32;
>  		}
>  		cache->len += n;
> --
 LGTM, Acked.

Thanks
Ian


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
                   ` (3 preceding siblings ...)
  2024-09-06 14:11 ` [PATCH 4/4] common/idpf: " Bruce Richardson
@ 2024-09-30 13:31 ` Stokes, Ian
  2024-10-01 13:36   ` Bruce Richardson
  2024-09-30 15:38 ` David Marchand
  5 siblings, 1 reply; 14+ messages in thread
From: Stokes, Ian @ 2024-09-30 13:31 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: Richardson, Bruce

> The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> assumed that each pointer was always 8B in size.
> 
> Bruce Richardson (4):
>   net/i40e: fix AVX-512 pointer copy on 32-bit
>   net/ice: fix AVX-512 pointer copy on 32-bit
>   net/iavf: fix AVX-512 pointer copy on 32-bit
>   common/idpf: fix AVX-512 pointer copy on 32-bit
> 
>  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
>  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
>  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
>  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++

Series looks good to me.

Series-acked-by: Ian Stokes <ian.stokes@intel.com>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
                   ` (4 preceding siblings ...)
  2024-09-30 13:31 ` [PATCH 0/4] fix issues with using AVX-512 drivers " Stokes, Ian
@ 2024-09-30 15:38 ` David Marchand
  2024-09-30 15:57   ` Bruce Richardson
  5 siblings, 1 reply; 14+ messages in thread
From: David Marchand @ 2024-09-30 15:38 UTC (permalink / raw)
  To: Bruce Richardson, Ian Stokes; +Cc: dev, Robin Jarry

On Fri, Sep 6, 2024 at 4:11 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> assumed that each pointer was always 8B in size.
>
> Bruce Richardson (4):
>   net/i40e: fix AVX-512 pointer copy on 32-bit
>   net/ice: fix AVX-512 pointer copy on 32-bit
>   net/iavf: fix AVX-512 pointer copy on 32-bit
>   common/idpf: fix AVX-512 pointer copy on 32-bit
>
>  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
>  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
>  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
>  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
>  4 files changed, 28 insertions(+)

Sorry, not directly related to this series, but as I was checking some
AVX512 patch, I suspect some drivers are missing runtime checks for
availability of some AVX512 instructions:

$ for meson in $(git grep -l __AVX512[^_]*__
'drivers/**/meson.build'); do dir=$(dirname $meson); for flag in $(git
grep -ho __AVX512[^_]*__ $dir | sort -u); do flag=${flag%%__};
flag=${flag##__}; git grep -ql
rte_cpu_get_flag_enabled.RTE_CPUFLAG_$flag $dir || echo
RTE_CPUFLAG_$flag check missing in $dir; done; done

RTE_CPUFLAG_AVX512BW check missing in drivers/common/idpf
RTE_CPUFLAG_AVX512DQ check missing in drivers/common/idpf
RTE_CPUFLAG_AVX512F check missing in drivers/common/idpf
RTE_CPUFLAG_AVX512VL check missing in drivers/net/i40e
RTE_CPUFLAG_AVX512VL check missing in drivers/net/ice

Maybe some flags are implictly available... worth a confirmation from
Intel in any case from my pov.

Thanks.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-09-30 15:38 ` David Marchand
@ 2024-09-30 15:57   ` Bruce Richardson
  2024-09-30 17:52     ` Bruce Richardson
  2024-10-01  7:14     ` David Marchand
  0 siblings, 2 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-09-30 15:57 UTC (permalink / raw)
  To: David Marchand; +Cc: Ian Stokes, dev, Robin Jarry

On Mon, Sep 30, 2024 at 05:38:44PM +0200, David Marchand wrote:
> On Fri, Sep 6, 2024 at 4:11 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> > assumed that each pointer was always 8B in size.
> >
> > Bruce Richardson (4):
> >   net/i40e: fix AVX-512 pointer copy on 32-bit
> >   net/ice: fix AVX-512 pointer copy on 32-bit
> >   net/iavf: fix AVX-512 pointer copy on 32-bit
> >   common/idpf: fix AVX-512 pointer copy on 32-bit
> >
> >  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
> >  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
> >  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
> >  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
> >  4 files changed, 28 insertions(+)
> 
> Sorry, not directly related to this series, but as I was checking some
> AVX512 patch, I suspect some drivers are missing runtime checks for
> availability of some AVX512 instructions:
> 
> $ for meson in $(git grep -l __AVX512[^_]*__
> 'drivers/**/meson.build'); do dir=$(dirname $meson); for flag in $(git
> grep -ho __AVX512[^_]*__ $dir | sort -u); do flag=${flag%%__};
> flag=${flag##__}; git grep -ql
> rte_cpu_get_flag_enabled.RTE_CPUFLAG_$flag $dir || echo
> RTE_CPUFLAG_$flag check missing in $dir; done; done
> 
> RTE_CPUFLAG_AVX512BW check missing in drivers/common/idpf
> RTE_CPUFLAG_AVX512DQ check missing in drivers/common/idpf
> RTE_CPUFLAG_AVX512F check missing in drivers/common/idpf
> RTE_CPUFLAG_AVX512VL check missing in drivers/net/i40e
> RTE_CPUFLAG_AVX512VL check missing in drivers/net/ice
> 
> Maybe some flags are implictly available... worth a confirmation from
> Intel in any case from my pov.
> 

I think it would be good practice to explicitly check for all the AVX-512
extensions actually used. Ideally, as a cleanup, we should probably check
for those listed (f, bw, dq and vl) once early in the config and reuse that
value throughout the build, rather than having each and every PMD
continually check them.

/Bruce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-09-30 15:57   ` Bruce Richardson
@ 2024-09-30 17:52     ` Bruce Richardson
  2024-10-01  7:14     ` David Marchand
  1 sibling, 0 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-09-30 17:52 UTC (permalink / raw)
  To: David Marchand; +Cc: Ian Stokes, dev, Robin Jarry

On Mon, Sep 30, 2024 at 04:57:21PM +0100, Bruce Richardson wrote:
> On Mon, Sep 30, 2024 at 05:38:44PM +0200, David Marchand wrote:
> > On Fri, Sep 6, 2024 at 4:11 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> > > assumed that each pointer was always 8B in size.
> > >
> > > Bruce Richardson (4):
> > >   net/i40e: fix AVX-512 pointer copy on 32-bit
> > >   net/ice: fix AVX-512 pointer copy on 32-bit
> > >   net/iavf: fix AVX-512 pointer copy on 32-bit
> > >   common/idpf: fix AVX-512 pointer copy on 32-bit
> > >
> > >  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
> > >  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
> > >  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
> > >  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
> > >  4 files changed, 28 insertions(+)
> > 
> > Sorry, not directly related to this series, but as I was checking some
> > AVX512 patch, I suspect some drivers are missing runtime checks for
> > availability of some AVX512 instructions:
> > 
> > $ for meson in $(git grep -l __AVX512[^_]*__
> > 'drivers/**/meson.build'); do dir=$(dirname $meson); for flag in $(git
> > grep -ho __AVX512[^_]*__ $dir | sort -u); do flag=${flag%%__};
> > flag=${flag##__}; git grep -ql
> > rte_cpu_get_flag_enabled.RTE_CPUFLAG_$flag $dir || echo
> > RTE_CPUFLAG_$flag check missing in $dir; done; done
> > 
> > RTE_CPUFLAG_AVX512BW check missing in drivers/common/idpf
> > RTE_CPUFLAG_AVX512DQ check missing in drivers/common/idpf
> > RTE_CPUFLAG_AVX512F check missing in drivers/common/idpf
> > RTE_CPUFLAG_AVX512VL check missing in drivers/net/i40e
> > RTE_CPUFLAG_AVX512VL check missing in drivers/net/ice
> > 
> > Maybe some flags are implictly available... worth a confirmation from
> > Intel in any case from my pov.
> > 
> 
> I think it would be good practice to explicitly check for all the AVX-512
> extensions actually used. Ideally, as a cleanup, we should probably check
> for those listed (f, bw, dq and vl) once early in the config and reuse that
> value throughout the build, rather than having each and every PMD
> continually check them.
> 
Draft patchset for this now submitted to dev list [1]. Reviews and feedback
welcome, as always!

/Bruce

[1] https://patches.dpdk.org/project/dpdk/list/?series=33188

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-09-30 15:57   ` Bruce Richardson
  2024-09-30 17:52     ` Bruce Richardson
@ 2024-10-01  7:14     ` David Marchand
  2024-10-01  7:42       ` Bruce Richardson
  1 sibling, 1 reply; 14+ messages in thread
From: David Marchand @ 2024-10-01  7:14 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Ian Stokes, dev, Robin Jarry

On Mon, Sep 30, 2024 at 5:57 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Sep 30, 2024 at 05:38:44PM +0200, David Marchand wrote:
> > On Fri, Sep 6, 2024 at 4:11 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> > > assumed that each pointer was always 8B in size.
> > >
> > > Bruce Richardson (4):
> > >   net/i40e: fix AVX-512 pointer copy on 32-bit
> > >   net/ice: fix AVX-512 pointer copy on 32-bit
> > >   net/iavf: fix AVX-512 pointer copy on 32-bit
> > >   common/idpf: fix AVX-512 pointer copy on 32-bit
> > >
> > >  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
> > >  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
> > >  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
> > >  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
> > >  4 files changed, 28 insertions(+)
> >
> > Sorry, not directly related to this series, but as I was checking some
> > AVX512 patch, I suspect some drivers are missing runtime checks for
> > availability of some AVX512 instructions:
> >
> > $ for meson in $(git grep -l __AVX512[^_]*__
> > 'drivers/**/meson.build'); do dir=$(dirname $meson); for flag in $(git
> > grep -ho __AVX512[^_]*__ $dir | sort -u); do flag=${flag%%__};
> > flag=${flag##__}; git grep -ql
> > rte_cpu_get_flag_enabled.RTE_CPUFLAG_$flag $dir || echo
> > RTE_CPUFLAG_$flag check missing in $dir; done; done
> >
> > RTE_CPUFLAG_AVX512BW check missing in drivers/common/idpf
> > RTE_CPUFLAG_AVX512DQ check missing in drivers/common/idpf
> > RTE_CPUFLAG_AVX512F check missing in drivers/common/idpf
> > RTE_CPUFLAG_AVX512VL check missing in drivers/net/i40e
> > RTE_CPUFLAG_AVX512VL check missing in drivers/net/ice
> >
> > Maybe some flags are implictly available... worth a confirmation from
> > Intel in any case from my pov.
> >
>
> I think it would be good practice to explicitly check for all the AVX-512
> extensions actually used. Ideally, as a cleanup, we should probably check
> for those listed (f, bw, dq and vl) once early in the config and reuse that
> value throughout the build, rather than having each and every PMD
> continually check them.

This simplification on the build side looks good.

On the other hand, vectorized handlers in libraries and drivers are
selected based on some AVX512 instructions availability at runtime.
Don't we need to validate *runtime* availability of each of those
instructions in each library/driver?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-10-01  7:14     ` David Marchand
@ 2024-10-01  7:42       ` Bruce Richardson
  0 siblings, 0 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-10-01  7:42 UTC (permalink / raw)
  To: David Marchand; +Cc: Ian Stokes, dev, Robin Jarry

On Tue, Oct 01, 2024 at 09:14:53AM +0200, David Marchand wrote:
> On Mon, Sep 30, 2024 at 5:57 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Mon, Sep 30, 2024 at 05:38:44PM +0200, David Marchand wrote:
> > > On Fri, Sep 6, 2024 at 4:11 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> > > > assumed that each pointer was always 8B in size.
> > > >
> > > > Bruce Richardson (4):
> > > >   net/i40e: fix AVX-512 pointer copy on 32-bit
> > > >   net/ice: fix AVX-512 pointer copy on 32-bit
> > > >   net/iavf: fix AVX-512 pointer copy on 32-bit
> > > >   common/idpf: fix AVX-512 pointer copy on 32-bit
> > > >
> > > >  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
> > > >  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
> > > >  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
> > > >  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
> > > >  4 files changed, 28 insertions(+)
> > >
> > > Sorry, not directly related to this series, but as I was checking some
> > > AVX512 patch, I suspect some drivers are missing runtime checks for
> > > availability of some AVX512 instructions:
> > >
> > > $ for meson in $(git grep -l __AVX512[^_]*__
> > > 'drivers/**/meson.build'); do dir=$(dirname $meson); for flag in $(git
> > > grep -ho __AVX512[^_]*__ $dir | sort -u); do flag=${flag%%__};
> > > flag=${flag##__}; git grep -ql
> > > rte_cpu_get_flag_enabled.RTE_CPUFLAG_$flag $dir || echo
> > > RTE_CPUFLAG_$flag check missing in $dir; done; done
> > >
> > > RTE_CPUFLAG_AVX512BW check missing in drivers/common/idpf
> > > RTE_CPUFLAG_AVX512DQ check missing in drivers/common/idpf
> > > RTE_CPUFLAG_AVX512F check missing in drivers/common/idpf
> > > RTE_CPUFLAG_AVX512VL check missing in drivers/net/i40e
> > > RTE_CPUFLAG_AVX512VL check missing in drivers/net/ice
> > >
> > > Maybe some flags are implictly available... worth a confirmation from
> > > Intel in any case from my pov.
> > >
> >
> > I think it would be good practice to explicitly check for all the AVX-512
> > extensions actually used. Ideally, as a cleanup, we should probably check
> > for those listed (f, bw, dq and vl) once early in the config and reuse that
> > value throughout the build, rather than having each and every PMD
> > continually check them.
> 
> This simplification on the build side looks good.
> 
> On the other hand, vectorized handlers in libraries and drivers are
> selected based on some AVX512 instructions availability at runtime.
> Don't we need to validate *runtime* availability of each of those
> instructions in each library/driver?
> 
Yes, each lib and driver should be also checking these at runtime.
Simplification of such checks may be possible, and may be something I look
at in future, time permitting. For now, an example of the checks done can
be seen in [1].

/Bruce

[1] https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3486

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit
  2024-09-30 13:31 ` [PATCH 0/4] fix issues with using AVX-512 drivers " Stokes, Ian
@ 2024-10-01 13:36   ` Bruce Richardson
  0 siblings, 0 replies; 14+ messages in thread
From: Bruce Richardson @ 2024-10-01 13:36 UTC (permalink / raw)
  To: Stokes, Ian; +Cc: dev

On Mon, Sep 30, 2024 at 02:31:56PM +0100, Stokes, Ian wrote:
> > The AVX-512 copy code in multiple drivers was incorrect for 32-bit as it
> > assumed that each pointer was always 8B in size.
> > 
> > Bruce Richardson (4):
> >   net/i40e: fix AVX-512 pointer copy on 32-bit
> >   net/ice: fix AVX-512 pointer copy on 32-bit
> >   net/iavf: fix AVX-512 pointer copy on 32-bit
> >   common/idpf: fix AVX-512 pointer copy on 32-bit
> > 
> >  drivers/common/idpf/idpf_common_rxtx_avx512.c | 7 +++++++
> >  drivers/net/i40e/i40e_rxtx_vec_avx512.c       | 7 +++++++
> >  drivers/net/iavf/iavf_rxtx_vec_avx512.c       | 7 +++++++
> >  drivers/net/ice/ice_rxtx_vec_avx512.c         | 7 +++++++
> 
> Series looks good to me.
> 
> Series-acked-by: Ian Stokes <ian.stokes@intel.com>
> 
Applied to dpdk-next-net-intel.

/Bruce

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-10-01 13:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-06 14:11 [PATCH 0/4] fix issues with using AVX-512 drivers on 32-bit Bruce Richardson
2024-09-06 14:11 ` [PATCH 1/4] net/i40e: fix AVX-512 pointer copy " Bruce Richardson
2024-09-30 13:27   ` Stokes, Ian
2024-09-06 14:11 ` [PATCH 2/4] net/ice: " Bruce Richardson
2024-09-30 13:29   ` Stokes, Ian
2024-09-06 14:11 ` [PATCH 3/4] net/iavf: " Bruce Richardson
2024-09-06 14:11 ` [PATCH 4/4] common/idpf: " Bruce Richardson
2024-09-30 13:31 ` [PATCH 0/4] fix issues with using AVX-512 drivers " Stokes, Ian
2024-10-01 13:36   ` Bruce Richardson
2024-09-30 15:38 ` David Marchand
2024-09-30 15:57   ` Bruce Richardson
2024-09-30 17:52     ` Bruce Richardson
2024-10-01  7:14     ` David Marchand
2024-10-01  7:42       ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).