* [PATCH 1/5] examples/l3fwd: fix port group mask generation @ 2022-08-29 9:44 pbhagavatula 2022-09-02 9:18 ` [PATCH v2 " pbhagavatula 0 siblings, 1 reply; 11+ messages in thread From: pbhagavatula @ 2022-08-29 9:44 UTC (permalink / raw) To: jerinj, David Christensen; +Cc: dev, Pavan Nikhilesh, stable From: Pavan Nikhilesh <pbhagavatula@marvell.com> Fix port group mask generation in altivec, vec_any_eq returns 0 or 1 while port_groupx4 expects comparison mask result. Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> --- examples/common/altivec/port_group.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h index 5e209b02fa..7a6ef390ff 100644 --- a/examples/common/altivec/port_group.h +++ b/examples/common/altivec/port_group.h @@ -26,12 +26,19 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16_t u16[FWDSTEP + 1]; uint64_t u64; } *pnum = (void *)pn; + union u_vec { + __vector unsigned short v_us; + unsigned short s[8]; + }; + union u_vec res; int32_t v; - v = vec_any_eq(dp1, dp2); - + dp1 = vec_cmpeq(dp1, dp2); + res.v_us = dp1; + v = (res.s[0] & 0x1) | (res.s[1] & 0x2) | (res.s[2] & 0x4) | + (res.s[3] & 0x8); /* update last port counter. */ lp[0] += gptbl[v].lpv; -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/5] examples/l3fwd: fix port group mask generation 2022-08-29 9:44 [PATCH 1/5] examples/l3fwd: fix port group mask generation pbhagavatula @ 2022-09-02 9:18 ` pbhagavatula 2022-09-08 18:33 ` David Christensen 2022-09-11 18:12 ` [PATCH v3 " pbhagavatula 0 siblings, 2 replies; 11+ messages in thread From: pbhagavatula @ 2022-09-02 9:18 UTC (permalink / raw) To: jerinj, David Christensen; +Cc: dev, Pavan Nikhilesh, stable From: Pavan Nikhilesh <pbhagavatula@marvell.com> Fix port group mask generation in altivec, vec_any_eq returns 0 or 1 while port_groupx4 expects comparison mask result. Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> --- v2 Changes: - Fix PPC, RISC-V, aarch32 compilation. examples/common/altivec/port_group.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h index 5e209b02fa..592ef80b7f 100644 --- a/examples/common/altivec/port_group.h +++ b/examples/common/altivec/port_group.h @@ -26,12 +26,19 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16_t u16[FWDSTEP + 1]; uint64_t u64; } *pnum = (void *)pn; + union u_vec { + __vector unsigned short v_us; + unsigned short s[8]; + }; + union u_vec res; int32_t v; - v = vec_any_eq(dp1, dp2); - + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); + res.v_us = dp1; + v = (res.s[0] & 0x1) | (res.s[1] & 0x2) | (res.s[2] & 0x4) | + (res.s[3] & 0x8); /* update last port counter. */ lp[0] += gptbl[v].lpv; -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/5] examples/l3fwd: fix port group mask generation 2022-09-02 9:18 ` [PATCH v2 " pbhagavatula @ 2022-09-08 18:33 ` David Christensen 2022-09-09 5:56 ` [EXT] " Pavan Nikhilesh Bhagavatula 2022-09-11 18:12 ` [PATCH v3 " pbhagavatula 1 sibling, 1 reply; 11+ messages in thread From: David Christensen @ 2022-09-08 18:33 UTC (permalink / raw) To: pbhagavatula, jerinj; +Cc: dev, stable On 9/2/22 2:18 AM, pbhagavatula@marvell.com wrote: > From: Pavan Nikhilesh <pbhagavatula@marvell.com> > > Fix port group mask generation in altivec, vec_any_eq returns > 0 or 1 while port_groupx4 expects comparison mask result. > > Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") > Cc: stable@dpdk.org > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> > --- > v2 Changes: > - Fix PPC, RISC-V, aarch32 compilation. > > examples/common/altivec/port_group.h | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h > index 5e209b02fa..592ef80b7f 100644 > --- a/examples/common/altivec/port_group.h > +++ b/examples/common/altivec/port_group.h > @@ -26,12 +26,19 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, > uint16_t u16[FWDSTEP + 1]; > uint64_t u64; > } *pnum = (void *)pn; > + union u_vec { > + __vector unsigned short v_us; > + unsigned short s[8]; > + }; > > + union u_vec res; > int32_t v; > > - v = vec_any_eq(dp1, dp2); > - > + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); Altivec vec_cmpeq() is similar to Intel _mm_cmpeq_*(), so this looks right to me. > + res.v_us = dp1; > > + v = (res.s[0] & 0x1) | (res.s[1] & 0x2) | (res.s[2] & 0x4) | > + (res.s[3] & 0x8); This can be vectorized too. The Intel _mm_unpacklo_epi16() intrinsic can be replaced with the following Altivec code: extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_unpacklo_epi16 (__m128i __A, __m128i __B) { return (__m128i) vec_mergeh ((__v8hi)__A, (__v8hi)__B); } The Intel _mm_movemask_ps() intrinsic can be replaced with the following Altivec implementation: /* Creates a 4-bit mask from the most significant bits of the SPFP values. */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_movemask_ps (__m128 __A) { __vector unsigned long long result; static const __vector unsigned int perm_mask = { #ifdef __LITTLE_ENDIAN__ 0x00204060, 0x80808080, 0x80808080, 0x80808080 #else 0x80808080, 0x80808080, 0x80808080, 0x00204060 #endif }; result = ((__vector unsigned long long) vec_vbpermq ((__vector unsigned char) __A, (__vector unsigned char) perm_mask)); #ifdef __LITTLE_ENDIAN__ return result[1]; #else return result[0]; #endif } Dave ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [EXT] Re: [PATCH v2 1/5] examples/l3fwd: fix port group mask generation 2022-09-08 18:33 ` David Christensen @ 2022-09-09 5:56 ` Pavan Nikhilesh Bhagavatula 0 siblings, 0 replies; 11+ messages in thread From: Pavan Nikhilesh Bhagavatula @ 2022-09-09 5:56 UTC (permalink / raw) To: David Christensen, Jerin Jacob Kollanukkaran; +Cc: dev, stable > On 9/2/22 2:18 AM, pbhagavatula@marvell.com wrote: > > From: Pavan Nikhilesh <pbhagavatula@marvell.com> > > > > Fix port group mask generation in altivec, vec_any_eq returns > > 0 or 1 while port_groupx4 expects comparison mask result. > > > > Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on > powerpc") > > Cc: stable@dpdk.org > > > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> > > --- > > v2 Changes: > > - Fix PPC, RISC-V, aarch32 compilation. > > > > examples/common/altivec/port_group.h | 11 +++++++++-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/examples/common/altivec/port_group.h > b/examples/common/altivec/port_group.h > > index 5e209b02fa..592ef80b7f 100644 > > --- a/examples/common/altivec/port_group.h > > +++ b/examples/common/altivec/port_group.h > > @@ -26,12 +26,19 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t > *lp, > > uint16_t u16[FWDSTEP + 1]; > > uint64_t u64; > > } *pnum = (void *)pn; > > + union u_vec { > > + __vector unsigned short v_us; > > + unsigned short s[8]; > > + }; > > > > + union u_vec res; > > int32_t v; > > > > - v = vec_any_eq(dp1, dp2); > > - > > + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); > > Altivec vec_cmpeq() is similar to Intel _mm_cmpeq_*(), so this looks > right to me. > > > + res.v_us = dp1; > > > > + v = (res.s[0] & 0x1) | (res.s[1] & 0x2) | (res.s[2] & 0x4) | > > + (res.s[3] & 0x8); > > This can be vectorized too. The Intel _mm_unpacklo_epi16() intrinsic > can be replaced with the following Altivec code: > > extern __inline __m128i __attribute__((__gnu_inline__, > __always_inline__, __artificial__)) > _mm_unpacklo_epi16 (__m128i __A, __m128i __B) > { > return (__m128i) vec_mergeh ((__v8hi)__A, (__v8hi)__B); > } > > The Intel _mm_movemask_ps() intrinsic can be replaced with the following > Altivec implementation: > > /* Creates a 4-bit mask from the most significant bits of the SPFP > values. */ > extern __inline int __attribute__((__gnu_inline__, __always_inline__, > __artificial__)) > _mm_movemask_ps (__m128 __A) > { > __vector unsigned long long result; > static const __vector unsigned int perm_mask = > { > #ifdef __LITTLE_ENDIAN__ > 0x00204060, 0x80808080, 0x80808080, 0x80808080 > #else > 0x80808080, 0x80808080, 0x80808080, 0x00204060 > #endif > }; > > result = ((__vector unsigned long long) > vec_vbpermq ((__vector unsigned char) __A, > (__vector unsigned char) perm_mask)); > > #ifdef __LITTLE_ENDIAN__ > return result[1]; > #else > return result[0]; > #endif > } > Sure I will add this to the next version. > Dave Thanks, Pavan. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 1/5] examples/l3fwd: fix port group mask generation 2022-09-02 9:18 ` [PATCH v2 " pbhagavatula 2022-09-08 18:33 ` David Christensen @ 2022-09-11 18:12 ` pbhagavatula 2022-10-11 9:08 ` [PATCH v4 " pbhagavatula 1 sibling, 1 reply; 11+ messages in thread From: pbhagavatula @ 2022-09-11 18:12 UTC (permalink / raw) To: jerinj, David Christensen; +Cc: dev, Pavan Nikhilesh, stable From: Pavan Nikhilesh <pbhagavatula@marvell.com> Fix port group mask generation in altivec, vec_any_eq returns 0 or 1 while port_groupx4 expects comparison mask result. Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> --- v3 Changes: - PPC optimize port mask generation. - Fix aarch32 compilation. v2 Changes: - Fix PPC, RISC-V, aarch32 compilation. examples/common/altivec/port_group.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h index 5e209b02fa..1c05bc025a 100644 --- a/examples/common/altivec/port_group.h +++ b/examples/common/altivec/port_group.h @@ -26,12 +26,17 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16_t u16[FWDSTEP + 1]; uint64_t u64; } *pnum = (void *)pn; - + __vector unsigned long long result; + const __vector unsigned int perm_mask = {0x00204060, 0x80808080, + 0x80808080, 0x80808080}; int32_t v; - v = vec_any_eq(dp1, dp2); - + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); + dp1 = vec_mergeh(dp1, dp1); + result = (__vector unsigned long long)vec_vbpermq( + (__vector unsigned char)dp1, (__vector unsigned char)perm_mask); + v = result[1]; /* update last port counter. */ lp[0] += gptbl[v].lpv; -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 1/5] examples/l3fwd: fix port group mask generation 2022-09-11 18:12 ` [PATCH v3 " pbhagavatula @ 2022-10-11 9:08 ` pbhagavatula 2022-10-11 10:12 ` [PATCH v5 " pbhagavatula 0 siblings, 1 reply; 11+ messages in thread From: pbhagavatula @ 2022-10-11 9:08 UTC (permalink / raw) To: jerinj, David Christensen; +Cc: dev, Pavan Nikhilesh, stable From: Pavan Nikhilesh <pbhagavatula@marvell.com> Fix port group mask generation in altivec, vec_any_eq returns 0 or 1 while port_groupx4 expects comparison mask result. Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> --- v4 Changes: - Fix missing `rte_free`. v3 Changes: - PPC optimize port mask generation. - Fix aarch32 compilation. v2 Changes: - Fix PPC, RISC-V, aarch32 compilation. examples/common/altivec/port_group.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h index 5e209b02fa..1c05bc025a 100644 --- a/examples/common/altivec/port_group.h +++ b/examples/common/altivec/port_group.h @@ -26,12 +26,17 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16_t u16[FWDSTEP + 1]; uint64_t u64; } *pnum = (void *)pn; - + __vector unsigned long long result; + const __vector unsigned int perm_mask = {0x00204060, 0x80808080, + 0x80808080, 0x80808080}; int32_t v; - v = vec_any_eq(dp1, dp2); - + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); + dp1 = vec_mergeh(dp1, dp1); + result = (__vector unsigned long long)vec_vbpermq( + (__vector unsigned char)dp1, (__vector unsigned char)perm_mask); + v = result[1]; /* update last port counter. */ lp[0] += gptbl[v].lpv; -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v5 1/5] examples/l3fwd: fix port group mask generation 2022-10-11 9:08 ` [PATCH v4 " pbhagavatula @ 2022-10-11 10:12 ` pbhagavatula 2022-10-17 12:05 ` [EXT] " Shijith Thotton 2022-10-25 16:05 ` [PATCH v6 " pbhagavatula 0 siblings, 2 replies; 11+ messages in thread From: pbhagavatula @ 2022-10-11 10:12 UTC (permalink / raw) To: jerinj, David Christensen; +Cc: dev, Pavan Nikhilesh, stable From: Pavan Nikhilesh <pbhagavatula@marvell.com> Fix port group mask generation in altivec, vec_any_eq returns 0 or 1 while port_groupx4 expects comparison mask result. Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> --- v5 Changes: - Fix compilation errors. v4 Changes: - Fix missing `rte_free`. v3 Changes: - PPC optimize port mask generation. - Fix aarch32 compilation. v2 Changes: - Fix PPC, RISC-V, aarch32 compilation. examples/common/altivec/port_group.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h index 5e209b02fa..1c05bc025a 100644 --- a/examples/common/altivec/port_group.h +++ b/examples/common/altivec/port_group.h @@ -26,12 +26,17 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16_t u16[FWDSTEP + 1]; uint64_t u64; } *pnum = (void *)pn; - + __vector unsigned long long result; + const __vector unsigned int perm_mask = {0x00204060, 0x80808080, + 0x80808080, 0x80808080}; int32_t v; - v = vec_any_eq(dp1, dp2); - + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); + dp1 = vec_mergeh(dp1, dp1); + result = (__vector unsigned long long)vec_vbpermq( + (__vector unsigned char)dp1, (__vector unsigned char)perm_mask); + v = result[1]; /* update last port counter. */ lp[0] += gptbl[v].lpv; -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [EXT] [PATCH v5 1/5] examples/l3fwd: fix port group mask generation 2022-10-11 10:12 ` [PATCH v5 " pbhagavatula @ 2022-10-17 12:05 ` Shijith Thotton 2022-10-20 16:15 ` Pavan Nikhilesh Bhagavatula 2022-10-25 16:05 ` [PATCH v6 " pbhagavatula 1 sibling, 1 reply; 11+ messages in thread From: Shijith Thotton @ 2022-10-17 12:05 UTC (permalink / raw) To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, David Christensen Cc: dev, Pavan Nikhilesh Bhagavatula, stable [-- Attachment #1: Type: text/plain, Size: 1601 bytes --] > >Fix port group mask generation in altivec, vec_any_eq returns >0 or 1 while port_groupx4 expects comparison mask result. > >Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") >Cc: stable@dpdk.org > >Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Shijith Thotton <sthotton@marvell.com> >--- > v5 Changes: > - Fix compilation errors. > > v4 Changes: > - Fix missing `rte_free`. > > v3 Changes: > - PPC optimize port mask generation. > - Fix aarch32 compilation. > > v2 Changes: > - Fix PPC, RISC-V, aarch32 compilation. > > examples/common/altivec/port_group.h | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > >diff --git a/examples/common/altivec/port_group.h >b/examples/common/altivec/port_group.h >index 5e209b02fa..1c05bc025a 100644 >--- a/examples/common/altivec/port_group.h >+++ b/examples/common/altivec/port_group.h >@@ -26,12 +26,17 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, > uint16_t u16[FWDSTEP + 1]; > uint64_t u64; > } *pnum = (void *)pn; >- >+ __vector unsigned long long result; >+ const __vector unsigned int perm_mask = {0x00204060, 0x80808080, >+ 0x80808080, 0x80808080}; > int32_t v; > >- v = vec_any_eq(dp1, dp2); >- >+ dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); >+ dp1 = vec_mergeh(dp1, dp1); >+ result = (__vector unsigned long long)vec_vbpermq( >+ (__vector unsigned char)dp1, (__vector unsigned >char)perm_mask); > >+ v = result[1]; > /* update last port counter. */ > lp[0] += gptbl[v].lpv; > >-- >2.25.1 [-- Attachment #2: winmail.dat --] [-- Type: application/ms-tnef, Size: 14616 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [EXT] [PATCH v5 1/5] examples/l3fwd: fix port group mask generation 2022-10-17 12:05 ` [EXT] " Shijith Thotton @ 2022-10-20 16:15 ` Pavan Nikhilesh Bhagavatula 0 siblings, 0 replies; 11+ messages in thread From: Pavan Nikhilesh Bhagavatula @ 2022-10-20 16:15 UTC (permalink / raw) To: Shijith Thotton, Jerin Jacob Kollanukkaran, David Christensen; +Cc: dev, stable > -----Original Message----- > From: Shijith Thotton <sthotton@marvell.com> > Sent: Monday, October 17, 2022 5:36 PM > To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin Jacob > Kollanukkaran <jerinj@marvell.com>; David Christensen > <drc@linux.vnet.ibm.com> > Cc: dev@dpdk.org; Pavan Nikhilesh Bhagavatula > <pbhagavatula@marvell.com>; stable@dpdk.org > Subject: RE: [EXT] [PATCH v5 1/5] examples/l3fwd: fix port group mask > generation > > > > >Fix port group mask generation in altivec, vec_any_eq returns > >0 or 1 while port_groupx4 expects comparison mask result. > > > >Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on > powerpc") > >Cc: stable@dpdk.org > > > >Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> > > Acked-by: Shijith Thotton <sthotton@marvell.com> > Thomas, Will this series make it into 22.11 release? > >--- > > v5 Changes: > > - Fix compilation errors. > > > > v4 Changes: > > - Fix missing `rte_free`. > > > > v3 Changes: > > - PPC optimize port mask generation. > > - Fix aarch32 compilation. > > > > v2 Changes: > > - Fix PPC, RISC-V, aarch32 compilation. > > > > examples/common/altivec/port_group.h | 11 ++++++++--- > > 1 file changed, 8 insertions(+), 3 deletions(-) > > > >diff --git a/examples/common/altivec/port_group.h > >b/examples/common/altivec/port_group.h > >index 5e209b02fa..1c05bc025a 100644 > >--- a/examples/common/altivec/port_group.h > >+++ b/examples/common/altivec/port_group.h > >@@ -26,12 +26,17 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t > *lp, > > uint16_t u16[FWDSTEP + 1]; > > uint64_t u64; > > } *pnum = (void *)pn; > >- > >+ __vector unsigned long long result; > >+ const __vector unsigned int perm_mask = {0x00204060, 0x80808080, > >+ 0x80808080, 0x80808080}; > > int32_t v; > > > >- v = vec_any_eq(dp1, dp2); > >- > >+ dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); > >+ dp1 = vec_mergeh(dp1, dp1); > >+ result = (__vector unsigned long long)vec_vbpermq( > >+ (__vector unsigned char)dp1, (__vector unsigned > >char)perm_mask); > > > >+ v = result[1]; > > /* update last port counter. */ > > lp[0] += gptbl[v].lpv; > > > >-- > >2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 1/5] examples/l3fwd: fix port group mask generation 2022-10-11 10:12 ` [PATCH v5 " pbhagavatula 2022-10-17 12:05 ` [EXT] " Shijith Thotton @ 2022-10-25 16:05 ` pbhagavatula 2022-10-31 14:52 ` Thomas Monjalon 1 sibling, 1 reply; 11+ messages in thread From: pbhagavatula @ 2022-10-25 16:05 UTC (permalink / raw) To: jerinj, thomas, David Christensen Cc: dev, Pavan Nikhilesh, stable, Shijith Thotton From: Pavan Nikhilesh <pbhagavatula@marvell.com> Fix port group mask generation in altivec, vec_any_eq returns 0 or 1 while port_groupx4 expects comparison mask result. Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Shijith Thotton <sthotton@marvell.com> --- v6 Changes: - Minor optimiazation to process_dst_port NEON. v5 Changes: - Fix compilation errors. v4 Changes: - Fix missing `rte_free`. v3 Changes: - PPC optimize port mask generation. - Fix aarch32 compilation. v2 Changes: - Fix PPC, RISC-V, aarch32 compilation. examples/common/altivec/port_group.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h index 5e209b02fa..1c05bc025a 100644 --- a/examples/common/altivec/port_group.h +++ b/examples/common/altivec/port_group.h @@ -26,12 +26,17 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16_t u16[FWDSTEP + 1]; uint64_t u64; } *pnum = (void *)pn; - + __vector unsigned long long result; + const __vector unsigned int perm_mask = {0x00204060, 0x80808080, + 0x80808080, 0x80808080}; int32_t v; - v = vec_any_eq(dp1, dp2); - + dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2); + dp1 = vec_mergeh(dp1, dp1); + result = (__vector unsigned long long)vec_vbpermq( + (__vector unsigned char)dp1, (__vector unsigned char)perm_mask); + v = result[1]; /* update last port counter. */ lp[0] += gptbl[v].lpv; -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 1/5] examples/l3fwd: fix port group mask generation 2022-10-25 16:05 ` [PATCH v6 " pbhagavatula @ 2022-10-31 14:52 ` Thomas Monjalon 0 siblings, 0 replies; 11+ messages in thread From: Thomas Monjalon @ 2022-10-31 14:52 UTC (permalink / raw) To: Pavan Nikhilesh Cc: jerinj, David Christensen, stable, dev, stable, Shijith Thotton 25/10/2022 18:05, pbhagavatula@marvell.com: > From: Pavan Nikhilesh <pbhagavatula@marvell.com> > > Fix port group mask generation in altivec, vec_any_eq returns > 0 or 1 while port_groupx4 expects comparison mask result. > > Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc") > Cc: stable@dpdk.org > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> > Acked-by: Shijith Thotton <sthotton@marvell.com> Series applied, thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-10-31 14:53 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-08-29 9:44 [PATCH 1/5] examples/l3fwd: fix port group mask generation pbhagavatula 2022-09-02 9:18 ` [PATCH v2 " pbhagavatula 2022-09-08 18:33 ` David Christensen 2022-09-09 5:56 ` [EXT] " Pavan Nikhilesh Bhagavatula 2022-09-11 18:12 ` [PATCH v3 " pbhagavatula 2022-10-11 9:08 ` [PATCH v4 " pbhagavatula 2022-10-11 10:12 ` [PATCH v5 " pbhagavatula 2022-10-17 12:05 ` [EXT] " Shijith Thotton 2022-10-20 16:15 ` Pavan Nikhilesh Bhagavatula 2022-10-25 16:05 ` [PATCH v6 " pbhagavatula 2022-10-31 14:52 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).