DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] Make the thash library arch-independent
@ 2015-07-28 13:06 Vladimir Medvedkin
  2015-07-28 13:47 ` Thomas Monjalon
  2015-07-29 13:56 ` [dpdk-dev] [PATCH v2] " Vladimir Medvedkin
  0 siblings, 2 replies; 7+ messages in thread
From: Vladimir Medvedkin @ 2015-07-28 13:06 UTC (permalink / raw)
  To: dev

Signed-off-by: Vladimir Medvedkin <medvedkinv@gmail.com>
---
 lib/librte_hash/rte_thash.h | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/lib/librte_hash/rte_thash.h b/lib/librte_hash/rte_thash.h
index 6156e8a..ddb650a 100644
--- a/lib/librte_hash/rte_thash.h
+++ b/lib/librte_hash/rte_thash.h
@@ -53,14 +53,23 @@ extern "C" {
 
 #include <stdint.h>
 #include <rte_byteorder.h>
-#include <rte_vect.h>
 #include <rte_ip.h>
 
+#ifdef __SSE3__
+#include <rte_vect.h>
+#endif /* __SSE3__ */
+
+#ifndef XMM_SIZE
+#define XMM_SIZE	16
+#endif /* XMM_SIZE */
+
+#ifdef __SSE3__
 /* Byte swap mask used for converting IPv6 address
  * 4-byte chunks to CPU byte order
  */
 static const __m128i rte_thash_ipv6_bswap_mask = {
 		0x0405060700010203, 0x0C0D0E0F08090A0B};
+#endif /* __SSE3__ */
 
 /**
  * length in dwords of input tuple to
@@ -157,12 +166,22 @@ rte_convert_rss_key(const uint32_t *orig, uint32_t *targ, int len)
 static inline void
 rte_thash_load_v6_addrs(const struct ipv6_hdr *orig, union rte_thash_tuple *targ)
 {
+#ifdef __SSE__
 	__m128i ipv6 = _mm_loadu_si128((const __m128i *)orig->src_addr);
 	*(__m128i *)targ->v6.src_addr =
 			_mm_shuffle_epi8(ipv6, rte_thash_ipv6_bswap_mask);
 	ipv6 = _mm_loadu_si128((const __m128i *)orig->dst_addr);
 	*(__m128i *)targ->v6.dst_addr =
 			_mm_shuffle_epi8(ipv6, rte_thash_ipv6_bswap_mask);
+#else
+	int i;
+	for (i = 0; i < 4; i++) {
+		*((uint32_t *)targ->v6.src_addr + i) =
+			rte_be_to_cpu_32(*((const uint32_t *)orig->src_addr + i));
+		*((uint32_t *)targ->v6.dst_addr + i) =
+			rte_be_to_cpu_32(*((const uint32_t *)orig->dst_addr + i));
+	}
+#endif /* __SSE3__ */
 }
 
 /**
-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] Make the thash library arch-independent
  2015-07-28 13:06 [dpdk-dev] [PATCH] Make the thash library arch-independent Vladimir Medvedkin
@ 2015-07-28 13:47 ` Thomas Monjalon
  2015-07-28 15:33   ` Vladimir Medvedkin
  2015-07-29 13:56 ` [dpdk-dev] [PATCH v2] " Vladimir Medvedkin
  1 sibling, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2015-07-28 13:47 UTC (permalink / raw)
  To: Vladimir Medvedkin; +Cc: dev

Hi Vladimir,
Thanks for fixing.
Comments below.

2015-07-28 09:06, Vladimir Medvedkin:
> Signed-off-by: Vladimir Medvedkin <medvedkinv@gmail.com>

Please explain how it was broken and how you fixed it.
It would be interesting to know which part is __SSE3__ and __SSE__.

> +#ifdef __SSE3__
> +#include <rte_vect.h>
> +#endif /* __SSE3__ */

Comments after short ifdef block are not needed.

> +#ifndef XMM_SIZE
> +#define XMM_SIZE	16

Why is it needed?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] Make the thash library arch-independent
  2015-07-28 13:47 ` Thomas Monjalon
@ 2015-07-28 15:33   ` Vladimir Medvedkin
  2015-07-28 16:05     ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir Medvedkin @ 2015-07-28 15:33 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,


2015-07-28 16:47 GMT+03:00 Thomas Monjalon <thomas.monjalon@6wind.com>:

> Hi Vladimir,
> Thanks for fixing.
> Comments below.
>
> 2015-07-28 09:06, Vladimir Medvedkin:
> > Signed-off-by: Vladimir Medvedkin <medvedkinv@gmail.com>
>
> Please explain how it was broken and how you fixed it.
> It would be interesting to know which part is __SSE3__ and __SSE__.
>
 As mentioned in http://dpdk.org/ml/archives/dev/2015-July/022020.html
compilation fails on non x86 architectures( in that case it was tile). So I
add for optimized code, which uses SSE3 intrinsics, non optimized general
version.

>
> > +#ifdef __SSE3__
> > +#include <rte_vect.h>
> > +#endif /* __SSE3__ */
>
> Comments after short ifdef block are not needed.
>
Should I delete it?

>
> > +#ifndef XMM_SIZE
> > +#define XMM_SIZE     16
>
> Why is it needed?
>
 because there is no defines for XMM_SIZE on non X86 architectures

Regards,
Vladimir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] Make the thash library arch-independent
  2015-07-28 15:33   ` Vladimir Medvedkin
@ 2015-07-28 16:05     ` Thomas Monjalon
  2015-07-28 19:08       ` Vladimir Medvedkin
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2015-07-28 16:05 UTC (permalink / raw)
  To: Vladimir Medvedkin; +Cc: dev

2015-07-28 18:33, Vladimir Medvedkin:
> 2015-07-28 16:47 GMT+03:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> > 2015-07-28 09:06, Vladimir Medvedkin:
> > Please explain how it was broken and how you fixed it.
> > It would be interesting to know which part is __SSE3__ and __SSE__.
> >
>  As mentioned in http://dpdk.org/ml/archives/dev/2015-July/022020.html
> compilation fails on non x86 architectures( in that case it was tile). So I
> add for optimized code, which uses SSE3 intrinsics, non optimized general
> version.

I know. I was requesting an updated commit with explanations:
	build is broken because...
	x86 version uses SSE3...
	Some code is enclosed with __SSE__, not __SSE3__ because...

What happens if it is built with SSE3 support but run on
a CPU without such support?
Please check how it is done for ACL.

> > > +#ifdef __SSE3__
> > > +#include <rte_vect.h>
> > > +#endif /* __SSE3__ */
> >
> > Comments after short ifdef block are not needed.
> >
> Should I delete it?

Yes please.

> > > +#ifndef XMM_SIZE
> > > +#define XMM_SIZE     16
> >
> > Why is it needed?
> >
>  because there is no defines for XMM_SIZE on non X86 architectures

Why XMM_SIZE is needed on non x86 arch?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] Make the thash library arch-independent
  2015-07-28 16:05     ` Thomas Monjalon
@ 2015-07-28 19:08       ` Vladimir Medvedkin
  0 siblings, 0 replies; 7+ messages in thread
From: Vladimir Medvedkin @ 2015-07-28 19:08 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

2015-07-28 19:05 GMT+03:00 Thomas Monjalon <thomas.monjalon@6wind.com>:

> 2015-07-28 18:33, Vladimir Medvedkin:
> > 2015-07-28 16:47 GMT+03:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> > > 2015-07-28 09:06, Vladimir Medvedkin:
> > > Please explain how it was broken and how you fixed it.
> > > It would be interesting to know which part is __SSE3__ and __SSE__.
> > >
> >  As mentioned in http://dpdk.org/ml/archives/dev/2015-July/022020.html
> > compilation fails on non x86 architectures( in that case it was tile).
> So I
> > add for optimized code, which uses SSE3 intrinsics, non optimized general
> > version.
>
> I know. I was requesting an updated commit with explanations:
>         build is broken because...
>         x86 version uses SSE3...
>         Some code is enclosed with __SSE__, not __SSE3__ because...
>
Oh, that's my mistake. Will fix this typo in the next patch.

>
> What happens if it is built with SSE3 support but run on
> a CPU without such support?
> Please check how it is done for ACL.
>
> > > > +#ifdef __SSE3__
> > > > +#include <rte_vect.h>
> > > > +#endif /* __SSE3__ */
> > >
> > > Comments after short ifdef block are not needed.
> > >
> > Should I delete it?
>
> Yes please.
>
> > > > +#ifndef XMM_SIZE
> > > > +#define XMM_SIZE     16
> > >
> > > Why is it needed?
> > >
> >  because there is no defines for XMM_SIZE on non X86 architectures
>
> Why XMM_SIZE is needed on non x86 arch?
>
Ok, I will leave union rte_thash_tuple unaligned for non x86 arch

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH v2] Make the thash library arch-independent
  2015-07-28 13:06 [dpdk-dev] [PATCH] Make the thash library arch-independent Vladimir Medvedkin
  2015-07-28 13:47 ` Thomas Monjalon
@ 2015-07-29 13:56 ` Vladimir Medvedkin
  2015-07-29 23:40   ` Thomas Monjalon
  1 sibling, 1 reply; 7+ messages in thread
From: Vladimir Medvedkin @ 2015-07-29 13:56 UTC (permalink / raw)
  To: dev

v2 changes
- Fix SSE to SSE3 typo
- remove unnecessary comments
- Leave unalligned union rte_thash_tuple if no support for SSE3
- Makes 32bit compiler happy by adding ULL suffix

Signed-off-by: Vladimir Medvedkin <medvedkinv@gmail.com>
---
 lib/librte_hash/rte_thash.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/lib/librte_hash/rte_thash.h b/lib/librte_hash/rte_thash.h
index 6156e8a..d98e98e 100644
--- a/lib/librte_hash/rte_thash.h
+++ b/lib/librte_hash/rte_thash.h
@@ -53,14 +53,19 @@ extern "C" {
 
 #include <stdint.h>
 #include <rte_byteorder.h>
-#include <rte_vect.h>
 #include <rte_ip.h>
 
+#ifdef __SSE3__
+#include <rte_vect.h>
+#endif
+
+#ifdef __SSE3__
 /* Byte swap mask used for converting IPv6 address
  * 4-byte chunks to CPU byte order
  */
 static const __m128i rte_thash_ipv6_bswap_mask = {
-		0x0405060700010203, 0x0C0D0E0F08090A0B};
+		0x0405060700010203ULL, 0x0C0D0E0F08090A0BULL};
+#endif
 
 /**
  * length in dwords of input tuple to
@@ -126,7 +131,11 @@ struct rte_ipv6_tuple {
 union rte_thash_tuple {
 	struct rte_ipv4_tuple	v4;
 	struct rte_ipv6_tuple	v6;
+#ifdef __SSE3__
 } __attribute__((aligned(XMM_SIZE)));
+#else
+};
+#endif
 
 /**
  * Prepare special converted key to use with rte_softrss_be()
@@ -157,12 +166,22 @@ rte_convert_rss_key(const uint32_t *orig, uint32_t *targ, int len)
 static inline void
 rte_thash_load_v6_addrs(const struct ipv6_hdr *orig, union rte_thash_tuple *targ)
 {
+#ifdef __SSE3__
 	__m128i ipv6 = _mm_loadu_si128((const __m128i *)orig->src_addr);
 	*(__m128i *)targ->v6.src_addr =
 			_mm_shuffle_epi8(ipv6, rte_thash_ipv6_bswap_mask);
 	ipv6 = _mm_loadu_si128((const __m128i *)orig->dst_addr);
 	*(__m128i *)targ->v6.dst_addr =
 			_mm_shuffle_epi8(ipv6, rte_thash_ipv6_bswap_mask);
+#else
+	int i;
+	for (i = 0; i < 4; i++) {
+		*((uint32_t *)targ->v6.src_addr + i) =
+			rte_be_to_cpu_32(*((const uint32_t *)orig->src_addr + i));
+		*((uint32_t *)targ->v6.dst_addr + i) =
+			rte_be_to_cpu_32(*((const uint32_t *)orig->dst_addr + i));
+	}
+#endif
 }
 
 /**
-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH v2] Make the thash library arch-independent
  2015-07-29 13:56 ` [dpdk-dev] [PATCH v2] " Vladimir Medvedkin
@ 2015-07-29 23:40   ` Thomas Monjalon
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2015-07-29 23:40 UTC (permalink / raw)
  To: Vladimir Medvedkin; +Cc: dev

2015-07-29 09:56, Vladimir Medvedkin:
> v2 changes
> - Fix SSE to SSE3 typo
> - remove unnecessary comments
> - Leave unalligned union rte_thash_tuple if no support for SSE3
> - Makes 32bit compiler happy by adding ULL suffix
> 
> Signed-off-by: Vladimir Medvedkin <medvedkinv@gmail.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-07-29 23:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-28 13:06 [dpdk-dev] [PATCH] Make the thash library arch-independent Vladimir Medvedkin
2015-07-28 13:47 ` Thomas Monjalon
2015-07-28 15:33   ` Vladimir Medvedkin
2015-07-28 16:05     ` Thomas Monjalon
2015-07-28 19:08       ` Vladimir Medvedkin
2015-07-29 13:56 ` [dpdk-dev] [PATCH v2] " Vladimir Medvedkin
2015-07-29 23:40   ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).