DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation
@ 2021-10-17 20:37 Georg Sauthoff
  2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff
  0 siblings, 1 reply; 5+ messages in thread
From: Georg Sauthoff @ 2021-10-17 20:37 UTC (permalink / raw)
  To: Morten Brørup, Olivier Matz, Thomas Monjalon, David Marchand
  Cc: dev, Georg Sauthoff

The current __rte_raw_cksum() function violates the C strict-aliasing
rules since it uses a uint8_t pointer to access a trailing byte.

This patch also fixes a superfluous cast, i.e.:

	uintptr_t ptr = (uintptr_t)buf;
	typedef uint16_t __attribute__((__may_alias__)) u16_p;
	const u16_p *u16_buf = (const u16_p *)ptr;

Transitive casting involving uintptr_t doesn't solve anything here.
It also doesn't help with fixing a strict-aliasing issue here.

The patch also simplifies the main loop, i.e. it eliminates the manually
unrolled loop

	while (len >= (sizeof(*u16_buf) * 4)) {
		sum += u16_buf[0];
		sum += u16_buf[1];
		sum += u16_buf[2];
		sum += u16_buf[3];
		len -= sizeof(*u16_buf) * 4;
		u16_buf += 4;
	}

since modern C compilers are in a better position to decide which level of
unrolling is optimal for the target architecture.

See also https://godbolt.org/z/6rYbYGnj7 which shows how GCC auto-vectorizes
the simplified loop using AVX instructions, when compiling for Haswell. When
looking at the number of instructions in the compiled code, the new version
is half as big as the existing one.

Signed-off-by: Georg Sauthoff <mail@gms.tf>
---
v2:
* Reword commit message (detail aliasing implications of uint8_t)
* Add unlikely()

Georg Sauthoff (1):
  net: fix aliasing issue in checksum computation

 lib/net/rte_ip.h | 27 ++++++++-------------------
 1 file changed, 8 insertions(+), 19 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
  2021-10-17 20:37 [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation Georg Sauthoff
@ 2021-10-17 20:37 ` Georg Sauthoff
  2021-10-18  7:35   ` Morten Brørup
  0 siblings, 1 reply; 5+ messages in thread
From: Georg Sauthoff @ 2021-10-17 20:37 UTC (permalink / raw)
  To: Morten Brørup, Olivier Matz, Thomas Monjalon, David Marchand
  Cc: dev, Georg Sauthoff

That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.

Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.

Signed-off-by: Georg Sauthoff <mail@gms.tf>
---
 lib/net/rte_ip.h | 27 ++++++++-------------------
 1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
index 05948b69b7..1b8c6519a9 100644
--- a/lib/net/rte_ip.h
+++ b/lib/net/rte_ip.h
@@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
 static inline uint32_t
 __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
 {
-	/* workaround gcc strict-aliasing warning */
-	uintptr_t ptr = (uintptr_t)buf;
+	/* extend strict-aliasing rules */
 	typedef uint16_t __attribute__((__may_alias__)) u16_p;
-	const u16_p *u16_buf = (const u16_p *)ptr;
-
-	while (len >= (sizeof(*u16_buf) * 4)) {
-		sum += u16_buf[0];
-		sum += u16_buf[1];
-		sum += u16_buf[2];
-		sum += u16_buf[3];
-		len -= sizeof(*u16_buf) * 4;
-		u16_buf += 4;
-	}
-	while (len >= sizeof(*u16_buf)) {
+	const u16_p *u16_buf = (const u16_p *)buf;
+	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
+
+	for (; u16_buf != end; ++u16_buf)
 		sum += *u16_buf;
-		len -= sizeof(*u16_buf);
-		u16_buf += 1;
-	}
 
-	/* if length is in odd bytes */
-	if (len == 1) {
+	/* if length is odd, keeping it byte order independent */
+	if (unlikely(len % 2)) {
 		uint16_t left = 0;
-		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
+		*(unsigned char*)&left = *(const unsigned char *)end;
 		sum += left;
 	}
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
  2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff
@ 2021-10-18  7:35   ` Morten Brørup
  2021-10-18  7:58     ` Olivier Matz
  0 siblings, 1 reply; 5+ messages in thread
From: Morten Brørup @ 2021-10-18  7:35 UTC (permalink / raw)
  To: Georg Sauthoff, Olivier Matz, Thomas Monjalon, David Marchand,
	Ferruh Yigit
  Cc: dev

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> Sent: Sunday, 17 October 2021 22.37

+Ferruh, as delegate to v1 in Patchwork.

> 
> That means a superfluous cast is removed and aliasing through a uint8_t
> pointer is eliminated. NB: The C standard specifies that a unsigned
> char
> pointer may alias while the C standard doesn't include such requirement
> for uint8_t pointers.
> 
> Also simplified the loop since a modern C compiler can speed up (i.e.
> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> it
> for Haswell using AVX registers while halving the number of
> instructions
> in the generated code.
> 
> Signed-off-by: Georg Sauthoff <mail@gms.tf>
> ---
>  lib/net/rte_ip.h | 27 ++++++++-------------------
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index 05948b69b7..1b8c6519a9 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> *ipv4_hdr)
>  static inline uint32_t
>  __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>  {
> -	/* workaround gcc strict-aliasing warning */
> -	uintptr_t ptr = (uintptr_t)buf;
> +	/* extend strict-aliasing rules */
>  	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> -	const u16_p *u16_buf = (const u16_p *)ptr;
> -
> -	while (len >= (sizeof(*u16_buf) * 4)) {
> -		sum += u16_buf[0];
> -		sum += u16_buf[1];
> -		sum += u16_buf[2];
> -		sum += u16_buf[3];
> -		len -= sizeof(*u16_buf) * 4;
> -		u16_buf += 4;
> -	}
> -	while (len >= sizeof(*u16_buf)) {
> +	const u16_p *u16_buf = (const u16_p *)buf;
> +	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> +
> +	for (; u16_buf != end; ++u16_buf)
>  		sum += *u16_buf;
> -		len -= sizeof(*u16_buf);
> -		u16_buf += 1;
> -	}
> 
> -	/* if length is in odd bytes */
> -	if (len == 1) {
> +	/* if length is odd, keeping it byte order independent */
> +	if (unlikely(len % 2)) {
>  		uint16_t left = 0;
> -		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
> +		*(unsigned char*)&left = *(const unsigned char *)end;
>  		sum += left;
>  	}
> 
> --
> 2.31.1
> 

Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)

After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.

Reviewed-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
  2021-10-18  7:35   ` Morten Brørup
@ 2021-10-18  7:58     ` Olivier Matz
  2021-10-18 16:14       ` Ferruh Yigit
  0 siblings, 1 reply; 5+ messages in thread
From: Olivier Matz @ 2021-10-18  7:58 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Georg Sauthoff, Thomas Monjalon, David Marchand, Ferruh Yigit, dev

On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> > Sent: Sunday, 17 October 2021 22.37
> 
> +Ferruh, as delegate to v1 in Patchwork.
> 
> > 
> > That means a superfluous cast is removed and aliasing through a uint8_t
> > pointer is eliminated. NB: The C standard specifies that a unsigned
> > char
> > pointer may alias while the C standard doesn't include such requirement
> > for uint8_t pointers.
> > 
> > Also simplified the loop since a modern C compiler can speed up (i.e.
> > auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> > it
> > for Haswell using AVX registers while halving the number of
> > instructions
> > in the generated code.
> > 
> > Signed-off-by: Georg Sauthoff <mail@gms.tf>
> > ---
> >  lib/net/rte_ip.h | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > index 05948b69b7..1b8c6519a9 100644
> > --- a/lib/net/rte_ip.h
> > +++ b/lib/net/rte_ip.h
> > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> > *ipv4_hdr)
> >  static inline uint32_t
> >  __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> >  {
> > -	/* workaround gcc strict-aliasing warning */
> > -	uintptr_t ptr = (uintptr_t)buf;
> > +	/* extend strict-aliasing rules */
> >  	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > -	const u16_p *u16_buf = (const u16_p *)ptr;
> > -
> > -	while (len >= (sizeof(*u16_buf) * 4)) {
> > -		sum += u16_buf[0];
> > -		sum += u16_buf[1];
> > -		sum += u16_buf[2];
> > -		sum += u16_buf[3];
> > -		len -= sizeof(*u16_buf) * 4;
> > -		u16_buf += 4;
> > -	}
> > -	while (len >= sizeof(*u16_buf)) {
> > +	const u16_p *u16_buf = (const u16_p *)buf;
> > +	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > +
> > +	for (; u16_buf != end; ++u16_buf)
> >  		sum += *u16_buf;
> > -		len -= sizeof(*u16_buf);
> > -		u16_buf += 1;
> > -	}
> > 
> > -	/* if length is in odd bytes */
> > -	if (len == 1) {
> > +	/* if length is odd, keeping it byte order independent */
> > +	if (unlikely(len % 2)) {
> >  		uint16_t left = 0;
> > -		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > +		*(unsigned char*)&left = *(const unsigned char *)end;
> >  		sum += left;
> >  	}
> > 
> > --
> > 2.31.1
> > 
> 
> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
> 
> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.

+1, thanks for the good documentation

> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
  2021-10-18  7:58     ` Olivier Matz
@ 2021-10-18 16:14       ` Ferruh Yigit
  0 siblings, 0 replies; 5+ messages in thread
From: Ferruh Yigit @ 2021-10-18 16:14 UTC (permalink / raw)
  To: Olivier Matz, Morten Brørup, Georg Sauthoff
  Cc: Thomas Monjalon, David Marchand, dev

On 10/18/2021 8:58 AM, Olivier Matz wrote:
> On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
>>> Sent: Sunday, 17 October 2021 22.37
>>
>> +Ferruh, as delegate to v1 in Patchwork.
>>
>>>
>>> That means a superfluous cast is removed and aliasing through a uint8_t
>>> pointer is eliminated. NB: The C standard specifies that a unsigned
>>> char
>>> pointer may alias while the C standard doesn't include such requirement
>>> for uint8_t pointers.
>>>
>>> Also simplified the loop since a modern C compiler can speed up (i.e.
>>> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
>>> it
>>> for Haswell using AVX registers while halving the number of
>>> instructions
>>> in the generated code.
>>>
>>> Signed-off-by: Georg Sauthoff <mail@gms.tf>
>>> ---
>>>   lib/net/rte_ip.h | 27 ++++++++-------------------
>>>   1 file changed, 8 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>>> index 05948b69b7..1b8c6519a9 100644
>>> --- a/lib/net/rte_ip.h
>>> +++ b/lib/net/rte_ip.h
>>> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
>>> *ipv4_hdr)
>>>   static inline uint32_t
>>>   __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>>>   {
>>> -	/* workaround gcc strict-aliasing warning */
>>> -	uintptr_t ptr = (uintptr_t)buf;
>>> +	/* extend strict-aliasing rules */
>>>   	typedef uint16_t __attribute__((__may_alias__)) u16_p;
>>> -	const u16_p *u16_buf = (const u16_p *)ptr;
>>> -
>>> -	while (len >= (sizeof(*u16_buf) * 4)) {
>>> -		sum += u16_buf[0];
>>> -		sum += u16_buf[1];
>>> -		sum += u16_buf[2];
>>> -		sum += u16_buf[3];
>>> -		len -= sizeof(*u16_buf) * 4;
>>> -		u16_buf += 4;
>>> -	}
>>> -	while (len >= sizeof(*u16_buf)) {
>>> +	const u16_p *u16_buf = (const u16_p *)buf;
>>> +	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
>>> +
>>> +	for (; u16_buf != end; ++u16_buf)
>>>   		sum += *u16_buf;
>>> -		len -= sizeof(*u16_buf);
>>> -		u16_buf += 1;
>>> -	}
>>>
>>> -	/* if length is in odd bytes */
>>> -	if (len == 1) {
>>> +	/* if length is odd, keeping it byte order independent */
>>> +	if (unlikely(len % 2)) {
>>>   		uint16_t left = 0;
>>> -		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
>>> +		*(unsigned char*)&left = *(const unsigned char *)end;
>>>   		sum += left;
>>>   	}
>>>
>>> --
>>> 2.31.1
>>>
>>
>> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
>>
>> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
> 
> +1, thanks for the good documentation
> 
>> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 

Updated patch title as: "net: fix aliasing in checksum computation"

Added fixes tags:
     Fixes: 6006818cfb26 ("net: new checksum functions")
     Fixes: e079655c41fb ("net: fix build with gcc 4.4.7 and strict aliasing")
     Cc: stable@dpdk.org

Following warning fixed in next-net:
   ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)"
   #65: FILE: lib/net/rte_ip.h:168:
   +               *(unsigned char*)&left = *(const unsigned char *)end;


Applied to dpdk-next-net/main, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-18 16:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17 20:37 [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation Georg Sauthoff
2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff
2021-10-18  7:35   ` Morten Brørup
2021-10-18  7:58     ` Olivier Matz
2021-10-18 16:14       ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).