* [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation
@ 2021-10-17 20:37 Georg Sauthoff
2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff
0 siblings, 1 reply; 5+ messages in thread
From: Georg Sauthoff @ 2021-10-17 20:37 UTC (permalink / raw)
To: Morten Brørup, Olivier Matz, Thomas Monjalon, David Marchand
Cc: dev, Georg Sauthoff
The current __rte_raw_cksum() function violates the C strict-aliasing
rules since it uses a uint8_t pointer to access a trailing byte.
This patch also fixes a superfluous cast, i.e.:
uintptr_t ptr = (uintptr_t)buf;
typedef uint16_t __attribute__((__may_alias__)) u16_p;
const u16_p *u16_buf = (const u16_p *)ptr;
Transitive casting involving uintptr_t doesn't solve anything here.
It also doesn't help with fixing a strict-aliasing issue here.
The patch also simplifies the main loop, i.e. it eliminates the manually
unrolled loop
while (len >= (sizeof(*u16_buf) * 4)) {
sum += u16_buf[0];
sum += u16_buf[1];
sum += u16_buf[2];
sum += u16_buf[3];
len -= sizeof(*u16_buf) * 4;
u16_buf += 4;
}
since modern C compilers are in a better position to decide which level of
unrolling is optimal for the target architecture.
See also https://godbolt.org/z/6rYbYGnj7 which shows how GCC auto-vectorizes
the simplified loop using AVX instructions, when compiling for Haswell. When
looking at the number of instructions in the compiled code, the new version
is half as big as the existing one.
Signed-off-by: Georg Sauthoff <mail@gms.tf>
---
v2:
* Reword commit message (detail aliasing implications of uint8_t)
* Add unlikely()
Georg Sauthoff (1):
net: fix aliasing issue in checksum computation
lib/net/rte_ip.h | 27 ++++++++-------------------
1 file changed, 8 insertions(+), 19 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
2021-10-17 20:37 [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation Georg Sauthoff
@ 2021-10-17 20:37 ` Georg Sauthoff
2021-10-18 7:35 ` Morten Brørup
0 siblings, 1 reply; 5+ messages in thread
From: Georg Sauthoff @ 2021-10-17 20:37 UTC (permalink / raw)
To: Morten Brørup, Olivier Matz, Thomas Monjalon, David Marchand
Cc: dev, Georg Sauthoff
That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.
Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.
Signed-off-by: Georg Sauthoff <mail@gms.tf>
---
lib/net/rte_ip.h | 27 ++++++++-------------------
1 file changed, 8 insertions(+), 19 deletions(-)
diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
index 05948b69b7..1b8c6519a9 100644
--- a/lib/net/rte_ip.h
+++ b/lib/net/rte_ip.h
@@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
static inline uint32_t
__rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
{
- /* workaround gcc strict-aliasing warning */
- uintptr_t ptr = (uintptr_t)buf;
+ /* extend strict-aliasing rules */
typedef uint16_t __attribute__((__may_alias__)) u16_p;
- const u16_p *u16_buf = (const u16_p *)ptr;
-
- while (len >= (sizeof(*u16_buf) * 4)) {
- sum += u16_buf[0];
- sum += u16_buf[1];
- sum += u16_buf[2];
- sum += u16_buf[3];
- len -= sizeof(*u16_buf) * 4;
- u16_buf += 4;
- }
- while (len >= sizeof(*u16_buf)) {
+ const u16_p *u16_buf = (const u16_p *)buf;
+ const u16_p *end = u16_buf + len / sizeof(*u16_buf);
+
+ for (; u16_buf != end; ++u16_buf)
sum += *u16_buf;
- len -= sizeof(*u16_buf);
- u16_buf += 1;
- }
- /* if length is in odd bytes */
- if (len == 1) {
+ /* if length is odd, keeping it byte order independent */
+ if (unlikely(len % 2)) {
uint16_t left = 0;
- *(uint8_t *)&left = *(const uint8_t *)u16_buf;
+ *(unsigned char*)&left = *(const unsigned char *)end;
sum += left;
}
--
2.31.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff
@ 2021-10-18 7:35 ` Morten Brørup
2021-10-18 7:58 ` Olivier Matz
0 siblings, 1 reply; 5+ messages in thread
From: Morten Brørup @ 2021-10-18 7:35 UTC (permalink / raw)
To: Georg Sauthoff, Olivier Matz, Thomas Monjalon, David Marchand,
Ferruh Yigit
Cc: dev
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> Sent: Sunday, 17 October 2021 22.37
+Ferruh, as delegate to v1 in Patchwork.
>
> That means a superfluous cast is removed and aliasing through a uint8_t
> pointer is eliminated. NB: The C standard specifies that a unsigned
> char
> pointer may alias while the C standard doesn't include such requirement
> for uint8_t pointers.
>
> Also simplified the loop since a modern C compiler can speed up (i.e.
> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> it
> for Haswell using AVX registers while halving the number of
> instructions
> in the generated code.
>
> Signed-off-by: Georg Sauthoff <mail@gms.tf>
> ---
> lib/net/rte_ip.h | 27 ++++++++-------------------
> 1 file changed, 8 insertions(+), 19 deletions(-)
>
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index 05948b69b7..1b8c6519a9 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> *ipv4_hdr)
> static inline uint32_t
> __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> {
> - /* workaround gcc strict-aliasing warning */
> - uintptr_t ptr = (uintptr_t)buf;
> + /* extend strict-aliasing rules */
> typedef uint16_t __attribute__((__may_alias__)) u16_p;
> - const u16_p *u16_buf = (const u16_p *)ptr;
> -
> - while (len >= (sizeof(*u16_buf) * 4)) {
> - sum += u16_buf[0];
> - sum += u16_buf[1];
> - sum += u16_buf[2];
> - sum += u16_buf[3];
> - len -= sizeof(*u16_buf) * 4;
> - u16_buf += 4;
> - }
> - while (len >= sizeof(*u16_buf)) {
> + const u16_p *u16_buf = (const u16_p *)buf;
> + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> +
> + for (; u16_buf != end; ++u16_buf)
> sum += *u16_buf;
> - len -= sizeof(*u16_buf);
> - u16_buf += 1;
> - }
>
> - /* if length is in odd bytes */
> - if (len == 1) {
> + /* if length is odd, keeping it byte order independent */
> + if (unlikely(len % 2)) {
> uint16_t left = 0;
> - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> + *(unsigned char*)&left = *(const unsigned char *)end;
> sum += left;
> }
>
> --
> 2.31.1
>
Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
2021-10-18 7:35 ` Morten Brørup
@ 2021-10-18 7:58 ` Olivier Matz
2021-10-18 16:14 ` Ferruh Yigit
0 siblings, 1 reply; 5+ messages in thread
From: Olivier Matz @ 2021-10-18 7:58 UTC (permalink / raw)
To: Morten Brørup
Cc: Georg Sauthoff, Thomas Monjalon, David Marchand, Ferruh Yigit, dev
On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> > Sent: Sunday, 17 October 2021 22.37
>
> +Ferruh, as delegate to v1 in Patchwork.
>
> >
> > That means a superfluous cast is removed and aliasing through a uint8_t
> > pointer is eliminated. NB: The C standard specifies that a unsigned
> > char
> > pointer may alias while the C standard doesn't include such requirement
> > for uint8_t pointers.
> >
> > Also simplified the loop since a modern C compiler can speed up (i.e.
> > auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> > it
> > for Haswell using AVX registers while halving the number of
> > instructions
> > in the generated code.
> >
> > Signed-off-by: Georg Sauthoff <mail@gms.tf>
> > ---
> > lib/net/rte_ip.h | 27 ++++++++-------------------
> > 1 file changed, 8 insertions(+), 19 deletions(-)
> >
> > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > index 05948b69b7..1b8c6519a9 100644
> > --- a/lib/net/rte_ip.h
> > +++ b/lib/net/rte_ip.h
> > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> > *ipv4_hdr)
> > static inline uint32_t
> > __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> > {
> > - /* workaround gcc strict-aliasing warning */
> > - uintptr_t ptr = (uintptr_t)buf;
> > + /* extend strict-aliasing rules */
> > typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > - const u16_p *u16_buf = (const u16_p *)ptr;
> > -
> > - while (len >= (sizeof(*u16_buf) * 4)) {
> > - sum += u16_buf[0];
> > - sum += u16_buf[1];
> > - sum += u16_buf[2];
> > - sum += u16_buf[3];
> > - len -= sizeof(*u16_buf) * 4;
> > - u16_buf += 4;
> > - }
> > - while (len >= sizeof(*u16_buf)) {
> > + const u16_p *u16_buf = (const u16_p *)buf;
> > + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > +
> > + for (; u16_buf != end; ++u16_buf)
> > sum += *u16_buf;
> > - len -= sizeof(*u16_buf);
> > - u16_buf += 1;
> > - }
> >
> > - /* if length is in odd bytes */
> > - if (len == 1) {
> > + /* if length is odd, keeping it byte order independent */
> > + if (unlikely(len % 2)) {
> > uint16_t left = 0;
> > - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > + *(unsigned char*)&left = *(const unsigned char *)end;
> > sum += left;
> > }
> >
> > --
> > 2.31.1
> >
>
> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
>
> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
+1, thanks for the good documentation
> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation
2021-10-18 7:58 ` Olivier Matz
@ 2021-10-18 16:14 ` Ferruh Yigit
0 siblings, 0 replies; 5+ messages in thread
From: Ferruh Yigit @ 2021-10-18 16:14 UTC (permalink / raw)
To: Olivier Matz, Morten Brørup, Georg Sauthoff
Cc: Thomas Monjalon, David Marchand, dev
On 10/18/2021 8:58 AM, Olivier Matz wrote:
> On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
>>> Sent: Sunday, 17 October 2021 22.37
>>
>> +Ferruh, as delegate to v1 in Patchwork.
>>
>>>
>>> That means a superfluous cast is removed and aliasing through a uint8_t
>>> pointer is eliminated. NB: The C standard specifies that a unsigned
>>> char
>>> pointer may alias while the C standard doesn't include such requirement
>>> for uint8_t pointers.
>>>
>>> Also simplified the loop since a modern C compiler can speed up (i.e.
>>> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
>>> it
>>> for Haswell using AVX registers while halving the number of
>>> instructions
>>> in the generated code.
>>>
>>> Signed-off-by: Georg Sauthoff <mail@gms.tf>
>>> ---
>>> lib/net/rte_ip.h | 27 ++++++++-------------------
>>> 1 file changed, 8 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>>> index 05948b69b7..1b8c6519a9 100644
>>> --- a/lib/net/rte_ip.h
>>> +++ b/lib/net/rte_ip.h
>>> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
>>> *ipv4_hdr)
>>> static inline uint32_t
>>> __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>>> {
>>> - /* workaround gcc strict-aliasing warning */
>>> - uintptr_t ptr = (uintptr_t)buf;
>>> + /* extend strict-aliasing rules */
>>> typedef uint16_t __attribute__((__may_alias__)) u16_p;
>>> - const u16_p *u16_buf = (const u16_p *)ptr;
>>> -
>>> - while (len >= (sizeof(*u16_buf) * 4)) {
>>> - sum += u16_buf[0];
>>> - sum += u16_buf[1];
>>> - sum += u16_buf[2];
>>> - sum += u16_buf[3];
>>> - len -= sizeof(*u16_buf) * 4;
>>> - u16_buf += 4;
>>> - }
>>> - while (len >= sizeof(*u16_buf)) {
>>> + const u16_p *u16_buf = (const u16_p *)buf;
>>> + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
>>> +
>>> + for (; u16_buf != end; ++u16_buf)
>>> sum += *u16_buf;
>>> - len -= sizeof(*u16_buf);
>>> - u16_buf += 1;
>>> - }
>>>
>>> - /* if length is in odd bytes */
>>> - if (len == 1) {
>>> + /* if length is odd, keeping it byte order independent */
>>> + if (unlikely(len % 2)) {
>>> uint16_t left = 0;
>>> - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
>>> + *(unsigned char*)&left = *(const unsigned char *)end;
>>> sum += left;
>>> }
>>>
>>> --
>>> 2.31.1
>>>
>>
>> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
>>
>> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
>
> +1, thanks for the good documentation
>
>> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
Updated patch title as: "net: fix aliasing in checksum computation"
Added fixes tags:
Fixes: 6006818cfb26 ("net: new checksum functions")
Fixes: e079655c41fb ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org
Following warning fixed in next-net:
ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)"
#65: FILE: lib/net/rte_ip.h:168:
+ *(unsigned char*)&left = *(const unsigned char *)end;
Applied to dpdk-next-net/main, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-10-18 16:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17 20:37 [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation Georg Sauthoff
2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff
2021-10-18 7:35 ` Morten Brørup
2021-10-18 7:58 ` Olivier Matz
2021-10-18 16:14 ` Ferruh Yigit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).