* [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation @ 2021-10-17 20:37 Georg Sauthoff 2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff 0 siblings, 1 reply; 5+ messages in thread From: Georg Sauthoff @ 2021-10-17 20:37 UTC (permalink / raw) To: Morten Brørup, Olivier Matz, Thomas Monjalon, David Marchand Cc: dev, Georg Sauthoff The current __rte_raw_cksum() function violates the C strict-aliasing rules since it uses a uint8_t pointer to access a trailing byte. This patch also fixes a superfluous cast, i.e.: uintptr_t ptr = (uintptr_t)buf; typedef uint16_t __attribute__((__may_alias__)) u16_p; const u16_p *u16_buf = (const u16_p *)ptr; Transitive casting involving uintptr_t doesn't solve anything here. It also doesn't help with fixing a strict-aliasing issue here. The patch also simplifies the main loop, i.e. it eliminates the manually unrolled loop while (len >= (sizeof(*u16_buf) * 4)) { sum += u16_buf[0]; sum += u16_buf[1]; sum += u16_buf[2]; sum += u16_buf[3]; len -= sizeof(*u16_buf) * 4; u16_buf += 4; } since modern C compilers are in a better position to decide which level of unrolling is optimal for the target architecture. See also https://godbolt.org/z/6rYbYGnj7 which shows how GCC auto-vectorizes the simplified loop using AVX instructions, when compiling for Haswell. When looking at the number of instructions in the compiled code, the new version is half as big as the existing one. Signed-off-by: Georg Sauthoff <mail@gms.tf> --- v2: * Reword commit message (detail aliasing implications of uint8_t) * Add unlikely() Georg Sauthoff (1): net: fix aliasing issue in checksum computation lib/net/rte_ip.h | 27 ++++++++------------------- 1 file changed, 8 insertions(+), 19 deletions(-) -- 2.31.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation 2021-10-17 20:37 [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation Georg Sauthoff @ 2021-10-17 20:37 ` Georg Sauthoff 2021-10-18 7:35 ` Morten Brørup 0 siblings, 1 reply; 5+ messages in thread From: Georg Sauthoff @ 2021-10-17 20:37 UTC (permalink / raw) To: Morten Brørup, Olivier Matz, Thomas Monjalon, David Marchand Cc: dev, Georg Sauthoff That means a superfluous cast is removed and aliasing through a uint8_t pointer is eliminated. NB: The C standard specifies that a unsigned char pointer may alias while the C standard doesn't include such requirement for uint8_t pointers. Also simplified the loop since a modern C compiler can speed up (i.e. auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it for Haswell using AVX registers while halving the number of instructions in the generated code. Signed-off-by: Georg Sauthoff <mail@gms.tf> --- lib/net/rte_ip.h | 27 ++++++++------------------- 1 file changed, 8 insertions(+), 19 deletions(-) diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h index 05948b69b7..1b8c6519a9 100644 --- a/lib/net/rte_ip.h +++ b/lib/net/rte_ip.h @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr) static inline uint32_t __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) { - /* workaround gcc strict-aliasing warning */ - uintptr_t ptr = (uintptr_t)buf; + /* extend strict-aliasing rules */ typedef uint16_t __attribute__((__may_alias__)) u16_p; - const u16_p *u16_buf = (const u16_p *)ptr; - - while (len >= (sizeof(*u16_buf) * 4)) { - sum += u16_buf[0]; - sum += u16_buf[1]; - sum += u16_buf[2]; - sum += u16_buf[3]; - len -= sizeof(*u16_buf) * 4; - u16_buf += 4; - } - while (len >= sizeof(*u16_buf)) { + const u16_p *u16_buf = (const u16_p *)buf; + const u16_p *end = u16_buf + len / sizeof(*u16_buf); + + for (; u16_buf != end; ++u16_buf) sum += *u16_buf; - len -= sizeof(*u16_buf); - u16_buf += 1; - } - /* if length is in odd bytes */ - if (len == 1) { + /* if length is odd, keeping it byte order independent */ + if (unlikely(len % 2)) { uint16_t left = 0; - *(uint8_t *)&left = *(const uint8_t *)u16_buf; + *(unsigned char*)&left = *(const unsigned char *)end; sum += left; } -- 2.31.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation 2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff @ 2021-10-18 7:35 ` Morten Brørup 2021-10-18 7:58 ` Olivier Matz 0 siblings, 1 reply; 5+ messages in thread From: Morten Brørup @ 2021-10-18 7:35 UTC (permalink / raw) To: Georg Sauthoff, Olivier Matz, Thomas Monjalon, David Marchand, Ferruh Yigit Cc: dev > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff > Sent: Sunday, 17 October 2021 22.37 +Ferruh, as delegate to v1 in Patchwork. > > That means a superfluous cast is removed and aliasing through a uint8_t > pointer is eliminated. NB: The C standard specifies that a unsigned > char > pointer may alias while the C standard doesn't include such requirement > for uint8_t pointers. > > Also simplified the loop since a modern C compiler can speed up (i.e. > auto-vectorize) it in a similar way. For example, GCC auto-vectorizes > it > for Haswell using AVX registers while halving the number of > instructions > in the generated code. > > Signed-off-by: Georg Sauthoff <mail@gms.tf> > --- > lib/net/rte_ip.h | 27 ++++++++------------------- > 1 file changed, 8 insertions(+), 19 deletions(-) > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h > index 05948b69b7..1b8c6519a9 100644 > --- a/lib/net/rte_ip.h > +++ b/lib/net/rte_ip.h > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr > *ipv4_hdr) > static inline uint32_t > __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) > { > - /* workaround gcc strict-aliasing warning */ > - uintptr_t ptr = (uintptr_t)buf; > + /* extend strict-aliasing rules */ > typedef uint16_t __attribute__((__may_alias__)) u16_p; > - const u16_p *u16_buf = (const u16_p *)ptr; > - > - while (len >= (sizeof(*u16_buf) * 4)) { > - sum += u16_buf[0]; > - sum += u16_buf[1]; > - sum += u16_buf[2]; > - sum += u16_buf[3]; > - len -= sizeof(*u16_buf) * 4; > - u16_buf += 4; > - } > - while (len >= sizeof(*u16_buf)) { > + const u16_p *u16_buf = (const u16_p *)buf; > + const u16_p *end = u16_buf + len / sizeof(*u16_buf); > + > + for (; u16_buf != end; ++u16_buf) > sum += *u16_buf; > - len -= sizeof(*u16_buf); > - u16_buf += 1; > - } > > - /* if length is in odd bytes */ > - if (len == 1) { > + /* if length is odd, keeping it byte order independent */ > + if (unlikely(len % 2)) { > uint16_t left = 0; > - *(uint8_t *)&left = *(const uint8_t *)u16_buf; > + *(unsigned char*)&left = *(const unsigned char *)end; > sum += left; > } > > -- > 2.31.1 > Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-) After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch. Reviewed-by: Morten Brørup <mb@smartsharesystems.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation 2021-10-18 7:35 ` Morten Brørup @ 2021-10-18 7:58 ` Olivier Matz 2021-10-18 16:14 ` Ferruh Yigit 0 siblings, 1 reply; 5+ messages in thread From: Olivier Matz @ 2021-10-18 7:58 UTC (permalink / raw) To: Morten Brørup Cc: Georg Sauthoff, Thomas Monjalon, David Marchand, Ferruh Yigit, dev On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote: > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff > > Sent: Sunday, 17 October 2021 22.37 > > +Ferruh, as delegate to v1 in Patchwork. > > > > > That means a superfluous cast is removed and aliasing through a uint8_t > > pointer is eliminated. NB: The C standard specifies that a unsigned > > char > > pointer may alias while the C standard doesn't include such requirement > > for uint8_t pointers. > > > > Also simplified the loop since a modern C compiler can speed up (i.e. > > auto-vectorize) it in a similar way. For example, GCC auto-vectorizes > > it > > for Haswell using AVX registers while halving the number of > > instructions > > in the generated code. > > > > Signed-off-by: Georg Sauthoff <mail@gms.tf> > > --- > > lib/net/rte_ip.h | 27 ++++++++------------------- > > 1 file changed, 8 insertions(+), 19 deletions(-) > > > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h > > index 05948b69b7..1b8c6519a9 100644 > > --- a/lib/net/rte_ip.h > > +++ b/lib/net/rte_ip.h > > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr > > *ipv4_hdr) > > static inline uint32_t > > __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) > > { > > - /* workaround gcc strict-aliasing warning */ > > - uintptr_t ptr = (uintptr_t)buf; > > + /* extend strict-aliasing rules */ > > typedef uint16_t __attribute__((__may_alias__)) u16_p; > > - const u16_p *u16_buf = (const u16_p *)ptr; > > - > > - while (len >= (sizeof(*u16_buf) * 4)) { > > - sum += u16_buf[0]; > > - sum += u16_buf[1]; > > - sum += u16_buf[2]; > > - sum += u16_buf[3]; > > - len -= sizeof(*u16_buf) * 4; > > - u16_buf += 4; > > - } > > - while (len >= sizeof(*u16_buf)) { > > + const u16_p *u16_buf = (const u16_p *)buf; > > + const u16_p *end = u16_buf + len / sizeof(*u16_buf); > > + > > + for (; u16_buf != end; ++u16_buf) > > sum += *u16_buf; > > - len -= sizeof(*u16_buf); > > - u16_buf += 1; > > - } > > > > - /* if length is in odd bytes */ > > - if (len == 1) { > > + /* if length is odd, keeping it byte order independent */ > > + if (unlikely(len % 2)) { > > uint16_t left = 0; > > - *(uint8_t *)&left = *(const uint8_t *)u16_buf; > > + *(unsigned char*)&left = *(const unsigned char *)end; > > sum += left; > > } > > > > -- > > 2.31.1 > > > > Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-) > > After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch. +1, thanks for the good documentation > Reviewed-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/1] net: fix aliasing issue in checksum computation 2021-10-18 7:58 ` Olivier Matz @ 2021-10-18 16:14 ` Ferruh Yigit 0 siblings, 0 replies; 5+ messages in thread From: Ferruh Yigit @ 2021-10-18 16:14 UTC (permalink / raw) To: Olivier Matz, Morten Brørup, Georg Sauthoff Cc: Thomas Monjalon, David Marchand, dev On 10/18/2021 8:58 AM, Olivier Matz wrote: > On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote: >>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff >>> Sent: Sunday, 17 October 2021 22.37 >> >> +Ferruh, as delegate to v1 in Patchwork. >> >>> >>> That means a superfluous cast is removed and aliasing through a uint8_t >>> pointer is eliminated. NB: The C standard specifies that a unsigned >>> char >>> pointer may alias while the C standard doesn't include such requirement >>> for uint8_t pointers. >>> >>> Also simplified the loop since a modern C compiler can speed up (i.e. >>> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes >>> it >>> for Haswell using AVX registers while halving the number of >>> instructions >>> in the generated code. >>> >>> Signed-off-by: Georg Sauthoff <mail@gms.tf> >>> --- >>> lib/net/rte_ip.h | 27 ++++++++------------------- >>> 1 file changed, 8 insertions(+), 19 deletions(-) >>> >>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h >>> index 05948b69b7..1b8c6519a9 100644 >>> --- a/lib/net/rte_ip.h >>> +++ b/lib/net/rte_ip.h >>> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr >>> *ipv4_hdr) >>> static inline uint32_t >>> __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) >>> { >>> - /* workaround gcc strict-aliasing warning */ >>> - uintptr_t ptr = (uintptr_t)buf; >>> + /* extend strict-aliasing rules */ >>> typedef uint16_t __attribute__((__may_alias__)) u16_p; >>> - const u16_p *u16_buf = (const u16_p *)ptr; >>> - >>> - while (len >= (sizeof(*u16_buf) * 4)) { >>> - sum += u16_buf[0]; >>> - sum += u16_buf[1]; >>> - sum += u16_buf[2]; >>> - sum += u16_buf[3]; >>> - len -= sizeof(*u16_buf) * 4; >>> - u16_buf += 4; >>> - } >>> - while (len >= sizeof(*u16_buf)) { >>> + const u16_p *u16_buf = (const u16_p *)buf; >>> + const u16_p *end = u16_buf + len / sizeof(*u16_buf); >>> + >>> + for (; u16_buf != end; ++u16_buf) >>> sum += *u16_buf; >>> - len -= sizeof(*u16_buf); >>> - u16_buf += 1; >>> - } >>> >>> - /* if length is in odd bytes */ >>> - if (len == 1) { >>> + /* if length is odd, keeping it byte order independent */ >>> + if (unlikely(len % 2)) { >>> uint16_t left = 0; >>> - *(uint8_t *)&left = *(const uint8_t *)u16_buf; >>> + *(unsigned char*)&left = *(const unsigned char *)end; >>> sum += left; >>> } >>> >>> -- >>> 2.31.1 >>> >> >> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-) >> >> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch. > > +1, thanks for the good documentation > >> Reviewed-by: Morten Brørup <mb@smartsharesystems.com> > > Acked-by: Olivier Matz <olivier.matz@6wind.com> > Updated patch title as: "net: fix aliasing in checksum computation" Added fixes tags: Fixes: 6006818cfb26 ("net: new checksum functions") Fixes: e079655c41fb ("net: fix build with gcc 4.4.7 and strict aliasing") Cc: stable@dpdk.org Following warning fixed in next-net: ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)" #65: FILE: lib/net/rte_ip.h:168: + *(unsigned char*)&left = *(const unsigned char *)end; Applied to dpdk-next-net/main, thanks. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-10-18 16:15 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-10-17 20:37 [dpdk-dev] [PATCH v2 0/1] net: fix aliasing issue in checksum computation Georg Sauthoff 2021-10-17 20:37 ` [dpdk-dev] [PATCH v2 1/1] " Georg Sauthoff 2021-10-18 7:35 ` Morten Brørup 2021-10-18 7:58 ` Olivier Matz 2021-10-18 16:14 ` Ferruh Yigit
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).