From: Ferruh Yigit <ferruh.yigit@xilinx.com>
To: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>,
olivier.matz@6wind.com
Cc: "Emil Berg" <emil.berg@ericsson.com>,
bruce.richardson@intel.com, stephen@networkplumber.org,
stable@dpdk.org, bugzilla@dpdk.org, dev@dpdk.org,
onar.olsen@ericsson.com,
"Morten Brørup" <mb@smartsharesystems.com>
Subject: Re: [PATCH v2 2/2] net: have checksum routines accept unaligned data
Date: Fri, 8 Jul 2022 15:44:14 +0100 [thread overview]
Message-ID: <58432e09-11c1-5ce0-3e8c-9b3df7266e6a@xilinx.com> (raw)
In-Reply-To: <20220708125608.24532-2-mattias.ronnblom@ericsson.com>
On 7/8/2022 1:56 PM, Mattias Rönnblom wrote:
> __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
> data through an uint16_t pointer, which allowed the compiler to assume
> the data was 16-bit aligned. This in turn would, with certain
> architectures and compiler flag combinations, result in code with SIMD
> load or store instructions with restrictions on data alignment.
>
> This patch keeps the old algorithm, but data is read using memcpy()
> instead of direct pointer access, forcing the compiler to always
> generate code that handles unaligned input. The __may_alias__ GCC
> attribute is no longer needed.
>
> The data on which the Internet checksum functions operates are almost
> always 16-bit aligned, but there are exceptions. In particular, the
> PDCP protocol header may (literally) have an odd size.
>
> Performance impact seems to range from none to a very slight
> regression.
>
> Bugzilla ID: 1035
> Cc: stable@dpdk.org
>
> ---
>
> v2:
> * Simplified the odd-length conditional (Morten Brørup).
>
> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
> lib/net/rte_ip.h | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index b502481670..a0334d931e 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -160,18 +160,21 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
> static inline uint32_t
> __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> {
> - /* extend strict-aliasing rules */
> - typedef uint16_t __attribute__((__may_alias__)) u16_p;
> - const u16_p *u16_buf = (const u16_p *)buf;
> - const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> + const void *end;
>
> - for (; u16_buf != end; ++u16_buf)
> - sum += *u16_buf;
> + for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) * sizeof(uint16_t));
> + buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
> + uint16_t v;
> +
> + memcpy(&v, buf, sizeof(uint16_t));
> + sum += v;
> + }
>
> /* if length is odd, keeping it byte order independent */
> if (unlikely(len % 2)) {
> uint16_t left = 0;
> - *(unsigned char *)&left = *(const unsigned char *)end;
> +
> + memcpy(&left, end, 1);
> sum += left;
> }
>
Hi Mattias,
I got following result [1] with patches on [2].
Can you shed light to some questions I have,
1) For 1500 why 'Unaligned' access gives better performance than
'Aligned' access?
2) Why 21/101 bytes almost doubles 20/100 bytes perf?
3) Why 1501 bytes perf better than 1500 bytes perf?
Btw, I don't see any noticeable performance difference between with and
without patch.
[1]
RTE>>cksum_perf_autotest
### rte_raw_cksum() performance ###
Alignment Block size TSC cycles/block TSC cycles/byte
Aligned 20 25.1 1.25
Unaligned 20 25.1 1.25
Aligned 21 51.5 2.45
Unaligned 21 51.5 2.45
Aligned 100 28.2 0.28
Unaligned 100 28.2 0.28
Aligned 101 54.5 0.54
Unaligned 101 54.5 0.54
Aligned 1500 188.9 0.13
Unaligned 1500 138.7 0.09
Aligned 1501 114.1 0.08
Unaligned 1501 110.1 0.07
Test OK
RTE>>
[2]
AMD EPYC 7543P
next prev parent reply other threads:[~2022-07-08 14:44 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-15 7:16 [Bug 1035] __rte_raw_cksum() crash with misaligned pointer bugzilla
2022-06-15 14:40 ` Morten Brørup
2022-06-16 5:44 ` Emil Berg
2022-06-16 6:27 ` Morten Brørup
2022-06-16 6:32 ` Emil Berg
2022-06-16 6:44 ` Morten Brørup
2022-06-16 13:58 ` Mattias Rönnblom
2022-06-16 14:36 ` Morten Brørup
2022-06-17 7:32 ` Morten Brørup
2022-06-17 8:45 ` [PATCH] net: fix checksum with unaligned buffer Morten Brørup
2022-06-17 9:06 ` Morten Brørup
2022-06-17 12:17 ` Emil Berg
2022-06-20 10:37 ` Emil Berg
2022-06-20 10:57 ` Morten Brørup
2022-06-21 7:16 ` Emil Berg
2022-06-21 8:05 ` Morten Brørup
2022-06-21 8:23 ` Bruce Richardson
2022-06-21 9:35 ` Morten Brørup
2022-06-22 6:26 ` Emil Berg
2022-06-22 9:18 ` Bruce Richardson
2022-06-22 11:26 ` Morten Brørup
2022-06-22 12:25 ` Emil Berg
2022-06-22 14:01 ` Morten Brørup
2022-06-22 14:03 ` Emil Berg
2022-06-23 5:21 ` Emil Berg
2022-06-23 7:01 ` Morten Brørup
2022-06-23 11:39 ` Emil Berg
2022-06-23 12:18 ` Morten Brørup
2022-06-22 13:44 ` [PATCH v2] " Morten Brørup
2022-06-22 13:54 ` [PATCH v3] " Morten Brørup
2022-06-23 12:39 ` [PATCH v4] " Morten Brørup
2022-06-23 12:51 ` Morten Brørup
2022-06-27 7:56 ` Emil Berg
2022-06-27 10:54 ` Morten Brørup
2022-06-27 12:28 ` Mattias Rönnblom
2022-06-27 12:46 ` Emil Berg
2022-06-27 12:50 ` Emil Berg
2022-06-27 13:22 ` Morten Brørup
2022-06-27 17:22 ` Mattias Rönnblom
2022-06-27 20:21 ` Morten Brørup
2022-06-28 6:28 ` Mattias Rönnblom
2022-06-30 16:28 ` Morten Brørup
2022-07-07 15:21 ` Stanisław Kardach
2022-07-07 18:34 ` [PATCH 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-07 18:34 ` [PATCH 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-07 21:44 ` Morten Brørup
2022-07-08 12:43 ` Mattias Rönnblom
2022-07-08 12:56 ` [PATCH v2 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-08 12:56 ` [PATCH v2 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-08 14:44 ` Ferruh Yigit [this message]
2022-07-11 9:53 ` Olivier Matz
2022-07-11 10:53 ` Mattias Rönnblom
2022-07-11 9:47 ` [PATCH v2 1/2] app/test: add cksum performance test Olivier Matz
2022-07-11 10:42 ` Mattias Rönnblom
2022-07-11 11:33 ` Olivier Matz
2022-07-11 12:11 ` [PATCH v3 " Mattias Rönnblom
2022-07-11 12:11 ` [PATCH v3 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-11 13:25 ` Olivier Matz
2022-08-08 9:25 ` Mattias Rönnblom
2022-09-20 12:09 ` Mattias Rönnblom
2022-09-20 16:10 ` Thomas Monjalon
2022-07-11 13:20 ` [PATCH v3 1/2] app/test: add cksum performance test Olivier Matz
2022-07-08 13:02 ` [PATCH 2/2] net: have checksum routines accept unaligned data Morten Brørup
2022-07-08 13:52 ` Mattias Rönnblom
2022-07-08 14:10 ` Bruce Richardson
2022-07-08 14:30 ` Morten Brørup
2022-06-30 17:41 ` [PATCH v4] net: fix checksum with unaligned buffer Stephen Hemminger
2022-06-30 17:45 ` Stephen Hemminger
2022-07-01 4:11 ` Emil Berg
2022-07-01 16:50 ` Morten Brørup
2022-07-01 17:04 ` Stephen Hemminger
2022-07-01 20:46 ` Morten Brørup
2022-06-16 14:09 ` [Bug 1035] __rte_raw_cksum() crash with misaligned pointer Mattias Rönnblom
2022-10-10 10:40 ` bugzilla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58432e09-11c1-5ce0-3e8c-9b3df7266e6a@xilinx.com \
--to=ferruh.yigit@xilinx.com \
--cc=bruce.richardson@intel.com \
--cc=bugzilla@dpdk.org \
--cc=dev@dpdk.org \
--cc=emil.berg@ericsson.com \
--cc=mattias.ronnblom@ericsson.com \
--cc=mb@smartsharesystems.com \
--cc=olivier.matz@6wind.com \
--cc=onar.olsen@ericsson.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).