DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Emil Berg" <emil.berg@ericsson.com>
Cc: <stable@dpdk.org>, <bugzilla@dpdk.org>, <hofors@lysator.liu.se>,
	<olivier.matz@6wind.com>, <dev@dpdk.org>
Subject: RE: [PATCH] net: fix checksum with unaligned buffer
Date: Mon, 20 Jun 2022 12:57:33 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87141@smartserver.smartshare.dk> (raw)
In-Reply-To: <AM8PR07MB766628919D85FADCE736DD1E98B09@AM8PR07MB7666.eurprd07.prod.outlook.com>

> From: Emil Berg [mailto:emil.berg@ericsson.com]
> Sent: Monday, 20 June 2022 12.38
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: den 17 juni 2022 11:07
> >
> > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > Sent: Friday, 17 June 2022 10.45
> > >
> > > With this patch, the checksum can be calculated on an unligned part
> of
> > > a packet buffer.
> > > I.e. the buf parameter is no longer required to be 16 bit aligned.
> > >
> > > The DPDK invariant that packet buffers must be 16 bit aligned
> remains
> > > unchanged.
> > > This invariant also defines how to calculate the 16 bit checksum on
> an
> > > unaligned part of a packet buffer.
> > >
> > > Bugzilla ID: 1035
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > > ---
> > >  lib/net/rte_ip.h | 17 +++++++++++++++--
> > >  1 file changed, 15 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h index
> > > b502481670..8e301d9c26 100644
> > > --- a/lib/net/rte_ip.h
> > > +++ b/lib/net/rte_ip.h
> > > @@ -162,9 +162,22 @@ __rte_raw_cksum(const void *buf, size_t len,
> > > uint32_t sum)  {
> > >  	/* extend strict-aliasing rules */
> > >  	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > > -	const u16_p *u16_buf = (const u16_p *)buf;
> > > -	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > > +	const u16_p *u16_buf;
> > > +	const u16_p *end;
> > > +
> > > +	/* if buffer is unaligned, keeping it byte order independent */
> > > +	if (unlikely((uintptr_t)buf & 1)) {
> > > +		uint16_t first = 0;
> > > +		if (unlikely(len == 0))
> > > +			return 0;
> > > +		((unsigned char *)&first)[1] = *(const unsigned
> > char *)buf;
> > > +		sum += first;
> > > +		buf = (const void *)((uintptr_t)buf + 1);
> > > +		len--;
> > > +	}
> > >
> > > +	u16_buf = (const u16_p *)buf;
> > > +	end = u16_buf + len / sizeof(*u16_buf);
> > >  	for (; u16_buf != end; ++u16_buf)
> > >  		sum += *u16_buf;
> > >
> > > --
> > > 2.17.1
> >
> > @Emil, can you please test this patch with an unaligned buffer on
> your
> > application to confirm that it produces the expected result.
> 
> Hi!
> 
> I tested the patch. It doesn't seem to produce the same results. I
> think the problem is that it always starts summing from an even
> address, the sum should always start from the first byte according to
> the checksum specification. Can I instead propose something Mattias
> Rönnblom sent me?

I assume that it produces the same result when the "buf" parameter is aligned?

And when the "buf" parameter is unaligned, I don't expect it to produce the same results as the simple algorithm!

This was the whole point of the patch: I expect the overall packet buffer to be 16 bit aligned, and the checksum to be a partial checksum of such a 16 bit aligned packet buffer. When calling this function, I assume that the "buf" and "len" parameters point to a part of such a packet buffer. If these expectations are correct, the simple algorithm will produce incorrect results when "buf" is unaligned.

I was asking you to test if the checksum on the packet is correct when your application modifies an unaligned part of the packet and uses this function to update the checksum.


> -----------------------------------------------------------------------
> ---------------------------------------
> const void *end = RTE_PTR_ADD(buf, (len / sizeof(uint16_t)) *
> sizeof(uint16_t));
> 
> for (; buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
>     uint16_t v;
>     memcpy(&v, buf, sizeof(uint16_t));
>     sum += v;
> }
> 
> /* if length is odd, keeping it byte order independent */
> if (unlikely(len % 2)) {
>     uint16_t left = 0;
>     *(unsigned char *)&left = *(const unsigned char *)end;
>     sum += left;
> }
> -----------------------------------------------------------------------
> ---------------------------------------
> Note that the last block is the same as before. Amazingly I see no
> measurable performance hit from this compared to the previous one (-O3,
> march=native). Looking at the previous the loop body may compile to
> (x86):
> -----------------------------------------------------------------------
> ---------------------------------------
> vmovdqa (%rdx),%xmm1
> vpmovzxwd %xmm1,%xmm0
> vpsrldq $0x8,%xmm1,%xmm1
> vpmovzxwd %xmm1,%xmm1
> vpaddd %xmm1,%xmm0,%xmm0
> cmp    $0xf,%rax
> jbe    0x7ff7a0dfb1a9
> -----------------------------------------------------------------------
> ---------------------------------------
> while Mattias' memcpy solution:
> -----------------------------------------------------------------------
> ---------------------------------------
> vmovdqu (%rcx),%ymm0
> add    $0x20,%rcx
> vpmovzxwd %xmm0,%ymm1
> vextracti128 $0x1,%ymm0,%xmm0
> vpmovzxwd %xmm0,%ymm0
> vpaddd %ymm0,%ymm1,%ymm0
> vpaddd %ymm0,%ymm2,%ymm2
> cmp    %r9,%rcx
> jne    0x555555556380
> -----------------------------------------------------------------------
> ---------------------------------------
> Thus two extra instructions in the loop, but I suspect it may be memory
> bound, leading to no measurable performance difference.
> 
> Any comments?
> 
> /Emil

  reply	other threads:[~2022-06-20 10:57 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-15  7:16 [Bug 1035] __rte_raw_cksum() crash with misaligned pointer bugzilla
2022-06-15 14:40 ` Morten Brørup
2022-06-16  5:44   ` Emil Berg
2022-06-16  6:27     ` Morten Brørup
2022-06-16  6:32     ` Emil Berg
2022-06-16  6:44       ` Morten Brørup
2022-06-16 13:58         ` Mattias Rönnblom
2022-06-16 14:36           ` Morten Brørup
2022-06-17  7:32           ` Morten Brørup
2022-06-17  8:45             ` [PATCH] net: fix checksum with unaligned buffer Morten Brørup
2022-06-17  9:06               ` Morten Brørup
2022-06-17 12:17                 ` Emil Berg
2022-06-20 10:37                 ` Emil Berg
2022-06-20 10:57                   ` Morten Brørup [this message]
2022-06-21  7:16                     ` Emil Berg
2022-06-21  8:05                       ` Morten Brørup
2022-06-21  8:23                         ` Bruce Richardson
2022-06-21  9:35                           ` Morten Brørup
2022-06-22  6:26                             ` Emil Berg
2022-06-22  9:18                               ` Bruce Richardson
2022-06-22 11:26                                 ` Morten Brørup
2022-06-22 12:25                                   ` Emil Berg
2022-06-22 14:01                                     ` Morten Brørup
2022-06-22 14:03                                       ` Emil Berg
2022-06-23  5:21                                       ` Emil Berg
2022-06-23  7:01                                         ` Morten Brørup
2022-06-23 11:39                                           ` Emil Berg
2022-06-23 12:18                                             ` Morten Brørup
2022-06-22 13:44             ` [PATCH v2] " Morten Brørup
2022-06-22 13:54             ` [PATCH v3] " Morten Brørup
2022-06-23 12:39             ` [PATCH v4] " Morten Brørup
2022-06-23 12:51               ` Morten Brørup
2022-06-27  7:56                 ` Emil Berg
2022-06-27 10:54                   ` Morten Brørup
2022-06-27 12:28                 ` Mattias Rönnblom
2022-06-27 12:46                   ` Emil Berg
2022-06-27 12:50                     ` Emil Berg
2022-06-27 13:22                       ` Morten Brørup
2022-06-27 17:22                         ` Mattias Rönnblom
2022-06-27 20:21                           ` Morten Brørup
2022-06-28  6:28                             ` Mattias Rönnblom
2022-06-30 16:28                               ` Morten Brørup
2022-07-07 15:21                                 ` Stanisław Kardach
2022-07-07 18:34                             ` [PATCH 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-07 18:34                               ` [PATCH 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-07 21:44                                 ` Morten Brørup
2022-07-08 12:43                                   ` Mattias Rönnblom
2022-07-08 12:56                                     ` [PATCH v2 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-08 12:56                                       ` [PATCH v2 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-08 14:44                                         ` Ferruh Yigit
2022-07-11  9:53                                         ` Olivier Matz
2022-07-11 10:53                                           ` Mattias Rönnblom
2022-07-11  9:47                                       ` [PATCH v2 1/2] app/test: add cksum performance test Olivier Matz
2022-07-11 10:42                                         ` Mattias Rönnblom
2022-07-11 11:33                                           ` Olivier Matz
2022-07-11 12:11                                             ` [PATCH v3 " Mattias Rönnblom
2022-07-11 12:11                                               ` [PATCH v3 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-11 13:25                                                 ` Olivier Matz
2022-08-08  9:25                                                   ` Mattias Rönnblom
2022-09-20 12:09                                                   ` Mattias Rönnblom
2022-09-20 16:10                                                     ` Thomas Monjalon
2022-07-11 13:20                                               ` [PATCH v3 1/2] app/test: add cksum performance test Olivier Matz
2022-07-08 13:02                                     ` [PATCH 2/2] net: have checksum routines accept unaligned data Morten Brørup
2022-07-08 13:52                                       ` Mattias Rönnblom
2022-07-08 14:10                                         ` Bruce Richardson
2022-07-08 14:30                                           ` Morten Brørup
2022-06-30 17:41               ` [PATCH v4] net: fix checksum with unaligned buffer Stephen Hemminger
2022-06-30 17:45               ` Stephen Hemminger
2022-07-01  4:11                 ` Emil Berg
2022-07-01 16:50                   ` Morten Brørup
2022-07-01 17:04                     ` Stephen Hemminger
2022-07-01 20:46                       ` Morten Brørup
2022-06-16 14:09       ` [Bug 1035] __rte_raw_cksum() crash with misaligned pointer Mattias Rönnblom
2022-10-10 10:40 ` bugzilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D87141@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    --cc=emil.berg@ericsson.com \
    --cc=hofors@lysator.liu.se \
    --cc=olivier.matz@6wind.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).