From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 722EAA00C4
	for <public@inbox.dpdk.org>; Mon, 11 Jul 2022 11:53:11 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 6A2BA40DFD;
	Mon, 11 Jul 2022 11:53:11 +0200 (CEST)
Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com
 [209.85.221.46]) by mails.dpdk.org (Postfix) with ESMTP id 87AB240695
 for <stable@dpdk.org>; Mon, 11 Jul 2022 11:53:08 +0200 (CEST)
Received: by mail-wr1-f46.google.com with SMTP id z12so6262361wrq.7
 for <stable@dpdk.org>; Mon, 11 Jul 2022 02:53:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:content-transfer-encoding:in-reply-to;
 bh=pGtrhRgIomIHDj1EFBqowbbNW+qug3PnaAo7YbBTcQM=;
 b=J/raQH64d3xXZskj9KD51eAlqhQmzO5WPkPJCwWHG1wt9BdkyPHjFM070Xk8lGxM/K
 eu3Oo+H2mYBceklUIIoJSZCYF1snEfXBKA0WLCZpgwply4oPO8qtgOS31L2PYgF2oVES
 je6h2ATxeQ2/BKdctNRU3Z1InxNRQrBjbL3LihmzSVvP8pk0HJS1k02m20u7S02bKDsB
 4rJKOr7lc7INmWC1noSwed//v3R8k7HEkt0HyKInGVHnCTvDSEhIppNqlmHDiJulPc69
 EfShEilEmDFsQ1jQhfMoD7XIwyszU8Zmj7hzrLx/8s81dkPGamuTxBwrxGNEbMfp+lQ0
 /c8w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:content-transfer-encoding
 :in-reply-to;
 bh=pGtrhRgIomIHDj1EFBqowbbNW+qug3PnaAo7YbBTcQM=;
 b=5036wQB3frn0n1BGY7EOeZFrOl8bnLQpzABHwvk24yUKXUGWU0OXBV+UqV8zMjUXYl
 KpKNSKy4oxkeJhjDMOpRdg6zPjUkb9fp1PeEx/GF+Bp9phGZKDKkQf3Psqnpw+uYurSU
 IDUKg9slsKL5n81HvXXxXQWy6LrfPNugqsxU2W1Zn7RA5Dxl0qsQ8dIvPs3oJSEK3Jkb
 3QLPAsYVUMt9fA55QrLaFwU7o9PGFZJIOoT7vW8cak5Qlay2DkfVkY3puXb9ibzQ/sxS
 cUx1aetYh+VjJwOaOyShg2AUVJD6kB9yLzcCKjScjPfaF2nIGBwbvfHkKqvg07q6DwUo
 lTRg==
X-Gm-Message-State: AJIora+hhutlgLH7CILHhzPlbVeZKcTGg00KFFJ3IsO9Pt96OxcJ6ZdO
 AHjomXw7JT8/g04gt23gnNBBPA==
X-Google-Smtp-Source: AGRyM1uBCY/+HNaFKOzBtmNxBDQcRy+b0bd5sjT15I+C8Xl2PwnGS3W2pdxdwJrFWtKCISA2ipUCfw==
X-Received: by 2002:a5d:56ca:0:b0:21d:8b21:9fd5 with SMTP id
 m10-20020a5d56ca000000b0021d8b219fd5mr15941889wrw.179.1657533188254; 
 Mon, 11 Jul 2022 02:53:08 -0700 (PDT)
Received: from 6wind.com ([2a01:e0a:5ac:6460:c065:401d:87eb:9b25])
 by smtp.gmail.com with ESMTPSA id
 t10-20020a7bc3ca000000b0039c4b518df4sm7469165wmj.5.2022.07.11.02.53.07
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 11 Jul 2022 02:53:07 -0700 (PDT)
Date: Mon, 11 Jul 2022 11:53:06 +0200
From: Olivier Matz <olivier.matz@6wind.com>
To: Mattias =?iso-8859-1?Q?R=F6nnblom?= <mattias.ronnblom@ericsson.com>
Cc: Emil Berg <emil.berg@ericsson.com>, bruce.richardson@intel.com,
 stephen@networkplumber.org, stable@dpdk.org, bugzilla@dpdk.org,
 dev@dpdk.org, onar.olsen@ericsson.com,
 Morten =?iso-8859-1?Q?Br=F8rup?= <mb@smartsharesystems.com>
Subject: Re: [PATCH v2 2/2] net: have checksum routines accept unaligned data
Message-ID: <YsvzAsy8gG4ArPq7@platinum>
References: <6839721a-8050-0e11-0c66-0f735ec8c56d@ericsson.com>
 <20220708125608.24532-1-mattias.ronnblom@ericsson.com>
 <20220708125608.24532-2-mattias.ronnblom@ericsson.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20220708125608.24532-2-mattias.ronnblom@ericsson.com>
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
Errors-To: stable-bounces@dpdk.org

Hi,

On Fri, Jul 08, 2022 at 02:56:08PM +0200, Mattias Rönnblom wrote:
> __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
> data through an uint16_t pointer, which allowed the compiler to assume
> the data was 16-bit aligned. This in turn would, with certain
> architectures and compiler flag combinations, result in code with SIMD
> load or store instructions with restrictions on data alignment.
> 
> This patch keeps the old algorithm, but data is read using memcpy()
> instead of direct pointer access, forcing the compiler to always
> generate code that handles unaligned input. The __may_alias__ GCC
> attribute is no longer needed.
> 
> The data on which the Internet checksum functions operates are almost
> always 16-bit aligned, but there are exceptions. In particular, the
> PDCP protocol header may (literally) have an odd size.
> 
> Performance impact seems to range from none to a very slight
> regression.
> 
> Bugzilla ID: 1035
> Cc: stable@dpdk.org
> 
> ---

Using memcpy() looks to be a good solution fix the issue, while avoiding a
branch and the __may_alias__.

I just have one minor comment below.

> 
> v2:
>   * Simplified the odd-length conditional (Morten Brørup).
> 
> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> 
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
>  lib/net/rte_ip.h | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index b502481670..a0334d931e 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -160,18 +160,21 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
>  static inline uint32_t
>  __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>  {
> -	/* extend strict-aliasing rules */
> -	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> -	const u16_p *u16_buf = (const u16_p *)buf;
> -	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> +	const void *end;
>  
> -	for (; u16_buf != end; ++u16_buf)
> -		sum += *u16_buf;
> +	for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) * sizeof(uint16_t));

What do you think about this form:

	for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t)));

This also has the good property to solve the debate about the
spaces around the '/' :)


> +	     buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
> +		uint16_t v;
> +
> +		memcpy(&v, buf, sizeof(uint16_t));
> +		sum += v;
> +	}
>  
>  	/* if length is odd, keeping it byte order independent */
>  	if (unlikely(len % 2)) {
>  		uint16_t left = 0;
> -		*(unsigned char *)&left = *(const unsigned char *)end;
> +
> +		memcpy(&left, end, 1);
>  		sum += left;
>  	}
>  
> -- 
> 2.25.1
>