From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3336FA0C45; Sat, 18 Sep 2021 13:49:46 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id ACDAC40041; Sat, 18 Sep 2021 13:49:45 +0200 (CEST) Received: from turing.lru.li (turing.lru.li [49.12.115.177]) by mails.dpdk.org (Postfix) with ESMTP id 526E34003D for ; Sat, 18 Sep 2021 13:49:44 +0200 (CEST) Received: from dell12.lru.li (unknown [IPv6:2001:1a80:303a:0:faca:b8ff:fe50:d072]) (Authenticated sender: georg) by turing.lru.li (Postfix) with ESMTPSA id DFA3B44B12E; Sat, 18 Sep 2021 11:49:43 +0000 (UTC) Received: by dell12.lru.li (Postfix, from userid 1000) id 606D91A5017; Sat, 18 Sep 2021 13:49:43 +0200 (CEST) From: Georg Sauthoff To: Olivier Matz , Thomas Monjalon , David Marchand Cc: dev@dpdk.org, Georg Sauthoff Date: Sat, 18 Sep 2021 13:49:29 +0200 Message-Id: <20210918114930.245387-1-mail@gms.tf> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH 0/1] net: fix aliasing issue in checksum computation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The current __rte_raw_cksum() function violates the C strict-aliasing rules since it uses a uint8_t pointer to access a trailing byte. This patch also fixes a superfluous cast, i.e.: uintptr_t ptr = (uintptr_t)buf; typedef uint16_t __attribute__((__may_alias__)) u16_p; const u16_p *u16_buf = (const u16_p *)ptr; Transitive casting involving uintptr_t doesn't solve anything here. It also doesn't help with fixing a strict-aliasing issue here. The patch also simplifies the main loop, i.e. it eliminates the manually unrolled loop while (len >= (sizeof(*u16_buf) * 4)) { sum += u16_buf[0]; sum += u16_buf[1]; sum += u16_buf[2]; sum += u16_buf[3]; len -= sizeof(*u16_buf) * 4; u16_buf += 4; } since modern C compilers are in a better position to decide which level of unrolling is optimal for the target architecture. See also https://godbolt.org/z/6rYbYGnj7 which shows how GCC auto-vectorizes the simplified loop using AVX instructions, when compiling for Haswell. When looking at the number of instructions in the compiled code, the new version is half as big as the existing one. Georg Sauthoff (1): net: fix aliasing issue in checksum computation lib/net/rte_ip.h | 27 ++++++++------------------- 1 file changed, 8 insertions(+), 19 deletions(-) -- 2.31.1