From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (xvm-189-124.dc0.ghst.net [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DF38CA0524; Thu, 7 Jan 2021 17:00:08 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6496E140FF1; Thu, 7 Jan 2021 17:00:08 +0100 (CET) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by mails.dpdk.org (Postfix) with ESMTP id 78FC0140FDF for ; Thu, 7 Jan 2021 17:00:06 +0100 (CET) IronPort-SDR: 1PrGNqjSIaTSYtvARCvwOy1VQMfLLyosRosaMRqwZC2vLZS9ZTZqUZRnIzmsg+0hNjA7Z6nTUW H4s53TEZPhNQ== X-IronPort-AV: E=McAfee;i="6000,8403,9857"; a="177543524" X-IronPort-AV: E=Sophos;i="5.79,329,1602572400"; d="scan'208";a="177543524" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jan 2021 08:00:04 -0800 IronPort-SDR: ayQAOmQ5Ze3VeANjlqf0cz7QFr1b5bn6kiqk/4EWMLvb/YlpDBNn9yFCss+nrUmf1uHicbxI6/ /yP1CV3tjKHg== X-IronPort-AV: E=Sophos;i="5.79,329,1602572400"; d="scan'208";a="362010697" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.252.29.165]) ([10.252.29.165]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jan 2021 08:00:03 -0800 To: Stephen Hemminger , George Prekas Cc: Wenzhuo Lu , Beilei Xing , Bernard Iremonger , dev@dpdk.org References: <20201203135954.1127-1-prekageo@amazon.com> <20201205054238.12469-1-prekageo@amazon.com> <8c1fd6ad-302d-f45e-f48a-e81fd5f8ba85@intel.com> <6539f363-97b1-82b2-1b09-036aa75c9dc9@amazon.com> <20210107075008.77704541@hermes.local> From: Ferruh Yigit Message-ID: <6157960e-0ed9-5bd0-a150-f1c59c6f95a4@intel.com> Date: Thu, 7 Jan 2021 15:59:59 +0000 MIME-Version: 1.0 In-Reply-To: <20210107075008.77704541@hermes.local> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 1/7/2021 3:50 PM, Stephen Hemminger wrote: > On Wed, 6 Jan 2021 23:39:39 -0600 > George Prekas wrote: > >> On 1/6/2021 12:02 PM, Ferruh Yigit wrote: >>> On 12/5/2020 5:42 AM, George Prekas wrote: >>>> Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c >>>> and the calculated IP checksum is wrong on GCC 9 and GCC 10. >>>> >>>> Signed-off-by: George Prekas >>>> --- >>>> v2: >>>> * Instead of a compiler barrier, use a compiler flag. >>>> --- >>>>   app/test-pmd/meson.build | 1 + >>>>   1 file changed, 1 insertion(+) >>>> >>>> diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build >>>> index 7e9c7bdd6..5d24e807f 100644 >>>> --- a/app/test-pmd/meson.build >>>> +++ b/app/test-pmd/meson.build >>>> @@ -4,6 +4,7 @@ >>>>   # override default name to drop the hyphen >>>>   name = 'testpmd' >>>>   cflags += '-Wno-deprecated-declarations' >>>> +cflags += '-fno-strict-aliasing' >>>>   sources = files('5tswap.c', >>>>       'cmdline.c', >>>>       'cmdline_flow.c', >>>> >>> >>> Hi George, >>> >>> I am trying to understand this, the relevant code is as below: >>> ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, sizeof(*ip_hdr)); >>> >>> You are suspicious of strict aliasing rule violation, with more details: >>> The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const >>> unaligned_uint16_t *hdr", and compiler can optimize out the calculations using >>> data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter the >>> data and compiler may think data is not changed at all. >>> >>> 1) But the pointer "hdr" is assigned in the loop, from another pointer whose >>> content is changing, why this is not helping to figure out that the data 'hdr' >>> pointing is changed. >>> >>> 2) I tried to debug this, but I am not able to reproduce the issue, 'ip_sum()' >>> called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can you >>> able to confirm the case with debug, or from the assembly/object file? >>> >>> >>> And if the issue is strict aliasing rule violation as you said, compiler flag is >>> an option but not sure how much it reduces the compiler optimization benefit, I >>> guess other options also not so good, memcpy brings too much work on runtime and >>> union requires bigger change and makes code complex. >>> I wonder if making 'ip_sum()' a non inline function can help, can you please >>> give a try since you can reproduce it? >> >> Hi Ferruh, >> >> Thanks for looking into it. >> >> I am copy-pasting at the end of this email a minimal reproduction. It calculates a checksum and prints it. The correct value is f8d9. If you compile it with -O0 or -O3 -fno-strict-aliasing, you will get the correct value. If you compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and -O3, you will get f8e8. You can also try it on https://godbolt.org/ and see how different versions behave. >> >> My understanding is that the code violates the C standard (https://stackoverflow.com/a/99010). >> >> --- cut here --- >> >> #include >> #include >> #include >> #include >> >> struct rte_ipv4_hdr { >> uint8_t version_ihl; >> uint8_t type_of_service; >> uint16_t total_length; >> uint16_t packet_id; >> uint16_t fragment_offset; >> uint8_t time_to_live; >> uint8_t next_proto_id; >> uint16_t hdr_checksum; >> uint32_t src_addr; >> uint32_t dst_addr; >> }; >> >> static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) >> { >> uint32_t sum = 0; >> >> while (hdr_len > 1) >> { >> sum += *hdr++; >> if (sum & 0x80000000) >> sum = (sum & 0xFFFF) + (sum >> 16); >> hdr_len -= 2; >> } >> >> while (sum >> 16) >> sum = (sum & 0xFFFF) + (sum >> 16); >> >> return ~sum; >> } >> >> static void pkt_burst_flow_gen(void) >> { >> struct rte_ipv4_hdr *ip_hdr = (struct rte_ipv4_hdr *) malloc(4096); >> memset(ip_hdr, 0, sizeof(*ip_hdr)); >> ip_hdr->version_ihl = 1; >> ip_hdr->type_of_service = 2; >> ip_hdr->fragment_offset = 3; >> ip_hdr->time_to_live = 4; >> ip_hdr->next_proto_id = 5; >> ip_hdr->packet_id = 6; >> ip_hdr->src_addr = 7; >> ip_hdr->dst_addr = 8; >> ip_hdr->total_length = 9; >> ip_hdr->hdr_checksum = ip_sum((uint16_t *)ip_hdr, sizeof(*ip_hdr)); >> printf("%x\n", ip_hdr->hdr_checksum); >> } >> >> int main(void) >> { >> pkt_burst_flow_gen(); >> return 0; >> } > > If I change your code like this to use union, Gcc 10 is still broken. This worked fine for me: https://godbolt.org/z/vdsxh9 > It is a compiler bug. It maybe because optimizer is not smart enough > to know that memset has cleared the header. > > > #include > #include > #include > #include > > struct rte_ipv4_hdr { > uint8_t version_ihl; > uint8_t type_of_service; > uint16_t total_length; > uint16_t packet_id; > uint16_t fragment_offset; > uint8_t time_to_live; > uint8_t next_proto_id; > uint16_t hdr_checksum; > uint32_t src_addr; > uint32_t dst_addr; > }; > > static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) > { > uint32_t sum = 0; > > while (hdr_len > 1) > { > sum += *hdr++; > if (sum & 0x80000000) > sum = (sum & 0xFFFF) + (sum >> 16); > hdr_len -= 2; > } > > while (sum >> 16) > sum = (sum & 0xFFFF) + (sum >> 16); > > return ~sum; > } > > static void pkt_burst_flow_gen(void) > { > union { > struct rte_ipv4_hdr ip; > uint16_t data[10]; > } *hdr; > > hdr = malloc(sizeof(*hdr)); > > memset(hdr, 0, sizeof(*hdr)); > hdr->ip.version_ihl = 1; > hdr->ip.type_of_service = 2; > hdr->ip.fragment_offset = 3; > hdr->ip.time_to_live = 4; > hdr->ip.next_proto_id = 5; > hdr->ip.packet_id = 6; > hdr->ip.src_addr = 7; > hdr->ip.dst_addr = 8; > hdr->ip.total_length = 9; > hdr->ip.hdr_checksum = ip_sum(hdr->data, sizeof(*hdr)); > printf("%x\n", hdr->ip.hdr_checksum); > } > > int main(void) > { > pkt_burst_flow_gen(); > return 0; > } >