From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 73C5942A65; Fri, 5 May 2023 03:54:11 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E9E0040ED7; Fri, 5 May 2023 03:54:10 +0200 (CEST) Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by mails.dpdk.org (Postfix) with ESMTP id BC7E5406BA for ; Fri, 5 May 2023 03:54:08 +0200 (CEST) Received: from loongson.cn (unknown [10.20.42.90]) by gateway (Coremail) with SMTP id _____8CxOuq+YVRkCvQEAA--.8044S3; Fri, 05 May 2023 09:54:06 +0800 (CST) Received: from [10.20.42.90] (unknown [10.20.42.90]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxsOS9YVRkuQFLAA--.7193S3; Fri, 05 May 2023 09:54:05 +0800 (CST) Subject: Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions To: =?UTF-8?Q?Morten_Br=c3=b8rup?= , Konstantin Ananyev Cc: dev@dpdk.org, maobibo@loongson.cn, qiming.yang@intel.com, wenjun1.wu@intel.com, ruifeng.wang@arm.com, drc@linux.vnet.ibm.com, Tyler Retzlaff References: <20230424090532.367194-1-zhoumin@loongson.cn> <20671e5e-8e86-4bc3-2d95-4cec014b0539@yandex.ru> <43fb4e62-63b5-7076-327e-dc30cd561fff@loongson.cn> <98CBD80474FA8B44BF855DF32C47DC35D878E4@smartserver.smartshare.dk> From: zhoumin Message-ID: Date: Fri, 5 May 2023 09:54:05 +0800 User-Agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D878E4@smartserver.smartshare.dk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID: AQAAf8CxsOS9YVRkuQFLAA--.7193S3 X-CM-SenderInfo: 52kr3ztlq6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjvJXoWxZF1kXr17AFWrXrW7Wr47XFb_yoWrAFWrpF WkJan0kryUGrW8Jw12qw18Wr9IyrWrJ3WUXryrJ3WkKr4qq3WY9F4jqr1j9F1DXr48G3Wj vr1UJ39xuFs8AaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU baxYFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_Jrv_JF1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW8JVW5JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwA2z4 x0Y4vEx4A2jsIE14v26F4j6r4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJwAS 0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI0UMc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280aVAFwI0_Jr0_Gr1l Ox8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcVAKI48JMxk0xIA0c2IEe2xFo4CEbIxvr21l42 xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1l4IxYO2xFxVAFwI0_JF0_Jw1l x2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14 v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IY x2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87 Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZF pf9x07j83kZUUUUU= X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Morten, On Thur, May 4, 2023 at 9:21PM, Morten Brørup wrote: >> From: zhoumin [mailto:zhoumin@loongson.cn] >> Sent: Thursday, 4 May 2023 15.17 >> >> Hi Konstantin, >> >> Thanks for your  comments. >> >> On 2023/5/1 下午9:29, Konstantin Ananyev wrote: >>>> Segmentation fault has been observed while running the >>>> ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 >>>> processor which has 64 cores and 4 NUMA nodes. >>>> >>>> From the ixgbe_recv_pkts_lro() function, we found that as long as the >>>> first >>>> packet has the EOP bit set, and the length of this packet is less >>>> than or >>>> equal to rxq->crc_len, the segmentation fault will definitely happen >>>> even >>>> though on the other platforms, such as X86. >>>> >>>> Because when processd the first packet the first_seg->next will be >>>> NULL, if >>>> at the same time this packet has the EOP bit set and its length is less >>>> than or equal to rxq->crc_len, the following loop will be excecuted: >>>> >>>>     for (lp = first_seg; lp->next != rxm; lp = lp->next) >>>>         ; >>>> >>>> We know that the first_seg->next will be NULL under this condition. >>>> So the >>>> expression of lp->next->next will cause the segmentation fault. >>>> >>>> Normally, the length of the first packet with EOP bit set will be >>>> greater >>>> than rxq->crc_len. However, the out-of-order execution of CPU may >>>> make the >>>> read ordering of the status and the rest of the descriptor fields in >>>> this >>>> function not be correct. The related codes are as following: >>>> >>>>         rxdp = &rx_ring[rx_id]; >>>>  #1     staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); >>>> >>>>         if (!(staterr & IXGBE_RXDADV_STAT_DD)) >>>>             break; >>>> >>>>  #2     rxd = *rxdp; >>>> >>>> The sentence #2 may be executed before sentence #1. This action is >>>> likely >>>> to make the ready packet zero length. If the packet is the first >>>> packet and >>>> has the EOP bit set, the above segmentation fault will happen. >>>> >>>> So, we should add rte_rmb() to ensure the read ordering be correct. >>>> We also >>>> did the same thing in the ixgbe_recv_pkts() function to make the rxd >>>> data >>>> be valid even thougth we did not find segmentation fault in this >>>> function. >>>> >>>> Signed-off-by: Min Zhou >>>> --- >>>> v2: >>>> - Make the calling of rte_rmb() for all platforms >>>> --- >>>>  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ >>>>  1 file changed, 3 insertions(+) >>>> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c >>>> b/drivers/net/ixgbe/ixgbe_rxtx.c >>>> index c9d6ca9efe..302a5ab7ff 100644 >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c >>>> @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf >>>> **rx_pkts, >>>>          staterr = rxdp->wb.upper.status_error; >>>>          if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) >>>>              break; >>>> + >>>> +        rte_rmb(); >>>>          rxd = *rxdp; >>> >>> >>> Indeed, looks like a problem to me on systems with relaxed MO. >>> Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. >> The LoongArch architecture uses the Weak Consistency model which can >> cause the problem, especially in scenario with many cores, such as >> Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot >> reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores. >>> About a fix - looks right, but a bit excessive to me - >>> as I understand all we need here is to prevent re-ordering by CPU itself. >> Yes, thanks for cc-ing. >>> So rte_smp_rmb() seems enough here. >>> Or might be just: >>> staterr = __atomic_load_n(&rxdp->wb.upper.status_error, >>> __ATOMIC_ACQUIRE); >>> >> Does __atomic_load_n() work on Windows if we use it to solve this problem ? > Yes, __atomic_load_n() works on Windows too. > Thank you, Morten. I got it. I will compare those barriers and choose a proper one for this problem. Best regards, Min