From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BE4DA43E50; Fri, 12 Apr 2024 10:16:58 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8EDC840299; Fri, 12 Apr 2024 10:16:58 +0200 (CEST) Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by mails.dpdk.org (Postfix) with ESMTP id CCC8740295 for ; Fri, 12 Apr 2024 10:16:56 +0200 (CEST) Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4VG8V13k5zz2NW75; Fri, 12 Apr 2024 16:14:01 +0800 (CST) Received: from dggpeml500024.china.huawei.com (unknown [7.185.36.10]) by mail.maildlp.com (Postfix) with ESMTPS id E144D140555; Fri, 12 Apr 2024 16:16:54 +0800 (CST) Received: from [10.67.121.161] (10.67.121.161) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Fri, 12 Apr 2024 16:16:54 +0800 Subject: Re: [PATCH] ethdev: fix strict aliasing lead to link cannot be up To: Stephen Hemminger CC: , , , , , References: <20240411030749.41874-1-fengchengwen@huawei.com> <20240411080530.1f5030c6@hermes.local> From: fengchengwen Message-ID: <6dfc109e-8b02-e8d4-c14f-03febf33839e@huawei.com> Date: Fri, 12 Apr 2024 16:16:54 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <20240411080530.1f5030c6@hermes.local> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.121.161] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500024.china.huawei.com (7.185.36.10) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Stephen, On 2024/4/11 23:05, Stephen Hemminger wrote: > On Thu, 11 Apr 2024 03:07:49 +0000 > Chengwen Feng wrote: > >> Fix a problem introduced by a compiler upgrade (from gcc10 to gcc12.3), >> which will lead the hns3 NIC can't link up. The root cause is strict >> aliasing violation in rte_eth_linkstatus_set() with hns3 driver, see >> [1] for more details. >> >> This commit use union to avoid such aliasing violation. >> >> [1] Strict aliasing problem with rte_eth_linkstatus_set() >> https://marc.info/?l=dpdk-dev&m=171274148514777&w=3 >> >> Cc: stable@dpdk.org >> >> Signed-off-by: Chengwen Feng >> Signed-off-by: Dengdui Huang >> --- > > The patch to use union is correct. > Examining the link status fuller raises a couple of pre-existing issues. > > 1. Why is this an inline function, there is no way this is in the > fast path of any driver or application? Nice catch. The rte_eth_linkstatus_set/get was first introduced by commit "b77d21cc23 ethdev: add link status get/set helper functions". which were inline function. But I check this patchset [1], and found in v1~v3 is was impl without static inline, and the v4 just with static inline. and I can't get the change reason. @Stephen could you recall something about this? Below is what I try to find the reason from source code: 1, Why rte_eth_linkstatus_get is inline, I think maybe due: a, memif driver invoke rte_eth_link_get() in Rx/Tx path (eth_memif_rx/eth_memif_rx_zc/eth_memif_tx/eth_memif_tx_zc) b, before the above commit, rte_eth_link_get() invoke rte_eth_dev_atomic_read_link_status() which was static inline. from above, I think it mainly due to performance reason, so keep the original impl. 2, Why rte_eth_linkstatus_set is inline, I check the following patch: 5042dde07d net/enic: use link status helper functions 2b4ab4223d net/octeontx: use link status helper functions cc92eb9a97 net/szedata2: use link status helper functions 8e14dc285a net/thunderx: use link status helper functions e324523c6c net/liquidio: use link status helper functions e66b0fd123 net/i40e: use link status helper functions 4abe903e50 net/sfc: use link status helper functions faadebad81 net/ixgbe: use link status helper functions 80ba61115e net/e1000: use link status helper functions aab28ea2bc net/nfp: use link status helper functions 7e2eb5f0d2 net/dpaa2: use link status helper functions 13086a8f50 net/vmxnet3: use link status helper functions 717b2e8eae net/virtio: use link status helper functions almost all pmd's set link function is static inline, I also check, and found only sfc driver will invoke it in Tx path: sfc_efx_recv_pkts->sfc_ev_qpoll->efx_ev_qpoll->eevo_qpoll->siena_ef10_ev_qpoll->rhead_ev_mcdi->ef10_ev_mcdi->sfc_ev_link_change->rte_eth_linkstatus_set [1] https://inbox.dpdk.org/dev/20180108174514.14688-1-stephen@networkplumber.org/ https://inbox.dpdk.org/dev/20180116182558.6254-1-stephen@networkplumber.org/ > > 2. Why is it marked sequential consistent and not relaxed? How could there be a visible > relationship between link status and other variables. Drivers would > not be using the link status state as internal variable. The original was impl by rte_atomic64_cmpset(): 1, In powerpc platform, it was impl as: static inline int rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src) { return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire, rte_memory_order_acquire) ? 1 : 0; } 2, In ARM64 platform, it was impl as: static inline int rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src) { return __sync_bool_compare_and_swap(dst, exp, src); } The __sync_bool_compare_and_swap will compiled with a dmb. So there's no problem from the point of view of replacement (just with the memory barrier). I also think there are no need with sequential consistent, because this atomic mainly to solve the concurrent problem of rte_eth_link_get and driver lsc interrupt setting. BTW: the above two problem is OK, but I think we could tackle with another commit (NOT this). Thanks > > . >