From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1DBFEA0C41; Thu, 30 Sep 2021 12:54:11 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 910AA40DDA; Thu, 30 Sep 2021 12:54:10 +0200 (CEST) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by mails.dpdk.org (Postfix) with ESMTP id E14554067E for ; Thu, 30 Sep 2021 12:54:07 +0200 (CEST) Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4HKqkg4KSRzbmlW; Thu, 30 Sep 2021 18:49:47 +0800 (CST) Received: from dggema767-chm.china.huawei.com (10.1.198.209) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2308.8; Thu, 30 Sep 2021 18:54:05 +0800 Received: from [10.66.74.184] (10.66.74.184) by dggema767-chm.china.huawei.com (10.1.198.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2308.8; Thu, 30 Sep 2021 18:54:04 +0800 To: "Singh, Aman Deep" , Ferruh Yigit , Thomas Monjalon CC: Andrew Rybchenko , "dev@dpdk.org" , Anatoly Burakov , David Marchand References: <1627908397-51565-1-git-send-email-lihuisong@huawei.com> <1627957839-38279-1-git-send-email-lihuisong@huawei.com> <9670389d-ebbc-9d9c-0cac-c7e8826ecb6f@huawei.com> <21383486.34YfpWhNxb@thomas> <858dc20f-5711-f770-2c08-0a432b6ea733@huawei.com> <4e2b5720-6da0-155b-845d-76c42b800abc@intel.com> From: Huisong Li Message-ID: Date: Thu, 30 Sep 2021 18:54:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: <4e2b5720-6da0-155b-845d-76c42b800abc@intel.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.66.74.184] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggema767-chm.china.huawei.com (10.1.198.209) X-CFilter-Loop: Reflected Subject: Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 在 2021/9/28 15:19, Singh, Aman Deep 写道: > > On 9/22/2021 9:01 AM, Huisong Li wrote: >> >> 在 2021/9/20 22:07, Ferruh Yigit 写道: >>> On 8/25/2021 10:53 AM, Huisong Li wrote: >>>> 在 2021/8/24 22:42, Ferruh Yigit 写道: >>>>> On 8/19/2021 4:45 AM, Huisong Li wrote: >>>>>> 在 2021/8/18 19:24, Ferruh Yigit 写道: >>>>>>> On 8/13/2021 9:16 AM, Huisong Li wrote: >>>>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道: >>>>>>>>> 13/08/2021 04:11, Huisong Li: >>>>>>>>>> Hi, all >>>>>>>>>> >>>>>>>>>> This patch can enhance the security of device uninstallation to >>>>>>>>>> eliminate dependency on user usage methods. >>>>>>>>>> >>>>>>>>>> Can you check this patch? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道: >>>>>>>>>>> Ethernet devices in DPDK can be released by >>>>>>>>>>> rte_eth_dev_close() and >>>>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in >>>>>>>>>>> PMD layer >>>>>>>>>>> to uninstall hardware. However, the two APIs do not have >>>>>>>>>>> explicit >>>>>>>>>>> invocation restrictions. In other words, at the ethdev >>>>>>>>>>> layer, it is >>>>>>>>>>> possible to call rte_eth_dev_close() before calling >>>>>>>>>>> rte_dev_remove() >>>>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario, >>>>>>>>> It is not a bad scenario. >>>>>>>>> If there is no more port for the device after calling close, >>>>>>>>> the device should be removed automatically. >>>>>>>>> Keep in mind "close" is for one port, "remove" is for the >>>>>>>>> entire device >>>>>>>>> which can have more than one port. >>>>>>>> I know. >>>>>>>> >>>>>>>> dev_close() is for removing an eth device. And rte_dev_remove() >>>>>>>> can be used >>>>>>>> >>>>>>>> for removing the rte device and all its eth devices belonging >>>>>>>> to the rte >>>>>>>> device. >>>>>>>> >>>>>>>> In rte_dev_remove(), "remove" is executed in primary or one of >>>>>>>> secondary, >>>>>>>> >>>>>>>> all eth devices having same pci address will be closed and >>>>>>>> removed. >>>>>>>> >>>>>>>>>>> the primary >>>>>>>>>>> process may be fine, but it may cause that xxx_dev_close() >>>>>>>>>>> in the PMD >>>>>>>>>>> layer will be called twice in the secondary process. So this >>>>>>>>>>> patch >>>>>>>>>>> fixes it. >>>>>>>>> If a port is closed in primary, it should be the same in >>>>>>>>> secondary. >>>>>>>>> >>>>>>>>> >>>>>>>>>>> +    /* >>>>>>>>>>> +     * The eth_dev->data->name doesn't be cleared by the >>>>>>>>>>> secondary >>>>>>>>>>> process, >>>>>>>>>>> +     * so above "eth_dev" isn't NULL after >>>>>>>>>>> rte_eth_dev_close() called. >>>>>>>>> This assumption is not clear. All should be closed together. >>>>>>>> However, dev_close() does not have the feature similar to >>>>>>>> rte_dev_remove(). >>>>>>>> >>>>>>>> Namely, it is not guaranteed that all eth devices are closed >>>>>>>> together in >>>>>>>> ethdev >>>>>>>> layer. It depends on app or user. >>>>>>>> >>>>>>>> If the app does not close together, the operation of repeatedly >>>>>>>> uninstalling an >>>>>>>> eth device in the secondary process >>>>>>>> >>>>>>>> will be triggered when dev_close() is first called by one >>>>>>>> secondary >>>>>>>> process, and >>>>>>>> then rte_dev_remove() is called. >>>>>>>> >>>>>>>> So I think it should be avoided. >>>>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or >>>>>>> 'rte_dev_remove()' from the secondary process. >>>>>>> There are explicit checks in various locations to prevent >>>>>>> clearing resources >>>>>>> completely from secondary process. >>>>>> There's no denying that. >>>>>> >>>>>> Generally, hardware resources of eth device and shared data of >>>>>> the primary and >>>>>> secondary process >>>>>> >>>>>> are cleared by primary, which are controled by ethdev layer or >>>>>> PMD layer. >>>>>> >>>>>> But there may be some private data or resources of each process >>>>>> (primary or >>>>>> secondary ), such as mp action >>>>>> >>>>>> registered by rte_mp_action_register() or others.  For these >>>>>> resources, the >>>>>> secondary process still needs to clear. >>>>>> >>>>>> Namely, both primary and secondary processes need to prevent >>>>>> repeated offloading >>>>>> of resources. >>>>>> >>>>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary >>>>>>> is technically >>>>>>> can be done but application needs to be extra cautious and >>>>>>> should take extra >>>>>>> measures and synchronization to make it work. >>>>>>> Regular use-case is secondary processes do the packet processing >>>>>>> and all >>>>>>> control >>>>>>> commands run by primary. >>>>>> You are right. We have a consensus that 'rte_eth_dev_close()' or >>>>>> 'rte_dev_remove()' >>>>>> >>>>>> can be called by primary and secondary processes. >>>>>> >>>>>> But DPDK framework cannot assume user behavior.😁 >>>>>> >>>>>> We need to make it more secure and reliable for both primary and >>>>>> secondary >>>>>> processes. >>>>>> >>>>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all >>>>>>> ethdev >>>>>>> resources >>>>>>> and further 'rte_dev_remove()' call will detect missing ethdev >>>>>>> resources and >>>>>>> won't try to clear them again. >>>>>>> >>>>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear >>>>>>> all resources >>>>>>> and further 'rte_dev_remove()' call (either from primary or >>>>>>> secondary) will try >>>>>>> to clean ethdev resources again. You are trying to prevent this >>>>>>> retry in remove >>>>>>> happening for secondary process. >>>>>> Right. However, if secondary process in PMD layer has its own >>>>>> private resources >>>>>> to be >>>>>> >>>>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or >>>>>> 'rte_dev_remove()'. >>>>>> >>>>>>> In secondary it won't free ethdev resources anyway if you let it >>>>>>> continue, >>>>>>> but I >>>>>>> guess here you are trying to prevent the PMD dev_close() called >>>>>>> again. Why? Is >>>>>>> it just for optimization or does it cause unexpected behavior in >>>>>>> the PMD? >>>>>>> >>>>>>> >>>>>>> Overall, to free resources you need to do the >>>>>>> 'rte_eth_dev_close()' or >>>>>>> 'rte_dev_remove()' in the primary anyway. So instead of this >>>>>>> workaround, I >>>>>>> would >>>>>>> suggest making PMD dev_close() safe to be called multiple times >>>>>>> (if this is the >>>>>>> problem.) >>>>>> In conclusion,  primary and secondary processes in PMD layer may >>>>>> have their own >>>>>> >>>>>> private data and resources, which need to be processed and released. >>>>>> >>>>>> Currently,  these for PMD are either handled and cleaned up in >>>>>> dev_close() or >>>>>> remove(). >>>>>> >>>>>> However, code streams in rte_dev_remove() cannot ensure that the >>>>>> uninstallation >>>>>> >>>>>> from secondary process will not be repeated if >>>>>> rte_eth_dev_close() is first >>>>>> called by >>>>>> >>>>>> secondary(primary is ok, plese review this patch). >>>>>> >>>>>> I think, this is the same for each PMD and is better suited to >>>>>> doing it in >>>>>> ethdev layer. >>>>>> >>>>> This patch prevents to call dev_close() twice in the secondary >>>>> process, is this >>>>> fixing a theoretical problem or an actual problem? >>>>> >>>>> If it is an actual problem can you please provide details, >>>>> callstack of the >>>>> problematic case? >>>> We analyzed the code when modifying the bug and found that the >>>> problem did exist. >>>> >>>> The ethdev layer did not guarantee the security. >>>> >>> I was wondering if there is a crash for an unexpected path, for >>> below case the >>> primary process check in the 'hns3_dev_uninit()' should already >>> prevent anything >>> unexpected. So I assume this is a fix for a theoretical issue. >> >> Yes. For primary process, hns3 can prevent multiple device >> uninstallation through >> >> "adapter_state" controled by primary process. >> >> The problem of multiple device uninstallation has been prevented at >> rte_eth_dev_pci_generic_remove() >> >> in the ethdev layer, as described in the patch. >> >> In primary process, rte_eth_dev_allocated() in >> rte_eth_dev_pci_generic_remove() will return NULL >> >> when first calling dev_close(), and then calling rte_dev_remove(). So >> it is ok. But the logic can not >> >> prevent the same case in secondary because secondary does not clear >> dev->data. >> >>> >>> In secondary process, these init/uninit device is already not very >>> safe. For >>> example, as far as I can see in secondary if you hot remove a device >>> device and >>> hot plug a new one, new device will use wrong device data (since hot >>> remove >>> won't clear device data, new one will continue to use it). >> >> No. If we hot remove a device in secondary, the secondary will >> request its primary >> >> send "remove device" message to all secondaries. After all >> secondaries are removed, >> >> the primary also removes the device and the device data will be cleared. >> >> This is the logic of rte_dev_remove(). >> >>> So I am not sure about adding secondary process related checks in >>> that area and >>> causing a false sense of security, and polluting the logic with >>> secondary >>> specific checks. >>> Also the check you add may hit by primary process and I am worried >>> on an >>> unexpected side affect it cause. >>> >>> >>> As said above I am not sure about this new check, but even we >>> continue with it, >>> what about wrapping the check with secondary process check, at least >>> to be sure >>> there won't be any side affect for primary process. >>> >> If app hot remove device in primary/secondary, the original logic is >> still used in this interface, because of >> >> bing RTE_ETH_DEV_ATTACHED state for eth_dev. If app first calls >> dev_close(), eth device is >> >> RTE_ETH_DEV_UNUSED state in primary/secondary. In this issue case, >> this interface is still return from the >> >> original place instead of the new check. >> >> So I don't think the new check will affect the primary process. >> Conversely, it's safer for secondary processes. >> >> Please check it again, thanks. > > As this issue is specific to secondary process, can we have these > change under a check like "if (rte_eal_process_type() != > RTE_PROC_PRIMARY)" > > By this we can avoid any side effect of these changes for primary > process and also it makes code readablity easier for designers who are > not checking secondary process changes. yes. @Ferruh, what do you think? > >> >>>> The general function of the two interfaces is as follows: >>>> >>>> rte_eth_dev_close() --> release eth device >>>> >>>> rte_dev_remove() -->   release eth device + remove and free >>>> rte_pci_device(primary andsecondary). >>>> >>>> According to the OVS application scenario, first call dev_close() >>>> and then call >>>> >>>> remove(), which is possible. >>>> >>>> We constructed this scenario using testpmd to start the secondary >>>> process(the >>>> multi-process >>>> >>>> patch of testpmd is being uploaded.). It is proved that the ethdev >>>> layer cannot >>>> guarantee >>>> >>>> this security. The callstack is as follows: >>>> >>>> ************************************************************************************************************************************************** >>>> >>>> >>>> >>>> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee >>>> --proc-type=auto -- >>>> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1 >>>> (gdb) b hns3_dev_uninit >>>> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, >>>> line 7806. >>>> (gdb) b hns3_dev_close >>>> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, >>>> line 6189. >>>> (gdb) r >>>> Starting program: >>>> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a >>>> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i >>>> --rxq 4 --txq 4 >>>> --num-procs=2 --proc-id=1 >>>> Missing separate debuginfo for /root/lib/libnuma.so.1 >>>> Try: yum --enablerepo='*debug*' install >>>> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug >>>> >>>> [Thread debugging using libthread_db enabled] >>>> Using host libthread_db library "/lib64/libthread_db.so.1". >>>> EAL: Detected 128 lcore(s) >>>> EAL: Detected 4 NUMA nodes >>>> EAL: Auto-detected process type: SECONDARY >>>> EAL: Detected static linkage of DPDK >>>> [New Thread 0xfffff7a0ad10 (LWP 17717)] >>>> EAL: Multi-process socket >>>> /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9 >>>> [New Thread 0xfffff7209d10 (LWP 17718)] >>>> EAL: Selected IOVA mode 'VA' >>>> EAL: VFIO support initialized >>>> [New Thread 0xfffff69f8d10 (LWP 17719)] >>>> EAL: Using IOMMU type 1 (Type 1) >>>> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 >>>> (socket 0) >>>> [New Thread 0xfffff61f7d10 (LWP 17720)] >>>> TELEMETRY: No legacy callbacks, legacy socket not created >>>> Interactive-mode selected >>>> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be >>>> stopped >>>> before configuration >>>> Failed to set MTU to 1500 for port 0 >>>> testpmd: create a new mbuf pool : n=155456, size=2176, >>>> socket=0 >>>> testpmd: preferred mempool ops selected: ring_mp_mc >>>> >>>> Warning! port-topology=paired and odd forward ports number, the >>>> last port will >>>> pair with itself. >>>> >>>> Configuring Port 0 (socket 0) >>>> Port 0: 00:18:2D:00:00:9E >>>> Checking link statuses... >>>> Done >>>> testpmd> port close 0 >>>> Closing ports... >>>> >>>> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 ) >>>>      at ../drivers/net/hns3/hns3_ethdev.c:6189 >>>> 6189        struct hns3_adapter *hns = eth_dev->data->dev_private; >>>> Missing separate debuginfos, use: debuginfo-install >>>> glibc-2.17-260.el7.aarch64 >>>> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64 >>>> zlib-1.2.7-18.el7.aarch64 >>>> (gdb) bt >>>> #0  hns3_dev_close (eth_dev=0x21cdb80 ) at >>>> ../drivers/net/hns3/hns3_ethdev.c:6189 >>>> #1  0x0000000000742eac in rte_eth_dev_close (port_id=0) at >>>> ../lib/ethdev/rte_ethdev.c:1870 >>>> #2  0x0000000000542a8c in close_port (pid=0) at >>>> ../app/test-pmd/testpmd.c:2895 >>>> #3  0x00000000004df82c in cmd_operate_specific_port_parsed >>>> (parsed_result=0xffffffffb0f0, >>>>      cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272 >>>> #4  0x0000000000738ce4 in cmdline_parse (cl=0x2475080, >>>> buf=0x24750c8 "port close >>>> 0\n") >>>>      at ../lib/cmdline/cmdline_parse.c:290 >>>> #5  0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090, >>>> buf=0x24750c8 >>>> "port close 0\n", >>>>      size=14) at ../lib/cmdline/cmdline.c:26 >>>> #6  0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n') >>>>      at ../lib/cmdline/cmdline_rdline.c:421 >>>> #7  0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f >>>> "\n\200\362\377\377\377\377", >>>>      size=1) at ../lib/cmdline/cmdline.c:149 >>>> #8  0x0000000000737270 in cmdline_interact (cl=0x2475080) at >>>> ../lib/cmdline/cmdline.c:223 >>>> #9  0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882 >>>> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at >>>> ../app/test-pmd/testpmd.c:3998 >>>> (gdb) c >>>> Continuing. >>>> Port 0 is closed >>>> Done >>>> testpmd> device detach 0000:7d:00.0 >>>> Removing a device... >>>> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)] >>>> >>>> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 ) >>>>      at ../drivers/net/hns3/hns3_ethdev.c:7806 >>>> 7806        struct hns3_adapter *hns = eth_dev->data->dev_private; >>>> (gdb) bt >>>> #0  hns3_dev_uninit (eth_dev=0x21cdb80 ) at >>>> ../drivers/net/hns3/hns3_ethdev.c:7806 >>>> #1  0x0000000000bb8668 in rte_eth_dev_pci_generic_remove >>>> (pci_dev=0x247a600, >>>>      dev_uninit=0xbca7c4 ) at >>>> ../lib/ethdev/ethdev_pci.h:155 >>>> #2  0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600) >>>>      at ../drivers/net/hns3/hns3_ethdev.c:7833 >>>> #3  0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at >>>> ../drivers/bus/pci/pci_common.c:287 >>>> #4  0x00000000007e8f14 in pci_unplug (dev=0x247a610) at >>>> ../drivers/bus/pci/pci_common.c:570 >>>> #5  0x0000000000775678 in local_dev_remove (dev=0x247a610) at >>>> ../lib/eal/common/eal_common_dev.c:319 >>>> #6  0x000000000078f114 in __handle_primary_request >>>> (param=0xfffff00008c0) >>>>      at ../lib/eal/common/hotplug_mp.c:284 >>>> #7  0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at >>>> ../lib/eal/linux/eal_alarm.c:92 >>>> #8  0x00000000007a3f30 in eal_intr_process_interrupts >>>> (events=0xfffff7a0a3e0, >>>> nfds=1) >>>>      at ../lib/eal/linux/eal_interrupts.c:998 >>>> #9  0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, >>>> totalfds=2) >>>>      at ../lib/eal/linux/eal_interrupts.c:1071 >>>> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at >>>> ../lib/eal/linux/eal_interrupts.c:1142 >>>> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40) >>>>      at ../lib/eal/common/eal_common_thread.c:202 >>>> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0 >>>> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6 >>>> >>>> ************************************************************************************************************************************************** >>>> >>>> >>>> >>>> Note: above log is from the secondary process. >>>> >>>>>>> And again, please re-consider calling 'rte_eth_dev_close()' or >>>>>>> 'rte_dev_remove()' from the secondary process. >>>>>>> >>>>>>>>>>> +     * Namely, whether "eth_dev" is NULL cannot be used to >>>>>>>>>>> determine >>>>>>>>>>> whether >>>>>>>>>>> +     * an ethdev port has been released. >>>>>>>>>>> +     * For both primary process and secondary process, >>>>>>>>>>> eth_dev->state is >>>>>>>>>>> +     * RTE_ETH_DEV_UNUSED, which means the ethdev port has >>>>>>>>>>> been released. >>>>>>>>>>> +     */ >>>>>>>>>>> +    if (eth_dev->state == RTE_ETH_DEV_UNUSED) { >>>>>>>>>>> +        RTE_ETHDEV_LOG(INFO, "The ethdev port has been >>>>>>>>>>> released."); >>>>>>>>>>> +        return 0; >>>>>>>>>>> +    } >>>>>>>>> . >>>>>>> . >>>>> . >>> . > .