From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3782FA0C47; Thu, 19 Aug 2021 05:45:57 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AE4CF4067E; Thu, 19 Aug 2021 05:45:56 +0200 (CEST) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by mails.dpdk.org (Postfix) with ESMTP id 3E0C440141 for ; Thu, 19 Aug 2021 05:45:55 +0200 (CEST) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GqrJR56RNzYr6p; Thu, 19 Aug 2021 11:45:27 +0800 (CST) Received: from dggema767-chm.china.huawei.com (10.1.198.209) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Thu, 19 Aug 2021 11:45:52 +0800 Received: from [10.66.74.184] (10.66.74.184) by dggema767-chm.china.huawei.com (10.1.198.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 19 Aug 2021 11:45:52 +0800 To: Ferruh Yigit , Thomas Monjalon CC: Andrew Rybchenko , "dev@dpdk.org" , Anatoly Burakov , David Marchand References: <1627908397-51565-1-git-send-email-lihuisong@huawei.com> <1627957839-38279-1-git-send-email-lihuisong@huawei.com> <9670389d-ebbc-9d9c-0cac-c7e8826ecb6f@huawei.com> <21383486.34YfpWhNxb@thomas> From: Huisong Li Message-ID: Date: Thu, 19 Aug 2021 11:45:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.66.74.184] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggema767-chm.china.huawei.com (10.1.198.209) X-CFilter-Loop: Reflected Subject: Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 在 2021/8/18 19:24, Ferruh Yigit 写道: > On 8/13/2021 9:16 AM, Huisong Li wrote: >> 在 2021/8/13 14:12, Thomas Monjalon 写道: >>> 13/08/2021 04:11, Huisong Li: >>>> Hi, all >>>> >>>> This patch can enhance the security of device uninstallation to >>>> eliminate dependency on user usage methods. >>>> >>>> Can you check this patch? >>>> >>>> >>>> 在 2021/8/3 10:30, Huisong Li 写道: >>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and >>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer >>>>> to uninstall hardware. However, the two APIs do not have explicit >>>>> invocation restrictions. In other words, at the ethdev layer, it is >>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove() >>>>> or rte_eal_hotplug_remove(). In such a bad scenario, >>> It is not a bad scenario. >>> If there is no more port for the device after calling close, >>> the device should be removed automatically. >>> Keep in mind "close" is for one port, "remove" is for the entire device >>> which can have more than one port. >> I know. >> >> dev_close() is for removing an eth device. And rte_dev_remove() can be used >> >> for removing the rte device and all its eth devices belonging to the rte device. >> >> In rte_dev_remove(), "remove" is executed in primary or one of secondary, >> >> all eth devices having same pci address will be closed and removed. >> >>>>> the primary >>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD >>>>> layer will be called twice in the secondary process. So this patch >>>>> fixes it. >>> If a port is closed in primary, it should be the same in secondary. >>> >>> >>>>> +    /* >>>>> +     * The eth_dev->data->name doesn't be cleared by the secondary process, >>>>> +     * so above "eth_dev" isn't NULL after rte_eth_dev_close() called. >>> This assumption is not clear. All should be closed together. >> However, dev_close() does not have the feature similar to rte_dev_remove(). >> >> Namely, it is not guaranteed that all eth devices are closed together in ethdev >> layer. It depends on app or user. >> >> If the app does not close together, the operation of repeatedly uninstalling an >> eth device in the secondary process >> >> will be triggered when dev_close() is first called by one secondary process, and >> then rte_dev_remove() is called. >> >> So I think it should be avoided. > First of all, I am not sure about calling 'rte_eth_dev_close()' or > 'rte_dev_remove()' from the secondary process. > There are explicit checks in various locations to prevent clearing resources > completely from secondary process. There's no denying that. Generally, hardware resources of eth device and shared data of the primary and secondary process are cleared by primary, which are controled by ethdev layer or PMD layer. But there may be some private data or resources of each process (primary or secondary ), such as mp action registered by rte_mp_action_register() or others.  For these resources, the secondary process still needs to clear. Namely, both primary and secondary processes need to prevent repeated offloading of resources. > > Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically > can be done but application needs to be extra cautious and should take extra > measures and synchronization to make it work. > Regular use-case is secondary processes do the packet processing and all control > commands run by primary. You are right. We have a consensus that 'rte_eth_dev_close()' or 'rte_dev_remove()' can be called by primary and secondary processes. But DPDK framework cannot assume user behavior.😁 We need to make it more secure and reliable for both primary and secondary processes. > > In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev resources > and further 'rte_dev_remove()' call will detect missing ethdev resources and > won't try to clear them again. > > In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources > and further 'rte_dev_remove()' call (either from primary or secondary) will try > to clean ethdev resources again. You are trying to prevent this retry in remove > happening for secondary process. Right. However, if secondary process in PMD layer has its own private resources to be cleared, it still need to do it by calling 'rte_eth_dev_close()' or 'rte_dev_remove()'. > In secondary it won't free ethdev resources anyway if you let it continue, but I > guess here you are trying to prevent the PMD dev_close() called again. Why? Is > it just for optimization or does it cause unexpected behavior in the PMD? > > > Overall, to free resources you need to do the 'rte_eth_dev_close()' or > 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I would > suggest making PMD dev_close() safe to be called multiple times (if this is the > problem.) In conclusion,  primary and secondary processes in PMD layer may have their own private data and resources, which need to be processed and released. Currently,  these for PMD are either handled and cleaned up in dev_close() or remove(). However, code streams in rte_dev_remove() cannot ensure that the uninstallation from secondary process will not be repeated if rte_eth_dev_close() is first called by secondary(primary is ok, plese review this patch). I think, this is the same for each PMD and is better suited to doing it in ethdev layer. > > And again, please re-consider calling 'rte_eth_dev_close()' or > 'rte_dev_remove()' from the secondary process. > >>>>> +     * Namely, whether "eth_dev" is NULL cannot be used to determine whether >>>>> +     * an ethdev port has been released. >>>>> +     * For both primary process and secondary process, eth_dev->state is >>>>> +     * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released. >>>>> +     */ >>>>> +    if (eth_dev->state == RTE_ETH_DEV_UNUSED) { >>>>> +        RTE_ETHDEV_LOG(INFO, "The ethdev port has been released."); >>>>> +        return 0; >>>>> +    } >>> >>> . > .