From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4457E41E31; Thu, 9 Mar 2023 01:59:56 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 250CB4111C; Thu, 9 Mar 2023 01:59:56 +0100 (CET) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by mails.dpdk.org (Postfix) with ESMTP id 712E740A7E for ; Thu, 9 Mar 2023 01:59:54 +0100 (CET) Received: from dggpeml500024.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4PX9mn5K29zrSRg; Thu, 9 Mar 2023 08:59:05 +0800 (CST) Received: from [10.67.100.224] (10.67.100.224) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Thu, 9 Mar 2023 08:59:52 +0800 Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode To: Honnappa Nagarahalli , Konstantin Ananyev , "dev@dpdk.org" , "thomas@monjalon.net" , Ferruh Yigit , Andrew Rybchenko , Kalesh AP , "Ajit Khaparde (ajit.khaparde@broadcom.com)" CC: nd References: <20230301030610.49468-1-fengchengwen@huawei.com> <20230301030610.49468-2-fengchengwen@huawei.com> <95edd6ca-fe1f-fd7c-719f-0a9e6d7c45b5@huawei.com> From: fengchengwen Message-ID: <90919d02-08ec-dcd1-db56-7104e7aeb299@huawei.com> Date: Thu, 9 Mar 2023 08:59:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.100.224] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500024.china.huawei.com (7.185.36.10) X-CFilter-Loop: Reflected X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2023/3/8 9:09, Honnappa Nagarahalli wrote: > > >>>>>>> >>>>> >>>>> Is there any reason not to design this in the same way as >>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself? >>>> >>>> I suppose it is a question for the authors of original patch... >>> Appreciate if the authors could comment on this. >> >> The main cause is that the hardware implementation limit, I will try to explain >> from hns3 PMD's view. >> For a global reset, all the function need responsed within a centain period of >> time. otherwise, the reset will fail. and also the reset requirement a few steps (all >> may take a long time). >> >> When with multiple functions in one DPDK, and trigger a global reset, the >> rte_eth_dev_reset will not cover this scene: >> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread. >> 2. then invoke application callback, but due to the same thread, and each >> port's recover will take a long time, so later port will reset failed. > If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover, what problems do you see? I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has no difference with RTE_ETH_EVENT_INTR_RESET mechanism. Could you detail more? > >> >>> >>>> >>>>> We could have a similar API 'rte_eth_dev_recover' to do the recovery >>>> functionality. >>>> >>>> I suppose such approach is also possible. >>>> Personally I am fine with both ways: either existing one or what you >>>> propose, as long as we'll fix existing race-condition. >>>> What is good with what you suggest - that way we probably don't need >>>> to worry how to allow user to enable/disable auto-recovery inside PMD. >>>> >>>> Konstantin >>>> >>>