From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2E8C641DFD; Tue, 7 Mar 2023 09:25:24 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 09E3D40E03; Tue, 7 Mar 2023 09:25:24 +0100 (CET) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by mails.dpdk.org (Postfix) with ESMTP id DE5564067E for ; Tue, 7 Mar 2023 09:25:21 +0100 (CET) Received: from dggpeml500024.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4PW7k91TmyzKpw6; Tue, 7 Mar 2023 16:23:13 +0800 (CST) Received: from [10.67.100.224] (10.67.100.224) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Tue, 7 Mar 2023 16:25:17 +0800 Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode To: Konstantin Ananyev , Ajit Khaparde , Ferruh Yigit CC: Konstantin Ananyev , Thomas Monjalon , Andrew Rybchenko , "dev@dpdk.org" References: <20230301030610.49468-1-fengchengwen@huawei.com> <20230301030610.49468-2-fengchengwen@huawei.com> <0f387ca1eee34a7f92745de7b59a71a1@huawei.com> <18c5b676-ae72-e646-89af-d6cd636d923f@yandex.ru> <5b40049f-6f22-fc3f-b13e-da1793c2fd1a@amd.com> <9092fdae9c1d4c53a00a8f23eb1129ec@huawei.com> <0effdaa9-5045-0635-775c-6e1eda0d5dba@amd.com> From: fengchengwen Message-ID: <9dc5d714-07a4-c32e-e557-efe8e7fe2d16@huawei.com> Date: Tue, 7 Mar 2023 16:25:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.100.224] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpeml500024.china.huawei.com (7.185.36.10) X-CFilter-Loop: Reflected X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2023/3/6 19:13, Konstantin Ananyev wrote: > > >>>>>>>>> In the proactive error handling mode, the PMD will set the data path >>>>>>>>> pointers to dummy functions and then try recovery, in this period the >>>>>>>>> application may still invoking data path API. This will introduce a >>>>>>>>> race-condition with data path which may lead to crash [1]. >>>>>>>>> >>>>>>>>> Although the PMD added delay after setting data path pointers to cover >>>>>>>>> the above race-condition, it reduces the probability, but it doesn't >>>>>>>>> solve the problem. >>>>>>>>> >>>>>>>>> To solve the race-condition problem fundamentally, the following >>>>>>>>> requirements are added: >>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after >>>>>>>>> report RTE_ETH_EVENT_ERR_RECOVERING event. >>>>>>>>> 2. The application should stop data path API invocation when process >>>>>>>>> the RTE_ETH_EVENT_ERR_RECOVERING event. >>>>>>>>> 3. The PMD should set the data path pointers to valid functions before >>>>>>>>> report RTE_ETH_EVENT_RECOVERY_SUCCESS event. >>>>>>>>> 4. The application should enable data path API invocation when process >>>>>>>>> the RTE_ETH_EVENT_RECOVERY_SUCCESS event. >>>>>>>>> >>>>>>> >>>>>>> How this is solving the race-condition, by pushing responsibility to >>>>>>> stop data path to application? >>>>>> >>>>>> Exactly, it becomes application responsibility to make sure data-path is >>>>>> stopped/suspended before recovery will continue. >>>>>> >>>>> >>>>> From documentation of the feature: >>>>> >>>>> `` >>>>> Because the PMD recovers automatically, >>>>> the application can only sense that the data flow is disconnected for a >>>>> while and the control API returns an error in this period. >>>>> >>>>> In order to sense the error happening/recovering, as well as to restore >>>>> some additional configuration, three events are available: >>>>> `` >>>>> >>>>> It looks like initial design is to use events mainly inform application >>>>> about what happened and mainly for re-configuration. >>>>> >>>>> Although I am don't disagree to involve the application, I am not sure >>>>> that is part of current design. >>>> >>>> I thought we all agreed that initial design contain some fallacies that >>>> need to fixed, no? >>>> Statement that with current rte_ethdev design error recovery can be done >>>> without interaction with the app (to stop/suspend data/control path) >>>> is the main one I think. >>>> It needs some interaction with app layer, one way or another. >>>> >>>>>>> >>>>>>> What if application is not interested in recovery modes at all and not >>>>>>> registered any callback for the recovery? >>>>>> >>>>>> >>>>>> Are you saying there is no way for application to disable >>>>>> automatic recovery in PMD if it is not interested >>>>>> (or can't full-fill per-requesties for it)? >>>>>> If so, then yes it is a problem and we need to fix it. >>>>>> I assumed that such mechanism to disable unwanted events already exists, >>>>>> but I can't find anything. >>>>>> Wonder what would be the easiest way here - can PMD make a decision >>>>>> based on callback return value, or do we need a new API to >>>>>> enable/disable callbacks, or ...? >>>>>> >>>>>> >>>>> >>>>> As far as I can see automatic recovery is not configurable by app. >>>>> >>>>> But that is not all, PMD sends events to application but PMD can't know >>>>> if application is handling them or not, so with current design PMD can't >>>>> rely on to app. >>>> >>>> Well, PMD invokes user provided callback. >>>> One way to fix that problem - if there is no callback provided, >>>> or callback returns an error code - PMD can assume that recovery >>>> should not be done. >>>> That is probably not the best design choice, but at least it will allow >>>> to fix the problem without too many changes and introducing new API. >>>> That could be sort of a 'quick fix'. >>>> In a meanwhile we can think about new/better approach for that. >>>> >>> >>> -rc2 for 23.03 is a few days away. >>> >>> What do you think to have 'quick fix' as modifying how driver updates >>> burst ops to prevent the race condition, for this release? The 'quick fix', do you mean only update function pointer (without rxq setting) ? Currently the PMDs which announced support "proactive error handling mode" already do this. >>> >>> And plan a design update for the next release? >> +1 on the overall approach. > > Yep, agree. Hope for better solution. And also, I notice only the openvswitch (from all open-source software which based-on DPDK) registers RTE_ETH_EVENT_INTR_RESET callback . Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism. > >> >>> >>> >>>>> >>>>>>> I think driver should not rely on application for this, unless >>>>>>> application explicitly says (to driver) that it is handling recovery, >>>>>>> right now there is no way for driver to know this. >>>>>> >>>>>> I think it is visa-versa: >>>>>> application should not enable auto-recovery if it can't meet >>>>>> per-requeststies for it (provide appropriate callback). >>>>>> >>>>> >>>>> I agree on above, we are saying similar thing in different perspective. >>>> >>>> Ok, that's good we are on the same page. >>>> >>>> >>>>> >>>>>> >>>>>>> >>>>>>>>> Also, this patch introduce a driver internal function >>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD. >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/ >>>>>>>>> >>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode") >>>>>>>>> Cc: stable@dpdk.org >>>>>>>>> >>>>>>>>> Signed-off-by: Chengwen Feng >>>>>>>>> --- >>>>>>>>> doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++--------- >>>>>>>>> lib/ethdev/ethdev_driver.c | 8 +++++++ >>>>>>>>> lib/ethdev/ethdev_driver.h | 10 ++++++++ >>>>>>>>> lib/ethdev/rte_ethdev.h | 32 >>>>>>>>> +++++++++++++++---------- >>>>>>>>> lib/ethdev/version.map | 1 + >>>>>>>>> 5 files changed, 46 insertions(+), 25 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst >>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst >>>>>>>>> index c145a9066c..e380ff135a 100644 >>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst >>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst >>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery >>>>>>>>> in PASSIVE mode, >>>>>>>>> the PMD automatically recovers from error in PROACTIVE mode, >>>>>>>>> and only a small amount of work is required for the application. >>>>>>>>> >>>>>>>>> -During error detection and automatic recovery, >>>>>>>>> -the PMD sets the data path pointers to dummy functions >>>>>>>>> -(which will prevent the crash), >>>>>>>>> -and also make sure the control path operations fail with a return >>>>>>>>> code ``-EBUSY``. >>>>>>>>> - >>>>>>>>> -Because the PMD recovers automatically, >>>>>>>>> -the application can only sense that the data flow is disconnected >>>>>>>>> for a while >>>>>>>>> -and the control API returns an error in this period. >>>>>>>>> +During error detection and automatic recovery, the PMD sets the >>>>>>>>> data path >>>>>>>>> +pointers to dummy functions and also make sure the control path >>>>>>>>> operations >>>>>>>>> +failed with a return code ``-EBUSY``. >>>>>>>>> >>>>>>>>> In order to sense the error happening/recovering, >>>>>>>>> as well as to restore some additional configuration, >>>>>>>>> @@ -653,9 +648,9 @@ three events are available: >>>>>>>>> >>>>>>>>> ``RTE_ETH_EVENT_ERR_RECOVERING`` >>>>>>>>> Notify the application that an error is detected >>>>>>>>> - and the recovery is being started. >>>>>>>>> + and the recovery is about to start. >>>>>>>>> Upon receiving the event, the application should not invoke >>>>>>>>> - any control path function until receiving >>>>>>>>> + any control and data path API until receiving >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event. >>>>>>>>> >>>>>>>>> .. note:: >>>>>>>>> @@ -666,8 +661,9 @@ three events are available: >>>>>>>>> >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` >>>>>>>>> Notify the application that the recovery from error is successful, >>>>>>>>> - the PMD already re-configures the port, >>>>>>>>> - and the effect is the same as a restart operation. >>>>>>>>> + the PMD already re-configures the port. >>>>>>>>> + The application should restore some additional configuration, >>>>>>>>> and then >>>>>>>>> + enable data path API invocation. >>>>>>>>> >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` >>>>>>>>> Notify the application that the recovery from error failed, >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c >>>>>>>>> index 0be1e8ca04..f994653fe9 100644 >>>>>>>>> --- a/lib/ethdev/ethdev_driver.c >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c >>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev >>>>>>>>> *dev, const char *ring_name, >>>>>>>>> return rc; >>>>>>>>> } >>>>>>>>> >>>>>>>>> +void >>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) >>>>>>>>> +{ >>>>>>>>> + if (dev == NULL) >>>>>>>>> + return; >>>>>>>>> + eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> const struct rte_memzone * >>>>>>>>> rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char >>>>>>>>> *ring_name, >>>>>>>>> uint16_t queue_id, size_t size, unsigned int align, >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h >>>>>>>>> index 2c9d615fb5..0d964d1f67 100644 >>>>>>>>> --- a/lib/ethdev/ethdev_driver.h >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h >>>>>>>>> @@ -1621,6 +1621,16 @@ int >>>>>>>>> rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const >>>>>>>>> char *name, >>>>>>>>> uint16_t queue_id); >>>>>>>>> >>>>>>>>> +/** >>>>>>>>> + * @internal >>>>>>>>> + * Setup eth fast-path API to ethdev values. >>>>>>>>> + * >>>>>>>>> + * @param dev >>>>>>>>> + * Pointer to struct rte_eth_dev. >>>>>>>>> + */ >>>>>>>>> +__rte_internal >>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev); >>>>>>>>> + >>>>>>>>> /** >>>>>>>>> * @internal >>>>>>>>> * Atomically set the link status for the specific device. >>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h >>>>>>>>> index 049641d57c..44ee7229c1 100644 >>>>>>>>> --- a/lib/ethdev/rte_ethdev.h >>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h >>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type { >>>>>>>>> */ >>>>>>>>> RTE_ETH_EVENT_RX_AVAIL_THRESH, >>>>>>>>> /** Port recovering from a hardware or firmware error. >>>>>>>>> - * If PMD supports proactive error recovery, >>>>>>>>> - * it should trigger this event to notify application >>>>>>>>> - * that it detected an error and the recovery is being started. >>>>>>>>> - * Upon receiving the event, the application should not invoke >>>>>>>>> any control path API >>>>>>>>> - * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until >>>>>>>>> receiving >>>>>>>>> - * RTE_ETH_EVENT_RECOVERY_SUCCESS or >>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event. >>>>>>>>> - * The PMD will set the data path pointers to dummy functions, >>>>>>>>> - * and re-set the data path pointers to non-dummy functions >>>>>>>>> - * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event. >>>>>>>>> - * It means that the application cannot send or receive any >>>>>>>>> packets >>>>>>>>> - * during this period. >>>>>>>>> + * >>>>>>>>> + * If PMD supports proactive error recovery, it should trigger >>>>>>>>> this >>>>>>>>> + * event to notify application that it detected an error and the >>>>>>>>> + * recovery is about to start. >>>>>>>>> + * >>>>>>>>> + * Upon receiving the event, the application should not invoke any >>>>>>>>> + * control and data path API until receiving >>>>>>>>> + * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED >>>>>>>>> + * event. >>>>>>>>> + * >>>>>>>>> + * Once this event is reported, the PMD will set the data path >>>>>>>>> pointers >>>>>>>>> + * to dummy functions, and re-set the data path pointers to valid >>>>>>>>> + * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS >>>>>>>>> event. >>>>>>>>> + * >>>>>>>>> * @note Before the PMD reports the recovery result, >>>>>>>>> * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event >>>>>>>>> again, >>>>>>>>> * because a larger error may occur during the recovery. >>>>>>>>> */ >>>>>>>>> RTE_ETH_EVENT_ERR_RECOVERING, >>>>>>>>> /** Port recovers successfully from the error. >>>>>>>>> - * The PMD already re-configured the port, >>>>>>>>> - * and the effect is the same as a restart operation. >>>>>>>>> + * >>>>>>>>> + * The PMD already re-configured the port: >>>>>>>>> * a) The following operation will be retained: (alphabetically) >>>>>>>>> * - DCB configuration >>>>>>>>> * - FEC configuration >>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type { >>>>>>>>> * (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP) >>>>>>>>> * c) Any other configuration will not be stored >>>>>>>>> * and will need to be re-configured. >>>>>>>>> + * >>>>>>>>> + * The application should restore some additional configuration >>>>>>>>> + * (see above case b/c), and then enable data path API invocation. >>>>>>>>> */ >>>>>>>>> RTE_ETH_EVENT_RECOVERY_SUCCESS, >>>>>>>>> /** Port recovery failed. >>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map >>>>>>>>> index 357d1a88c0..c273e0bdae 100644 >>>>>>>>> --- a/lib/ethdev/version.map >>>>>>>>> +++ b/lib/ethdev/version.map >>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL { >>>>>>>>> rte_eth_devices; >>>>>>>>> rte_eth_dma_zone_free; >>>>>>>>> rte_eth_dma_zone_reserve; >>>>>>>>> + rte_eth_fp_ops_setup; >>>>>>>>> rte_eth_hairpin_queue_peer_bind; >>>>>>>>> rte_eth_hairpin_queue_peer_unbind; >>>>>>>>> rte_eth_hairpin_queue_peer_update; >>>>>>>>> -- >>>>>>>> Acked-by: Konstantin Ananyev >>>>>>>> >>>>>>>>> 2.17.1 >>>>>>>> >>>>>>> >>>>>> >>>> >>>