From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0087041DEA; Sun, 5 Mar 2023 16:23:50 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D9C8C40EF0; Sun, 5 Mar 2023 16:23:50 +0100 (CET) Received: from forward502b.mail.yandex.net (forward502b.mail.yandex.net [178.154.239.146]) by mails.dpdk.org (Postfix) with ESMTP id 8039D40EE5 for ; Sun, 5 Mar 2023 16:23:49 +0100 (CET) Received: from sas2-8b071049b117.qloud-c.yandex.net (sas2-8b071049b117.qloud-c.yandex.net [IPv6:2a02:6b8:c08:f480:0:640:8b07:1049]) by forward502b.mail.yandex.net (Yandex) with ESMTP id F2F665E9FA; Sun, 5 Mar 2023 18:23:48 +0300 (MSK) Received: by sas2-8b071049b117.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id jNbIvgSbo0U1-lIhYJHrF; Sun, 05 Mar 2023 18:23:48 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1678029828; bh=mIujp/cUfSEiyxaR/+uM229KXzsEqUgdDxMTONe7wRo=; h=From:In-Reply-To:Cc:Date:References:To:Subject:Message-ID; b=aoyE+pqE0SZYcMs/yPo1FKZeOHuF9DU71yW3V8plVNiUPtiCjCqymy5G5kAoXrGYi 7IeqD+6MCfqn64qQzMLvKNrbalD3P4tmXku+7b9eMyDmU63UelKsPH9m8j2AxQjWDH qSUGMDjh7nFryI7h7iCuGGFWHYNbei4G5h5lWV+U= Authentication-Results: sas2-8b071049b117.qloud-c.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: Date: Sun, 5 Mar 2023 15:23:44 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode Content-Language: en-US To: Honnappa Nagarahalli , "dev@dpdk.org" , Chengwen Feng , "thomas@monjalon.net" , Ferruh Yigit , Andrew Rybchenko , Kalesh AP , "Ajit Khaparde (ajit.khaparde@broadcom.com)" Cc: nd References: <20230301030610.49468-1-fengchengwen@huawei.com> <20230301030610.49468-2-fengchengwen@huawei.com> From: Konstantin Ananyev In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org >>>> >>>> In the proactive error handling mode, the PMD will set the data path >>>> pointers to dummy functions and then try recovery, in this period the >>>> application may still invoking data path API. This will introduce a >>>> race-condition with data path which may lead to crash [1]. >>>> >>>> Although the PMD added delay after setting data path pointers to >>>> cover the above race-condition, it reduces the probability, but it >>>> doesn't solve the problem. >>>> >>>> To solve the race-condition problem fundamentally, the following >>>> requirements are added: >>>> 1. The PMD should set the data path pointers to dummy functions after >>>> report RTE_ETH_EVENT_ERR_RECOVERING event. >>> Do you mean to say, PMD should set the data path pointers after calling the >> call back function? >>> The PMD is running in the context of multiple EAL threads. How do these >> threads synchronize such that only one thread sets these data pointers? >> >> As I understand this event callback supposed to be called in the context of EAL >> interrupt thread (whoever is more familiar with original idea, feel free to correct >> me if I missed something). > I could not figure this out. It looks to be called from the data plane thread context. > I also have a thought on alternate design at the end, appreciate if you can take a look. > >> How it is going to signal data-path threads that they need to stop/suspend >> calling data-path API - that's I suppose is left to application to decide... >> Same as right now it is application responsibility to stop data-path threads >> before doing dev_stop()/dev/_config()/etc. > Ok, good, this expectation is not new. The application must have a mechanism already. > >> >> >>> >>>> 2. The application should stop data path API invocation when process >>>> the RTE_ETH_EVENT_ERR_RECOVERING event. >>> Any thoughts on how an application can do this? > We can ignore this question as there is already similar expectation set for earlier functionalities. > >>> >>>> 3. The PMD should set the data path pointers to valid functions before >>>> report RTE_ETH_EVENT_RECOVERY_SUCCESS event. >>>> 4. The application should enable data path API invocation when process >>>> the RTE_ETH_EVENT_RECOVERY_SUCCESS event. >>> Do you mean to say that the application should not call the datapath APIs >> while the PMD is running the recovery process? >> >> Yes, I believe that's the intention. > Ok, this is good and makes sense. > >> >>>> >>>> Also, this patch introduce a driver internal function >>>> rte_eth_fp_ops_setup which used as an help function for PMD. >>>> >>>> [1] >>>> >> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2 >>>> - >>>> ashok.k.kaladi@intel.com/ >>>> >>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode") >>>> Cc: stable@dpdk.org >>>> >>>> Signed-off-by: Chengwen Feng >>>> --- >>>> doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++--------- >>>> lib/ethdev/ethdev_driver.c | 8 +++++++ >>>> lib/ethdev/ethdev_driver.h | 10 ++++++++ >>>> lib/ethdev/rte_ethdev.h | 32 +++++++++++++++---------- >>>> lib/ethdev/version.map | 1 + >>>> 5 files changed, 46 insertions(+), 25 deletions(-) >>>> >>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst >>>> b/doc/guides/prog_guide/poll_mode_drv.rst >>>> index c145a9066c..e380ff135a 100644 >>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst >>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst >>>> @@ -638,14 +638,9 @@ different from the application invokes recovery >>>> in PASSIVE mode, the PMD automatically recovers from error in >>>> PROACTIVE mode, and only a small amount of work is required for the >> application. >>>> >>>> -During error detection and automatic recovery, -the PMD sets the >>>> data path pointers to dummy functions -(which will prevent the >>>> crash), -and also make sure the control path operations fail with a return >> code ``-EBUSY``. >>>> - >>>> -Because the PMD recovers automatically, -the application can only >>>> sense that the data flow is disconnected for a while -and the control >>>> API returns an error in this period. >>>> +During error detection and automatic recovery, the PMD sets the data >>>> +path pointers to dummy functions and also make sure the control path >>>> +operations failed with a return code ``-EBUSY``. >>>> >>>> In order to sense the error happening/recovering, as well as to >>>> restore some additional configuration, @@ -653,9 +648,9 @@ three events >> are available: >>>> >>>> ``RTE_ETH_EVENT_ERR_RECOVERING`` >>>> Notify the application that an error is detected >>>> - and the recovery is being started. >>>> + and the recovery is about to start. >>>> Upon receiving the event, the application should not invoke >>>> - any control path function until receiving >>>> + any control and data path API until receiving >>>> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or >>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event. >>>> >>>> .. note:: >>>> @@ -666,8 +661,9 @@ three events are available: >>>> >>>> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` >>>> Notify the application that the recovery from error is successful, >>>> - the PMD already re-configures the port, >>>> - and the effect is the same as a restart operation. >>>> + the PMD already re-configures the port. >>>> + The application should restore some additional configuration, and >>>> + then >>> What is the additional configuration? Is this specific to each NIC/PMD? >>> I thought, this is an auto recovery process and the application does not require >> to reconfigure anything. If the application has to restore the configuration, how >> does auto recovery differ from typical recovery process? >>> >>>> + enable data path API invocation. >>>> >>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` >>>> Notify the application that the recovery from error failed, diff >>>> --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index >>>> 0be1e8ca04..f994653fe9 100644 >>>> --- a/lib/ethdev/ethdev_driver.c >>>> +++ b/lib/ethdev/ethdev_driver.c >>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev >>>> *dev, const char *ring_name, >>>> return rc; >>>> } >>>> >>>> +void >>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) { >>>> + if (dev == NULL) >>>> + return; >>>> + eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); } >>>> + >>>> const struct rte_memzone * >>>> rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char >>>> *ring_name, >>>> uint16_t queue_id, size_t size, unsigned int align, diff - >> -git >>>> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index >>>> 2c9d615fb5..0d964d1f67 100644 >>>> --- a/lib/ethdev/ethdev_driver.h >>>> +++ b/lib/ethdev/ethdev_driver.h >>>> @@ -1621,6 +1621,16 @@ int >>>> rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char >>>> *name, >>>> uint16_t queue_id); >>>> >>>> +/** >>>> + * @internal >>>> + * Setup eth fast-path API to ethdev values. >>>> + * >>>> + * @param dev >>>> + * Pointer to struct rte_eth_dev. >>>> + */ >>>> +__rte_internal >>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev); >>>> + >>>> /** >>>> * @internal >>>> * Atomically set the link status for the specific device. >>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index >>>> 049641d57c..44ee7229c1 100644 >>>> --- a/lib/ethdev/rte_ethdev.h >>>> +++ b/lib/ethdev/rte_ethdev.h >>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type { >>>> */ >>>> RTE_ETH_EVENT_RX_AVAIL_THRESH, >>>> /** Port recovering from a hardware or firmware error. >>>> - * If PMD supports proactive error recovery, >>>> - * it should trigger this event to notify application >>>> - * that it detected an error and the recovery is being started. >>>> - * Upon receiving the event, the application should not invoke any >>>> control path API >>>> - * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving >>>> - * RTE_ETH_EVENT_RECOVERY_SUCCESS or >>>> RTE_ETH_EVENT_RECOVERY_FAILED event. >>>> - * The PMD will set the data path pointers to dummy functions, >>>> - * and re-set the data path pointers to non-dummy functions >>>> - * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event. >>>> - * It means that the application cannot send or receive any packets >>>> - * during this period. >>>> + * >>>> + * If PMD supports proactive error recovery, it should trigger this >>>> + * event to notify application that it detected an error and the >>>> + * recovery is about to start. >>>> + * >>>> + * Upon receiving the event, the application should not invoke any >>>> + * control and data path API until receiving >>>> + * RTE_ETH_EVENT_RECOVERY_SUCCESS or >>>> RTE_ETH_EVENT_RECOVERY_FAILED >>>> + * event. >>>> + * >>>> + * Once this event is reported, the PMD will set the data path pointers >>>> + * to dummy functions, and re-set the data path pointers to valid >>>> + * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS >>>> event. >>> Why do we need to set the data path pointers to dummy functions if the >> application is restricted from invoking any control and data path APIs till the >> recovery process is completed? >> >> You are right, in theory it is not mandatory. >> Though it helps to flag a problem if user will still try to call them while recovery is >> in progress. > Ok, may be in debug mode. > I mean, we have already set an expectation to the application that it should not call and the application has implemented a method to do the same. Why do we need to complicate this? > If the application calls the APIs, it is a programming error. My preference would be to keep it this way for both debug and non-debug mode. It doesn't cost anything to us in terms of perfomance, but helps to catch problems with wrong behaving app. > >> Again, same as we doing in dev_stop(). > >> >>> >>>> + * >>>> * @note Before the PMD reports the recovery result, >>>> * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event >>>> again, >>>> * because a larger error may occur during the recovery. >>>> */ >>>> RTE_ETH_EVENT_ERR_RECOVERING, >>> I understand this is not a change in this patch. But, just wondering, what is the >> purpose of this? How is the application supposed to use this? >>> >>>> /** Port recovers successfully from the error. >>>> - * The PMD already re-configured the port, >>>> - * and the effect is the same as a restart operation. >>>> + * >>>> + * The PMD already re-configured the port: >>>> * a) The following operation will be retained: (alphabetically) >>>> * - DCB configuration >>>> * - FEC configuration >>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type { >>>> * (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP) >>>> * c) Any other configuration will not be stored >>>> * and will need to be re-configured. >>>> + * >>>> + * The application should restore some additional configuration >>>> + * (see above case b/c), and then enable data path API invocation. >>>> */ >>>> RTE_ETH_EVENT_RECOVERY_SUCCESS, >>>> /** Port recovery failed. >>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index >>>> 357d1a88c0..c273e0bdae 100644 >>>> --- a/lib/ethdev/version.map >>>> +++ b/lib/ethdev/version.map >>>> @@ -320,6 +320,7 @@ INTERNAL { >>>> rte_eth_devices; >>>> rte_eth_dma_zone_free; >>>> rte_eth_dma_zone_reserve; >>>> + rte_eth_fp_ops_setup; >>>> rte_eth_hairpin_queue_peer_bind; >>>> rte_eth_hairpin_queue_peer_unbind; >>>> rte_eth_hairpin_queue_peer_update; >>>> -- >>>> 2.17.1 >>> > > Is there any reason not to design this in the same way as 'rte_eth_dev_reset'? Why does the PMD have to recover by itself? I suppose it is a question for the authors of original patch... > We could have a similar API 'rte_eth_dev_recover' to do the recovery functionality. I suppose such approach is also possible. Personally I am fine with both ways: either existing one or what you propose, as long as we'll fix existing race-condition. What is good with what you suggest - that way we probably don't need to worry how to allow user to enable/disable auto-recovery inside PMD. Konstantin