From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 86B64A00C5; Wed, 2 Feb 2022 12:44:35 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0B81540DF4; Wed, 2 Feb 2022 12:44:35 +0100 (CET) Received: from mail-108-mta133.mxroute.com (mail-108-mta133.mxroute.com [136.175.108.133]) by mails.dpdk.org (Postfix) with ESMTP id 8603A40688 for ; Wed, 2 Feb 2022 12:44:33 +0100 (CET) Received: from filter006.mxroute.com ([140.82.40.27] 140.82.40.27.vultr.com) (Authenticated sender: mN4UYu2MZsgR) by mail-108-mta133.mxroute.com (ZoneMTA) with ESMTPSA id 17eba4022170005a20.003 for (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES128-GCM-SHA256); Wed, 02 Feb 2022 11:44:32 +0000 X-Zone-Loop: ae1122379caec977823a4e89e0d03a1946cddcd0ce9e X-Originating-IP: [140.82.40.27] DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ashroe.eu; s=x; h=Content-Type:MIME-Version:Message-ID:Date:In-reply-to:Subject:Cc:To: From:References:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=irwDRzZr7Rk9gMUEjSAyJyikY1NGyZhM7zsvDsJ0G2Y=; b=ZG0y5SuoAv3qNjp0+r+puNzAqT k6/0v9QmXImYufSzjUGP23V1fCmL4RTm8AT2fz7TWZDHG8XRNI0xbljAn449Rc40p5BXfbVUBGlMF pzIDNUN3itr0PER4InmP70ypZzyTxqnVhd2UurwF+BmCuL4AgR7kkWE8/qYB9BlQEMJwn5e+IhVvF B7sw5OS5jQ9hXxwcuH8Iq+cLT9nXaEJXsw7yXbDcTWpNt9pF46g7PARegycKShUS4gLcE3aE2stXM YWrhhEnd2JlQd55oh/tTnLSA4wtyMwAUJGMew35HIRToROwrlVPkqJ1F+6fWgVNG5++8f3NgMcFv1 JEGGfoXQ==; References: <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> <20220128124830.427-1-kalesh-anakkur.purayil@broadcom.com> <20220128124830.427-2-kalesh-anakkur.purayil@broadcom.com> User-agent: mu4e 1.4.15; emacs 27.1 From: Ray Kinsella To: Ferruh Yigit Cc: Kalesh A P , dev@dpdk.org, ajit.khaparde@broadcom.com, asafp@nvidia.com, David Marchand , Thomas Monjalon , Andrew Rybchenko Subject: Re: [dpdk-dev] [PATCH v7 1/4] ethdev: support device reset and recovery events In-reply-to: Date: Wed, 02 Feb 2022 06:44:28 -0500 Message-ID: <87iltx1oir.fsf@mdr78.vserver.site> MIME-Version: 1.0 Content-Type: text/plain X-AuthUser: mdr@ashroe.eu X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Ferruh Yigit writes: > On 1/28/2022 12:48 PM, Kalesh A P wrote: >> From: Kalesh AP >> Adding support for the device reset and recovery events in the >> rte_eth_event framework. FW error and FW reset conditions would be >> managed internally by the PMD without needing application intervention. >> In such cases, PMD would need reset/recovery events to notify application >> that PMD is undergoing a reset. >> While most of the recovery process is transparent to the application since >> most of the driver ensures recovery from FW reset or FW error conditions, >> the application will have to reprogram any flows which were offloaded to >> the underlying hardware. >> Signed-off-by: Kalesh AP >> Signed-off-by: Somnath Kotur >> Reviewed-by: Ajit Khaparde >> --- >> doc/guides/prog_guide/poll_mode_drv.rst | 24 ++++++++++++++++++++++++ >> lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++ >> 2 files changed, 42 insertions(+) >> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst >> b/doc/guides/prog_guide/poll_mode_drv.rst >> index 6831289..9ecc0e4 100644 >> --- a/doc/guides/prog_guide/poll_mode_drv.rst >> +++ b/doc/guides/prog_guide/poll_mode_drv.rst >> @@ -623,3 +623,27 @@ by application. >> The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger >> the application to handle reset event. It is duty of application to >> handle all synchronization before it calls rte_eth_dev_reset(). >> + >> +Error recovery support >> +~~~~~~~~~~~~~~~~~~~~~~ >> + >> +When the PMD detects a FW reset or error condition, it may try to recover >> +from the error without needing the application intervention. In such cases, >> +PMD would need events to notify the application that it is undergoing >> +an error recovery. >> + >> +The PMD should trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the >> +application that PMD detected a FW reset or FW error condition. PMD may >> +try to recover from the error by itself. Data path may be quiesced and >> +control path operations may fail during the recovery period. The application >> +should stop polling till it receives RTE_ETH_EVENT_RECOVERED event from the PMD. >> + >> +The PMD should trigger RTE_ETH_EVENT_RECOVERED event to notify the application >> +that the it has recovered from the error condition. PMD re-configures the port >> +to the state prior to the error condition. Control path and data path are up now. >> +Since the device has undergone a reset, flow rules offloaded prior to reset >> +may be lost and the application should recreate the rules again. >> + >> +The PMD should trigger RTE_ETH_EVENT_INTR_RMV event to notify the application >> +that it has failed to recover from the error condition. The device may not be >> +usable anymore. >> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h >> index 147cc1c..a46819f 100644 >> --- a/lib/ethdev/rte_ethdev.h >> +++ b/lib/ethdev/rte_ethdev.h >> @@ -3818,6 +3818,24 @@ enum rte_eth_event_type { >> RTE_ETH_EVENT_DESTROY, /**< port is released */ >> RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ >> RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ >> + RTE_ETH_EVENT_ERR_RECOVERING, >> + /**< port recovering from an error >> + * >> + * PMD detected a FW reset or error condition. >> + * PMD will try to recover from the error. >> + * Data path may be quiesced and Control path operations >> + * may fail at this time. >> + */ >> + RTE_ETH_EVENT_RECOVERED, >> + /**< port recovered from an error >> + * >> + * PMD has recovered from the error condition. >> + * Control path and Data path are up now. >> + * PMD re-configures the port to the state prior to the error. >> + * Since the device has undergone a reset, flow rules >> + * offloaded prior to reset may be lost and >> + * the application should recreate the rules again. >> + */ >> RTE_ETH_EVENT_MAX /**< max value of this enum */ > > > Also ABI check complains about 'RTE_ETH_EVENT_MAX' value check, cc'ed more people > to evaluate if it is a false positive: > > > 1 function with some indirect sub-type change: > [C] 'function int rte_eth_dev_callback_register(uint16_t, rte_eth_event_type, rte_eth_dev_cb_fn, void*)' at rte_ethdev.c:4637:1 has some indirect sub-type changes: > parameter 3 of type 'typedef rte_eth_dev_cb_fn' has sub-type changes: > underlying type 'int (typedef uint16_t, enum rte_eth_event_type, void*, void*)*' changed: > in pointed to type 'function type int (typedef uint16_t, enum rte_eth_event_type, void*, void*)': > parameter 2 of type 'enum rte_eth_event_type' has sub-type changes: > type size hasn't changed > 2 enumerator insertions: > 'rte_eth_event_type::RTE_ETH_EVENT_ERR_RECOVERING' value '11' > 'rte_eth_event_type::RTE_ETH_EVENT_RECOVERED' value '12' > 1 enumerator change: > 'rte_eth_event_type::RTE_ETH_EVENT_MAX' from value '11' to '13' at rte_ethdev.h:3807:1 I don't immediately see the problem that this would cause. There are no array sizes etc dependent on the value of MAX for instance. Looks safe? -- Regards, Ray K