From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ACDDCA04FF; Tue, 24 May 2022 17:11:23 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6328540140; Tue, 24 May 2022 17:11:23 +0200 (CEST) Received: from mail-108-mta176.mxroute.com (mail-108-mta176.mxroute.com [136.175.108.176]) by mails.dpdk.org (Postfix) with ESMTP id C0203400D6 for ; Tue, 24 May 2022 17:11:21 +0200 (CEST) Received: from filter006.mxroute.com ([140.82.40.27] 140.82.40.27.vultrusercontent.com) (Authenticated sender: mN4UYu2MZsgR) by mail-108-mta176.mxroute.com (ZoneMTA) with ESMTPSA id 180f69f5114000c327.003 for (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES128-GCM-SHA256); Tue, 24 May 2022 15:11:17 +0000 X-Zone-Loop: 81380390b77e4d30e8e05bf7461d57ff0b760587b28c X-Originating-IP: [140.82.40.27] DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ashroe.eu; s=x; h=Content-Type:MIME-Version:Message-ID:Date:In-reply-to:Subject:Cc:To: From:References:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=XuaFIgniULLq6OsF2ruQebOJmidmz2X9iGyli02UGp0=; b=kJqQcAP4XF1hi1fRpH37Ahfjxj IrnfQaFYaGjIZoyf7+dwOYWGnICrH7KUcIHuogER7JqVoj1rR/zzGIcQecC83WGozLL4IEN6WfoaN YpRqrSrg64tGQwtRGuKbIAp46wbClJpQWJLcbQ+Ff8pd43ZVuZXNZ183tyJqkMWyWM9i8zm5ISUFd FIkvvgb3Rfu1KDAFegraxZHXrj/LdYaHZXKZKPs2Vl1vdgwT6upigZmFUXoKKUuD331yXALvD02uU voQucqZuI+VxuAnAPcZ8E3PYO1xidwZGiaTEVQel/WXD/kbupZYQl3/ZETIO/sClIeeTLytsr3Km4 LpRISNrw==; References: <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> <87sfspiuj1.fsf@mdr78.vserver.site> <878rudiwhb.fsf@mdr78.vserver.site> <45691978.XUcTiDjVJD@thomas> <875yphigb6.fsf@mdr78.vserver.site> User-agent: mu4e 1.4.15; emacs 27.1 From: Ray Kinsella To: fengchengwen Cc: Thomas Monjalon , Ferruh Yigit , Kalesh A P , dev@dpdk.org, ajit.khaparde@broadcom.com, asafp@nvidia.com, David Marchand , Andrew Rybchenko , lizhenyi1@huawei.com, shuliubin 00419723 Subject: Re: [dpdk-dev] [PATCH v7 1/4] ethdev: support device reset and recovery events In-reply-to: Date: Tue, 24 May 2022 16:11:14 +0100 Message-ID: <87k0abynhp.fsf@mdr78.vserver.site> MIME-Version: 1.0 Content-Type: text/plain X-AuthUser: mdr@ashroe.eu X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org fengchengwen writes: > Hi all, > > This patch lasts for a long time. Are we waiting for 22.11 to deal with it? That was my read, as can't reliably change the value of _MAX at this stage without it having impact elsewhere. > We have the same requirements for the reset or recovery mechanism, but there are differences: > > APP PMD > | | > | detect error > | <---report error event--- | > | | > do error stats | > and report | > | ---start recover--> | > | do recover > | <---report recover result | > | | > if succ just log > else may migrate > service > > Can we generalize these processes(means that the implementation is at the framework layer)? or only at PMD API? > > > On 2022/2/15 0:06, Ray Kinsella wrote: >> >> Thomas Monjalon writes: >> >>> 14/02/2022 11:16, Ray Kinsella: >>>> Ray Kinsella writes: >>>>> Thomas Monjalon writes: >>>>>> 02/02/2022 12:44, Ray Kinsella: >>>>>>> Ferruh Yigit writes: >>>>>>>> On 1/28/2022 12:48 PM, Kalesh A P wrote: >>>>>>>>> --- a/lib/ethdev/rte_ethdev.h >>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h >>>>>>>>> @@ -3818,6 +3818,24 @@ enum rte_eth_event_type { >>>>>>>>> RTE_ETH_EVENT_DESTROY, /**< port is released */ >>>>>>>>> RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ >>>>>>>>> RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ >>>>>>>>> + RTE_ETH_EVENT_ERR_RECOVERING, >>>>>>>>> + /**< port recovering from an error >>>>>>>>> + * >>>>>>>>> + * PMD detected a FW reset or error condition. >>>>>>>>> + * PMD will try to recover from the error. >>>>>>>>> + * Data path may be quiesced and Control path operations >>>>>>>>> + * may fail at this time. >>>>>>>>> + */ >>>>>>>>> + RTE_ETH_EVENT_RECOVERED, >>>>>>>>> + /**< port recovered from an error >>>>>>>>> + * >>>>>>>>> + * PMD has recovered from the error condition. >>>>>>>>> + * Control path and Data path are up now. >>>>>>>>> + * PMD re-configures the port to the state prior to the error. >>>>>>>>> + * Since the device has undergone a reset, flow rules >>>>>>>>> + * offloaded prior to reset may be lost and >>>>>>>>> + * the application should recreate the rules again. >>>>>>>>> + */ >>>>>>>>> RTE_ETH_EVENT_MAX /**< max value of this enum */ >>>>>>>> >>>>>>>> >>>>>>>> Also ABI check complains about 'RTE_ETH_EVENT_MAX' value check, cc'ed more people >>>>>>>> to evaluate if it is a false positive: >>>>>>>> >>>>>>>> >>>>>>>> 1 function with some indirect sub-type change: >>>>>>>> [C] 'function int rte_eth_dev_callback_register(uint16_t, rte_eth_event_type, rte_eth_dev_cb_fn, void*)' at rte_ethdev.c:4637:1 has some indirect sub-type changes: >>>>>>>> parameter 3 of type 'typedef rte_eth_dev_cb_fn' has sub-type changes: >>>>>>>> underlying type 'int (typedef uint16_t, enum rte_eth_event_type, void*, void*)*' changed: >>>>>>>> in pointed to type 'function type int (typedef uint16_t, enum rte_eth_event_type, void*, void*)': >>>>>>>> parameter 2 of type 'enum rte_eth_event_type' has sub-type changes: >>>>>>>> type size hasn't changed >>>>>>>> 2 enumerator insertions: >>>>>>>> 'rte_eth_event_type::RTE_ETH_EVENT_ERR_RECOVERING' value '11' >>>>>>>> 'rte_eth_event_type::RTE_ETH_EVENT_RECOVERED' value '12' >>>>>>>> 1 enumerator change: >>>>>>>> 'rte_eth_event_type::RTE_ETH_EVENT_MAX' from value '11' to '13' at rte_ethdev.h:3807:1 >>>>>>> >>>>>>> I don't immediately see the problem that this would cause. >>>>>>> There are no array sizes etc dependent on the value of MAX for instance. >>>>>>> >>>>>>> Looks safe? >>>>>> >>>>>> We never know how this enum will be used by the application. >>>>>> The max value may be used for the size of an event array. >>>>>> It looks a real ABI issue unfortunately. >>>>> >>>>> Right - but we only really care about it when an array size based on MAX >>>>> is likely to be passed to DPDK, which doesn't apply in this case. >>> >>> I don't completely agree. >>> A developer may assume an event will never exceed MAX value. >>> However, after an upgrade of DPDK without app rebuild, >>> a higher event value may be received in the app, >>> breaking the assumption. >>> Should we consider this case as an ABI breakage? >> >> Nope - I think we should explicitly exclude MAX values from any >> ABI guarantee, as being able to change them is key to our be able to >> evolve DPDK while maintaining ABI stability. >> >> Consider what it means applying the ABI policy to a MAX value, you are >> in effect saying that that no value can be added to this enumeration >> until the next ABI version, for me this is very restrictive without a >> solid reason. >> >>> >>>>> I noted that some Linux folks explicitly mark similar MAX values as not >>>>> part of the ABI. >>>>> >>>>> /usr/include/linux/perf_event.h >>>>> 37: PERF_TYPE_MAX, /* non-ABI */ >>>>> 60: PERF_COUNT_HW_MAX, /* non-ABI */ >>>>> 79: PERF_COUNT_HW_CACHE_MAX, /* non-ABI */ >>>>> 87: PERF_COUNT_HW_CACHE_OP_MAX, /* non-ABI */ >>>>> 94: PERF_COUNT_HW_CACHE_RESULT_MAX, /* non-ABI */ >>>>> 116: PERF_COUNT_SW_MAX, /* non-ABI */ >>>>> 149: PERF_SAMPLE_MAX = 1U << 24, /* non-ABI */ >>>>> 151: __PERF_SAMPLE_CALLCHAIN_EARLY = 1ULL << 63, /* >>>>> non-ABI; internal use */ >>>>> 189: PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ >>>>> 267: PERF_TXN_MAX = (1 << 8), /* non-ABI */ >>>>> 301: PERF_FORMAT_MAX = 1U << 4, /* non-ABI */ >>>>> 1067: PERF_RECORD_MAX, /* non-ABI */ >>>>> 1078: PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */ >>>>> 1087: PERF_BPF_EVENT_MAX, /* non-ABI */ >>>> >>>> Any thoughts on similarly annotating all our _MAX enums in the same way? >>>> We could also add a section in the ABI Policy to make it explicit _MAX >>>> enum values are not part of the ABI - what do folks think? >>> >>> Interesting. I am not sure it is always ABI-safe though. >> >> -- Regards, Ray K