From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6AA1AA04B7; Wed, 14 Oct 2020 15:10:34 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D617C1DC42; Wed, 14 Oct 2020 15:10:32 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id BF4D81DC36 for ; Wed, 14 Oct 2020 15:10:30 +0200 (CEST) IronPort-SDR: 2bqCM1+WKQseU/RduWDbRV1Cu4QW9BxXpn90GsrsoBJnHamsr+jdKhjgj7MSk2b9gqkD7gFHMc cnddsD9GnZSA== X-IronPort-AV: E=McAfee;i="6000,8403,9773"; a="166221404" X-IronPort-AV: E=Sophos;i="5.77,374,1596524400"; d="scan'208";a="166221404" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2020 06:10:24 -0700 IronPort-SDR: EC4A+2gEQywZpBy31iOAav2iGGDV55pCLN47wljPihBfk5agkyfIXXFg198Mv++5rhe3u6hH/A 4v1lNdiiYJqA== X-IronPort-AV: E=Sophos;i="5.77,374,1596524400"; d="scan'208";a="463881310" Received: from vmedvedk-mobl.ger.corp.intel.com (HELO [10.214.212.248]) ([10.214.212.248]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2020 06:10:19 -0700 To: Michel Machado , Kevin Traynor , Ruifeng Wang , Bruce Richardson , Cody Doucette , Andre Nathan , Qiaobin Fu Cc: "dev@dpdk.org" , Honnappa Nagarahalli , nd References: <20200907081518.46350-1-ruifeng.wang@arm.com> <20200907081518.46350-3-ruifeng.wang@arm.com> <20200915160224.GA825@bricha3-MOBL.ger.corp.intel.com> <81eb8fde-cd4f-1df9-0ebb-05c902b30fe3@intel.com> <48834549-00e9-b762-4915-9a2dd0e5fe1d@redhat.com> <6497770e-9d1c-97c3-3834-84bd96186836@digirati.com.br> <18c44f31-abc0-c0b5-c2fb-76d6166d5237@digirati.com.br> <9197371c-5e03-4852-d62a-6456f0b762f0@intel.com> From: "Medvedkin, Vladimir" Message-ID: <9647f80d-53c3-33aa-b6d0-301aef34ca0a@intel.com> Date: Wed, 14 Oct 2020 14:10:15 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH 2/2] lpm: hide internal data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 13/10/2020 20:48, Michel Machado wrote: > On 10/13/20 3:06 PM, Medvedkin, Vladimir wrote: >> >> >> On 13/10/2020 18:46, Michel Machado wrote: >>> On 10/13/20 11:41 AM, Medvedkin, Vladimir wrote: >>>> Hi Michel, >>>> >>>> Could you please describe a condition when LPM gets inconsistent? As >>>> I can see if there is no free tbl8 it will return -ENOSPC. >>> >>>     Consider this simple example, we need to add the following two >>> prefixes with different next hops: 10.99.0.0/16, 18.99.99.128/25. If >>> the LPM table is out of tbl8s, the second prefix is not added and >>> Gatekeeper will make decisions in violation of the policy. The data >>> structure of the LPM table is consistent, but its content >>> inconsistent with the policy. >> >> Aha, thanks. So do I understand correctly that you need to add a set >> of routes atomically (either the entire set is installed or nothing)? > >    Yes. > >> If so, then I would suggest having 2 lpm and switching them atomically >> after a successful addition. As for now, even if you have enough >> tbl8's, routes are installed non atomically, i.e. there will be a time >> gap between adding two routes, so in this time interval the table will >> be inconsistent with the policy. >> Also, if new lpm algorithms are added to the DPDK, they won't have >> such a thing as tbl8. > >    Our code already deals with synchronization. OK, so my suggestion here would be to add new routes to the shadow copy of the lpm, and if it returns -ENOSPC, than create a new LPM with double amount of tbl8's and add all the routes to it. Then switch the active-shadow LPM pointers. In this case you'll always add a bulk of routes atomically. > >>>     We minimize the need of replacing a LPM table by allocating LPM >>> tables with the double of what we need (see example here >>> https://github.com/AltraMayor/gatekeeper/blob/95d1d6e8201861a0d0c698bfd06ad606674f1e07/lua/examples/policy.lua#L172-L183), >>> but the code must be ready for unexpected needs that may arise in >>> production. >>> >> >> Usually, the table is initialized with a large enough number of >> entries, enough to add a possible number of routes. One tbl8 group >> takes up 1Kb of memory which is nothing comparing to the size of tbl24 >> which is 64Mb. > >    When the prefixes come from BGP, initializing a large enough table > is fine. But when prefixes come from threat intelligence, the number of > prefixes can vary wildly and the number of prefixes above 24 bits are > way more common. > >> P.S. consider using rte_fib library, it has a number of advantages >> over LPM. You can replace the loop in __lookup_fib_bulk() with a bulk >> lookup call and this will probably increase the speed. > >    I'm not aware of the rte_fib library. The only documentation that I > found on Google was https://doc.dpdk.org/api/rte__fib_8h.html and it > just says "FIB (Forwarding information base) implementation for IPv4 > Longest Prefix Match". That's true, I'm going to add programmer's guide soon. Although the fib API is very similar to existing LPM. > >>>> >>>> On 13/10/2020 15:58, Michel Machado wrote: >>>>> Hi Kevin, >>>>> >>>>>     We do need fields max_rules and number_tbl8s of struct rte_lpm, >>>>> so the removal would force us to have another patch to our local >>>>> copy of DPDK. We'd rather avoid this new local patch because we >>>>> wish to eventually be in sync with the stock DPDK. >>>>> >>>>>     Those fields are needed in Gatekeeper because we found a >>>>> condition in an ongoing deployment in which the entries of some LPM >>>>> tables may suddenly change a lot to reflect policy changes. To >>>>> avoid getting into a state in which the LPM table is inconsistent >>>>> because it cannot fit all the new entries, we compute the needed >>>>> parameters to support the new entries, and compare with the current >>>>> parameters. If the current table doesn't fit everything, we have to >>>>> replace it with a new LPM table. >>>>> >>>>>     If there were a way to obtain the struct rte_lpm_config of a >>>>> given LPM table, it would cleanly address our need. We have the >>>>> same need in IPv6 and have a local patch to work around it (see >>>>> https://github.com/cjdoucette/dpdk/commit/3eaf124a781349b8ec8cd880db26a78115cb8c8f). >>>>> Thus, an IPv4 and IPv6 solution would be best. >>>>> >>>>>     PS: I've added Qiaobin Fu, another Gatekeeper maintainer, to >>>>> this disscussion. >>>>> >>>>> [ ]'s >>>>> Michel Machado >>>>> >>>>> On 10/13/20 9:53 AM, Kevin Traynor wrote: >>>>>> Hi Gatekeeper maintainers (I think), >>>>>> >>>>>> fyi - there is a proposal to remove some members of a struct in >>>>>> DPDK LPM >>>>>> API that Gatekeeper is using [1]. It would be only from DPDK 20.11 >>>>>> but >>>>>> as it's an LTS I guess it would probably hit Debian in a few months. >>>>>> >>>>>> The full thread is here: >>>>>> http://inbox.dpdk.org/dev/20200907081518.46350-1-ruifeng.wang@arm.com/ >>>>>> >>>>>> >>>>>> Maybe you can take a look and tell us if they are needed in >>>>>> Gatekeeper >>>>>> or you can workaround it? >>>>>> >>>>>> thanks, >>>>>> Kevin. >>>>>> >>>>>> [1] >>>>>> https://github.com/AltraMayor/gatekeeper/blob/master/gt/lua_lpm.c#L235-L248 >>>>>> >>>>>> >>>>>> On 09/10/2020 07:54, Ruifeng Wang wrote: >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Kevin Traynor >>>>>>>> Sent: Wednesday, September 30, 2020 4:46 PM >>>>>>>> To: Ruifeng Wang ; Medvedkin, Vladimir >>>>>>>> ; Bruce Richardson >>>>>>>> >>>>>>>> Cc: dev@dpdk.org; Honnappa Nagarahalli >>>>>>>> ; nd >>>>>>>> Subject: Re: [dpdk-dev] [PATCH 2/2] lpm: hide internal data >>>>>>>> >>>>>>>> On 16/09/2020 04:17, Ruifeng Wang wrote: >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Medvedkin, Vladimir >>>>>>>>>> Sent: Wednesday, September 16, 2020 12:28 AM >>>>>>>>>> To: Bruce Richardson ; Ruifeng Wang >>>>>>>>>> >>>>>>>>>> Cc: dev@dpdk.org; Honnappa Nagarahalli >>>>>>>>>> ; nd >>>>>>>>>> Subject: Re: [PATCH 2/2] lpm: hide internal data >>>>>>>>>> >>>>>>>>>> Hi Ruifeng, >>>>>>>>>> >>>>>>>>>> On 15/09/2020 17:02, Bruce Richardson wrote: >>>>>>>>>>> On Mon, Sep 07, 2020 at 04:15:17PM +0800, Ruifeng Wang wrote: >>>>>>>>>>>> Fields except tbl24 and tbl8 in rte_lpm structure have no >>>>>>>>>>>> need to >>>>>>>>>>>> be exposed to the user. >>>>>>>>>>>> Hide the unneeded exposure of structure fields for better ABI >>>>>>>>>>>> maintainability. >>>>>>>>>>>> >>>>>>>>>>>> Suggested-by: David Marchand >>>>>>>>>>>> Signed-off-by: Ruifeng Wang >>>>>>>>>>>> Reviewed-by: Phil Yang >>>>>>>>>>>> --- >>>>>>>>>>>>    lib/librte_lpm/rte_lpm.c | 152 >>>>>>>>>>>> +++++++++++++++++++++++--------------- >>>>>>>>>> - >>>>>>>>>>>>    lib/librte_lpm/rte_lpm.h |   7 -- >>>>>>>>>>>>    2 files changed, 91 insertions(+), 68 deletions(-) >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> diff --git a/lib/librte_lpm/rte_lpm.h >>>>>>>>>>>> b/lib/librte_lpm/rte_lpm.h >>>>>>>>>>>> index 03da2d37e..112d96f37 100644 >>>>>>>>>>>> --- a/lib/librte_lpm/rte_lpm.h >>>>>>>>>>>> +++ b/lib/librte_lpm/rte_lpm.h >>>>>>>>>>>> @@ -132,17 +132,10 @@ struct rte_lpm_rule_info { >>>>>>>>>>>> >>>>>>>>>>>>    /** @internal LPM structure. */ >>>>>>>>>>>>    struct rte_lpm { >>>>>>>>>>>> -    /* LPM metadata. */ >>>>>>>>>>>> -    char name[RTE_LPM_NAMESIZE];        /**< Name of the >>>>>>>>>>>> lpm. */ >>>>>>>>>>>> -    uint32_t max_rules; /**< Max. balanced rules per lpm. */ >>>>>>>>>>>> -    uint32_t number_tbl8s; /**< Number of tbl8s. */ >>>>>>>>>>>> -    struct rte_lpm_rule_info rule_info[RTE_LPM_MAX_DEPTH]; >>>>>>>>>>>> /**< >>>>>>>>>> Rule info table. */ >>>>>>>>>>>> - >>>>>>>>>>>>        /* LPM Tables. */ >>>>>>>>>>>>        struct rte_lpm_tbl_entry >>>>>>>>>>>> tbl24[RTE_LPM_TBL24_NUM_ENTRIES] >>>>>>>>>>>>                __rte_cache_aligned; /**< LPM tbl24 table. */ >>>>>>>>>>>>        struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */ >>>>>>>>>>>> -    struct rte_lpm_rule *rules_tbl; /**< LPM rules. */ >>>>>>>>>>>>    }; >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Since this changes the ABI, does it not need advance notice? >>>>>>>>>>> >>>>>>>>>>> [Basically the return value point from rte_lpm_create() will be >>>>>>>>>>> different, and that return value could be used by >>>>>>>>>>> rte_lpm_lookup() >>>>>>>>>>> which as a static inline function will be in the binary and >>>>>>>>>>> using >>>>>>>>>>> the old structure offsets.] >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Agree with Bruce, this patch breaks ABI, so it can't be accepted >>>>>>>>>> without prior notice. >>>>>>>>>> >>>>>>>>> So if the change wants to happen in 20.11, a deprecation notice >>>>>>>>> should >>>>>>>>> have been added in 20.08. >>>>>>>>> I should have added a deprecation notice. This change will have >>>>>>>>> to wait for >>>>>>>> next ABI update window. >>>>>>>>> >>>>>>>> >>>>>>>> Do you plan to extend? or is this just speculative? >>>>>>> It is speculative. >>>>>>> >>>>>>>> >>>>>>>> A quick scan and there seems to be several projects using some >>>>>>>> of these >>>>>>>> members that you are proposing to hide. e.g. BESS, NFF-Go, DPVS, >>>>>>>> gatekeeper. I didn't look at the details to see if they are >>>>>>>> really needed. >>>>>>>> >>>>>>>> Not sure how much notice they'd need or if they update DPDK >>>>>>>> much, but I >>>>>>>> think it's worth having a closer look as to how they use lpm and >>>>>>>> what the >>>>>>>> impact to them is. >>>>>>> Checked the projects listed above. BESS, NFF-Go and DPVS don't >>>>>>> access the members to be hided. >>>>>>> They will not be impacted by this patch. >>>>>>> But Gatekeeper accesses the rte_lpm internal members that to be >>>>>>> hided. Its compilation will be broken with this patch. >>>>>>> >>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> Ruifeng >>>>>>>>>>>>    /** LPM RCU QSBR configuration structure. */ >>>>>>>>>>>> -- >>>>>>>>>>>> 2.17.1 >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Regards, >>>>>>>>>> Vladimir >>>>>>> >>>>>> >>>> >> -- Regards, Vladimir