DPDK patches and discussions
 help / color / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com>,
	Michel Machado <michel@digirati.com.br>,
	Kevin Traynor <ktraynor@redhat.com>,
	Ruifeng Wang <Ruifeng.Wang@arm.com>,
	Bruce Richardson <bruce.richardson@intel.com>,
	Cody Doucette <doucette@bu.edu>,
	Andre Nathan <andre@digirati.com.br>,
	Qiaobin Fu <mengxiang0811@gmail.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH 2/2] lpm: hide internal data
Date: Wed, 14 Oct 2020 23:57:54 +0000
Message-ID: <DBAPR08MB5814117F38D45F942352B44F98050@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <9647f80d-53c3-33aa-b6d0-301aef34ca0a@intel.com>

<snip>

> >>
> >>
> >> On 13/10/2020 18:46, Michel Machado wrote:
> >>> On 10/13/20 11:41 AM, Medvedkin, Vladimir wrote:
> >>>> Hi Michel,
> >>>>
> >>>> Could you please describe a condition when LPM gets inconsistent?
> >>>> As I can see if there is no free tbl8 it will return -ENOSPC.
> >>>
> >>>     Consider this simple example, we need to add the following two
> >>> prefixes with different next hops: 10.99.0.0/16, 18.99.99.128/25. If
> >>> the LPM table is out of tbl8s, the second prefix is not added and
> >>> Gatekeeper will make decisions in violation of the policy. The data
> >>> structure of the LPM table is consistent, but its content
> >>> inconsistent with the policy.
max_rules and number_tbl8s in 'struct rte_lpm' contain the config information. These 2 fields do not change based on the routes added and do not indicate the amount of space left. So, you cannot use this information to decide if there is enough space to add more routes.

> >>
> >> Aha, thanks. So do I understand correctly that you need to add a set
> >> of routes atomically (either the entire set is installed or nothing)?
> >
> >     Yes.
> >
> >> If so, then I would suggest having 2 lpm and switching them
> >> atomically after a successful addition. As for now, even if you have
> >> enough tbl8's, routes are installed non atomically, i.e. there will
> >> be a time gap between adding two routes, so in this time interval the
> >> table will be inconsistent with the policy.
> >> Also, if new lpm algorithms are added to the DPDK, they won't have
> >> such a thing as tbl8.
> >
> >     Our code already deals with synchronization.
If the application code already deals with synchronization, is it possible to revert back (i.e. delete the routes that got added so far) when the addition of the route-set fails?

> 
> OK, so my suggestion here would be to add new routes to the shadow copy
> of the lpm, and if it returns -ENOSPC, than create a new LPM with double
> amount of tbl8's and add all the routes to it. Then switch the active-shadow
> LPM pointers. In this case you'll always add a bulk of routes atomically.
> 
> >
> >>>     We minimize the need of replacing a LPM table by allocating LPM
> >>> tables with the double of what we need (see example here
> >>>
> https://github.com/AltraMayor/gatekeeper/blob/95d1d6e8201861a0d0c698
> >>> bfd06ad606674f1e07/lua/examples/policy.lua#L172-L183),
> >>> but the code must be ready for unexpected needs that may arise in
> >>> production.
> >>>
> >>
> >> Usually, the table is initialized with a large enough number of
> >> entries, enough to add a possible number of routes. One tbl8 group
> >> takes up 1Kb of memory which is nothing comparing to the size of
> >> tbl24 which is 64Mb.
> >
> >     When the prefixes come from BGP, initializing a large enough table
> > is fine. But when prefixes come from threat intelligence, the number
> > of prefixes can vary wildly and the number of prefixes above 24 bits
> > are way more common.
> >
> >> P.S. consider using rte_fib library, it has a number of advantages
> >> over LPM. You can replace the loop in __lookup_fib_bulk() with a bulk
> >> lookup call and this will probably increase the speed.
> >
> >     I'm not aware of the rte_fib library. The only documentation that
> > I found on Google was https://doc.dpdk.org/api/rte__fib_8h.html and it
> > just says "FIB (Forwarding information base) implementation for IPv4
> > Longest Prefix Match".
> 
> That's true, I'm going to add programmer's guide soon.
> Although the fib API is very similar to existing LPM.
> 
> >
> >>>>
> >>>> On 13/10/2020 15:58, Michel Machado wrote:
> >>>>> Hi Kevin,
> >>>>>
> >>>>>     We do need fields max_rules and number_tbl8s of struct
> >>>>> rte_lpm, so the removal would force us to have another patch to
> >>>>> our local copy of DPDK. We'd rather avoid this new local patch
> >>>>> because we wish to eventually be in sync with the stock DPDK.
> >>>>>
> >>>>>     Those fields are needed in Gatekeeper because we found a
> >>>>> condition in an ongoing deployment in which the entries of some
> >>>>> LPM tables may suddenly change a lot to reflect policy changes. To
> >>>>> avoid getting into a state in which the LPM table is inconsistent
> >>>>> because it cannot fit all the new entries, we compute the needed
> >>>>> parameters to support the new entries, and compare with the
> >>>>> current parameters. If the current table doesn't fit everything,
> >>>>> we have to replace it with a new LPM table.
> >>>>>
> >>>>>     If there were a way to obtain the struct rte_lpm_config of a
> >>>>> given LPM table, it would cleanly address our need. We have the
> >>>>> same need in IPv6 and have a local patch to work around it (see
> >>>>>
> https://github.com/cjdoucette/dpdk/commit/3eaf124a781349b8ec8cd880db
> 26a78115cb8c8f).
I do not see why such an API is not possible, we could add one API that returns max_rules and number_tbl8s (essentially, the config that was passed to rte_lpm_create API).
But, is there a possibility to store that info in the application as that data was passed to rte_lpm from the application?

> >>>>> Thus, an IPv4 and IPv6 solution would be best.
> >>>>>
> >>>>>     PS: I've added Qiaobin Fu, another Gatekeeper maintainer, to
> >>>>> this disscussion.
> >>>>>
> >>>>> [ ]'s
> >>>>> Michel Machado
> >>>>>
> >>>>> On 10/13/20 9:53 AM, Kevin Traynor wrote:
> >>>>>> Hi Gatekeeper maintainers (I think),
> >>>>>>
> >>>>>> fyi - there is a proposal to remove some members of a struct in
> >>>>>> DPDK LPM API that Gatekeeper is using [1]. It would be only from
> >>>>>> DPDK 20.11 but as it's an LTS I guess it would probably hit
> >>>>>> Debian in a few months.
> >>>>>>
> >>>>>> The full thread is here:
> >>>>>> http://inbox.dpdk.org/dev/20200907081518.46350-1-
> ruifeng.wang@arm
> >>>>>> .com/
> >>>>>>
> >>>>>>
> >>>>>> Maybe you can take a look and tell us if they are needed in
> >>>>>> Gatekeeper or you can workaround it?
> >>>>>>
> >>>>>> thanks,
> >>>>>> Kevin.
> >>>>>>
> >>>>>> [1]
> >>>>>>
> https://github.com/AltraMayor/gatekeeper/blob/master/gt/lua_lpm.c
> >>>>>> #L235-L248
> >>>>>>
> >>>>>>
> >>>>>> On 09/10/2020 07:54, Ruifeng Wang wrote:
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Kevin Traynor <ktraynor@redhat.com>
> >>>>>>>> Sent: Wednesday, September 30, 2020 4:46 PM
> >>>>>>>> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Medvedkin,
> Vladimir
> >>>>>>>> <vladimir.medvedkin@intel.com>; Bruce Richardson
> >>>>>>>> <bruce.richardson@intel.com>
> >>>>>>>> Cc: dev@dpdk.org; Honnappa Nagarahalli
> >>>>>>>> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> >>>>>>>> Subject: Re: [dpdk-dev] [PATCH 2/2] lpm: hide internal data
> >>>>>>>>
> >>>>>>>> On 16/09/2020 04:17, Ruifeng Wang wrote:
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
> >>>>>>>>>> Sent: Wednesday, September 16, 2020 12:28 AM
> >>>>>>>>>> To: Bruce Richardson <bruce.richardson@intel.com>; Ruifeng
> >>>>>>>>>> Wang <Ruifeng.Wang@arm.com>
> >>>>>>>>>> Cc: dev@dpdk.org; Honnappa Nagarahalli
> >>>>>>>>>> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> >>>>>>>>>> Subject: Re: [PATCH 2/2] lpm: hide internal data
> >>>>>>>>>>
> >>>>>>>>>> Hi Ruifeng,
> >>>>>>>>>>
> >>>>>>>>>> On 15/09/2020 17:02, Bruce Richardson wrote:
> >>>>>>>>>>> On Mon, Sep 07, 2020 at 04:15:17PM +0800, Ruifeng Wang
> wrote:
> >>>>>>>>>>>> Fields except tbl24 and tbl8 in rte_lpm structure have no
> >>>>>>>>>>>> need to be exposed to the user.
> >>>>>>>>>>>> Hide the unneeded exposure of structure fields for better
> >>>>>>>>>>>> ABI maintainability.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Suggested-by: David Marchand
> <david.marchand@redhat.com>
> >>>>>>>>>>>> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >>>>>>>>>>>> Reviewed-by: Phil Yang <phil.yang@arm.com>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>    lib/librte_lpm/rte_lpm.c | 152
> >>>>>>>>>>>> +++++++++++++++++++++++---------------
> >>>>>>>>>> -
> >>>>>>>>>>>>    lib/librte_lpm/rte_lpm.h |   7 --
> >>>>>>>>>>>>    2 files changed, 91 insertions(+), 68 deletions(-)
> >>>>>>>>>>>>
> >>>>>>>>>>> <snip>
> >>>>>>>>>>>> diff --git a/lib/librte_lpm/rte_lpm.h
> >>>>>>>>>>>> b/lib/librte_lpm/rte_lpm.h index 03da2d37e..112d96f37
> >>>>>>>>>>>> 100644
> >>>>>>>>>>>> --- a/lib/librte_lpm/rte_lpm.h
> >>>>>>>>>>>> +++ b/lib/librte_lpm/rte_lpm.h
> >>>>>>>>>>>> @@ -132,17 +132,10 @@ struct rte_lpm_rule_info {
> >>>>>>>>>>>>
> >>>>>>>>>>>>    /** @internal LPM structure. */
> >>>>>>>>>>>>    struct rte_lpm {
> >>>>>>>>>>>> -    /* LPM metadata. */
> >>>>>>>>>>>> -    char name[RTE_LPM_NAMESIZE];        /**< Name of the
> >>>>>>>>>>>> lpm. */
> >>>>>>>>>>>> -    uint32_t max_rules; /**< Max. balanced rules per lpm.
> >>>>>>>>>>>> */
> >>>>>>>>>>>> -    uint32_t number_tbl8s; /**< Number of tbl8s. */
> >>>>>>>>>>>> -    struct rte_lpm_rule_info
> rule_info[RTE_LPM_MAX_DEPTH];
> >>>>>>>>>>>> /**<
> >>>>>>>>>> Rule info table. */
> >>>>>>>>>>>> -
> >>>>>>>>>>>>        /* LPM Tables. */
> >>>>>>>>>>>>        struct rte_lpm_tbl_entry
> >>>>>>>>>>>> tbl24[RTE_LPM_TBL24_NUM_ENTRIES]
> >>>>>>>>>>>>                __rte_cache_aligned; /**< LPM tbl24 table.
> >>>>>>>>>>>> */
> >>>>>>>>>>>>        struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table.
> >>>>>>>>>>>> */
> >>>>>>>>>>>> -    struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
> >>>>>>>>>>>>    };
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Since this changes the ABI, does it not need advance notice?
> >>>>>>>>>>>
> >>>>>>>>>>> [Basically the return value point from rte_lpm_create() will
> >>>>>>>>>>> be different, and that return value could be used by
> >>>>>>>>>>> rte_lpm_lookup()
> >>>>>>>>>>> which as a static inline function will be in the binary and
> >>>>>>>>>>> using the old structure offsets.]
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Agree with Bruce, this patch breaks ABI, so it can't be
> >>>>>>>>>> accepted without prior notice.
> >>>>>>>>>>
> >>>>>>>>> So if the change wants to happen in 20.11, a deprecation
> >>>>>>>>> notice should have been added in 20.08.
> >>>>>>>>> I should have added a deprecation notice. This change will
> >>>>>>>>> have to wait for
> >>>>>>>> next ABI update window.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Do you plan to extend? or is this just speculative?
> >>>>>>> It is speculative.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> A quick scan and there seems to be several projects using some
> >>>>>>>> of these
> >>>>>>>> members that you are proposing to hide. e.g. BESS, NFF-Go, DPVS,
> >>>>>>>> gatekeeper. I didn't look at the details to see if they are
> >>>>>>>> really needed.
> >>>>>>>>
> >>>>>>>> Not sure how much notice they'd need or if they update DPDK
> >>>>>>>> much, but I
> >>>>>>>> think it's worth having a closer look as to how they use lpm and
> >>>>>>>> what the
> >>>>>>>> impact to them is.
> >>>>>>> Checked the projects listed above. BESS, NFF-Go and DPVS don't
> >>>>>>> access the members to be hided.
> >>>>>>> They will not be impacted by this patch.
> >>>>>>> But Gatekeeper accesses the rte_lpm internal members that to be
> >>>>>>> hided. Its compilation will be broken with this patch.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> Thanks.
> >>>>>>>>> Ruifeng
> >>>>>>>>>>>>    /** LPM RCU QSBR configuration structure. */
> >>>>>>>>>>>> --
> >>>>>>>>>>>> 2.17.1
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Regards,
> >>>>>>>>>> Vladimir
> >>>>>>>
> >>>>>>
> >>>>
> >>
> 
> --
> Regards,
> Vladimir

  reply index

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-07  8:15 [dpdk-dev] [PATCH 0/2] LPM changes Ruifeng Wang
2020-09-07  8:15 ` [dpdk-dev] [PATCH 1/2] lpm: fix free of data structure Ruifeng Wang
2020-09-15 15:55   ` Bruce Richardson
2020-09-15 16:25   ` Medvedkin, Vladimir
2020-09-07  8:15 ` [dpdk-dev] [PATCH 2/2] lpm: hide internal data Ruifeng Wang
2020-09-15 16:02   ` Bruce Richardson
2020-09-15 16:28     ` Medvedkin, Vladimir
2020-09-16  3:17       ` Ruifeng Wang
2020-09-30  8:45         ` Kevin Traynor
2020-10-09  6:54           ` Ruifeng Wang
2020-10-13 13:53             ` Kevin Traynor
2020-10-13 14:58               ` Michel Machado
2020-10-13 15:41                 ` Medvedkin, Vladimir
2020-10-13 17:46                   ` Michel Machado
2020-10-13 19:06                     ` Medvedkin, Vladimir
2020-10-13 19:48                       ` Michel Machado
2020-10-14 13:10                         ` Medvedkin, Vladimir
2020-10-14 23:57                           ` Honnappa Nagarahalli [this message]
2020-10-15 13:39                             ` Michel Machado
2020-10-15 17:38                               ` Honnappa Nagarahalli
2020-10-15 19:30                                 ` Medvedkin, Vladimir
2020-10-15 22:54                                   ` Honnappa Nagarahalli
2020-10-16 11:39                                     ` Kevin Traynor
2020-10-16 13:55                                       ` Michel Machado
2020-10-19 14:53                                     ` David Marchand
2020-10-20 14:22                                       ` Thomas Monjalon
2020-10-20 14:32                                         ` Medvedkin, Vladimir
2020-10-19 17:53             ` Honnappa Nagarahalli
2020-09-15 14:41 ` [dpdk-dev] [PATCH 0/2] LPM changes David Marchand
2020-10-19 13:37   ` Kevin Traynor
2020-10-21  3:02 ` [dpdk-dev] [PATCH v2 " Ruifeng Wang
2020-10-21  3:02   ` [dpdk-dev] [PATCH v2 1/2] lpm: fix free of data structure Ruifeng Wang
2020-10-21  3:02   ` [dpdk-dev] [PATCH v2 2/2] lpm: hide internal data Ruifeng Wang
2020-10-21  7:58     ` Thomas Monjalon
2020-10-21  8:15       ` Ruifeng Wang
2020-10-22 15:14     ` David Marchand
2020-10-23  6:13       ` Ruifeng Wang
2020-10-23 16:08         ` Medvedkin, Vladimir
2020-10-23  9:38 ` [dpdk-dev] [PATCH v3 0/2] LPM changes David Marchand
2020-10-23  9:38   ` [dpdk-dev] [PATCH v3 1/2] lpm: fix free of data structure David Marchand
2020-10-23  9:38   ` [dpdk-dev] [PATCH v3 2/2] lpm: hide internal data David Marchand
2020-10-26  8:26   ` [dpdk-dev] [PATCH v3 0/2] LPM changes David Marchand

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBAPR08MB5814117F38D45F942352B44F98050@DBAPR08MB5814.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=andre@digirati.com.br \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=doucette@bu.edu \
    --cc=ktraynor@redhat.com \
    --cc=mengxiang0811@gmail.com \
    --cc=michel@digirati.com.br \
    --cc=nd@arm.com \
    --cc=vladimir.medvedkin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox