DPDK patches and discussions
 help / color / mirror / Atom feed
From: Vladimir Medvedkin <medvedkinv@gmail.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [RFC] Add RIB library
Date: Tue, 15 Aug 2017 14:01:55 +0300	[thread overview]
Message-ID: <CANDrEHky5dZgxHPWj3eOF700DR1duA_RyeWxe=4=X+cNi+QesQ@mail.gmail.com> (raw)
In-Reply-To: <CANDrEH=qFLqJS093pLb2hZB-uT9qRvmg-n7TtkR9L_TCqyihcw@mail.gmail.com>

Moreover rte_rib_v4_node could contain app specific extension (.ext field).
For example you can implement PIC (prefix independent convergence) by
linking rte_rib_v4_node with similar next hop together and precalculate
feasible next hop for each. Something like:
struct rte_rib_pic_nh {
    struct *rte_rib_v4_node;
    uint64_t nh;
    uint64_t feasible_nh;
}
and keep that linked list's head in next hop structure.
When next hop fails you just jump from rte_rib_v4_node rte_rib_v4_node and
change next hop very fast.

2017-08-15 13:49 GMT+03:00 Vladimir Medvedkin <medvedkinv@gmail.com>:

> The advantage is in increasing control plane operations speed. I tested
> with my fullview + internal routes. It had 650030 prefixes(shuffled) with
> 1000 specific (longer /24) prefixes within 73 /24 networks. Every prefix
> had unique next hop. In this test the speed of inserting new routes was
> increased 70 times against current LPM. This was achieved due to
> 1. keeping routes in a trie structure instead of array (no need to get
> free room for rule)
> 2. avoid unnecessary reads in tbl24 (checking for .depth). Thanks to
> rte_rib_v4_get_next_child() (that is reverse order traversal without
> recursion) you can get all more specific prefixes inside your target prefix
> (you want to add/del), so you can get all ranges between that more
> specifics and write next hop unconditionally to tbl24 and tbl8.
>
> 2017-08-15 11:23 GMT+03:00 Bruce Richardson <bruce.richardson@intel.com>:
>
>> On Tue, Aug 15, 2017 at 01:28:26AM +0300, Vladimir Medvedkin wrote:
>> > 2017-08-14 13:51 GMT+03:00 Bruce Richardson <bruce.richardson@intel.com
>> >:
>> >
>> > > On Tue, Jul 11, 2017 at 07:33:04PM +0000, Medvedkin Vladimir wrote:
>> > > > Hi,
>> > > >
>> > > > I want to introduce new library for ip routing lookup that have some
>> > > advantages
>> > > > over current LPM library. In short:
>> > > >      - Increases the speed of control plane operations against lpm
>> such
>> > > as
>> > > >        adding/deleting routes
>> > > >      - Adds abstraction from dataplane algorythms, so it is
>> possible to
>> > > add
>> > > >        different ip route lookup algorythms such as
>> > > DXR/poptrie/lpc-trie/etc
>> > > >        in addition to current dir24_8
>> > > >      - It is possible to keep user defined application specific
>> > > additional
>> > > >        information in struct rte_rib_v4_node which represents route
>> > > entry.
>> > > >        It can be next hop/set of next hops (i.e. active and
>> feasible),
>> > > >        pointers to link rte_rib_v4_node based on some criteria (i.e.
>> > > next_hop),
>> > > >        plenty of additional control plane information.
>> > > >      - For dir24_8 implementation it is possible to remove
>> > > rte_lpm_tbl_entry.depth
>> > > >        field that helps to save 6 bits.
>> > > >      - Also new dir24_8 implementation supports different next_hop
>> sizes
>> > > >        (1/2/4/8 bytes per next hop)
>> > > >
>> > > > It would be nice to hear your opinion. The draft is below.
>> > > >
>> > > > Medvedkin Vladimir (1):
>> > > >   lib/rib: Add Routing Information Base library
>> > > >
>> > >
>> > > On reading this patch and then having discussion with you offline, it
>> > > appears there are two major new elements in this patchset:
>> > >
>> > > 1. a re-implementation of LPM, with the major advantage of having a
>> > > flexible data-size
>> > > 2. a separate control plane structure that is designed to fit on top
>> off
>> > > possibly multiple lookup structures for the data plane
>> > >
>> > > Is this correct?
>> > >
>> > Correct
>> >
>> > >
>> > > For the first part, I don't think we should carry about two separate
>> LPM
>> > > implementations, but rather look to take the improvements in your
>> > > version back into the existing lib. [Or else replace the existing one,
>> > > but I prefer pulling the new stuff into it, so as to keep backward
>> > > compatibility]
>> > >
>> >
>> > > For the second part, perhaps you could expand a bit more on the
>> thought
>> > > here, and explain what all different data plane implementations would
>> > > fit under it. Would, for instance a hash-lookup work? In that case,
>> what
>> > > would the data plane APIs be, and the control plane ones.
>> > >
>> >
>> >  I'm not sure for _all_ data plane implementations, but from my point of
>> > view compressed prefix trie (rte_rib structure) could be useful at least
>> > for dir24_8, dxr, bitmap handling. Concerning to hash lookup, it
>> depends on
>> > algorithm (array of hash tables indexed by mask length, unrolling
>> prefix to
>> > number of /32).
>> > Perhaps it is better to waive the abstraction and make LPM as primary
>> > struct that keeps rte_rib inside (instead of rules_tbl[ ]).
>> > In that case rte_rib becomes side structure and it's API only for
>> working
>> > with a trie. LPM's API remains the same (except next_hop size and LPM
>> > creation).
>> >
>> >
>> What is the advantage of using the rte_rib for control plane access over
>> the existing rules table structure. Are not the basic operations needed
>> for adding/removing/looking-up rules supported by both?
>>
>> /Bruce
>>
>
>
>
> --
> Regards,
> Vladimir
>



-- 
Regards,
Vladimir

  reply	other threads:[~2017-08-15 11:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11 19:33 Medvedkin Vladimir
2017-07-11 19:33 ` [dpdk-dev] [PATCH] lib/rib: Add Routing Information Base library Medvedkin Vladimir
2017-07-11 20:28   ` Stephen Hemminger
2017-07-11 23:17     ` Vladimir Medvedkin
2017-07-11 20:31 ` [dpdk-dev] [RFC] Add RIB library Stephen Hemminger
2017-07-11 23:13   ` Vladimir Medvedkin
2017-08-14 10:51 ` Bruce Richardson
2017-08-14 22:28   ` Vladimir Medvedkin
2017-08-15  8:23     ` Bruce Richardson
2017-08-15 10:49       ` Vladimir Medvedkin
2017-08-15 11:01         ` Vladimir Medvedkin [this message]
2018-01-16  0:41           ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANDrEHky5dZgxHPWj3eOF700DR1duA_RyeWxe=4=X+cNi+QesQ@mail.gmail.com' \
    --to=medvedkinv@gmail.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).