From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f171.google.com (mail-qk0-f171.google.com [209.85.220.171]) by dpdk.org (Postfix) with ESMTP id 71D779103 for ; Tue, 15 Aug 2017 13:01:56 +0200 (CEST) Received: by mail-qk0-f171.google.com with SMTP id u139so2376767qka.1 for ; Tue, 15 Aug 2017 04:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=IQ2lzyLqIrU2qonQLKi68bt2Oh/3mywoXgl7hATkMvk=; b=RUuf/azURrXXkmlnA3dTmjm3d0oL42y8farzSbx9bK/uBIoXZSXZHpMYycE3Q6362T 9S1ENLipYW1Y4uzzY8mvqpDS3ejHNWwNDsfJmNPQGMs/BEAeOfdu0bcXGb+gtSmVATe/ RBM4v/3aAurx7vDOYYgNx71SPCOOUOA9H0m1YHrlomHhTLLT5k8TPkVih3/DFzj1lpAA XfUKUl0N8QitM/MwEXiaFhEYIJO4cKugi7s3VwTsBtDbvBCj4KCB3aOiSMNNUA1PK5IS aVzV9WRaie0cHIJchsCy9US7oM2IOqtgY2azam4xMDGym4C7J/Nk1WZOV4MUWx99uOyV O/NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=IQ2lzyLqIrU2qonQLKi68bt2Oh/3mywoXgl7hATkMvk=; b=EdXWBXLTDWAUI6szQ9rinwcoeEwONrC4949Dw24h37UhVHUuL9YiG+ELRg09dekE6Z qXFoVi2nIKyHnRtU5jP7/xYNUqQvFzHal3V92L5VkLZPqYhW1bWibIiWw54bVkOkDllU k6yKsQnstOSqug+Qro2f9X5uDnYJEekcECf5aPtS/wtQ2XHwql29JRoKsLdLtElcshl2 120paMIknh2QoebJ6jG0PbOjw9Vy0qQyamZYxtjSnbIN1m5chEE9jkn8uzxQ6AIPpS2U tB3V/cMblGwCXY/yTPMrCUZyczDI7q6puS92U9NJi9Ucn6MeUhLanWcF6K+X2Ay9NWe0 +VgA== X-Gm-Message-State: AHYfb5gdEL9o4jQKk0IHRFDpqW+yqD1O5RIuxs1T+S1CDmRzQnViDXON 4C1pIig7BDlbFf/cIeeoZ2aAIqcIUA== X-Received: by 10.55.94.69 with SMTP id s66mr32434181qkb.99.1502794915873; Tue, 15 Aug 2017 04:01:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.200.49.137 with HTTP; Tue, 15 Aug 2017 04:01:55 -0700 (PDT) In-Reply-To: References: <1499801585-10031-1-git-send-email-medvedkinv@gmail.com> <20170814105156.GA8112@bricha3-MOBL3.ger.corp.intel.com> <20170815082322.GA568@bricha3-MOBL3.ger.corp.intel.com> From: Vladimir Medvedkin Date: Tue, 15 Aug 2017 14:01:55 +0300 Message-ID: To: Bruce Richardson Cc: "dev@dpdk.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [RFC] Add RIB library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Aug 2017 11:01:57 -0000 Moreover rte_rib_v4_node could contain app specific extension (.ext field). For example you can implement PIC (prefix independent convergence) by linking rte_rib_v4_node with similar next hop together and precalculate feasible next hop for each. Something like: struct rte_rib_pic_nh { struct *rte_rib_v4_node; uint64_t nh; uint64_t feasible_nh; } and keep that linked list's head in next hop structure. When next hop fails you just jump from rte_rib_v4_node rte_rib_v4_node and change next hop very fast. 2017-08-15 13:49 GMT+03:00 Vladimir Medvedkin : > The advantage is in increasing control plane operations speed. I tested > with my fullview + internal routes. It had 650030 prefixes(shuffled) with > 1000 specific (longer /24) prefixes within 73 /24 networks. Every prefix > had unique next hop. In this test the speed of inserting new routes was > increased 70 times against current LPM. This was achieved due to > 1. keeping routes in a trie structure instead of array (no need to get > free room for rule) > 2. avoid unnecessary reads in tbl24 (checking for .depth). Thanks to > rte_rib_v4_get_next_child() (that is reverse order traversal without > recursion) you can get all more specific prefixes inside your target prefix > (you want to add/del), so you can get all ranges between that more > specifics and write next hop unconditionally to tbl24 and tbl8. > > 2017-08-15 11:23 GMT+03:00 Bruce Richardson : > >> On Tue, Aug 15, 2017 at 01:28:26AM +0300, Vladimir Medvedkin wrote: >> > 2017-08-14 13:51 GMT+03:00 Bruce Richardson > >: >> > >> > > On Tue, Jul 11, 2017 at 07:33:04PM +0000, Medvedkin Vladimir wrote: >> > > > Hi, >> > > > >> > > > I want to introduce new library for ip routing lookup that have some >> > > advantages >> > > > over current LPM library. In short: >> > > > - Increases the speed of control plane operations against lpm >> such >> > > as >> > > > adding/deleting routes >> > > > - Adds abstraction from dataplane algorythms, so it is >> possible to >> > > add >> > > > different ip route lookup algorythms such as >> > > DXR/poptrie/lpc-trie/etc >> > > > in addition to current dir24_8 >> > > > - It is possible to keep user defined application specific >> > > additional >> > > > information in struct rte_rib_v4_node which represents route >> > > entry. >> > > > It can be next hop/set of next hops (i.e. active and >> feasible), >> > > > pointers to link rte_rib_v4_node based on some criteria (i.e. >> > > next_hop), >> > > > plenty of additional control plane information. >> > > > - For dir24_8 implementation it is possible to remove >> > > rte_lpm_tbl_entry.depth >> > > > field that helps to save 6 bits. >> > > > - Also new dir24_8 implementation supports different next_hop >> sizes >> > > > (1/2/4/8 bytes per next hop) >> > > > >> > > > It would be nice to hear your opinion. The draft is below. >> > > > >> > > > Medvedkin Vladimir (1): >> > > > lib/rib: Add Routing Information Base library >> > > > >> > > >> > > On reading this patch and then having discussion with you offline, it >> > > appears there are two major new elements in this patchset: >> > > >> > > 1. a re-implementation of LPM, with the major advantage of having a >> > > flexible data-size >> > > 2. a separate control plane structure that is designed to fit on top >> off >> > > possibly multiple lookup structures for the data plane >> > > >> > > Is this correct? >> > > >> > Correct >> > >> > > >> > > For the first part, I don't think we should carry about two separate >> LPM >> > > implementations, but rather look to take the improvements in your >> > > version back into the existing lib. [Or else replace the existing one, >> > > but I prefer pulling the new stuff into it, so as to keep backward >> > > compatibility] >> > > >> > >> > > For the second part, perhaps you could expand a bit more on the >> thought >> > > here, and explain what all different data plane implementations would >> > > fit under it. Would, for instance a hash-lookup work? In that case, >> what >> > > would the data plane APIs be, and the control plane ones. >> > > >> > >> > I'm not sure for _all_ data plane implementations, but from my point of >> > view compressed prefix trie (rte_rib structure) could be useful at least >> > for dir24_8, dxr, bitmap handling. Concerning to hash lookup, it >> depends on >> > algorithm (array of hash tables indexed by mask length, unrolling >> prefix to >> > number of /32). >> > Perhaps it is better to waive the abstraction and make LPM as primary >> > struct that keeps rte_rib inside (instead of rules_tbl[ ]). >> > In that case rte_rib becomes side structure and it's API only for >> working >> > with a trie. LPM's API remains the same (except next_hop size and LPM >> > creation). >> > >> > >> What is the advantage of using the rte_rib for control plane access over >> the existing rules table structure. Are not the basic operations needed >> for adding/removing/looking-up rules supported by both? >> >> /Bruce >> > > > > -- > Regards, > Vladimir > -- Regards, Vladimir