From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f171.google.com (mail-qt0-f171.google.com [209.85.216.171]) by dpdk.org (Postfix) with ESMTP id 1CF5F7CB5 for ; Fri, 27 Apr 2018 10:27:06 +0200 (CEST) Received: by mail-qt0-f171.google.com with SMTP id z23-v6so1249390qti.5 for ; Fri, 27 Apr 2018 01:27:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=n6F+WMR3BAdfMBqjVdMf+dCb13t11V44BzK9JFlpZZI=; b=ejt47VUAWVjbBWyk3R8sSH1+9EtYsjredgAzuZFnlRbqyrC92u0TctR8y0wKVTiN2Q vMygCJRHBAfc7WuFm07/1pVpwOWcL+kwV1fcANbyCWDNcMf5DkAK0HgnYSsoqdWFK979 HiTwDDPLziUVDBUHE1mZ6uKIerkR/jCe6SrZAgQ1wel1d/wuK4JweA7pOcYdfSVvmhKD T2faZwe5P8OCuDv4wYpStNq8G4bP882UgmXjyOI4Zk36eQ1LgCd9NtuKUEF1FAGVuJpG /pjHpR/d7BqJ8pzprlCaXVZZPlUgxu5hYZkjFknf4M3Gyv+WuWckrKkLNQTYzFzQi2O5 OXtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=n6F+WMR3BAdfMBqjVdMf+dCb13t11V44BzK9JFlpZZI=; b=IMHZn8stscyCU9sXzySoMopoxa3/6HYgetSJ+z+FiYc4qVAMc/xubOP0b+sp6lwgG/ PJ9gj2S++pKyKnj7dpwmPEcIfM4eTbSHnmiTMRxJEXAJTdv1IA41SDLV+0nEKGNhk/8M GUmRPSfLYL6EqUn1LEo0LHXHKOgLDGmjHclBKSZX69mZxxMgIOD3BTc4Q76tL8CGdaZK 0LKi0oqZEz3Fr3+OKge85D7DyEQ+Rmd5hrkkzGyknYqOeErsAfdDf1ed9EK+uYJRcOVl 3yO02hJCH3sWaVeLPHulqp2Fohj+MzLV5dKUKHxQ4qJncz1GH37WQDhBNL0YAZNRv3O4 jtjA== X-Gm-Message-State: ALQs6tA8OvzTN+Y/ErLGfiIN9naa1teJUrDRPq8x6BlgkkitWgj85Zi6 fZmzuJi2WoUVoGifRtnXFT8y+9PV/LIml5IvWnV9ukXu X-Google-Smtp-Source: AB8JxZp2pedznj08co+vRcmmQgUXaAPqFTJVVqZ2GxOmdaFgaHmwpeMMfizOm63fu3Ho3LYWCAUKrxXoR6r+9ILizdY= X-Received: by 2002:a0c:afb2:: with SMTP id s47-v6mr1083995qvc.163.1524817625365; Fri, 27 Apr 2018 01:27:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.200.37.27 with HTTP; Fri, 27 Apr 2018 01:27:04 -0700 (PDT) In-Reply-To: <20180426152448.0c21dc13@xeon-e3> References: <1524780214-23196-1-git-send-email-medvedkinv@gmail.com> <20180426152448.0c21dc13@xeon-e3> From: Vladimir Medvedkin Date: Fri, 27 Apr 2018 11:27:04 +0300 Message-ID: To: Stephen Hemminger Cc: dev@dpdk.org, Bruce Richardson , thomas@monjalon.net, cristian.dumitrescu@intel.com Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v4 0/4] lib/rib: Add Routing Information Base library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Apr 2018 08:27:06 -0000 Hi Stephen, 2018-04-27 1:24 GMT+03:00 Stephen Hemminger : > On Fri, 27 Apr 2018 01:03:30 +0300 > Medvedkin Vladimir wrote: > > > This patch series introduces new library librte_rib which potentially > could > > replace librte_lpm. > > > > RIB is an alternative to current LPM library. > > It solves the following problems > > - Increases the speed of control plane operations against lpm such as > > adding/deleting routes > > - Adds abstraction from dataplane algorithms, so it is possible to add > > different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc > > in addition to current dir24_8 > > - It is possible to keep user defined application specific additional > > information in struct rte_rib_node which represents route entry. > > It can be next hop/set of next hops (i.e. active and feasible), > > pointers to link rte_rib_node based on some criteria (i.e. next_hop), > > plenty of additional control plane information. > > > > v4: > > fix various bugs > > make struct rte_rib opaque > > make inline functions instead of macro > > remove RTE_RIB_MALLOC node allocation type > > add new lookup functions > > remove rte_dir24_8_lookup() > > add rte_dir24_8_get_lookup() > > add meson support > > add fib configuration > > > > > > Medvedkin Vladimir (4): > > Add RIB library > > Add dir24_8 implementation for rib library > > Add autotests for RIB library > > The existing DPDK LPM code does need more work it does trade space for > time. > > It does have some advantages though: > * LPM lookup table is independent of number of routes. > To be honest it depends on number of prefixes longer 24. But despite this my implementation is independent of number of routes too. > * LPM lookup table can be lock free reader safe using RCU. > As I see you are using rcu only for tbl8's growing. This feature was out of scope and could be added if needed. But I'm afraid about license constraints. > > The existing slowness of add and delete was fixed at Vyatta/Brocade by > replacing list with red-black tree. Patches were submitted but never > merged. > Could you please share lpm_perf_autotest results for your version? Here my results for lpm_perf_autotest and rib_perf_autotest (sizeof key == sizeof(struct rte_lpm_tbl_entry) so memory footprint is the same) LPM: Unique added entries = 1037026 Used table 24 entries = 14680064 (87.5%) 64 byte Cache entries used = 458753 (29360192 bytes) Average LPM Add: 306566 cycles Average LPM Lookup: 29.7 cycles (fails = 0.0%) BULK LPM Lookup: 29.4 cycles (fails = 0.0%) LPM LookupX4: 27.1 cycles (fails = 0.0%) Average LPM Delete: 201360 cycles Test OK RIB: Unique added entries = 1076816 Average RIB Add: 3149.62 cycles BULK RIB Lookup: 23.8 cycles (fails = 0.0%) Average RIB Delete: 2949.62 cycles Test OK As you can see for 1M+ routes I have 100 times faster add and 70 times faster delete. Lookup speed is faster too. It was achieved due to: 1. routes are kept in compressed binary tree so rule insertion/deletion are cheap comparing to flat array in LPM. And yes, rb_tree implementation fixes that problem too. 2. routes are kept in compressed binary tree so you can get less specific prefix (parent node) for a given prefix (you want add/del to) and compare it's next hop to a given next hop. There is no need to touch fib table if they are equal. 3. routes are kept in compressed binary tree so you can traverse on some part of that tree to get subprefixes for a given prefix (you want add/del to). Hence you can get gaps between retrieved more specific routes and based on this gaps unconditionally rewrite part of fib table. Comparing to LPM where every write to tbl24 or tbl8 is accompanied by a comparison of depth. In addition there is no need to keep depth in fib anymore. 4. LPM tbl8_alloc is expensive due to memory scan through tbl8 to find a free tbl8 group. More effective to keep free tbl8 indexes in bitmap. -- Regards, Vladimir