From: Vladimir Medvedkin <medvedkinv@gmail.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2 1/2] Add RIB library
Date: Thu, 29 Mar 2018 23:11:20 +0300 [thread overview]
Message-ID: <CANDrEHkzgUcYzaS1pdSLRuJ4WSrEF91nqOK14+_Ny4G3eSJX_g@mail.gmail.com> (raw)
In-Reply-To: <20180329102746.GA4004@bricha3-MOBL.ger.corp.intel.com>
2018-03-29 13:27 GMT+03:00 Bruce Richardson <bruce.richardson@intel.com>:
> On Wed, Feb 21, 2018 at 09:44:54PM +0000, Medvedkin Vladimir wrote:
> > RIB is an alternative to current LPM library.
> > It solves the following problems
> > - Increases the speed of control plane operations against lpm such as
> > adding/deleting routes
> > - Adds abstraction from dataplane algorithms, so it is possible to add
> > different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
> > in addition to current dir24_8
> > - It is possible to keep user defined application specific additional
> > information in struct rte_rib_node which represents route entry.
> > It can be next hop/set of next hops (i.e. active and feasible),
> > pointers to link rte_rib_node based on some criteria (i.e. next_hop),
> > plenty of additional control plane information.
> > - For dir24_8 implementation it is possible to remove
> rte_lpm_tbl_entry.depth
> > field that helps to save 6 bits.
> > - Also new dir24_8 implementation supports different next_hop sizes
> > (1/2/4/8 bytes per next hop)
> > - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate ternary
> operator.
> > Instead it returns special default value if there is no route.
> >
> > Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
> > ---
>
> Hi again,
>
> some initial comments on the dir24_8 files below.
>
> /Bruce
>
> > config/common_base | 6 +
> > doc/api/doxy-api.conf | 1 +
> > lib/Makefile | 2 +
> > lib/librte_rib/Makefile | 22 ++
> > lib/librte_rib/rte_dir24_8.c | 482 ++++++++++++++++++++++++++++++
> +++
> > lib/librte_rib/rte_dir24_8.h | 116 ++++++++
> > lib/librte_rib/rte_rib.c | 526 ++++++++++++++++++++++++++++++
> +++++++
> > lib/librte_rib/rte_rib.h | 322 +++++++++++++++++++++++
> > lib/librte_rib/rte_rib_version.map | 18 ++
> > mk/rte.app.mk | 1 +
> > 10 files changed, 1496 insertions(+)
> > create mode 100644 lib/librte_rib/Makefile
> > create mode 100644 lib/librte_rib/rte_dir24_8.c
> > create mode 100644 lib/librte_rib/rte_dir24_8.h
> > create mode 100644 lib/librte_rib/rte_rib.c
> > create mode 100644 lib/librte_rib/rte_rib.h
> > create mode 100644 lib/librte_rib/rte_rib_version.map
> >
>
> <snip>
>
> > diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c
> > new file mode 100644
> > index 0000000..a12f882
> > --- /dev/null
> > +++ b/lib/librte_rib/rte_dir24_8.c
> > @@ -0,0 +1,482 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> > + */
> > +
> > +#include <stdint.h>
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <rte_debug.h>
> > +#include <rte_malloc.h>
> > +#include <rte_prefetch.h>
> > +#include <rte_errno.h>
> > +
> > +#include <inttypes.h>
> > +
> > +#include <rte_memory.h>
> > +#include <rte_branch_prediction.h>
> > +
> > +#include <rte_rib.h>
> > +#include <rte_dir24_8.h>
> > +
> > +#define BITMAP_SLAB_BIT_SIZE_LOG2 6
> > +#define BITMAP_SLAB_BIT_SIZE (1 << BITMAP_SLAB_BIT_SIZE_LOG2)
> > +#define BITMAP_SLAB_BITMASK (BITMAP_SLAB_BIT_SIZE - 1)
> > +
> > +#define ROUNDUP(x, y) RTE_ALIGN_CEIL(x, (1 << (32 - y)))
> > +
> > +static __rte_always_inline __attribute__((pure)) void *
> > +get_tbl24_p(struct rte_dir24_8_tbl *fib, uint32_t ip)
> > +{
> > + return (void *)&((uint8_t *)fib->tbl24)[(ip &
> > + RTE_DIR24_8_TBL24_MASK) >> (8 - fib->nh_sz)];
> > +}
> > +
> > +#define LOOKUP_FUNC(suffix, type, bulk_prefetch) \
> > +int rte_dir24_8_lookup_bulk_##suffix(void *fib_p, const uint32_t
> *ips, \
> > + uint64_t *next_hops, const unsigned n) \
> > +{ \
> > + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; \
> > + uint64_t tmp; \
> > + uint32_t i; \
> > + uint32_t prefetch_offset = RTE_MIN((unsigned)bulk_prefetch, n); \
> > + \
> > + RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ips == NULL) || \
> > + (next_hops == NULL)), -EINVAL); \
> > + \
> > + for (i = 0; i < prefetch_offset; i++) \
> > + rte_prefetch0(get_tbl24_p(fib, ips[i])); \
> > + for (i = 0; i < (n - prefetch_offset); i++) { \
> > + rte_prefetch0(get_tbl24_p(fib, ips[i + prefetch_offset]));
> \
> > + tmp = ((type *)fib->tbl24)[ips[i] >> 8]; \
> > + if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) == \
> > + RTE_DIR24_8_VALID_EXT_ENT)) { \
> > + tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] + \
> > + ((tmp >> 1) *
> RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> > + } \
> > + next_hops[i] = tmp >> 1; \
> > + } \
> > + for (; i < n; i++) { \
> > + tmp = ((type *)fib->tbl24)[ips[i] >> 8]; \
> > + if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) == \
> > + RTE_DIR24_8_VALID_EXT_ENT)) { \
> > + tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] + \
> > + ((tmp >> 1) *
> RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> > + } \
> > + next_hops[i] = tmp >> 1; \
> > + } \
> > + return 0; \
> > +} \
>
> What is the advantage of doing this as a macro? Unless I'm missing
> something "suffix" is never actually used in the function at all, and you
> reference the size of the data from fix->nh_sz. Therefore there can be no
> performance benefit from having such a lookup function, that I can see.
>
suffix is to create 4 different function names, look at the end of
rte_dir24_8.c, there are
LOOKUP_FUNC(1b, uint8_t, 5)
LOOKUP_FUNC(2b, uint16_t, 6)
LOOKUP_FUNC(4b, uint32_t, 15)
LOOKUP_FUNC(8b, uint64_t, 12)
Now I made single lookup function that references the size of the data from
fib->nh_sz instead of static casting to passed type in macro.
was:
BULK RIB Lookup: 24.2 cycles (fails = 0.0%)
become:
BULK RIB Lookup: 26.1 cycles (fails = 0.0%)
proc E3-1230v1@3.6Ghz
>
> Therefore, if performance is ok, I suggest just making a single lookup_bulk
> function that works with all sizes - as the inlined lookup function does in
> the header.
>
> Alternatively, if you do want specific functions for each
> entry size, you still don't need macros. Write a single function that takes
> as a final parameter the entry-size and use that in calculations rather
> than nh_sz. Then wrap that function in the set of public ones, passing in
> the final size parameter explicitly as "1", "2", "4" or "8". The compiler
> will then know that as a compile-time constant and generate the correct
> code for each size. However, for this path I suggest you check for any
> resulting performance improvement, e.g. with l3fwd, as I think it's not
> likely to be significant.
>
> > +
> > +static void
> > +write_to_fib(void *ptr, uint64_t val, enum rte_dir24_8_nh_sz size, int
> n)
> > +{
> > + int i;
> > + uint8_t *ptr8 = (uint8_t *)ptr;
> > + uint16_t *ptr16 = (uint16_t *)ptr;
> > + uint32_t *ptr32 = (uint32_t *)ptr;
> > + uint64_t *ptr64 = (uint64_t *)ptr;
> > +
> > + switch (size) {
> > + case RTE_DIR24_8_1B:
> > + for (i = 0; i < n; i++)
> > + ptr8[i] = (uint8_t)val;
> > + break;
> > + case RTE_DIR24_8_2B:
> > + for (i = 0; i < n; i++)
> > + ptr16[i] = (uint16_t)val;
> > + break;
> > + case RTE_DIR24_8_4B:
> > + for (i = 0; i < n; i++)
> > + ptr32[i] = (uint32_t)val;
> > + break;
> > + case RTE_DIR24_8_8B:
> > + for (i = 0; i < n; i++)
> > + ptr64[i] = (uint64_t)val;
> > + break;
> > + }
> > +}
> > +
> > +static int
> > +tbl8_get_idx(struct rte_dir24_8_tbl *fib)
> > +{
> > + uint32_t i;
> > + int bit_idx;
> > +
> > + for (i = 0; (i < (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2))
> &&
> > + (fib->tbl8_idxes[i] == UINT64_MAX); i++)
> > + ;
> > + if (i <= (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) {
> > + bit_idx = __builtin_ctzll(~fib->tbl8_idxes[i]);
> > + fib->tbl8_idxes[i] |= (1ULL << bit_idx);
> > + return (i << BITMAP_SLAB_BIT_SIZE_LOG2) + bit_idx;
> > + }
> > + return -ENOSPC;
> > +}
> > +
> > +static inline void
> > +tbl8_free_idx(struct rte_dir24_8_tbl *fib, int idx)
> > +{
> > + fib->tbl8_idxes[idx >> BITMAP_SLAB_BIT_SIZE_LOG2] &=
> > + ~(1ULL << (idx & BITMAP_SLAB_BITMASK));
> > +}
> > +
> > +static int
> > +tbl8_alloc(struct rte_dir24_8_tbl *fib, uint64_t nh)
> > +{
> > + int tbl8_idx;
> > + uint8_t *tbl8_ptr;
> > +
> > + tbl8_idx = tbl8_get_idx(fib);
> > + if (tbl8_idx < 0)
> > + return tbl8_idx;
> > + tbl8_ptr = (uint8_t *)fib->tbl8 +
> > + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) <<
> > + fib->nh_sz);
> > + /*Init tbl8 entries with nexthop from tbl24*/
> > + write_to_fib((void *)tbl8_ptr, nh|
> > + RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz,
> > + RTE_DIR24_8_TBL8_GRP_NUM_ENT);
> > + return tbl8_idx;
> > +}
> > +
> > +static void
> > +tbl8_recycle(struct rte_dir24_8_tbl *fib, uint32_t ip, uint64_t
> tbl8_idx)
> > +{
> > + int i;
> > + uint64_t nh;
> > + uint8_t *ptr8;
> > + uint16_t *ptr16;
> > + uint32_t *ptr32;
> > + uint64_t *ptr64;
> > +
> > + switch (fib->nh_sz) {
> > + case RTE_DIR24_8_1B:
> > + ptr8 = &((uint8_t *)fib->tbl8)[tbl8_idx *
> > + RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > + nh = *ptr8;
> > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > + if (nh != ptr8[i])
> > + return;
> > + }
> > + ((uint8_t *)fib->tbl24)[ip >> 8] =
> > + nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > + ptr8[i] = 0;
> > + break;
> > + case RTE_DIR24_8_2B:
> > + ptr16 = &((uint16_t *)fib->tbl8)[tbl8_idx *
> > + RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > + nh = *ptr16;
> > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > + if (nh != ptr16[i])
> > + return;
> > + }
> > + ((uint16_t *)fib->tbl24)[ip >> 8] =
> > + nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > + ptr16[i] = 0;
> > + break;
> > + case RTE_DIR24_8_4B:
> > + ptr32 = &((uint32_t *)fib->tbl8)[tbl8_idx *
> > + RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > + nh = *ptr32;
> > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > + if (nh != ptr32[i])
> > + return;
> > + }
> > + ((uint32_t *)fib->tbl24)[ip >> 8] =
> > + nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > + ptr32[i] = 0;
> > + break;
> > + case RTE_DIR24_8_8B:
> > + ptr64 = &((uint64_t *)fib->tbl8)[tbl8_idx *
> > + RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > + nh = *ptr64;
> > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > + if (nh != ptr64[i])
> > + return;
> > + }
> > + ((uint64_t *)fib->tbl24)[ip >> 8] =
> > + nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > + ptr64[i] = 0;
> > + break;
> > + }
> > + tbl8_free_idx(fib, tbl8_idx);
> > +}
> > +
> > +static int
> > +install_to_fib(struct rte_dir24_8_tbl *fib, uint32_t ledge, uint32_t
> redge,
> > + uint64_t next_hop)
> > +{
> > + uint64_t tbl24_tmp;
> > + int tbl8_idx;
> > + int tmp_tbl8_idx;
> > + uint8_t *tbl8_ptr;
> > +
> > + /*case for 0.0.0.0/0*/
> > + if (unlikely((ledge == 0) && (redge == 0))) {
> > + write_to_fib(fib->tbl24, next_hop << 1, fib->nh_sz, 1 <<
> 24);
> > + return 0;
> > + }
> > + if (ROUNDUP(ledge, 24) <= redge) {
> > + if (ledge < ROUNDUP(ledge, 24)) {
> > + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
> > + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> > + RTE_DIR24_8_VALID_EXT_ENT) {
> > + tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> > + tmp_tbl8_idx = tbl8_get_idx(fib);
> > + if ((tbl8_idx < 0) || (tmp_tbl8_idx < 0))
> > + return -ENOSPC;
> > + tbl8_free_idx(fib, tmp_tbl8_idx);
> > + /*update dir24 entry with tbl8 index*/
> > + write_to_fib(get_tbl24_p(fib, ledge),
> > + (tbl8_idx << 1)|
> > + RTE_DIR24_8_VALID_EXT_ENT,
> > + fib->nh_sz, 1);
> > + } else
> > + tbl8_idx = tbl24_tmp >> 1;
> > + tbl8_ptr = (uint8_t *)fib->tbl8 +
> > + (((tbl8_idx *
> RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
> > + (ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
> > + fib->nh_sz);
> > + /*update tbl8 with new next hop*/
> > + write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> > + RTE_DIR24_8_VALID_EXT_ENT,
> > + fib->nh_sz, ROUNDUP(ledge, 24) - ledge);
> > + tbl8_recycle(fib, ledge, tbl8_idx);
> > + }
> > + if (ROUNDUP(ledge, 24) < (redge & RTE_DIR24_8_TBL24_MASK))
> {
> > + write_to_fib(get_tbl24_p(fib, ROUNDUP(ledge, 24)),
> > + next_hop << 1, fib->nh_sz,
> > + ((redge & RTE_DIR24_8_TBL24_MASK) -
> > + ROUNDUP(ledge, 24)) >> 8);
> > + }
> > + if (redge & ~RTE_DIR24_8_TBL24_MASK) {
> > + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, redge);
> > + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> > + RTE_DIR24_8_VALID_EXT_ENT) {
> > + tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> > + if (tbl8_idx < 0)
> > + return -ENOSPC;
> > + /*update dir24 entry with tbl8 index*/
> > + write_to_fib(get_tbl24_p(fib, redge),
> > + (tbl8_idx << 1)|
> > + RTE_DIR24_8_VALID_EXT_ENT,
> > + fib->nh_sz, 1);
> > + } else
> > + tbl8_idx = tbl24_tmp >> 1;
> > + tbl8_ptr = (uint8_t *)fib->tbl8 +
> > + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT)
> <<
> > + fib->nh_sz);
> > + /*update tbl8 with new next hop*/
> > + write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> > + RTE_DIR24_8_VALID_EXT_ENT,
> > + fib->nh_sz, redge &
> ~RTE_DIR24_8_TBL24_MASK);
> > + tbl8_recycle(fib, redge, tbl8_idx);
> > + }
> > + } else {
> > + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
> > + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> > + RTE_DIR24_8_VALID_EXT_ENT) {
> > + tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> > + if (tbl8_idx < 0)
> > + return -ENOSPC;
> > + /*update dir24 entry with tbl8 index*/
> > + write_to_fib(get_tbl24_p(fib, ledge),
> > + (tbl8_idx << 1)|
> > + RTE_DIR24_8_VALID_EXT_ENT,
> > + fib->nh_sz, 1);
> > + } else
> > + tbl8_idx = tbl24_tmp >> 1;
> > + tbl8_ptr = (uint8_t *)fib->tbl8 +
> > + (((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
> > + (ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
> > + fib->nh_sz);
> > + /*update tbl8 with new next hop*/
> > + write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> > + RTE_DIR24_8_VALID_EXT_ENT,
> > + fib->nh_sz, redge - ledge);
> > + tbl8_recycle(fib, ledge, tbl8_idx);
> > + }
> > + return 0;
> > +}
> > +
> > +static int
> > +modify_fib(struct rte_rib *rib, uint32_t ip, uint8_t depth,
> > + uint64_t next_hop)
> > +{
> > + struct rte_rib_node *tmp = NULL;
> > + struct rte_dir24_8_tbl *fib;
> > + uint32_t ledge, redge;
> > + int ret;
> > +
> > + fib = rib->fib;
> > +
> > + if (next_hop > DIR24_8_MAX_NH(fib))
> > + return -EINVAL;
> > +
> > + ledge = ip;
> > + do {
> > + tmp = rte_rib_tree_get_nxt(rib, ip, depth, tmp,
> > + RTE_RIB_GET_NXT_COVER);
> > + if (tmp != NULL) {
> > + if (tmp->depth == depth)
> > + continue;
> > + redge = tmp->key;
> > + if (ledge == redge) {
> > + ledge = redge +
> > + (uint32_t)(1ULL << (32 -
> tmp->depth));
> > + continue;
> > + }
> > + ret = install_to_fib(fib, ledge, redge,
> > + next_hop);
> > + if (ret != 0)
> > + return ret;
> > + ledge = redge +
> > + (uint32_t)(1ULL << (32 - tmp->depth));
> > + } else {
> > + redge = ip + (uint32_t)(1ULL << (32 - depth));
> > + ret = install_to_fib(fib, ledge, redge,
> > + next_hop);
> > + if (ret != 0)
> > + return ret;
> > + }
> > + } while (tmp);
> > +
> > + return 0;
> > +}
> > +
> > +int
> > +rte_dir24_8_modify(struct rte_rib *rib, uint32_t ip, uint8_t depth,
> > + uint64_t next_hop, enum rte_rib_op op)
> > +{
> > + struct rte_dir24_8_tbl *fib;
> > + struct rte_rib_node *tmp = NULL;
> > + struct rte_rib_node *node;
> > + struct rte_rib_node *parent;
> > + int ret = 0;
> > +
> > + if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> > + return -EINVAL;
> > +
> > + fib = rib->fib;
> > + RTE_ASSERT(fib);
> > +
> > + ip &= (uint32_t)(UINT64_MAX << (32 - depth));
> > +
> > + node = rte_rib_tree_lookup_exact(rib, ip, depth);
> > + switch (op) {
> > + case RTE_RIB_ADD:
> > + if (node != NULL) {
> > + if (node->nh == next_hop)
> > + return 0;
> > + ret = modify_fib(rib, ip, depth, next_hop);
> > + if (ret == 0)
> > + node->nh = next_hop;
> > + return 0;
> > + }
> > + if (depth > 24) {
> > + tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL,
> > + RTE_RIB_GET_NXT_COVER);
> > + if ((tmp == NULL) &&
> > + (fib->cur_tbl8s >= fib->number_tbl8s))
> > + return -ENOSPC;
> > +
> > + }
> > + node = rte_rib_tree_insert(rib, ip, depth);
> > + if (node == NULL)
> > + return -rte_errno;
> > + node->nh = next_hop;
> > + parent = rte_rib_tree_lookup_parent(node);
> > + if ((parent != NULL) && (parent->nh == next_hop))
> > + return 0;
> > + ret = modify_fib(rib, ip, depth, next_hop);
> > + if (ret) {
> > + rte_rib_tree_remove(rib, ip, depth);
> > + return ret;
> > + }
> > + if ((depth > 24) && (tmp == NULL))
> > + fib->cur_tbl8s++;
> > + return 0;
> > + case RTE_RIB_DEL:
> > + if (node == NULL)
> > + return -ENOENT;
> > +
> > + parent = rte_rib_tree_lookup_parent(node);
> > + if (parent != NULL) {
> > + if (parent->nh != node->nh)
> > + ret = modify_fib(rib, ip, depth,
> parent->nh);
> > + } else
> > + ret = modify_fib(rib, ip, depth, fib->def_nh);
> > + if (ret == 0) {
> > + rte_rib_tree_remove(rib, ip, depth);
> > + if (depth > 24) {
> > + tmp = rte_rib_tree_get_nxt(rib, ip, 24,
> NULL,
> > + RTE_RIB_GET_NXT_COVER);
> > + if (tmp == NULL)
> > + fib->cur_tbl8s--;
> > + }
> > + }
> > + return ret;
> > + default:
> > + break;
> > + }
> > + return -EINVAL;
> > +}
> > +
> > +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int
> socket_id,
> > + enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh)
> > +{
> > + char mem_name[RTE_RIB_NAMESIZE];
> > + struct rte_dir24_8_tbl *fib;
> > +
> > + snprintf(mem_name, sizeof(mem_name), "FIB_%s", name);
> > + fib = rte_zmalloc_socket(name, sizeof(struct rte_dir24_8_tbl) +
> > + RTE_DIR24_8_TBL24_NUM_ENT * (1 << nh_sz),
> RTE_CACHE_LINE_SIZE,
> > + socket_id);
> > + if (fib == NULL)
> > + return fib;
> > +
> > + snprintf(mem_name, sizeof(mem_name), "TBL8_%s", name);
> > + fib->tbl8 = rte_zmalloc_socket(mem_name,
> RTE_DIR24_8_TBL8_GRP_NUM_ENT *
> > + (1 << nh_sz) * RTE_DIR24_8_TBL8_NUM_GROUPS,
> > + RTE_CACHE_LINE_SIZE, socket_id);
> > + if (fib->tbl8 == NULL) {
> > + rte_free(fib);
> > + return NULL;
> > + }
> > + fib->def_nh = def_nh;
> > + fib->nh_sz = nh_sz;
> > + fib->number_tbl8s = RTE_MIN((uint32_t)RTE_DIR24_8_TBL8_NUM_GROUPS,
> > + DIR24_8_MAX_NH(fib));
> > +
> > + snprintf(mem_name, sizeof(mem_name), "TBL8_idxes_%s", name);
> > + fib->tbl8_idxes = rte_zmalloc_socket(mem_name,
> > + RTE_ALIGN_CEIL(fib->number_tbl8s, 64) >> 3,
> > + RTE_CACHE_LINE_SIZE, socket_id);
> > + if (fib->tbl8_idxes == NULL) {
> > + rte_free(fib->tbl8);
> > + rte_free(fib);
> > + return NULL;
> > + }
> > +
> > + return fib;
> > +}
> > +
> > +void
> > +rte_dir24_8_free(void *fib_p)
> > +{
> > + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> > +
> > + rte_free(fib->tbl8_idxes);
> > + rte_free(fib->tbl8);
> > + rte_free(fib);
> > +}
> > +
> > +LOOKUP_FUNC(1b, uint8_t, 5)
> > +LOOKUP_FUNC(2b, uint16_t, 6)
> > +LOOKUP_FUNC(4b, uint32_t, 15)
> > +LOOKUP_FUNC(8b, uint64_t, 12)
> > diff --git a/lib/librte_rib/rte_dir24_8.h b/lib/librte_rib/rte_dir24_8.h
> > new file mode 100644
> > index 0000000..f779409
> > --- /dev/null
> > +++ b/lib/librte_rib/rte_dir24_8.h
> > @@ -0,0 +1,116 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> > + */
> > +
> > +#ifndef _RTE_DIR24_8_H_
> > +#define _RTE_DIR24_8_H_
> > +
> > +/**
> > + * @file
> > + * RTE Longest Prefix Match (LPM)
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +/** @internal Total number of tbl24 entries. */
> > +#define RTE_DIR24_8_TBL24_NUM_ENT (1 << 24)
> > +
> > +/** Maximum depth value possible for IPv4 LPM. */
> > +#define RTE_DIR24_8_MAX_DEPTH 32
> > +
> > +/** @internal Number of entries in a tbl8 group. */
> > +#define RTE_DIR24_8_TBL8_GRP_NUM_ENT 256
> > +
> > +/** @internal Total number of tbl8 groups in the tbl8. */
> > +#define RTE_DIR24_8_TBL8_NUM_GROUPS 65536
> > +
> > +/** @internal bitmask with valid and valid_group fields set */
> > +#define RTE_DIR24_8_VALID_EXT_ENT 0x01
> > +
> > +#define RTE_DIR24_8_TBL24_MASK 0xffffff00
> > +
> > +/** Size of nexthop (1 << nh_sz) bits */
> > +enum rte_dir24_8_nh_sz {
> > + RTE_DIR24_8_1B,
> > + RTE_DIR24_8_2B,
> > + RTE_DIR24_8_4B,
> > + RTE_DIR24_8_8B
> > +};
> > +
> > +
> > +#define DIR24_8_BITS_IN_NH(fib) (8 * (1 << fib->nh_sz))
> > +#define DIR24_8_MAX_NH(fib) ((1ULL << (DIR24_8_BITS_IN_NH(fib) - 1)) -
> 1)
> > +
> > +#define DIR24_8_TBL_IDX(a, fib) ((a) >> (3 - fib->nh_sz))
> > +#define DIR24_8_PSD_IDX(a, fib) ((a) & ((1 << (3 -
> fib->nh_sz)) - 1))
> > +
> > +#define DIR24_8_TBL24_VAL(ip) (ip >> 8)
> > +#define DIR24_8_TBL8_VAL(res, ip) \
> > + ((res >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT + (uint8_t)ip) \
> > +
> > +#define DIR24_8_LOOKUP_MSK \
> > + (((1ULL << ((1 << (fib->nh_sz + 3)) - 1)) << 1) - 1) \
> > +
> > +#define RTE_DIR24_8_GET_TBL24(fib, ip)
> \
> > + ((fib->tbl24[DIR24_8_TBL_IDX(DIR24_8_TBL24_VAL(ip), fib)] >> \
> > + (DIR24_8_PSD_IDX(DIR24_8_TBL24_VAL(ip), fib) * \
> > + DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK) \
> > +
> > +#define RTE_DIR24_8_GET_TBL8(fib, res, ip) \
> > + ((fib->tbl8[DIR24_8_TBL_IDX(DIR24_8_TBL8_VAL(res, ip), fib)] >> \
> > + (DIR24_8_PSD_IDX(DIR24_8_TBL8_VAL(res, ip), fib) * \
> > + DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK) \
> >
> I would strongly suggest making each of the above macros into inline
> functions instead. It would allow easier readability since you have
> parameter types and can split things across lines easier.
> Also, some comments might be good too.
>
> +
> > +
> > +struct rte_dir24_8_tbl {
> > + uint32_t number_tbl8s; /**< Total number of tbl8s. */
> > + uint32_t cur_tbl8s; /**< Current cumber of tbl8s. */
> > + uint64_t def_nh;
> > + enum rte_dir24_8_nh_sz nh_sz; /**< Size of nexthop entry */
> > + uint64_t *tbl8; /**< LPM tbl8 table. */
> > + uint64_t *tbl8_idxes;
> > + uint64_t tbl24[0] __rte_cache_aligned; /**< LPM tbl24
> table. */
> > +};
> > +
> > +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int
> socket_id,
> > + enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh);
> > +void rte_dir24_8_free(void *fib_p);
> > +int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key,
> > + uint8_t depth, uint64_t next_hop, enum rte_rib_op op);
> > +int rte_dir24_8_lookup_bulk_1b(void *fib_p, const uint32_t *ips,
> > + uint64_t *next_hops, const unsigned n);
> > +int rte_dir24_8_lookup_bulk_2b(void *fib_p, const uint32_t *ips,
> > + uint64_t *next_hops, const unsigned n);
> > +int rte_dir24_8_lookup_bulk_4b(void *fib_p, const uint32_t *ips,
> > + uint64_t *next_hops, const unsigned n);
> > +int rte_dir24_8_lookup_bulk_8b(void *fib_p, const uint32_t *ips,
> > + uint64_t *next_hops, const unsigned n);
> > +
> > +
> > +static inline int
> > +rte_dir24_8_lookup(void *fib_p, uint32_t ip, uint64_t *next_hop)
>
> Why use void * as parameter, since the proper type is defined just above?
agree, will change to struct rte_dir24_8_tbl *
>
> > +{
> > + uint64_t res;
> > + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> > +
> > + RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ip == NULL) ||
> > + (next_hop == NULL)), -EINVAL);
> > +
> > + res = RTE_DIR24_8_GET_TBL24(fib, ip);
> > + if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) ==
> > + RTE_DIR24_8_VALID_EXT_ENT)) {
> > + res = RTE_DIR24_8_GET_TBL8(fib, res, ip);
> > + }
> > + *next_hop = res >> 1;
> > + return 0;
> > +}
>
> Do we need this static inline function? Can the bulk functions do on their
> own? If we can remove this, we can move the most of the header file
> contents, especially the structures, out of the public header. That would
> greatly improve the ease with which ABI can be maintained.
>
It was done in some manner to LPM. There was separate single lookup and
bulk versions.
Of course it is possible to remove this function at all and use bulk
version to lookup single packet. But I thought maybe somebody could use it.
>
>
> +
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_DIR24_8_H_ */
> > +
>
--
Regards,
Vladimir
next prev parent reply other threads:[~2018-03-29 20:11 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-21 21:44 [dpdk-dev] [PATCH v2 0/2] lib/rib: Add Routing Information Base library Medvedkin Vladimir
2018-02-21 21:44 ` [dpdk-dev] [PATCH v2 1/2] Add RIB library Medvedkin Vladimir
2018-03-14 11:09 ` Bruce Richardson
2018-03-14 12:05 ` Richardson, Bruce
2018-03-25 18:17 ` Vladimir Medvedkin
2018-03-26 9:50 ` Bruce Richardson
2018-03-29 19:59 ` Vladimir Medvedkin
2018-03-29 10:27 ` Bruce Richardson
2018-03-29 20:11 ` Vladimir Medvedkin [this message]
2018-03-29 20:41 ` Bruce Richardson
2018-02-21 21:44 ` [dpdk-dev] [PATCH v2 2/2] Add autotests for " Medvedkin Vladimir
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANDrEHkzgUcYzaS1pdSLRuJ4WSrEF91nqOK14+_Ny4G3eSJX_g@mail.gmail.com \
--to=medvedkinv@gmail.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).