From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 1B16520F for ; Thu, 29 Mar 2018 12:27:50 +0200 (CEST) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Mar 2018 03:27:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,376,1517904000"; d="scan'208";a="46271429" Received: from bricha3-mobl.ger.corp.intel.com ([10.237.221.52]) by orsmga002.jf.intel.com with SMTP; 29 Mar 2018 03:27:47 -0700 Received: by (sSMTP sendmail emulation); Thu, 29 Mar 2018 11:27:47 +0100 Date: Thu, 29 Mar 2018 11:27:46 +0100 From: Bruce Richardson To: Medvedkin Vladimir Cc: dev@dpdk.org Message-ID: <20180329102746.GA4004@bricha3-MOBL.ger.corp.intel.com> References: <1519249495-16594-1-git-send-email-medvedkinv@gmail.com> <1519249495-16594-2-git-send-email-medvedkinv@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1519249495-16594-2-git-send-email-medvedkinv@gmail.com> Organization: Intel Research and Development Ireland Ltd. User-Agent: Mutt/1.9.4 (2018-02-28) Subject: Re: [dpdk-dev] [PATCH v2 1/2] Add RIB library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Mar 2018 10:27:51 -0000 On Wed, Feb 21, 2018 at 09:44:54PM +0000, Medvedkin Vladimir wrote: > RIB is an alternative to current LPM library. > It solves the following problems > - Increases the speed of control plane operations against lpm such as > adding/deleting routes > - Adds abstraction from dataplane algorithms, so it is possible to add > different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc > in addition to current dir24_8 > - It is possible to keep user defined application specific additional > information in struct rte_rib_node which represents route entry. > It can be next hop/set of next hops (i.e. active and feasible), > pointers to link rte_rib_node based on some criteria (i.e. next_hop), > plenty of additional control plane information. > - For dir24_8 implementation it is possible to remove rte_lpm_tbl_entry.depth > field that helps to save 6 bits. > - Also new dir24_8 implementation supports different next_hop sizes > (1/2/4/8 bytes per next hop) > - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate ternary operator. > Instead it returns special default value if there is no route. > > Signed-off-by: Medvedkin Vladimir > --- Hi again, some initial comments on the dir24_8 files below. /Bruce > config/common_base | 6 + > doc/api/doxy-api.conf | 1 + > lib/Makefile | 2 + > lib/librte_rib/Makefile | 22 ++ > lib/librte_rib/rte_dir24_8.c | 482 +++++++++++++++++++++++++++++++++ > lib/librte_rib/rte_dir24_8.h | 116 ++++++++ > lib/librte_rib/rte_rib.c | 526 +++++++++++++++++++++++++++++++++++++ > lib/librte_rib/rte_rib.h | 322 +++++++++++++++++++++++ > lib/librte_rib/rte_rib_version.map | 18 ++ > mk/rte.app.mk | 1 + > 10 files changed, 1496 insertions(+) > create mode 100644 lib/librte_rib/Makefile > create mode 100644 lib/librte_rib/rte_dir24_8.c > create mode 100644 lib/librte_rib/rte_dir24_8.h > create mode 100644 lib/librte_rib/rte_rib.c > create mode 100644 lib/librte_rib/rte_rib.h > create mode 100644 lib/librte_rib/rte_rib_version.map > > diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c > new file mode 100644 > index 0000000..a12f882 > --- /dev/null > +++ b/lib/librte_rib/rte_dir24_8.c > @@ -0,0 +1,482 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2018 Vladimir Medvedkin > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +#include > +#include > + > +#include > +#include > + > +#define BITMAP_SLAB_BIT_SIZE_LOG2 6 > +#define BITMAP_SLAB_BIT_SIZE (1 << BITMAP_SLAB_BIT_SIZE_LOG2) > +#define BITMAP_SLAB_BITMASK (BITMAP_SLAB_BIT_SIZE - 1) > + > +#define ROUNDUP(x, y) RTE_ALIGN_CEIL(x, (1 << (32 - y))) > + > +static __rte_always_inline __attribute__((pure)) void * > +get_tbl24_p(struct rte_dir24_8_tbl *fib, uint32_t ip) > +{ > + return (void *)&((uint8_t *)fib->tbl24)[(ip & > + RTE_DIR24_8_TBL24_MASK) >> (8 - fib->nh_sz)]; > +} > + > +#define LOOKUP_FUNC(suffix, type, bulk_prefetch) \ > +int rte_dir24_8_lookup_bulk_##suffix(void *fib_p, const uint32_t *ips, \ > + uint64_t *next_hops, const unsigned n) \ > +{ \ > + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; \ > + uint64_t tmp; \ > + uint32_t i; \ > + uint32_t prefetch_offset = RTE_MIN((unsigned)bulk_prefetch, n); \ > + \ > + RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ips == NULL) || \ > + (next_hops == NULL)), -EINVAL); \ > + \ > + for (i = 0; i < prefetch_offset; i++) \ > + rte_prefetch0(get_tbl24_p(fib, ips[i])); \ > + for (i = 0; i < (n - prefetch_offset); i++) { \ > + rte_prefetch0(get_tbl24_p(fib, ips[i + prefetch_offset])); \ > + tmp = ((type *)fib->tbl24)[ips[i] >> 8]; \ > + if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) == \ > + RTE_DIR24_8_VALID_EXT_ENT)) { \ > + tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] + \ > + ((tmp >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \ > + } \ > + next_hops[i] = tmp >> 1; \ > + } \ > + for (; i < n; i++) { \ > + tmp = ((type *)fib->tbl24)[ips[i] >> 8]; \ > + if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) == \ > + RTE_DIR24_8_VALID_EXT_ENT)) { \ > + tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] + \ > + ((tmp >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \ > + } \ > + next_hops[i] = tmp >> 1; \ > + } \ > + return 0; \ > +} \ What is the advantage of doing this as a macro? Unless I'm missing something "suffix" is never actually used in the function at all, and you reference the size of the data from fix->nh_sz. Therefore there can be no performance benefit from having such a lookup function, that I can see. Therefore, if performance is ok, I suggest just making a single lookup_bulk function that works with all sizes - as the inlined lookup function does in the header. Alternatively, if you do want specific functions for each entry size, you still don't need macros. Write a single function that takes as a final parameter the entry-size and use that in calculations rather than nh_sz. Then wrap that function in the set of public ones, passing in the final size parameter explicitly as "1", "2", "4" or "8". The compiler will then know that as a compile-time constant and generate the correct code for each size. However, for this path I suggest you check for any resulting performance improvement, e.g. with l3fwd, as I think it's not likely to be significant. > + > +static void > +write_to_fib(void *ptr, uint64_t val, enum rte_dir24_8_nh_sz size, int n) > +{ > + int i; > + uint8_t *ptr8 = (uint8_t *)ptr; > + uint16_t *ptr16 = (uint16_t *)ptr; > + uint32_t *ptr32 = (uint32_t *)ptr; > + uint64_t *ptr64 = (uint64_t *)ptr; > + > + switch (size) { > + case RTE_DIR24_8_1B: > + for (i = 0; i < n; i++) > + ptr8[i] = (uint8_t)val; > + break; > + case RTE_DIR24_8_2B: > + for (i = 0; i < n; i++) > + ptr16[i] = (uint16_t)val; > + break; > + case RTE_DIR24_8_4B: > + for (i = 0; i < n; i++) > + ptr32[i] = (uint32_t)val; > + break; > + case RTE_DIR24_8_8B: > + for (i = 0; i < n; i++) > + ptr64[i] = (uint64_t)val; > + break; > + } > +} > + > +static int > +tbl8_get_idx(struct rte_dir24_8_tbl *fib) > +{ > + uint32_t i; > + int bit_idx; > + > + for (i = 0; (i < (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) && > + (fib->tbl8_idxes[i] == UINT64_MAX); i++) > + ; > + if (i <= (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) { > + bit_idx = __builtin_ctzll(~fib->tbl8_idxes[i]); > + fib->tbl8_idxes[i] |= (1ULL << bit_idx); > + return (i << BITMAP_SLAB_BIT_SIZE_LOG2) + bit_idx; > + } > + return -ENOSPC; > +} > + > +static inline void > +tbl8_free_idx(struct rte_dir24_8_tbl *fib, int idx) > +{ > + fib->tbl8_idxes[idx >> BITMAP_SLAB_BIT_SIZE_LOG2] &= > + ~(1ULL << (idx & BITMAP_SLAB_BITMASK)); > +} > + > +static int > +tbl8_alloc(struct rte_dir24_8_tbl *fib, uint64_t nh) > +{ > + int tbl8_idx; > + uint8_t *tbl8_ptr; > + > + tbl8_idx = tbl8_get_idx(fib); > + if (tbl8_idx < 0) > + return tbl8_idx; > + tbl8_ptr = (uint8_t *)fib->tbl8 + > + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << > + fib->nh_sz); > + /*Init tbl8 entries with nexthop from tbl24*/ > + write_to_fib((void *)tbl8_ptr, nh| > + RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz, > + RTE_DIR24_8_TBL8_GRP_NUM_ENT); > + return tbl8_idx; > +} > + > +static void > +tbl8_recycle(struct rte_dir24_8_tbl *fib, uint32_t ip, uint64_t tbl8_idx) > +{ > + int i; > + uint64_t nh; > + uint8_t *ptr8; > + uint16_t *ptr16; > + uint32_t *ptr32; > + uint64_t *ptr64; > + > + switch (fib->nh_sz) { > + case RTE_DIR24_8_1B: > + ptr8 = &((uint8_t *)fib->tbl8)[tbl8_idx * > + RTE_DIR24_8_TBL8_GRP_NUM_ENT]; > + nh = *ptr8; > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) { > + if (nh != ptr8[i]) > + return; > + } > + ((uint8_t *)fib->tbl24)[ip >> 8] = > + nh & ~RTE_DIR24_8_VALID_EXT_ENT; > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) > + ptr8[i] = 0; > + break; > + case RTE_DIR24_8_2B: > + ptr16 = &((uint16_t *)fib->tbl8)[tbl8_idx * > + RTE_DIR24_8_TBL8_GRP_NUM_ENT]; > + nh = *ptr16; > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) { > + if (nh != ptr16[i]) > + return; > + } > + ((uint16_t *)fib->tbl24)[ip >> 8] = > + nh & ~RTE_DIR24_8_VALID_EXT_ENT; > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) > + ptr16[i] = 0; > + break; > + case RTE_DIR24_8_4B: > + ptr32 = &((uint32_t *)fib->tbl8)[tbl8_idx * > + RTE_DIR24_8_TBL8_GRP_NUM_ENT]; > + nh = *ptr32; > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) { > + if (nh != ptr32[i]) > + return; > + } > + ((uint32_t *)fib->tbl24)[ip >> 8] = > + nh & ~RTE_DIR24_8_VALID_EXT_ENT; > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) > + ptr32[i] = 0; > + break; > + case RTE_DIR24_8_8B: > + ptr64 = &((uint64_t *)fib->tbl8)[tbl8_idx * > + RTE_DIR24_8_TBL8_GRP_NUM_ENT]; > + nh = *ptr64; > + for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) { > + if (nh != ptr64[i]) > + return; > + } > + ((uint64_t *)fib->tbl24)[ip >> 8] = > + nh & ~RTE_DIR24_8_VALID_EXT_ENT; > + for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) > + ptr64[i] = 0; > + break; > + } > + tbl8_free_idx(fib, tbl8_idx); > +} > + > +static int > +install_to_fib(struct rte_dir24_8_tbl *fib, uint32_t ledge, uint32_t redge, > + uint64_t next_hop) > +{ > + uint64_t tbl24_tmp; > + int tbl8_idx; > + int tmp_tbl8_idx; > + uint8_t *tbl8_ptr; > + > + /*case for 0.0.0.0/0*/ > + if (unlikely((ledge == 0) && (redge == 0))) { > + write_to_fib(fib->tbl24, next_hop << 1, fib->nh_sz, 1 << 24); > + return 0; > + } > + if (ROUNDUP(ledge, 24) <= redge) { > + if (ledge < ROUNDUP(ledge, 24)) { > + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge); > + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) != > + RTE_DIR24_8_VALID_EXT_ENT) { > + tbl8_idx = tbl8_alloc(fib, tbl24_tmp); > + tmp_tbl8_idx = tbl8_get_idx(fib); > + if ((tbl8_idx < 0) || (tmp_tbl8_idx < 0)) > + return -ENOSPC; > + tbl8_free_idx(fib, tmp_tbl8_idx); > + /*update dir24 entry with tbl8 index*/ > + write_to_fib(get_tbl24_p(fib, ledge), > + (tbl8_idx << 1)| > + RTE_DIR24_8_VALID_EXT_ENT, > + fib->nh_sz, 1); > + } else > + tbl8_idx = tbl24_tmp >> 1; > + tbl8_ptr = (uint8_t *)fib->tbl8 + > + (((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) + > + (ledge & ~RTE_DIR24_8_TBL24_MASK)) << > + fib->nh_sz); > + /*update tbl8 with new next hop*/ > + write_to_fib((void *)tbl8_ptr, (next_hop << 1)| > + RTE_DIR24_8_VALID_EXT_ENT, > + fib->nh_sz, ROUNDUP(ledge, 24) - ledge); > + tbl8_recycle(fib, ledge, tbl8_idx); > + } > + if (ROUNDUP(ledge, 24) < (redge & RTE_DIR24_8_TBL24_MASK)) { > + write_to_fib(get_tbl24_p(fib, ROUNDUP(ledge, 24)), > + next_hop << 1, fib->nh_sz, > + ((redge & RTE_DIR24_8_TBL24_MASK) - > + ROUNDUP(ledge, 24)) >> 8); > + } > + if (redge & ~RTE_DIR24_8_TBL24_MASK) { > + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, redge); > + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) != > + RTE_DIR24_8_VALID_EXT_ENT) { > + tbl8_idx = tbl8_alloc(fib, tbl24_tmp); > + if (tbl8_idx < 0) > + return -ENOSPC; > + /*update dir24 entry with tbl8 index*/ > + write_to_fib(get_tbl24_p(fib, redge), > + (tbl8_idx << 1)| > + RTE_DIR24_8_VALID_EXT_ENT, > + fib->nh_sz, 1); > + } else > + tbl8_idx = tbl24_tmp >> 1; > + tbl8_ptr = (uint8_t *)fib->tbl8 + > + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << > + fib->nh_sz); > + /*update tbl8 with new next hop*/ > + write_to_fib((void *)tbl8_ptr, (next_hop << 1)| > + RTE_DIR24_8_VALID_EXT_ENT, > + fib->nh_sz, redge & ~RTE_DIR24_8_TBL24_MASK); > + tbl8_recycle(fib, redge, tbl8_idx); > + } > + } else { > + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge); > + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) != > + RTE_DIR24_8_VALID_EXT_ENT) { > + tbl8_idx = tbl8_alloc(fib, tbl24_tmp); > + if (tbl8_idx < 0) > + return -ENOSPC; > + /*update dir24 entry with tbl8 index*/ > + write_to_fib(get_tbl24_p(fib, ledge), > + (tbl8_idx << 1)| > + RTE_DIR24_8_VALID_EXT_ENT, > + fib->nh_sz, 1); > + } else > + tbl8_idx = tbl24_tmp >> 1; > + tbl8_ptr = (uint8_t *)fib->tbl8 + > + (((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) + > + (ledge & ~RTE_DIR24_8_TBL24_MASK)) << > + fib->nh_sz); > + /*update tbl8 with new next hop*/ > + write_to_fib((void *)tbl8_ptr, (next_hop << 1)| > + RTE_DIR24_8_VALID_EXT_ENT, > + fib->nh_sz, redge - ledge); > + tbl8_recycle(fib, ledge, tbl8_idx); > + } > + return 0; > +} > + > +static int > +modify_fib(struct rte_rib *rib, uint32_t ip, uint8_t depth, > + uint64_t next_hop) > +{ > + struct rte_rib_node *tmp = NULL; > + struct rte_dir24_8_tbl *fib; > + uint32_t ledge, redge; > + int ret; > + > + fib = rib->fib; > + > + if (next_hop > DIR24_8_MAX_NH(fib)) > + return -EINVAL; > + > + ledge = ip; > + do { > + tmp = rte_rib_tree_get_nxt(rib, ip, depth, tmp, > + RTE_RIB_GET_NXT_COVER); > + if (tmp != NULL) { > + if (tmp->depth == depth) > + continue; > + redge = tmp->key; > + if (ledge == redge) { > + ledge = redge + > + (uint32_t)(1ULL << (32 - tmp->depth)); > + continue; > + } > + ret = install_to_fib(fib, ledge, redge, > + next_hop); > + if (ret != 0) > + return ret; > + ledge = redge + > + (uint32_t)(1ULL << (32 - tmp->depth)); > + } else { > + redge = ip + (uint32_t)(1ULL << (32 - depth)); > + ret = install_to_fib(fib, ledge, redge, > + next_hop); > + if (ret != 0) > + return ret; > + } > + } while (tmp); > + > + return 0; > +} > + > +int > +rte_dir24_8_modify(struct rte_rib *rib, uint32_t ip, uint8_t depth, > + uint64_t next_hop, enum rte_rib_op op) > +{ > + struct rte_dir24_8_tbl *fib; > + struct rte_rib_node *tmp = NULL; > + struct rte_rib_node *node; > + struct rte_rib_node *parent; > + int ret = 0; > + > + if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH)) > + return -EINVAL; > + > + fib = rib->fib; > + RTE_ASSERT(fib); > + > + ip &= (uint32_t)(UINT64_MAX << (32 - depth)); > + > + node = rte_rib_tree_lookup_exact(rib, ip, depth); > + switch (op) { > + case RTE_RIB_ADD: > + if (node != NULL) { > + if (node->nh == next_hop) > + return 0; > + ret = modify_fib(rib, ip, depth, next_hop); > + if (ret == 0) > + node->nh = next_hop; > + return 0; > + } > + if (depth > 24) { > + tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL, > + RTE_RIB_GET_NXT_COVER); > + if ((tmp == NULL) && > + (fib->cur_tbl8s >= fib->number_tbl8s)) > + return -ENOSPC; > + > + } > + node = rte_rib_tree_insert(rib, ip, depth); > + if (node == NULL) > + return -rte_errno; > + node->nh = next_hop; > + parent = rte_rib_tree_lookup_parent(node); > + if ((parent != NULL) && (parent->nh == next_hop)) > + return 0; > + ret = modify_fib(rib, ip, depth, next_hop); > + if (ret) { > + rte_rib_tree_remove(rib, ip, depth); > + return ret; > + } > + if ((depth > 24) && (tmp == NULL)) > + fib->cur_tbl8s++; > + return 0; > + case RTE_RIB_DEL: > + if (node == NULL) > + return -ENOENT; > + > + parent = rte_rib_tree_lookup_parent(node); > + if (parent != NULL) { > + if (parent->nh != node->nh) > + ret = modify_fib(rib, ip, depth, parent->nh); > + } else > + ret = modify_fib(rib, ip, depth, fib->def_nh); > + if (ret == 0) { > + rte_rib_tree_remove(rib, ip, depth); > + if (depth > 24) { > + tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL, > + RTE_RIB_GET_NXT_COVER); > + if (tmp == NULL) > + fib->cur_tbl8s--; > + } > + } > + return ret; > + default: > + break; > + } > + return -EINVAL; > +} > + > +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int socket_id, > + enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh) > +{ > + char mem_name[RTE_RIB_NAMESIZE]; > + struct rte_dir24_8_tbl *fib; > + > + snprintf(mem_name, sizeof(mem_name), "FIB_%s", name); > + fib = rte_zmalloc_socket(name, sizeof(struct rte_dir24_8_tbl) + > + RTE_DIR24_8_TBL24_NUM_ENT * (1 << nh_sz), RTE_CACHE_LINE_SIZE, > + socket_id); > + if (fib == NULL) > + return fib; > + > + snprintf(mem_name, sizeof(mem_name), "TBL8_%s", name); > + fib->tbl8 = rte_zmalloc_socket(mem_name, RTE_DIR24_8_TBL8_GRP_NUM_ENT * > + (1 << nh_sz) * RTE_DIR24_8_TBL8_NUM_GROUPS, > + RTE_CACHE_LINE_SIZE, socket_id); > + if (fib->tbl8 == NULL) { > + rte_free(fib); > + return NULL; > + } > + fib->def_nh = def_nh; > + fib->nh_sz = nh_sz; > + fib->number_tbl8s = RTE_MIN((uint32_t)RTE_DIR24_8_TBL8_NUM_GROUPS, > + DIR24_8_MAX_NH(fib)); > + > + snprintf(mem_name, sizeof(mem_name), "TBL8_idxes_%s", name); > + fib->tbl8_idxes = rte_zmalloc_socket(mem_name, > + RTE_ALIGN_CEIL(fib->number_tbl8s, 64) >> 3, > + RTE_CACHE_LINE_SIZE, socket_id); > + if (fib->tbl8_idxes == NULL) { > + rte_free(fib->tbl8); > + rte_free(fib); > + return NULL; > + } > + > + return fib; > +} > + > +void > +rte_dir24_8_free(void *fib_p) > +{ > + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; > + > + rte_free(fib->tbl8_idxes); > + rte_free(fib->tbl8); > + rte_free(fib); > +} > + > +LOOKUP_FUNC(1b, uint8_t, 5) > +LOOKUP_FUNC(2b, uint16_t, 6) > +LOOKUP_FUNC(4b, uint32_t, 15) > +LOOKUP_FUNC(8b, uint64_t, 12) > diff --git a/lib/librte_rib/rte_dir24_8.h b/lib/librte_rib/rte_dir24_8.h > new file mode 100644 > index 0000000..f779409 > --- /dev/null > +++ b/lib/librte_rib/rte_dir24_8.h > @@ -0,0 +1,116 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2018 Vladimir Medvedkin > + */ > + > +#ifndef _RTE_DIR24_8_H_ > +#define _RTE_DIR24_8_H_ > + > +/** > + * @file > + * RTE Longest Prefix Match (LPM) > + */ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +/** @internal Total number of tbl24 entries. */ > +#define RTE_DIR24_8_TBL24_NUM_ENT (1 << 24) > + > +/** Maximum depth value possible for IPv4 LPM. */ > +#define RTE_DIR24_8_MAX_DEPTH 32 > + > +/** @internal Number of entries in a tbl8 group. */ > +#define RTE_DIR24_8_TBL8_GRP_NUM_ENT 256 > + > +/** @internal Total number of tbl8 groups in the tbl8. */ > +#define RTE_DIR24_8_TBL8_NUM_GROUPS 65536 > + > +/** @internal bitmask with valid and valid_group fields set */ > +#define RTE_DIR24_8_VALID_EXT_ENT 0x01 > + > +#define RTE_DIR24_8_TBL24_MASK 0xffffff00 > + > +/** Size of nexthop (1 << nh_sz) bits */ > +enum rte_dir24_8_nh_sz { > + RTE_DIR24_8_1B, > + RTE_DIR24_8_2B, > + RTE_DIR24_8_4B, > + RTE_DIR24_8_8B > +}; > + > + > +#define DIR24_8_BITS_IN_NH(fib) (8 * (1 << fib->nh_sz)) > +#define DIR24_8_MAX_NH(fib) ((1ULL << (DIR24_8_BITS_IN_NH(fib) - 1)) - 1) > + > +#define DIR24_8_TBL_IDX(a, fib) ((a) >> (3 - fib->nh_sz)) > +#define DIR24_8_PSD_IDX(a, fib) ((a) & ((1 << (3 - fib->nh_sz)) - 1)) > + > +#define DIR24_8_TBL24_VAL(ip) (ip >> 8) > +#define DIR24_8_TBL8_VAL(res, ip) \ > + ((res >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT + (uint8_t)ip) \ > + > +#define DIR24_8_LOOKUP_MSK \ > + (((1ULL << ((1 << (fib->nh_sz + 3)) - 1)) << 1) - 1) \ > + > +#define RTE_DIR24_8_GET_TBL24(fib, ip) \ > + ((fib->tbl24[DIR24_8_TBL_IDX(DIR24_8_TBL24_VAL(ip), fib)] >> \ > + (DIR24_8_PSD_IDX(DIR24_8_TBL24_VAL(ip), fib) * \ > + DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK) \ > + > +#define RTE_DIR24_8_GET_TBL8(fib, res, ip) \ > + ((fib->tbl8[DIR24_8_TBL_IDX(DIR24_8_TBL8_VAL(res, ip), fib)] >> \ > + (DIR24_8_PSD_IDX(DIR24_8_TBL8_VAL(res, ip), fib) * \ > + DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK) \ > I would strongly suggest making each of the above macros into inline functions instead. It would allow easier readability since you have parameter types and can split things across lines easier. Also, some comments might be good too. + > + > +struct rte_dir24_8_tbl { > + uint32_t number_tbl8s; /**< Total number of tbl8s. */ > + uint32_t cur_tbl8s; /**< Current cumber of tbl8s. */ > + uint64_t def_nh; > + enum rte_dir24_8_nh_sz nh_sz; /**< Size of nexthop entry */ > + uint64_t *tbl8; /**< LPM tbl8 table. */ > + uint64_t *tbl8_idxes; > + uint64_t tbl24[0] __rte_cache_aligned; /**< LPM tbl24 table. */ > +}; > + > +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int socket_id, > + enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh); > +void rte_dir24_8_free(void *fib_p); > +int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key, > + uint8_t depth, uint64_t next_hop, enum rte_rib_op op); > +int rte_dir24_8_lookup_bulk_1b(void *fib_p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned n); > +int rte_dir24_8_lookup_bulk_2b(void *fib_p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned n); > +int rte_dir24_8_lookup_bulk_4b(void *fib_p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned n); > +int rte_dir24_8_lookup_bulk_8b(void *fib_p, const uint32_t *ips, > + uint64_t *next_hops, const unsigned n); > + > + > +static inline int > +rte_dir24_8_lookup(void *fib_p, uint32_t ip, uint64_t *next_hop) Why use void * as parameter, since the proper type is defined just above? > +{ > + uint64_t res; > + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; > + > + RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ip == NULL) || > + (next_hop == NULL)), -EINVAL); > + > + res = RTE_DIR24_8_GET_TBL24(fib, ip); > + if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) == > + RTE_DIR24_8_VALID_EXT_ENT)) { > + res = RTE_DIR24_8_GET_TBL8(fib, res, ip); > + } > + *next_hop = res >> 1; > + return 0; > +} Do we need this static inline function? Can the bulk functions do on their own? If we can remove this, we can move the most of the header file contents, especially the structures, out of the public header. That would greatly improve the ease with which ABI can be maintained. > + > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_DIR24_8_H_ */ > +