From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bruce.richardson@intel.com>
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
 by dpdk.org (Postfix) with ESMTP id 1B16520F
 for <dev@dpdk.org>; Thu, 29 Mar 2018 12:27:50 +0200 (CEST)
X-Amp-Result: UNKNOWN
X-Amp-Original-Verdict: FILE UNKNOWN
X-Amp-File-Uploaded: False
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 29 Mar 2018 03:27:49 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,376,1517904000"; d="scan'208";a="46271429"
Received: from bricha3-mobl.ger.corp.intel.com ([10.237.221.52])
 by orsmga002.jf.intel.com with SMTP; 29 Mar 2018 03:27:47 -0700
Received: by  (sSMTP sendmail emulation); Thu, 29 Mar 2018 11:27:47 +0100
Date: Thu, 29 Mar 2018 11:27:46 +0100
From: Bruce Richardson <bruce.richardson@intel.com>
To: Medvedkin Vladimir <medvedkinv@gmail.com>
Cc: dev@dpdk.org
Message-ID: <20180329102746.GA4004@bricha3-MOBL.ger.corp.intel.com>
References: <1519249495-16594-1-git-send-email-medvedkinv@gmail.com>
 <1519249495-16594-2-git-send-email-medvedkinv@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1519249495-16594-2-git-send-email-medvedkinv@gmail.com>
Organization: Intel Research and Development Ireland Ltd.
User-Agent: Mutt/1.9.4 (2018-02-28)
Subject: Re: [dpdk-dev] [PATCH v2 1/2] Add RIB library
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Mar 2018 10:27:51 -0000

On Wed, Feb 21, 2018 at 09:44:54PM +0000, Medvedkin Vladimir wrote:
> RIB is an alternative to current LPM library.
> It solves the following problems
>  - Increases the speed of control plane operations against lpm such as
>    adding/deleting routes
>  - Adds abstraction from dataplane algorithms, so it is possible to add
>    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
>    in addition to current dir24_8
>  - It is possible to keep user defined application specific additional
>    information in struct rte_rib_node which represents route entry.
>    It can be next hop/set of next hops (i.e. active and feasible),
>    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
>    plenty of additional control plane information.
>  - For dir24_8 implementation it is possible to remove rte_lpm_tbl_entry.depth
>    field that helps to save 6 bits.
>  - Also new dir24_8 implementation supports different next_hop sizes
>    (1/2/4/8 bytes per next hop)
>  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate ternary operator.
>    Instead it returns special default value if there is no route.
> 
> Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
> ---

Hi again,

some initial comments on the dir24_8 files below.

/Bruce

>  config/common_base                 |   6 +
>  doc/api/doxy-api.conf              |   1 +
>  lib/Makefile                       |   2 +
>  lib/librte_rib/Makefile            |  22 ++
>  lib/librte_rib/rte_dir24_8.c       | 482 +++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_dir24_8.h       | 116 ++++++++
>  lib/librte_rib/rte_rib.c           | 526 +++++++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
>  lib/librte_rib/rte_rib_version.map |  18 ++
>  mk/rte.app.mk                      |   1 +
>  10 files changed, 1496 insertions(+)
>  create mode 100644 lib/librte_rib/Makefile
>  create mode 100644 lib/librte_rib/rte_dir24_8.c
>  create mode 100644 lib/librte_rib/rte_dir24_8.h
>  create mode 100644 lib/librte_rib/rte_rib.c
>  create mode 100644 lib/librte_rib/rte_rib.h
>  create mode 100644 lib/librte_rib/rte_rib_version.map
> 

<snip>

> diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c
> new file mode 100644
> index 0000000..a12f882
> --- /dev/null
> +++ b/lib/librte_rib/rte_dir24_8.c
> @@ -0,0 +1,482 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> + */
> +
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <rte_debug.h>
> +#include <rte_malloc.h>
> +#include <rte_prefetch.h>
> +#include <rte_errno.h>
> +
> +#include <inttypes.h>
> +
> +#include <rte_memory.h>
> +#include <rte_branch_prediction.h>
> +
> +#include <rte_rib.h>
> +#include <rte_dir24_8.h>
> +
> +#define BITMAP_SLAB_BIT_SIZE_LOG2	6
> +#define BITMAP_SLAB_BIT_SIZE		(1 << BITMAP_SLAB_BIT_SIZE_LOG2)
> +#define BITMAP_SLAB_BITMASK		(BITMAP_SLAB_BIT_SIZE - 1)
> +
> +#define ROUNDUP(x, y)	 RTE_ALIGN_CEIL(x, (1 << (32 - y)))
> +
> +static __rte_always_inline __attribute__((pure)) void *
> +get_tbl24_p(struct rte_dir24_8_tbl *fib, uint32_t ip)
> +{
> +	return (void *)&((uint8_t *)fib->tbl24)[(ip &
> +		RTE_DIR24_8_TBL24_MASK) >> (8 - fib->nh_sz)];
> +}
> +
> +#define LOOKUP_FUNC(suffix, type, bulk_prefetch)			\
> +int rte_dir24_8_lookup_bulk_##suffix(void *fib_p, const uint32_t *ips,	\
> +	uint64_t *next_hops, const unsigned n)				\
> +{									\
> +	struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;	\
> +	uint64_t tmp;							\
> +	uint32_t i;							\
> +	uint32_t prefetch_offset = RTE_MIN((unsigned)bulk_prefetch, n);	\
> +									\
> +	RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ips == NULL) ||	\
> +		(next_hops == NULL)), -EINVAL);				\
> +									\
> +	for (i = 0; i < prefetch_offset; i++)				\
> +		rte_prefetch0(get_tbl24_p(fib, ips[i]));		\
> +	for (i = 0; i < (n - prefetch_offset); i++) {			\
> +		rte_prefetch0(get_tbl24_p(fib, ips[i + prefetch_offset])); \
> +		tmp = ((type *)fib->tbl24)[ips[i] >> 8];		\
> +		if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==	\
> +			RTE_DIR24_8_VALID_EXT_ENT)) {			\
> +			tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +	\
> +				((tmp >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> +		}							\
> +		next_hops[i] = tmp >> 1;				\
> +	}								\
> +	for (; i < n; i++) {						\
> +		tmp = ((type *)fib->tbl24)[ips[i] >> 8];		\
> +		if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==	\
> +			RTE_DIR24_8_VALID_EXT_ENT)) {			\
> +			tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +	\
> +				((tmp >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> +		}							\
> +		next_hops[i] = tmp >> 1;				\
> +	}								\
> +	return 0;							\
> +}									\

What is the advantage of doing this as a macro? Unless I'm missing
something "suffix" is never actually used in the function at all, and you
reference the size of the data from fix->nh_sz. Therefore there can be no
performance benefit from having such a lookup function, that I can see.

Therefore, if performance is ok, I suggest just making a single lookup_bulk
function that works with all sizes - as the inlined lookup function does in
the header. 

Alternatively, if you do want specific functions for each
entry size, you still don't need macros. Write a single function that takes
as a final parameter the entry-size and use that in calculations rather
than nh_sz.  Then wrap that function in the set of public ones, passing in
the final size parameter explicitly as "1", "2", "4" or "8". The compiler
will then know that as a compile-time constant and generate the correct
code for each size. However, for this path I suggest you check for any
resulting performance improvement, e.g. with l3fwd, as I think it's not
likely to be significant.

> +
> +static void
> +write_to_fib(void *ptr, uint64_t val, enum rte_dir24_8_nh_sz size, int n)
> +{
> +	int i;
> +	uint8_t *ptr8 = (uint8_t *)ptr;
> +	uint16_t *ptr16 = (uint16_t *)ptr;
> +	uint32_t *ptr32 = (uint32_t *)ptr;
> +	uint64_t *ptr64 = (uint64_t *)ptr;
> +
> +	switch (size) {
> +	case RTE_DIR24_8_1B:
> +		for (i = 0; i < n; i++)
> +			ptr8[i] = (uint8_t)val;
> +		break;
> +	case RTE_DIR24_8_2B:
> +		for (i = 0; i < n; i++)
> +			ptr16[i] = (uint16_t)val;
> +		break;
> +	case RTE_DIR24_8_4B:
> +		for (i = 0; i < n; i++)
> +			ptr32[i] = (uint32_t)val;
> +		break;
> +	case RTE_DIR24_8_8B:
> +		for (i = 0; i < n; i++)
> +			ptr64[i] = (uint64_t)val;
> +		break;
> +	}
> +}
> +
> +static int
> +tbl8_get_idx(struct rte_dir24_8_tbl *fib)
> +{
> +	uint32_t i;
> +	int bit_idx;
> +
> +	for (i = 0; (i < (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) &&
> +		(fib->tbl8_idxes[i] == UINT64_MAX); i++)
> +		;
> +	if (i <= (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) {
> +		bit_idx = __builtin_ctzll(~fib->tbl8_idxes[i]);
> +		fib->tbl8_idxes[i] |= (1ULL << bit_idx);
> +		return (i << BITMAP_SLAB_BIT_SIZE_LOG2) + bit_idx;
> +	}
> +	return -ENOSPC;
> +}
> +
> +static inline void
> +tbl8_free_idx(struct rte_dir24_8_tbl *fib, int idx)
> +{
> +	fib->tbl8_idxes[idx >> BITMAP_SLAB_BIT_SIZE_LOG2] &=
> +		~(1ULL << (idx & BITMAP_SLAB_BITMASK));
> +}
> +
> +static int
> +tbl8_alloc(struct rte_dir24_8_tbl *fib, uint64_t nh)
> +{
> +	int	tbl8_idx;
> +	uint8_t	*tbl8_ptr;
> +
> +	tbl8_idx = tbl8_get_idx(fib);
> +	if (tbl8_idx < 0)
> +		return tbl8_idx;
> +	tbl8_ptr = (uint8_t *)fib->tbl8 +
> +		((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) <<
> +		fib->nh_sz);
> +	/*Init tbl8 entries with nexthop from tbl24*/
> +	write_to_fib((void *)tbl8_ptr, nh|
> +		RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz,
> +		RTE_DIR24_8_TBL8_GRP_NUM_ENT);
> +	return tbl8_idx;
> +}
> +
> +static void
> +tbl8_recycle(struct rte_dir24_8_tbl *fib, uint32_t ip, uint64_t tbl8_idx)
> +{
> +	int i;
> +	uint64_t nh;
> +	uint8_t *ptr8;
> +	uint16_t *ptr16;
> +	uint32_t *ptr32;
> +	uint64_t *ptr64;
> +
> +	switch (fib->nh_sz) {
> +	case RTE_DIR24_8_1B:
> +		ptr8 = &((uint8_t *)fib->tbl8)[tbl8_idx *
> +				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> +		nh = *ptr8;
> +		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> +			if (nh != ptr8[i])
> +				return;
> +		}
> +		((uint8_t *)fib->tbl24)[ip >> 8] =
> +			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> +		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> +			ptr8[i] = 0;
> +		break;
> +	case RTE_DIR24_8_2B:
> +		ptr16 = &((uint16_t *)fib->tbl8)[tbl8_idx *
> +				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> +		nh = *ptr16;
> +		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> +			if (nh != ptr16[i])
> +				return;
> +		}
> +		((uint16_t *)fib->tbl24)[ip >> 8] =
> +			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> +		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> +			ptr16[i] = 0;
> +		break;
> +	case RTE_DIR24_8_4B:
> +		ptr32 = &((uint32_t *)fib->tbl8)[tbl8_idx *
> +				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> +		nh = *ptr32;
> +		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> +			if (nh != ptr32[i])
> +				return;
> +		}
> +		((uint32_t *)fib->tbl24)[ip >> 8] =
> +			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> +		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> +			ptr32[i] = 0;
> +		break;
> +	case RTE_DIR24_8_8B:
> +		ptr64 = &((uint64_t *)fib->tbl8)[tbl8_idx *
> +				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> +		nh = *ptr64;
> +		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> +			if (nh != ptr64[i])
> +				return;
> +		}
> +		((uint64_t *)fib->tbl24)[ip >> 8] =
> +			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> +		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> +			ptr64[i] = 0;
> +		break;
> +	}
> +	tbl8_free_idx(fib, tbl8_idx);
> +}
> +
> +static int
> +install_to_fib(struct rte_dir24_8_tbl *fib, uint32_t ledge, uint32_t redge,
> +	uint64_t next_hop)
> +{
> +	uint64_t	tbl24_tmp;
> +	int	tbl8_idx;
> +	int tmp_tbl8_idx;
> +	uint8_t	*tbl8_ptr;
> +
> +	/*case for 0.0.0.0/0*/
> +	if (unlikely((ledge == 0) && (redge == 0))) {
> +		write_to_fib(fib->tbl24, next_hop << 1, fib->nh_sz, 1 << 24);
> +		return 0;
> +	}
> +	if (ROUNDUP(ledge, 24) <= redge) {
> +		if (ledge < ROUNDUP(ledge, 24)) {
> +			tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
> +			if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> +				RTE_DIR24_8_VALID_EXT_ENT) {
> +				tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> +				tmp_tbl8_idx = tbl8_get_idx(fib);
> +				if ((tbl8_idx < 0) || (tmp_tbl8_idx < 0))
> +					return -ENOSPC;
> +				tbl8_free_idx(fib, tmp_tbl8_idx);
> +				/*update dir24 entry with tbl8 index*/
> +				write_to_fib(get_tbl24_p(fib, ledge),
> +					(tbl8_idx << 1)|
> +					RTE_DIR24_8_VALID_EXT_ENT,
> +					fib->nh_sz, 1);
> +			} else
> +				tbl8_idx = tbl24_tmp >> 1;
> +			tbl8_ptr = (uint8_t *)fib->tbl8 +
> +				(((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
> +				(ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
> +				fib->nh_sz);
> +			/*update tbl8 with new next hop*/
> +			write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> +				RTE_DIR24_8_VALID_EXT_ENT,
> +				fib->nh_sz, ROUNDUP(ledge, 24) - ledge);
> +			tbl8_recycle(fib, ledge, tbl8_idx);
> +		}
> +		if (ROUNDUP(ledge, 24) < (redge & RTE_DIR24_8_TBL24_MASK)) {
> +			write_to_fib(get_tbl24_p(fib, ROUNDUP(ledge, 24)),
> +				next_hop << 1, fib->nh_sz,
> +				((redge & RTE_DIR24_8_TBL24_MASK) -
> +				ROUNDUP(ledge, 24)) >> 8);
> +		}
> +		if (redge & ~RTE_DIR24_8_TBL24_MASK) {
> +			tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, redge);
> +			if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> +					RTE_DIR24_8_VALID_EXT_ENT) {
> +				tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> +				if (tbl8_idx < 0)
> +					return -ENOSPC;
> +				/*update dir24 entry with tbl8 index*/
> +				write_to_fib(get_tbl24_p(fib, redge),
> +					(tbl8_idx << 1)|
> +					RTE_DIR24_8_VALID_EXT_ENT,
> +					fib->nh_sz, 1);
> +			} else
> +				tbl8_idx = tbl24_tmp >> 1;
> +			tbl8_ptr = (uint8_t *)fib->tbl8 +
> +				((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) <<
> +				fib->nh_sz);
> +			/*update tbl8 with new next hop*/
> +			write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> +				RTE_DIR24_8_VALID_EXT_ENT,
> +				fib->nh_sz, redge & ~RTE_DIR24_8_TBL24_MASK);
> +			tbl8_recycle(fib, redge, tbl8_idx);
> +		}
> +	} else {
> +		tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
> +		if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> +			RTE_DIR24_8_VALID_EXT_ENT) {
> +			tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> +			if (tbl8_idx < 0)
> +				return -ENOSPC;
> +			/*update dir24 entry with tbl8 index*/
> +			write_to_fib(get_tbl24_p(fib, ledge),
> +				(tbl8_idx << 1)|
> +				RTE_DIR24_8_VALID_EXT_ENT,
> +				fib->nh_sz, 1);
> +		} else
> +			tbl8_idx = tbl24_tmp >> 1;
> +		tbl8_ptr = (uint8_t *)fib->tbl8 +
> +			(((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
> +			(ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
> +			fib->nh_sz);
> +		/*update tbl8 with new next hop*/
> +		write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> +			RTE_DIR24_8_VALID_EXT_ENT,
> +			fib->nh_sz, redge - ledge);
> +		tbl8_recycle(fib, ledge, tbl8_idx);
> +	}
> +	return 0;
> +}
> +
> +static int
> +modify_fib(struct rte_rib *rib, uint32_t ip, uint8_t depth,
> +	uint64_t next_hop)
> +{
> +	struct rte_rib_node *tmp = NULL;
> +	struct rte_dir24_8_tbl *fib;
> +	uint32_t ledge, redge;
> +	int ret;
> +
> +	fib = rib->fib;
> +
> +	if (next_hop > DIR24_8_MAX_NH(fib))
> +		return -EINVAL;
> +
> +	ledge = ip;
> +	do {
> +		tmp = rte_rib_tree_get_nxt(rib, ip, depth, tmp,
> +			RTE_RIB_GET_NXT_COVER);
> +		if (tmp != NULL) {
> +			if (tmp->depth == depth)
> +				continue;
> +			redge = tmp->key;
> +			if (ledge == redge) {
> +				ledge = redge +
> +					(uint32_t)(1ULL << (32 - tmp->depth));
> +				continue;
> +			}
> +			ret = install_to_fib(fib, ledge, redge,
> +				next_hop);
> +			if (ret != 0)
> +				return ret;
> +			ledge = redge +
> +				(uint32_t)(1ULL << (32 - tmp->depth));
> +		} else {
> +			redge = ip + (uint32_t)(1ULL << (32 - depth));
> +			ret = install_to_fib(fib, ledge, redge,
> +				next_hop);
> +			if (ret != 0)
> +				return ret;
> +		}
> +	} while (tmp);
> +
> +	return 0;
> +}
> +
> +int
> +rte_dir24_8_modify(struct rte_rib *rib, uint32_t ip, uint8_t depth,
> +	uint64_t next_hop, enum rte_rib_op op)
> +{
> +	struct rte_dir24_8_tbl *fib;
> +	struct rte_rib_node *tmp = NULL;
> +	struct rte_rib_node *node;
> +	struct rte_rib_node *parent;
> +	int ret = 0;
> +
> +	if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> +		return -EINVAL;
> +
> +	fib = rib->fib;
> +	RTE_ASSERT(fib);
> +
> +	ip &= (uint32_t)(UINT64_MAX << (32 - depth));
> +
> +	node = rte_rib_tree_lookup_exact(rib, ip, depth);
> +	switch (op) {
> +	case RTE_RIB_ADD:
> +		if (node != NULL) {
> +			if (node->nh == next_hop)
> +				return 0;
> +			ret = modify_fib(rib, ip, depth, next_hop);
> +			if (ret == 0)
> +				node->nh = next_hop;
> +			return 0;
> +		}
> +		if (depth > 24) {
> +			tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL,
> +				RTE_RIB_GET_NXT_COVER);
> +			if ((tmp == NULL) &&
> +				(fib->cur_tbl8s >= fib->number_tbl8s))
> +				return -ENOSPC;
> +
> +		}
> +		node = rte_rib_tree_insert(rib, ip, depth);
> +		if (node == NULL)
> +			return -rte_errno;
> +		node->nh = next_hop;
> +		parent = rte_rib_tree_lookup_parent(node);
> +		if ((parent != NULL) && (parent->nh == next_hop))
> +			return 0;
> +		ret = modify_fib(rib, ip, depth, next_hop);
> +		if (ret) {
> +			rte_rib_tree_remove(rib, ip, depth);
> +			return ret;
> +		}
> +		if ((depth > 24) && (tmp == NULL))
> +			fib->cur_tbl8s++;
> +		return 0;
> +	case RTE_RIB_DEL:
> +		if (node == NULL)
> +			return -ENOENT;
> +
> +		parent = rte_rib_tree_lookup_parent(node);
> +		if (parent != NULL) {
> +			if (parent->nh != node->nh)
> +				ret = modify_fib(rib, ip, depth, parent->nh);
> +		} else
> +			ret = modify_fib(rib, ip, depth, fib->def_nh);
> +		if (ret == 0) {
> +			rte_rib_tree_remove(rib, ip, depth);
> +			if (depth > 24) {
> +				tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL,
> +					RTE_RIB_GET_NXT_COVER);
> +				if (tmp == NULL)
> +					fib->cur_tbl8s--;
> +			}
> +		}
> +		return ret;
> +	default:
> +		break;
> +	}
> +	return -EINVAL;
> +}
> +
> +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int socket_id,
> +	enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh)
> +{
> +	char mem_name[RTE_RIB_NAMESIZE];
> +	struct rte_dir24_8_tbl *fib;
> +
> +	snprintf(mem_name, sizeof(mem_name), "FIB_%s", name);
> +	fib = rte_zmalloc_socket(name, sizeof(struct rte_dir24_8_tbl) +
> +		RTE_DIR24_8_TBL24_NUM_ENT * (1 << nh_sz), RTE_CACHE_LINE_SIZE,
> +		socket_id);
> +	if (fib == NULL)
> +		return fib;
> +
> +	snprintf(mem_name, sizeof(mem_name), "TBL8_%s", name);
> +	fib->tbl8 = rte_zmalloc_socket(mem_name, RTE_DIR24_8_TBL8_GRP_NUM_ENT *
> +			(1 << nh_sz) * RTE_DIR24_8_TBL8_NUM_GROUPS,
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (fib->tbl8 == NULL) {
> +		rte_free(fib);
> +		return NULL;
> +	}
> +	fib->def_nh = def_nh;
> +	fib->nh_sz = nh_sz;
> +	fib->number_tbl8s = RTE_MIN((uint32_t)RTE_DIR24_8_TBL8_NUM_GROUPS,
> +				DIR24_8_MAX_NH(fib));
> +
> +	snprintf(mem_name, sizeof(mem_name), "TBL8_idxes_%s", name);
> +	fib->tbl8_idxes = rte_zmalloc_socket(mem_name,
> +			RTE_ALIGN_CEIL(fib->number_tbl8s, 64) >> 3,
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (fib->tbl8_idxes == NULL) {
> +		rte_free(fib->tbl8);
> +		rte_free(fib);
> +		return NULL;
> +	}
> +
> +	return fib;
> +}
> +
> +void
> +rte_dir24_8_free(void *fib_p)
> +{
> +	struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> +
> +	rte_free(fib->tbl8_idxes);
> +	rte_free(fib->tbl8);
> +	rte_free(fib);
> +}
> +
> +LOOKUP_FUNC(1b, uint8_t, 5)
> +LOOKUP_FUNC(2b, uint16_t, 6)
> +LOOKUP_FUNC(4b, uint32_t, 15)
> +LOOKUP_FUNC(8b, uint64_t, 12)
> diff --git a/lib/librte_rib/rte_dir24_8.h b/lib/librte_rib/rte_dir24_8.h
> new file mode 100644
> index 0000000..f779409
> --- /dev/null
> +++ b/lib/librte_rib/rte_dir24_8.h
> @@ -0,0 +1,116 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> + */
> +
> +#ifndef _RTE_DIR24_8_H_
> +#define _RTE_DIR24_8_H_
> +
> +/**
> + * @file
> + * RTE Longest Prefix Match (LPM)
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/** @internal Total number of tbl24 entries. */
> +#define RTE_DIR24_8_TBL24_NUM_ENT	(1 << 24)
> +
> +/** Maximum depth value possible for IPv4 LPM. */
> +#define RTE_DIR24_8_MAX_DEPTH		32
> +
> +/** @internal Number of entries in a tbl8 group. */
> +#define RTE_DIR24_8_TBL8_GRP_NUM_ENT	256
> +
> +/** @internal Total number of tbl8 groups in the tbl8. */
> +#define RTE_DIR24_8_TBL8_NUM_GROUPS	65536
> +
> +/** @internal bitmask with valid and valid_group fields set */
> +#define RTE_DIR24_8_VALID_EXT_ENT	0x01
> +
> +#define RTE_DIR24_8_TBL24_MASK		0xffffff00
> +
> +/** Size of nexthop (1 << nh_sz) bits */
> +enum rte_dir24_8_nh_sz {
> +	RTE_DIR24_8_1B,
> +	RTE_DIR24_8_2B,
> +	RTE_DIR24_8_4B,
> +	RTE_DIR24_8_8B
> +};
> +
> +
> +#define DIR24_8_BITS_IN_NH(fib)		(8 * (1 << fib->nh_sz))
> +#define DIR24_8_MAX_NH(fib)	((1ULL << (DIR24_8_BITS_IN_NH(fib) - 1)) - 1)
> +
> +#define DIR24_8_TBL_IDX(a, fib)		((a) >> (3 - fib->nh_sz))
> +#define DIR24_8_PSD_IDX(a, fib)		((a) & ((1 << (3 - fib->nh_sz)) - 1))
> +
> +#define DIR24_8_TBL24_VAL(ip)	(ip >> 8)
> +#define DIR24_8_TBL8_VAL(res, ip)					\
> +	((res >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT + (uint8_t)ip)	\
> +
> +#define DIR24_8_LOOKUP_MSK						\
> +	(((1ULL << ((1 << (fib->nh_sz + 3)) - 1)) << 1) - 1)		\
> +
> +#define RTE_DIR24_8_GET_TBL24(fib, ip)					\
> +	((fib->tbl24[DIR24_8_TBL_IDX(DIR24_8_TBL24_VAL(ip), fib)] >>	\
> +	(DIR24_8_PSD_IDX(DIR24_8_TBL24_VAL(ip), fib) *			\
> +	DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK)			\
> +
> +#define RTE_DIR24_8_GET_TBL8(fib, res, ip)				\
> +	((fib->tbl8[DIR24_8_TBL_IDX(DIR24_8_TBL8_VAL(res, ip), fib)] >>	\
> +	(DIR24_8_PSD_IDX(DIR24_8_TBL8_VAL(res, ip), fib) *		\
> +	DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK)			\
> 
I would strongly suggest making each of the above macros into inline
functions instead. It would allow easier readability since you have
parameter types and can split things across lines easier.
Also, some comments might be good too.

+
> +
> +struct rte_dir24_8_tbl {
> +	uint32_t	number_tbl8s;	/**< Total number of tbl8s. */
> +	uint32_t	cur_tbl8s;	/**< Current cumber of tbl8s. */
> +	uint64_t	def_nh;
> +	enum rte_dir24_8_nh_sz	nh_sz;	/**< Size of nexthop entry */
> +	uint64_t	*tbl8;		/**< LPM tbl8 table. */
> +	uint64_t	*tbl8_idxes;
> +	uint64_t	tbl24[0] __rte_cache_aligned; /**< LPM tbl24 table. */
> +};
> +
> +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int socket_id,
> +	enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh);
> +void rte_dir24_8_free(void *fib_p);
> +int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key,
> +	uint8_t depth, uint64_t next_hop, enum rte_rib_op op);
> +int rte_dir24_8_lookup_bulk_1b(void *fib_p, const uint32_t *ips,
> +	uint64_t *next_hops, const unsigned n);
> +int rte_dir24_8_lookup_bulk_2b(void *fib_p, const uint32_t *ips,
> +	uint64_t *next_hops, const unsigned n);
> +int rte_dir24_8_lookup_bulk_4b(void *fib_p, const uint32_t *ips,
> +	uint64_t *next_hops, const unsigned n);
> +int rte_dir24_8_lookup_bulk_8b(void *fib_p, const uint32_t *ips,
> +	uint64_t *next_hops, const unsigned n);
> +
> +
> +static inline int
> +rte_dir24_8_lookup(void *fib_p, uint32_t ip, uint64_t *next_hop)

Why use void * as parameter, since the proper type is defined just above?

> +{
> +	uint64_t res;
> +	struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> +
> +	RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ip == NULL) ||
> +		(next_hop == NULL)), -EINVAL);
> +
> +	res = RTE_DIR24_8_GET_TBL24(fib, ip);
> +	if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) ==
> +		RTE_DIR24_8_VALID_EXT_ENT)) {
> +		res = RTE_DIR24_8_GET_TBL8(fib, res, ip);
> +	}
> +	*next_hop = res >> 1;
> +	return 0;
> +}

Do we need this static inline function? Can the bulk functions do on their
own? If we can remove this, we can move the most of the header file
contents, especially the structures, out of the public header. That would
greatly improve the ease with which ABI can be maintained.

> +
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_DIR24_8_H_ */
> +