DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com>
To: Robin Jarry <rjarry@redhat.com>, <dev@dpdk.org>
Subject: Re: [RFC PATCH dpdk] fib6: implement RCU rule reclamation
Date: Tue, 14 Oct 2025 19:16:30 +0100	[thread overview]
Message-ID: <a06b6bf4-ea23-47f9-9d97-dc72eb90b0e9@intel.com> (raw)
In-Reply-To: <20250610145341.38271-2-rjarry@redhat.com>

Hi Robin,

Thanks for the patch, see comments inline

On 6/10/2025 3:53 PM, Robin Jarry wrote:
> Currently, for the TRIE algorithm (actually, it should be called
> DIR-24-8-8-8-8-8-8-8-8-8-8-8-8),

Technically yes, but we can look at this structure as an n-ary tree 
(tree with associativity equal to n), where the first layer of the tree 
has associativity of 2^24 and subsequent layers (up to max 13) have 
associativity of 2^8. It has properties to keep prefixes, so it can also 
be considered a Trie.

>   the tbl8 group is freed even though the
> readers might be using the tbl8 group entries. The freed tbl8 group can
> be reallocated quickly. As a result, lookup may be performed
> incorrectly.
>
> To address that, RCU QSBR is integrated for safe tbl8 group reclamation.
>
> Cc: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
>
> Notes:
>      This is a semi-copy-paste of the FIB4 implementation.
>      
>      I couldn't understand the implementation of trie_modify with regard to
>      depth_diff handling.
>      
>      The unit tests fail because depth_diff is always 0 when deleting a route
>      which causes any subsequent add to fail with a -ENOSPC error.
>      
>      Vladimir, could you give some more insights on the matter?

I'll be happy to explain.

depth_diff represents a number of layers(tbl8 entries) that should be 
built towards the target new prefix(prefix we add) from existing least 
significant prefix that covers prefix we are going to add, or number of 
destroyed layers from the prefix we are going to delete to the 
next least significant prefix that covers prefix we are going to delete.

Here, we determine whether a prebuilt path already exists for the 
prefix’s location or if it needs to be created (by allocating depth_diff 
tbl8 entries). Or find the number of tbl8 we can free.

There is a bug in the current implementation, where the number of tbl8 
entries to be freed is always zero when a prefix is deleted. I’ll send a 
fix for this issue, please refer to it for details.

>
>   app/test/test_fib6.c | 214 +++++++++++++++++++++++++++++++++++++++++++
>   lib/fib/rte_fib6.c   |  15 +++
>   lib/fib/rte_fib6.h   |  54 +++++++++++
>   lib/fib/trie.c       |  89 ++++++++++++++++--
>   lib/fib/trie.h       |   8 ++
>   5 files changed, 373 insertions(+), 7 deletions(-)
>
<snip>
> diff --git a/lib/fib/rte_fib6.h b/lib/fib/rte_fib6.h
> index 42e613870886..a4e1d34fb1c4 100644
> --- a/lib/fib/rte_fib6.h
> +++ b/lib/fib/rte_fib6.h
> @@ -19,6 +19,7 @@
>   
>   #include <rte_common.h>
>   #include <rte_ip6.h>
> +#include <rte_rcu_qsbr.h>
>   
>   #ifdef __cplusplus
>   extern "C" {
> @@ -31,6 +32,19 @@ extern "C" {
>   struct rte_fib6;
>   struct rte_rib6;
>   
> +/** @internal Default RCU defer queue entries to reclaim in one go. */
> +#define RTE_FIB6_RCU_DQ_RECLAIM_MAX     16
> +/** @internal Default RCU defer queue size. */
> +#define RTE_FIB6_RCU_DQ_RECLAIM_SZ      128
> +
> +/** RCU reclamation modes */
> +enum rte_fib6_qsbr_mode {
> +	/** Create defer queue for reclaim. */
> +	RTE_FIB6_QSBR_MODE_DQ = 0,
> +	/** Use blocking mode reclaim. No defer queue created. */
> +	RTE_FIB6_QSBR_MODE_SYNC
> +};
> +
>   /** Type of FIB struct */
>   enum rte_fib6_type {
>   	RTE_FIB6_DUMMY,		/**< RIB6 tree based FIB */
> @@ -82,6 +96,26 @@ struct rte_fib6_conf {
>   	};
>   };
>   
> +/** FIB RCU QSBR configuration structure. */
> +struct rte_fib6_rcu_config {
> +	/** RCU QSBR variable. */
> +	struct rte_rcu_qsbr *v;
> +	/** Mode of RCU QSBR. See RTE_FIB_QSBR_MODE_xxx.
typo, did you mean RTE_FIB6_QSBR_MODE_xxx ?
> +	 * Default: RTE_FIB6_QSBR_MODE_DQ, create defer queue for reclaim.
> +	 */
> +	enum rte_fib6_qsbr_mode mode;
> +	/** RCU defer queue size.
> +	 * Default: RTE_FIB6_RCU_DQ_RECLAIM_SZ.
> +	 */
> +	uint32_t dq_size;
> +	/** Threshold to trigger auto reclaim. */
> +	uint32_t reclaim_thd;
> +	/** Max entries to reclaim in one go.
> +	 * Default: RTE_FIB6_RCU_DQ_RECLAIM_MAX.
> +	 */
> +	uint32_t reclaim_max;
> +};
> +
>   /**
>    * Free an FIB object.
>    *
> @@ -217,6 +251,26 @@ rte_fib6_get_rib(struct rte_fib6 *fib);
>   int
>   rte_fib6_select_lookup(struct rte_fib6 *fib, enum rte_fib6_lookup_type type);
>   
> +/**
> + * Associate RCU QSBR variable with a FIB object.
> + *
> + * @param fib
> + *   FIB object handle
> + * @param cfg
> + *   RCU QSBR configuration
> + * @return
> + *   On success - 0
> + *   On error - 1 with error code set in rte_errno.
Please fix this doxygen documentation to reflect actual behavior of the 
function with respect to return value
> + *   Possible rte_errno codes are:
> + *   - EINVAL - invalid pointer
> + *   - EEXIST - already added QSBR
> + *   - ENOMEM - memory allocation failure
> + *   - ENOTSUP - not supported by configured dataplane algorithm
> + */
> +__rte_experimental
> +int
> +rte_fib6_rcu_qsbr_add(struct rte_fib6 *fib, struct rte_fib6_rcu_config *cfg);
> +
<snip>

-- 
Regards,
Vladimir


      parent reply	other threads:[~2025-10-14 18:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-10 14:53 Robin Jarry
2025-10-10  8:55 ` Robin Jarry
2025-10-14 18:07   ` Medvedkin, Vladimir
2025-10-14 18:16 ` Medvedkin, Vladimir [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a06b6bf4-ea23-47f9-9d97-dc72eb90b0e9@intel.com \
    --to=vladimir.medvedkin@intel.com \
    --cc=dev@dpdk.org \
    --cc=rjarry@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).