DPDK patches and discussions
 help / color / mirror / Atom feed
From: Nitin Saxena <nsaxena16@gmail.com>
To: Ankur Dwivedi <adwivedi@marvell.com>
Cc: dev@dpdk.org, jerinj@marvell.com, vladimir.medvedkin@intel.com,
	 ndabilpuram@marvell.com, pbhagavatula@marvell.com,
	skori@marvell.com,  rkudurumalla@marvell.com
Subject: Re: [PATCH v1 04/12] node: add process callback for IP4 FIB
Date: Wed, 16 Apr 2025 13:24:50 +0530	[thread overview]
Message-ID: <CAG6-93wOdPnGS9kLoRmJTS41x=W7kY0bMu5_bNbosDXg1KjdDQ@mail.gmail.com> (raw)
In-Reply-To: <20250415121052.1497155-5-adwivedi@marvell.com>

Hi Ankur,

Same comments apply to IPv6 nodes as well. See for ip4 lookup comments

Thanks,
Nitin

On Tue, Apr 15, 2025 at 6:20 PM Ankur Dwivedi <adwivedi@marvell.com> wrote:
>
> Adds the process callback function for ip4_lookup_fib node.
>
> Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
> ---
>  lib/node/ip4_lookup_fib.c | 164 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 164 insertions(+)
>
> diff --git a/lib/node/ip4_lookup_fib.c b/lib/node/ip4_lookup_fib.c
> index e87864e672..c535b191f8 100644
> --- a/lib/node/ip4_lookup_fib.c
> +++ b/lib/node/ip4_lookup_fib.c
> @@ -40,6 +40,169 @@ static struct ip4_lookup_fib_node_main ip4_lookup_fib_nm;
>  #define IP4_LOOKUP_NODE_PRIV1_OFF(ctx) \
>         (((struct ip4_lookup_fib_node_ctx *)ctx)->mbuf_priv1_off)
>
> +static uint16_t
> +ip4_lookup_fib_node_process(struct rte_graph *graph, struct rte_node *node, void **objs,
> +                           uint16_t nb_objs)
> +{
> +       struct rte_mbuf *mbuf0, *mbuf1, *mbuf2, *mbuf3, **pkts;
> +       struct rte_fib *fib = IP4_LOOKUP_NODE_FIB(node->ctx);
> +       const int dyn = IP4_LOOKUP_NODE_PRIV1_OFF(node->ctx);
> +       struct rte_ipv4_hdr *ipv4_hdr;
> +       uint64_t next_hop[nb_objs];
> +       uint16_t lookup_err = 0;
> +       void **to_next, **from;
> +       uint16_t last_spec = 0;
> +       rte_edge_t next_index;
> +       uint16_t n_left_from;
> +       uint32_t ip[nb_objs];
> +       uint16_t held = 0;
> +       uint32_t drop_nh;
> +       uint16_t next;
> +       int i, rc;
> +
> +       /* Speculative next */
> +       next_index = RTE_NODE_IP4_LOOKUP_NEXT_REWRITE;

Is it possible if we add next_edge in node->ctx? Save next_index in
node->init() function and at the end of process() function for
better speculative performance

Also in general, this function is assuming packets are being forwarded
to rewrite node but there be can other paths as well like
- LOCAL
- PUNT etc.

So let next_hop returned from rte_fib_lookup_bulk() determine the
next_edge (even pkt_drop node). Control plane feeds next_edge in 8B
next_hop which I defined in other patch and also below
(rte_ip4_lookup_fib_next_hop_t)


> +       /* Drop node */
> +       drop_nh = ((uint32_t)RTE_NODE_IP4_LOOKUP_NEXT_PKT_DROP) << 16;

Let drop be determined from next_hop returned from rte_ip4_lookup_fib_next_hop_t
Control plane feeds default next_hop as part of setup_fib() as follows
struct {
   struct {
     uint32_t next_hop_od;
    uint16_t next_edge;
    uint16_t reserved;
   };
   uint64_t u4;
} rte_ip4_lookpu_fib_next_hop_t;

default_nh = {,next = RTE_NODE_IP4_LOOKUP_NEXT_PKT_DROP}; which is
programmed in setup_fib
> +
> +       pkts = (struct rte_mbuf **)objs;
> +       from = objs;
> +       n_left_from = nb_objs;
> +
> +       /* Get stream for the speculated next node */
> +       to_next = rte_node_next_stream_get(graph, node, next_index, nb_objs);
> +
> +       for (i = OBJS_PER_CLINE; i < RTE_GRAPH_BURST_SIZE; i += OBJS_PER_CLINE)
> +               rte_prefetch0(&objs[i]);
> +
> +#if RTE_GRAPH_BURST_SIZE > 64
> +       for (i = 0; i < 4 && i < n_left_from; i++) {
> +               rte_prefetch0(pkts[i]);
> +               rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[i], void *,
> +                                       sizeof(struct rte_ether_hdr)));
> +       }
> +#endif
> +
> +       i = 0;
> +       while (n_left_from >= 4) {
> +#if RTE_GRAPH_BURST_SIZE > 64
> +               if (likely(n_left_from > 7)) {
> +                       rte_prefetch0(pkts[4]);
> +                       rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[4], void *,
> +                                       sizeof(struct rte_ether_hdr)));
> +                       rte_prefetch0(pkts[5]);
> +                       rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[5], void *,
> +                                       sizeof(struct rte_ether_hdr)));
> +                       rte_prefetch0(pkts[6]);
> +                       rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[6], void *,
> +                                       sizeof(struct rte_ether_hdr)));
> +                       rte_prefetch0(pkts[7]);
> +                       rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[7], void *,
> +                                       sizeof(struct rte_ether_hdr)));
> +               }
> +#endif
> +
> +               mbuf0 = pkts[0];
> +               mbuf1 = pkts[1];
> +               mbuf2 = pkts[2];
> +               mbuf3 = pkts[3];
> +               pkts += 4;
> +               n_left_from -= 4;
> +               /* Extract DIP of mbuf0 */
> +               ipv4_hdr = rte_pktmbuf_mtod_offset(mbuf0, struct rte_ipv4_hdr *,
> +                               sizeof(struct rte_ether_hdr));
> +               /* Extract cksum, ttl as ipv4 hdr is in cache */
> +               node_mbuf_priv1(mbuf0, dyn)->cksum = ipv4_hdr->hdr_checksum;
> +               node_mbuf_priv1(mbuf0, dyn)->ttl = ipv4_hdr->time_to_live;
> +
> +               ip[i++] = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
> +
> +               /* Extract DIP of mbuf1 */
> +               ipv4_hdr = rte_pktmbuf_mtod_offset(mbuf1, struct rte_ipv4_hdr *,
> +                               sizeof(struct rte_ether_hdr));
> +               /* Extract cksum, ttl as ipv4 hdr is in cache */
> +               node_mbuf_priv1(mbuf1, dyn)->cksum = ipv4_hdr->hdr_checksum;
> +               node_mbuf_priv1(mbuf1, dyn)->ttl = ipv4_hdr->time_to_live;
> +
> +               ip[i++] = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
> +
> +               /* Extract DIP of mbuf2 */
> +               ipv4_hdr = rte_pktmbuf_mtod_offset(mbuf2, struct rte_ipv4_hdr *,
> +                               sizeof(struct rte_ether_hdr));
> +               /* Extract cksum, ttl as ipv4 hdr is in cache */
> +               node_mbuf_priv1(mbuf2, dyn)->cksum = ipv4_hdr->hdr_checksum;
> +               node_mbuf_priv1(mbuf2, dyn)->ttl = ipv4_hdr->time_to_live;
> +
> +               ip[i++] = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
> +
> +               /* Extract DIP of mbuf3 */
> +               ipv4_hdr = rte_pktmbuf_mtod_offset(mbuf3, struct rte_ipv4_hdr *,
> +                               sizeof(struct rte_ether_hdr));
> +
> +               /* Extract cksum, ttl as ipv4 hdr is in cache */
> +               node_mbuf_priv1(mbuf3, dyn)->cksum = ipv4_hdr->hdr_checksum;
> +               node_mbuf_priv1(mbuf3, dyn)->ttl = ipv4_hdr->time_to_live;
> +
> +               ip[i++] = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
> +       }
> +       while (n_left_from > 0) {
> +               mbuf0 = pkts[0];
> +               pkts += 1;
> +               n_left_from -= 1;
> +
> +               /* Extract DIP of mbuf0 */
> +               ipv4_hdr = rte_pktmbuf_mtod_offset(mbuf0, struct rte_ipv4_hdr *,
> +                               sizeof(struct rte_ether_hdr));
> +               /* Extract cksum, ttl as ipv4 hdr is in cache */
> +               node_mbuf_priv1(mbuf0, dyn)->cksum = ipv4_hdr->hdr_checksum;
> +               node_mbuf_priv1(mbuf0, dyn)->ttl = ipv4_hdr->time_to_live;
> +
> +               ip[i++] = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
> +       }
> +
> +       rc = rte_fib_lookup_bulk(fib, ip, next_hop, nb_objs);
> +       if (unlikely(rc != 0))
> +               return 0;
> +
> +       for (i = 0; i < nb_objs; i++) {
> +               if (unlikely(next_hop[i] == FIB_DEFAULT_NH)) {

This check can be removed since we have programmed drop node in default_nh.

> +                       next_hop[i] = drop_nh;
> +                       lookup_err += 1;
> +               }
> +
> +               mbuf0 = (struct rte_mbuf *)objs[i];
> +               node_mbuf_priv1(mbuf0, dyn)->nh = (uint16_t)next_hop[i];
> +               next = (uint16_t)(next_hop[i] >> 16);

Please get next and next_hop from next_hop itself

      node_mbuf_priv1(mbuf0, dyn)->nh =
((rte_ip4_lookup_fib_next_hop_t *)next_hop[i])->next_edge
      next = ((rte_ip4_lookup_fib_next_hop_t *)next_hop[i])->next_hop_id;
> +
> +               if (unlikely(next_index ^ next)) {
> +                       /* Copy things successfully speculated till now */
> +                       rte_memcpy(to_next, from, last_spec * sizeof(from[0]));
> +                       from += last_spec;
> +                       to_next += last_spec;
> +                       held += last_spec;
> +                       last_spec = 0;
> +
> +                       rte_node_enqueue_x1(graph, node, next, from[0]);
> +                       from += 1;
> +               } else {
> +                       last_spec += 1;
> +               }
> +       }
> +
> +       /* !!! Home run !!! */
> +       if (likely(last_spec == nb_objs)) {
> +               rte_node_next_stream_move(graph, node, next_index);
> +               return nb_objs;
> +       }
> +
> +       NODE_INCREMENT_XSTAT_ID(node, 0, lookup_err != 0, lookup_err);
> +       held += last_spec;
> +       rte_memcpy(to_next, from, last_spec * sizeof(from[0]));

Save current next_index in node->ctx to make speculation work in next
iteration.
> +       rte_node_next_stream_put(graph, node, next_index, held);
> +
> +       return nb_objs;
> +}
> +
>  RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_node_ip4_fib_route_add, 25.07)
>  int
>  rte_node_ip4_fib_route_add(uint32_t ip, uint8_t depth, uint16_t next_hop,
> @@ -147,6 +310,7 @@ static struct rte_node_xstats ip4_lookup_fib_xstats = {
>  };
>
>  static struct rte_node_register ip4_lookup_fib_node = {
> +       .process = ip4_lookup_fib_node_process,
>         .name = "ip4_lookup_fib",
>
>         .init = ip4_lookup_fib_node_init,
> --
> 2.25.1
>

  reply	other threads:[~2025-04-16  7:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-15 12:10 [PATCH v1 00/12] add lookup fib nodes in graph library Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 01/12] fib: move macro to header file Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 02/12] node: add IP4 lookup FIB node Ankur Dwivedi
2025-04-16  7:32   ` Nitin Saxena
2025-04-16 10:26     ` [EXTERNAL] " Ankur Dwivedi
2025-04-16  9:34   ` Medvedkin, Vladimir
2025-04-16 10:07     ` [EXTERNAL] " Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 03/12] node: add IP4 FIB route add Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 04/12] node: add process callback for IP4 FIB Ankur Dwivedi
2025-04-16  7:54   ` Nitin Saxena [this message]
2025-04-16 12:54   ` Medvedkin, Vladimir
2025-04-18  7:38     ` [EXTERNAL] " Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 05/12] node: add next node in packet classification Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 06/12] app/graph: add IP4 lookup mode command Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 07/12] fib: move macro to header file Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 08/12] node: add IP6 lookup FIB node Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 09/12] node: add IP6 FIB route add Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 10/12] node: add process callback for IP6 FIB Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 11/12] node: add next node in packet classification Ankur Dwivedi
2025-04-15 12:10 ` [PATCH v1 12/12] app/graph: add IP6 lookup mode command Ankur Dwivedi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG6-93wOdPnGS9kLoRmJTS41x=W7kY0bMu5_bNbosDXg1KjdDQ@mail.gmail.com' \
    --to=nsaxena16@gmail.com \
    --cc=adwivedi@marvell.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=ndabilpuram@marvell.com \
    --cc=pbhagavatula@marvell.com \
    --cc=rkudurumalla@marvell.com \
    --cc=skori@marvell.com \
    --cc=vladimir.medvedkin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).