[RFC] graph/node: feedback and future improvements

DPDK patches and discussions
 help / color / mirror / Atom feed

* [RFC] graph/node: feedback and future improvements
@ 2024-03-25 14:17 Robin Jarry
  2024-04-05 23:11 ` Jerin Jacob
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Jarry @ 2024-03-25 14:17 UTC (permalink / raw)
  To: dev, Jerin Jacob
  Cc: Kiran Kumar K, Nithin Dabilpuram, Pavan Nikhilesh, Hongbo Zheng,
	Zhirun Yan

Hi Jerin, all,

I apologize in advance for the long email :)

I am working on a project [1] that uses rte_graph extensively. In the 
course of action, I have stumbled upon a few issues. I managed to work 
around them for now, but I'd like to get more insights about long term 
solutions.

Per Rx/Tx queue nodes
=====================

In the in-built nodes and in the examples, there is one ethdev_rx and 
ethdev_tx node per rx/tx queue [2].

Is there a technical reason for this design? Does it make sense to have 
only one of each ethdev_rx and ethdev_tx nodes per graph? For simplicity 
and to make dynamic rxq changes possible, I chose to have a single rx 
& tx node per graph. Do you think we could change the in-built nodes to 
support both modes ?

Having multiple instances of the same node in a graph complicates 
instantiation as it requires cloning the nodes with unique names. Also, 
it makes dynamic configuration of ports even more complicated without 
shutting down everything first since some nodes will be part of an 
active graph and there may be conflicts.

Speaking of graph reloading, I found that the in-built ethdev_tx TX 
queue id is initialized to graph_id [3]. This seems odd. If there are 
multiple rounds of graph create/destroy, the id will become invalid.

Dynamic graph and nodes construction/destruction
================================================

I need to deal with reconfiguration of the graphs at runtime. This can 
happen on multiple occasions: port addition/suppression, number of rxq 
change, rxq size change, etc.

I could not manage to reuse the in-built nodes because of the issues 
raised in the previous point.

Could we change the in-built nodes to better support dynamic reloading? 
Maybe this only applies to nodes that may appear multiple times in the 
same graph (rx/tx).

Node context data
=================

There is no way to prepare node data context when calling 
rte_graph_create(). The current implementation uses global variables [4] 
but this makes it very "static".

It would be nice to pass prepared context data for every node on graph 
creation, either via a config parameter (better) or via another 
mechanism. I currently do this via a hash map but it requires a global 
hash as well which may not be the best solution.

I tried patching the graph library code myself but after struggling, 
I thought it would be best to discuss things first.

Pluggable nodes
===============

Currently, the declaration of next nodes is static. In order to support 
plugins (e.g. via dlopen() or similar), could we introduce a way to 
dynamically insert a node in the graph?

I have done this using a post-init callback system but we could think of 
a different way.

Also, could we allow overriding nodes with RTE_NODE_REGISTER()? So users 
can replace the default implementation with their own if they need to.

Move data pointer of packets?
=============================

This final point is more a global design question. In traditional 
networking stacks, each layer of the stack receives pbuffers that are 
"adjusted" to their respective network header in the packet (i.e. the IP 
lookup function will receive a buffer that points to the beginning of 
the IP header, regardless of what was the transport protocol, plain 
ethernet, Ethernet + VLAN, IP in IP, etc.).

Would it make sense to have a similar mechanism when designing an 
application with rte_graph?

Thanks in advance for your replies.

Cheers!

-- 
Robin

[1] https://github.com/rjarry/brouter
[2] https://github.com/DPDK/dpdk/blob/v23.11/lib/node/ethdev_rx.c#L28
[3] https://github.com/DPDK/dpdk/blob/v23.11/lib/node/ethdev_tx.c#L56
[4] https://github.com/DPDK/dpdk/blob/v23.11/lib/node/ethdev_ctrl.c

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] graph/node: feedback and future improvements
  2024-03-25 14:17 [RFC] graph/node: feedback and future improvements Robin Jarry
@ 2024-04-05 23:11 ` Jerin Jacob
  2024-04-14 10:35   ` Robin Jarry
  0 siblings, 1 reply; 7+ messages in thread
From: Jerin Jacob @ 2024-04-05 23:11 UTC (permalink / raw)
  To: Robin Jarry
  Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram,
	Pavan Nikhilesh, Hongbo Zheng, Zhirun Yan, Thomas Monjalon

On Mon, Mar 25, 2024 at 7:56 PM Robin Jarry <rjarry@redhat.com> wrote:
>
> Hi Jerin, all,
>
> I apologize in advance for the long email :)
> I am working on a project [1] that uses rte_graph extensively. In the

Great.

You may consider improving and/or adding inbuilt nodes for generic
protocol processing.
Furthermore, consider contributing on app/graph. I think, most likely,
you should be able to
leverage app/graph.

> course of action, I have stumbled upon a few issues. I managed to work
> around them for now, but I'd like to get more insights about long term
> solutions.
>
> Per Rx/Tx queue nodes
> =====================
>
> In the in-built nodes and in the examples, there is one ethdev_rx and
> ethdev_tx node per rx/tx queue [2].
>
> Is there a technical reason for this design? Does it make sense to have

The default Rx/Tx nodes added to support l3fwd-graph example application.

> only one of each ethdev_rx and ethdev_tx nodes per graph? For simplicity
> and to make dynamic rxq changes possible, I chose to have a single rx
> & tx node per graph. Do you think we could change the in-built nodes to
> support both modes ?

In terms of performance, the current scheme will be more performant.

I would suggest, we can add another inbuilt node for this. This is to
avoid additional checks in fast path
to enable dynamic behavior. Probably need to use rcu as control thread
updates port/queue
configuration changes and fast path needs to adapt to it.

>
> Having multiple instances of the same node in a graph complicates
> instantiation as it requires cloning the nodes with unique names. Also,
> it makes dynamic configuration of ports even more complicated without
> shutting down everything first since some nodes will be part of an
> active graph and there may be conflicts.
>
> Speaking of graph reloading, I found that the in-built ethdev_tx TX
> queue id is initialized to graph_id [3]. This seems odd. If there are
> multiple rounds of graph create/destroy, the id will become invalid.
>
> Dynamic graph and nodes construction/destruction
> ================================================
>
> I need to deal with reconfiguration of the graphs at runtime. This can
> happen on multiple occasions: port addition/suppression, number of rxq
> change, rxq size change, etc.
>
> I could not manage to reuse the in-built nodes because of the issues
> raised in the previous point.

Having a new inbuilt node as mentioned above will fix this issue.

>
> Could we change the in-built nodes to better support dynamic reloading?
> Maybe this only applies to nodes that may appear multiple times in the
> same graph (rx/tx).
>
> Node context data
> =================
>
> There is no way to prepare node data context when calling
> rte_graph_create(). The current implementation uses global variables [4]
> but this makes it very "static".

Since the those are node and it is private. I think it is OK.

Also using, rte_graph_node_get_*() one can get the node and its ctx at anytime.


>
> It would be nice to pass prepared context data for every node on graph
> creation, either via a config parameter (better) or via another

I think, rte_graph_create() will be complicated, e.s.p it supports loading nodes
with regex pattern. I think, we can weigh in pros and cons if you have patch.


> mechanism. I currently do this via a hash map but it requires a global
> hash as well which may not be the best solution.
>
> I tried patching the graph library code myself but after struggling,
> I thought it would be best to discuss things first.
>
> Pluggable nodes
> ===============
>
> Currently, the declaration of next nodes is static. In order to support
> plugins (e.g. via dlopen() or similar), could we introduce a way to
> dynamically insert a node in the graph?
>
> I have done this using a post-init callback system but we could think of
> a different way.
>
> Also, could we allow overriding nodes with RTE_NODE_REGISTER()? So users
> can replace the default implementation with their own if they need to.

I think, if we document the inbuilt node's downstream node ID. After
rte_graph_create(),
one can use rte_node_edge_update() to dynamically add custom/user defined nodes
in between.

I thought of adding more helper functions (leveraging existing
rte_node_edge_update() for this)
It will be more like "feature arch" in VPP. Provided, node cannot add
dynamically after rte_graph_create(),
but one changes the downstream node connection dynamically.


>
> Move data pointer of packets?
> =============================
>
> This final point is more a global design question. In traditional
> networking stacks, each layer of the stack receives pbuffers that are
> "adjusted" to their respective network header in the packet (i.e. the IP
> lookup function will receive a buffer that points to the beginning of
> the IP header, regardless of what was the transport protocol, plain
> ethernet, Ethernet + VLAN, IP in IP, etc.).
>
> Would it make sense to have a similar mechanism when designing an
> application with rte_graph?

Yes. I think, we can use mbuf data offset or new dynamic mbuf filed for this.

>
> Thanks in advance for your replies.
>
> Cheers!
>
> --
> Robin
>
> [1] https://github.com/rjarry/brouter
> [2] https://github.com/DPDK/dpdk/blob/v23.11/lib/node/ethdev_rx.c#L28
> [3] https://github.com/DPDK/dpdk/blob/v23.11/lib/node/ethdev_tx.c#L56
> [4] https://github.com/DPDK/dpdk/blob/v23.11/lib/node/ethdev_ctrl.c
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] graph/node: feedback and future improvements
  2024-04-05 23:11 ` Jerin Jacob
@ 2024-04-14 10:35   ` Robin Jarry
  2024-04-24 13:24     ` Robin Jarry
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Jarry @ 2024-04-14 10:35 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram,
	Pavan Nikhilesh, Hongbo Zheng, Zhirun Yan, Thomas Monjalon

Hi Jerin,

Sorry for the delayed reply. Thanks a lot for your comments! By the way, 
I completely forgot to say that the rte_graph design you created is 
awesome. It makes it a breeze to write a clean data path implementation. 
Kudos!

Please see my remarks inline.

Jerin Jacob, Apr 06, 2024 at 01:11:
> Great.
>
> You may consider improving and/or adding inbuilt nodes for generic 
> protocol processing. Furthermore, consider contributing on app/graph. 
> I think, most likely, you should be able to leverage app/graph.

Thanks! I am definitely planning to upstream nodes into DPDK as much as 
possible. I still need to determine what is the correct level of data 
path node public API so that the nodes can be agnostic of the control 
plane implementation.

> > only one of each ethdev_rx and ethdev_tx nodes per graph? For 
> > simplicity and to make dynamic rxq changes possible, I chose to have 
> > a single rx & tx node per graph. Do you think we could change the 
> > in-built nodes to support both modes ?
>
> In terms of performance, the current scheme will be more performant.
>
> I would suggest, we can add another inbuilt node for this. This is to 
> avoid additional checks in fast path to enable dynamic behavior. 
> Probably need to use rcu as control thread updates port/queue 
> configuration changes and fast path needs to adapt to it.

Ack, I'll think about adding a variant of the ethdev_rx/tx nodes.

> > There is no way to prepare node data context when calling 
> > rte_graph_create(). The current implementation uses global variables 
> > [4] but this makes it very "static".
>
> Since the those are node and it is private. I think it is OK.
>
> Also using, rte_graph_node_get_*() one can get the node and its ctx at 
> anytime.

I had not thought about accessing the nodes private data outside of fast 
path. This would simplify nodes data update by a huge margin.

> I think, rte_graph_create() will be complicated, e.s.p it supports 
> loading nodes with regex pattern. I think, we can weigh in pros and 
> cons if you have patch.

I agree that it would make that function very complex. Probably this is 
not required as you said before.

> > Pluggable nodes
> > ===============
> >
> > Currently, the declaration of next nodes is static. In order to 
> > support plugins (e.g. via dlopen() or similar), could we introduce 
> > a way to dynamically insert a node in the graph?
> >
> > I have done this using a post-init callback system but we could 
> > think of a different way.
> >
> > Also, could we allow overriding nodes with RTE_NODE_REGISTER()? So 
> > users can replace the default implementation with their own if they 
> > need to.
>
> I think, if we document the inbuilt node's downstream node ID. After 
> rte_graph_create(), one can use rte_node_edge_update() to dynamically 
> add custom/user defined nodes in between.
>
> I thought of adding more helper functions (leveraging existing 
> rte_node_edge_update() for this) It will be more like "feature arch" 
> in VPP. Provided, node cannot add dynamically after 
> rte_graph_create(), but one changes the downstream node connection 
> dynamically.

After I get something satisfactory, I will send an RFC patch with some 
helper functions to allow flexibility when registering nodes.

> Yes. I think, we can use mbuf data offset or new dynamic mbuf filed 
> for this.

Ack, I'll send a patch.

Cheers.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] graph/node: feedback and future improvements
  2024-04-14 10:35   ` Robin Jarry
@ 2024-04-24 13:24     ` Robin Jarry
  2024-05-04 10:03       ` Jerin Jacob
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Jarry @ 2024-04-24 13:24 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram,
	Pavan Nikhilesh, Thomas Monjalon

Robin Jarry, Apr 14, 2024 at 12:35:
> Jerin Jacob, Apr 06, 2024 at 01:11:
> > Great.
> >
> > You may consider improving and/or adding inbuilt nodes for generic 
> > protocol processing. Furthermore, consider contributing on app/graph. 
> > I think, most likely, you should be able to leverage app/graph.
>
> Thanks! I am definitely planning to upstream nodes into DPDK as much as 
> possible. I still need to determine what is the correct level of data 
> path node public API so that the nodes can be agnostic of the control 
> plane implementation.

Hey Jerin,

while working on ARP support [1], I noticed that we need to have 
configurability of nodes (or set of nodes to be more precise) from the 
outside. But the nodes in the set also need to specify metadata that 
they expect from other nodes to store in mbufs.

For now, I have used a single dynamic field which I repurpose for every 
node set depending on the use case [2]. However this raises a question:
how can we make it generic (and agnostic to the application) before 
inclusion in lib/node?

I would be glad to have your opinion on the subject.

Cheers!

[1] https://github.com/rjarry/brouter/commit/e05246f51b736821b6689d40
[2] https://github.com/rjarry/brouter/blob/main/modules/infra/datapath/br_mbuf.h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] graph/node: feedback and future improvements
  2024-04-24 13:24     ` Robin Jarry
@ 2024-05-04 10:03       ` Jerin Jacob
  2024-05-08 22:04         ` Robin Jarry
  0 siblings, 1 reply; 7+ messages in thread
From: Jerin Jacob @ 2024-05-04 10:03 UTC (permalink / raw)
  To: Robin Jarry
  Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram,
	Pavan Nikhilesh, Thomas Monjalon

On Wed, Apr 24, 2024 at 6:54 PM Robin Jarry <rjarry@redhat.com> wrote:
>
> Robin Jarry, Apr 14, 2024 at 12:35:
> > Jerin Jacob, Apr 06, 2024 at 01:11:
> > > Great.
> > >
> > > You may consider improving and/or adding inbuilt nodes for generic
> > > protocol processing. Furthermore, consider contributing on app/graph.
> > > I think, most likely, you should be able to leverage app/graph.
> >
> > Thanks! I am definitely planning to upstream nodes into DPDK as much as
> > possible. I still need to determine what is the correct level of data
> > path node public API so that the nodes can be agnostic of the control
> > plane implementation.
>
> Hey Jerin,
>
> while working on ARP support [1], I noticed that we need to have
> configurability of nodes (or set of nodes to be more precise) from the
> outside. But the nodes in the set also need to specify metadata that
> they expect from other nodes to store in mbufs.
>
> For now, I have used a single dynamic field which I repurpose for every
> node set depending on the use case [2]. However this raises a question:
> how can we make it generic (and agnostic to the application) before
> inclusion in lib/node?
>
> I would be glad to have your opinion on the subject.

Looking at the brouter[1] project, Based on my understanding it has following

1)Data plane code:  IMO, Since it is generic and any consumers can
reuse this code. IMO, We can make as generic and upstream to lib/node.
2)Control plane code: IMO, if you are willing, I will be glad to see
it is hosted at https://www.dpdk.org/hosted-projects/ like pktgen.
This may attract more developers for control plane code  for brouter
and show good reference of using DPDK graph library.
I think, TB needs to approve for this. If you are OK, We can add this
to one of TB meeting agenda.

[1] https://github.com/rjarry/brouter/


> Cheers!
>
> [1] https://github.com/rjarry/brouter/commit/e05246f51b736821b6689d40
> [2] https://github.com/rjarry/brouter/blob/main/modules/infra/datapath/br_mbuf.h
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] graph/node: feedback and future improvements
  2024-05-04 10:03       ` Jerin Jacob
@ 2024-05-08 22:04         ` Robin Jarry
  2024-05-09  8:24           ` Jerin Jacob
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Jarry @ 2024-05-08 22:04 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram,
	Pavan Nikhilesh, Thomas Monjalon

Hi Jerin,

Jerin Jacob, May 04, 2024 at 12:03:
> 2) Control plane code: IMO, if you are willing, I will be glad to see 
>    it is hosted at https://www.dpdk.org/hosted-projects/ like pktgen. 
>    This may attract more developers for control plane code  for 
>    brouter and show good reference of using DPDK graph library. 
>    I think, TB needs to approve for this. If you are OK, We can add 
>    this to one of TB meeting agenda.

Wow, thanks for suggesting this! Of course, I would be thrilled to have 
the project hosted in the DPDK namespace. Do I need to make a formal 
request to techboard@dpdk.org or is it premature?

Thanks again!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] graph/node: feedback and future improvements
  2024-05-08 22:04         ` Robin Jarry
@ 2024-05-09  8:24           ` Jerin Jacob
  0 siblings, 0 replies; 7+ messages in thread
From: Jerin Jacob @ 2024-05-09  8:24 UTC (permalink / raw)
  To: Robin Jarry
  Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram,
	Pavan Nikhilesh, Thomas Monjalon

On Thu, May 9, 2024 at 3:34 AM Robin Jarry <rjarry@redhat.com> wrote:
>
> Hi Jerin,
>
> Jerin Jacob, May 04, 2024 at 12:03:
> > 2) Control plane code: IMO, if you are willing, I will be glad to see
> >    it is hosted at https://www.dpdk.org/hosted-projects/ like pktgen.
> >    This may attract more developers for control plane code  for
> >    brouter and show good reference of using DPDK graph library.
> >    I think, TB needs to approve for this. If you are OK, We can add
> >    this to one of TB meeting agenda.
>
> Wow, thanks for suggesting this! Of course, I would be thrilled to have
> the project hosted in the DPDK namespace. Do I need to make a formal
> request to techboard@dpdk.org or is it premature?

IMO, If basic use case is running then you may request TB for approval
to host it.

>
> Thanks again!
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-09  8:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-25 14:17 [RFC] graph/node: feedback and future improvements Robin Jarry
2024-04-05 23:11 ` Jerin Jacob
2024-04-14 10:35   ` Robin Jarry
2024-04-24 13:24     ` Robin Jarry
2024-05-04 10:03       ` Jerin Jacob
2024-05-08 22:04         ` Robin Jarry
2024-05-09  8:24           ` Jerin Jacob

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).