* [PATCH dpdk 0/2] Flexible graph nodes stats collection
@ 2025-12-09 8:50 Robin Jarry
2025-12-09 8:50 ` [PATCH dpdk 1/2] graph: always count objects and calls Robin Jarry
2025-12-09 8:50 ` [PATCH dpdk 2/2] graph: allow enabling/disabling node visit cycles collection Robin Jarry
0 siblings, 2 replies; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 8:50 UTC (permalink / raw)
To: dev
Currently, the collection of statistics is fixed at compile time and
encompasses the number of cycles and number of node visits/calls and
processed objects.
While having fine grained statistics is a useful tool, calling accessing
rte_rdtsc() twice per node can be costly.
This series brings more flexibility and allows to always collect the
number of objects and node visits regardless of the compile time
constants. Node cycles collection can be enabled at runtime per-graph.
Robin Jarry (2):
graph: always count objects and calls
graph: allow enabling/disabling node visit cycles collection
lib/graph/graph_populate.c | 1 +
lib/graph/rte_graph_worker_common.h | 24 +++++++++++++++++++-----
2 files changed, 20 insertions(+), 5 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 8:50 [PATCH dpdk 0/2] Flexible graph nodes stats collection Robin Jarry
@ 2025-12-09 8:50 ` Robin Jarry
2025-12-09 9:09 ` Jerin Jacob
2025-12-09 8:50 ` [PATCH dpdk 2/2] graph: allow enabling/disabling node visit cycles collection Robin Jarry
1 sibling, 1 reply; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 8:50 UTC (permalink / raw)
To: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
Accumulate the number of processed objects and the number of times each
node is visited regardless of the compile time value of
RTE_LIBRTE_GRAPH_STATS.
Accumulating these numbers do not bring much overhead when rte_rdtsc()
isn't called.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
lib/graph/rte_graph_worker_common.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4ab53a533e4c..c87a6796a96e 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -209,11 +209,11 @@ __rte_node_process(struct rte_graph *graph, struct rte_node *node)
start = rte_rdtsc();
rc = node->process(graph, node, objs, node->idx);
node->total_cycles += rte_rdtsc() - start;
- node->total_calls++;
- node->total_objs += rc;
} else {
- node->process(graph, node, objs, node->idx);
+ rc = node->process(graph, node, objs, node->idx);
}
+ node->total_calls++;
+ node->total_objs += rc;
node->idx = 0;
}
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH dpdk 2/2] graph: allow enabling/disabling node visit cycles collection
2025-12-09 8:50 [PATCH dpdk 0/2] Flexible graph nodes stats collection Robin Jarry
2025-12-09 8:50 ` [PATCH dpdk 1/2] graph: always count objects and calls Robin Jarry
@ 2025-12-09 8:50 ` Robin Jarry
1 sibling, 0 replies; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 8:50 UTC (permalink / raw)
To: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
Add a new rte_graph_cycle_stats_enable() function to control whether
rte_rdtsc() is called twice for each node visit in order to collect the
number of cycles spent in each node process callback.
The default value remains the same as before and is fixed by the compile
time value RTE_LIBRTE_GRAPH_STATS.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
lib/graph/graph_populate.c | 1 +
lib/graph/rte_graph_worker_common.h | 18 ++++++++++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/lib/graph/graph_populate.c b/lib/graph/graph_populate.c
index 026daecb2122..e51e3b39e24a 100644
--- a/lib/graph/graph_populate.c
+++ b/lib/graph/graph_populate.c
@@ -64,6 +64,7 @@ graph_header_popluate(struct graph *_graph)
graph->nb_nodes = _graph->node_count;
graph->cir_start = RTE_PTR_ADD(graph, _graph->cir_start);
graph->nodes_start = _graph->nodes_start;
+ graph->cycles_stats = rte_graph_has_stats_feature();
graph->socket = _graph->socket;
graph->id = _graph->id;
memcpy(graph->name, _graph->name, RTE_GRAPH_NAMESIZE);
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index c87a6796a96e..4161fc4b02e7 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -56,7 +56,7 @@ struct __rte_cache_aligned rte_graph {
rte_graph_off_t *cir_start; /**< Pointer to circular buffer. */
rte_graph_off_t nodes_start; /**< Offset at which node memory starts. */
uint8_t model; /**< graph model */
- uint8_t reserved1; /**< Reserved for future use. */
+ uint8_t cycles_stats; /**< Reserved for future use. */
uint16_t reserved2; /**< Reserved for future use. */
union {
/* Fast schedule area for mcore dispatch model */
@@ -205,7 +205,7 @@ __rte_node_process(struct rte_graph *graph, struct rte_node *node)
objs = node->objs;
rte_prefetch0(objs);
- if (rte_graph_has_stats_feature()) {
+ if (graph->cycles_stats) {
start = rte_rdtsc();
rc = node->process(graph, node, objs, node->idx);
node->total_cycles += rte_rdtsc() - start;
@@ -560,6 +560,20 @@ rte_graph_model_is_valid(uint8_t model);
*/
int rte_graph_worker_model_set(uint8_t model);
+/**
+ * Enable/disable collecting of TSC cycles for each node visit. The default
+ * value for all new or cloned graphs is equal to rte_graph_has_stats_feature()
+ * which is set to the RTE_LIBRTE_GRAPH_STATS compile time constant.
+ *
+ * @param enabled
+ * Set to 1 to enable, 0 to disable.
+ */
+static inline void
+rte_graph_cycle_stats_enable(struct rte_graph *graph, uint8_t enabled)
+{
+ graph->cycles_stats = enabled;
+}
+
/**
* Get the graph worker model
*
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 8:50 ` [PATCH dpdk 1/2] graph: always count objects and calls Robin Jarry
@ 2025-12-09 9:09 ` Jerin Jacob
2025-12-09 9:12 ` Robin Jarry
0 siblings, 1 reply; 11+ messages in thread
From: Jerin Jacob @ 2025-12-09 9:09 UTC (permalink / raw)
To: Robin Jarry
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
On Tue, Dec 9, 2025 at 2:27 PM Robin Jarry <rjarry@redhat.com> wrote:
>
> Accumulate the number of processed objects and the number of times each
> node is visited regardless of the compile time value of
> RTE_LIBRTE_GRAPH_STATS.
By default, RTE_LIBRTE_GRAPH_STATS is enabled in distro build. So why
need such changes?
>
> Accumulating these numbers do not bring much overhead when rte_rdtsc()
> isn't called.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
> lib/graph/rte_graph_worker_common.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
> index 4ab53a533e4c..c87a6796a96e 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -209,11 +209,11 @@ __rte_node_process(struct rte_graph *graph, struct rte_node *node)
> start = rte_rdtsc();
> rc = node->process(graph, node, objs, node->idx);
> node->total_cycles += rte_rdtsc() - start;
> - node->total_calls++;
> - node->total_objs += rc;
> } else {
> - node->process(graph, node, objs, node->idx);
> + rc = node->process(graph, node, objs, node->idx);
> }
> + node->total_calls++;
> + node->total_objs += rc;
> node->idx = 0;
> }
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 9:09 ` Jerin Jacob
@ 2025-12-09 9:12 ` Robin Jarry
2025-12-09 11:13 ` Morten Brørup
0 siblings, 1 reply; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 9:12 UTC (permalink / raw)
To: Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
Jerin Jacob, Dec 09, 2025 at 10:09:
> By default, RTE_LIBRTE_GRAPH_STATS is enabled in distro build. So why
> need such changes?
This makes sense with the next patch in fact.
The user (e.g. grout) may need to collect the number of processed
objects in order to make runtime decisions. But the cost of rdtsc() may
be too high and it may be a good thing to disable it unless trying to
benchmark/debug specific issues.
--
Robin
> Offer limited to residents of the contiguous United States.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 9:12 ` Robin Jarry
@ 2025-12-09 11:13 ` Morten Brørup
2025-12-09 11:47 ` Robin Jarry
0 siblings, 1 reply; 11+ messages in thread
From: Morten Brørup @ 2025-12-09 11:13 UTC (permalink / raw)
To: Robin Jarry, Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
> From: Robin Jarry [mailto:rjarry@redhat.com]
> Sent: Tuesday, 9 December 2025 10.13
>
> Jerin Jacob, Dec 09, 2025 at 10:09:
> > By default, RTE_LIBRTE_GRAPH_STATS is enabled in distro build. So why
> > need such changes?
>
> This makes sense with the next patch in fact.
>
> The user (e.g. grout) may need to collect the number of processed
> objects in order to make runtime decisions. But the cost of rdtsc() may
> be too high and it may be a good thing to disable it unless trying to
> benchmark/debug specific issues.
Looking at patch 2/2, I disagree with the approach.
RTE_LIBRTE_GRAPH_STATS should control all stats, incl. total_calls and total_objs.
Then, if enabled, the total_cycles stats can be controlled by rte_graph_cycle_stats_enable().
Your v1 series introduces unnecessary overhead for applications not caring about total_calls/total_objs stats and thus built without RTE_LIBRTE_GRAPH_STATS.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 11:13 ` Morten Brørup
@ 2025-12-09 11:47 ` Robin Jarry
2025-12-09 12:11 ` Morten Brørup
0 siblings, 1 reply; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 11:47 UTC (permalink / raw)
To: Morten Brørup, Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
Morten Brørup, Dec 09, 2025 at 12:13:
> Looking at patch 2/2, I disagree with the approach.
>
> RTE_LIBRTE_GRAPH_STATS should control all stats, incl. total_calls and
> total_objs. Then, if enabled, the total_cycles stats can be controlled
> by rte_graph_cycle_stats_enable().
>
> Your v1 series introduces unnecessary overhead for applications not
> caring about total_calls/total_objs stats and thus built without
> RTE_LIBRTE_GRAPH_STATS.
My issue is that I would like the total_objs stat but without the
overhead of rte_rdtsc() being called twice for every node visit.
And also, I would like to be able to enable/disable these stats *at
runtime*. Having it behind a compile time constant makes it very not
flexible.
I could have two booleans to control whether total_calls/total_objs are
updated *and* whether total_cycles are computed. But that seems a bit
overkill and it would mean two fields to check (two branches) instead of
one per node.
Is it really that bad to update two uint64_t counters?
--
Robin
> Your canceled check is your receipt.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 11:47 ` Robin Jarry
@ 2025-12-09 12:11 ` Morten Brørup
2025-12-09 12:45 ` Robin Jarry
0 siblings, 1 reply; 11+ messages in thread
From: Morten Brørup @ 2025-12-09 12:11 UTC (permalink / raw)
To: Robin Jarry, Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
> From: Robin Jarry [mailto:rjarry@redhat.com]
> Sent: Tuesday, 9 December 2025 12.47
>
> Morten Brørup, Dec 09, 2025 at 12:13:
> > Looking at patch 2/2, I disagree with the approach.
> >
> > RTE_LIBRTE_GRAPH_STATS should control all stats, incl. total_calls
> and
> > total_objs. Then, if enabled, the total_cycles stats can be
> controlled
> > by rte_graph_cycle_stats_enable().
> >
> > Your v1 series introduces unnecessary overhead for applications not
> > caring about total_calls/total_objs stats and thus built without
> > RTE_LIBRTE_GRAPH_STATS.
>
> My issue is that I would like the total_objs stat but without the
> overhead of rte_rdtsc() being called twice for every node visit.
I get that. My proposal provides that.
>
> And also, I would like to be able to enable/disable these stats *at
> runtime*.
Why would you enable/disable total_calls/total_objs at runtime?
> Having it behind a compile time constant makes it very not
> flexible.
Agree, but instrumentation has a performance cost, and should be possible to disable at compile time.
>
> I could have two booleans to control whether total_calls/total_objs are
> updated *and* whether total_cycles are computed. But that seems a bit
> overkill and it would mean two fields to check (two branches) instead
> of
> one per node.
>
> Is it really that bad to update two uint64_t counters?
The cost is unnecessary. It's instrumentation.
>
> --
> Robin
>
> > Your canceled check is your receipt.
Please change your cookie generator to not make it look a quote from the person you are replying to.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 12:11 ` Morten Brørup
@ 2025-12-09 12:45 ` Robin Jarry
2025-12-09 13:45 ` Morten Brørup
0 siblings, 1 reply; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 12:45 UTC (permalink / raw)
To: Morten Brørup, Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
Morten Brørup, Dec 09, 2025 at 13:11:
>> My issue is that I would like the total_objs stat but without the
>> overhead of rte_rdtsc() being called twice for every node visit.
>
> I get that. My proposal provides that.
I didn't get that from your previous message:
Morten Brørup, Dec 09, 2025 at 12:13:
>>> Looking at patch 2/2, I disagree with the approach.
>>>
>>> RTE_LIBRTE_GRAPH_STATS should control all stats, incl. total_calls and total_objs.
>>> Then, if enabled, the total_cycles stats can be controlled by rte_graph_cycle_stats_enable().
>>>
>>> Your v1 series introduces unnecessary overhead for applications not caring about total_calls/total_objs stats and thus built without RTE_LIBRTE_GRAPH_STATS.
Where do you propose a new flag to determine whether total_calls and
total_objs are updated and total_cycles are not?
>> And also, I would like to be able to enable/disable these stats *at
>> runtime*.
>
> Why would you enable/disable total_calls/total_objs at runtime?
I don't want to disable/enable total_calls/total_objs. I want to
disable/enable total_cycles at runtime.
>> Having it behind a compile time constant makes it very not flexible.
>
> Agree, but instrumentation has a performance cost, and should be
> possible to disable at compile time.
>
>> I could have two booleans to control whether total_calls/total_objs are
>> updated *and* whether total_cycles are computed. But that seems a bit
>> overkill and it would mean two fields to check (two branches) instead
>> of
>> one per node.
>>
>> Is it really that bad to update two uint64_t counters?
>
> The cost is unnecessary. It's instrumentation.
In that instance, I consider total_objs not to be "instrumentation" but
more like regular counters (e.g. ethdev stats, xstats, etc.). Unless
I am mistaken, there is no way to disable statistics for interfaces for
example.
For example, in grout, we use total_objs in the main loop to determine
if we can save power by willingly putting the CPU to sleep for very
short periods of time. If total_objs is no longer updated, grout cannot
save power.
total_cycles however, *is* instrumentation. It is only useful to
identify hotspots and debug issues. But it can be interesting to enable
it *at runtime* without recompiling.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 12:45 ` Robin Jarry
@ 2025-12-09 13:45 ` Morten Brørup
2025-12-09 14:47 ` Robin Jarry
0 siblings, 1 reply; 11+ messages in thread
From: Morten Brørup @ 2025-12-09 13:45 UTC (permalink / raw)
To: Robin Jarry, Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
> From: Robin Jarry [mailto:rjarry@redhat.com]
> Sent: Tuesday, 9 December 2025 13.46
>
> Morten Brørup, Dec 09, 2025 at 13:11:
> >> My issue is that I would like the total_objs stat but without the
> >> overhead of rte_rdtsc() being called twice for every node visit.
> >
> > I get that. My proposal provides that.
>
> I didn't get that from your previous message:
>
> Morten Brørup, Dec 09, 2025 at 12:13:
> >>> Looking at patch 2/2, I disagree with the approach.
> >>>
> >>> RTE_LIBRTE_GRAPH_STATS should control all stats, incl. total_calls
> and total_objs.
> >>> Then, if enabled, the total_cycles stats can be controlled by
> rte_graph_cycle_stats_enable().
> >>>
> >>> Your v1 series introduces unnecessary overhead for applications not
> caring about total_calls/total_objs stats and thus built without
> RTE_LIBRTE_GRAPH_STATS.
>
> Where do you propose a new flag to determine whether total_calls and
> total_objs are updated and total_cycles are not?
Sorry about not being more clear. I meant:
"Then, if [RTE_LIBRTE_GRAPH_STATS is] enabled [at build time], the total_cycles stats can be controlled by rte_graph_cycle_stats_enable()."
I proposed that you add your rte_graph_cycle_stats_enable() to control the total_cycles stats, and leave RTE_LIBRTE_GRAPH_STATS as it is now (i.e. to be able to disable all stats at build time).
>
> >> And also, I would like to be able to enable/disable these stats *at
> >> runtime*.
> >
> > Why would you enable/disable total_calls/total_objs at runtime?
>
> I don't want to disable/enable total_calls/total_objs. I want to
> disable/enable total_cycles at runtime.
OK. That makes sense.
>
> >> Having it behind a compile time constant makes it very not flexible.
> >
> > Agree, but instrumentation has a performance cost, and should be
> > possible to disable at compile time.
> >
> >> I could have two booleans to control whether total_calls/total_objs
> are
> >> updated *and* whether total_cycles are computed. But that seems a
> bit
> >> overkill and it would mean two fields to check (two branches)
> instead
> >> of
> >> one per node.
> >>
> >> Is it really that bad to update two uint64_t counters?
> >
> > The cost is unnecessary. It's instrumentation.
>
> In that instance, I consider total_objs not to be "instrumentation" but
> more like regular counters (e.g. ethdev stats, xstats, etc.). Unless
> I am mistaken, there is no way to disable statistics for interfaces for
> example.
Ethdev counters are updated by the NIC silicon.
They have zero software performance cost in the data plane.
>
> For example, in grout, we use total_objs in the main loop to determine
> if we can save power by willingly putting the CPU to sleep for very
> short periods of time. If total_objs is no longer updated, grout cannot
> save power.
Makes sense. We do something similar in our application. Look at some instrumentation counters to estimate the workload and determine power saving actions.
If Grout is built with RTE_LIBRTE_GRAPH_STATS, it can use the graph stats; otherwise, it will have to look at something else.
We want to keep DPDK performant, so I'm opposed to making optional instrumentation mandatory.
>
> total_cycles however, *is* instrumentation. It is only useful to
> identify hotspots and debug issues. But it can be interesting to enable
> it *at runtime* without recompiling.
The word "instrumentation" might not be 100 % clearly defined, and we seem to have slightly different interpretations of it.
But that is not important. My point is: These library counters are irrelevant for the data plane, so they should remain optional.
They are also irrelevant for the control plane.
They might be relevant for the management plane.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH dpdk 1/2] graph: always count objects and calls
2025-12-09 13:45 ` Morten Brørup
@ 2025-12-09 14:47 ` Robin Jarry
0 siblings, 0 replies; 11+ messages in thread
From: Robin Jarry @ 2025-12-09 14:47 UTC (permalink / raw)
To: Morten Brørup, Jerin Jacob
Cc: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
Morten Brørup, Dec 09, 2025 at 14:45:
> Sorry about not being more clear. I meant: "Then, if
> [RTE_LIBRTE_GRAPH_STATS is] enabled [at build time], the total_cycles
> stats can be controlled by rte_graph_cycle_stats_enable()."
>
> I proposed that you add your rte_graph_cycle_stats_enable() to control
> the total_cycles stats, and leave RTE_LIBRTE_GRAPH_STATS as it is now
> (i.e. to be able to disable all stats at build time).
OK I see what you mean. I will include these changes in a v2.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-12-09 14:47 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-09 8:50 [PATCH dpdk 0/2] Flexible graph nodes stats collection Robin Jarry
2025-12-09 8:50 ` [PATCH dpdk 1/2] graph: always count objects and calls Robin Jarry
2025-12-09 9:09 ` Jerin Jacob
2025-12-09 9:12 ` Robin Jarry
2025-12-09 11:13 ` Morten Brørup
2025-12-09 11:47 ` Robin Jarry
2025-12-09 12:11 ` Morten Brørup
2025-12-09 12:45 ` Robin Jarry
2025-12-09 13:45 ` Morten Brørup
2025-12-09 14:47 ` Robin Jarry
2025-12-09 8:50 ` [PATCH dpdk 2/2] graph: allow enabling/disabling node visit cycles collection Robin Jarry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).