DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] graph: optimize graph search when scheduling nodes
@ 2024-11-07  8:04 Huichao cai
  2024-11-07  9:37 ` [EXTERNAL] " Jerin Jacob
  0 siblings, 1 reply; 5+ messages in thread
From: Huichao cai @ 2024-11-07  8:04 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

In the function __rte_graph_ccore_ispatch_stched_node_dequeue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Signed-off-by: Huichao cai <chcchc88@163.com>
---
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9..a81d338 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2..4c2432b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-07  8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai
@ 2024-11-07  9:37 ` Jerin Jacob
  2024-11-08  1:39   ` Huichao Cai
  0 siblings, 1 reply; 5+ messages in thread
From: Jerin Jacob @ 2024-11-07  9:37 UTC (permalink / raw)
  To: Huichao cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163
  Cc: dev


> -----Original Message-----
> From: Huichao cai <chcchc88@163.com>
> Sent: Thursday, November 7, 2024 1:35 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling
> nodes
> 
> In the function __rte_graph_ccore_ispatch_stched_node_dequeue, use a slower
> loop to search for the graph, modify the search logic to record the result of the
> first search, and use this record for subsequent searches to improve search
> speed
> In the function __rte_graph_ccore_ispatch_stched_node_dequeue,
> use a slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed.
> 
> Signed-off-by: Huichao cai <chcchc88@163.com>
> ---
>  	return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false;  } diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index a518af2..4c2432b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>  			unsigned int lcore_id;  /**< Node running lcore. */
>  			uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
>  			uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> +			struct rte_graph *graph;  /**< Graph corresponding to
> lcore_id. */

Is n't breaking the ABI?

Also, please change commit as following for mcore specific changes 

graph: mcore: ...

>  		} dispatch;
>  	};
>  	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-07  9:37 ` [EXTERNAL] " Jerin Jacob
@ 2024-11-08  1:39   ` Huichao Cai
  2024-11-08 12:22     ` Jerin Jacob
  0 siblings, 1 reply; 5+ messages in thread
From: Huichao Cai @ 2024-11-08  1:39 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev

[-- Attachment #1: Type: text/plain, Size: 117 bytes --]

> Is n't breaking the ABI?

So can't we modify the ABI, or is there any special operation required to modify the ABI?

[-- Attachment #2: Type: text/html, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-08  1:39   ` Huichao Cai
@ 2024-11-08 12:22     ` Jerin Jacob
  2024-11-08 13:38       ` David Marchand
  0 siblings, 1 reply; 5+ messages in thread
From: Jerin Jacob @ 2024-11-08 12:22 UTC (permalink / raw)
  To: Huichao Cai
  Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
	dev, Thomas Monjalon, david.marchand, Robin Jarry



> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Friday, November 8, 2024 7:10 AM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; yanzhirun_163@163.com;
> dev@dpdk.org
> Subject: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
> 
> > Is n't breaking the ABI? So can't we modify the ABI, or is there any
> > special operation required to modify the ABI? ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> > ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> > ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> > ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> 
> ZjQcmQRYFpfptBannerEnd
> 
> > Is n't breaking the ABI?
> 
> So can't we modify the ABI, or is there any special operation required to modify
> the ABI?

Only LTS release (xx.11) can change the ABI after sending deprecation notice.
Looking at the pahole output, one option will be making dispatch and new semi fastpath
Additions like  xstat_off can be min cache aligned to make room for future expansion and 
to make sure have better performance.

For xstat_off addition, there was deprecation notice to update rte_node.
If there are no objection, may be we can try following in this release to not wait
Huichao for one more year.


[main] [dpdk.org] $ git diff
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..ec9a82186d 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
        /** Original process function when pcap is enabled. */
        rte_node_process_t original_process;

+       alignas(RTE_CACHE_LINE_MIN_SIZE)
        union {
                /* Fast schedule area for mcore dispatch model */
                struct {
@@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {
                        uint64_t total_sched_fail; /**< Number of scheduled failure. */
                } dispatch;
        };
+       alignas(RTE_CACHE_LINE_MIN_SIZE)
        rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
        /* Fast path area  */
        __extension__ struct __rte_cache_aligned {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-08 12:22     ` Jerin Jacob
@ 2024-11-08 13:38       ` David Marchand
  0 siblings, 0 replies; 5+ messages in thread
From: David Marchand @ 2024-11-08 13:38 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163, dev, Thomas Monjalon, Robin Jarry

Hello Jerin,

On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > Is n't breaking the ABI?
> >
> > So can't we modify the ABI, or is there any special operation required to modify
> > the ABI?
>
> Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> Looking at the pahole output, one option will be making dispatch and new semi fastpath
> Additions like  xstat_off can be min cache aligned to make room for future expansion and
> to make sure have better performance.

Adding holes may be a short term solution, but in my opinion, the slow
path part should be entirely hidden and we only expose the fp part.
Reminder, those holes must be in a "known state" as we release v24.11
so that the presence of future additions can be safely detected.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-11-08 13:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-07  8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai
2024-11-07  9:37 ` [EXTERNAL] " Jerin Jacob
2024-11-08  1:39   ` Huichao Cai
2024-11-08 12:22     ` Jerin Jacob
2024-11-08 13:38       ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).