* [PATCH] graph: optimize graph search when scheduling nodes
@ 2024-11-07 8:04 Huichao cai
2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob
2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai
0 siblings, 2 replies; 22+ messages in thread
From: Huichao cai @ 2024-11-07 8:04 UTC (permalink / raw)
To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev
In the function __rte_graph_ccore_ispatch_stched_node_dequeue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.
Signed-off-by: Huichao cai <chcchc88@163.com>
---
lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
lib/graph/rte_graph_worker_common.h | 1 +
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9..a81d338 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@
struct rte_graph_rq_head *rq)
{
const unsigned int lcore_id = node->dispatch.lcore_id;
- struct rte_graph *graph;
+ struct rte_graph *graph = node->dispatch.graph;
- SLIST_FOREACH(graph, rq, next)
- if (graph->dispatch.lcore_id == lcore_id)
- break;
+ if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+ SLIST_FOREACH(graph, rq, next)
+ if (graph->dispatch.lcore_id == lcore_id)
+ break;
+ node->dispatch.graph = graph;
+ }
return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
}
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2..4c2432b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
unsigned int lcore_id; /**< Node running lcore. */
uint64_t total_sched_objs; /**< Number of objects scheduled. */
uint64_t total_sched_fail; /**< Number of scheduled failure. */
+ struct rte_graph *graph; /**< Graph corresponding to lcore_id. */
} dispatch;
};
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
--
1.8.3.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-07 8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai
@ 2024-11-07 9:37 ` Jerin Jacob
2024-11-08 1:39 ` Huichao Cai
2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai
1 sibling, 1 reply; 22+ messages in thread
From: Jerin Jacob @ 2024-11-07 9:37 UTC (permalink / raw)
To: Huichao cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163
Cc: dev
> -----Original Message-----
> From: Huichao cai <chcchc88@163.com>
> Sent: Thursday, November 7, 2024 1:35 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling
> nodes
>
> In the function __rte_graph_ccore_ispatch_stched_node_dequeue, use a slower
> loop to search for the graph, modify the search logic to record the result of the
> first search, and use this record for subsequent searches to improve search
> speed
> In the function __rte_graph_ccore_ispatch_stched_node_dequeue,
> use a slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed.
>
> Signed-off-by: Huichao cai <chcchc88@163.com>
> ---
> return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false; } diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index a518af2..4c2432b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
> unsigned int lcore_id; /**< Node running lcore. */
> uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
> uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> + struct rte_graph *graph; /**< Graph corresponding to
> lcore_id. */
Is n't breaking the ABI?
Also, please change commit as following for mcore specific changes
graph: mcore: ...
> } dispatch;
> };
> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob
@ 2024-11-08 1:39 ` Huichao Cai
2024-11-08 12:22 ` Jerin Jacob
0 siblings, 1 reply; 22+ messages in thread
From: Huichao Cai @ 2024-11-08 1:39 UTC (permalink / raw)
To: Jerin Jacob
Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev
[-- Attachment #1: Type: text/plain, Size: 117 bytes --]
> Is n't breaking the ABI?
So can't we modify the ABI, or is there any special operation required to modify the ABI?
[-- Attachment #2: Type: text/html, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-08 1:39 ` Huichao Cai
@ 2024-11-08 12:22 ` Jerin Jacob
2024-11-08 13:38 ` David Marchand
0 siblings, 1 reply; 22+ messages in thread
From: Jerin Jacob @ 2024-11-08 12:22 UTC (permalink / raw)
To: Huichao Cai
Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
dev, Thomas Monjalon, david.marchand, Robin Jarry
> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Friday, November 8, 2024 7:10 AM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; yanzhirun_163@163.com;
> dev@dpdk.org
> Subject: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
>
> > Is n't breaking the ABI? So can't we modify the ABI, or is there any
> > special operation required to modify the ABI?
> >
> >
> >
>
> ZjQcmQRYFpfptBannerEnd
>
> > Is n't breaking the ABI?
>
> So can't we modify the ABI, or is there any special operation required to modify
> the ABI?
Only LTS release (xx.11) can change the ABI after sending deprecation notice.
Looking at the pahole output, one option will be making dispatch and new semi fastpath
Additions like xstat_off can be min cache aligned to make room for future expansion and
to make sure have better performance.
For xstat_off addition, there was deprecation notice to update rte_node.
If there are no objection, may be we can try following in this release to not wait
Huichao for one more year.
[main] [dpdk.org] $ git diff
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..ec9a82186d 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
/** Original process function when pcap is enabled. */
rte_node_process_t original_process;
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
union {
/* Fast schedule area for mcore dispatch model */
struct {
@@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {
uint64_t total_sched_fail; /**< Number of scheduled failure. */
} dispatch;
};
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
/* Fast path area */
__extension__ struct __rte_cache_aligned {
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-08 12:22 ` Jerin Jacob
@ 2024-11-08 13:38 ` David Marchand
2024-11-11 5:38 ` Jerin Jacob
0 siblings, 1 reply; 22+ messages in thread
From: David Marchand @ 2024-11-08 13:38 UTC (permalink / raw)
To: Jerin Jacob
Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163, dev, Thomas Monjalon, Robin Jarry
Hello Jerin,
On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > Is n't breaking the ABI?
> >
> > So can't we modify the ABI, or is there any special operation required to modify
> > the ABI?
>
> Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> Looking at the pahole output, one option will be making dispatch and new semi fastpath
> Additions like xstat_off can be min cache aligned to make room for future expansion and
> to make sure have better performance.
Adding holes may be a short term solution, but in my opinion, the slow
path part should be entirely hidden and we only expose the fp part.
Reminder, those holes must be in a "known state" as we release v24.11
so that the presence of future additions can be safely detected.
--
David Marchand
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2] graph: mcore: optimize graph search
2024-11-07 8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai
2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob
@ 2024-11-11 4:03 ` Huichao Cai
2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob
2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai
1 sibling, 2 replies; 22+ messages in thread
From: Huichao Cai @ 2024-11-11 4:03 UTC (permalink / raw)
To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev, Huichao cai
From: Huichao cai <chcchc88@163.com>
In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.
Signed-off-by: Huichao cai <chcchc88@163.com>
---
lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
lib/graph/rte_graph_worker_common.h | 1 +
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9..a81d338 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@
struct rte_graph_rq_head *rq)
{
const unsigned int lcore_id = node->dispatch.lcore_id;
- struct rte_graph *graph;
+ struct rte_graph *graph = node->dispatch.graph;
- SLIST_FOREACH(graph, rq, next)
- if (graph->dispatch.lcore_id == lcore_id)
- break;
+ if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+ SLIST_FOREACH(graph, rq, next)
+ if (graph->dispatch.lcore_id == lcore_id)
+ break;
+ node->dispatch.graph = graph;
+ }
return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
}
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2..4c2432b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
unsigned int lcore_id; /**< Node running lcore. */
uint64_t total_sched_objs; /**< Number of objects scheduled. */
uint64_t total_sched_fail; /**< Number of scheduled failure. */
+ struct rte_graph *graph; /**< Graph corresponding to lcore_id. */
} dispatch;
};
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
--
1.8.3.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-08 13:38 ` David Marchand
@ 2024-11-11 5:38 ` Jerin Jacob
2024-11-12 8:51 ` David Marchand
0 siblings, 1 reply; 22+ messages in thread
From: Jerin Jacob @ 2024-11-11 5:38 UTC (permalink / raw)
To: David Marchand
Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163, dev, Thomas Monjalon, Robin Jarry
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, November 8, 2024 7:08 PM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
>
> Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ marvell. com>
> wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is
> there any special operation required to modify > >
> Hello Jerin,
Hello David,
>
> On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > > Is n't breaking the ABI?
> > >
> > > So can't we modify the ABI, or is there any special operation
> > > required to modify the ABI?
> >
> > Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> > Looking at the pahole output, one option will be making dispatch and
> > new semi fastpath Additions like xstat_off can be min cache aligned
> > to make room for future expansion and to make sure have better
> performance.
>
> Adding holes may be a short term solution, but in my opinion, the slow path
> part should be entirely hidden and we only expose the fp part.
The new cache line alignment items are proposed are fastpath items only.
> Reminder, those holes must be in a "known state" as we release v24.11 so that
> the presence of future additions can be safely detected.
>
>
> --
> David Marchand
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search
2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai
@ 2024-11-11 5:46 ` Jerin Jacob
2024-11-13 9:19 ` Huichao Cai
2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai
1 sibling, 1 reply; 22+ messages in thread
From: Jerin Jacob @ 2024-11-11 5:46 UTC (permalink / raw)
To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163, david.marchand, Thomas Monjalon
Cc: dev
> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Monday, November 11, 2024 9:33 AM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org; Huichao cai <chcchc88@163.com>
> Subject: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search
>
> From: Huichao cai <chcchc88@ 163. com> In the function
> __rte_graph_mcore_dispatch_sched_node_enqueue, use a slower loop to
> search for the graph, modify the search logic to record the result of the first
> search, and use this record for subsequent
> From: Huichao cai <chcchc88@163.com>
>
> In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
> use a slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed.
>
> Signed-off-by: Huichao cai <chcchc88@163.com>
> ---
> lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
> lib/graph/rte_graph_worker_common.h | 1 +
> 2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c
> b/lib/graph/rte_graph_model_mcore_dispatch.c
> index a590fc9..a81d338 100644
> --- a/lib/graph/rte_graph_model_mcore_dispatch.c
> +++ b/lib/graph/rte_graph_model_mcore_dispatch.c
> @@ -118,11 +118,14 @@
> struct rte_graph_rq_head *rq) {
> const unsigned int lcore_id = node->dispatch.lcore_id;
> - struct rte_graph *graph;
> + struct rte_graph *graph = node->dispatch.graph;
>
> - SLIST_FOREACH(graph, rq, next)
> - if (graph->dispatch.lcore_id == lcore_id)
> - break;
> + if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
> + SLIST_FOREACH(graph, rq, next)
> + if (graph->dispatch.lcore_id == lcore_id)
> + break;
> + node->dispatch.graph = graph;
> + }
>
> return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false; } diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index a518af2..4c2432b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
> unsigned int lcore_id; /**< Node running lcore. */
> uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
> uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> + struct rte_graph *graph; /**< Graph corresponding to
> lcore_id. */
Need to conclude the ABI related discussion here before making change
https://patches.dpdk.org/project/dpdk/patch/1730966682-2632-1-git-send-email-chcchc88@163.com/
> } dispatch;
> };
> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-11 5:38 ` Jerin Jacob
@ 2024-11-12 8:51 ` David Marchand
2024-11-12 9:35 ` Jerin Jacob
0 siblings, 1 reply; 22+ messages in thread
From: David Marchand @ 2024-11-12 8:51 UTC (permalink / raw)
To: Jerin Jacob
Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163, dev, Thomas Monjalon, Robin Jarry
On Mon, Nov 11, 2024 at 6:39 AM Jerin Jacob <jerinj@marvell.com> wrote:
>
>
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Friday, November 8, 2024 7:08 PM
> > To: Jerin Jacob <jerinj@marvell.com>
> > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> > scheduling nodes
> >
> > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ marvell. com>
> > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is
> > there any special operation required to modify > >
> > Hello Jerin,
>
> Hello David,
>
> >
> > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > > > Is n't breaking the ABI?
> > > >
> > > > So can't we modify the ABI, or is there any special operation
> > > > required to modify the ABI?
> > >
> > > Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> > > Looking at the pahole output, one option will be making dispatch and
> > > new semi fastpath Additions like xstat_off can be min cache aligned
> > > to make room for future expansion and to make sure have better
> > performance.
> >
> > Adding holes may be a short term solution, but in my opinion, the slow path
> > part should be entirely hidden and we only expose the fp part.
>
> The new cache line alignment items are proposed are fastpath items only.
I had only noticed the second comment:
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
/* Fast path area */
^^^^^^^^^^^^
And I assumed the part in the struct before was slow path.
(it may be worth enhancing these comments, with a single limit of
slow/fast path areas)
>
> > Reminder, those holes must be in a "known state" as we release v24.11 so that
> > the presence of future additions can be safely detected.
If the rte_node objects are allocated by the graph library and zero'd,
then we are good.
It seems to be the case in graph_nodes_populate(), and the rte_node
objects are embedded in the rte_graph object.
Is there another location in the graph library where a rte_node object
is allocated?
If not, and an application can not create a rte_node object, your
proposal looks good to me.
--
David Marchand
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-12 8:51 ` David Marchand
@ 2024-11-12 9:35 ` Jerin Jacob
2024-11-12 12:57 ` Huichao Cai
2024-11-13 9:22 ` Huichao Cai
0 siblings, 2 replies; 22+ messages in thread
From: Jerin Jacob @ 2024-11-12 9:35 UTC (permalink / raw)
To: David Marchand, Huichao Cai
Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
dev, Thomas Monjalon, Robin Jarry
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, November 12, 2024 2:21 PM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
>
> On Mon, Nov 11, 2024 at 6: 39 AM Jerin Jacob <jerinj@ marvell. com> wrote: >
> > > > > -----Original Message----- > > From: David Marchand
> <david. marchand@ redhat. com> > > Sent: Friday, November 8, 2024 7: 08
>
> On Mon, Nov 11, 2024 at 6:39 AM Jerin Jacob <jerinj@marvell.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: David Marchand <david.marchand@redhat.com>
> > > Sent: Friday, November 8, 2024 7:08 PM
> > > To: Jerin Jacob <jerinj@marvell.com>
> > > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> > > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> > > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> > > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry
> > > <rjarry@redhat.com>
> > > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search
> > > when scheduling nodes
> > >
> > > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@
> > > marvell. com>
> > > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the
> > > ABI, or is there any special operation required to modify > > Hello
> > > Jerin,
> >
> > Hello David,
> >
> > >
> > > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > > > > Is n't breaking the ABI?
> > > > >
> > > > > So can't we modify the ABI, or is there any special operation
> > > > > required to modify the ABI?
> > > >
> > > > Only LTS release (xx.11) can change the ABI after sending deprecation
> notice.
> > > > Looking at the pahole output, one option will be making dispatch
> > > > and new semi fastpath Additions like xstat_off can be min cache
> > > > aligned to make room for future expansion and to make sure have
> > > > better
> > > performance.
> > >
> > > Adding holes may be a short term solution, but in my opinion, the
> > > slow path part should be entirely hidden and we only expose the fp part.
> >
> > The new cache line alignment items are proposed are fastpath items only.
>
> I had only noticed the second comment:
>
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> /* Fast path area */
> ^^^^^^^^^^^^
>
> And I assumed the part in the struct before was slow path.
> (it may be worth enhancing these comments, with a single limit of slow/fast
> path areas)
Yes. Xstat_off was new addition as a fastpath item in this release and there was no space
in original Fastpath area. And, Yes, the comment needs to be updated.
>
>
> >
> > > Reminder, those holes must be in a "known state" as we release
> > > v24.11 so that the presence of future additions can be safely detected.
>
> If the rte_node objects are allocated by the graph library and zero'd, then we
> are good.
> It seems to be the case in graph_nodes_populate(), and the rte_node objects
> are embedded in the rte_graph object.
>
> Is there another location in the graph library where a rte_node object is
> allocated?
No
>
> If not, and an application can not create a rte_node object, your proposal looks
> good to me.
OK. @Huichao Cai Please send two patches (a) new proposal and (b) your improvement as series.
Update ABI Changes section in doc/guides/rel_notes/release_24_11.rst
>
>
> --
> David Marchand
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-12 9:35 ` Jerin Jacob
@ 2024-11-12 12:57 ` Huichao Cai
2024-11-13 9:22 ` Huichao Cai
1 sibling, 0 replies; 22+ messages in thread
From: Huichao Cai @ 2024-11-12 12:57 UTC (permalink / raw)
To: Jerin Jacob
Cc: David Marchand, Kiran Kumar Kokkilagadda,
Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon,
Robin Jarry
[-- Attachment #1: Type: text/plain, Size: 205 bytes --]
>OK. @Huichao Cai Please send two patches (a) new proposal and (b) your improvement as series.
>Update ABI Changes section in doc/guides/rel_notes/release_24_11.rst Ok.I will send these two patches soon.
[-- Attachment #2: Type: text/html, Size: 380 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 1/2] graph: mcore: optimize graph search
2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai
2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob
@ 2024-11-13 7:35 ` Huichao Cai
2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
1 sibling, 2 replies; 22+ messages in thread
From: Huichao Cai @ 2024-11-13 7:35 UTC (permalink / raw)
To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev
In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.
Due to the addition of a "graph" field in the "rte_node" structure,
update file release_24_11.rst.
Signed-off-by: Huichao Cai <chcchc88@163.com>
---
doc/guides/rel_notes/release_24_11.rst | 1 +
lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
lib/graph/rte_graph_worker_common.h | 1 +
3 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 9dc739c4cb..592116b979 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -423,6 +423,7 @@ ABI Changes
added new structure ``rte_node_xstats`` to ``rte_node_register`` and
added ``xstat_off`` to ``rte_node``.
+* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
Known Issues
------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
struct rte_graph_rq_head *rq)
{
const unsigned int lcore_id = node->dispatch.lcore_id;
- struct rte_graph *graph;
+ struct rte_graph *graph = node->dispatch.graph;
- SLIST_FOREACH(graph, rq, next)
- if (graph->dispatch.lcore_id == lcore_id)
- break;
+ if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+ SLIST_FOREACH(graph, rq, next)
+ if (graph->dispatch.lcore_id == lcore_id)
+ break;
+ node->dispatch.graph = graph;
+ }
return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
}
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..4c2432b47f 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
unsigned int lcore_id; /**< Node running lcore. */
uint64_t total_sched_objs; /**< Number of objects scheduled. */
uint64_t total_sched_fail; /**< Number of scheduled failure. */
+ struct rte_graph *graph; /**< Graph corresponding to lcore_id. */
} dispatch;
};
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
--
2.27.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 2/2] graph: add alignment to the member of rte_node
2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai
@ 2024-11-13 7:35 ` Huichao Cai
2024-11-14 7:14 ` [EXTERNAL] " Jerin Jacob
2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
1 sibling, 1 reply; 22+ messages in thread
From: Huichao Cai @ 2024-11-13 7:35 UTC (permalink / raw)
To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev
The members "dispatch" and "xstat_off" of the structure "rte_node"
can be min cache aligned to make room for future expansion and to
make sure have better performance.
Due to the modification of the alignment of some members of the
"rte_node" structure, update file release_24_11.rst.
Signed-off-by: Huichao Cai <chcchc88@163.com>
---
doc/guides/rel_notes/release_24_11.rst | 3 +++
lib/graph/rte_graph_worker_common.h | 5 ++++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 592116b979..6903b1d0f0 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -425,6 +425,9 @@ ABI Changes
* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
+* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been
+ marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
+
Known Issues
------------
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4c2432b47f..9e99278a0a 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
/** Original process function when pcap is enabled. */
rte_node_process_t original_process;
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
union {
/* Fast schedule area for mcore dispatch model */
struct {
@@ -113,8 +114,10 @@ struct __rte_cache_aligned rte_node {
struct rte_graph *graph; /**< Graph corresponding to lcore_id. */
} dispatch;
};
- rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
+
/* Fast path area */
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
+ rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
__extension__ struct __rte_cache_aligned {
#define RTE_NODE_CTX_SZ 16
union {
--
2.27.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re:RE: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search
2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob
@ 2024-11-13 9:19 ` Huichao Cai
0 siblings, 0 replies; 22+ messages in thread
From: Huichao Cai @ 2024-11-13 9:19 UTC (permalink / raw)
To: Jerin Jacob
Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
david.marchand, Thomas Monjalon, dev
[-- Attachment #1: Type: text/plain, Size: 5102 bytes --]
> [main] [dpdk.org] $ git diff
> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
> index a518af2b2a..ec9a82186d 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
> /** Original process function when pcap is enabled. */
> rte_node_process_t original_process;
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
> union {
Hi, Jerin
The C++standard cannot align anonymous unions. Do we need to fill in reserved fields in order to maintain union alignment with RTE-CAHE_LINE_LIN_SIZE bytes?
> /* Fast schedule area for mcore dispatch model */
> struct {
> @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {
> uint64_t total_sched_fail; /**< Number of scheduled failure. */
> } dispatch;
> };
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> /* Fast path area */
> __extension__ struct __rte_cache_aligned {
FAILED: buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o
ccache c++ -Ibuildtools/chkincs/chkincs-cpp.p -Ibuildtools/chkincs -I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wnon-virtual-dtor -Wextra -Werror -g -include rte_config.h -march=corei7 -mrtm -MD -MQ buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -MF buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o.d -o buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -c buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp
In file included from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_model_rtc.h:6,
from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker.h:9,
from buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp:1:
/home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: error: attribute ignored in declaration of ‘union rte_node::<unnamed>’ [-Werror=attributes]
108 | union {
| ^
/home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: note: attribute for ‘union rte_node::<unnamed>’ must follow the ‘union’ keyword
cc1plus: all warnings being treated as errors
[5410/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_lpm.cpp.o
[5411/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_port_in_action.cpp.o
[5412/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_pipeline.cpp.o
[5413/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_action.cpp.o
[5414/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_swx_ipsec.cpp.o
ninja: build stopped: subcommand failed.
[-- Attachment #2: Type: text/html, Size: 7692 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-12 9:35 ` Jerin Jacob
2024-11-12 12:57 ` Huichao Cai
@ 2024-11-13 9:22 ` Huichao Cai
2024-11-14 7:09 ` Jerin Jacob
1 sibling, 1 reply; 22+ messages in thread
From: Huichao Cai @ 2024-11-13 9:22 UTC (permalink / raw)
To: Jerin Jacob
Cc: David Marchand, Kiran Kumar Kokkilagadda,
Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon,
Robin Jarry
[-- Attachment #1: Type: text/plain, Size: 5084 bytes --]
> [main] [dpdk.org] $ git diff> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h> index a518af2b2a..ec9a82186d 100644> --- a/lib/graph/rte_graph_worker_common.h> +++ b/lib/graph/rte_graph_worker_common.h> @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {> /** Original process function when pcap is enabled. */> rte_node_process_t original_process;
> + alignas(RTE_CACHE_LINE_MIN_SIZE)> union {
Hi, Jerin
The C++standard cannot align anonymous unions. Do we need to fill in reserved fields in order to maintain union alignment with RTE-CAHE_LINE_LIN_SIZE bytes?
> /* Fast schedule area for mcore dispatch model */> struct {> @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {> uint64_t total_sched_fail; /**< Number of scheduled failure. */> } dispatch;> };> + alignas(RTE_CACHE_LINE_MIN_SIZE)> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */> /* Fast path area */> __extension__ struct __rte_cache_aligned {
FAILED: buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o
ccache c++ -Ibuildtools/chkincs/chkincs-cpp.p -Ibuildtools/chkincs -I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wnon-virtual-dtor -Wextra -Werror -g -include rte_config.h -march=corei7 -mrtm -MD -MQ buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -MF buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o.d -o buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -c buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp
In file included from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_model_rtc.h:6,
from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker.h:9,
from buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp:1:
/home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: error: attribute ignored in declaration of ‘union rte_node::<unnamed>’ [-Werror=attributes]
108 | union {
| ^
/home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: note: attribute for ‘union rte_node::<unnamed>’ must follow the ‘union’ keyword
cc1plus: all warnings being treated as errors
[5410/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_lpm.cpp.o
[5411/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_port_in_action.cpp.o
[5412/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_pipeline.cpp.o
[5413/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_action.cpp.o
[5414/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_swx_ipsec.cpp.o
ninja: build stopped: subcommand failed.
[-- Attachment #2: Type: text/html, Size: 8148 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
2024-11-13 9:22 ` Huichao Cai
@ 2024-11-14 7:09 ` Jerin Jacob
0 siblings, 0 replies; 22+ messages in thread
From: Jerin Jacob @ 2024-11-14 7:09 UTC (permalink / raw)
To: Huichao Cai
Cc: David Marchand, Kiran Kumar Kokkilagadda,
Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon,
Robin Jarry
> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Wednesday, November 13, 2024 2:52 PM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: David Marchand <david.marchand@redhat.com>; Kiran Kumar
> Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> Subject: Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
>
> > [main] [dpdk. org] $ git diff > diff --git
> > a/lib/graph/rte_graph_worker_common. h
> > b/lib/graph/rte_graph_worker_common. h > index a518af2b2a. .
> > ec9a82186d 100644 > --- a/lib/graph/rte_graph_worker_common. h > +++
> > b/lib/graph/rte_graph_worker_common. h
>
> > [main] [dpdk.org] $ git diff
> > diff --git a/lib/graph/rte_graph_worker_common.h
> > b/lib/graph/rte_graph_worker_common.h
> > index a518af2b2a..ec9a82186d 100644
> > --- a/lib/graph/rte_graph_worker_common.h
> > +++ b/lib/graph/rte_graph_worker_common.h
> > @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
> > /** Original process function when pcap is enabled. */
> > rte_node_process_t original_process;
>
> > + alignas(RTE_CACHE_LINE_MIN_SIZE)
> > union {
>
> Hi, Jerin
> The C++standard cannot align anonymous unions. Do we need to fill in reserved
> fields in order to maintain union alignment with RTE-CAHE_LINE_LIN_SIZE
> bytes?
You can bring it inside the structure.
>
> > /* Fast schedule area for mcore dispatch model */
> > struct {
> > @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {
> > uint64_t total_sched_fail; /**< Number of scheduled failure. */
> > } dispatch;
> > };
> > + alignas(RTE_CACHE_LINE_MIN_SIZE)
> > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> > /* Fast path area */
> > __extension__ struct __rte_cache_aligned {
>
>
> FAILED: buildtools/chkincs/chkincs-cpp.p/meson-
> generated_rte_graph_worker.cpp.o
> ccache c++ -Ibuildtools/chkincs/chkincs-cpp.p -Ibuildtools/chkincs -
> I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -
> I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -
> I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -
> I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -
> I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -
> Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -
> Ilib/telemetry -I../lib/telemetry -Idrivers/bus/pci -I../drivers/bus/pci -
> I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -
> I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -
> I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -
> Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -
> Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -
> Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -
> Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -
> Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -
> I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -
> I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev
> -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -
> Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -
> I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -
> Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -
> I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -
> Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -
> Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -
> Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -
> I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table
> -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -
> fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -
> Wnon-virtual-dtor -Wextra -Werror -g -include rte_config.h -march=corei7 -
> mrtm -MD -MQ buildtools/chkincs/chkincs-cpp.p/meson-
> generated_rte_graph_worker.cpp.o -MF buildtools/chkincs/chkincs-
> cpp.p/meson-generated_rte_graph_worker.cpp.o.d -o
> buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -c
> buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp
> In file included from
> /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_model_rtc.h:6,
> from
> /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker.h:9,
> from buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp:1:
> /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:1
> 5: error: attribute ignored in declaration of ‘union rte_node::<unnamed>’ [-
> Werror=attributes]
> 108 | union {
> | ^
> /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:1
> 5: note: attribute for ‘union rte_node::<unnamed>’ must follow the ‘union’
> keyword
> cc1plus: all warnings being treated as errors [5410/6569] Compiling C++ object
> buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_lpm.cpp.o
> [5411/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-
> generated_rte_port_in_action.cpp.o
> [5412/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-
> generated_rte_pipeline.cpp.o
> [5413/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-
> generated_rte_table_action.cpp.o
> [5414/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-
> generated_rte_swx_ipsec.cpp.o
> ninja: build stopped: subcommand failed.
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [EXTERNAL] [PATCH v3 2/2] graph: add alignment to the member of rte_node
2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-11-14 7:14 ` Jerin Jacob
0 siblings, 0 replies; 22+ messages in thread
From: Jerin Jacob @ 2024-11-14 7:14 UTC (permalink / raw)
To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163
Cc: dev
> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Wednesday, November 13, 2024 1:06 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH v3 2/2] graph: add alignment to the member of
> rte_node
>
> The members "dispatch" and "xstat_off" of the structure "rte_node" can be min
> cache aligned to make room for future expansion and to make sure have better
> performance. Due to the modification of the alignment of some members of the
> "rte_node"
>
> The members "dispatch" and "xstat_off" of the structure "rte_node"
> can be min cache aligned to make room for future expansion and to make sure
> have better performance.
>
> Due to the modification of the alignment of some members of the "rte_node"
> structure, update file release_24_11.rst.
>
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
> doc/guides/rel_notes/release_24_11.rst | 3 +++
> lib/graph/rte_graph_worker_common.h | 5 ++++-
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/doc/guides/rel_notes/release_24_11.rst
> b/doc/guides/rel_notes/release_24_11.rst
> index 592116b979..6903b1d0f0 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -425,6 +425,9 @@ ABI Changes
>
> * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node``
> structure.
>
> +* graph: The members ``dispatch`` and ``xstat_off`` of the structure
> +``rte_node`` have been
> + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
> +
> Known Issues
> ------------
>
> diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index 4c2432b47f..9e99278a0a 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
> /** Original process function when pcap is enabled. */
> rte_node_process_t original_process;
>
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
> union {
> /* Fast schedule area for mcore dispatch model */
> struct {
> @@ -113,8 +114,10 @@ struct __rte_cache_aligned rte_node {
> struct rte_graph *graph; /**< Graph corresponding to
> lcore_id. */
> } dispatch;
> };
> - rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> +
> /* Fast path area */
Make it as two separate comment, Fast path area cache line 1 and Fastpath area cache line 2.
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
> + rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ
> 16
> union {
> --
> 2.27.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 1/2] graph: mcore: optimize graph search
2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai
2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-11-14 8:45 ` Huichao Cai
2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
1 sibling, 1 reply; 22+ messages in thread
From: Huichao Cai @ 2024-11-14 8:45 UTC (permalink / raw)
To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev
In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.
Due to the addition of a "graph" field in the "rte_node" structure,
update file release_24_11.rst.
Signed-off-by: Huichao Cai <chcchc88@163.com>
---
doc/guides/rel_notes/release_24_11.rst | 1 +
lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
lib/graph/rte_graph_worker_common.h | 1 +
3 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 9dc739c4cb..592116b979 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -423,6 +423,7 @@ ABI Changes
added new structure ``rte_node_xstats`` to ``rte_node_register`` and
added ``xstat_off`` to ``rte_node``.
+* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
Known Issues
------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
struct rte_graph_rq_head *rq)
{
const unsigned int lcore_id = node->dispatch.lcore_id;
- struct rte_graph *graph;
+ struct rte_graph *graph = node->dispatch.graph;
- SLIST_FOREACH(graph, rq, next)
- if (graph->dispatch.lcore_id == lcore_id)
- break;
+ if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+ SLIST_FOREACH(graph, rq, next)
+ if (graph->dispatch.lcore_id == lcore_id)
+ break;
+ node->dispatch.graph = graph;
+ }
return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
}
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..4c2432b47f 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
unsigned int lcore_id; /**< Node running lcore. */
uint64_t total_sched_objs; /**< Number of objects scheduled. */
uint64_t total_sched_fail; /**< Number of scheduled failure. */
+ struct rte_graph *graph; /**< Graph corresponding to lcore_id. */
} dispatch;
};
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
--
2.27.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 2/2] graph: add alignment to the member of rte_node
2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
@ 2024-11-14 8:45 ` Huichao Cai
2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob
0 siblings, 1 reply; 22+ messages in thread
From: Huichao Cai @ 2024-11-14 8:45 UTC (permalink / raw)
To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev
The members dispatch and xstat_off of the structure rte_node
can be min cache aligned to make room for future expansion and to
make sure have better performance. Add corresponding comments.
Due to the modification of the alignment of some members of the
rte_node structure, update file release_24_11.rst.
Signed-off-by: Huichao Cai <chcchc88@163.com>
---
doc/guides/rel_notes/release_24_11.rst | 3 +++
lib/graph/rte_graph_worker_common.h | 7 ++++++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 592116b979..6903b1d0f0 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -425,6 +425,9 @@ ABI Changes
* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
+* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been
+ marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
+
Known Issues
------------
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4c2432b47f..d36abec08b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,16 +104,21 @@ struct __rte_cache_aligned rte_node {
/** Original process function when pcap is enabled. */
rte_node_process_t original_process;
+ /** Fast path area cache line 1. */
union {
/* Fast schedule area for mcore dispatch model */
- struct {
+ alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
unsigned int lcore_id; /**< Node running lcore. */
uint64_t total_sched_objs; /**< Number of objects scheduled. */
uint64_t total_sched_fail; /**< Number of scheduled failure. */
struct rte_graph *graph; /**< Graph corresponding to lcore_id. */
} dispatch;
};
+
+ /** Fast path area cache line 2. */
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
+
/* Fast path area */
__extension__ struct __rte_cache_aligned {
#define RTE_NODE_CTX_SZ 16
--
2.27.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node
2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-11-14 10:05 ` Jerin Jacob
2024-11-14 12:06 ` Huichao Cai
0 siblings, 1 reply; 22+ messages in thread
From: Jerin Jacob @ 2024-11-14 10:05 UTC (permalink / raw)
To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
yanzhirun_163, david.marchand
Cc: dev
> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Thursday, November 14, 2024 2:15 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of
> rte_node
>
> The members dispatch and xstat_off of the structure rte_node can be min cache
> aligned to make room for future expansion and to make sure have better
> performance. Add corresponding comments. Due to the modification of the
> alignment of some members
> The members dispatch and xstat_off of the structure rte_node can be min cache
> aligned to make room for future expansion and to make sure have better
> performance. Add corresponding comments.
>
Please change subject to graph: improve node layout
> Due to the modification of the alignment of some members of the rte_node
> structure, update file release_24_11.rst.
The above section is not needed.
>
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
> doc/guides/rel_notes/release_24_11.rst | 3 +++
> lib/graph/rte_graph_worker_common.h | 7 ++++++-
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/doc/guides/rel_notes/release_24_11.rst
> b/doc/guides/rel_notes/release_24_11.rst
> index 592116b979..6903b1d0f0 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -425,6 +425,9 @@ ABI Changes
>
> * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node``
> structure.
>
> +* graph: The members ``dispatch`` and ``xstat_off`` of the structure
> +``rte_node`` have been
> + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
> +
> Known Issues
> ------------
>
> diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index 4c2432b47f..d36abec08b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -104,16 +104,21 @@ struct __rte_cache_aligned rte_node {
> /** Original process function when pcap is enabled. */
> rte_node_process_t original_process;
>
> + /** Fast path area cache line 1. */
Fast schedule area for mcore dispatch model
> union {
> /* Fast schedule area for mcore dispatch model */
Above comment you can remove it
> - struct {
> + alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
> unsigned int lcore_id; /**< Node running lcore. */
> uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
> uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> struct rte_graph *graph; /**< Graph corresponding to
> lcore_id. */
> } dispatch;
> };
> +
> + /** Fast path area cache line 2. */
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> +
> /* Fast path area */
Fast path area cache line 1
> __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ
> 16
With above: Acked-by: Jerin Jacob <jerinj@marvell.com>
Looks loke we cannot merge new feature in rc3. I would suggest skip 1/2 and send only this patch so that 1/2 can merged in next release.
Please add @david.marchand@redhat.com in Cc.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re:RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node
2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob
@ 2024-11-14 12:06 ` Huichao Cai
2024-11-14 13:04 ` Jerin Jacob
0 siblings, 1 reply; 22+ messages in thread
From: Huichao Cai @ 2024-11-14 12:06 UTC (permalink / raw)
To: Jerin Jacob
Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
david.marchand, dev
[-- Attachment #1: Type: text/plain, Size: 1213 bytes --]
Hi, Jerin. Like this?
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..f9ff7dd8c9 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,15 +104,19 @@ struct __rte_cache_aligned rte_node {
/** Original process function when pcap is enabled. */
rte_node_process_t original_process;
+ /** Fast schedule area for mcore dispatch model. */
union {
- /* Fast schedule area for mcore dispatch model */
- struct {
+ alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
unsigned int lcore_id; /**< Node running lcore. */
uint64_t total_sched_objs; /**< Number of objects scheduled. */
uint64_t total_sched_fail; /**< Number of scheduled failure. */
} dispatch;
};
+
+ /** Fast path area cache line 1. */
+ alignas(RTE_CACHE_LINE_MIN_SIZE)
rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
+
/* Fast path area */
__extension__ struct __rte_cache_aligned {
#define RTE_NODE_CTX_SZ 16
[-- Attachment #2: Type: text/html, Size: 2621 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node
2024-11-14 12:06 ` Huichao Cai
@ 2024-11-14 13:04 ` Jerin Jacob
0 siblings, 0 replies; 22+ messages in thread
From: Jerin Jacob @ 2024-11-14 13:04 UTC (permalink / raw)
To: Huichao Cai
Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
david.marchand, dev
> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Thursday, November 14, 2024 5:37 PM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; yanzhirun_163@163.com;
> david.marchand@redhat.com; dev@dpdk.org
> Subject: Re:RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the
> member of rte_node
>
> Hi, Jerin. Like this? diff --git a/lib/graph/rte_graph_worker_common. h
> b/lib/graph/rte_graph_worker_common. h index a518af2b2a. . f9ff7dd8c9
> 100644 --- a/lib/graph/rte_graph_worker_common. h +++
> b/lib/graph/rte_graph_worker_common. h @@ -104,15 +104,19
>
>
> Hi, Jerin. Like this?
>
>
>
>
> diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
>
> index a518af2b2a..f9ff7dd8c9 100644
>
> --- a/lib/graph/rte_graph_worker_common.h
>
> +++ b/lib/graph/rte_graph_worker_common.h
>
> @@ -104,15 +104,19 @@ struct __rte_cache_aligned rte_node {
>
> /** Original process function when pcap is enabled. */
>
> rte_node_process_t original_process;
>
>
>
> + /** Fast schedule area for mcore dispatch model. */
>
> union {
>
> - /* Fast schedule area for mcore dispatch model */
>
> - struct {
>
> + alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
>
> unsigned int lcore_id; /**< Node running lcore. */
>
> uint64_t total_sched_objs; /**< Number of objects scheduled. */
>
> uint64_t total_sched_fail; /**< Number of scheduled failure. */
>
> } dispatch;
>
> };
>
> +
>
> + /** Fast path area cache line 1. */
>
> + alignas(RTE_CACHE_LINE_MIN_SIZE)
>
> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
>
> +
>
> /* Fast path area */
Fast path area cache line 2
Rest looks good to me.
>
> __extension__ struct __rte_cache_aligned {
>
> #define RTE_NODE_CTX_SZ 16
>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2024-11-14 13:04 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-07 8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai
2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob
2024-11-08 1:39 ` Huichao Cai
2024-11-08 12:22 ` Jerin Jacob
2024-11-08 13:38 ` David Marchand
2024-11-11 5:38 ` Jerin Jacob
2024-11-12 8:51 ` David Marchand
2024-11-12 9:35 ` Jerin Jacob
2024-11-12 12:57 ` Huichao Cai
2024-11-13 9:22 ` Huichao Cai
2024-11-14 7:09 ` Jerin Jacob
2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai
2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob
2024-11-13 9:19 ` Huichao Cai
2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai
2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
2024-11-14 7:14 ` [EXTERNAL] " Jerin Jacob
2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob
2024-11-14 12:06 ` Huichao Cai
2024-11-14 13:04 ` Jerin Jacob
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).