* [PATCH] graph: optimize graph search when scheduling nodes @ 2024-11-07 8:04 Huichao cai 2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob 2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai 0 siblings, 2 replies; 26+ messages in thread From: Huichao cai @ 2024-11-07 8:04 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev In the function __rte_graph_ccore_ispatch_stched_node_dequeue, use a slower loop to search for the graph, modify the search logic to record the result of the first search, and use this record for subsequent searches to improve search speed. Signed-off-by: Huichao cai <chcchc88@163.com> --- lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++---- lib/graph/rte_graph_worker_common.h | 1 + 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c index a590fc9..a81d338 100644 --- a/lib/graph/rte_graph_model_mcore_dispatch.c +++ b/lib/graph/rte_graph_model_mcore_dispatch.c @@ -118,11 +118,14 @@ struct rte_graph_rq_head *rq) { const unsigned int lcore_id = node->dispatch.lcore_id; - struct rte_graph *graph; + struct rte_graph *graph = node->dispatch.graph; - SLIST_FOREACH(graph, rq, next) - if (graph->dispatch.lcore_id == lcore_id) - break; + if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) { + SLIST_FOREACH(graph, rq, next) + if (graph->dispatch.lcore_id == lcore_id) + break; + node->dispatch.graph = graph; + } return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false; } diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2..4c2432b 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ + struct rte_graph *graph; /**< Graph corresponding to lcore_id. */ } dispatch; }; rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ -- 1.8.3.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-07 8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai @ 2024-11-07 9:37 ` Jerin Jacob 2024-11-08 1:39 ` Huichao Cai 2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai 1 sibling, 1 reply; 26+ messages in thread From: Jerin Jacob @ 2024-11-07 9:37 UTC (permalink / raw) To: Huichao cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163 Cc: dev > -----Original Message----- > From: Huichao cai <chcchc88@163.com> > Sent: Thursday, November 7, 2024 1:35 PM > To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com > Cc: dev@dpdk.org > Subject: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling > nodes > > In the function __rte_graph_ccore_ispatch_stched_node_dequeue, use a slower > loop to search for the graph, modify the search logic to record the result of the > first search, and use this record for subsequent searches to improve search > speed > In the function __rte_graph_ccore_ispatch_stched_node_dequeue, > use a slower loop to search for the graph, modify the search logic to record the > result of the first search, and use this record for subsequent searches to > improve search speed. > > Signed-off-by: Huichao cai <chcchc88@163.com> > --- > return graph != NULL ? __graph_sched_node_enqueue(node, graph) : > false; } diff --git a/lib/graph/rte_graph_worker_common.h > b/lib/graph/rte_graph_worker_common.h > index a518af2..4c2432b 100644 > --- a/lib/graph/rte_graph_worker_common.h > +++ b/lib/graph/rte_graph_worker_common.h > @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node { > unsigned int lcore_id; /**< Node running lcore. */ > uint64_t total_sched_objs; /**< Number of objects > scheduled. */ > uint64_t total_sched_fail; /**< Number of scheduled > failure. */ > + struct rte_graph *graph; /**< Graph corresponding to > lcore_id. */ Is n't breaking the ABI? Also, please change commit as following for mcore specific changes graph: mcore: ... > } dispatch; > }; > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob @ 2024-11-08 1:39 ` Huichao Cai 2024-11-08 12:22 ` Jerin Jacob 0 siblings, 1 reply; 26+ messages in thread From: Huichao Cai @ 2024-11-08 1:39 UTC (permalink / raw) To: Jerin Jacob Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev [-- Attachment #1: Type: text/plain, Size: 117 bytes --] > Is n't breaking the ABI? So can't we modify the ABI, or is there any special operation required to modify the ABI? [-- Attachment #2: Type: text/html, Size: 490 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-08 1:39 ` Huichao Cai @ 2024-11-08 12:22 ` Jerin Jacob 2024-11-08 13:38 ` David Marchand 0 siblings, 1 reply; 26+ messages in thread From: Jerin Jacob @ 2024-11-08 12:22 UTC (permalink / raw) To: Huichao Cai Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, david.marchand, Robin Jarry > -----Original Message----- > From: Huichao Cai <chcchc88@163.com> > Sent: Friday, November 8, 2024 7:10 AM > To: Jerin Jacob <jerinj@marvell.com> > Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar > Dabilpuram <ndabilpuram@marvell.com>; yanzhirun_163@163.com; > dev@dpdk.org > Subject: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when > scheduling nodes > > > Is n't breaking the ABI? So can't we modify the ABI, or is there any > > special operation required to modify the ABI? > > > > > > > > ZjQcmQRYFpfptBannerEnd > > > Is n't breaking the ABI? > > So can't we modify the ABI, or is there any special operation required to modify > the ABI? Only LTS release (xx.11) can change the ABI after sending deprecation notice. Looking at the pahole output, one option will be making dispatch and new semi fastpath Additions like xstat_off can be min cache aligned to make room for future expansion and to make sure have better performance. For xstat_off addition, there was deprecation notice to update rte_node. If there are no objection, may be we can try following in this release to not wait Huichao for one more year. [main] [dpdk.org] $ git diff diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2b2a..ec9a82186d 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node { /** Original process function when pcap is enabled. */ rte_node_process_t original_process; + alignas(RTE_CACHE_LINE_MIN_SIZE) union { /* Fast schedule area for mcore dispatch model */ struct { @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node { uint64_t total_sched_fail; /**< Number of scheduled failure. */ } dispatch; }; + alignas(RTE_CACHE_LINE_MIN_SIZE) rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ /* Fast path area */ __extension__ struct __rte_cache_aligned { ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-08 12:22 ` Jerin Jacob @ 2024-11-08 13:38 ` David Marchand 2024-11-11 5:38 ` Jerin Jacob 0 siblings, 1 reply; 26+ messages in thread From: David Marchand @ 2024-11-08 13:38 UTC (permalink / raw) To: Jerin Jacob Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry Hello Jerin, On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is there any special operation required to modify > > the ABI? > > Only LTS release (xx.11) can change the ABI after sending deprecation notice. > Looking at the pahole output, one option will be making dispatch and new semi fastpath > Additions like xstat_off can be min cache aligned to make room for future expansion and > to make sure have better performance. Adding holes may be a short term solution, but in my opinion, the slow path part should be entirely hidden and we only expose the fp part. Reminder, those holes must be in a "known state" as we release v24.11 so that the presence of future additions can be safely detected. -- David Marchand ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-08 13:38 ` David Marchand @ 2024-11-11 5:38 ` Jerin Jacob 2024-11-12 8:51 ` David Marchand 0 siblings, 1 reply; 26+ messages in thread From: Jerin Jacob @ 2024-11-11 5:38 UTC (permalink / raw) To: David Marchand Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry > -----Original Message----- > From: David Marchand <david.marchand@redhat.com> > Sent: Friday, November 8, 2024 7:08 PM > To: Jerin Jacob <jerinj@marvell.com> > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org; > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com> > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when > scheduling nodes > > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ marvell. com> > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is > there any special operation required to modify > > > Hello Jerin, Hello David, > > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote: > > > > Is n't breaking the ABI? > > > > > > So can't we modify the ABI, or is there any special operation > > > required to modify the ABI? > > > > Only LTS release (xx.11) can change the ABI after sending deprecation notice. > > Looking at the pahole output, one option will be making dispatch and > > new semi fastpath Additions like xstat_off can be min cache aligned > > to make room for future expansion and to make sure have better > performance. > > Adding holes may be a short term solution, but in my opinion, the slow path > part should be entirely hidden and we only expose the fp part. The new cache line alignment items are proposed are fastpath items only. > Reminder, those holes must be in a "known state" as we release v24.11 so that > the presence of future additions can be safely detected. > > > -- > David Marchand ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-11 5:38 ` Jerin Jacob @ 2024-11-12 8:51 ` David Marchand 2024-11-12 9:35 ` Jerin Jacob 0 siblings, 1 reply; 26+ messages in thread From: David Marchand @ 2024-11-12 8:51 UTC (permalink / raw) To: Jerin Jacob Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry On Mon, Nov 11, 2024 at 6:39 AM Jerin Jacob <jerinj@marvell.com> wrote: > > > > > -----Original Message----- > > From: David Marchand <david.marchand@redhat.com> > > Sent: Friday, November 8, 2024 7:08 PM > > To: Jerin Jacob <jerinj@marvell.com> > > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda > > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org; > > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com> > > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when > > scheduling nodes > > > > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ marvell. com> > > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is > > there any special operation required to modify > > > > Hello Jerin, > > Hello David, > > > > > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote: > > > > > Is n't breaking the ABI? > > > > > > > > So can't we modify the ABI, or is there any special operation > > > > required to modify the ABI? > > > > > > Only LTS release (xx.11) can change the ABI after sending deprecation notice. > > > Looking at the pahole output, one option will be making dispatch and > > > new semi fastpath Additions like xstat_off can be min cache aligned > > > to make room for future expansion and to make sure have better > > performance. > > > > Adding holes may be a short term solution, but in my opinion, the slow path > > part should be entirely hidden and we only expose the fp part. > > The new cache line alignment items are proposed are fastpath items only. I had only noticed the second comment: + alignas(RTE_CACHE_LINE_MIN_SIZE) rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ /* Fast path area */ ^^^^^^^^^^^^ And I assumed the part in the struct before was slow path. (it may be worth enhancing these comments, with a single limit of slow/fast path areas) > > > Reminder, those holes must be in a "known state" as we release v24.11 so that > > the presence of future additions can be safely detected. If the rte_node objects are allocated by the graph library and zero'd, then we are good. It seems to be the case in graph_nodes_populate(), and the rte_node objects are embedded in the rte_graph object. Is there another location in the graph library where a rte_node object is allocated? If not, and an application can not create a rte_node object, your proposal looks good to me. -- David Marchand ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-12 8:51 ` David Marchand @ 2024-11-12 9:35 ` Jerin Jacob 2024-11-12 12:57 ` Huichao Cai 2024-11-13 9:22 ` Huichao Cai 0 siblings, 2 replies; 26+ messages in thread From: Jerin Jacob @ 2024-11-12 9:35 UTC (permalink / raw) To: David Marchand, Huichao Cai Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry > -----Original Message----- > From: David Marchand <david.marchand@redhat.com> > Sent: Tuesday, November 12, 2024 2:21 PM > To: Jerin Jacob <jerinj@marvell.com> > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org; > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com> > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when > scheduling nodes > > On Mon, Nov 11, 2024 at 6: 39 AM Jerin Jacob <jerinj@ marvell. com> wrote: > > > > > > -----Original Message----- > > From: David Marchand > <david. marchand@ redhat. com> > > Sent: Friday, November 8, 2024 7: 08 > > On Mon, Nov 11, 2024 at 6:39 AM Jerin Jacob <jerinj@marvell.com> wrote: > > > > > > > > > -----Original Message----- > > > From: David Marchand <david.marchand@redhat.com> > > > Sent: Friday, November 8, 2024 7:08 PM > > > To: Jerin Jacob <jerinj@marvell.com> > > > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda > > > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > > > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org; > > > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry > > > <rjarry@redhat.com> > > > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search > > > when scheduling nodes > > > > > > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ > > > marvell. com> > > > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the > > > ABI, or is there any special operation required to modify > > Hello > > > Jerin, > > > > Hello David, > > > > > > > > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote: > > > > > > Is n't breaking the ABI? > > > > > > > > > > So can't we modify the ABI, or is there any special operation > > > > > required to modify the ABI? > > > > > > > > Only LTS release (xx.11) can change the ABI after sending deprecation > notice. > > > > Looking at the pahole output, one option will be making dispatch > > > > and new semi fastpath Additions like xstat_off can be min cache > > > > aligned to make room for future expansion and to make sure have > > > > better > > > performance. > > > > > > Adding holes may be a short term solution, but in my opinion, the > > > slow path part should be entirely hidden and we only expose the fp part. > > > > The new cache line alignment items are proposed are fastpath items only. > > I had only noticed the second comment: > > + alignas(RTE_CACHE_LINE_MIN_SIZE) > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > /* Fast path area */ > ^^^^^^^^^^^^ > > And I assumed the part in the struct before was slow path. > (it may be worth enhancing these comments, with a single limit of slow/fast > path areas) Yes. Xstat_off was new addition as a fastpath item in this release and there was no space in original Fastpath area. And, Yes, the comment needs to be updated. > > > > > > > Reminder, those holes must be in a "known state" as we release > > > v24.11 so that the presence of future additions can be safely detected. > > If the rte_node objects are allocated by the graph library and zero'd, then we > are good. > It seems to be the case in graph_nodes_populate(), and the rte_node objects > are embedded in the rte_graph object. > > Is there another location in the graph library where a rte_node object is > allocated? No > > If not, and an application can not create a rte_node object, your proposal looks > good to me. OK. @Huichao Cai Please send two patches (a) new proposal and (b) your improvement as series. Update ABI Changes section in doc/guides/rel_notes/release_24_11.rst > > > -- > David Marchand ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-12 9:35 ` Jerin Jacob @ 2024-11-12 12:57 ` Huichao Cai 2024-11-13 9:22 ` Huichao Cai 1 sibling, 0 replies; 26+ messages in thread From: Huichao Cai @ 2024-11-12 12:57 UTC (permalink / raw) To: Jerin Jacob Cc: David Marchand, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry [-- Attachment #1: Type: text/plain, Size: 205 bytes --] >OK. @Huichao Cai Please send two patches (a) new proposal and (b) your improvement as series. >Update ABI Changes section in doc/guides/rel_notes/release_24_11.rst Ok.I will send these two patches soon. [-- Attachment #2: Type: text/html, Size: 380 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-12 9:35 ` Jerin Jacob 2024-11-12 12:57 ` Huichao Cai @ 2024-11-13 9:22 ` Huichao Cai 2024-11-14 7:09 ` Jerin Jacob 1 sibling, 1 reply; 26+ messages in thread From: Huichao Cai @ 2024-11-13 9:22 UTC (permalink / raw) To: Jerin Jacob Cc: David Marchand, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry [-- Attachment #1: Type: text/plain, Size: 5084 bytes --] > [main] [dpdk.org] $ git diff> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h> index a518af2b2a..ec9a82186d 100644> --- a/lib/graph/rte_graph_worker_common.h> +++ b/lib/graph/rte_graph_worker_common.h> @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {> /** Original process function when pcap is enabled. */> rte_node_process_t original_process; > + alignas(RTE_CACHE_LINE_MIN_SIZE)> union { Hi, Jerin The C++standard cannot align anonymous unions. Do we need to fill in reserved fields in order to maintain union alignment with RTE-CAHE_LINE_LIN_SIZE bytes? > /* Fast schedule area for mcore dispatch model */> struct {> @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {> uint64_t total_sched_fail; /**< Number of scheduled failure. */> } dispatch;> };> + alignas(RTE_CACHE_LINE_MIN_SIZE)> rte_graph_off_t xstat_off; /**< Offset to xstat counters. */> /* Fast path area */> __extension__ struct __rte_cache_aligned { FAILED: buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o ccache c++ -Ibuildtools/chkincs/chkincs-cpp.p -Ibuildtools/chkincs -I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wnon-virtual-dtor -Wextra -Werror -g -include rte_config.h -march=corei7 -mrtm -MD -MQ buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -MF buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o.d -o buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -c buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp In file included from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_model_rtc.h:6, from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker.h:9, from buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp:1: /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: error: attribute ignored in declaration of ‘union rte_node::<unnamed>’ [-Werror=attributes] 108 | union { | ^ /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: note: attribute for ‘union rte_node::<unnamed>’ must follow the ‘union’ keyword cc1plus: all warnings being treated as errors [5410/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_lpm.cpp.o [5411/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_port_in_action.cpp.o [5412/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_pipeline.cpp.o [5413/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_action.cpp.o [5414/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_swx_ipsec.cpp.o ninja: build stopped: subcommand failed. [-- Attachment #2: Type: text/html, Size: 8148 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes 2024-11-13 9:22 ` Huichao Cai @ 2024-11-14 7:09 ` Jerin Jacob 0 siblings, 0 replies; 26+ messages in thread From: Jerin Jacob @ 2024-11-14 7:09 UTC (permalink / raw) To: Huichao Cai Cc: David Marchand, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon, Robin Jarry > -----Original Message----- > From: Huichao Cai <chcchc88@163.com> > Sent: Wednesday, November 13, 2024 2:52 PM > To: Jerin Jacob <jerinj@marvell.com> > Cc: David Marchand <david.marchand@redhat.com>; Kiran Kumar > Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org; > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com> > Subject: Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when > scheduling nodes > > > [main] [dpdk. org] $ git diff > diff --git > > a/lib/graph/rte_graph_worker_common. h > > b/lib/graph/rte_graph_worker_common. h > index a518af2b2a. . > > ec9a82186d 100644 > --- a/lib/graph/rte_graph_worker_common. h > +++ > > b/lib/graph/rte_graph_worker_common. h > > > [main] [dpdk.org] $ git diff > > diff --git a/lib/graph/rte_graph_worker_common.h > > b/lib/graph/rte_graph_worker_common.h > > index a518af2b2a..ec9a82186d 100644 > > --- a/lib/graph/rte_graph_worker_common.h > > +++ b/lib/graph/rte_graph_worker_common.h > > @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node { > > /** Original process function when pcap is enabled. */ > > rte_node_process_t original_process; > > > + alignas(RTE_CACHE_LINE_MIN_SIZE) > > union { > > Hi, Jerin > The C++standard cannot align anonymous unions. Do we need to fill in reserved > fields in order to maintain union alignment with RTE-CAHE_LINE_LIN_SIZE > bytes? You can bring it inside the structure. > > > /* Fast schedule area for mcore dispatch model */ > > struct { > > @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node { > > uint64_t total_sched_fail; /**< Number of scheduled failure. */ > > } dispatch; > > }; > > + alignas(RTE_CACHE_LINE_MIN_SIZE) > > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > > /* Fast path area */ > > __extension__ struct __rte_cache_aligned { > > > FAILED: buildtools/chkincs/chkincs-cpp.p/meson- > generated_rte_graph_worker.cpp.o > ccache c++ -Ibuildtools/chkincs/chkincs-cpp.p -Ibuildtools/chkincs - > I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd - > I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig - > I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include - > I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include - > I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal - > Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics - > Ilib/telemetry -I../lib/telemetry -Idrivers/bus/pci -I../drivers/bus/pci - > I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus - > I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse - > I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring - > Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf - > Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev - > Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer - > Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats - > Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev - > I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor - > I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev > -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev - > Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats - > I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm - > Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power - > I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev - > Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder - > Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack - > Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib - > I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table > -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node - > fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch - > Wnon-virtual-dtor -Wextra -Werror -g -include rte_config.h -march=corei7 - > mrtm -MD -MQ buildtools/chkincs/chkincs-cpp.p/meson- > generated_rte_graph_worker.cpp.o -MF buildtools/chkincs/chkincs- > cpp.p/meson-generated_rte_graph_worker.cpp.o.d -o > buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -c > buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp > In file included from > /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_model_rtc.h:6, > from > /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker.h:9, > from buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp:1: > /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:1 > 5: error: attribute ignored in declaration of ‘union rte_node::<unnamed>’ [- > Werror=attributes] > 108 | union { > | ^ > /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:1 > 5: note: attribute for ‘union rte_node::<unnamed>’ must follow the ‘union’ > keyword > cc1plus: all warnings being treated as errors [5410/6569] Compiling C++ object > buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_lpm.cpp.o > [5411/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson- > generated_rte_port_in_action.cpp.o > [5412/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson- > generated_rte_pipeline.cpp.o > [5413/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson- > generated_rte_table_action.cpp.o > [5414/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson- > generated_rte_swx_ipsec.cpp.o > ninja: build stopped: subcommand failed. ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2] graph: mcore: optimize graph search 2024-11-07 8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai 2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob @ 2024-11-11 4:03 ` Huichao Cai 2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob 2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai 1 sibling, 2 replies; 26+ messages in thread From: Huichao Cai @ 2024-11-11 4:03 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev, Huichao cai From: Huichao cai <chcchc88@163.com> In the function __rte_graph_mcore_dispatch_sched_node_enqueue, use a slower loop to search for the graph, modify the search logic to record the result of the first search, and use this record for subsequent searches to improve search speed. Signed-off-by: Huichao cai <chcchc88@163.com> --- lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++---- lib/graph/rte_graph_worker_common.h | 1 + 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c index a590fc9..a81d338 100644 --- a/lib/graph/rte_graph_model_mcore_dispatch.c +++ b/lib/graph/rte_graph_model_mcore_dispatch.c @@ -118,11 +118,14 @@ struct rte_graph_rq_head *rq) { const unsigned int lcore_id = node->dispatch.lcore_id; - struct rte_graph *graph; + struct rte_graph *graph = node->dispatch.graph; - SLIST_FOREACH(graph, rq, next) - if (graph->dispatch.lcore_id == lcore_id) - break; + if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) { + SLIST_FOREACH(graph, rq, next) + if (graph->dispatch.lcore_id == lcore_id) + break; + node->dispatch.graph = graph; + } return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false; } diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2..4c2432b 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ + struct rte_graph *graph; /**< Graph corresponding to lcore_id. */ } dispatch; }; rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ -- 1.8.3.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search 2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai @ 2024-11-11 5:46 ` Jerin Jacob 2024-11-13 9:19 ` Huichao Cai 2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai 1 sibling, 1 reply; 26+ messages in thread From: Jerin Jacob @ 2024-11-11 5:46 UTC (permalink / raw) To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, david.marchand, Thomas Monjalon Cc: dev > -----Original Message----- > From: Huichao Cai <chcchc88@163.com> > Sent: Monday, November 11, 2024 9:33 AM > To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com > Cc: dev@dpdk.org; Huichao cai <chcchc88@163.com> > Subject: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search > > From: Huichao cai <chcchc88@ 163. com> In the function > __rte_graph_mcore_dispatch_sched_node_enqueue, use a slower loop to > search for the graph, modify the search logic to record the result of the first > search, and use this record for subsequent > From: Huichao cai <chcchc88@163.com> > > In the function __rte_graph_mcore_dispatch_sched_node_enqueue, > use a slower loop to search for the graph, modify the search logic to record the > result of the first search, and use this record for subsequent searches to > improve search speed. > > Signed-off-by: Huichao cai <chcchc88@163.com> > --- > lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++---- > lib/graph/rte_graph_worker_common.h | 1 + > 2 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c > b/lib/graph/rte_graph_model_mcore_dispatch.c > index a590fc9..a81d338 100644 > --- a/lib/graph/rte_graph_model_mcore_dispatch.c > +++ b/lib/graph/rte_graph_model_mcore_dispatch.c > @@ -118,11 +118,14 @@ > struct rte_graph_rq_head *rq) { > const unsigned int lcore_id = node->dispatch.lcore_id; > - struct rte_graph *graph; > + struct rte_graph *graph = node->dispatch.graph; > > - SLIST_FOREACH(graph, rq, next) > - if (graph->dispatch.lcore_id == lcore_id) > - break; > + if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) { > + SLIST_FOREACH(graph, rq, next) > + if (graph->dispatch.lcore_id == lcore_id) > + break; > + node->dispatch.graph = graph; > + } > > return graph != NULL ? __graph_sched_node_enqueue(node, graph) : > false; } diff --git a/lib/graph/rte_graph_worker_common.h > b/lib/graph/rte_graph_worker_common.h > index a518af2..4c2432b 100644 > --- a/lib/graph/rte_graph_worker_common.h > +++ b/lib/graph/rte_graph_worker_common.h > @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node { > unsigned int lcore_id; /**< Node running lcore. */ > uint64_t total_sched_objs; /**< Number of objects > scheduled. */ > uint64_t total_sched_fail; /**< Number of scheduled > failure. */ > + struct rte_graph *graph; /**< Graph corresponding to > lcore_id. */ Need to conclude the ABI related discussion here before making change https://patches.dpdk.org/project/dpdk/patch/1730966682-2632-1-git-send-email-chcchc88@163.com/ > } dispatch; > }; > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re:RE: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search 2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob @ 2024-11-13 9:19 ` Huichao Cai 0 siblings, 0 replies; 26+ messages in thread From: Huichao Cai @ 2024-11-13 9:19 UTC (permalink / raw) To: Jerin Jacob Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, david.marchand, Thomas Monjalon, dev [-- Attachment #1: Type: text/plain, Size: 5102 bytes --] > [main] [dpdk.org] $ git diff > diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h > index a518af2b2a..ec9a82186d 100644 > --- a/lib/graph/rte_graph_worker_common.h > +++ b/lib/graph/rte_graph_worker_common.h > @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node { > /** Original process function when pcap is enabled. */ > rte_node_process_t original_process; > + alignas(RTE_CACHE_LINE_MIN_SIZE) > union { Hi, Jerin The C++standard cannot align anonymous unions. Do we need to fill in reserved fields in order to maintain union alignment with RTE-CAHE_LINE_LIN_SIZE bytes? > /* Fast schedule area for mcore dispatch model */ > struct { > @@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node { > uint64_t total_sched_fail; /**< Number of scheduled failure. */ > } dispatch; > }; > + alignas(RTE_CACHE_LINE_MIN_SIZE) > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > /* Fast path area */ > __extension__ struct __rte_cache_aligned { FAILED: buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o ccache c++ -Ibuildtools/chkincs/chkincs-cpp.p -Ibuildtools/chkincs -I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wnon-virtual-dtor -Wextra -Werror -g -include rte_config.h -march=corei7 -mrtm -MD -MQ buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -MF buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o.d -o buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_graph_worker.cpp.o -c buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp In file included from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_model_rtc.h:6, from /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker.h:9, from buildtools/chkincs/chkincs-cpp.p/rte_graph_worker.cpp:1: /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: error: attribute ignored in declaration of ‘union rte_node::<unnamed>’ [-Werror=attributes] 108 | union { | ^ /home/runner/work/dpdk/dpdk/lib/graph/rte_graph_worker_common.h:108:15: note: attribute for ‘union rte_node::<unnamed>’ must follow the ‘union’ keyword cc1plus: all warnings being treated as errors [5410/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_lpm.cpp.o [5411/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_port_in_action.cpp.o [5412/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_pipeline.cpp.o [5413/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_table_action.cpp.o [5414/6569] Compiling C++ object buildtools/chkincs/chkincs-cpp.p/meson-generated_rte_swx_ipsec.cpp.o ninja: build stopped: subcommand failed. [-- Attachment #2: Type: text/html, Size: 7692 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3 1/2] graph: mcore: optimize graph search 2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai 2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob @ 2024-11-13 7:35 ` Huichao Cai 2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai 2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai 1 sibling, 2 replies; 26+ messages in thread From: Huichao Cai @ 2024-11-13 7:35 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev In the function __rte_graph_mcore_dispatch_sched_node_enqueue, use a slower loop to search for the graph, modify the search logic to record the result of the first search, and use this record for subsequent searches to improve search speed. Due to the addition of a "graph" field in the "rte_node" structure, update file release_24_11.rst. Signed-off-by: Huichao Cai <chcchc88@163.com> --- doc/guides/rel_notes/release_24_11.rst | 1 + lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++---- lib/graph/rte_graph_worker_common.h | 1 + 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 9dc739c4cb..592116b979 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -423,6 +423,7 @@ ABI Changes added new structure ``rte_node_xstats`` to ``rte_node_register`` and added ``xstat_off`` to ``rte_node``. +* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure. Known Issues ------------ diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c index a590fc9497..a81d338227 100644 --- a/lib/graph/rte_graph_model_mcore_dispatch.c +++ b/lib/graph/rte_graph_model_mcore_dispatch.c @@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node, struct rte_graph_rq_head *rq) { const unsigned int lcore_id = node->dispatch.lcore_id; - struct rte_graph *graph; + struct rte_graph *graph = node->dispatch.graph; - SLIST_FOREACH(graph, rq, next) - if (graph->dispatch.lcore_id == lcore_id) - break; + if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) { + SLIST_FOREACH(graph, rq, next) + if (graph->dispatch.lcore_id == lcore_id) + break; + node->dispatch.graph = graph; + } return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false; } diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2b2a..4c2432b47f 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ + struct rte_graph *graph; /**< Graph corresponding to lcore_id. */ } dispatch; }; rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ -- 2.27.0 ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3 2/2] graph: add alignment to the member of rte_node 2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai @ 2024-11-13 7:35 ` Huichao Cai 2024-11-14 7:14 ` [EXTERNAL] " Jerin Jacob 2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai 1 sibling, 1 reply; 26+ messages in thread From: Huichao Cai @ 2024-11-13 7:35 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev The members "dispatch" and "xstat_off" of the structure "rte_node" can be min cache aligned to make room for future expansion and to make sure have better performance. Due to the modification of the alignment of some members of the "rte_node" structure, update file release_24_11.rst. Signed-off-by: Huichao Cai <chcchc88@163.com> --- doc/guides/rel_notes/release_24_11.rst | 3 +++ lib/graph/rte_graph_worker_common.h | 5 ++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 592116b979..6903b1d0f0 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -425,6 +425,9 @@ ABI Changes * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure. +* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. + Known Issues ------------ diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index 4c2432b47f..9e99278a0a 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node { /** Original process function when pcap is enabled. */ rte_node_process_t original_process; + alignas(RTE_CACHE_LINE_MIN_SIZE) union { /* Fast schedule area for mcore dispatch model */ struct { @@ -113,8 +114,10 @@ struct __rte_cache_aligned rte_node { struct rte_graph *graph; /**< Graph corresponding to lcore_id. */ } dispatch; }; - rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ + /* Fast path area */ + alignas(RTE_CACHE_LINE_MIN_SIZE) + rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ 16 union { -- 2.27.0 ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [EXTERNAL] [PATCH v3 2/2] graph: add alignment to the member of rte_node 2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai @ 2024-11-14 7:14 ` Jerin Jacob 0 siblings, 0 replies; 26+ messages in thread From: Jerin Jacob @ 2024-11-14 7:14 UTC (permalink / raw) To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163 Cc: dev > -----Original Message----- > From: Huichao Cai <chcchc88@163.com> > Sent: Wednesday, November 13, 2024 1:06 PM > To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com > Cc: dev@dpdk.org > Subject: [EXTERNAL] [PATCH v3 2/2] graph: add alignment to the member of > rte_node > > The members "dispatch" and "xstat_off" of the structure "rte_node" can be min > cache aligned to make room for future expansion and to make sure have better > performance. Due to the modification of the alignment of some members of the > "rte_node" > > The members "dispatch" and "xstat_off" of the structure "rte_node" > can be min cache aligned to make room for future expansion and to make sure > have better performance. > > Due to the modification of the alignment of some members of the "rte_node" > structure, update file release_24_11.rst. > > Signed-off-by: Huichao Cai <chcchc88@163.com> > --- > doc/guides/rel_notes/release_24_11.rst | 3 +++ > lib/graph/rte_graph_worker_common.h | 5 ++++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/doc/guides/rel_notes/release_24_11.rst > b/doc/guides/rel_notes/release_24_11.rst > index 592116b979..6903b1d0f0 100644 > --- a/doc/guides/rel_notes/release_24_11.rst > +++ b/doc/guides/rel_notes/release_24_11.rst > @@ -425,6 +425,9 @@ ABI Changes > > * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` > structure. > > +* graph: The members ``dispatch`` and ``xstat_off`` of the structure > +``rte_node`` have been > + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. > + > Known Issues > ------------ > > diff --git a/lib/graph/rte_graph_worker_common.h > b/lib/graph/rte_graph_worker_common.h > index 4c2432b47f..9e99278a0a 100644 > --- a/lib/graph/rte_graph_worker_common.h > +++ b/lib/graph/rte_graph_worker_common.h > @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node { > /** Original process function when pcap is enabled. */ > rte_node_process_t original_process; > > + alignas(RTE_CACHE_LINE_MIN_SIZE) > union { > /* Fast schedule area for mcore dispatch model */ > struct { > @@ -113,8 +114,10 @@ struct __rte_cache_aligned rte_node { > struct rte_graph *graph; /**< Graph corresponding to > lcore_id. */ > } dispatch; > }; > - rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > + > /* Fast path area */ Make it as two separate comment, Fast path area cache line 1 and Fastpath area cache line 2. > + alignas(RTE_CACHE_LINE_MIN_SIZE) > + rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ > 16 > union { > -- > 2.27.0 ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v4 1/2] graph: mcore: optimize graph search 2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai 2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai @ 2024-11-14 8:45 ` Huichao Cai 2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai 1 sibling, 1 reply; 26+ messages in thread From: Huichao Cai @ 2024-11-14 8:45 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev In the function __rte_graph_mcore_dispatch_sched_node_enqueue, use a slower loop to search for the graph, modify the search logic to record the result of the first search, and use this record for subsequent searches to improve search speed. Due to the addition of a "graph" field in the "rte_node" structure, update file release_24_11.rst. Signed-off-by: Huichao Cai <chcchc88@163.com> --- doc/guides/rel_notes/release_24_11.rst | 1 + lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++---- lib/graph/rte_graph_worker_common.h | 1 + 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 9dc739c4cb..592116b979 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -423,6 +423,7 @@ ABI Changes added new structure ``rte_node_xstats`` to ``rte_node_register`` and added ``xstat_off`` to ``rte_node``. +* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure. Known Issues ------------ diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c index a590fc9497..a81d338227 100644 --- a/lib/graph/rte_graph_model_mcore_dispatch.c +++ b/lib/graph/rte_graph_model_mcore_dispatch.c @@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node, struct rte_graph_rq_head *rq) { const unsigned int lcore_id = node->dispatch.lcore_id; - struct rte_graph *graph; + struct rte_graph *graph = node->dispatch.graph; - SLIST_FOREACH(graph, rq, next) - if (graph->dispatch.lcore_id == lcore_id) - break; + if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) { + SLIST_FOREACH(graph, rq, next) + if (graph->dispatch.lcore_id == lcore_id) + break; + node->dispatch.graph = graph; + } return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false; } diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2b2a..4c2432b47f 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ + struct rte_graph *graph; /**< Graph corresponding to lcore_id. */ } dispatch; }; rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ -- 2.27.0 ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v4 2/2] graph: add alignment to the member of rte_node 2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai @ 2024-11-14 8:45 ` Huichao Cai 2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob 2024-11-15 1:55 ` [PATCH v5 1/1] graph: improve node layout Huichao Cai 0 siblings, 2 replies; 26+ messages in thread From: Huichao Cai @ 2024-11-14 8:45 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev The members dispatch and xstat_off of the structure rte_node can be min cache aligned to make room for future expansion and to make sure have better performance. Add corresponding comments. Due to the modification of the alignment of some members of the rte_node structure, update file release_24_11.rst. Signed-off-by: Huichao Cai <chcchc88@163.com> --- doc/guides/rel_notes/release_24_11.rst | 3 +++ lib/graph/rte_graph_worker_common.h | 7 ++++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 592116b979..6903b1d0f0 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -425,6 +425,9 @@ ABI Changes * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure. +* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. + Known Issues ------------ diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index 4c2432b47f..d36abec08b 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -104,16 +104,21 @@ struct __rte_cache_aligned rte_node { /** Original process function when pcap is enabled. */ rte_node_process_t original_process; + /** Fast path area cache line 1. */ union { /* Fast schedule area for mcore dispatch model */ - struct { + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ struct rte_graph *graph; /**< Graph corresponding to lcore_id. */ } dispatch; }; + + /** Fast path area cache line 2. */ + alignas(RTE_CACHE_LINE_MIN_SIZE) rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ + /* Fast path area */ __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ 16 -- 2.27.0 ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node 2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai @ 2024-11-14 10:05 ` Jerin Jacob 2024-11-14 12:06 ` Huichao Cai 2024-11-15 1:55 ` [PATCH v5 1/1] graph: improve node layout Huichao Cai 1 sibling, 1 reply; 26+ messages in thread From: Jerin Jacob @ 2024-11-14 10:05 UTC (permalink / raw) To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, david.marchand Cc: dev > -----Original Message----- > From: Huichao Cai <chcchc88@163.com> > Sent: Thursday, November 14, 2024 2:15 PM > To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com>; yanzhirun_163@163.com > Cc: dev@dpdk.org > Subject: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of > rte_node > > The members dispatch and xstat_off of the structure rte_node can be min cache > aligned to make room for future expansion and to make sure have better > performance. Add corresponding comments. Due to the modification of the > alignment of some members > The members dispatch and xstat_off of the structure rte_node can be min cache > aligned to make room for future expansion and to make sure have better > performance. Add corresponding comments. > Please change subject to graph: improve node layout > Due to the modification of the alignment of some members of the rte_node > structure, update file release_24_11.rst. The above section is not needed. > > Signed-off-by: Huichao Cai <chcchc88@163.com> > --- > doc/guides/rel_notes/release_24_11.rst | 3 +++ > lib/graph/rte_graph_worker_common.h | 7 ++++++- > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/doc/guides/rel_notes/release_24_11.rst > b/doc/guides/rel_notes/release_24_11.rst > index 592116b979..6903b1d0f0 100644 > --- a/doc/guides/rel_notes/release_24_11.rst > +++ b/doc/guides/rel_notes/release_24_11.rst > @@ -425,6 +425,9 @@ ABI Changes > > * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` > structure. > > +* graph: The members ``dispatch`` and ``xstat_off`` of the structure > +``rte_node`` have been > + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. > + > Known Issues > ------------ > > diff --git a/lib/graph/rte_graph_worker_common.h > b/lib/graph/rte_graph_worker_common.h > index 4c2432b47f..d36abec08b 100644 > --- a/lib/graph/rte_graph_worker_common.h > +++ b/lib/graph/rte_graph_worker_common.h > @@ -104,16 +104,21 @@ struct __rte_cache_aligned rte_node { > /** Original process function when pcap is enabled. */ > rte_node_process_t original_process; > > + /** Fast path area cache line 1. */ Fast schedule area for mcore dispatch model > union { > /* Fast schedule area for mcore dispatch model */ Above comment you can remove it > - struct { > + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { > unsigned int lcore_id; /**< Node running lcore. */ > uint64_t total_sched_objs; /**< Number of objects > scheduled. */ > uint64_t total_sched_fail; /**< Number of scheduled > failure. */ > struct rte_graph *graph; /**< Graph corresponding to > lcore_id. */ > } dispatch; > }; > + > + /** Fast path area cache line 2. */ > + alignas(RTE_CACHE_LINE_MIN_SIZE) > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > + > /* Fast path area */ Fast path area cache line 1 > __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ > 16 With above: Acked-by: Jerin Jacob <jerinj@marvell.com> Looks loke we cannot merge new feature in rc3. I would suggest skip 1/2 and send only this patch so that 1/2 can merged in next release. Please add @david.marchand@redhat.com in Cc. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re:RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node 2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob @ 2024-11-14 12:06 ` Huichao Cai 2024-11-14 13:04 ` Jerin Jacob 0 siblings, 1 reply; 26+ messages in thread From: Huichao Cai @ 2024-11-14 12:06 UTC (permalink / raw) To: Jerin Jacob Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, david.marchand, dev [-- Attachment #1: Type: text/plain, Size: 1213 bytes --] Hi, Jerin. Like this? diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2b2a..f9ff7dd8c9 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -104,15 +104,19 @@ struct __rte_cache_aligned rte_node { /** Original process function when pcap is enabled. */ rte_node_process_t original_process; + /** Fast schedule area for mcore dispatch model. */ union { - /* Fast schedule area for mcore dispatch model */ - struct { + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ } dispatch; }; + + /** Fast path area cache line 1. */ + alignas(RTE_CACHE_LINE_MIN_SIZE) rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ + /* Fast path area */ __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ 16 [-- Attachment #2: Type: text/html, Size: 2621 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: Re:RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node 2024-11-14 12:06 ` Huichao Cai @ 2024-11-14 13:04 ` Jerin Jacob 0 siblings, 0 replies; 26+ messages in thread From: Jerin Jacob @ 2024-11-14 13:04 UTC (permalink / raw) To: Huichao Cai Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, david.marchand, dev > -----Original Message----- > From: Huichao Cai <chcchc88@163.com> > Sent: Thursday, November 14, 2024 5:37 PM > To: Jerin Jacob <jerinj@marvell.com> > Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar > Dabilpuram <ndabilpuram@marvell.com>; yanzhirun_163@163.com; > david.marchand@redhat.com; dev@dpdk.org > Subject: Re:RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the > member of rte_node > > Hi, Jerin. Like this? diff --git a/lib/graph/rte_graph_worker_common. h > b/lib/graph/rte_graph_worker_common. h index a518af2b2a. . f9ff7dd8c9 > 100644 --- a/lib/graph/rte_graph_worker_common. h +++ > b/lib/graph/rte_graph_worker_common. h @@ -104,15 +104,19 > > > Hi, Jerin. Like this? > > > > > diff --git a/lib/graph/rte_graph_worker_common.h > b/lib/graph/rte_graph_worker_common.h > > index a518af2b2a..f9ff7dd8c9 100644 > > --- a/lib/graph/rte_graph_worker_common.h > > +++ b/lib/graph/rte_graph_worker_common.h > > @@ -104,15 +104,19 @@ struct __rte_cache_aligned rte_node { > > /** Original process function when pcap is enabled. */ > > rte_node_process_t original_process; > > > > + /** Fast schedule area for mcore dispatch model. */ > > union { > > - /* Fast schedule area for mcore dispatch model */ > > - struct { > > + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { > > unsigned int lcore_id; /**< Node running lcore. */ > > uint64_t total_sched_objs; /**< Number of objects scheduled. */ > > uint64_t total_sched_fail; /**< Number of scheduled failure. */ > > } dispatch; > > }; > > + > > + /** Fast path area cache line 1. */ > > + alignas(RTE_CACHE_LINE_MIN_SIZE) > > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > > + > > /* Fast path area */ Fast path area cache line 2 Rest looks good to me. > > __extension__ struct __rte_cache_aligned { > > #define RTE_NODE_CTX_SZ 16 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v5 1/1] graph: improve node layout 2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai 2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob @ 2024-11-15 1:55 ` Huichao Cai 2024-11-15 14:23 ` Thomas Monjalon 1 sibling, 1 reply; 26+ messages in thread From: Huichao Cai @ 2024-11-15 1:55 UTC (permalink / raw) To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev The members "dispatch" and "xstat_off" of the structure "rte_node" can be min cache aligned to make room for future expansion and to make sure have better performance. Add corresponding comments. Signed-off-by: Huichao Cai <chcchc88@163.com> --- doc/guides/rel_notes/release_24_11.rst | 2 ++ lib/graph/rte_graph_worker_common.h | 10 +++++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 5063badf39..32800e8cb0 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -491,6 +491,8 @@ ABI Changes added new structure ``rte_node_xstats`` to ``rte_node_register`` and added ``xstat_off`` to ``rte_node``. +* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. Known Issues ------------ diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index a518af2b2a..d3ec88519d 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -104,16 +104,20 @@ struct __rte_cache_aligned rte_node { /** Original process function when pcap is enabled. */ rte_node_process_t original_process; + /** Fast schedule area for mcore dispatch model. */ union { - /* Fast schedule area for mcore dispatch model */ - struct { + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { unsigned int lcore_id; /**< Node running lcore. */ uint64_t total_sched_objs; /**< Number of objects scheduled. */ uint64_t total_sched_fail; /**< Number of scheduled failure. */ } dispatch; }; + + /** Fast path area cache line 1. */ + alignas(RTE_CACHE_LINE_MIN_SIZE) rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ - /* Fast path area */ + + /** Fast path area cache line 2. */ __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ 16 union { -- 2.27.0 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v5 1/1] graph: improve node layout 2024-11-15 1:55 ` [PATCH v5 1/1] graph: improve node layout Huichao Cai @ 2024-11-15 14:23 ` Thomas Monjalon 2024-11-15 15:57 ` [EXTERNAL] " Jerin Jacob 0 siblings, 1 reply; 26+ messages in thread From: Thomas Monjalon @ 2024-11-15 14:23 UTC (permalink / raw) To: jerinj, ndabilpuram; +Cc: kirankumark, yanzhirun_163, dev, Huichao Cai Is it good to go? 15/11/2024 02:55, Huichao Cai: > The members "dispatch" and "xstat_off" of the structure "rte_node" > can be min cache aligned to make room for future expansion and to > make sure have better performance. Add corresponding comments. > > Signed-off-by: Huichao Cai <chcchc88@163.com> > --- > doc/guides/rel_notes/release_24_11.rst | 2 ++ > lib/graph/rte_graph_worker_common.h | 10 +++++++--- > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst > index 5063badf39..32800e8cb0 100644 > --- a/doc/guides/rel_notes/release_24_11.rst > +++ b/doc/guides/rel_notes/release_24_11.rst > @@ -491,6 +491,8 @@ ABI Changes > added new structure ``rte_node_xstats`` to ``rte_node_register`` and > added ``xstat_off`` to ``rte_node``. > > +* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been > + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. > > Known Issues > ------------ > diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h > index a518af2b2a..d3ec88519d 100644 > --- a/lib/graph/rte_graph_worker_common.h > +++ b/lib/graph/rte_graph_worker_common.h > @@ -104,16 +104,20 @@ struct __rte_cache_aligned rte_node { > /** Original process function when pcap is enabled. */ > rte_node_process_t original_process; > > + /** Fast schedule area for mcore dispatch model. */ > union { > - /* Fast schedule area for mcore dispatch model */ > - struct { > + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { > unsigned int lcore_id; /**< Node running lcore. */ > uint64_t total_sched_objs; /**< Number of objects scheduled. */ > uint64_t total_sched_fail; /**< Number of scheduled failure. */ > } dispatch; > }; > + > + /** Fast path area cache line 1. */ > + alignas(RTE_CACHE_LINE_MIN_SIZE) > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > - /* Fast path area */ > + > + /** Fast path area cache line 2. */ > __extension__ struct __rte_cache_aligned { > #define RTE_NODE_CTX_SZ 16 > union { > ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [EXTERNAL] Re: [PATCH v5 1/1] graph: improve node layout 2024-11-15 14:23 ` Thomas Monjalon @ 2024-11-15 15:57 ` Jerin Jacob 2024-11-19 10:31 ` Thomas Monjalon 0 siblings, 1 reply; 26+ messages in thread From: Jerin Jacob @ 2024-11-15 15:57 UTC (permalink / raw) To: Thomas Monjalon, Nithin Kumar Dabilpuram Cc: Kiran Kumar Kokkilagadda, yanzhirun_163, dev, Huichao Cai > -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Friday, November 15, 2024 7:54 PM > To: Jerin Jacob <jerinj@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com> > Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; > yanzhirun_163@163.com; dev@dpdk.org; Huichao Cai <chcchc88@163.com> > Subject: [EXTERNAL] Re: [PATCH v5 1/1] graph: improve node layout > > Is it good to go? 15/11/2024 02: 55, Huichao Cai: > The members "dispatch" > and "xstat_off" of the structure "rte_node" > can be min cache aligned to make > room for future expansion and to > make sure have better performance. Add > corresponding > Is it good to go? > > > 15/11/2024 02:55, Huichao Cai: > > The members "dispatch" and "xstat_off" of the structure "rte_node" > > can be min cache aligned to make room for future expansion and to make > > sure have better performance. Add corresponding comments. > > > > Signed-off-by: Huichao Cai <chcchc88@163.com>] Acked-by: Jerin Jacob <jerinj@marvell.com> > > --- > > doc/guides/rel_notes/release_24_11.rst | 2 ++ > > lib/graph/rte_graph_worker_common.h | 10 +++++++--- > > 2 files changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/doc/guides/rel_notes/release_24_11.rst > > b/doc/guides/rel_notes/release_24_11.rst > > index 5063badf39..32800e8cb0 100644 > > --- a/doc/guides/rel_notes/release_24_11.rst > > +++ b/doc/guides/rel_notes/release_24_11.rst > > @@ -491,6 +491,8 @@ ABI Changes > > added new structure ``rte_node_xstats`` to ``rte_node_register`` and > > added ``xstat_off`` to ``rte_node``. > > > > +* graph: The members ``dispatch`` and ``xstat_off`` of the structure > > +``rte_node`` have been > > + marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned. > > > > Known Issues > > ------------ > > diff --git a/lib/graph/rte_graph_worker_common.h > > b/lib/graph/rte_graph_worker_common.h > > index a518af2b2a..d3ec88519d 100644 > > --- a/lib/graph/rte_graph_worker_common.h > > +++ b/lib/graph/rte_graph_worker_common.h > > @@ -104,16 +104,20 @@ struct __rte_cache_aligned rte_node { > > /** Original process function when pcap is enabled. */ > > rte_node_process_t original_process; > > > > + /** Fast schedule area for mcore dispatch model. */ > > union { > > - /* Fast schedule area for mcore dispatch model */ > > - struct { > > + alignas(RTE_CACHE_LINE_MIN_SIZE) struct { > > unsigned int lcore_id; /**< Node running lcore. */ > > uint64_t total_sched_objs; /**< Number of objects > scheduled. */ > > uint64_t total_sched_fail; /**< Number of scheduled > failure. */ > > } dispatch; > > }; > > + > > + /** Fast path area cache line 1. */ > > + alignas(RTE_CACHE_LINE_MIN_SIZE) > > rte_graph_off_t xstat_off; /**< Offset to xstat counters. */ > > - /* Fast path area */ > > + > > + /** Fast path area cache line 2. */ > > __extension__ struct __rte_cache_aligned { #define RTE_NODE_CTX_SZ > > 16 > > union { > > > > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [EXTERNAL] Re: [PATCH v5 1/1] graph: improve node layout 2024-11-15 15:57 ` [EXTERNAL] " Jerin Jacob @ 2024-11-19 10:31 ` Thomas Monjalon 0 siblings, 0 replies; 26+ messages in thread From: Thomas Monjalon @ 2024-11-19 10:31 UTC (permalink / raw) To: Huichao Cai Cc: Nithin Kumar Dabilpuram, dev, Kiran Kumar Kokkilagadda, yanzhirun_163, dev, Jerin Jacob > > 15/11/2024 02:55, Huichao Cai: > > > The members "dispatch" and "xstat_off" of the structure "rte_node" > > > can be min cache aligned to make room for future expansion and to make > > > sure have better performance. Add corresponding comments. > > > > > > Signed-off-by: Huichao Cai <chcchc88@163.com>] > > Acked-by: Jerin Jacob <jerinj@marvell.com> Applied, thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2024-11-19 10:31 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-11-07 8:04 [PATCH] graph: optimize graph search when scheduling nodes Huichao cai 2024-11-07 9:37 ` [EXTERNAL] " Jerin Jacob 2024-11-08 1:39 ` Huichao Cai 2024-11-08 12:22 ` Jerin Jacob 2024-11-08 13:38 ` David Marchand 2024-11-11 5:38 ` Jerin Jacob 2024-11-12 8:51 ` David Marchand 2024-11-12 9:35 ` Jerin Jacob 2024-11-12 12:57 ` Huichao Cai 2024-11-13 9:22 ` Huichao Cai 2024-11-14 7:09 ` Jerin Jacob 2024-11-11 4:03 ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai 2024-11-11 5:46 ` [EXTERNAL] " Jerin Jacob 2024-11-13 9:19 ` Huichao Cai 2024-11-13 7:35 ` [PATCH v3 1/2] " Huichao Cai 2024-11-13 7:35 ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai 2024-11-14 7:14 ` [EXTERNAL] " Jerin Jacob 2024-11-14 8:45 ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai 2024-11-14 8:45 ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai 2024-11-14 10:05 ` [EXTERNAL] " Jerin Jacob 2024-11-14 12:06 ` Huichao Cai 2024-11-14 13:04 ` Jerin Jacob 2024-11-15 1:55 ` [PATCH v5 1/1] graph: improve node layout Huichao Cai 2024-11-15 14:23 ` Thomas Monjalon 2024-11-15 15:57 ` [EXTERNAL] " Jerin Jacob 2024-11-19 10:31 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).