DPDK patches and discussions
 help / color / mirror / Atom feed
* Re: [dpdk-dev] [PATCH] service: improve service run performance
@ 2019-10-07 14:52 Eads, Gage
  0 siblings, 0 replies; 5+ messages in thread
From: Eads, Gage @ 2019-10-07 14:52 UTC (permalink / raw)
  To: dev; +Cc: Rao, Nikhil

> For a valid service, the core mask of the service
> is checked against the current core and the corresponding
> entry in the active_on_lcore array is set or reset.

> Upto 8 cores share the same cache line for their
> service active_on_lcore array entries since each entry is a uint8_t.
> Some number of these entries also share the cache line with
> the internal_flags member of struct rte_service_spec_impl,
> hence this false sharing also makes the service_valid() check
> expensive.

> Eliminate false sharing by moving the active_on_lcore array to
> a per-core data structure. The array is now indexed by service id.

> Signed-off-by: Nikhil Rao <nikhil.rao at intel.com>

Acked-by: Gage Eads <gage.eads@intel.com>

Thanks,
Gage

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH] service: improve service run performance
  2019-10-07 15:37   ` Van Haaren, Harry
@ 2019-10-18  4:10     ` David Marchand
  0 siblings, 0 replies; 5+ messages in thread
From: David Marchand @ 2019-10-18  4:10 UTC (permalink / raw)
  To: Van Haaren, Harry; +Cc: dev, Rao, Nikhil

On Mon, Oct 7, 2019 at 5:37 PM Van Haaren, Harry
<harry.van.haaren@intel.com> wrote:
>
> > -----Original Message-----
> > From: David Marchand [mailto:david.marchand@redhat.com]
> > Sent: Monday, October 7, 2019 3:53 PM
> > To: Van Haaren, Harry <harry.van.haaren@intel.com>
> > Cc: dev <dev@dpdk.org>; Rao, Nikhil <nikhil.rao@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH] service: improve service run performance
> >
> > On Mon, Sep 16, 2019 at 12:01 PM Nikhil Rao <nikhil.rao@intel.com> wrote:
> > >
> > > For a valid service, the core mask of the service
> > > is checked against the current core and the corresponding
> > > entry in the active_on_lcore array is set or reset.
> > >
> > > Upto 8 cores share the same cache line for their
> > > service active_on_lcore array entries since each entry is a uint8_t.
> > > Some number of these entries also share the cache line with
> > > the internal_flags member of struct rte_service_spec_impl,
> > > hence this false sharing also makes the service_valid() check
> > > expensive.
> > >
> > > Eliminate false sharing by moving the active_on_lcore array to
> > > a per-core data structure. The array is now indexed by service id.

Acked-by: Gage Eads <gage.eads@intel.com>

> >
> > Harry, any comments on this patch?
>
>
> Looks good to me, thanks Nikhil & David for the ping;
>
> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

Applied, thanks.



--
David Marchand


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH] service: improve service run performance
  2019-10-07 14:52 ` David Marchand
@ 2019-10-07 15:37   ` Van Haaren, Harry
  2019-10-18  4:10     ` David Marchand
  0 siblings, 1 reply; 5+ messages in thread
From: Van Haaren, Harry @ 2019-10-07 15:37 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Rao, Nikhil

> -----Original Message-----
> From: David Marchand [mailto:david.marchand@redhat.com]
> Sent: Monday, October 7, 2019 3:53 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev <dev@dpdk.org>; Rao, Nikhil <nikhil.rao@intel.com>
> Subject: Re: [dpdk-dev] [PATCH] service: improve service run performance
> 
> On Mon, Sep 16, 2019 at 12:01 PM Nikhil Rao <nikhil.rao@intel.com> wrote:
> >
> > For a valid service, the core mask of the service
> > is checked against the current core and the corresponding
> > entry in the active_on_lcore array is set or reset.
> >
> > Upto 8 cores share the same cache line for their
> > service active_on_lcore array entries since each entry is a uint8_t.
> > Some number of these entries also share the cache line with
> > the internal_flags member of struct rte_service_spec_impl,
> > hence this false sharing also makes the service_valid() check
> > expensive.
> >
> > Eliminate false sharing by moving the active_on_lcore array to
> > a per-core data structure. The array is now indexed by service id.
> 
> Harry, any comments on this patch?


Looks good to me, thanks Nikhil & David for the ping;

Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH] service: improve service run performance
  2019-09-16 10:01 Nikhil Rao
@ 2019-10-07 14:52 ` David Marchand
  2019-10-07 15:37   ` Van Haaren, Harry
  0 siblings, 1 reply; 5+ messages in thread
From: David Marchand @ 2019-10-07 14:52 UTC (permalink / raw)
  To: Van Haaren Harry; +Cc: dev, Nikhil Rao

On Mon, Sep 16, 2019 at 12:01 PM Nikhil Rao <nikhil.rao@intel.com> wrote:
>
> For a valid service, the core mask of the service
> is checked against the current core and the corresponding
> entry in the active_on_lcore array is set or reset.
>
> Upto 8 cores share the same cache line for their
> service active_on_lcore array entries since each entry is a uint8_t.
> Some number of these entries also share the cache line with
> the internal_flags member of struct rte_service_spec_impl,
> hence this false sharing also makes the service_valid() check
> expensive.
>
> Eliminate false sharing by moving the active_on_lcore array to
> a per-core data structure. The array is now indexed by service id.

Harry, any comments on this patch?
Thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH] service: improve service run performance
@ 2019-09-16 10:01 Nikhil Rao
  2019-10-07 14:52 ` David Marchand
  0 siblings, 1 reply; 5+ messages in thread
From: Nikhil Rao @ 2019-09-16 10:01 UTC (permalink / raw)
  To: harry.van.haaren; +Cc: dev, Nikhil Rao

For a valid service, the core mask of the service
is checked against the current core and the corresponding
entry in the active_on_lcore array is set or reset.

Upto 8 cores share the same cache line for their
service active_on_lcore array entries since each entry is a uint8_t.
Some number of these entries also share the cache line with
the internal_flags member of struct rte_service_spec_impl,
hence this false sharing also makes the service_valid() check
expensive.

Eliminate false sharing by moving the active_on_lcore array to
a per-core data structure. The array is now indexed by service id.

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
---
 lib/librte_eal/common/rte_service.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c
index c3653eb..5d52a81 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -51,7 +51,6 @@ struct rte_service_spec_impl {
 	rte_atomic32_t num_mapped_cores;
 	uint64_t calls;
 	uint64_t cycles_spent;
-	uint8_t active_on_lcore[RTE_MAX_LCORE];
 } __rte_cache_aligned;
 
 /* the internal values of a service core */
@@ -60,7 +59,7 @@ struct core_state {
 	uint64_t service_mask;
 	uint8_t runstate; /* running or stopped */
 	uint8_t is_service_core; /* set if core is currently a service core */
-
+	uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
 	uint64_t loops;
 	uint64_t calls_per_service[RTE_SERVICE_NUM_MAX];
 } __rte_cache_aligned;
@@ -347,7 +346,7 @@ int32_t rte_service_init(void)
 
 
 static inline int32_t
-service_run(uint32_t i, int lcore, struct core_state *cs, uint64_t service_mask)
+service_run(uint32_t i, struct core_state *cs, uint64_t service_mask)
 {
 	if (!service_valid(i))
 		return -EINVAL;
@@ -355,11 +354,11 @@ int32_t rte_service_init(void)
 	if (s->comp_runstate != RUNSTATE_RUNNING ||
 			s->app_runstate != RUNSTATE_RUNNING ||
 			!(service_mask & (UINT64_C(1) << i))) {
-		s->active_on_lcore[lcore] = 0;
+		cs->service_active_on_lcore[i] = 0;
 		return -ENOEXEC;
 	}
 
-	s->active_on_lcore[lcore] = 1;
+	cs->service_active_on_lcore[i] = 1;
 
 	/* check do we need cmpset, if MT safe or <= 1 core
 	 * mapped, atomic ops are not required.
@@ -382,7 +381,6 @@ int32_t rte_service_init(void)
 rte_service_may_be_active(uint32_t id)
 {
 	uint32_t ids[RTE_MAX_LCORE] = {0};
-	struct rte_service_spec_impl *s = &rte_services[id];
 	int32_t lcore_count = rte_service_lcore_list(ids, RTE_MAX_LCORE);
 	int i;
 
@@ -390,7 +388,7 @@ int32_t rte_service_init(void)
 		return -EINVAL;
 
 	for (i = 0; i < lcore_count; i++) {
-		if (s->active_on_lcore[ids[i]])
+		if (lcore_states[i].service_active_on_lcore[id])
 			return 1;
 	}
 
@@ -421,7 +419,7 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
 		return -EBUSY;
 	}
 
-	int ret = service_run(id, rte_lcore_id(), cs, UINT64_MAX);
+	int ret = service_run(id, cs, UINT64_MAX);
 
 	if (serialize_mt_unsafe)
 		rte_atomic32_dec(&s->num_mapped_cores);
@@ -442,7 +440,7 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
 
 		for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
 			/* return value ignored as no change to code flow */
-			service_run(i, lcore, cs, service_mask);
+			service_run(i, cs, service_mask);
 		}
 
 		cs->loops++;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-10-18  4:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-07 14:52 [dpdk-dev] [PATCH] service: improve service run performance Eads, Gage
  -- strict thread matches above, loose matches on Subject: below --
2019-09-16 10:01 Nikhil Rao
2019-10-07 14:52 ` David Marchand
2019-10-07 15:37   ` Van Haaren, Harry
2019-10-18  4:10     ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).