From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: mattias.ronnblom <mattias.ronnblom@ericsson.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
"Morten Brørup" <mb@smartsharesystems.com>, nd <nd@arm.com>,
"mattias.ronnblom" <mattias.ronnblom@ericsson.com>
Subject: RE: [PATCH 3/6] service: reduce average case service core overhead
Date: Mon, 3 Oct 2022 13:33:52 +0000 [thread overview]
Message-ID: <BN0PR11MB5712ED8900F790275B92399BD75B9@BN0PR11MB5712.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20220906161352.296110-3-mattias.ronnblom@ericsson.com>
> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Sent: Tuesday, September 6, 2022 5:14 PM
> To: Van; Haaren; Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> Morten Brørup <mb@smartsharesystems.com>; nd <nd@arm.com>;
> mattias.ronnblom <mattias.ronnblom@ericsson.com>
> Subject: [PATCH 3/6] service: reduce average case service core overhead
>
> Optimize service loop so that the starting point is the lowest-indexed
> service mapped to the lcore in question, and terminate the loop at the
> highest-indexed service.
>
> While the worst case latency remains the same, this patch
> significantly reduces the service framework overhead for the average
> case. In particular, scenarios where an lcore only runs a single
> service, or multiple services which id values are close (e.g., three
> services with ids 17, 18 and 22), show significant improvements.
>
> The worse case is a where the lcore two services mapped to it; one
> with service id 0 and the other with id 63.
I like the optimization - nice work. There is one caveat, that with the
builtin_ctz() call, RTE_SERVICE_NUM_MAX *must* be 64 or lower.
Today it is defined as 64, but we must ensure that this value cannot
be changed "by accident" without explicit compilation failures and a
comment explaining that fact.
There are likely options around making it runtime-dynamic, but I don't
think the complexity is justified: suggest we use compile-time check
BUILD_BUG_ON() and error if its > 64?
Note in rte_service_component_register(), we *re-use* IDs when they
become available, so we can have up to 64 active services at a time, but
the can register/unregister more times than that. This is a very unlikely
usage of the services API to continually register-unregister services.
With the BUILD_BUG_ON() around the 64 MAX value with a comment:
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
> On a service lcore serving a single service, the service loop overhead
> is reduced from ~190 core clock cycles to ~46. (On an Intel Cascade
> Lake generation Xeon.) On weakly ordered CPUs, the gain is larger,
> since the loop included load-acquire atomic operations.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
> lib/eal/common/rte_service.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
> index 87df04e3ac..4cac866792 100644
> --- a/lib/eal/common/rte_service.c
> +++ b/lib/eal/common/rte_service.c
> @@ -464,7 +464,6 @@ static int32_t
> service_runner_func(void *arg)
> {
> RTE_SET_USED(arg);
> - uint32_t i;
> const int lcore = rte_lcore_id();
> struct core_state *cs = &lcore_states[lcore];
>
> @@ -478,10 +477,17 @@ service_runner_func(void *arg)
> RUNSTATE_RUNNING) {
>
> const uint64_t service_mask = cs->service_mask;
> + uint8_t start_id;
> + uint8_t end_id;
> + uint8_t i;
>
> - for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
> - if (!service_registered(i))
> - continue;
> + if (service_mask == 0)
> + continue;
> +
> + start_id = __builtin_ctzl(service_mask);
> + end_id = 64 - __builtin_clzl(service_mask);
> +
> + for (i = start_id; i < end_id; i++) {
> /* return value ignored as no change to code flow */
> service_run(i, cs, service_mask, service_get(i), 1);
> }
next prev parent reply other threads:[~2022-10-03 13:33 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-08 12:56 [PATCH 1/2] test/service: add perf measurements for with stats mode Harry van Haaren
2022-07-08 12:56 ` [PATCH 2/2] service: fix potential stats race-condition on MT services Harry van Haaren
2022-07-08 13:23 ` Morten Brørup
2022-07-08 13:44 ` Van Haaren, Harry
2022-07-08 14:14 ` Morten Brørup
2022-07-08 13:48 ` Mattias Rönnblom
2022-07-08 15:16 ` Honnappa Nagarahalli
2022-07-08 15:31 ` Van Haaren, Harry
2022-07-08 16:21 ` Bruce Richardson
2022-07-08 16:33 ` Honnappa Nagarahalli
2022-07-08 20:02 ` Mattias Rönnblom
2022-07-08 16:29 ` Morten Brørup
2022-07-08 16:45 ` Honnappa Nagarahalli
2022-07-08 17:22 ` Morten Brørup
2022-07-08 17:39 ` Honnappa Nagarahalli
2022-07-08 18:08 ` Morten Brørup
2022-09-06 16:13 ` [PATCH 1/6] service: reduce statistics overhead for parallel services Mattias Rönnblom
2022-09-06 16:13 ` [PATCH 2/6] service: introduce per-lcore cycles counter Mattias Rönnblom
2022-09-06 16:13 ` [PATCH 3/6] service: reduce average case service core overhead Mattias Rönnblom
2022-10-03 13:33 ` Van Haaren, Harry [this message]
2022-10-03 14:32 ` Mattias Rönnblom
2022-09-06 16:13 ` [PATCH 4/6] service: tweak cycle statistics semantics Mattias Rönnblom
2022-09-07 8:41 ` Morten Brørup
2022-10-03 13:45 ` Van Haaren, Harry
2022-09-06 16:13 ` [PATCH 5/6] event/sw: report idle when no work is performed Mattias Rönnblom
2022-09-06 16:13 ` [PATCH 6/6] service: provide links to functions in documentation Mattias Rönnblom
2022-10-03 8:06 ` [PATCH 1/6] service: reduce statistics overhead for parallel services David Marchand
2022-10-03 8:40 ` Mattias Rönnblom
2022-10-03 9:53 ` David Marchand
2022-10-03 11:37 ` Mattias Rönnblom
2022-10-03 13:03 ` Van Haaren, Harry
2022-10-03 13:33 ` Van Haaren, Harry
2022-10-03 14:37 ` Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 0/6] Service cores performance and statistics improvements Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 1/6] service: reduce statistics overhead for parallel services Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 2/6] service: introduce per-lcore cycles counter Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 3/6] service: reduce average case service core overhead Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 4/6] service: tweak cycle statistics semantics Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 5/6] event/sw: report idle when no work is performed Mattias Rönnblom
2022-10-05 9:16 ` [PATCH v2 6/6] service: provide links to functions in documentation Mattias Rönnblom
2022-10-05 9:49 ` [PATCH v2 0/6] Service cores performance and statistics improvements Morten Brørup
2022-10-05 10:14 ` Mattias Rönnblom
2022-10-05 13:39 ` David Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BN0PR11MB5712ED8900F790275B92399BD75B9@BN0PR11MB5712.namprd11.prod.outlook.com \
--to=harry.van.haaren@intel.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=dev@dpdk.org \
--cc=mattias.ronnblom@ericsson.com \
--cc=mb@smartsharesystems.com \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).