From: Bruce Richardson <bruce.richardson@intel.com>
To: Jerin Jacob <jerinjacobk@gmail.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>,
dpdk-dev <dev@dpdk.org>,
Nicolas Chautru <nicolas.chautru@intel.com>,
Fan Zhang <roy.fan.zhang@intel.com>,
Ashish Gupta <ashish.gupta@marvell.com>,
Akhil Goyal <gakhil@marvell.com>,
David Hunt <david.hunt@intel.com>,
Chengwen Feng <fengchengwen@huawei.com>,
Kevin Laatz <kevin.laatz@intel.com>, Ray Kinsella <mdr@ashroe.eu>,
Thomas Monjalon <thomas@monjalon.net>,
Ferruh Yigit <ferruh.yigit@xilinx.com>,
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
Jerin Jacob <jerinj@marvell.com>,
Sachin Saxena <sachin.saxena@oss.nxp.com>,
Hemant Agrawal <hemant.agrawal@nxp.com>,
Ori Kam <orika@nvidia.com>,
Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>,
Conor Walsh <conor.walsh@intel.com>
Subject: Re: [PATCH v1 1/2] eal: add lcore busyness telemetry
Date: Fri, 15 Jul 2022 15:11:46 +0100 [thread overview]
Message-ID: <YtF1oq8dXD/tu/GC@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <CALBAE1Pto954vH8DvgYk=KAPTfNhc1ni2mg-xiHbzrsrdyeiLQ@mail.gmail.com>
On Fri, Jul 15, 2022 at 07:16:17PM +0530, Jerin Jacob wrote:
> On Fri, Jul 15, 2022 at 6:42 PM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
> >
> > Currently, there is no way to measure lcore busyness in a passive way,
> > without any modifications to the application. This patch adds a new EAL
> > API that will be able to passively track core busyness.
> >
> > The busyness is calculated by relying on the fact that most DPDK API's
> > will poll for packets. Empty polls can be counted as "idle", while
> > non-empty polls can be counted as busy. To measure lcore busyness, we
> > simply call the telemetry timestamping function with the number of polls
> > a particular code section has processed, and count the number of cycles
> > we've spent processing empty bursts. The more empty bursts we encounter,
> > the less cycles we spend in "busy" state, and the less core busyness
> > will be reported.
> >
> > In order for all of the above to work without modifications to the
> > application, the library code needs to be instrumented with calls to
> > the lcore telemetry busyness timestamping function. The following parts
> > of DPDK are instrumented with lcore telemetry calls:
> >
> > - All major driver API's:
> > - ethdev
> > - cryptodev
> > - compressdev
> > - regexdev
> > - bbdev
> > - rawdev
> > - eventdev
> > - dmadev
> > - Some additional libraries:
> > - ring
> > - distributor
> >
> > To avoid performance impact from having lcore telemetry support, a
> > global variable is exported by EAL, and a call to timestamping function
> > is wrapped into a macro, so that whenever telemetry is disabled, it only
> > takes one additional branch and no function calls are performed. It is
> > also possible to disable it at compile time by commenting out
> > RTE_LCORE_BUSYNESS from build config.
> >
> > This patch also adds a telemetry endpoint to report lcore busyness, as
> > well as telemetry endpoints to enable/disable lcore telemetry.
> >
> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> > Signed-off-by: Conor Walsh <conor.walsh@intel.com>
> > Signed-off-by: David Hunt <david.hunt@intel.com>
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
>
> It is a good feature. Thanks for this work.
>
+1
Some follow-up comments inline below.
/Bruce
>
> > ---
> >
> > Notes:
> > We did a couple of quick smoke tests to see if this patch causes any performance
> > degradation, and it seemed to have none that we could measure. Telemetry can be
> > disabled at compile time via a config option, while at runtime it can be
> > disabled, seemingly at a cost of one additional branch.
> >
> > That said, our benchmarking efforts were admittedly not very rigorous, so
> > comments welcome!
>
> >
> > diff --git a/config/rte_config.h b/config/rte_config.h
> > index 46549cb062..583cb6f7a5 100644
> > --- a/config/rte_config.h
> > +++ b/config/rte_config.h
> > @@ -39,6 +39,8 @@
> > #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
> > #define RTE_BACKTRACE 1
> > #define RTE_MAX_VFIO_CONTAINERS 64
> > +#define RTE_LCORE_BUSYNESS 1
>
> Please don't enable debug features in fastpath as default.
>
I would disagree that this is a debug feature. The number of times I have
heard from DPDK users that they wish to get more visibility into what the
app is doing rather than seeing 100% cpu busy. Therefore, I'd see this as
enabled by default rather than disabled - unless it's shown to have a
performance regression.
That said, since this impacts multiple components and it's something that
end users might want to disable at build-time, I'd suggest moving it to
meson_options.txt file and have it as an official DPDK build-time option.
> > +#define RTE_LCORE_BUSYNESS_PERIOD 4000000ULL
> >
> > +
> > +#include <unistd.h>
> > +#include <limits.h>
> > +#include <string.h>
> > +
> > +#include <rte_common.h>
> > +#include <rte_cycles.h>
> > +#include <rte_errno.h>
> > +#include <rte_lcore.h>
> > +
> > +#ifdef RTE_LCORE_BUSYNESS
>
> This clutter may not be required. Let it compile all cases.
>
> > +#include <rte_telemetry.h>
> > +#endif
> > +
> > +int __rte_lcore_telemetry_enabled;
> > +
> > +#ifdef RTE_LCORE_BUSYNESS
> > +
> > +struct lcore_telemetry {
> > + int busyness;
> > + /**< Calculated busyness (gets set/returned by the API) */
> > + int raw_busyness;
> > + /**< Calculated busyness times 100. */
> > + uint64_t interval_ts;
> > + /**< when previous telemetry interval started */
> > + uint64_t empty_cycles;
> > + /**< empty cycle count since last interval */
> > + uint64_t last_poll_ts;
> > + /**< last poll timestamp */
> > + bool last_empty;
> > + /**< if last poll was empty */
> > + unsigned int contig_poll_cnt;
> > + /**< contiguous (always empty/non empty) poll counter */
> > +} __rte_cache_aligned;
> > +static struct lcore_telemetry telemetry_data[RTE_MAX_LCORE];
>
> Allocate this from hugepage.
>
Yes, whether or not it's allocated from hugepages, dynamic allocation would
be better than having static vars.
> > +
> > +void __rte_lcore_telemetry_timestamp(uint16_t nb_rx)
> > +{
> > + const unsigned int lcore_id = rte_lcore_id();
> > + uint64_t interval_ts, empty_cycles, cur_tsc, last_poll_ts;
> > + struct lcore_telemetry *tdata = &telemetry_data[lcore_id];
> > + const bool empty = nb_rx == 0;
> > + uint64_t diff_int, diff_last;
> > + bool last_empty;
> > +
> > + last_empty = tdata->last_empty;
> > +
> > + /* optimization: don't do anything if status hasn't changed */
> > + if (last_empty == empty && tdata->contig_poll_cnt++ < 32)
> > + return;
> > + /* status changed or we're waiting for too long, reset counter */
> > + tdata->contig_poll_cnt = 0;
> > +
> > + cur_tsc = rte_rdtsc();
> > +
> > + interval_ts = tdata->interval_ts;
> > + empty_cycles = tdata->empty_cycles;
> > + last_poll_ts = tdata->last_poll_ts;
> > +
> > + diff_int = cur_tsc - interval_ts;
> > + diff_last = cur_tsc - last_poll_ts;
> > +
> > + /* is this the first time we're here? */
> > + if (interval_ts == 0) {
> > + tdata->busyness = LCORE_BUSYNESS_MIN;
> > + tdata->raw_busyness = 0;
> > + tdata->interval_ts = cur_tsc;
> > + tdata->empty_cycles = 0;
> > + tdata->contig_poll_cnt = 0;
> > + goto end;
> > + }
> > +
> > + /* update the empty counter if we got an empty poll earlier */
> > + if (last_empty)
> > + empty_cycles += diff_last;
> > +
> > + /* have we passed the interval? */
> > + if (diff_int > RTE_LCORE_BUSYNESS_PERIOD) {
>
>
> I think, this function logic can be limited to just updating the
> timestamp in the ring buffer,
> and another control function of telemetry which runs in control core to do
> heavy lifting to reduce the performance impact on fast path,
>
> > + int raw_busyness;
> > +
> > + /* get updated busyness value */
> > + raw_busyness = calc_raw_busyness(tdata, empty_cycles, diff_int);
> > +
> > + /* set a new interval, reset empty counter */
> > + tdata->interval_ts = cur_tsc;
> > + tdata->empty_cycles = 0;
> > + tdata->raw_busyness = raw_busyness;
> > + /* bring busyness back to 0..100 range, biased to round up */
> > + tdata->busyness = (raw_busyness + 50) / 100;
> > + } else
> > + /* we may have updated empty counter */
> > + tdata->empty_cycles = empty_cycles;
> > +
> > +end:
> > + /* update status for next poll */
> > + tdata->last_poll_ts = cur_tsc;
> > + tdata->last_empty = empty;
> > +}
> > +
> > +#ifdef RTE_LCORE_BUSYNESS
> > +#define RTE_LCORE_TELEMETRY_TIMESTAMP(nb_rx) \
> > + do { \
> > + if (__rte_lcore_telemetry_enabled) \
>
> I think, rather than reading memory, Like Linux perf infrastructure,
> we can patch up the instruction steam as NOP vs Timestamp capture
> function.
>
Surely that requires much more complicated tooling? How would that work in
this situation?
>
> Also instead of changing all libraries, Maybe we can use
> "-finstrument-functions".
> and just mark the function with attribute.
>
> Just 2c.
>
> > + __rte_lcore_telemetry_timestamp(nb_rx); \
> > + } while (0)
> > +#else
> > +#define RTE_LCORE_TELEMETRY_TIMESTAMP(nb_rx) \
> > + while (0)
> > +#endif
> > +
next prev parent reply other threads:[~2022-07-15 14:11 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-15 13:12 Anatoly Burakov
2022-07-15 13:12 ` [PATCH v1 2/2] eal: add cpuset lcore telemetry entries Anatoly Burakov
2022-07-15 13:35 ` [PATCH v1 1/2] eal: add lcore busyness telemetry Burakov, Anatoly
2022-07-15 13:46 ` Jerin Jacob
2022-07-15 14:11 ` Bruce Richardson [this message]
2022-07-15 14:18 ` Burakov, Anatoly
2022-07-15 22:13 ` Morten Brørup
2022-07-16 14:38 ` Thomas Monjalon
2022-07-17 3:10 ` Honnappa Nagarahalli
2022-07-17 9:56 ` Morten Brørup
2022-07-18 9:43 ` Burakov, Anatoly
2022-07-18 10:59 ` Morten Brørup
2022-07-19 12:20 ` Thomas Monjalon
2022-07-18 15:46 ` Stephen Hemminger
2022-08-24 16:24 ` [PATCH v2 0/3] Add lcore poll " Kevin Laatz
2022-08-24 16:24 ` [PATCH v2 1/3] eal: add " Kevin Laatz
2022-08-24 16:24 ` [PATCH v2 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-08-24 16:24 ` [PATCH v2 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-08-25 7:47 ` [PATCH v2 0/3] Add lcore poll busyness telemetry Morten Brørup
2022-08-25 10:53 ` Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 " Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 1/3] eal: add " Kevin Laatz
2022-08-26 7:05 ` Jerin Jacob
2022-08-26 8:07 ` Bruce Richardson
2022-08-26 8:16 ` Jerin Jacob
2022-08-26 8:29 ` Morten Brørup
2022-08-26 15:27 ` Kevin Laatz
2022-08-26 15:46 ` Morten Brørup
2022-08-29 10:41 ` Bruce Richardson
2022-08-29 10:53 ` Thomas Monjalon
2022-08-29 12:36 ` Kevin Laatz
2022-08-29 12:49 ` Morten Brørup
2022-08-29 13:37 ` Kevin Laatz
2022-08-29 13:44 ` Morten Brørup
2022-08-29 14:21 ` Kevin Laatz
2022-08-29 11:22 ` Morten Brørup
2022-08-26 22:06 ` Mattias Rönnblom
2022-08-29 8:23 ` Bruce Richardson
2022-08-29 13:16 ` Kevin Laatz
2022-08-30 10:26 ` Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 0/3] Add lcore poll busyness telemetry Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 1/3] eal: add " Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 0/3] Add lcore poll busyness telemetry Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 1/3] eal: add " Kevin Laatz
2022-09-03 13:33 ` Jerin Jacob
2022-09-06 9:37 ` Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 0/4] Add lcore poll busyness telemetry Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 1/4] eal: add " Kevin Laatz
2022-09-13 13:48 ` Morten Brørup
2022-09-13 13:19 ` [PATCH v6 2/4] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 3/4] app/test: add unit tests for lcore poll busyness Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 4/4] doc: add howto guide " Kevin Laatz
2022-09-14 9:29 ` [PATCH v7 0/4] Add lcore poll busyness telemetry Kevin Laatz
2022-09-14 9:29 ` [PATCH v7 1/4] eal: add " Kevin Laatz
2022-09-14 14:30 ` Stephen Hemminger
2022-09-16 12:35 ` Kevin Laatz
2022-09-19 10:19 ` Konstantin Ananyev
2022-09-22 17:14 ` Kevin Laatz
2022-09-26 9:37 ` Konstantin Ananyev
2022-09-29 12:41 ` Kevin Laatz
2022-09-30 12:32 ` Jerin Jacob
2022-10-01 14:17 ` Konstantin Ananyev
2022-10-03 20:02 ` Mattias Rönnblom
2022-10-04 9:15 ` Morten Brørup
2022-10-04 11:57 ` Bruce Richardson
2022-10-04 14:26 ` Mattias Rönnblom
2022-10-04 23:30 ` Konstantin Ananyev
2022-09-30 22:13 ` Mattias Rönnblom
2022-09-14 9:29 ` [PATCH v7 2/4] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-14 9:29 ` [PATCH v7 3/4] app/test: add unit tests for lcore poll busyness Kevin Laatz
2022-09-30 22:20 ` Mattias Rönnblom
2022-09-14 9:29 ` [PATCH v7 4/4] doc: add howto guide " Kevin Laatz
2022-09-14 14:33 ` [PATCH v7 0/4] Add lcore poll busyness telemetry Stephen Hemminger
2022-09-16 12:35 ` Kevin Laatz
2022-09-16 14:10 ` Kevin Laatz
2022-10-05 13:44 ` Kevin Laatz
2022-10-06 13:25 ` Morten Brørup
2022-10-06 15:26 ` Mattias Rönnblom
2022-10-10 15:22 ` Morten Brørup
2022-10-10 17:38 ` Mattias Rönnblom
2022-10-12 12:25 ` Morten Brørup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YtF1oq8dXD/tu/GC@bricha3-MOBL.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=ashish.gupta@marvell.com \
--cc=conor.walsh@intel.com \
--cc=david.hunt@intel.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=ferruh.yigit@xilinx.com \
--cc=gakhil@marvell.com \
--cc=hemant.agrawal@nxp.com \
--cc=honnappa.nagarahalli@arm.com \
--cc=jerinj@marvell.com \
--cc=jerinjacobk@gmail.com \
--cc=kevin.laatz@intel.com \
--cc=konstantin.v.ananyev@yandex.ru \
--cc=mdr@ashroe.eu \
--cc=nicolas.chautru@intel.com \
--cc=orika@nvidia.com \
--cc=roy.fan.zhang@intel.com \
--cc=sachin.saxena@oss.nxp.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).