DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Jerin Jacob <jerinjacobk@gmail.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>,
	dpdk-dev <dev@dpdk.org>,
	Nicolas Chautru <nicolas.chautru@intel.com>,
	Fan Zhang <roy.fan.zhang@intel.com>,
	Ashish Gupta <ashish.gupta@marvell.com>,
	Akhil Goyal <gakhil@marvell.com>,
	David Hunt <david.hunt@intel.com>,
	Chengwen Feng <fengchengwen@huawei.com>,
	Kevin Laatz <kevin.laatz@intel.com>, Ray Kinsella <mdr@ashroe.eu>,
	Thomas Monjalon <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@xilinx.com>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	Jerin Jacob <jerinj@marvell.com>,
	Sachin Saxena <sachin.saxena@oss.nxp.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	Ori Kam <orika@nvidia.com>,
	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
	Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>,
	Conor Walsh <conor.walsh@intel.com>
Subject: Re: [PATCH v1 1/2] eal: add lcore busyness telemetry
Date: Fri, 15 Jul 2022 15:11:46 +0100	[thread overview]
Message-ID: <YtF1oq8dXD/tu/GC@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <CALBAE1Pto954vH8DvgYk=KAPTfNhc1ni2mg-xiHbzrsrdyeiLQ@mail.gmail.com>

On Fri, Jul 15, 2022 at 07:16:17PM +0530, Jerin Jacob wrote:
> On Fri, Jul 15, 2022 at 6:42 PM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
> >
> > Currently, there is no way to measure lcore busyness in a passive way,
> > without any modifications to the application. This patch adds a new EAL
> > API that will be able to passively track core busyness.
> >
> > The busyness is calculated by relying on the fact that most DPDK API's
> > will poll for packets. Empty polls can be counted as "idle", while
> > non-empty polls can be counted as busy. To measure lcore busyness, we
> > simply call the telemetry timestamping function with the number of polls
> > a particular code section has processed, and count the number of cycles
> > we've spent processing empty bursts. The more empty bursts we encounter,
> > the less cycles we spend in "busy" state, and the less core busyness
> > will be reported.
> >
> > In order for all of the above to work without modifications to the
> > application, the library code needs to be instrumented with calls to
> > the lcore telemetry busyness timestamping function. The following parts
> > of DPDK are instrumented with lcore telemetry calls:
> >
> > - All major driver API's:
> >   - ethdev
> >   - cryptodev
> >   - compressdev
> >   - regexdev
> >   - bbdev
> >   - rawdev
> >   - eventdev
> >   - dmadev
> > - Some additional libraries:
> >   - ring
> >   - distributor
> >
> > To avoid performance impact from having lcore telemetry support, a
> > global variable is exported by EAL, and a call to timestamping function
> > is wrapped into a macro, so that whenever telemetry is disabled, it only
> > takes one additional branch and no function calls are performed. It is
> > also possible to disable it at compile time by commenting out
> > RTE_LCORE_BUSYNESS from build config.
> >
> > This patch also adds a telemetry endpoint to report lcore busyness, as
> > well as telemetry endpoints to enable/disable lcore telemetry.
> >
> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> > Signed-off-by: Conor Walsh <conor.walsh@intel.com>
> > Signed-off-by: David Hunt <david.hunt@intel.com>
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> 
> It is a good feature. Thanks for this work.
> 

+1
Some follow-up comments inline below.

/Bruce

> 
> > ---
> >
> > Notes:
> >     We did a couple of quick smoke tests to see if this patch causes any performance
> >     degradation, and it seemed to have none that we could measure. Telemetry can be
> >     disabled at compile time via a config option, while at runtime it can be
> >     disabled, seemingly at a cost of one additional branch.
> >
> >     That said, our benchmarking efforts were admittedly not very rigorous, so
> >     comments welcome!
> 
> >
> > diff --git a/config/rte_config.h b/config/rte_config.h
> > index 46549cb062..583cb6f7a5 100644
> > --- a/config/rte_config.h
> > +++ b/config/rte_config.h
> > @@ -39,6 +39,8 @@
> >  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
> >  #define RTE_BACKTRACE 1
> >  #define RTE_MAX_VFIO_CONTAINERS 64
> > +#define RTE_LCORE_BUSYNESS 1
> 
> Please don't enable debug features in fastpath as default.
> 

I would disagree that this is a debug feature. The number of times I have
heard from DPDK users that they wish to get more visibility into what the
app is doing rather than seeing 100% cpu busy. Therefore, I'd see this as
enabled by default rather than disabled - unless it's shown to have a
performance regression.

That said, since this impacts multiple components and it's something that
end users might want to disable at build-time, I'd suggest moving it to
meson_options.txt file and have it as an official DPDK build-time option.

> > +#define RTE_LCORE_BUSYNESS_PERIOD 4000000ULL
> >
> > +
> > +#include <unistd.h>
> > +#include <limits.h>
> > +#include <string.h>
> > +
> > +#include <rte_common.h>
> > +#include <rte_cycles.h>
> > +#include <rte_errno.h>
> > +#include <rte_lcore.h>
> > +
> > +#ifdef RTE_LCORE_BUSYNESS
> 
> This clutter may not be required. Let it compile all cases.
> 
> > +#include <rte_telemetry.h>
> > +#endif
> > +
> > +int __rte_lcore_telemetry_enabled;
> > +
> > +#ifdef RTE_LCORE_BUSYNESS
> > +
> > +struct lcore_telemetry {
> > +       int busyness;
> > +       /**< Calculated busyness (gets set/returned by the API) */
> > +       int raw_busyness;
> > +       /**< Calculated busyness times 100. */
> > +       uint64_t interval_ts;
> > +       /**< when previous telemetry interval started */
> > +       uint64_t empty_cycles;
> > +       /**< empty cycle count since last interval */
> > +       uint64_t last_poll_ts;
> > +       /**< last poll timestamp */
> > +       bool last_empty;
> > +       /**< if last poll was empty */
> > +       unsigned int contig_poll_cnt;
> > +       /**< contiguous (always empty/non empty) poll counter */
> > +} __rte_cache_aligned;
> > +static struct lcore_telemetry telemetry_data[RTE_MAX_LCORE];
> 
> Allocate this from hugepage.
> 

Yes, whether or not it's allocated from hugepages, dynamic allocation would
be better than having static vars.

> > +
> > +void __rte_lcore_telemetry_timestamp(uint16_t nb_rx)
> > +{
> > +       const unsigned int lcore_id = rte_lcore_id();
> > +       uint64_t interval_ts, empty_cycles, cur_tsc, last_poll_ts;
> > +       struct lcore_telemetry *tdata = &telemetry_data[lcore_id];
> > +       const bool empty = nb_rx == 0;
> > +       uint64_t diff_int, diff_last;
> > +       bool last_empty;
> > +
> > +       last_empty = tdata->last_empty;
> > +
> > +       /* optimization: don't do anything if status hasn't changed */
> > +       if (last_empty == empty && tdata->contig_poll_cnt++ < 32)
> > +               return;
> > +       /* status changed or we're waiting for too long, reset counter */
> > +       tdata->contig_poll_cnt = 0;
> > +
> > +       cur_tsc = rte_rdtsc();
> > +
> > +       interval_ts = tdata->interval_ts;
> > +       empty_cycles = tdata->empty_cycles;
> > +       last_poll_ts = tdata->last_poll_ts;
> > +
> > +       diff_int = cur_tsc - interval_ts;
> > +       diff_last = cur_tsc - last_poll_ts;
> > +
> > +       /* is this the first time we're here? */
> > +       if (interval_ts == 0) {
> > +               tdata->busyness = LCORE_BUSYNESS_MIN;
> > +               tdata->raw_busyness = 0;
> > +               tdata->interval_ts = cur_tsc;
> > +               tdata->empty_cycles = 0;
> > +               tdata->contig_poll_cnt = 0;
> > +               goto end;
> > +       }
> > +
> > +       /* update the empty counter if we got an empty poll earlier */
> > +       if (last_empty)
> > +               empty_cycles += diff_last;
> > +
> > +       /* have we passed the interval? */
> > +       if (diff_int > RTE_LCORE_BUSYNESS_PERIOD) {
> 
> 
> I think, this function logic can be limited to just updating the
> timestamp in the ring buffer,
> and another control function of telemetry which runs in control core to do
> heavy lifting to reduce the performance impact on fast path,
> 
> > +               int raw_busyness;
> > +
> > +               /* get updated busyness value */
> > +               raw_busyness = calc_raw_busyness(tdata, empty_cycles, diff_int);
> > +
> > +               /* set a new interval, reset empty counter */
> > +               tdata->interval_ts = cur_tsc;
> > +               tdata->empty_cycles = 0;
> > +               tdata->raw_busyness = raw_busyness;
> > +               /* bring busyness back to 0..100 range, biased to round up */
> > +               tdata->busyness = (raw_busyness + 50) / 100;
> > +       } else
> > +               /* we may have updated empty counter */
> > +               tdata->empty_cycles = empty_cycles;
> > +
> > +end:
> > +       /* update status for next poll */
> > +       tdata->last_poll_ts = cur_tsc;
> > +       tdata->last_empty = empty;
> > +}
> > +
> > +#ifdef RTE_LCORE_BUSYNESS
> > +#define RTE_LCORE_TELEMETRY_TIMESTAMP(nb_rx)                    \
> > +       do {                                                    \
> > +               if (__rte_lcore_telemetry_enabled)              \
> 
> I think, rather than reading memory, Like Linux perf infrastructure,
> we can patch up the instruction steam as NOP vs Timestamp capture
> function.
> 
Surely that requires much more complicated tooling? How would that work in
this situation?

> 
> Also instead of changing all libraries, Maybe we can use
> "-finstrument-functions".
> and just mark the function with attribute.
> 
> Just 2c.
> 
> > +                       __rte_lcore_telemetry_timestamp(nb_rx); \
> > +       } while (0)
> > +#else
> > +#define RTE_LCORE_TELEMETRY_TIMESTAMP(nb_rx) \
> > +       while (0)
> > +#endif
> > +

  reply	other threads:[~2022-07-15 14:11 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-15 13:12 Anatoly Burakov
2022-07-15 13:12 ` [PATCH v1 2/2] eal: add cpuset lcore telemetry entries Anatoly Burakov
2022-07-15 13:35 ` [PATCH v1 1/2] eal: add lcore busyness telemetry Burakov, Anatoly
2022-07-15 13:46 ` Jerin Jacob
2022-07-15 14:11   ` Bruce Richardson [this message]
2022-07-15 14:18   ` Burakov, Anatoly
2022-07-15 22:13 ` Morten Brørup
2022-07-16 14:38   ` Thomas Monjalon
2022-07-17  3:10   ` Honnappa Nagarahalli
2022-07-17  9:56     ` Morten Brørup
2022-07-18  9:43       ` Burakov, Anatoly
2022-07-18 10:59         ` Morten Brørup
2022-07-19 12:20           ` Thomas Monjalon
2022-07-18 15:46         ` Stephen Hemminger
2022-08-24 16:24 ` [PATCH v2 0/3] Add lcore poll " Kevin Laatz
2022-08-24 16:24   ` [PATCH v2 1/3] eal: add " Kevin Laatz
2022-08-24 16:24   ` [PATCH v2 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-08-24 16:24   ` [PATCH v2 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-08-25  7:47   ` [PATCH v2 0/3] Add lcore poll busyness telemetry Morten Brørup
2022-08-25 10:53     ` Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 " Kevin Laatz
2022-08-25 15:28   ` [PATCH v3 1/3] eal: add " Kevin Laatz
2022-08-26  7:05     ` Jerin Jacob
2022-08-26  8:07       ` Bruce Richardson
2022-08-26  8:16         ` Jerin Jacob
2022-08-26  8:29           ` Morten Brørup
2022-08-26 15:27             ` Kevin Laatz
2022-08-26 15:46               ` Morten Brørup
2022-08-29 10:41                 ` Bruce Richardson
2022-08-29 10:53                   ` Thomas Monjalon
2022-08-29 12:36                     ` Kevin Laatz
2022-08-29 12:49                       ` Morten Brørup
2022-08-29 13:37                         ` Kevin Laatz
2022-08-29 13:44                           ` Morten Brørup
2022-08-29 14:21                             ` Kevin Laatz
2022-08-29 11:22                   ` Morten Brørup
2022-08-26 22:06     ` Mattias Rönnblom
2022-08-29  8:23       ` Bruce Richardson
2022-08-29 13:16       ` Kevin Laatz
2022-08-30 10:26       ` Kevin Laatz
2022-08-25 15:28   ` [PATCH v3 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-08-25 15:28   ` [PATCH v3 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 0/3] Add lcore poll busyness telemetry Kevin Laatz
2022-09-01 14:39   ` [PATCH v4 1/3] eal: add " Kevin Laatz
2022-09-01 14:39   ` [PATCH v4 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-01 14:39   ` [PATCH v4 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 0/3] Add lcore poll busyness telemetry Kevin Laatz
2022-09-02 15:58   ` [PATCH v5 1/3] eal: add " Kevin Laatz
2022-09-03 13:33     ` Jerin Jacob
2022-09-06  9:37       ` Kevin Laatz
2022-09-02 15:58   ` [PATCH v5 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-02 15:58   ` [PATCH v5 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 0/4] Add lcore poll busyness telemetry Kevin Laatz
2022-09-13 13:19   ` [PATCH v6 1/4] eal: add " Kevin Laatz
2022-09-13 13:48     ` Morten Brørup
2022-09-13 13:19   ` [PATCH v6 2/4] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-13 13:19   ` [PATCH v6 3/4] app/test: add unit tests for lcore poll busyness Kevin Laatz
2022-09-13 13:19   ` [PATCH v6 4/4] doc: add howto guide " Kevin Laatz
2022-09-14  9:29 ` [PATCH v7 0/4] Add lcore poll busyness telemetry Kevin Laatz
2022-09-14  9:29   ` [PATCH v7 1/4] eal: add " Kevin Laatz
2022-09-14 14:30     ` Stephen Hemminger
2022-09-16 12:35       ` Kevin Laatz
2022-09-19 10:19     ` Konstantin Ananyev
2022-09-22 17:14       ` Kevin Laatz
2022-09-26  9:37         ` Konstantin Ananyev
2022-09-29 12:41           ` Kevin Laatz
2022-09-30 12:32             ` Jerin Jacob
2022-10-01 14:17             ` Konstantin Ananyev
2022-10-03 20:02               ` Mattias Rönnblom
2022-10-04  9:15                 ` Morten Brørup
2022-10-04 11:57                   ` Bruce Richardson
2022-10-04 14:26                     ` Mattias Rönnblom
2022-10-04 23:30                     ` Konstantin Ananyev
2022-09-30 22:13     ` Mattias Rönnblom
2022-09-14  9:29   ` [PATCH v7 2/4] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-14  9:29   ` [PATCH v7 3/4] app/test: add unit tests for lcore poll busyness Kevin Laatz
2022-09-30 22:20     ` Mattias Rönnblom
2022-09-14  9:29   ` [PATCH v7 4/4] doc: add howto guide " Kevin Laatz
2022-09-14 14:33   ` [PATCH v7 0/4] Add lcore poll busyness telemetry Stephen Hemminger
2022-09-16 12:35     ` Kevin Laatz
2022-09-16 14:10       ` Kevin Laatz
2022-10-05 13:44   ` Kevin Laatz
2022-10-06 13:25     ` Morten Brørup
2022-10-06 15:26       ` Mattias Rönnblom
2022-10-10 15:22         ` Morten Brørup
2022-10-10 17:38           ` Mattias Rönnblom
2022-10-12 12:25             ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtF1oq8dXD/tu/GC@bricha3-MOBL.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=ashish.gupta@marvell.com \
    --cc=conor.walsh@intel.com \
    --cc=david.hunt@intel.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@xilinx.com \
    --cc=gakhil@marvell.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=kevin.laatz@intel.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=mdr@ashroe.eu \
    --cc=nicolas.chautru@intel.com \
    --cc=orika@nvidia.com \
    --cc=roy.fan.zhang@intel.com \
    --cc=sachin.saxena@oss.nxp.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).