DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Tomasz Duszynski" <tduszynski@marvell.com>, <dev@dpdk.org>
Cc: <thomas@monjalon.net>,
	"Jerin Jacob Kollanukkaran" <jerinj@marvell.com>,
	<zhoumin@loongson.cn>, <mattias.ronnblom@ericsson.com>
Subject: RE: [PATCH v4 1/4] eal: add generic support for reading PMU events
Date: Wed, 14 Dec 2022 11:41:24 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8759B@smartserver.smartshare.dk> (raw)
In-Reply-To: <DM4PR18MB4368461EC42603F77A7DC1BCD2E09@DM4PR18MB4368.namprd18.prod.outlook.com>

+CC: Mattias, see my comment below about per-thread constructor for this

> From: Tomasz Duszynski [mailto:tduszynski@marvell.com]
> Sent: Wednesday, 14 December 2022 10.39
> 
> Hello Morten,
> 
> Thanks for review. Answers inline.
> 
> [...]
> 
> > > +/**
> > > + * @file
> > > + *
> > > + * PMU event tracing operations
> > > + *
> > > + * This file defines generic API and types necessary to setup PMU
> and
> > > + * read selected counters in runtime.
> > > + */
> > > +
> > > +/**
> > > + * A structure describing a group of events.
> > > + */
> > > +struct rte_pmu_event_group {
> > > +	int *fds; /**< array of event descriptors */
> > > +	void **mmap_pages; /**< array of pointers to mmapped
> > > perf_event_attr structures */
> >
> > There seems to be a lot of indirection involved here. Why are these
> arrays not statically sized,
> > instead of dynamically allocated?
> >
> 
> Different architectures/pmus impose limits on number of simultaneously
> enabled counters. So in order
> relief the pain of thinking about it and adding macros for each and
> every arch I decided to allocate
> the number user wants dynamically. Also assumption holds that user
> knows about tradeoffs of using
> too many counters hence will not enable too many events at once.

The DPDK convention is to use fixed size arrays (with a maximum size, e.g. RTE_MAX_ETHPORTS) in the fast path, for performance reasons.

Please use fixed size arrays instead of dynamically allocated arrays.

> 
> > Also, what is the reason for hiding the type struct
> perf_event_mmap_page **mmap_pages opaque by
> > using void **mmap_pages instead?
> 
> I think, that part doing mmap/munmap was written first hence void **
> was chosen in the first place.

Please update it, so the actual type is reflected here.

> 
> >
> > > +	bool enabled; /**< true if group was enabled on particular lcore
> > > */
> > > +};
> > > +
> > > +/**
> > > + * A structure describing an event.
> > > + */
> > > +struct rte_pmu_event {
> > > +	char *name; /** name of an event */
> > > +	int index; /** event index into fds/mmap_pages */
> > > +	TAILQ_ENTRY(rte_pmu_event) next; /** list entry */ };
> > > +
> > > +/**
> > > + * A PMU state container.
> > > + */
> > > +struct rte_pmu {
> > > +	char *name; /** name of core PMU listed under
> > > /sys/bus/event_source/devices */
> > > +	struct rte_pmu_event_group group[RTE_MAX_LCORE]; /**< per lcore
> > > event group data */
> > > +	int num_group_events; /**< number of events in a group */
> > > +	TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching
> > > events */

The event_list is used in slow path only, so it can remain a list - i.e. no change requested here. :-)

> > > +};
> > > +
> > > +/** Pointer to the PMU state container */ extern struct rte_pmu
> > > +*rte_pmu;
> >
> > Again, why not just extern struct rte_pmu, instead of dynamic
> allocation?
> >
> 
> No strong opinions here since this is a matter of personal preference.
> Can be removed
> in the next version.

Yes, please.

> 
> > > +
> > > +/** Each architecture supporting PMU needs to provide its own
> version
> > > */
> > > +#ifndef rte_pmu_pmc_read
> > > +#define rte_pmu_pmc_read(index) ({ 0; }) #endif
> > > +
> > > +/**
> > > + * @internal
> > > + *
> > > + * Read PMU counter.
> > > + *
> > > + * @param pc
> > > + *   Pointer to the mmapped user page.
> > > + * @return
> > > + *   Counter value read from hardware.
> > > + */
> > > +__rte_internal
> > > +static __rte_always_inline uint64_t
> > > +rte_pmu_read_userpage(struct perf_event_mmap_page *pc) {
> > > +	uint64_t offset, width, pmc = 0;
> > > +	uint32_t seq, index;
> > > +	int tries = 100;
> > > +
> > > +	for (;;) {

As a matter of personal preference, I would write this loop differently:

+ for (tries = 100; tries != 0; tries--) {

> > > +		seq = pc->lock;
> > > +		rte_compiler_barrier();
> > > +		index = pc->index;
> > > +		offset = pc->offset;
> > > +		width = pc->pmc_width;
> > > +
> > > +		if (likely(pc->cap_user_rdpmc && index)) {

Why "&& index"? The way I read [man perf_event_open], index 0 is perfectly valid.

[man perf_event_open]: https://man7.org/linux/man-pages/man2/perf_event_open.2.html

> > > +			pmc = rte_pmu_pmc_read(index - 1);
> > > +			pmc <<= 64 - width;
> > > +			pmc >>= 64 - width;
> > > +		}
> > > +
> > > +		rte_compiler_barrier();
> > > +
> > > +		if (likely(pc->lock == seq))
> > > +			return pmc + offset;
> > > +
> > > +		if (--tries == 0) {
> > > +			RTE_LOG(DEBUG, EAL, "failed to get
> > > perf_event_mmap_page lock\n");
> > > +			break;
> > > +		}

- Remove the 4 above lines of code, and move the debug log message to the end of the function instead.

> > > +	}

+ RTE_LOG(DEBUG, EAL, "failed to get perf_event_mmap_page lock\n");

> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/**
> > > + * @internal
> > > + *
> > > + * Enable group of events for a given lcore.
> > > + *
> > > + * @param lcore_id
> > > + *   The identifier of the lcore.
> > > + * @return
> > > + *   0 in case of success, negative value otherwise.
> > > + */
> > > +__rte_internal
> > > +int
> > > +rte_pmu_enable_group(int lcore_id);
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + *
> > > + * Add event to the group of enabled events.
> > > + *
> > > + * @param name
> > > + *   Name of an event listed under
> > > /sys/bus/event_source/devices/pmu/events.
> > > + * @return
> > > + *   Event index in case of success, negative value otherwise.
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_pmu_add_event(const char *name);
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + *
> > > + * Read hardware counter configured to count occurrences of an
> event.
> > > + *
> > > + * @param index
> > > + *   Index of an event to be read.
> > > + * @return
> > > + *   Event value read from register. In case of errors or lack of
> > > support
> > > + *   0 is returned. In other words, stream of zeros in a trace
> file
> > > + *   indicates problem with reading particular PMU event register.
> > > + */
> > > +__rte_experimental
> > > +static __rte_always_inline uint64_t
> > > +rte_pmu_read(int index)

The index type can be changed from int to uint32_t. This also eliminates the "(index < 0" part of the comparison further below in this function.

> > > +{
> > > +	int lcore_id = rte_lcore_id();
> > > +	struct rte_pmu_event_group *group;
> > > +	int ret;
> > > +
> > > +	if (!rte_pmu)
> > > +		return 0;
> > > +
> > > +	group = &rte_pmu->group[lcore_id];
> > > +	if (!group->enabled) {

Optimized: if (unlikely(!group->enabled)) {

> > > +		ret = rte_pmu_enable_group(lcore_id);
> > > +		if (ret)
> > > +			return 0;
> > > +
> > > +		group->enabled = true;
> > > +	}
> >
> > Why is the group not enabled in the setup function,
> rte_pmu_add_event(), instead of here, in the
> > hot path?
> >
> 
> When this is executed for the very first time then cpu will have
> obviously more work to do
> but afterwards setup path is not taken hence much less cpu cycles are
> required.
> 
> Setup is executed by main lcore solely, before lcores are executed
> hence some info passed to
> SYS_perf_event_open ioctl() is missing, pid (via rte_gettid()) being an
> example here.

OK. Thank you for the explanation. Since impossible at setup, it has to be done at runtime.

@Mattias: Another good example of something that would belong in per-thread constructors, as my suggested feature creep in [1].

[1]: http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87553@smartserver.smartshare.dk/

> 
> > > +
> > > +	if (index < 0 || index >= rte_pmu->num_group_events)

Optimized: if (unlikely(index >= rte_pmu.num_group_events))

> > > +		return 0;
> > > +
> > > +	return rte_pmu_read_userpage((struct perf_event_mmap_page
> > > *)group->mmap_pages[index]);
> >
> > Using fixed size arrays instead of multiple indirections via pointers
> is faster. It could be:
> >
> > return rte_pmu_read_userpage((struct perf_event_mmap_page
> > *)rte_pmu.group[lcore_id].mmap_pages[index]);
> >
> > With our without suggested performance improvements...
> >
> > Series-acked-by: Morten Brørup <mb@smartsharesystems.com>
> 


  reply	other threads:[~2022-12-14 10:41 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-11  9:43 [PATCH 0/4] add support for self monitoring Tomasz Duszynski
2022-11-11  9:43 ` [PATCH 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-12-15  8:33   ` Mattias Rönnblom
2022-11-11  9:43 ` [PATCH 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-11-11  9:43 ` [PATCH 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-11-11  9:43 ` [PATCH 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2022-11-21 12:11 ` [PATCH v2 0/4] add support for self monitoring Tomasz Duszynski
2022-11-21 12:11   ` [PATCH v2 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-11-21 12:11   ` [PATCH v2 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-11-21 12:11   ` [PATCH v2 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-11-21 12:11   ` [PATCH v2 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2022-11-29  9:28   ` [PATCH v3 0/4] add support for self monitoring Tomasz Duszynski
2022-11-29  9:28     ` [PATCH v3 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-11-30  8:32       ` zhoumin
2022-12-13  8:05         ` [EXT] " Tomasz Duszynski
2022-11-29  9:28     ` [PATCH v3 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-11-29  9:28     ` [PATCH v3 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-11-29  9:28     ` [PATCH v3 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2022-11-29 10:42     ` [PATCH v3 0/4] add support for self monitoring Morten Brørup
2022-12-13  8:23       ` Tomasz Duszynski
2022-12-13 10:43     ` [PATCH v4 " Tomasz Duszynski
2022-12-13 10:43       ` [PATCH v4 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-12-13 11:52         ` Morten Brørup
2022-12-14  9:38           ` Tomasz Duszynski
2022-12-14 10:41             ` Morten Brørup [this message]
2022-12-15  8:22               ` Morten Brørup
2022-12-16  7:33                 ` Morten Brørup
2023-01-05 21:14               ` Tomasz Duszynski
2023-01-05 22:07                 ` Morten Brørup
2023-01-08 15:41                   ` Tomasz Duszynski
2023-01-08 16:30                     ` Morten Brørup
2022-12-15  8:46         ` Mattias Rönnblom
2023-01-04 15:47           ` Tomasz Duszynski
2023-01-09  7:37         ` Ruifeng Wang
2023-01-09 15:40           ` Tomasz Duszynski
2022-12-13 10:43       ` [PATCH v4 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-12-13 10:43       ` [PATCH v4 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-12-13 10:43       ` [PATCH v4 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-01-10 23:46       ` [PATCH v5 0/4] add support for self monitoring Tomasz Duszynski
2023-01-10 23:46         ` [PATCH v5 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2023-01-11  9:05           ` Morten Brørup
2023-01-11 16:20             ` Tomasz Duszynski
2023-01-11 16:54               ` Morten Brørup
2023-01-10 23:46         ` [PATCH v5 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2023-01-10 23:46         ` [PATCH v5 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2023-01-10 23:46         ` [PATCH v5 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-01-11  0:32         ` [PATCH v5 0/4] add support for self monitoring Tyler Retzlaff
2023-01-11  9:31           ` Morten Brørup
2023-01-11 14:24             ` Tomasz Duszynski
2023-01-11 14:32               ` Bruce Richardson
2023-01-11  9:39           ` [EXT] " Tomasz Duszynski
2023-01-11 21:05             ` Tyler Retzlaff
2023-01-13  7:44               ` Tomasz Duszynski
2023-01-13 19:22                 ` Tyler Retzlaff
2023-01-14  9:53                   ` Morten Brørup
2023-01-19 23:39         ` [PATCH v6 " Tomasz Duszynski
2023-01-19 23:39           ` [PATCH v6 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-01-20  9:46             ` Morten Brørup
2023-01-26  9:40               ` Tomasz Duszynski
2023-01-26 12:29                 ` Morten Brørup
2023-01-26 12:59                   ` Bruce Richardson
2023-01-26 15:28                     ` [EXT] " Tomasz Duszynski
2023-02-02 14:27                       ` Morten Brørup
2023-01-26 15:17                   ` Tomasz Duszynski
2023-01-20 18:29             ` Tyler Retzlaff
2023-01-26  9:05               ` [EXT] " Tomasz Duszynski
2023-01-19 23:39           ` [PATCH v6 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-01-19 23:39           ` [PATCH v6 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-01-19 23:39           ` [PATCH v6 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-01 13:17           ` [PATCH v7 0/4] add support for self monitoring Tomasz Duszynski
2023-02-01 13:17             ` [PATCH v7 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-01 13:17             ` [PATCH v7 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-01 13:17             ` [PATCH v7 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-01 13:17             ` [PATCH v7 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-01 13:51             ` [PATCH v7 0/4] add support for self monitoring Morten Brørup
2023-02-02  7:54               ` Tomasz Duszynski
2023-02-02  9:43             ` [PATCH v8 " Tomasz Duszynski
2023-02-02  9:43               ` [PATCH v8 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-02 10:32                 ` Ruifeng Wang
2023-02-02  9:43               ` [PATCH v8 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-02  9:43               ` [PATCH v8 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-02  9:43               ` [PATCH v8 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-02 12:49               ` [PATCH v9 0/4] add support for self monitoring Tomasz Duszynski
2023-02-02 12:49                 ` [PATCH v9 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-06 11:02                   ` David Marchand
2023-02-09 11:09                     ` [EXT] " Tomasz Duszynski
2023-02-02 12:49                 ` [PATCH v9 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-02 12:49                 ` [PATCH v9 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-02 12:49                 ` [PATCH v9 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-13 11:31                 ` [PATCH v10 0/4] add support for self monitoring Tomasz Duszynski
2023-02-13 11:31                   ` [PATCH v10 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-16  7:39                     ` Ruifeng Wang
2023-02-16 14:44                       ` Tomasz Duszynski
2023-02-13 11:31                   ` [PATCH v10 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-16  7:41                     ` Ruifeng Wang
2023-02-13 11:31                   ` [PATCH v10 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-13 11:31                   ` [PATCH v10 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-16 17:54                   ` [PATCH v11 0/4] add support for self monitoring Tomasz Duszynski
2023-02-16 17:54                     ` [PATCH v11 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-16 23:50                       ` Konstantin Ananyev
2023-02-17  8:49                         ` [EXT] " Tomasz Duszynski
2023-02-17 10:14                           ` Konstantin Ananyev
2023-02-19 14:23                             ` Tomasz Duszynski
2023-02-20 14:31                               ` Konstantin Ananyev
2023-02-20 16:59                                 ` Tomasz Duszynski
2023-02-20 17:21                                   ` Konstantin Ananyev
2023-02-20 20:42                                     ` Tomasz Duszynski
2023-02-21  0:48                                       ` Konstantin Ananyev
2023-02-27  8:12                                         ` Tomasz Duszynski
2023-02-28 11:35                                           ` Konstantin Ananyev
2023-02-21 12:15                           ` Konstantin Ananyev
2023-02-21  2:17                       ` Konstantin Ananyev
2023-02-27  9:19                         ` [EXT] " Tomasz Duszynski
2023-02-27 20:53                           ` Konstantin Ananyev
2023-02-28  8:25                             ` Morten Brørup
2023-02-28 12:04                               ` Konstantin Ananyev
2023-02-28 13:15                                 ` Morten Brørup
2023-02-28 16:22                                 ` Morten Brørup
2023-03-05 16:30                                   ` Konstantin Ananyev
2023-02-28  9:57                             ` Tomasz Duszynski
2023-02-28 11:58                               ` Konstantin Ananyev
2023-02-16 17:55                     ` [PATCH v11 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-16 17:55                     ` [PATCH v11 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-16 17:55                     ` [PATCH v11 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-16 18:03                     ` [PATCH v11 0/4] add support for self monitoring Ruifeng Wang
2023-05-04  8:02                     ` David Marchand
2023-07-31 12:33                       ` Thomas Monjalon
2023-08-07  8:11                         ` [EXT] " Tomasz Duszynski
2023-09-21  8:26                           ` David Marchand
2023-01-25 10:33         ` [PATCH 0/2] add platform bus Tomasz Duszynski
2023-01-25 10:33           ` [PATCH 1/2] lib: add helper to read strings from sysfs files Tomasz Duszynski
2023-01-25 10:39             ` Thomas Monjalon
2023-01-25 16:16               ` Tyler Retzlaff
2023-01-26  8:30                 ` [EXT] " Tomasz Duszynski
2023-01-26 17:21                   ` Tyler Retzlaff
2023-01-26  8:35               ` Tomasz Duszynski
2023-01-25 10:33           ` [PATCH 2/2] bus: add platform bus Tomasz Duszynski
2023-01-25 10:41           ` [PATCH 0/2] " Tomasz Duszynski
2023-02-16 20:56         ` [PATCH v5 0/4] add support for self monitoring Liang Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D8759B@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=tduszynski@marvell.com \
    --cc=thomas@monjalon.net \
    --cc=zhoumin@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).