From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
"Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
<dev@dpdk.org>, "Bruce Richardson" <bruce.richardson@intel.com>,
"Nicolas Chautru" <nicolas.chautru@intel.com>,
"Fan Zhang" <roy.fan.zhang@intel.com>,
"Ashish Gupta" <ashish.gupta@marvell.com>,
"Akhil Goyal" <gakhil@marvell.com>,
"David Hunt" <david.hunt@intel.com>,
"Chengwen Feng" <fengchengwen@huawei.com>,
"Kevin Laatz" <kevin.laatz@intel.com>,
"Ray Kinsella" <mdr@ashroe.eu>, <thomas@monjalon.net>,
"Ferruh Yigit" <ferruh.yigit@xilinx.com>,
"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
<jerinj@marvell.com>, "Sachin Saxena" <sachin.saxena@oss.nxp.com>,
<hemant.agrawal@nxp.com>, "Ori Kam" <orika@nvidia.com>,
"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>
Cc: "Conor Walsh" <conor.walsh@intel.com>, "nd" <nd@arm.com>
Subject: RE: [PATCH v1 1/2] eal: add lcore busyness telemetry
Date: Mon, 18 Jul 2022 12:59:59 +0200 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D871CE@smartserver.smartshare.dk> (raw)
In-Reply-To: <9a27efbd-31d1-7525-51b0-ead0d969e98a@intel.com>
> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]
> Sent: Monday, 18 July 2022 11.44
>
> On 17-Jul-22 10:56 AM, Morten Brørup wrote:
> >> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> >> Sent: Sunday, 17 July 2022 05.10
> >>
> >> <snip>
> >>
> >>> Subject: RE: [PATCH v1 1/2] eal: add lcore busyness telemetry
> >>>
> >>>> From: Anatoly Burakov [mailto:anatoly.burakov@intel.com]
> >>>> Sent: Friday, 15 July 2022 15.13
> >>>>
> >>>> Currently, there is no way to measure lcore busyness in a passive
> >> way,
> >>>> without any modifications to the application. This patch adds a
> new
> >>>> EAL API that will be able to passively track core busyness.
> >>>>
> >>>> The busyness is calculated by relying on the fact that most DPDK
> >> API's
> >>>> will poll for packets.
> >>>
> >>> This is an "alternative fact"! Only run-to-completion applications
> >> polls for RX.
> >>> Pipelined applications do not poll for packets in every pipeline
> >> stage.
> >> I guess you meant, poll for packets from NIC. They still need to
> >> receive packets from queues. We could do a similar thing for
> rte_ring
> >> APIs.
>
> Ring API is already instrumented to report telemetry in the same way,
> so
> any rte_ring-based pipeline will be able to track it. Obviously,
> non-DPDK API's will have to be instrumented too, we really can't do
> anything about that from inside DPDK.
>
> >
> > But it would mix apples, pears and bananas.
> >
> > Let's say you have a pipeline with three ingress preprocessing
> threads, two advanced packet processing threads in the next pipeline
> stage and one egress thread as the third pipeline stage.
> >
> > Now, the metrics reflects busyness for six threads, but three of them
> are apples, two of them are pears, and one is bananas.
> >
> > I just realized another example, where this patch might give
> misleading results on a run-to-completion application:
> >
> > One thread handles a specific type of packets received on an Ethdev
> ingress queue set up by the rte_flow APIs, and another thread handles
> ingress packets from another Ethdev ingress queue. E.g. the first queue
> may contain packets for well known flows, where packets can be
> processed quickly, and the other queue for other packets requiring more
> scrutiny. Both threads are run-to-completion and handle Ethdev ingress
> packets.
> >
> > *So: Only applications where the threads perform the exact same task
> can use this patch.*
>
> I do not see how that follows. I think you're falling for a "it's not
> 100% useful, therefore it's 0% useful" fallacy here. Some use cases
> would obviously make telemetry more informative than others, that's
> true, however I do not see how it's a mandatory requirement for lcore
> busyness to report the same thing. We can document the limitations and
> assumptions made, can we not?
I did use strong wording in my email to get my message through. However, I do consider the scope "applications where the threads perform the exact same task" more than 0 % of all deployed applications, and thus the patch is more than 0 % useful. But I certainly don't consider the scope for this patch 100 % of all deployed applications, and perhaps not even 80 %.
I didn't reject the patch or oppose to it, but requested it to be updated, so the names reflect the information provided by it. I strongly oppose to using "CPU Busyness" as the telemetry name for something that only reflects ingress activity, and is zero for a thread that only performs egress or other non-ingress tasks. That would be strongly misleading.
If you by "document the limitations and assumptions" also mean rename telemetry names and variables/functions in the patch to reflect what it actually does, then yes, documenting the limitations and assumptions suffices. However, adding a notice in some documentation that "CPU Business" telemetry only is correct/relevant for specific applications doesn't suffice.
>
> It is true that this patchset is mostly written from the standpoint of
> a
> run-to-completion application, but can we improve it? What would be
> your
> suggestions to make it better suit use cases you are familiar with?
Our application uses our own run-time profiler library to measure time spent in the application's various threads and pipeline stages. And the application needs to call the profiler library functions to feed it the information it needs. We still haven't found a good way to transform the profiler data to a generic summary CPU Utilization percentage, which should reflect how much of the system's CPU capacity is being used (preferably on a linear scale). (Our profiler library is designed specifically for our own purposes, and would require a complete rewrite to meet even basic DPDK library standards, so I won't even try to contribute it.)
I don't think it is possible to measure and report detailed CPU Busyness without involving the application. Only the application has knowledge about what the individual lcores are doing. Even for my example above (with two run-to-completion threads serving rte_flow configured Ethdev ingress queues), this patch would not provide information about which of the two types of traffic is causing the higher busyness. The telemetry might expose which specific thread is busy, but it doesn't tell which of the two tasks is being performed by that thread, and thus which kind of traffic is causing the busyness.
>
> >
> > Also, rings may be used for other purposes than queueing packets
> between pipeline stages. E.g. our application uses rings for fast bulk
> allocation and freeing of other resources.
> >
>
> Well, this is the tradeoff for simplicity. Of course we could add all
> sorts of stuff like dynamic enable/disable of this and that and the
> other... but the end goal was something easy and automatic and that
> doesn't require any work to implement, not something that suits 100% of
> the cases 100% of the time. Having such flexibility as you described
> comes at a cost that this patch was not meant to pay!
I do see the benefit of adding instrumentation like this to the DPDK libraries, so information becomes available at zero application development effort. The alternative would be a profiler/busyness library requiring application modifications.
I only request that:
1. The patch clearly reflects what is does, and
2. The instrumentation can be omitted at build time, so it has zero performance impact on applications where it is useless.
>
> --
> Thanks,
> Anatoly
PS: The busyness counters in the DPDK Service Cores library are also being updated [1].
[1] http://inbox.dpdk.org/dev/20220711131825.3373195-2-harry.van.haaren@intel.com/T/#u
next prev parent reply other threads:[~2022-07-18 11:00 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-15 13:12 Anatoly Burakov
2022-07-15 13:12 ` [PATCH v1 2/2] eal: add cpuset lcore telemetry entries Anatoly Burakov
2022-07-15 13:35 ` [PATCH v1 1/2] eal: add lcore busyness telemetry Burakov, Anatoly
2022-07-15 13:46 ` Jerin Jacob
2022-07-15 14:11 ` Bruce Richardson
2022-07-15 14:18 ` Burakov, Anatoly
2022-07-15 22:13 ` Morten Brørup
2022-07-16 14:38 ` Thomas Monjalon
2022-07-17 3:10 ` Honnappa Nagarahalli
2022-07-17 9:56 ` Morten Brørup
2022-07-18 9:43 ` Burakov, Anatoly
2022-07-18 10:59 ` Morten Brørup [this message]
2022-07-19 12:20 ` Thomas Monjalon
2022-07-18 15:46 ` Stephen Hemminger
2022-08-24 16:24 ` [PATCH v2 0/3] Add lcore poll " Kevin Laatz
2022-08-24 16:24 ` [PATCH v2 1/3] eal: add " Kevin Laatz
2022-08-24 16:24 ` [PATCH v2 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-08-24 16:24 ` [PATCH v2 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-08-25 7:47 ` [PATCH v2 0/3] Add lcore poll busyness telemetry Morten Brørup
2022-08-25 10:53 ` Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 " Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 1/3] eal: add " Kevin Laatz
2022-08-26 7:05 ` Jerin Jacob
2022-08-26 8:07 ` Bruce Richardson
2022-08-26 8:16 ` Jerin Jacob
2022-08-26 8:29 ` Morten Brørup
2022-08-26 15:27 ` Kevin Laatz
2022-08-26 15:46 ` Morten Brørup
2022-08-29 10:41 ` Bruce Richardson
2022-08-29 10:53 ` Thomas Monjalon
2022-08-29 12:36 ` Kevin Laatz
2022-08-29 12:49 ` Morten Brørup
2022-08-29 13:37 ` Kevin Laatz
2022-08-29 13:44 ` Morten Brørup
2022-08-29 14:21 ` Kevin Laatz
2022-08-29 11:22 ` Morten Brørup
2022-08-26 22:06 ` Mattias Rönnblom
2022-08-29 8:23 ` Bruce Richardson
2022-08-29 13:16 ` Kevin Laatz
2022-08-30 10:26 ` Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-08-25 15:28 ` [PATCH v3 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 0/3] Add lcore poll busyness telemetry Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 1/3] eal: add " Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-01 14:39 ` [PATCH v4 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 0/3] Add lcore poll busyness telemetry Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 1/3] eal: add " Kevin Laatz
2022-09-03 13:33 ` Jerin Jacob
2022-09-06 9:37 ` Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 2/3] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-02 15:58 ` [PATCH v5 3/3] doc: add howto guide for lcore poll busyness Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 0/4] Add lcore poll busyness telemetry Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 1/4] eal: add " Kevin Laatz
2022-09-13 13:48 ` Morten Brørup
2022-09-13 13:19 ` [PATCH v6 2/4] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 3/4] app/test: add unit tests for lcore poll busyness Kevin Laatz
2022-09-13 13:19 ` [PATCH v6 4/4] doc: add howto guide " Kevin Laatz
2022-09-14 9:29 ` [PATCH v7 0/4] Add lcore poll busyness telemetry Kevin Laatz
2022-09-14 9:29 ` [PATCH v7 1/4] eal: add " Kevin Laatz
2022-09-14 14:30 ` Stephen Hemminger
2022-09-16 12:35 ` Kevin Laatz
2022-09-19 10:19 ` Konstantin Ananyev
2022-09-22 17:14 ` Kevin Laatz
2022-09-26 9:37 ` Konstantin Ananyev
2022-09-29 12:41 ` Kevin Laatz
2022-09-30 12:32 ` Jerin Jacob
2022-10-01 14:17 ` Konstantin Ananyev
2022-10-03 20:02 ` Mattias Rönnblom
2022-10-04 9:15 ` Morten Brørup
2022-10-04 11:57 ` Bruce Richardson
2022-10-04 14:26 ` Mattias Rönnblom
2022-10-04 23:30 ` Konstantin Ananyev
2022-09-30 22:13 ` Mattias Rönnblom
2022-09-14 9:29 ` [PATCH v7 2/4] eal: add cpuset lcore telemetry entries Kevin Laatz
2022-09-14 9:29 ` [PATCH v7 3/4] app/test: add unit tests for lcore poll busyness Kevin Laatz
2022-09-30 22:20 ` Mattias Rönnblom
2022-09-14 9:29 ` [PATCH v7 4/4] doc: add howto guide " Kevin Laatz
2022-09-14 14:33 ` [PATCH v7 0/4] Add lcore poll busyness telemetry Stephen Hemminger
2022-09-16 12:35 ` Kevin Laatz
2022-09-16 14:10 ` Kevin Laatz
2022-10-05 13:44 ` Kevin Laatz
2022-10-06 13:25 ` Morten Brørup
2022-10-06 15:26 ` Mattias Rönnblom
2022-10-10 15:22 ` Morten Brørup
2022-10-10 17:38 ` Mattias Rönnblom
2022-10-12 12:25 ` Morten Brørup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D871CE@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=anatoly.burakov@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=ashish.gupta@marvell.com \
--cc=bruce.richardson@intel.com \
--cc=conor.walsh@intel.com \
--cc=david.hunt@intel.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=ferruh.yigit@xilinx.com \
--cc=gakhil@marvell.com \
--cc=hemant.agrawal@nxp.com \
--cc=jerinj@marvell.com \
--cc=kevin.laatz@intel.com \
--cc=konstantin.v.ananyev@yandex.ru \
--cc=mdr@ashroe.eu \
--cc=nd@arm.com \
--cc=nicolas.chautru@intel.com \
--cc=orika@nvidia.com \
--cc=roy.fan.zhang@intel.com \
--cc=sachin.saxena@oss.nxp.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).