From: "Zhang, Helin" <helin.zhang@intel.com>
To: Matthew Hall <mhall@mhcomputing.net>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] librte_power w/ intel_pstate cpufreq governor
Date: Tue, 12 Jan 2016 15:17:21 +0000 [thread overview]
Message-ID: <F35DEAC7BCE34641BA9FAC6BCA4A12E70A97A800@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <5688D2EE.5010700@mhcomputing.net>
Hi Matthew
Yes, you have indicated out the key, the power management module has changed or upgraded.
Could you help to try the legacy one to see if it still works, as indicated in your link?
Taking control of the governor from kernel to user space, might need one more checks before that.
But it is actually not a big issue, as user can switch it back to anything via 'echo'.
Yes, it seems that librte_power is out of date for a while. It is not easy to track all the kernel versions.
Now we have good chance to do that, as you have reported issues. Let's have a look on the new power management mechanism and then see if we can do something.
Really thanks to your questions!
Regards,
Helin
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matthew Hall
> Sent: Sunday, January 3, 2016 3:51 PM
> To: dev@dpdk.org
> Subject: Re: [dpdk-dev] librte_power w/ intel_pstate cpufreq governor
>
> Hello,
>
> In about one month, I never received any response about all these major
> issues I was finding with librte_power and the intel_pstate based CPU
> clockrate control driver used in all the new Linux kernels.
>
> From what I can tell, none of this librte_power code ever worked right in the
> first place on Sandy Bridge and newer, because the chip secretly ignores
> clockrate adjustments from outside.
>
> Can anyone who is more expert about Intel Power Management please help
> me check this and point me to some documentation which explains how this
> is supposed to work?
>
> I am kind of blocked on doing performance / production quality
> improvements on my code, without some kind of basic help understanding
> how this librte_power stuff should work.
>
> Thanks,
> Matthew.
>
> On 12/5/15 4:08 PM, Matthew Hall wrote:
> > Hello all,
> >
> > I wanted to ask some questions about librte_power and the great
> > adaptive polling / IRQ mode example in l3fwd-power.
> >
> > I am very interested in getting this to work in my project because it
> > will make it much friendlier to attract new community developers if I
> > am as cooperative as possible with system resources.
> >
> > Let's discuss the init process for a moment. It has some problems on
> > my system, and I need some help to figure out how to handle this right.
> >
> > 1. Begins with the call to rte_power_init.
> >
> > 2. Attempts to init ACPI cpufreq mode.
> >
> > 2.1. Sets lcore cpufreq governor to userspace mode.
> >
> > 2.2. Function power_get_available_freqs checks lcore CPU frequencies
> from:
> >
> > /sys/devices/system/cpu/cpuX/cpufreq/scaling_available_frequencies
> >
> > 2.3. This fails with (cryptic) error "POWER: ERR: File not openned". I
> > am planning to write a patch for this error a bit later.
> >
> > My kernel is using the intel_pstate driver, so
> > scaling_available_frequencies does not exist:
> >
> > http://askubuntu.com/questions/544266/why-are-missing-the-frequency-
> op
> > tions-on-cpufreq-utils-indicator
> >
> > 3. When power_get_available_freqs fails, rte_power_acpi_cpufreq_init
> fails.
> >
> > 4. rte_power_init will try rte_power_kvm_vm_init. That will fail
> > because it's a physical Skylake system not some kind of VM.
> >
> > 5. Now rte_power_init totally fails, with error "POWER: ERR: Unable to
> > set Power Management Environment for lcore 0".
> >
> > So, I have a couple of questions to figure out from here:
> >
> > 1. It seems bad to switch the governor into userspace before verifying
> > the frequencies available in scaling_available_frequencies. If there
> > are no frequencies available, it seems like it should not be trying to
> > take over control of an effectively uncontrollable value.
> >
> > 2. If the governor is switched to userspace, and then no governing is
> > done, it seems like the clockrate will necessarily always be wrong
> > also because nothing will be configuring it anymore, neither kernel,
> > nor failed DPDK userspace code, since rte_power_freq_up / down
> > function pointers will always be NULL. Is this true? This seems bad if so.
> >
> > It seems that the librte_power code is basically out of date, as
> > pstate has been present since Sandy Bridge, which is quite old by now
> > for network processing. I am not sure how to make this work right now.
> > So far I see a couple options but I really don't know much about this stuff:
> >
> > 1) skip rte_power_init completely, and let intel_pstate handle it
> > using HWP mode
> >
> > 2) disable intel_pstate, switch to the legacy ACPI cpufreq (but people
> > warned this old driver is mostly a no-op and the CPU ignores its frequency
> requests).
> >
> > The Internet advice says it's possible, but not a very good idea, to
> > switch from the modern intel_pstate driver to the legacy ACPI mode.
> > Reading through the kernel docs (below) state that it's better to use
> > HWP (Hardware P State)
> > mode:
> >
> > https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt
> >
> > If none of this rte_power_init stuff works, are the other CPU
> > conservation measures inside the l3fwd-power example enough to work
> > right with HWP all by themselves with nothing additional?
> >
> > Thanks,
> > Matthew.
> >
next prev parent reply other threads:[~2016-01-12 15:17 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-06 0:08 Matthew Hall
2016-01-03 7:51 ` Matthew Hall
2016-01-12 15:17 ` Zhang, Helin [this message]
2016-01-14 7:03 ` Matthew Hall
2016-01-14 7:11 ` Matthew Hall
2016-01-14 7:15 ` Zhang, Helin
2016-01-14 7:44 ` Matthew Hall
2017-02-27 5:56 Threqn Peng
2017-03-01 9:22 ` Threqn Peng
2018-03-02 7:18 longtb5
2018-03-02 7:20 ` longtb5
2018-03-05 10:16 ` Hunt, David
2018-03-05 10:48 ` longtb5
2018-03-05 11:25 ` Hunt, David
2018-03-05 12:23 ` longtb5
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=F35DEAC7BCE34641BA9FAC6BCA4A12E70A97A800@SHSMSX104.ccr.corp.intel.com \
--to=helin.zhang@intel.com \
--cc=dev@dpdk.org \
--cc=mhall@mhcomputing.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).