From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <lma25@ecsmtp.ir.intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id C99671B1E6
 for <dev@dpdk.org>; Thu, 13 Dec 2018 14:46:19 +0100 (CET)
X-Amp-Result: UNKNOWN
X-Amp-Original-Verdict: FILE UNKNOWN
X-Amp-File-Uploaded: False
Received: from fmsmga005.fm.intel.com ([10.253.24.32])
 by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 13 Dec 2018 05:46:18 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.56,349,1539673200"; d="scan'208";a="303535653"
Received: from irvmail001.ir.intel.com ([163.33.26.43])
 by fmsmga005.fm.intel.com with ESMTP; 13 Dec 2018 05:46:17 -0800
Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com
 [10.237.217.48])
 by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id
 wBDDkGoF032258; Thu, 13 Dec 2018 13:46:16 GMT
Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1])
 by sivswdev09.ir.intel.com with ESMTP id wBDDkGxY020905;
 Thu, 13 Dec 2018 13:46:16 GMT
Received: (from lma25@localhost)
 by sivswdev09.ir.intel.com with LOCAL id wBDDkG7c020901;
 Thu, 13 Dec 2018 13:46:16 GMT
Date: Thu, 13 Dec 2018 13:46:16 +0000
From: "Liang, Ma" <liang.j.ma@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: david.hunt@intel.com, dev@dpdk.org, lei.a.yao@intel.com,
 ktraynor@redhat.com
Message-ID: <20181213134616.GB26517@sivswdev09.ir.intel.com>
References: <1542972781-6149-1-git-send-email-liang.j.ma@intel.com>
 <d6d32360-fc1d-eef5-6cff-c80ee2b43567@intel.com>
 <20181213105848.GA26517@sivswdev09.ir.intel.com>
 <79d8a78c-0dad-e750-5c71-7eacbdce9528@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <79d8a78c-0dad-e750-5c71-7eacbdce9528@intel.com>
User-Agent: Mutt/1.9.1 (2017-09-22)
Subject: Re: [dpdk-dev] [PATCH] libs/power: add p-state driver compatibility
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Dec 2018 13:46:20 -0000

 13 Dec 11:16, Burakov, Anatoly wrote:
> On 13-Dec-18 10:58 AM, Liang, Ma wrote:
> > On 10 Dec 16:08, Burakov, Anatoly wrote:
> > <snip>
> > > Hi Liang,
> > > 
> > > General comment: please do not break up log strings onto multiple lines. It
> > > makes it harder to grep the codebase. Checkpatch will not warn about log
> > > strings going over 80 characters.
> > agree, I will update in next version
> > > 
> > > > ---
> > > >    lib/librte_power/Makefile               |   2 +
> > > >    lib/librte_power/meson.build            |   4 +-
> > > >    lib/librte_power/power_pstate_cpufreq.c | 778 ++++++++++++++++++++++++++++++++
> > > >    lib/librte_power/power_pstate_cpufreq.h | 218 +++++++++
> > > >    lib/librte_power/rte_power.c            |  43 +-
> > > >    lib/librte_power/rte_power.h            |   3 +-
> > > >    6 files changed, 1039 insertions(+), 9 deletions(-)
> > > >    create mode 100644 lib/librte_power/power_pstate_cpufreq.c
> > > >    create mode 100644 lib/librte_power/power_pstate_cpufreq.h
> > > 
> > > <snip>
> > > 
> > > >    sources = files('rte_power.c', 'power_acpi_cpufreq.c',
> > > >    		'power_kvm_vm.c', 'guest_channel.c',
> > > > -		'rte_power_empty_poll.c')
> > > > +		'rte_power_empty_poll.c',
> > > > +		'power_pstate_cpufreq.c')
> > > >    headers = files('rte_power.h','rte_power_empty_poll.h')
> > > > -deps += ['timer']
> > > > diff --git a/lib/librte_power/power_pstate_cpufreq.c b/lib/librte_power/power_pstate_cpufreq.c
> > > > new file mode 100644
> > > > index 0000000..f1dada8
> > > > --- /dev/null
> > > > +++ b/lib/librte_power/power_pstate_cpufreq.c
> > > > @@ -0,0 +1,778 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + * Copyright(c) 2010-2018 Intel Corporation
> > > 
> > > I believe it should be 2018.
> > I didn't see anything wrong here.
> 
> The copyright on the file should start when it was first written. You're
> created this file in 2018, not 2010 :)
> 
> > > 
> > > > + */
> > > > +
> > > > +#include <stdio.h>
> > > > +#include <sys/types.h>
> > > > +#include <sys/stat.h>
> > > > +#include <fcntl.h>
> > > > +#include <stdlib.h>
> > > > +#include <string.h>
> > > > +#include <unistd.h>
> > > > +#include <signal.h>
> > > > +#include <limits.h>
> > > > +#include <errno.h>
> > > > +
> > > > +#include <rte_memcpy.h>
> > > > +#include <rte_atomic.h>
> > > > +
> > > > +#include "power_pstate_cpufreq.h"
> > > > +#include "power_common.h"
> > > > +
> > > <snip>
> > > 
> > > > +
> 
> <snip>
> 
> > > > +uint32_t
> > > > +power_pstate_cpufreq_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t num)
> > > > +{
> > > > +	struct pstate_power_info *pi;
> > > > +
> > > > +	if (lcore_id >= RTE_MAX_LCORE) {
> > > > +		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
> > > > +		return -1;
> > > > +	}
> > > > +
> > > > +	pi = &lcore_power_info[lcore_id];
> > > > +	if (num < pi->nb_freqs) {
> > > > +		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
> > > > +		return 0;
> > > > +	}
> > > > +	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
> > > 
> > > Why rte_memcpy?
> > the table can be large
> 
> So? :) This isn't a hotpath, so regular memcpy should be fine. I mean, it's
> OK to use rte_memcpy as well, just seems weird.
> 
> > > > +
> > > > +	return pi->nb_freqs;
> > > > +}
> > > > +
> > > > +uint32_t
> > > > +power_pstate_cpufreq_get_freq(unsigned int lcore_id)
> > > > +{
> > > > +	if (lcore_id >= RTE_MAX_LCORE) {
> > > > +		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
> > > > +		return RTE_POWER_INVALID_FREQ_INDEX;
> > > > +	}
> > > > +
> > > > +	return lcore_power_info[lcore_id].curr_idx;
> > > > +}
> > > > +
> > > 
> > > <snip>
> > > 
> > > > +	caps->turbo = !!(pi->turbo_available);
> > > > +
> > > > +	return 0;
> > > > +}
> > > > diff --git a/lib/librte_power/power_pstate_cpufreq.h b/lib/librte_power/power_pstate_cpufreq.h
> > > > new file mode 100644
> > > > index 0000000..0fc917a
> > > > --- /dev/null
> > > > +++ b/lib/librte_power/power_pstate_cpufreq.h
> > > > @@ -0,0 +1,218 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + * Copyright(c) 2010-2018 Intel Corporation
> > > 
> > > I believe it should be just 2018.
> > > 
> > > > + */
> > > > +
> > > > +#ifndef _POWER_PSTATE_CPUFREQ_H
> > > > +#define _POWER_PSTATE_CPUFREQ_H
> > > > +
> > > > +/**
> > > > + * @file
> > > > + * RTE Power Management via Intel Pstate driver
> > > > + */
> > > > +
> > > > +#include <rte_common.h>
> > > > +#include <rte_byteorder.h>
> > > > +#include <rte_log.h>
> > > > +#include <rte_string_fns.h>
> > > > +#include "rte_power.h"
> > > > +
> > > > +#ifdef __cplusplus
> > > > +extern "C" {
> > > > +#endif
> > > 
> > > I don't think this is necessary. These extern declarations are only for
> > > public API headers, so that C++ compilers could parse them. Since this is
> > > not a public header, there's no need for extern C declaration.
> > just keep as same as other headers
> 
> Well, other headers have it wrong then, if they too specify this extern
> declaration. This is only needed for public-facing headers. Don't copy
> bad/unnecessary code from other files - consistency is important, but not
> *that* important :)
> 
> > > 
> > > > +
> > > > +/**
> > > > + * Initialize power management for a specific lcore. It will check and set the
> > > > + * governor to performance for the lcore, get the available frequencies, and
> > > > + * prepare to set new lcore frequency.
> > > > + *
> > > > + * @param lcore_id
> > > > + *  lcore id.
> > > > + *
> > > > + * @return
> > > > + *  - 0 on success.
> > > > + *  - Negative on error.
> > > > + */
> > > 
> 
> <snip>
> 
> > > 
> > > >    	RTE_LOG(ERR, POWER, "Environment has not been set, unable to exit "
> > > >    				"gracefully\n");
> > > > diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
> > > > index d70bc0b..c5e8f6b 100644
> > > > --- a/lib/librte_power/rte_power.h
> > > > +++ b/lib/librte_power/rte_power.h
> > > > @@ -20,7 +20,8 @@ extern "C" {
> > > >    #endif
> > > >    /* Power Management Environment State */
> > > > -enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM};
> > > > +enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> > > > +		PM_ENV_PSTATE_CPUFREQ};
> > > 
> > > I don't like this approach. While it can be argued that application needs to
> > > be explicitly aware of whether it's using native (ACPI or pstate) vs. KVM
> > > power management, there's no real reason for an application to differentiate
> > > between ACPI and pstate modes.
> > > 
> > > Changing this would of course require a deprecation notice, so for now, can
> > > we hide all of this behind ACPI mode, and auto-detect whether we want ACPI
> > > or pstate mode? IMO it would be better for the user application to use a
> > > somewhat misnamed ACPI option for both ACPI and pstate modes, than to
> > > needlessly care about whether one or the other is in use.
> > > 
> > > What do you think?
> > acpi has significant difference with hwp. sysfs/xxxx/setspeed  sysfs/xxxx/scaling_driver is different
> > I would like to make  application aware the driver type.
> 
> Yes, however please correct me if i'm wrong - aren't all of these changes
> abstracted away by the power library API? However different they are
> internally, does the *application* really need to differentiate between the
> two? Are there differences *to the user* when using power API with pstate
> vs. ACPI? If not, then why do we need another env type when we can handle
> the differences internally?
>
Currently,  rte_power_init will try all three subsystem one by one to figure out the right driver.
however, the rte_power_set_env as  a interface is keep here for flexibility I thought.
also that's the original design for the library. if we find better way, we can implementa that 
in the future. but that's other patch then.;-)
> > > 
> > > >    /**
> > > >     * Set the default power management implementation. If this is not called prior
> > > > 
> > > 
> > > -- 
> > > Thanks,
> > > Anatoly
> > 
> 
> 
> -- 
> Thanks,
> Anatoly