From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id D10C3A04DB;
	Fri,  4 Sep 2020 20:34:13 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 3B995E07;
	Fri,  4 Sep 2020 20:34:13 +0200 (CEST)
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id 49409255
 for <dev@dpdk.org>; Fri,  4 Sep 2020 20:34:11 +0200 (CEST)
IronPort-SDR: kN33W15CutSHFW4nCbuCD1CnRGNAJ134tu71ykpmirDkJyRlJAywE0mpynXJOjmYyeNty9Oq9o
 zvk2GKI0+Omg==
X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="145534872"
X-IronPort-AV: E=Sophos;i="5.76,391,1592895600"; d="scan'208";a="145534872"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
 by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 04 Sep 2020 11:34:10 -0700
IronPort-SDR: Xp03pA8NkXaR+dv4mn1lsSSp46NoW4xPMTjB/6fI+wcKpILSJxg81/fNzNm1HgrEEeeC3ndP5v
 0reH5HICVPIg==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.76,391,1592895600"; d="scan'208";a="332252782"
Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16])
 by orsmga008.jf.intel.com with ESMTP; 04 Sep 2020 11:34:10 -0700
Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by
 ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1713.5; Fri, 4 Sep 2020 11:34:09 -0700
Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by
 orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5
 via Frontend Transport; Fri, 4 Sep 2020 11:34:09 -0700
Received: from NAM11-DM6-obe.outbound.protection.outlook.com (104.47.57.169)
 by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.1.1713.5; Fri, 4 Sep 2020 11:33:54 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=EZPOH3D7vlqZbgHD+VTCpdx6Tt4C0QYlvOsZGebJCHCYRmtQqMmexMW00bUYJskCJSCXABxRDpI1IakcbW0q8Vfe4A7lhfyLiG5jOxL1MjctxmIu3tvpdxagp2rllFCAryH/H3LUChvLBie5PSA4NYr4k6MF/i3UqojjWoGbMauROOE5rRx2tQjTRrP5JWMnTM2xhyE20PauHy5BK8QPg1NzlgPHqdOfk9dxZCWhvIGg4iTrW0/9HNXpjGasA/VxmWt9yLVbckGtbONaqcmSQLQIG+1kKOiBgn/3tSBatsq2pU0zCJy4xhMQvP6sZ835ncqEXigqzxQE8PqCTGrEpg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=hvFsEeJAI83HBqZ+L1gKF+sFMsB+3zETFbbWDYpuhT4=;
 b=UEzHv7iUhktGCABiik2IaSEjInP8h/cL/1BYlqLRnQ879trx5tUTEg0B/948+yDrBWU1/n/NbIJOcN/PZyoLH0ozskBf1pL+Q6lXC4wUtz9OiCnTa/bciEmXZePgsco8n+/VZGoj5Ie8aQQWU3uQMPnle98YTwcSukjm6Y1ogjN6nU456r+PNj25wXwA28Kj146MZQrGuDERKbsoq7U+xXLZdjq3Ee8sVgpFWj8PcB2KqW0AD0eC7Dkx5nZWE4fWkwFtDk9y7hNznCrpHEN7gJyxQ92YTt00YKVstxkUyKc5AculC5LD7Y1cUlQfxf4RdawlSHupjANeQ8yMcPngMw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com;
 dkim=pass header.d=intel.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; 
 s=selector2-intel-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=hvFsEeJAI83HBqZ+L1gKF+sFMsB+3zETFbbWDYpuhT4=;
 b=GKFzJX3c2Z5t4aMyU8fTyrl/f599fsVEu1CFj5SBisKokJwyXHTs2VbZncOZfJu0+QABfI4QeWgMcD67wea2VJ/G4hD5BkdAbMniQ+Ik5XntoMWDrTX6hmsCMW0hGF27CqMDOdoRRnmRnQXpytBGMclffKOxAYuvFZW3rGEne5c=
Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26)
 by BYAPR11MB3670.namprd11.prod.outlook.com (2603:10b6:a03:f8::11)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3326.23; Fri, 4 Sep
 2020 18:33:02 +0000
Received: from BYAPR11MB3301.namprd11.prod.outlook.com
 ([fe80::f43b:a137:dab8:8b0b]) by BYAPR11MB3301.namprd11.prod.outlook.com
 ([fe80::f43b:a137:dab8:8b0b%6]) with mapi id 15.20.3348.015; Fri, 4 Sep 2020
 18:33:02 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Ma, Liang J" <liang.j.ma@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
CC: "Hunt, David" <david.hunt@intel.com>, "Burakov, Anatoly"
 <anatoly.burakov@intel.com>, "Ma, Liang J" <liang.j.ma@intel.com>
Thread-Topic: [dpdk-dev] [PATCH v3 3/6] power: add simple power management API
 and callback
Thread-Index: AQHWgqTvSMUaGmA5RkSVjaZPl0KEqqlYyA7A
Date: Fri, 4 Sep 2020 18:33:01 +0000
Message-ID: <BYAPR11MB3301AE40DE185AFAF89F4FFE9A2D0@BYAPR11MB3301.namprd11.prod.outlook.com>
References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com>
 <1599214740-3927-1-git-send-email-liang.j.ma@intel.com>
 <1599214740-3927-3-git-send-email-liang.j.ma@intel.com>
In-Reply-To: <1599214740-3927-3-git-send-email-liang.j.ma@intel.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-reaction: no-action
dlp-version: 11.5.1.3
authentication-results: intel.com; dkim=none (message not signed)
 header.d=none;intel.com; dmarc=none action=none header.from=intel.com;
x-originating-ip: [46.7.39.127]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 8e0e5802-1e7f-4681-5e41-08d85100f4cb
x-ms-traffictypediagnostic: BYAPR11MB3670:
x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <BYAPR11MB36709D5EE5D1C7B47CF28F009A2D0@BYAPR11MB3670.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:3631;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: BtCChhL9/SQNvTXTdXjgdN2+awyZ1ae54DAff6d9vpf1lbVYAXqZfdcgzEkqOqymjvP12xtUSyaAv0BW4sQ5qquzKrO9kcsktDRILqFJzw7QQFxlkjwD+bsTKP3s0OvsRGtDXpnxZJwePe3k+P3q+ypbDj50nS6Fm3WjGB+/nIOdOo9Q8SynljDOkPfED3o4VNFKruxGUXej947aOgORy6g7LBJWMoJSodHawtOy+fXDKWV9xYQSFojpqj1k6obIB2f5mNbx2gAXOQvPhF9pbwdOGkMdqj3xUdIuYU0CPolG11+4IKN2W7WRE0JXIOtxZLYu6q+PfiptDqPvqvScyA==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(4636009)(136003)(366004)(346002)(39860400002)(396003)(376002)(6506007)(5660300002)(107886003)(54906003)(86362001)(26005)(9686003)(71200400001)(2906002)(8936002)(52536014)(66446008)(66556008)(66476007)(66946007)(316002)(4326008)(186003)(33656002)(110136005)(55016002)(478600001)(64756008)(83380400001)(76116006)(7696005);
 DIR:OUT; SFP:1102; 
x-ms-exchange-antispam-messagedata: 6Gk9wFhHjSM6srYb+9MG5uFYYt5RM7kMgNmi7k4gKc5rcU5Vdg93aUVL0V3SpxSYDvVsagdf5VVGMgNQ5XZJ5diXkLtWehnA7p1QB7U4d6RuBGpsb4qE488jV2BlSeBtgeGDUVoMXASPZNGp1IGqhlJQJyzAVVvUVrNlhkzsMGVjb2uguhkt/lkFEZqc+9l/DONrDCIwLq9w1SWcoTA/RmXwx33GalM2/bZWVdrm/YD3W6HLxsvtBrTXj8ObBB8h8IWZC/hiHjW6GbolHhrVYaMppmmPJ3xDHGiJVl5P3FZyMJ/WxxKbNClsxk2aUUsMWQWpYT1JapKREhfbgFDReEARqstYBqeTQn6Mfo7YV6PfVzXytgncYRJUd6AUQTiXfk9rNrsb2xviQrf2dwwRdt6ha1YlXs3quYiRe2+y//ZzVdM03h9bh+Yl4IhrrKHQ7rt94VUlT9KAHFb1ImawhWns/b2lsKD2sDOXsmHC0Qm6CTp+aoGUXC1ZpJQjQWuPepsLYhgTJte9PZ/000oKCe1gFhVQsWkCF4Jy+j5obvgyusPVFS+aepZOef5d8mIRSXRuGOuzVM+n4dulAL3BwTj/xzLZBoA1Wsk846KeA06uTzVbhzQmrVoWrGp6jgespSJ1ggFCSrFXeq94953AdA==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 8e0e5802-1e7f-4681-5e41-08d85100f4cb
X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Sep 2020 18:33:02.0166 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: vZI/r8rr0NNO8oqdnJ3KnkdqyUtLT0ubfoORjv0gn/LcNrQUDHmHw0vCmhyaTGWJib1NP+XxDjDYsOXZatzUM+Q1TO9UIN11rCVCjFuYQjI=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3670
X-OriginatorOrg: intel.com
Subject: Re: [dpdk-dev] [PATCH v3 3/6] power: add simple power management
	API	and callback
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

>=20
> Add a simple on/off switch that will enable saving power when no
> packets are arriving. It is based on counting the number of empty
> polls and, when the number reaches a certain threshold, entering an
> architecture-defined optimized power state that will either wait
> until a TSC timestamp expires, or when packets arrive.
>=20
> This API is limited to 1 core 1 port 1 queue use case as there is
> no coordination between queues/cores in ethdev. 1 port map to multiple
> core will be supported in next version.
>=20
> This design leverage RX Callback mechnaism which allow three
> different power management methodology co exist.
>=20
> 1. umwait/umonitor:
>=20
>    The TSC timestamp is automatically calculated using current
>    link speed and RX descriptor ring size, such that the sleep
>    time is not longer than it would take for a NIC to fill its
>    entire RX descriptor ring.
>=20
> 2. Pause instruction
>=20
>    Instead of move the core into deeper C state, this lightweight
>    method use Pause instruction to releaf the processor from
>    busy polling.
>=20
> 3. Frequency Scaling
>    Reuse exist rte power library to scale up/down core frequency
>    depend on traffic volume.
>=20
> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>  lib/librte_power/meson.build           |   3 +-
>  lib/librte_power/rte_power.h           |  38 +++++
>  lib/librte_power/rte_power_pmd_mgmt.c  | 184 +++++++++++++++++++++++++
>  lib/librte_power/rte_power_version.map |   4 +
>  4 files changed, 228 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c
>=20
> diff --git a/lib/librte_power/meson.build b/lib/librte_power/meson.build
> index 78c031c943..44b01afce2 100644
> --- a/lib/librte_power/meson.build
> +++ b/lib/librte_power/meson.build
> @@ -9,6 +9,7 @@ sources =3D files('rte_power.c', 'power_acpi_cpufreq.c',
>  		'power_kvm_vm.c', 'guest_channel.c',
>  		'rte_power_empty_poll.c',
>  		'power_pstate_cpufreq.c',
> +		'rte_power_pmd_mgmt.c',
>  		'power_common.c')
>  headers =3D files('rte_power.h','rte_power_empty_poll.h')
> -deps +=3D ['timer']
> +deps +=3D ['timer' ,'ethdev']
> diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
> index bbbde4dfb4..06d5a9984f 100644
> --- a/lib/librte_power/rte_power.h
> +++ b/lib/librte_power/rte_power.h
> @@ -14,6 +14,7 @@
>  #include <rte_byteorder.h>
>  #include <rte_log.h>
>  #include <rte_string_fns.h>
> +#include <rte_ethdev.h>
>=20
>  #ifdef __cplusplus
>  extern "C" {
> @@ -97,6 +98,43 @@ int rte_power_init(unsigned int lcore_id);
>   */
>  int rte_power_exit(unsigned int lcore_id);
>=20
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior no=
tice
> + *
> + * Enable device power management.
> + * @param lcore_id
> + *   lcore id.
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param mode
> + *   The power management callback function mode.
> + * @return
> + *   0 on success
> + *   <0 on error
> + */
> +__rte_experimental
> +int rte_power_pmd_mgmt_enable(unsigned int lcore_id,
> +				  uint16_t port_id,
> +				  enum rte_eth_dev_power_mgmt_cb_mode mode);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior no=
tice
> + *
> + * Disable device power management.
> + * @param lcore_id
> + *   lcore id.
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + *
> + * @return
> + *   0 on success
> + *   <0 on error
> + */
> +__rte_experimental
> +int rte_power_pmd_mgmt_disable(unsigned int lcore_id, uint16_t port_id);
> +
>  /**
>   * Get the available frequencies of a specific lcore.
>   * Function pointer definition. Review each environments
> diff --git a/lib/librte_power/rte_power_pmd_mgmt.c b/lib/librte_power/rte=
_power_pmd_mgmt.c
> new file mode 100644
> index 0000000000..a445153ede
> --- /dev/null
> +++ b/lib/librte_power/rte_power_pmd_mgmt.c
> @@ -0,0 +1,184 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2010-2020 Intel Corporation
> + */
> +
> +#include <rte_lcore.h>
> +#include <rte_cycles.h>
> +#include <rte_atomic.h>
> +#include <rte_malloc.h>
> +#include <rte_ethdev.h>
> +
> +#include "rte_power.h"
> +
> +
> +
> +static uint16_t
> +rte_power_mgmt_umwait(uint16_t port_id, uint16_t qidx,
> +		struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx,
> +		uint16_t max_pkts __rte_unused, void *_  __rte_unused)
> +{
> +
> +	struct rte_eth_dev *dev =3D &rte_eth_devices[port_id];
> +
> +	if (unlikely(nb_rx =3D=3D 0)) {
> +		dev->empty_poll_stats[qidx].num++;

I believe there are two fundamental issues with that approach:
1. You put metadata specific lib (power) callbacks into rte_eth_dev struct.
2. These callbacks do access rte_eth_devices[] directly.=20
That doesn't look right to me - rte_eth_dev structure supposed to be treate=
d
as internal one librt_ether and underlying drivers and should be accessed d=
irectly
by outer code.
If these callbacks need some extra metadata, then it is responsibility
of power library to allocate/manage these metadata.
You can pass pointer to this metadata via last parameter for rte_eth_add_rx=
_callback().

> +		if (unlikely(dev->empty_poll_stats[qidx].num >
> +			     ETH_EMPTYPOLL_MAX)) {
> +			volatile void *target_addr;
> +			uint64_t expected, mask;
> +			uint16_t ret;
> +
> +			/*
> +			 * get address of next descriptor in the RX
> +			 * ring for this queue, as well as expected
> +			 * value and a mask.
> +			 */
> +			ret =3D (*dev->dev_ops->next_rx_desc)
> +				(dev->data->rx_queues[qidx],
> +				 &target_addr, &expected, &mask);
> +			if (ret =3D=3D 0)
> +				/* -1ULL is maximum value for TSC */
> +				rte_power_monitor(target_addr,
> +						  expected, mask,
> +						  0, -1ULL);
> +		}
> +	} else
> +		dev->empty_poll_stats[qidx].num =3D 0;
> +
> +	return nb_rx;
> +}
> +
> +static uint16_t
> +rte_power_mgmt_pause(uint16_t port_id, uint16_t qidx,
> +		struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx,
> +		uint16_t max_pkts __rte_unused, void *_  __rte_unused)
> +{
> +	struct rte_eth_dev *dev =3D &rte_eth_devices[port_id];
> +
> +	int i;
> +
> +	if (unlikely(nb_rx =3D=3D 0)) {
> +
> +		dev->empty_poll_stats[qidx].num++;
> +
> +		if (unlikely(dev->empty_poll_stats[qidx].num >
> +			     ETH_EMPTYPOLL_MAX)) {
> +
> +			for (i =3D 0; i < RTE_ETH_PAUSE_NUM; i++)
> +				rte_pause();
> +
> +		}
> +	} else
> +		dev->empty_poll_stats[qidx].num =3D 0;
> +
> +	return nb_rx;
> +}
> +
> +static uint16_t
> +rte_power_mgmt_scalefreq(uint16_t port_id, uint16_t qidx,
> +		struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx,
> +		uint16_t max_pkts __rte_unused, void *_  __rte_unused)
> +{
> +	struct rte_eth_dev *dev =3D &rte_eth_devices[port_id];
> +
> +	if (unlikely(nb_rx =3D=3D 0)) {
> +		dev->empty_poll_stats[qidx].num++;
> +		if (unlikely(dev->empty_poll_stats[qidx].num >
> +			     ETH_EMPTYPOLL_MAX)) {
> +
> +			/*scale down freq */
> +			rte_power_freq_min(rte_lcore_id());
> +
> +		}
> +	} else {
> +		dev->empty_poll_stats[qidx].num =3D 0;
> +		/* scal up freq */
> +		rte_power_freq_max(rte_lcore_id());
> +	}
> +
> +	return nb_rx;
> +}
> +
> +int
> +rte_power_pmd_mgmt_enable(unsigned int lcore_id,
> +			uint16_t port_id,
> +			enum rte_eth_dev_power_mgmt_cb_mode mode)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> +	dev =3D &rte_eth_devices[port_id];
> +
> +	if (dev->pwr_mgmt_state =3D=3D RTE_ETH_DEV_POWER_MGMT_ENABLED)
> +		return -EINVAL;
> +	/* allocate memory for empty poll stats */
> +	dev->empty_poll_stats =3D rte_malloc_socket(NULL,
> +						  sizeof(struct rte_eth_ep_stat)
> +						  * RTE_MAX_QUEUES_PER_PORT,
> +						  0, dev->data->numa_node);
> +	if (dev->empty_poll_stats =3D=3D NULL)
> +		return -ENOMEM;
> +
> +	switch (mode) {
> +	case RTE_ETH_DEV_POWER_MGMT_CB_WAIT:
> +		if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG))
> +			return -ENOTSUP;

Here and in other places: in case of error return you don't' free your empt=
y_poll_stats.

> +		dev->cur_pwr_cb =3D rte_eth_add_rx_callback(port_id, 0,

Why zero for queue number, why not to pass queue_id as a parameter for that=
 function?

> +						rte_power_mgmt_umwait, NULL);

As I said above, instead of NULL - could be pointer to metadata struct.

> +		break;
> +	case RTE_ETH_DEV_POWER_MGMT_CB_SCALE:
> +		/* init scale freq */
> +		if (rte_power_init(lcore_id))
> +			return -EINVAL;
> +		dev->cur_pwr_cb =3D rte_eth_add_rx_callback(port_id, 0,
> +					rte_power_mgmt_scalefreq, NULL);
> +		break;
> +	case RTE_ETH_DEV_POWER_MGMT_CB_PAUSE:
> +		dev->cur_pwr_cb =3D rte_eth_add_rx_callback(port_id, 0,
> +						rte_power_mgmt_pause, NULL);
> +		break;
> +	}
> +
> +	dev->cb_mode =3D mode;
> +	dev->pwr_mgmt_state =3D RTE_ETH_DEV_POWER_MGMT_ENABLED;
> +	return 0;
> +}
> +
> +int
> +rte_power_pmd_mgmt_disable(unsigned int lcore_id,
> +				uint16_t port_id)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> +	dev =3D &rte_eth_devices[port_id];
> +
> +	/*add flag check */
> +
> +	if (dev->pwr_mgmt_state =3D=3D RTE_ETH_DEV_POWER_MGMT_DISABLED)
> +		return -EINVAL;
> +
> +	/* rte_free ignores NULL so safe to call without checks */
> +	rte_free(dev->empty_poll_stats);

You can't free callback metadata before removing the callback itself.
In fact, with current rx callback code it is not safe to free it
even after (we discussed it offline).

> +
> +	switch (dev->cb_mode) {
> +	case RTE_ETH_DEV_POWER_MGMT_CB_WAIT:
> +	case RTE_ETH_DEV_POWER_MGMT_CB_PAUSE:
> +		rte_eth_remove_rx_callback(port_id, 0,
> +					   dev->cur_pwr_cb);
> +		break;
> +	case RTE_ETH_DEV_POWER_MGMT_CB_SCALE:
> +		rte_power_freq_max(lcore_id);

Stupid q: what makes you think that lcore frequency was max,
*before* you setup the callback?

> +		rte_eth_remove_rx_callback(port_id, 0,
> +					   dev->cur_pwr_cb);
> +		if (rte_power_exit(lcore_id))
> +			return -EINVAL;
> +		break;
> +	}
> +
> +	dev->pwr_mgmt_state =3D RTE_ETH_DEV_POWER_MGMT_DISABLED;
> +	dev->cur_pwr_cb =3D NULL;
> +	dev->cb_mode =3D 0;
> +
> +	return 0;
> +}
> diff --git a/lib/librte_power/rte_power_version.map b/lib/librte_power/rt=
e_power_version.map
> index 00ee5753e2..ade83cfd4f 100644
> --- a/lib/librte_power/rte_power_version.map
> +++ b/lib/librte_power/rte_power_version.map
> @@ -34,4 +34,8 @@ EXPERIMENTAL {
>  	rte_power_guest_channel_receive_msg;
>  	rte_power_poll_stat_fetch;
>  	rte_power_poll_stat_update;
> +	# added in 20.08
> +	rte_power_pmd_mgmt_disable;
> +	rte_power_pmd_mgmt_enable;
> +
>  };
> --
> 2.17.1