From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B66AAA04B5; Wed, 13 Jan 2021 13:59:04 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 457F5140D1E; Wed, 13 Jan 2021 13:59:04 +0100 (CET) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by mails.dpdk.org (Postfix) with ESMTP id C400F140D1B for ; Wed, 13 Jan 2021 13:59:01 +0100 (CET) IronPort-SDR: VjThrxSoySET3ZS0JnyV5iaTQLjwCyduV2U1CXx85aSKKCJzTivRcQvn2WkfuGgKXSobIe3YMk kT5ILgEjpzFw== X-IronPort-AV: E=McAfee;i="6000,8403,9862"; a="165871736" X-IronPort-AV: E=Sophos;i="5.79,344,1602572400"; d="scan'208";a="165871736" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jan 2021 04:59:00 -0800 IronPort-SDR: NA+JZTzQ7XMsQHDklJ24CmQj7XILizRyC5lUQQBQrGPotYEmHaN9XCp9q1kAuyLjBHRJ3DZxFq KuSRbcea52Cw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.79,344,1602572400"; d="scan'208";a="389479924" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by orsmga007.jf.intel.com with ESMTP; 13 Jan 2021 04:59:00 -0800 Received: from orsmsx602.amr.corp.intel.com (10.22.229.15) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 13 Jan 2021 04:58:59 -0800 Received: from orsedg603.ED.cps.intel.com (10.7.248.4) by orsmsx602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 13 Jan 2021 04:58:59 -0800 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (104.47.70.109) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Wed, 13 Jan 2021 04:58:59 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Plmazp0HM10ECOG1TqXFyC9fV5Sk3LWLhqJjbDMjEKo1qHMbdpT9Ach3dCOHRDEiGrTmmj1gbEt1/NsJ5BKUCYSRLDL7CmUvRwo34DQMFQ7CtLO2v83gtGBnSSZgOHFNnlRf2jeckAXlehAXwvvH1n6RKoEPY91qkbxaRU3A2/Gvfc1oLO9EV29EFZp5kLyzxHDHm9AO+2lgMNyEYT0zHRL0zqywaCtIL6lHemf9j4zkdjP+flm1YbWcgDbnEdDKSvNTN3dISTiM9jO6X7yri1Fc8iCVppTmefgj5wnqpcJSHfLjwJQPbV96nvIWIiJoVCgd1HPe06Eg64OAL+85Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=C+nnD/PpFn5MwFgCUBMMKB0s5aDS52AWI0FIRmjE0BY=; b=kykRijInyO/Fzm+9oldzudPrBF8+YcmPbncqOBwQWU5bdm9Y+uNs1jqBRML+w2HCynV8O25kpz+iH93wUlrxnvCj5/39ym+fsR7tspvjjmt6hU+QxfFTSy28Mv+jbVJLSKRuSdRirUCWt/eUZvA5DdRJ1HXVeDCszVLenM+uq7QiyLe4NHeCsy3+7CVRkAnqhgO22LzihD8LZpXnruD8+ZDSAih4GFm46MITX4speVmsf9JA/5C2rk4LRYsuzi/QOKazmOIK7L7N2EbgZLWvLySWol6hcC7vJBE82UZ8JCmffrZ6pV1Q2NZH84LvZED23uIXaKIYfJ/Jp44fmjWGzA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=C+nnD/PpFn5MwFgCUBMMKB0s5aDS52AWI0FIRmjE0BY=; b=sIBnk6Q20jzT9itzenxnahzthBGPEqa0mQ9GlznWeSYIo19/iE32bAUV+h+jV0H0qokUGtorRvYumK0kM9pZAoLkPZgxgv3Fd0qnGDeeplJ4v63hTXIjwvKisAzk4DKqZJXwiogW5oxmW5YNn24wKa0Fh0t3N3ddiAAJIfy7eQw= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BY5PR11MB3974.namprd11.prod.outlook.com (2603:10b6:a03:183::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3763.9; Wed, 13 Jan 2021 12:58:57 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::1152:1426:8a4f:c755]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::1152:1426:8a4f:c755%4]) with mapi id 15.20.3742.012; Wed, 13 Jan 2021 12:58:56 +0000 From: "Ananyev, Konstantin" To: "Burakov, Anatoly" , "dev@dpdk.org" CC: "Ma, Liang J" , "Hunt, David" , Ray Kinsella , Neil Horman , "thomas@monjalon.net" , "McDaniel, Timothy" , "Richardson, Bruce" , "Macnamara, Chris" Thread-Topic: [PATCH v16 07/11] power: add PMD power management API and callback Thread-Index: AQHW6Qm3hxwMaiX0fUCwEoTjfJ8Z3aolghdg Date: Wed, 13 Jan 2021 12:58:56 +0000 Message-ID: References: <7618b351d6f3be9eba0f6cc862440fb126f934d2.1610473000.git.anatoly.burakov@intel.com> In-Reply-To: <7618b351d6f3be9eba0f6cc862440fb126f934d2.1610473000.git.anatoly.burakov@intel.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: d6201769-59d6-4ac7-0be6-08d8b7c2fd01 x-ms-traffictypediagnostic: BY5PR11MB3974: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 1rIXq2xR1krDous5AwC906a01VKIfiV6D1kDG35NhPGt6B6PH+ZHk2688qlXDA3FEFzO4i5GuW1vGPyGAeNjyf5CQ/A4JL0YWlKVUPj+TQ+5Zp/KQDB/OCI2Pq7h3hcpQjf+yYDMzcCXMa5Trqf1T6fR6fPXbM78rFpyIxecAEBdzqm/tdpk5AjXx4YZxfKxyM4bO/3d7k3649BvaWfcBO3ClJiEYA6GQZbOxKpdFW102MyKSvnAx3iMILPguvREO5vjpEjyPa9Waevtpot5kejKXTIbdtGKn4eF/EJGvhbXm1e9nW++p1p/t8Y4Sza0B+dqIH9qh3vkSee7lazcJG7k0azNxwD9UJa9JEwvOQ+rEL5hNJpqRY6fi3mWDx/L44pBg6vC2A1GNlddGaYzaw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(136003)(366004)(39860400002)(376002)(346002)(66556008)(71200400001)(6506007)(64756008)(66446008)(26005)(5660300002)(186003)(76116006)(66476007)(83380400001)(2906002)(53546011)(316002)(33656002)(86362001)(9686003)(54906003)(8936002)(110136005)(7696005)(55016002)(8676002)(4326008)(478600001)(66946007)(107886003)(52536014); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?X6q/piZHeCIh5LaBHjSQbV18tNjddRUaYUETDfPtkDhbCB4uP+GcBvsGjE8Z?= =?us-ascii?Q?pmT4qKUYtHi7kZiBGIPv1ujB8szDIpwPiKoExwGTC3ouDKiExJH6Gsin0hXB?= =?us-ascii?Q?H16sqyOH6Gd9eIg4zjyASLTtUEltIfFcZqpaVOrLr8NxlUgRxSV4KR9Yaz1i?= =?us-ascii?Q?kE4DOFgwLGv/LR3nhxjZCHwJfT/Bwf/lEBXglsVhtGyTzoFa1arSdq9MM1jK?= =?us-ascii?Q?pecnI5jpCJSDVXvWl0Ml9etG8FAzWO9qdzECnZ1FxOhAlutT9ge95FP3mui4?= =?us-ascii?Q?ucqjcaCcdltHjOL3+KLLA8X+hT5lBiSbqPYY1eiK1HAws17z13tMMw/OpIb1?= =?us-ascii?Q?PVwI+LGrjQ7dJfY9xp9hqyQ8pPAxtABJwGauz4vIW0cIhqAvBXGAqvIaOnFd?= =?us-ascii?Q?AQAN+jQuMQDRAXdxOWVnVgWMYyLTNfnGpQDu28ZOaJuP6GCAmC4sCV8y/DXM?= =?us-ascii?Q?EDIyHpQhnkloRgepVBCkBOEjZrA6Re/RoLt/hKRN/z455uDBlhG+QvMdpArB?= =?us-ascii?Q?N/xkJwg5he7teZLnWdzvlTtApfLggr+gwKmTMZ3tDCAMmSHyWoxozDUmkm9k?= =?us-ascii?Q?Ub8B/inPvCchu5NoRp2Oga54HZEoYUTb7s/+eahPGraavC1UudVuUkNzQBg7?= =?us-ascii?Q?HbE/lynPdOTBoXtQhQu7NagosulIu45BdQNEVNsp+f66I8G4dWmTawLSHReP?= =?us-ascii?Q?Wr4Xzk/tnVdcYyX0KKYqaAgLQMl7bud0fluG8hhDLvMyLyoCEju9hj8e7/FO?= =?us-ascii?Q?1i+++kii3tHwLZMEc0s415x6OgM2u1qPpvK0MD6RnXoZ1/4Ou8F+Ze30lwBK?= =?us-ascii?Q?fxEpP9v2v3qwRp2ci/Cc8m9dy9yA5w3qM5jIeMyz0IMnEZINK4Hj+1iGrZyX?= =?us-ascii?Q?kzWkOcTaxrv4WulvC9WfVndndyFRBpxihnoBzqBRcL1RxEd32oZ/ESH0mgki?= =?us-ascii?Q?tc9BvHJqZEX9fy93AKxGni1iilfGClgBmAcemOp8Cvk=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: d6201769-59d6-4ac7-0be6-08d8b7c2fd01 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Jan 2021 12:58:56.7838 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: sseymbZ9Mgicv7Z30rSYHlOZcqTeWN+1PpRyUUoB2CatwNsfG9+ILacp0/mGmVuGtrRRB+mh2zSb5WzQjHikUmgjWTWnWo2eMaOw2WS4tW8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR11MB3974 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v16 07/11] power: add PMD power management API and callback X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Burakov, Anatoly > Sent: Tuesday, January 12, 2021 5:37 PM > To: dev@dpdk.org > Cc: Ma, Liang J ; Hunt, David ; Ray Kinsella ; Neil Horman > ; thomas@monjalon.net; Ananyev, Konstantin ; McDaniel, Timothy > ; Richardson, Bruce ; Macnamara, Chris > Subject: [PATCH v16 07/11] power: add PMD power management API and callba= ck >=20 > From: Liang Ma >=20 > Add a simple on/off switch that will enable saving power when no > packets are arriving. It is based on counting the number of empty > polls and, when the number reaches a certain threshold, entering an > architecture-defined optimized power state that will either wait > until a TSC timestamp expires, or when packets arrive. >=20 > This API mandates a core-to-single-queue mapping (that is, multiple > queued per device are supported, but they have to be polled on different > cores). >=20 > This design is using PMD RX callbacks. >=20 > 1. UMWAIT/UMONITOR: >=20 > When a certain threshold of empty polls is reached, the core will go > into a power optimized sleep while waiting on an address of next RX > descriptor to be written to. >=20 > 2. TPAUSE/Pause instruction >=20 > This method uses the pause (or TPAUSE, if available) instruction to > avoid busy polling. >=20 > 3. Frequency scaling > Reuse existing DPDK power library to scale up/down core frequency > depending on traffic volume. >=20 > Signed-off-by: Liang Ma > Signed-off-by: Anatoly Burakov > --- >=20 > Notes: > v15: > - Fix check in UMWAIT callback >=20 > v13: > - Rework the synchronization mechanism to not require locking > - Add more parameter checking > - Rework n_rx_queues access to not go through internal PMD structures= and use > public API instead >=20 > v13: > - Rework the synchronization mechanism to not require locking > - Add more parameter checking > - Rework n_rx_queues access to not go through internal PMD structures= and use > public API instead >=20 > doc/guides/prog_guide/power_man.rst | 44 +++ > doc/guides/rel_notes/release_21_02.rst | 10 + > lib/librte_power/meson.build | 5 +- > lib/librte_power/rte_power_pmd_mgmt.c | 359 +++++++++++++++++++++++++ > lib/librte_power/rte_power_pmd_mgmt.h | 90 +++++++ > lib/librte_power/version.map | 5 + > 6 files changed, 511 insertions(+), 2 deletions(-) > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h >=20 ... > + > +static uint16_t > +clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte= _unused, > + uint16_t nb_rx, uint16_t max_pkts __rte_unused, > + void *addr __rte_unused) > +{ > + > + struct pmd_queue_cfg *q_conf; > + > + q_conf =3D &port_cfg[port_id][qidx]; > + > + if (unlikely(nb_rx =3D=3D 0)) { > + q_conf->empty_poll_stats++; > + if (unlikely(q_conf->empty_poll_stats > EMPTYPOLL_MAX)) { > + struct rte_power_monitor_cond pmc; > + uint16_t ret; > + > + /* > + * we might get a cancellation request while being > + * inside the callback, in which case the wakeup > + * wouldn't work because it would've arrived too early. > + * > + * to get around this, we notify the other thread that > + * we're sleeping, so that it can spin until we're done. > + * unsolicited wakeups are perfectly safe. > + */ > + q_conf->umwait_in_progress =3D true; This write and subsequent read can be reordered by the cpu. I think you need rte_atomic_thread_fence(__ATOMIC_SEQ_CST) here and in disable() code-path below. > + > + /* check if we need to cancel sleep */ > + if (q_conf->pwr_mgmt_state =3D=3D PMD_MGMT_ENABLED) { > + /* use monitoring condition to sleep */ > + ret =3D rte_eth_get_monitor_addr(port_id, qidx, > + &pmc); > + if (ret =3D=3D 0) > + rte_power_monitor(&pmc, -1ULL); > + } > + q_conf->umwait_in_progress =3D false; > + } > + } else > + q_conf->empty_poll_stats =3D 0; > + > + return nb_rx; > +} > + ... > + > +int > +rte_power_pmd_mgmt_queue_disable(unsigned int lcore_id, > + uint16_t port_id, uint16_t queue_id) > +{ > + struct pmd_queue_cfg *queue_cfg; > + > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > + > + if (lcore_id >=3D RTE_MAX_LCORE || queue_id >=3D RTE_MAX_QUEUES_PER_POR= T) > + return -EINVAL; > + > + /* no need to check queue id as wrong queue id would not be enabled */ > + queue_cfg =3D &port_cfg[port_id][queue_id]; > + > + if (queue_cfg->pwr_mgmt_state !=3D PMD_MGMT_ENABLED) > + return -EINVAL; > + > + /* let the callback know we're shutting down */ > + queue_cfg->pwr_mgmt_state =3D PMD_MGMT_BUSY; Same as above - write to pwr_mgmt_state and read from umwait_in_progress could be reordered by cpu. Need to insert rte_atomic_thread_fence(__ATOMIC_SEQ_CST) between them. BTW, out of curiosity - why do you need this intermediate state (PMD_MGMT_BUSY) at all? Why not directly: queue_cfg->pwr_mgmt_state =3D PMD_MGMT_DISABLED; ? > + > + switch (queue_cfg->cb_mode) { > + case RTE_POWER_MGMT_TYPE_MONITOR: > + { > + bool exit =3D false; > + do { > + /* > + * we may request cancellation while the other thread > + * has just entered the callback but hasn't started > + * sleeping yet, so keep waking it up until we know it's > + * done sleeping. > + */ > + if (queue_cfg->umwait_in_progress) > + rte_power_monitor_wakeup(lcore_id); > + else > + exit =3D true; > + } while (!exit); > + } > + /* fall-through */ > + case RTE_POWER_MGMT_TYPE_PAUSE: > + rte_eth_remove_rx_callback(port_id, queue_id, > + queue_cfg->cur_cb); > + break; > + case RTE_POWER_MGMT_TYPE_SCALE: > + rte_power_freq_max(lcore_id); > + rte_eth_remove_rx_callback(port_id, queue_id, > + queue_cfg->cur_cb); > + rte_power_exit(lcore_id); > + break; > + } > + /* > + * we don't free the RX callback here because it is unsafe to do so > + * unless we know for a fact that all data plane threads have stopped. > + */ > + queue_cfg->cur_cb =3D NULL; > + queue_cfg->pwr_mgmt_state =3D PMD_MGMT_DISABLED; > + > + return 0; > +} > diff --git a/lib/librte_power/rte_power_pmd_mgmt.h b/lib/librte_power/rte= _power_pmd_mgmt.h > new file mode 100644 > index 0000000000..0bfbc6ba69 > --- /dev/null > +++ b/lib/librte_power/rte_power_pmd_mgmt.h > @@ -0,0 +1,90 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2010-2020 Intel Corporation > + */ > + > +#ifndef _RTE_POWER_PMD_MGMT_H > +#define _RTE_POWER_PMD_MGMT_H > + > +/** > + * @file > + * RTE PMD Power Management > + */ > +#include > +#include > + > +#include > +#include > +#include > +#include > +#include > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +/** > + * PMD Power Management Type > + */ > +enum rte_power_pmd_mgmt_type { > + /** Use power-optimized monitoring to wait for incoming traffic */ > + RTE_POWER_MGMT_TYPE_MONITOR =3D 1, > + /** Use power-optimized sleep to avoid busy polling */ > + RTE_POWER_MGMT_TYPE_PAUSE, > + /** Use frequency scaling when traffic is low */ > + RTE_POWER_MGMT_TYPE_SCALE, > +}; > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior no= tice > + * > + * Enable power management on a specified RX queue and lcore. > + * > + * @note This function is not thread-safe. > + * > + * @param lcore_id > + * lcore_id. > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param queue_id > + * The queue identifier of the Ethernet device. > + * @param mode > + * The power management callback function type. > + > + * @return > + * 0 on success > + * <0 on error > + */ > +__rte_experimental > +int > +rte_power_pmd_mgmt_queue_enable(unsigned int lcore_id, > + uint16_t port_id, uint16_t queue_id, > + enum rte_power_pmd_mgmt_type mode); > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior no= tice > + * > + * Disable power management on a specified RX queue and lcore. > + * > + * @note This function is not thread-safe. > + * > + * @param lcore_id > + * lcore_id. > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param queue_id > + * The queue identifier of the Ethernet device. > + * @return > + * 0 on success > + * <0 on error > + */ > +__rte_experimental > +int > +rte_power_pmd_mgmt_queue_disable(unsigned int lcore_id, > + uint16_t port_id, uint16_t queue_id); > +#ifdef __cplusplus > +} > +#endif > + > +#endif > diff --git a/lib/librte_power/version.map b/lib/librte_power/version.map > index 69ca9af616..61996b4d11 100644 > --- a/lib/librte_power/version.map > +++ b/lib/librte_power/version.map > @@ -34,4 +34,9 @@ EXPERIMENTAL { > rte_power_guest_channel_receive_msg; > rte_power_poll_stat_fetch; > rte_power_poll_stat_update; > + > + # added in 21.02 > + rte_power_pmd_mgmt_queue_enable; > + rte_power_pmd_mgmt_queue_disable; > + > }; > -- > 2.25.1