From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (xvm-189-124.dc0.ghst.net [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 5DDE8A052A;
	Fri,  8 Jan 2021 18:42:29 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 0D860140EDF;
	Fri,  8 Jan 2021 18:42:19 +0100 (CET)
Received: from mga04.intel.com (mga04.intel.com [192.55.52.120])
 by mails.dpdk.org (Postfix) with ESMTP id 168DD140E92
 for <dev@dpdk.org>; Fri,  8 Jan 2021 18:42:17 +0100 (CET)
IronPort-SDR: TbbKq7ia8VThUqFfKETd/S2M3ZS3Z4XS2LNaBgF+yj85mFpfRQ6wfy71EIXxeBTkodWAN1jdkL
 sv7U60trnvvg==
X-IronPort-AV: E=McAfee;i="6000,8403,9858"; a="175049923"
X-IronPort-AV: E=Sophos;i="5.79,332,1602572400"; d="scan'208";a="175049923"
Received: from fmsmga004.fm.intel.com ([10.253.24.48])
 by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 08 Jan 2021 09:42:17 -0800
IronPort-SDR: fhXYDpgWWYEDyCZe8HvHl61cey1laQQKq3o9cBilKRJrdUJJdUNogKojHinGTynsnjwK1wkLlO
 Dm5PxejVxb9A==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.79,332,1602572400"; d="scan'208";a="396367987"
Received: from silpixa00399498.ir.intel.com (HELO
 silpixa00399498.ger.corp.intel.com) ([10.237.222.179])
 by fmsmga004.fm.intel.com with ESMTP; 08 Jan 2021 09:42:15 -0800
From: Anatoly Burakov <anatoly.burakov@intel.com>
To: dev@dpdk.org
Cc: thomas@monjalon.net, konstantin.ananyev@intel.com, gage.eads@intel.com,
 timothy.mcdaniel@intel.com, david.hunt@intel.com,
 bruce.richardson@intel.com, chris.macnamara@intel.com
Date: Fri,  8 Jan 2021 17:42:03 +0000
Message-Id: <cover.1610127361.git.anatoly.burakov@intel.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <cover.1608213657.git.anatoly.burakov@intel.com>
References: <cover.1608213657.git.anatoly.burakov@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: [dpdk-dev] [PATCH v13 00/11] Add PMD power management
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

This patchset proposes a simple API for Ethernet drivers to cause the  
CPU to enter a power-optimized state while waiting for packets to  
arrive. There are multiple proposed mechanisms to achieve said power
savings: simple frequency scaling, idle loop, and monitoring the Rx
queue for incoming packages. The latter is achieved through cooperation
with the NIC driver that will allow us to know address of wake up event,
and wait for writes on that address.

On IA, this is achieved through using UMONITOR/UMWAIT instructions. They 
are used in their raw opcode form because there is no widespread 
compiler support for them yet. Still, the API is made generic enough to
hopefully support other architectures, if they happen to implement 
similar instructions.

To achieve power savings, there is a very simple mechanism used: we're 
counting empty polls, and if a certain threshold is reached, we employ
one of the suggested power management schemes automatically, from within
a Rx callback inside the PMD. Once there's traffic again, the empty poll
counter is reset.

This patchset also introduces a few changes into existing power 
management-related intrinsics, namely to provide a native way of waking 
up a sleeping core without application being responsible for it, as well 
as general robustness improvements. There's quite a bit of locking going 
on, but these locks are per-thread and very little (if any) contention 
is expected, so the performance impact shouldn't be that bad (and in any 
case the locking happens when we're about to sleep anyway).

Why are we putting it into ethdev as opposed to leaving this up to the 
application? Our customers specifically requested a way to do it with
minimal changes to the application code. The current approach allows to 
just flip a switch and automatically have power savings.

Things of note:

- Only 1:1 core to queue mapping is supported, meaning that each lcore 
  must at most handle RX on a single queue
- Support 3 type policies. Monitor/Pause/Frequency Scaling
- Power management is enabled per-queue
- The API doesn't extend to other device types

v13:
- Reworked the librte_power code to require less locking and handle invalid
  parameters better
- Fix numerous rebase errors present in v12

v12:
- Rebase on top of 21.02
- Rework of power intrinsics code

Anatoly Burakov (5):
  eal: uninline power intrinsics
  eal: avoid invalid API usage in power intrinsics
  eal: change API of power intrinsics
  eal: remove sync version of power monitor
  eal: add monitor wakeup function

Liang Ma (6):
  ethdev: add simple power management API
  power: add PMD power management API and callback
  net/ixgbe: implement power management API
  net/i40e: implement power management API
  net/ice: implement power management API
  examples/l3fwd-power: enable PMD power mgmt

 doc/guides/prog_guide/power_man.rst           |  44 +++
 doc/guides/rel_notes/release_21_02.rst        |  15 +
 .../sample_app_ug/l3_forward_power_man.rst    |  35 ++
 drivers/event/dlb/dlb.c                       |  10 +-
 drivers/event/dlb2/dlb2.c                     |  10 +-
 drivers/net/i40e/i40e_ethdev.c                |   1 +
 drivers/net/i40e/i40e_rxtx.c                  |  25 ++
 drivers/net/i40e/i40e_rxtx.h                  |   1 +
 drivers/net/ice/ice_ethdev.c                  |   1 +
 drivers/net/ice/ice_rxtx.c                    |  26 ++
 drivers/net/ice/ice_rxtx.h                    |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.c              |   1 +
 drivers/net/ixgbe/ixgbe_rxtx.c                |  25 ++
 drivers/net/ixgbe/ixgbe_rxtx.h                |   1 +
 examples/l3fwd-power/main.c                   |  89 ++++-
 .../arm/include/rte_power_intrinsics.h        |  39 +-
 .../include/generic/rte_power_intrinsics.h    |  78 ++--
 .../ppc/include/rte_power_intrinsics.h        |  39 +-
 lib/librte_eal/version.map                    |   5 +
 .../x86/include/rte_power_intrinsics.h        | 115 ------
 lib/librte_eal/x86/meson.build                |   1 +
 lib/librte_eal/x86/rte_power_intrinsics.c     | 184 +++++++++
 lib/librte_ethdev/rte_ethdev.c                |  28 ++
 lib/librte_ethdev/rte_ethdev.h                |  25 ++
 lib/librte_ethdev/rte_ethdev_driver.h         |  22 ++
 lib/librte_ethdev/version.map                 |   3 +
 lib/librte_power/meson.build                  |   5 +-
 lib/librte_power/rte_power_pmd_mgmt.c         | 360 ++++++++++++++++++
 lib/librte_power/rte_power_pmd_mgmt.h         |  90 +++++
 lib/librte_power/version.map                  |   5 +
 30 files changed, 1055 insertions(+), 229 deletions(-)
 create mode 100644 lib/librte_eal/x86/rte_power_intrinsics.c
 create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c
 create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h

-- 
2.25.1