From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7F955A04B5; Tue, 27 Oct 2020 21:54:11 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DE4F32BDB; Tue, 27 Oct 2020 21:54:08 +0100 (CET) Received: from mail-oi1-f196.google.com (mail-oi1-f196.google.com [209.85.167.196]) by dpdk.org (Postfix) with ESMTP id AA2922BD3 for ; Tue, 27 Oct 2020 21:54:06 +0100 (CET) Received: by mail-oi1-f196.google.com with SMTP id k65so2733694oih.8 for ; Tue, 27 Oct 2020 13:54:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=TBp6bKD3VgafwiOqB1NIEq4URBuwWiJHE0l+wX6t5Hw=; b=f4NGlvMe8B4IaLUHsKVyziziJNLj/v24H+bYSqRNkdTmZQKwKHdOpLtwZdtEMnxA2H tir4KsHkCgp0OTJ5U9OdMm9m0TpXiXa2ELnT6mI9UoS7kcFDdqD2UL3oW5ovbNIE38IY 7H5t+zlllPf2PiW6iXI5ymufDv2qWRUMm7XPs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=TBp6bKD3VgafwiOqB1NIEq4URBuwWiJHE0l+wX6t5Hw=; b=JBMeQACqXXpMXSoD1HEFkeh6FW2nooJrd3EnAG6PkG+ViXEeVBDGBudSO6dJh3QSz7 cujTmdsr1Ul2/vUN8s/jfmENFeJI2R8vYR94qgsHuT8U+hNkS0n1CeqZmipZsNdvS18L gbkr2Y2pnSVYW454v7NjufTaLKmoC9kPwd6DHKLWyE9tqUbbTPuHunjt+YZG3qaBms19 KhNyx5R4Z8kgmLZHaietKXOqd6SbWHyypnGfgEM41JDRrr9NCaNaNjovYSF8a09p3xbU RFFPfUHqiE58EeGbTtkQV8iJM5pJh9rQIsDuOmyBaLIf0Q13eSPfKUGB425SZ7mr7YYG YQHA== X-Gm-Message-State: AOAM5306jsUBd5OTCP/cKksW6QJHL4iRkjnx7Gylf6JTPlfMt7OxAB0i jlX+neCebdCRgzTnzJPWJYlQ/d0FptgW9gH+dX1H2A== X-Google-Smtp-Source: ABdhPJwdOuInzhaXVPYTVVkSclOWtduC504/SELZk6DFYVKurUjRrwqXmNp6jm9964Mc38Zur90Inp9wXLbYe3DPX58= X-Received: by 2002:aca:ecd4:: with SMTP id k203mr2880546oih.179.1603832045526; Tue, 27 Oct 2020 13:54:05 -0700 (PDT) MIME-Version: 1.0 References: <1603494392-7181-1-git-send-email-liang.j.ma@intel.com> <1603810749-22285-1-git-send-email-liang.j.ma@intel.com> In-Reply-To: <1603810749-22285-1-git-send-email-liang.j.ma@intel.com> From: Ajit Khaparde Date: Tue, 27 Oct 2020 13:53:49 -0700 Message-ID: To: Liang Ma Cc: dpdk-dev , Ruifeng Wang , Haiyue Wang , Bruce Richardson , "Ananyev, Konstantin" , david.hunt@intel.com, Jerin Jacob , Neil Horman , Thomas Monjalon , timothy.mcdaniel@intel.com, gage.eads@intel.com, Marcin Wojtas , Guy Tzalik , Harman Kalra , John Daley , "Wei Hu (Xavier" , Ziyang Xuan , Matan Azrad , Yong Wang Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v10 0/9] Add PMD power mgmt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Oct 27, 2020 at 7:59 AM Liang Ma wrote: > > This patchset proposes a simple API for Ethernet drivers > to cause the CPU to enter a power-optimized state while > waiting for packets to arrive, along with a set of > generic intrinsics that facilitate that. This is achieved > through cooperation with the NIC driver that will allow > us to know address of wake up event, and wait for writes > on it. Is the wake event the same as ring status or interrupt status register? So in a way the PMD is passing the address of the next ring descriptor? So that instead of the PMD polling it, the application peeks at it and when ready asks the PMD to actually process the packet(s)? > > On IA, this is achieved through using UMONITOR/UMWAIT > instructions. They are used in their raw opcode form > because there is no widespread compiler support for > them yet. Still, the API is made generic enough to > hopefully support other architectures, if they happen > to implement similar instructions. > > To achieve power savings, there is a very simple mechanism > used: we're counting empty polls, and if a certain threshold > is reached, we get the address of next RX ring descriptor > from the NIC driver, arm the monitoring hardware, and > enter a power-optimized state. We will then wake up when > either a timeout happens, or a write happens (or generally > whenever CPU feels like waking up - this is platform- > specific), and proceed as normal. The empty poll counter is > reset whenever we actually get packets, so we only go to > sleep when we know nothing is going on. The mechanism is > generic which can be used for any write back descriptor. > > Why are we putting it into ethdev as opposed to leaving > this up to the application? Our customers specifically > requested a way to do it wit minimal changes to the > application code. The current approach allows to just > flip a switch and automatically have power savings. The application still has to know address of wake up event. Right? And then it will need the logic to count empty polls and the threshold? This will be done by application or something else? > > - Only 1:1 core to queue mapping is supported, > meaning that each lcore must at most handle RX on a > single queue > - Support 3 type policies. UMWAIT/PAUSE/Frequency_Scale > - Power management is enabled per-queue > - The API doesn't extend to other device types > > Liang Ma (9): > eal: add new x86 cpuid support for WAITPKG > eal: add power management intrinsics > eal: add intrinsics support check infrastructure > ethdev: add simple power management API > power: add PMD power management API and callback > net/ixgbe: implement power management API > net/i40e: implement power management API > net/ice: implement power management API > examples/l3fwd-power: enable PMD power mgmt > > doc/guides/prog_guide/power_man.rst | 48 +++ > doc/guides/rel_notes/release_20_11.rst | 15 + > .../sample_app_ug/l3_forward_power_man.rst | 13 + > drivers/net/i40e/i40e_ethdev.c | 1 + > drivers/net/i40e/i40e_rxtx.c | 26 ++ > drivers/net/i40e/i40e_rxtx.h | 2 + > drivers/net/ice/ice_ethdev.c | 1 + > drivers/net/ice/ice_rxtx.c | 26 ++ > drivers/net/ice/ice_rxtx.h | 2 + > drivers/net/ixgbe/ixgbe_ethdev.c | 1 + > drivers/net/ixgbe/ixgbe_rxtx.c | 25 ++ > drivers/net/ixgbe/ixgbe_rxtx.h | 2 + > examples/l3fwd-power/main.c | 46 ++- > lib/librte_eal/arm/include/meson.build | 1 + > .../arm/include/rte_power_intrinsics.h | 60 ++++ > lib/librte_eal/arm/rte_cpuflags.c | 6 + > lib/librte_eal/include/generic/rte_cpuflags.h | 26 ++ > .../include/generic/rte_power_intrinsics.h | 123 +++++++ > lib/librte_eal/include/meson.build | 1 + > lib/librte_eal/ppc/include/meson.build | 1 + > .../ppc/include/rte_power_intrinsics.h | 60 ++++ > lib/librte_eal/ppc/rte_cpuflags.c | 7 + > lib/librte_eal/version.map | 1 + > lib/librte_eal/x86/include/meson.build | 1 + > lib/librte_eal/x86/include/rte_cpuflags.h | 1 + > .../x86/include/rte_power_intrinsics.h | 135 ++++++++ > lib/librte_eal/x86/rte_cpuflags.c | 14 + > lib/librte_ethdev/rte_ethdev.c | 23 ++ > lib/librte_ethdev/rte_ethdev.h | 33 ++ > lib/librte_ethdev/rte_ethdev_driver.h | 28 ++ > lib/librte_ethdev/version.map | 1 + > lib/librte_power/meson.build | 5 +- > lib/librte_power/rte_power_pmd_mgmt.c | 320 ++++++++++++++++++ > lib/librte_power/rte_power_pmd_mgmt.h | 92 +++++ > lib/librte_power/version.map | 4 + > 35 files changed, 1148 insertions(+), 3 deletions(-) > create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h > > -- > 2.17.1 >