From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4A97DA0A02; Thu, 14 Jan 2021 11:25:17 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 35DA11410C7; Thu, 14 Jan 2021 11:25:17 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by mails.dpdk.org (Postfix) with ESMTP id 2CF0D1410B4 for ; Thu, 14 Jan 2021 11:25:15 +0100 (CET) IronPort-SDR: IBSWC+xjvBjLFNfTP6lJaNli13W19WJUQMP+yyz0C+kBpeXIYHOmhg7wpu29K0MOCG2OsurRxU tUEnfF1UTZmQ== X-IronPort-AV: E=McAfee;i="6000,8403,9863"; a="174834978" X-IronPort-AV: E=Sophos;i="5.79,347,1602572400"; d="scan'208";a="174834978" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jan 2021 02:25:14 -0800 IronPort-SDR: yCtNabAtpB+P1REyuL9KMUyyEiN3Denp6woyjhDVX2MM0hX3Q9svMTCPhsQ+OKwUl86R71+zrF l4aPQtbK5Kzw== X-IronPort-AV: E=Sophos;i="5.79,347,1602572400"; d="scan'208";a="465209539" Received: from spuligad-mobl2.ger.corp.intel.com (HELO [10.213.255.126]) ([10.213.255.126]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jan 2021 02:25:12 -0800 To: David Marchand , Ray Kinsella Cc: dev , Thomas Monjalon , "Ananyev, Konstantin" , Timothy McDaniel , David Hunt , Bruce Richardson , chris.macnamara@intel.com, Kevin Traynor References: From: "Burakov, Anatoly" Message-ID: <9c3e5dd0-d63f-2cf1-d4ed-7fe75b1bc49e@intel.com> Date: Thu, 14 Jan 2021 10:25:11 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v16 00/11] Add PMD power management X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 14-Jan-21 9:36 AM, David Marchand wrote: > On Tue, Jan 12, 2021 at 6:37 PM Anatoly Burakov > wrote: >> >> This patchset proposes a simple API for Ethernet drivers to cause the >> CPU to enter a power-optimized state while waiting for packets to >> arrive. There are multiple proposed mechanisms to achieve said power >> savings: simple frequency scaling, idle loop, and monitoring the Rx >> queue for incoming packages. The latter is achieved through cooperation >> with the NIC driver that will allow us to know address of wake up event, >> and wait for writes on that address. >> >> On IA, this is achieved through using UMONITOR/UMWAIT instructions. They >> are used in their raw opcode form because there is no widespread >> compiler support for them yet. Still, the API is made generic enough to >> hopefully support other architectures, if they happen to implement >> similar instructions. >> >> To achieve power savings, there is a very simple mechanism used: we're >> counting empty polls, and if a certain threshold is reached, we employ >> one of the suggested power management schemes automatically, from within >> a Rx callback inside the PMD. Once there's traffic again, the empty poll >> counter is reset. >> >> This patchset also introduces a few changes into existing power >> management-related intrinsics, namely to provide a native way of waking >> up a sleeping core without application being responsible for it, as well >> as general robustness improvements. There's quite a bit of locking going >> on, but these locks are per-thread and very little (if any) contention >> is expected, so the performance impact shouldn't be that bad (and in any >> case the locking happens when we're about to sleep anyway). >> >> Why are we putting it into ethdev as opposed to leaving this up to the >> application? Our customers specifically requested a way to do it with >> minimal changes to the application code. The current approach allows to >> just flip a switch and automatically have power savings. >> >> Things of note: >> >> - Only 1:1 core to queue mapping is supported, meaning that each lcore >> must at most handle RX on a single queue > > If we want to save power, it is likely we would poll more rxqs on a thread. We are investigating possibilities to make that happen, but for this patchset, this is the limitation. > > >> - Support 3 type policies. Monitor/Pause/Frequency Scaling >> - Power management is enabled per-queue >> - The API doesn't extend to other device types >> >> v16: >> - Implemented Konstantin's suggestions and comments >> - Added return values to the API > > - This revision breaks SPDK build (reported by UNH): > http://mails.dpdk.org/archives/test-report/2021-January/174069.html > > > - Build is broken for ARM and PPC at patch: > 86491d5bd4 - (HEAD) eal: add monitor wakeup function (25 minutes ago) > > > Only pasting the ARM failure: > ninja: Entering directory `/home/dmarchan/builds/build-arm64-host-clang' > [1/297] Compiling C object > 'lib/76b5a35@@rte_eal@sta/librte_eal_arm_rte_power_intrinsics.c.o'. > FAILED: lib/76b5a35@@rte_eal@sta/librte_eal_arm_rte_power_intrinsics.c.o > aarch64-linux-gnu-gcc -Ilib/76b5a35@@rte_eal@sta -Ilib > -I../../dpdk/lib -I. -I../../dpdk/ -Iconfig -I../../dpdk/config > -Ilib/librte_eal/include -I../../dpdk/lib/librte_eal/include > -Ilib/librte_eal/linux/include > -I../../dpdk/lib/librte_eal/linux/include -Ilib/librte_eal/arm/include > -I../../dpdk/lib/librte_eal/arm/include -Ilib/librte_eal/common > -I../../dpdk/lib/librte_eal/common -Ilib/librte_eal > -I../../dpdk/lib/librte_eal -Ilib/librte_kvargs > -I../../dpdk/lib/librte_kvargs > -Ilib/librte_telemetry/../librte_metrics > -I../../dpdk/lib/librte_telemetry/../librte_metrics > -Ilib/librte_telemetry -I../../dpdk/lib/librte_telemetry > -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall > -Winvalid-pch -Werror -O2 -g -include rte_config.h -Wextra -Wcast-qual > -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security > -Wmissing-declarations -Wmissing-prototypes -Wnested-externs > -Wold-style-definition -Wpointer-arith -Wsign-compare > -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-packed-not-aligned > -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=armv8-a+crc > -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation > '-DABI_VERSION="21.1"' -DRTE_LIBEAL_USE_GETENTROPY -MD -MQ > 'lib/76b5a35@@rte_eal@sta/librte_eal_arm_rte_power_intrinsics.c.o' -MF > 'lib/76b5a35@@rte_eal@sta/librte_eal_arm_rte_power_intrinsics.c.o.d' > -o 'lib/76b5a35@@rte_eal@sta/librte_eal_arm_rte_power_intrinsics.c.o' > -c ../../dpdk/lib/librte_eal/arm/rte_power_intrinsics.c > ../../dpdk/lib/librte_eal/arm/rte_power_intrinsics.c:35:1: error: > conflicting types for ‘rte_power_monitor_wakeup’ > rte_power_monitor_wakeup(const unsigned int lcore_id) > ^~~~~~~~~~~~~~~~~~~~~~~~ > In file included from > ../../dpdk/lib/librte_eal/arm/include/rte_power_intrinsics.h:14, > from ../../dpdk/lib/librte_eal/arm/rte_power_intrinsics.c:5: > ../../dpdk/lib/librte_eal/include/generic/rte_power_intrinsics.h:79:5: > note: previous declaration of ‘rte_power_monitor_wakeup’ was here > int rte_power_monitor_wakeup(const unsigned int lcore_id); > ^~~~~~~~~~~~~~~~~~~~~~~~ > ninja: build stopped: subcommand failed. Woops, wrong return value in the .c files. Will fix! > > > > - The ABI check is still not happy as I reported earlier. > Reproduced on v16 (GHA had a hiccup on this revision, but previous > ones had the failure too): > > 1 Changed variable: > > [C] 'rte_eth_dev rte_eth_devices[]' was changed at rte_ethdev_core.h:196:1: > type of variable changed: > array element type 'struct rte_eth_dev' changed: > type size hasn't changed > 1 data member change: > type of 'const eth_dev_ops* rte_eth_dev::dev_ops' changed: > in pointed to type 'const eth_dev_ops': > in unqualified underlying type 'struct eth_dev_ops' at > rte_ethdev_driver.h:789:1: > type size changed from 6208 to 6272 (in bits) > 1 data member insertion: > 'eth_get_monitor_addr_t > eth_dev_ops::get_monitor_addr', at offset 6208 (in bits) at > rte_ethdev_driver.h:940:1 > no data member changes (94 filtered); > type size hasn't changed > > Error: ABI issue reported for 'abidiff --suppr > /home/dmarchan/dpdk/devtools/../devtools/libabigail.abignore > --no-added-syms --headers-dir1 > /home/dmarchan/abi/v20.11/build-gcc-static/usr/local/include > --headers-dir2 /home/dmarchan/builds/build-gcc-static/install/usr/local/include > /home/dmarchan/abi/v20.11/build-gcc-static/dump/librte_ethdev.dump > /home/dmarchan/builds/build-gcc-static/install/dump/librte_ethdev.dump' > > ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged > this as a potential issue). > > One solution is to add an exception on the eth_dev_ops structure. > > --- a/devtools/libabigail.abignore > +++ b/devtools/libabigail.abignore > @@ -7,3 +7,7 @@ > symbol_version = INTERNAL > [suppress_variable] > symbol_version = INTERNAL > + > +; Explicit ignore for driver-only ABI > +[suppress_type] > + name = eth_dev_ops > > Right, OK. I didn't realize an "exception" is something you actually do in code, not an ad-hoc community process type thing :) I'll add this in the next revision. -- Thanks, Anatoly