From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 69554A04F6; Tue, 7 Jan 2020 10:14:15 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AF6B91D646; Tue, 7 Jan 2020 10:14:14 +0100 (CET) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 4513B1C11F for ; Tue, 7 Jan 2020 10:14:12 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jan 2020 01:14:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,405,1571727600"; d="scan'208,217";a="223120375" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.222.38]) by orsmga003.jf.intel.com with ESMTP; 07 Jan 2020 01:14:05 -0800 From: David Hunt To: dev@dpdk.org Cc: john.mcnamara@intel.com, marko.kovacevic@intel.com, David Hunt Date: Tue, 7 Jan 2020 09:13:43 +0000 Message-Id: <20200107091343.22965-1-david.hunt@intel.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v1] doc: rework vm power manager user guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Review and re-work of vm_power_manager documentation. Hopefully this is clearer, easier to follow. Signed-off-by: David Hunt --- .../img/vm_power_mgr_highlevel.svg | 2189 +++++++++++------ .../img/vm_power_mgr_vm_request_seq.svg | 1455 +++++------ .../sample_app_ug/vm_power_management.rst | 1194 +++++---- 3 files changed, 2775 insertions(+), 2063 deletions(-) diff --git a/doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg b/doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg index 92f882674..c251bcda6 100644 --- a/doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg +++ b/doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg @@ -1,710 +1,1491 @@ - - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Page-1 - - Box - Host - - - - - Host - - 1-D single.59 - - Sheet.63 - - - - - - - - - Sheet.64 - - - - - - - Sheet.65 - - - - - - - - 1-D single.54 - - Sheet.56 - - - - - - - - - Sheet.57 - - - - - - - Sheet.58 - - - - - - - - Box.10 - VM 0 - - - - - VM 0 - - Box.2 - Core 0 - - - - - Core 0 - - Box.3 - Core 1 - - - - - Core 1 - - Box.4 - Core 2 - - - - - Core 2 - - Box.5 - Core 3 - - - - - Core 3 - - Box.6 - Core 4 - - - - - Core 4 - - Box.7 - Core 5 - - - - - Core 5 - - Box.8 - Core 6 - - - - - Core 6 - - Box.9 - Core 7 - - - - - Core 7 - - Box.11 - Virtual Core 0 - - - - - Virtual Core 0 - - Box.12 - Virtual Core 1 - - - - - Virtual Core 1 - - Box.13 - Virtual Core 2 - - - - - Virtual Core 2 - - Box.14 - Virtual Core 3 - - - - - Virtual Core 3 - - 1-D single - - Sheet.17 - - - - - - - - - Sheet.18 - - - - - - - Sheet.19 - - - - - - - - 1-D single.20 - - Sheet.21 - - - - - - - - - Sheet.22 - - - - - - - Sheet.23 - - - - - - - - 1-D single.28 - - Sheet.29 - - - - - - - - - Sheet.30 - - - - - - - Sheet.31 - - - - - - - - Box.32 - DPDK Application - - - - - DPDK Application - - Box.33 - VM 1 - - - - - VM 1 - - Box.34 - Virtual Core 0 - - - - - Virtual Core 0 - - Box.35 - Virtual Core 1 - - - - - Virtual Core 1 - - Box.36 - DPDK Application - - - - - DPDK Application - - Box.49 - DPDK VM Application Reuse librte_power interface, but provide... - - - - - DPDK VM Application · Reuse librte_power interface, but provides a new implementation that forwards frequency set requests to host via Virtio-Serial channel · Each lcore has exclusive access to a single channel · Sample application re-uses l3fwd_power · A CLI for changing frequency from within a VM is also included. - - 1-D single.37 - - Sheet.38 - - - - - - - - - Sheet.39 - - - - - - - Sheet.40 - - - - - - - - Box.15 - OS/Hypervisor - - - - - OS/Hypervisor - - Box.55 - Linux “userspace” power governor /sys/devices/system/cpu/cpuN... - - - - - Linux “userspace” power governor /sys/devices/system/cpu/cpuN/cpufreq/ - - Box.45 - VM Power Monitor Accepts VM Commands over Virtio Serial endpo... - - - - - VM Power Monitor · Accepts VM Commands over Virtio Serial endpoints, monitored via epoll · Commands include the virtual core to be modified, using libvirt to get physical core mapping · Uses librte_power to affect frequency changes via Linux userspace power governor(APCI cpufreq) · CLI: For adding VM channels to monitor, inspecting and changing channel state, manually altering CPU frequency. Also allows for the changing of vCPU to pCPU pinning. - - Box.53 - VM Power Monitor Application - - - - - VM Power Monitor Application - - Box.61 - librte_power(vm) - - - - - librte_power(vm) - - Box.48 - lcore channel 0 - - - - - lcore channel 0 - - Box.47 - librte_power(vm) - - - - - librte_power(vm) - - Box.46 - lcore channel 1 - - - - - lcore channel 1 - - Box.60 - lcore channel 2 - - - - - lcore channel 2 - - Box.62 - lcore channel 3 - - - - - lcore channel 3 - - Box.50 - lcore channel 0 - - - - - lcore channel 0 - - Box.52 - lcore channel 1 - - - - - lcore channel 1 - - Box.51 - Endpoint Monitor(lcore channels) - - - - - Endpoint Monitor(lcore channels) - - Box.25 - Channel Manager - - - - - Channel Manager - - Box.41 - QEMU - - - - - QEMU - - Box.42 - libvirt - - - - - libvirt - - Dynamic connector.43 - - - - Dynamic connector - - - - Box.26 - librte_power(Host) - - - - - librte_power(Host) - - Dynamic connector.68 - Map vCPU to pCPU - - - Map vCPU to pCPU - - Box.27 - VM Power CLI - - - - - VM Power CLI - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + + DPDK Application + librte_power(vm) + + lcorechannel0 + + lcorechannel1 + + lcorechannel2 + + lcorechannel3 + VM 0 + + + + lcorechannel0 + + lcorechannel1 + VM 1 + librte_power(vm) + DPDK Application + DPDK VM Application • Reuse librte_power interface, but provides a new implementation that forwards frequency set requests to the host using a Virtio-Serial channel • Each lcore has exclusive access to a single channel • Sample application reuses l3fwd_power • A CLI for changing frequency from within a VM is also included. + + VM Power Monitor • Accepts VM commands over Virtio Serial endpoints, monitored using epoll. • Commands include the virutal core to be modified, using libvirt to get the physical core mapping. • Uses librte_power to affect frequency changes using Linux userspace power governor (APCI cpufreq). • CLI: For adding VM channels to monitor, inspecting and changing channel state, manually altering CPU frequency. Also allows for the changing of vCPU to pCPU pinning. + + + Core 0 + + Core 1 + Host + + + Core 2 + + + + + + + + + Endpoint Monitor (lcore channels) + + Channel Manager + + librte_power (Host) + + VM Power CLI + + QEMU + + libvirt + VM Power Monitor Application + + Linux "userspace" power governor/sys/devices/system/cpu/cpuN/cpufreq/ + OS/Hypervisor + + VirtualCore 0 + + VirtualCore 1 + + VirutalCore 2 + + VirutalCore 3 + + Core 3 + + Core 4 + + Core 5 + + Core 6 + + Core 7 + + VirtualCore 1 + + VirtualCore 0 + + + + Map vCPU to pCPU + + diff --git a/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg b/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg index 1487cda9a..0c6d49f0c 100644 --- a/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg +++ b/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg @@ -1,4 +1,5 @@ + @@ -12,884 +13,688 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + width="190mm" + height="105mm" + viewBox="0 0 190 105" version="1.1" - width="912.44751" - height="664.9787" - id="svg5187" - inkscape:version="0.48.5 r10040" - sodipodi:docname="vm_power_mgr_vm_request_seq.svg"> - - - - image/svg+xml - - - - - + id="svg13567" + inkscape:version="0.92.4 (5da689c313, 2019-01-14)" + sodipodi:docname="VM_Req_to_Scale_Freq.svg"> - - - - - - - - - - - - - - - - - - - - + id="defs13561"> + - - - - - - - - - - + + - - - Loop: for each epoll event - - - - - - - - - - + + + + - - - librte_power(VM) - - - - Sequence - - - - - - - - - + + + + - - - guest_channel(VM) - - - - - - - - - - + + + + - - - channel_monitor(Host) - - - - - - - - - - - + inkscape:connector-curvature="0" + id="path1563-2" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + transform="matrix(-0.4,0,0,-0.4,-4,0)" /> + + + inkscape:connector-curvature="0" + id="path902-8" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.625;stroke-linejoin:round;stroke-opacity:1" + d="M 8.7185878,4.0337352 -2.2072895,0.01601326 8.7185884,-4.0017078 c -1.7454984,2.3720609 -1.7354408,5.6174519 -6e-7,8.035443 z" + transform="scale(-0.6)" /> + + - - - channel_manager(Host) - - - - - - - - - - - + inkscape:connector-curvature="0" + id="path899-2" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.625;stroke-linejoin:round;stroke-opacity:1" + d="M 8.7185878,4.0337352 -2.2072895,0.01601326 8.7185884,-4.0017078 c -1.7454984,2.3720609 -1.7354408,5.6174519 -6e-7,8.035443 z" + transform="scale(0.6)" /> + + + inkscape:connector-curvature="0" + id="path902-8-5" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.625;stroke-linejoin:round;stroke-opacity:1" + d="M 8.7185878,4.0337352 -2.2072895,0.01601326 8.7185884,-4.0017078 c -1.7454984,2.3720609 -1.7354408,5.6174519 -6e-7,8.035443 z" + transform="scale(-0.6)" /> + + - - - power_manager(Host) - - + inkscape:connector-curvature="0" + id="path899-2-6" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.625;stroke-linejoin:round;stroke-opacity:1" + d="M 8.7185878,4.0337352 -2.2072895,0.01601326 8.7185884,-4.0017078 c -1.7454984,2.3720609 -1.7354408,5.6174519 -6e-7,8.035443 z" + transform="scale(0.6)" /> + + + + + + + + + + + + + image/svg+xml + + + + + + inkscape:groupmode="layer" + id="layer3" + inkscape:label="Drawing" + transform="translate(0,-192)"> + + + librte_power(VM) + + guest_channel(VM) + + channel_monitor(Host) + + channel_manager(Host) + + power_manager(Host) + + librte_power(Host) - - - - - - - - - process_request - - - - - - - - - - get_pcpu_mask() - - - - - - - - - - - + + + + + + - pcpu_mask - - - - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:1.5;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="25.989178" + y="226.47469" + id="text13585-3">rte_power_freq_up() - - - - - - - - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:1.5;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="24.740162" + y="233.53178" + id="text13585-3-3">status - librte_power(Host) - - - - - - - - - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:2;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="56.144329" + y="214.84552" + id="text13585-3-1">Loop: for each epoll event - scale_freq_up(pcpu_mask) - - - - - - - - - - - - - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:1.5;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="91.464294" + y="233.68802" + id="text13585-3-4-9">process_request - rte_power_freq_up() - - - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:1.5;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="91.556564" + y="245.283" + id="text13585-3-4-9-4">get_pcpu_mask() + pcpu_mask + status + rte_power_freq_up() + style="fill:none;stroke:#000000;stroke-width:0.26499999;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow1Mend)" + d="m 75.021197,234.51894 c 6.293309,0.0388 13.971298,0.23602 14.165285,3.60814 0.04205,3.38442 -8.987218,3.32283 -13.497112,3.47451" + id="path873" + inkscape:connector-curvature="0" + sodipodi:nodetypes="ccc" /> - - - - - - - - - - - guest_channel_send_msg() - - - - - - - - + style="fill:none;stroke:#000000;stroke-width:0.35277778;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow2Mend-5)" + d="m 75.241702,246.95037 36.841758,0.10349" + id="path1561-5" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> - - status - - - - - - - - + style="fill:none;stroke:#000000;stroke-width:0.35277778;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:0.70555559, 0.70555559;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#Arrow2Mstart-0)" + d="M 75.372816,259.391 H 112.08346" + id="path1561-3-9" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> - - status - - - + style="fill:none;stroke:#000000;stroke-width:0.35277778;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow2Mend-5-8)" + d="M 75.241704,270.47547 143.354,270.57896" + id="path1561-5-8" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + style="fill:none;stroke:#000000;stroke-width:0.35277778;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:0.7055556, 0.7055556;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#Arrow2Mstart-0-0)" + d="M 75.372824,282.9161 H 143.354" + id="path1561-3-9-8" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> - - - - - - - + style="fill:none;stroke:#000000;stroke-width:0.35277778;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow2Mend-3)" + d="m 146.29869,276.69198 h 24.54989" + id="path1561-4" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + style="fill:none;stroke:#000000;stroke-width:0.35277778;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:0.70555559, 0.70555559;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#Arrow2Mstart-3)" + d="m 146.43232,282.81664 h 24.41626" + id="path1561-3-7" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + + - rte_power_freq_up() - - - - - - - - - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:1.5;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="56.678867" + y="232.61894" + id="text13585-3-4">guest_channel_send_msg() + + scale_freq_up(pcpu_mask) + - status - + xml:space="preserve" + style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:1.41111112px;line-height:1.5;font-family:Arial;-inkscape-font-specification:'Arial Bold';letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332" + x="113.21278" + y="280.93787" + id="text13585-3-3-2">status diff --git a/doc/guides/sample_app_ug/vm_power_management.rst b/doc/guides/sample_app_ug/vm_power_management.rst index bb2aa4faf..d43ba9cbe 100644 --- a/doc/guides/sample_app_ug/vm_power_management.rst +++ b/doc/guides/sample_app_ug/vm_power_management.rst @@ -1,119 +1,128 @@ .. SPDX-License-Identifier: BSD-3-Clause Copyright(c) 2010-2014 Intel Corporation. -VM Power Management Application -=============================== - -Introduction ------------- - -Applications running in Virtual Environments have an abstract view of -the underlying hardware on the Host, in particular applications cannot see -the binding of virtual to physical hardware. -When looking at CPU resourcing, the pinning of Virtual CPUs(vCPUs) to -Host Physical CPUs(pCPUS) is not apparent to an application -and this pinning may change over time. -Furthermore, Operating Systems on virtual machines do not have the ability -to govern their own power policy; the Machine Specific Registers (MSRs) -for enabling P-State transitions are not exposed to Operating Systems -running on Virtual Machines(VMs). - -The Virtual Machine Power Management solution shows an example of -how a DPDK application can indicate its processing requirements using VM local -only information(vCPU/lcore, etc.) to a Host based Monitor which is responsible -for accepting requests for frequency changes for a vCPU, translating the vCPU -to a pCPU via libvirt and affecting the change in frequency. - -The solution is comprised of two high-level components: - -#. Example Host Application - - Using a Command Line Interface(CLI) for VM->Host communication channel management - allows adding channels to the Monitor, setting and querying the vCPU to pCPU pinning, - inspecting and manually changing the frequency for each CPU. - The CLI runs on a single lcore while the thread responsible for managing - VM requests runs on a second lcore. - - VM requests arriving on a channel for frequency changes are passed - to the librte_power ACPI cpufreq sysfs based library. - The Host Application relies on both qemu-kvm and libvirt to function. - - This monitoring application is responsible for: - - - Accepting requests from client applications: Client applications can - request frequency changes for a vCPU, translating - the vCPU to a pCPU via libvirt and affecting the change in frequency. - - - Accepting policies from client applications: Client application can - send a policy to the host application. The - host application will then apply the rules of the policy independent - of the application. For example, the policy can contain time-of-day - information for busy/quiet periods, and the host application can scale - up/down the relevant cores when required. See the details of the guest - application below for more information on setting the policy values. - - - Out-of-band monitoring of workloads via cores hardware event counters: - The host application can manage power for an application in a virtualised - OR non-virtualised environment by looking at the event counters of the - cores and taking action based on the branch hit/miss ratio. See the host - application '--core-list' command line parameter below. - -#. librte_power for Virtual Machines - - Using an alternate implementation for the librte_power API, requests for - frequency changes are forwarded to the host monitor rather than - the APCI cpufreq sysfs interface used on the host. - - The l3fwd-power application will use this implementation when deployed on a VM - (see :doc:`l3_forward_power_man`). +Virtual Machine Power Management Application +============================================ + +Applications running in virtual environments have an abstract view of +the underlying hardware on the host. Specifically, applications cannot +see the binding of virtual components to physical hardware. When looking +at CPU resourcing, the pinning of Virtual CPUs (vCPUs) to Physical CPUs +(pCPUs) on the host is not apparent to an application and this pinning +may change over time. In addition, operating systems on Virtual Machines +(VMs) do not have the ability to govern their own power policy. The +Machine Specific Registers (MSRs) for enabling P-state transitions are +not exposed to the operating systems running on the VMs. + +The solution demonstrated in this sample application shows an example of +how a DPDK application can indicate its processing requirements using +VM-local only information (vCPU/lcore, and so on) to a host resident VM +Power Manager. The VM Power Manager is responsible for: + +- **Accepting requests for frequency changes for a vCPU** +- **Translating the vCPU to a pCPU using libvirt** +- **Performing the change in frequency** + +This application demonstrates the following features: + +- **The handling of VM application requests to change frequency.** + VM applications can request frequency changes for a vCPU. The VM + Power Management Application uses libvirt to translate that + virtual CPU (vCPU) request to a physical CPU (pCPU) request and + performs the frequency change. + +- **The acceptance of power management policies from VM applications.** + A VM application can send a policy to the host application. The + policy contains rules that define the power management behaviour + of the VM. The host application then applies the rules of the + policy independent of the VM application. For example, the + policy can contain time-of-day information for busy/quiet + periods, and the host application can scale up/down the relevant + cores when required. See :ref:`sending_policy` for information on + setting policy values. + +- **Out-of-band monitoring of workloads using core hardware event counters.** + The host application can manage power for an application by looking + at the event counters of the cores and taking action based on the + branch miss/hit ratio. See :ref:`enabling_out_of_band`. + + **Note**: This functionality also applies in non-virtualised environments. + +In addition to the ``librte_power`` library used on the host, the +application uses a special version of ``librte_power`` on each VM, which +directs frequency changes and policies to the host monitor rather than +the APCI ``cpufreq`` ``sysfs`` interface used on the host in non-virtualised +environments. .. _figure_vm_power_mgr_highlevel: .. figure:: img/vm_power_mgr_highlevel.* - Highlevel Solution +Sample Application Architecture Overview +---------------------------------------- + +The VM power management solution employs ``qemu-kvm`` to provide +communications channels between the host and VMs in the form of a +``virtio-serial`` connection that appears as a para-virtualised serial +device on a VM and can be configured to use various backends on the +host. For this example, the configuration of each ``virtio-serial`` endpoint +on the host as an ``AF_UNIX`` file socket, supporting poll/select and +``epoll`` for event notification. In this example, each channel endpoint on +the host is monitored for ``EPOLLIN`` events using ``epoll``. Each channel +is specified as ``qemu-kvm`` arguments or as ``libvirt`` XML for each VM, +where each VM can have several channels up to a maximum of 64 per VM. In this +example, each DPDK lcore on a VM has exclusive access to a channel. + +To enable frequency changes from within a VM, the VM forwards a +``librte_power`` request over the ``virtio-serial`` channel to the host. Each +request contains the vCPU and power command (scale up/down/min/max). The +API for the host ``librte_power`` and guest ``librte_power`` is consistent +across environments, with the selection of VM or host implementation +determined automatically at runtime based on the environment. On +receiving a request, the host translates the vCPU to a pCPU using the +libvirt API before forwarding it to the host ``librte_power``. -Overview --------- - -VM Power Management employs qemu-kvm to provide communications channels -between the host and VMs in the form of Virtio-Serial which appears as -a paravirtualized serial device on a VM and can be configured to use -various backends on the host. For this example each Virtio-Serial endpoint -on the host is configured as AF_UNIX file socket, supporting poll/select -and epoll for event notification. -In this example each channel endpoint on the host is monitored via -epoll for EPOLLIN events. -Each channel is specified as qemu-kvm arguments or as libvirt XML for each VM, -where each VM can have a number of channels up to a maximum of 64 per VM, -in this example each DPDK lcore on a VM has exclusive access to a channel. - -To enable frequency changes from within a VM, a request via the librte_power interface -is forwarded via Virtio-Serial to the host, each request contains the vCPU -and power command(scale up/down/min/max). -The API for host and guest librte_power is consistent across environments, -with the selection of VM or Host Implementation determined at automatically -at runtime based on the environment. - -Upon receiving a request, the host translates the vCPU to a pCPU via -the libvirt API before forwarding to the host librte_power. - .. _figure_vm_power_mgr_vm_request_seq: .. figure:: img/vm_power_mgr_vm_request_seq.* - VM request to scale frequency - +In addition to the ability to send power management requests to the +host, a VM can send a power management policy to the host. In some +cases, using a power management policy is a preferred option because it +can eliminate possible latency issues that can occur when sending power +management requests. Once the VM sends the policy to the host, the VM no +longer needs to worry about power management, because the host now +manages the power for the VM based on the policy. The policy can specify +power behavior that is based on incoming traffic rates or time-of-day +power adjustment (busy/quiet hour power adjustment for example). See +:ref:`sending_policy` for more information. + +One method of power management is to sense how busy a core is when +processing packets and adjusting power accordingly. One technique for +doing this is to monitor the ratio of the branch miss to branch hits +counters and scale the core power accordingly. This technique is based +on the premise that when a core is not processing packets, the ratio of +branch misses to branch hits is very low, but when the core is +processing packets, it is measurably higher. The implementation of this +capability is as a policy of type ``BRANCH_RATIO``. +See :ref:`sending_policy` for more information on using the +BRANCH_RATIO policy option. + +A JSON interface enables the specification of power management requests +and policies in JSON format. The JSON interfaces provide a more +convenient and more easily interpreted interface for the specification +of requests and policies. See :ref:`power_man_requests` for more information. Performance Considerations ~~~~~~~~~~~~~~~~~~~~~~~~~~ -While Haswell Microarchitecture allows for independent power control for each core, -earlier Microarchtectures do not offer such fine grained control. -When deployed on pre-Haswell platforms greater care must be taken in selecting -which cores are assigned to a VM, for instance a core will not scale down -until its sibling is similarly scaled. +While the Haswell microarchitecture allows for independent power control +for each core, earlier microarchitectures do not offer such fine-grained +control. When deploying on pre-Haswell platforms, greater care must be +taken when selecting which cores are assigned to a VM, for example, a +core does not scale down in frequency until all of its siblings are +similarly scaled down. Configuration ------------- @@ -121,636 +130,541 @@ Configuration BIOS ~~~~ -Enhanced Intel SpeedStep® Technology must be enabled in the platform BIOS -if the power management feature of DPDK is to be used. -Otherwise, the sys file folder /sys/devices/system/cpu/cpu0/cpufreq will not exist, -and the CPU frequency-based power management cannot be used. -Consult the relevant BIOS documentation to determine how these settings -can be accessed. +To use the power management features of the DPDK, you must enable +Enhanced Intel SpeedStep® Technology in the platform BIOS. Otherwise, +the ``sys`` file folder ``/sys/devices/system/cpu/cpu0/cpufreq`` does not +exist, and you cannot use CPU frequency-based power management. Refer to the +relevant BIOS documentation to determine how to access these settings. Host Operating System ~~~~~~~~~~~~~~~~~~~~~ -The DPDK Power Library can use either the *acpi_cpufreq* or *intel_pstate* -kernel driver for the management of core frequencies. In many cases -the *intel_pstate* driver is the default Power Management environment. +The DPDK Power Management library can use either the ``acpi_cpufreq`` or +the ``intel_pstate`` kernel driver for the management of core frequencies. In +many cases, the ``intel_pstate`` driver is the default power management +environment. -Should the *acpi-cpufreq* driver be required, the *intel_pstate* module must -be disabled, and *apci_cpufreq* module loaded in its place. +Should the ``acpi-cpufreq driver`` be required, the ``intel_pstate`` +module must be disabled, and the ``acpi-cpufreq`` module loaded in its place. -To disable *intel_pstate* driver, add the following to the grub Linux -command line: +To disable the ``intel_pstate`` driver, add the following to the ``grub`` +Linux command line: -.. code-block:: console + ``intel_pstate=disable`` - intel_pstate=disable +On reboot, load the ``acpi_cpufreq`` module: -Upon rebooting, load the *acpi_cpufreq* module: - -.. code-block:: console - - modprobe acpi_cpufreq + ``modprobe acpi_cpufreq`` Hypervisor Channel Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Virtio-Serial channels are configured via libvirt XML: - +Configure ``virtio-serial`` channels using ``libvirt`` XML. +The XML structure is as follows:  -.. code-block:: xml +.. code-block:: XML - {vm_name} - -
- - - - -
- + {vm_name} + +
+ + + + +
+ +Where a single controller of type ``virtio-serial`` is created, up to 32 +channels can be associated with a single controller, and multiple +controllers can be specified. The convention is to use the name of the +VM in the host path ``{vm_name}`` and to increment ``{channel_num}`` for each +channel. Likewise, the port value ``{N}`` must be incremented for each +channel. -Where a single controller of type *virtio-serial* is created and up to 32 channels -can be associated with a single controller and multiple controllers can be specified. -The convention is to use the name of the VM in the host path *{vm_name}* and -to increment *{channel_num}* for each channel, likewise the port value *{N}* -must be incremented for each channel. - -Each channel on the host will appear in *path*, the directory */tmp/powermonitor/* -must first be created and given qemu permissions +On the host, for each channel to appear in the path, ensure the creation +of the ``/tmp/powermonitor/`` directory and the assignment of ``qemu`` +permissions: .. code-block:: console - mkdir /tmp/powermonitor/ - chown qemu:qemu /tmp/powermonitor + mkdir /tmp/powermonitor/ + chown qemu:qemu /tmp/powermonitor + +Note that files and directories in ``/tmp`` are generally removed when +rebooting the host and you may need to perform the previous steps after +each reboot. -Note that files and directories within /tmp are generally removed upon -rebooting the host and the above steps may need to be carried out after each reboot. +The serial device as it appears on a VM is configured with the target +element attribute name and must be in the form: +``virtio.serial.port.poweragent.{vm_channel_num}``, where +``vm_channel_num`` is typically the lcore channel to be used in +DPDK VM applications. -The serial device as it appears on a VM is configured with the *target* element attribute *name* -and must be in the form of *virtio.serial.port.poweragent.{vm_channel_num}*, -where *vm_channel_num* is typically the lcore channel to be used in DPDK VM applications. +Each channel on a VM is present at: -Each channel on a VM will be present at */dev/virtio-ports/virtio.serial.port.poweragent.{vm_channel_num}* +``/dev/virtio-ports/virtio.serial.port.poweragent.{vm_channel_num}`` Compiling and Running the Host Application ------------------------------------------ -Compiling -~~~~~~~~~ +Compiling the Host Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -For information on compiling DPDK and the sample applications +For information on compiling the DPDK and sample applications, see see :doc:`compiling`. -The application is located in the ``vm_power_manager`` sub-directory. +The application is located in the ``vm_power_manager`` subdirectory. To build just the ``vm_power_manager`` application using ``make``: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - export RTE_TARGET=build - cd ${RTE_SDK}/examples/vm_power_manager/ - make + export RTE_SDK=/path/to/rte_sdk + export RTE_TARGET=build + cd ${RTE_SDK}/examples/vm_power_manager/ + make -The resulting binary will be ${RTE_SDK}/build/examples/vm_power_manager +The resulting binary is ``${RTE_SDK}/build/examples/vm_power_manager``. -To build just the ``vm_power_manager`` application using ``meson/ninja``: +To build just the ``vm_power_manager`` application using ``meson``/``ninja``: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - cd ${RTE_SDK} - meson build - cd build - ninja - meson configure -Dexamples=vm_power_manager - ninja + export RTE_SDK=/path/to/rte_sdk + cd ${RTE_SDK} + meson build + cd build + ninja + meson configure -Dexamples=vm_power_manager + ninja -The resulting binary will be ${RTE_SDK}/build/examples/dpdk-vm_power_manager +The resulting binary is ``${RTE_SDK}/build/examples/dpdk-vm_power_manager``. -Running -~~~~~~~ +Running the Host Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The application does not have any specific command line options other than *EAL*: +The application does not have any specific command line options other +than the EAL options: .. code-block:: console - ./build/vm_power_mgr [EAL options] + ./build/vm_power_mgr [EAL options] -The application requires exactly two cores to run, one core is dedicated to the CLI, -while the other is dedicated to the channel endpoint monitor, for example to run -on cores 0 & 1 on a system with 4 memory channels: +The application requires exactly two cores to run. One core for the CLI +and the other for the channel endpoint monitor. For example, to run on +cores 0 and 1 on a system with four memory channels, issue the command: .. code-block:: console - ./build/vm_power_mgr -l 0-1 -n 4 + ./build/vm_power_mgr -l 0-1 -n 4 -After successful initialization the user is presented with VM Power Manager CLI: +After successful initialization, the VM Power Manager CLI prompt appears: .. code-block:: console - vm_power> + vm_power> -Virtual Machines can now be added to the VM Power Manager: +Now, it is possible to add virtual machines to the VM Power Manager: .. code-block:: console - vm_power> add_vm {vm_name} - -When a {vm_name} is specified with the *add_vm* command a lookup is performed -with libvirt to ensure that the VM exists, {vm_name} is used as an unique identifier -to associate channels with a particular VM and for executing operations on a VM within the CLI. -VMs do not have to be running in order to add them. + vm_power> add_vm {vm_name} -A number of commands can be issued via the CLI in relation to VMs: +When a ``{vm_name}`` is specified with the ``add_vm`` command, a lookup is +performed with ``libvirt`` to ensure that the VM exists. ``{vm_name}`` is a +unique identifier to associate channels with a particular VM and for +executing operations on a VM within the CLI. VMs do not have to be +running to add them. - Remove a Virtual Machine identified by {vm_name} from the VM Power Manager. +It is possible to issue several commands from the CLI to manage VMs. - .. code-block:: console - - rm_vm {vm_name} - - Add communication channels for the specified VM, the virtio channels must be enabled - in the VM configuration(qemu/libvirt) and the associated VM must be active. - {list} is a comma-separated list of channel numbers to add, using the keyword 'all' - will attempt to add all channels for the VM: - - .. code-block:: console - - add_channels {vm_name} {list}|all - - Enable or disable the communication channels in {list}(comma-separated) - for the specified VM, alternatively list can be replaced with keyword 'all'. - Disabled channels will still receive packets on the host, however the commands - they specify will be ignored. Set status to 'enabled' to begin processing requests again: - - .. code-block:: console - - set_channel_status {vm_name} {list}|all enabled|disabled - - Print to the CLI the information on the specified VM, the information - lists the number of vCPUS, the pinning to pCPU(s) as a bit mask, along with - any communication channels associated with each VM, along with the status of each channel: - - .. code-block:: console +Remove the virtual machine identified by ``{vm_name}`` from the VM Power +Manager using the command: - show_vm {vm_name} - - Set the binding of Virtual CPU on VM with name {vm_name} to the Physical CPU mask: - - .. code-block:: console +.. code-block:: console - set_pcpu_mask {vm_name} {vcpu} {pcpu} + rm_vm {vm_name} - Set the binding of Virtual CPU on VM to the Physical CPU: +Add communication channels for the specified VM using the following +command. The ``virtio`` channels must be enabled in the VM configuration +(``qemu/libvirt``) and the associated VM must be active. ``{list}`` is a +comma-separated list of channel numbers to add. Specifying the keyword +``all`` attempts to add all channels for the VM: - .. code-block:: console +.. code-block:: console - set_pcpu {vm_name} {vcpu} {pcpu} + set_pcpu {vm_name} {vcpu} {pcpu} Enable query of physical core information from a VM: - .. code-block:: console +.. code-block:: console - set_query {vm_name} enable|disable + set_query {vm_name} enable|disable Manual control and inspection can also be carried in relation CPU frequency scaling: Get the current frequency for each core specified in the mask: - .. code-block:: console +.. code-block:: console - show_cpu_freq_mask {mask} + show_cpu_freq_mask {mask} Set the current frequency for the cores specified in {core_mask} by scaling each up/down/min/max: - .. code-block:: console +.. code-block:: console - set_cpu_freq {core_mask} up|down|min|max + add_channels {vm_name} {list}|all - Get the current frequency for the specified core: +Enable or disable the communication channels in ``{list}`` (comma-separated) +for the specified VM. Alternatively, replace ``list`` with the keyword +``all``. Disabled channels receive packets on the host. However, the commands +they specify are ignored. Set the status to enabled to begin processing +requests again: - .. code-block:: console - - show_cpu_freq {core_num} +.. code-block:: console - Set the current frequency for the specified core by scaling up/down/min/max: + set_channel_status {vm_name} {list}|all enabled|disabled - .. code-block:: console +Print to the CLI information on the specified VM. The information lists +the number of vCPUs, the pinning to pCPU(s) as a bit mask, along with +any communication channels associated with each VM, and the status of +each channel: - set_cpu_freq {core_num} up|down|min|max +.. code-block:: console -There are also some command line parameters for enabling the out-of-band -monitoring of branch ratio on cores doing busy polling via PMDs. + show_vm {vm_name} - .. code-block:: console +Set the binding of a virtual CPU on a VM with name ``{vm_name}`` to the +physical CPU mask: - --core-list {list of cores} +.. code-block:: console - When this parameter is used, the list of cores specified will monitor the ratio - between branch hits and branch misses. A tightly polling PMD thread will have a - very low branch ratio, so the core frequency will be scaled down to the minimum - allowed value. When packets are received, the code path will alter, causing the - branch ratio to increase. When the ratio goes above the ratio threshold, the - core frequency will be scaled up to the maximum allowed value. + set_pcpu_mask {vm_name} {vcpu} {pcpu} +Set the binding of the virtual CPU on the VM to the physical CPU: +  .. code-block:: console - --branch-ratio {ratio} - - The branch ratio is a floating point number that specifies the threshold at which - to scale up or down for the given workload. The default branch ratio is 0.01, - and will need to be adjusted for different workloads. - - - -JSON API -~~~~~~~~ - -In addition to the command line interface for host command and a virtio-serial -interface for VM power policies, there is also a JSON interface through which -power commands and policies can be sent. This functionality adds a dependency -on the Jansson library, and the Jansson development package must be installed -on the system before the JSON parsing functionality is included in the app. -This is achieved by: - - .. code-block:: javascript + set_pcpu {vm_name} {vcpu} {pcpu} - apt-get install libjansson-dev +It is also possible to perform manual control and inspection in relation +to CPU frequency scaling. -The command and package name may be different depending on your operating -system. It's worth noting that the app will successfully build without this -package present, but a warning is shown during compilation, and the JSON -parsing functionality will not be present in the app. +Get the current frequency for each core specified in the mask: -Sending a command or policy to the power manager application is achieved by -simply opening a fifo file, writing a JSON string to that fifo, and closing -the file. In actual implementation every core has own dedicated fifo[0..n], -where n is number of the last available core. -Having a dedicated fifo file per core allows using standard filesystem permissions -to ensure a given container can only write JSON commands into fifos it is allowed -to use. - -The fifo is at /tmp/powermonitor/fifo[0..n] - -For example all cmds put to the /tmp/powermonitor/fifo7, will have -effect only on CPU[7]. - -The JSON string can be a policy or instruction, and takes the following -format: - - .. code-block:: javascript - - {"packet_type": { - "pair_1": value, - "pair_2": value - }} - -The 'packet_type' header can contain one of two values, depending on -whether a policy or power command is being sent. The two possible values are -"policy" and "instruction", and the expected name-value pairs is different -depending on which type is being sent. - -The pairs are the format of standard JSON name-value pairs. The value type -varies between the different name/value pairs, and may be integers, strings, -arrays, etc. Examples of policies follow later in this document. The allowed -names and value types are as follows: - - -:Pair Name: "command" -:Description: The type of packet we're sending to the power manager. We can be - creating or destroying a policy, or sending a direct command to adjust - the frequency of a core, similar to the command line interface. -:Type: string -:Values: - - :CREATE: used when creating a new policy, - :DESTROY: used when removing a policy, - :POWER: used when sending an immediate command, max, min, etc. -:Required: yes -:Example: +.. code-block:: console - .. code-block:: javascript + show_cpu_freq_mask {mask} - "command", "CREATE" +Set the current frequency for the cores specified in ``{core_mask}`` by +scaling each up/down/min/max: +.. code-block:: console -:Pair Name: "policy_type" -:Description: Type of policy to apply. Please see vm_power_manager documentation - for more information on the types of policies that may be used. -:Type: string -:Values: + set_cpu_freq {core_mask} up|down|min|max - :TIME: Time-of-day policy. Frequencies of the relevant cores are - scaled up/down depending on busy and quiet hours. - :TRAFFIC: This policy takes statistics from the NIC and scales up - and down accordingly. - :WORKLOAD: This policy looks at how heavily loaded the cores are, - and scales up and down accordingly. - :BRANCH_RATIO: This out-of-band policy can look at the ratio between - branch hits and misses on a core, and is useful for detecting - how much packet processing a core is doing. -:Required: only for CREATE/DESTROY command -:Example: +Get the current frequency for the specified core: - .. code-block:: javascript +.. code-block:: console - "policy_type", "TIME" + show_cpu_freq {core_num} -:Pair Name: "busy_hours" -:Description: The hours of the day in which we scale up the cores for busy - times. -:Type: array of integers -:Values: array with list of hour numbers, (0-23) -:Required: only for TIME policy -:Example: - - .. code-block:: javascript - - "busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ] - -:Pair Name: "quiet_hours" -:Description: The hours of the day in which we scale down the cores for quiet - times. -:Type: array of integers -:Values: array with list of hour numbers, (0-23) -:Required: only for TIME policy -:Example: +Set the current frequency for the specified core by scaling up/down/min/max: - .. code-block:: javascript +.. code-block:: console - "quiet_hours":[ 2, 3, 4, 5, 6 ] + set_cpu_freq {core_num} up|down|min|max -:Pair Name: "avg_packet_thresh" -:Description: Threshold below which the frequency will be set to min for - the TRAFFIC policy. If the traffic rate is above this and below max, the - frequency will be set to medium. -:Type: integer -:Values: The number of packets below which the TRAFFIC policy applies the - minimum frequency, or medium frequency if between avg and max thresholds. -:Required: only for TRAFFIC policy -:Example: +.. _enabling_out_of_band: - .. code-block:: javascript +Command Line Options for Enabling Out-of-band Branch Ratio Monitoring +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - "avg_packet_thresh": 100000 +There are a couple of command line parameters for enabling the out-of-band +monitoring of branch ratios on cores doing busy polling using PMDs as +described in the following table. -:Pair Name: "max_packet_thresh" -:Description: Threshold above which the frequency will be set to max for - the TRAFFIC policy -:Type: integer -:Values: The number of packets per interval above which the TRAFFIC policy - applies the maximum frequency -:Required: only for TRAFFIC policy -:Example: +Table 1 – Command Line Options for Enabling Out-of-band Monitoring of +Branch Ratios - .. code-block:: javascript +=============================== ============================================== +**Command Line Option** **Description** +=============================== ============================================== +``--core-list {list of cores}`` | Specify the list of cores to monitor the ratio of branch misses + | to branch hits. A tightly-polling PMD thread has a very low + | branch ratio, therefore the core frequency scales down to the + | minimum allowed value. On receiving packets, the code path changes, + | causing the branch ratio to increase. When the ratio goes above + | the ratio threshold, the core frequency scales up to the maximum + | allowed value. +``--branch-ratio {ratio}`` | Specify a floating-point number that identifies the threshold at which + | to scale up or down for the given workload. The default branch ratio + | is 0.01 and needs adjustment for different workloads. +=============================== ============================================== - "max_packet_thresh": 500000 -:Pair Name: "workload" -:Description: When our policy is of type WORKLOAD, we need to specify how - heavy our workload is. -:Type: string -:Values: - :HIGH: For cores running workloads that require high frequencies - :MEDIUM: For cores running workloads that require medium frequencies - :LOW: For cores running workloads that require low frequencies -:Required: only for WORKLOAD policy types -:Example: +Compiling and Running the Guest Applications +-------------------------------------------- - .. code-block:: javascript +It is possible to use the ``l3fwd-power`` application (for example) with the +``vm_power_manager``. - "workload", "MEDIUM" +The distribution also provides a guest CLI for validating the setup. -:Pair Name: "mac_list" -:Description: When our policy is of type TRAFFIC, we need to specify the - MAC addresses that the host needs to monitor -:Type: string -:Values: array with a list of mac address strings. -:Required: only for TRAFFIC policy types -:Example: +For both ``l3fwd-power`` and the guest CLI, the host application must use +the ``add_channels`` command to monitor the channels for the VM. To do this, +issue the following commands in the host application: - .. code-block:: javascript +.. code-block:: console - "mac_list":[ "de:ad:be:ef:01:01", "de:ad:be:ef:01:02" ] + vm_power> add_vm vmname + vm_power> add_channels vmname all + vm_power> set_channel_status vmname all enabled + vm_power> show_vm vmname -:Pair Name: "unit" -:Description: the type of power operation to apply in the command -:Type: string -:Values: +Compiling the Guest Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - :SCALE_MAX: Scale frequency of this core to maximum - :SCALE_MIN: Scale frequency of this core to minimum - :SCALE_UP: Scale up frequency of this core - :SCALE_DOWN: Scale down frequency of this core - :ENABLE_TURBO: Enable Turbo Boost for this core - :DISABLE_TURBO: Disable Turbo Boost for this core -:Required: only for POWER instruction -:Example: +For information on compiling DPDK and the sample applications in general, +see :doc:`compiling`. - .. code-block:: javascript +For compiling and running the ``l3fwd-power`` sample application, see +:doc:`l3_forward_power_man`. - "unit", "SCALE_MAX" +The application is in the ``guest_cli`` subdirectory under ``vm_power_manager``. -JSON API Examples -~~~~~~~~~~~~~~~~~ +To build just the ``guest_vm_power_manager`` application using ``make``, issue +the following commands: -Profile create example: +.. code-block:: console - .. code-block:: javascript + export RTE_SDK=/path/to/rte_sdk + export RTE_TARGET=build + cd ${RTE_SDK}/examples/vm_power_manager/guest_cli/ + make - {"policy": { - "command": "create", - "policy_type": "TIME", - "busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ], - "quiet_hours":[ 2, 3, 4, 5, 6 ] - }} +The resulting binary is ``${RTE_SDK}/build/examples/guest_cli``. -Profile destroy example: +**Note**: This sample application conditionally links in the Jansson JSON +library. Consequently, if you are using a multilib or cross-compile +environment, you may need to set the ``PKG_CONFIG_LIBDIR`` environmental +variable to point to the relevant ``pkgconfig`` folder so that the correct +library is linked in. - .. code-block:: javascript +For example, if you are building for a 32-bit target, you could find the +correct directory using the following find command: - {"policy": { - "command": "destroy" - }} +.. code-block:: console -Power command example: + # find /usr -type d -name pkgconfig + /usr/lib/i386-linux-gnu/pkgconfig + /usr/lib/x86_64-linux-gnu/pkgconfig - .. code-block:: javascript +Then use: - {"instruction": { - "command": "power", - "unit": "SCALE_MAX" - }} +.. code-block:: console -To send a JSON string to the Power Manager application, simply paste the -example JSON string into a text file and cat it into the proper fifo: + export PKG_CONFIG_LIBDIR=/usr/lib/i386-linux-gnu/pkgconfig - .. code-block:: console +You then use the ``make`` command as normal, which should find the 32-bit +version of the library, if it installed. If not, the application builds +without the JSON interface functionality. - cat file.json >/tmp/powermonitor/fifo[0..n] +To build just the ``vm_power_manager`` application using ``meson``/``ninja``: -The console of the Power Manager application should indicate the command that -was just received via the fifo. +.. code-block:: console -Compiling and Running the Guest Applications --------------------------------------------- + export RTE_SDK=/path/to/rte_sdk + cd ${RTE_SDK} + meson build + cd build + ninja + meson configure -Dexamples=vm_power_manager/guest_cli + ninja -l3fwd-power is one sample application that can be used with vm_power_manager. +The resulting binary is ``${RTE_SDK}/build/examples/guest_cli``. -A guest CLI is also provided for validating the setup. +Running the Guest Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -For both l3fwd-power and guest CLI, the channels for the VM must be monitored by the -host application using the *add_channels* command on the host. This typically uses -the following commands in the host application: +The standard EAL command line parameters are necessary: .. code-block:: console - vm_power> add_vm vmname - vm_power> add_channels vmname all - vm_power> set_channel_status vmname all enabled - vm_power> show_vm vmname - + ./build/vm_power_mgr [EAL options] -- [guest options] -Compiling -~~~~~~~~~ +The guest example uses a channel for each lcore enabled. For example, to +run on cores 0, 1, 2 and 3: -For information on compiling DPDK and the sample applications -see :doc:`compiling`. +.. code-block:: console -For compiling and running l3fwd-power, see :doc:`l3_forward_power_man`. + ./build/guest_vm_power_mgr -l 0-3 -The application is located in the ``guest_cli`` sub-directory under ``vm_power_manager``. +.. _sending_policy: -To build just the ``guest_vm_power_manager`` application using ``make``: +Command Line Options Available When Sending a Policy to the Host +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. code-block:: console +Optionally, there are several command line options for a user who needs +to send a power policy to the host application. The following table +describes these options. - export RTE_SDK=/path/to/rte_sdk - export RTE_TARGET=build - cd ${RTE_SDK}/examples/vm_power_manager/guest_cli/ - make +Table 1 – Command Line Options Available When Sending a Policy to the Host -The resulting binary will be ${RTE_SDK}/build/examples/guest_cli +======================================= ====================================== +**Command Line Option** **Description** +======================================= ====================================== +``--vm-name {name of guest vm}`` | Allows the user to change the virtual machine name passed + | down to the host application using the power policy. The + | default is ubuntu2. +``--vcpu-list {list vm cores}`` | A comma-separated list of cores in the VM that the user + | wants the host application to monitor. The list of cores + | in any vm starts at zero, and the host application maps + | these to the physical cores once the policy passes down + | to the host. Valid syntax includes individual cores + | 2,3,4, a range of cores 2-4, or a combination of both + | 1,3,5-7. +``--busy-hours {list of busy hours}`` | A comma-separated list of hours in which to set the core + | frequency to the maximum. Valid syntax includes + | individual hours 2,3,4, a range of hours 2-4, or a + | combination of both 1,3,5-7. Valid hour values are 0 to 23. +``--quiet-hours {list of quiet hours}`` | A comma-separated list of hours in which to set the core + | frequency to minimum. Valid syntax includes individual + | hours 2,3,4, a range of hours 2-4, or a combination of + | both 1,3,5-7. Valid hour values are 0 to 23. +``--policy {policy type}`` | The type of policy. This can be one of the following values: -.. Note:: - This sample application conditionally links in the Jansson JSON - library, so if you are using a multilib or cross compile environment you - may need to set the ``PKG_CONFIG_LIBDIR`` environmental variable to point to - the relevant pkgconfig folder so that the correct library is linked in. + - | TRAFFIC Based on incoming traffic rates on the NIC. - For example, if you are building for a 32-bit target, you could find the - correct directory using the following ``find`` command: + - | TIME - Uses a busy/quiet hours policy. - .. code-block:: console + - | BRANCH_RATIO - Uses branch ratio counters to determine + | core busyness. - # find /usr -type d -name pkgconfig - /usr/lib/i386-linux-gnu/pkgconfig - /usr/lib/x86_64-linux-gnu/pkgconfig + - | WORKLOAD - Sets the frequency to low, medium or high + | based on the received policy setting. - Then use: + | **Note**: Not all policy types need all parameters. For + | example, BRANCH_RATIO only needs the vcpu-list + | parameter. +======================================= ====================================== - .. code-block:: console +After successful initialization, the VM Power Manager Guest CLI prompt +appears: - export PKG_CONFIG_LIBDIR=/usr/lib/i386-linux-gnu/pkgconfig +.. code-block:: console - You then use the make command as normal, which should find the 32-bit - version of the library, if it installed. If not, the application will - be built without the JSON interface functionality. + vm_power(guest)> -To build just the ``vm_power_manager`` application using ``meson/ninja``: +To change the frequency of an lcore, use a ``set_cpu_freq`` command similar +to the following: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - cd ${RTE_SDK} - meson build - cd build - ninja - meson configure -Dexamples=vm_power_manager/guest_cli - ninja - -The resulting binary will be ${RTE_SDK}/build/examples/guest_cli + set_cpu_freq {core_num} up|down|min|max -Running -~~~~~~~ +where, ``{core_num}`` is the lcore and channel to change frequency by +scaling up/down/min/max. -The standard *EAL* command line parameters are required: +To start an application, configure the power policy, and send it to the +host, use a command like the following: .. code-block:: console - ./build/guest_vm_power_mgr [EAL options] -- [guest options] + ./build/guest_vm_power_mgr -l 0-3 -n 4 -- --vm-name=ubuntu --policy=BRANCH_RATIO --vcpu-list=2-4 -The guest example uses a channel for each lcore enabled. For example, -to run on cores 0,1,2,3: +Once the VM Power Manager Guest CLI appears, issuing the 'send_policy now' command +will send the policy to the host: .. code-block:: console - ./build/guest_vm_power_mgr -l 0-3 - -Optionally, there is a list of command line parameter should the user wish to send a power -policy down to the host application. These parameters are as follows: + send_policy now - .. code-block:: console +Once the policy is sent to the host, the host application takes over the power monitoring +of the specified cores in the policy. - --vm-name {name of guest vm} +.. _power_man_requests: - This parameter allows the user to change the Virtual Machine name passed down to the - host application via the power policy. The default is "ubuntu2" +JSON Interface for Power Management Requests and Policies +--------------------------------------------------------- - .. code-block:: console +In addition to the command line interface for the host command, and a +``virtio-serial`` interface for VM power policies, there is also a JSON +interface through which power commands and policies can be sent. - --vcpu-list {list vm cores} +**Note**: This functionality adds a dependency on the Jansson library. +Install the Jansson development package on the system to avail of the +JSON parsing functionality in the app. Issue the ``apt-get install +libjansson-dev`` command to install the development package. The command +and package name may be different depending on your operating system. It +is worth noting that the app builds successfully if this package is not +present, but a warning displays during compilation, and the JSON parsing +functionality is not present in the app. - A comma-separated list of cores in the VM that the user wants the host application to - monitor. The list of cores in any vm starts at zero, and these are mapped to the - physical cores by the host application once the policy is passed down. - Valid syntax includes individual cores '2,3,4', or a range of cores '2-4', or a - combination of both '1,3,5-7' +Send a request or policy to the VM Power Manager by simply opening a +fifo file at ``/tmp/powermonitor/fifo``, writing a JSON string to that file, +and closing the file. - .. code-block:: console +The JSON string can be a power management request or a policy, and takes +the following format: - --busy-hours {list of busy hours} +.. code-block:: javascript - A comma-separated list of hours within which to set the core frequency to maximum. - Valid syntax includes individual hours '2,3,4', or a range of hours '2-4', or a - combination of both '1,3,5-7'. Valid hours are 0 to 23. + {"packet_type": { + "pair_1": value, + "pair_2": value + }} - .. code-block:: console +The ``packet_type`` header can contain one of two values, depending on +whether a power management request or policy is being sent. The two +possible values are ``instruction`` and ``policy`` and the expected name-value +pairs are different depending on which type is sent. - --quiet-hours {list of quiet hours} +The pairs are in the format of standard JSON name-value pairs. The value +type varies between the different name-value pairs, and may be integers, +strings, arrays, and so on. See :ref:`json_interface_ex` +for examples of policies and instructions and +:ref:`json_name_value_pair` for the supported names and value types. - A comma-separated list of hours within which to set the core frequency to minimum. - Valid syntax includes individual hours '2,3,4', or a range of hours '2-4', or a - combination of both '1,3,5-7'. Valid hours are 0 to 23. +.. _json_interface_ex: - .. code-block:: console +JSON Interface Examples +~~~~~~~~~~~~~~~~~~~~~~~ - --policy {policy type} +The following is an example JSON string that creates a time-profile +policy. - The type of policy. This can be one of the following values: - TRAFFIC - based on incoming traffic rates on the NIC. - TIME - busy/quiet hours policy. - BRANCH_RATIO - uses branch ratio counters to determine core busyness. - Not all parameters are needed for all policy types. For example, BRANCH_RATIO - only needs the vcpu-list parameter, not any of the hours. +.. code-block:: JSON + {"policy": { + "name": "ubuntu", + "command": "create", + "policy_type": "TIME", + "busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ], + "quiet_hours":[ 2, 3, 4, 5, 6 ], + "core_list":[ 11 ] + }} -After successful initialization the user is presented with VM Power Manager Guest CLI: +The following is an example JSON string that removes the named policy. -.. code-block:: console +.. code-block:: JSON - vm_power(guest)> + {"policy": { + "name": "ubuntu", + "command": "destroy", + }} -To change the frequency of a lcore, use the set_cpu_freq command. -Where {core_num} is the lcore and channel to change frequency by scaling up/down/min/max. +The following is an example JSON string for a power management request. -.. code-block:: console +.. code-block:: JSON - set_cpu_freq {core_num} up|down|min|max + {"instruction": { + "name": "ubuntu", + "command": "power", + "unit": "SCALE_MAX", + "resource_id": 10 + }} To query the available frequences of an lcore, use the query_cpu_freq command. Where {core_num} is the lcore to query. @@ -783,3 +697,215 @@ will send the policy to the host: Once the policy is sent to the host, the host application takes over the power monitoring of the specified cores in the policy. + +.. _json_name_value_pair: + +JSON Name-value Pairs +~~~~~~~~~~~~~~~~~~~~~ + +The following are the name-value pairs supported by the JSON interface: + +- `avg_packet_thresh`_ +- `busy_hours`_ +- `command`_ +- `core_list`_ +- `mac_list`_ +- `max_packet_thresh`_ +- `name`_ +- `policy_type`_ +- `quiet_hours`_ +- `resource_id`_ +- `unit`_ +- `workload`_ + +avg_packet_thresh +^^^^^^^^^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "avg_packet_thresh" +================== =========================================================== + **Description:** | The threshold below which the frequency is set to the minimum value for the + | TRAFFIC policy. If the traffic rate is above this value and below the + | maximum value, the frequency is set to medium. + **Type:** integer + **Values:** | The number of packets below which the TRAFFIC policy applies the minimum + | frequency, or the medium frequency if between the average and maximum + | thresholds. + **Required:** Yes + **Example:** ``"avg_packet_thresh": 100000`` +================== =========================================================== + +busy_hours +^^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "busy_hours" +================== =========================================================== + **Description:** The hours of the day in which we scale up the cores for busy times. + **Type:** array of integers + **Values:** An array with a list of hour values (0-23). + **Required:** For the TIME policy only. + **Example:** ``"busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ]`` +================== =========================================================== + +command +^^^^^^^ + +================== =========================================================== + **Pair Name:** "command" +================== =========================================================== + **Description:** | The type of packet to send to the VM Power Manager. It is possible to create + | or destroy a policy or send a direct command to adjust the frequency of a core, + | as is possible on the command line interface. + **Type:** | string + **Values:** Possible values are: + + - CREATE: Create a new policy. + - DESTROY: Remove an existing policy. + - POWER: Send an immediate command, max, min, and so on. + + **Required:** Yes + **Example:** ``"command": "CREATE"`` +================== =========================================================== + +core_list +^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "core_list" +================== =========================================================== + **Description:** The cores to which to apply a policy. + **Type:** array of integers + **Values:** An array with a list of virtual CPUs. + **Required:** For CREATE/DESTROY policy requests only. + **Example:** ``"core_list":[ 10, 11 ]`` +================== =========================================================== + +mac_list +^^^^^^^^ + +================== =========================================================== + **Pair Name:** "mac_list" +================== =========================================================== + **Description:** | When the policy is of type TRAFFIC, it is necessary to specify the MAC addresses + | that the host must monitor. + **Type:** | array of strings + **Values:** An array with a list of mac address strings. + **Required:** For TRAFFIC policy types only. + **Example:** ``"mac_list":[ "de:ad:be:ef:01:01","de:ad:be:ef:01:02" ]`` +================== =========================================================== + + +max_packet_thresh +^^^^^^^^^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "max_packet_thresh" +================== =========================================================== + **Description:** | In a policy of type TRAFFIC, the threshold value above which the frequency is set + | to a maximum. + **Type:** | integer + **Values:** | The number of packets per interval above which the TRAFFIC + | policy applies the maximum frequency. + **Required:** For the TRAFFIC policy only. + **Example:** ``"max_packet_thresh": 500000`` +================== =========================================================== + +name +^^^^ + +================== =========================================================== + **Pair Name:** "name" +================== =========================================================== + **Description:** | The name of the VM or host. Allows the parser to associate the policy with the + | relevant VM or host OS. + **Type:** | string + **Values:** Any valid string. + **Required:** Yes + **Example:** ``"name": "ubuntu2"`` +================== =========================================================== + +policy_type +^^^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "policy_type" +================== =========================================================== + **Description:** | The type of policy to apply. See the ``--policy`` option description for more + | information. + **Type:** string + **Values:** Possible values are: + + - | TIME: Time-of-day policy. Scale the frequencies of the relevant cores up/down + | depending on busy and quiet hours. + - | TRAFFIC: Use statistics from the NIC and scale up and down accordingly. + - | WORKLOAD: Determine how heavily loaded the cores are and scale up and down + | accordingly. + - | BRANCH_RATIO: An out-of-band policy that looks at the ratio between branch + | hits and misses on a core and uses that information to determine how much + | packet processing a core is doing. + + **Required:** For ``CREATE`` and ``DESTROY`` policy requests only. + **Example:** ``"policy_type": "TIME"`` +================== =========================================================== + +quiet_hours +^^^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "quiet_hours" +================== =========================================================== + **Description:** | The hours of the day to scale down the cores for quiet times. + **Type:** array of integers + **Values:** | An array with a list of hour numbers with values in the range 0 to 23. + **Required:** For the TIME policy only. + **Example:** ``"quiet_hours":[ 2, 3, 4, 5, 6 ]`` +================== =========================================================== + +resource_id +^^^^^^^^^^^ + +================== =========================================================== + **Pair Name:** "resource_id" +================== =========================================================== + **Description:** The core to which to apply a power command. + **Type:** integer + **Values:** A valid core ID for the VM or host OS. + **Required:** For the ``POWER`` instruction only. + **Example:** ``"resource_id": 10`` +================== =========================================================== + +unit +^^^^ + +================== =========================================================== + **Pair Name:** "unit" +================== =========================================================== + **Description:** The type of power operation to apply in the command. + **Type:** string + **Values:** - SCALE_MAX: Scale the frequency of this core to the maximum. + - SCALE_MIN: Scale the frequency of this core to the minimum. + - SCALE_UP: Scale up the frequency of this core. + - SCALE_DOWN: Scale down the frequency of this core. + - ENABLE_TURBO: Enable Intel® Turbo Boost Technology for this core. + - DISABLE_TURBO: Disable Intel® Turbo Boost Technology for this core. + **Required:** For the ``POWER`` instruction only. + **Example:** ``"unit": "SCALE_MAX"`` +================== =========================================================== + +workload +^^^^^^^^ + +================== =========================================================== + **Pair Name:** "workload" +================== =========================================================== + **Description:** In a policy of type WORKLOAD, it is necessary to specify + how heavy the workload is. + **Type:** string + **Values:** - HIGH: Scale the frequency of this core to maximum. + - MEDIUM: Scale the frequency of this core to minimum. + - LOW: Scale up the frequency of this core. + **Required:** For the ``WORKLOAD`` policy only. + **Example:** ``"workload": "MEDIUM"`` +================== =========================================================== + -- 2.17.1