From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id A6F1E68B7 for ; Thu, 11 Sep 2014 17:48:57 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 11 Sep 2014 08:53:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,506,1406617200"; d="scan'208";a="598134417" Received: from irsmsx102.ger.corp.intel.com ([163.33.3.155]) by fmsmga002.fm.intel.com with ESMTP; 11 Sep 2014 08:53:53 -0700 Received: from irsmsx109.ger.corp.intel.com ([169.254.13.200]) by IRSMSX102.ger.corp.intel.com ([169.254.2.24]) with mapi id 14.03.0195.001; Thu, 11 Sep 2014 16:53:52 +0100 From: "Carew, Alan" To: "dev@dpdk.org" Thread-Topic: [RFC] Virtual Machine Power Management Thread-Index: Ac/N1xKVaurY0WOKQyCuR8WwXhnW7A== Date: Thu, 11 Sep 2014 15:53:52 +0000 Message-ID: <0E29434AEE0C3A4180987AB476A6F6306D272B96@IRSMSX109.ger.corp.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: [dpdk-dev] [RFC] Virtual Machine Power Management X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Sep 2014 15:48:58 -0000 Hi folks, I am currently working on a Power Management example application for a Virt= ual Machine environment running on qemu/KVM and would appreciate any feedba= ck(with code to share shortly). The basic idea is to provide librte_power functionality from within a VM to= address the lack(for good reason) of MSRs to facilitate frequency changes = from within a VM. For those unfamiliar, librte_power affects frequency changes via the "acpi-= cpufreq" userspace power governor, accessed via sysfs. The VM implementation allows for DPDK applications to request frequency cha= nges via the librte_power API, however requests are forwarded over a messag= e bus to a host monitor daemon which manages frequency changes for any numb= er of VMs, the daemon itself uses librte_power then to honour the VM reques= ts. VM: rte_power_freq_max ----> guest_channel_send_msg(pkt) ----> HOST HOST: epoll_wait() ----> read(pkt) ----> validate_and_process_request()= ----> get_pcpus_mask(vCPU) ----> power_manager_scale_core_max(pCPU_mask); The architecture requires a number of components to achieve this: Message Bus: A means of forwarding frequency change requests to the host. I am using Vir= tio-Serial, it gives us a secure channel that can be configured in a number= of ways. Each lcore in the VM has exclusive access to a channel. Each chan= nel is configured as a serial device on the VM and as an AF_UNIX socket on = the host. Both endpoints support poll/select/epoll. More information on Vir= tio-Serial is here: http://fedoraproject.org/wiki/Features/VirtioSerial VM Application: For each lcore, a channel is opened in non-blocking mode and frequency chan= ges are just packets send via "write" to the channel. The existing l3fwd-po= wer application be reused. Each packet has format of command(Power), resour= ce(core) and amount(min/max/up/down). Host Monitor: Epoll based monitor to manage channel requests: frequency changes(after con= version of vCPU to pCPU), VM shutdown and error events Management CLI: For channel management, adding channels to host monitor, disabling/re-enabl= ing VM requests to allow for manual core frequency management(via CLI) and = inspecting vCPU to physical CPU pinning. Power Management: A wrapper around librte_power to enable frequency changes for a mask of cor= es, however running a virtual CPU on multiple physical CPUs is not ideal, b= ut is supported. The sharing of a physical CPU with multiple VMs is not sup= ported, while it can be attempted there is no coordination of requests from= different VMs. Thanks, Alan