From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 9C1F97E76 for ; Tue, 14 Oct 2014 14:30:21 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP; 14 Oct 2014 05:37:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="399986720" Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153]) by FMSMGA003.fm.intel.com with ESMTP; 14 Oct 2014 05:30:43 -0700 Received: from irsmsx109.ger.corp.intel.com ([169.254.13.253]) by IRSMSX101.ger.corp.intel.com ([169.254.1.201]) with mapi id 14.03.0195.001; Tue, 14 Oct 2014 13:37:52 +0100 From: "Carew, Alan" To: "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v4 00/10] VM Power Management Thread-Index: AQHP5lPntegTcHXUhU2lJ6crRCq8n5wuan2AgAEcXMA= Date: Tue, 14 Oct 2014 12:37:51 +0000 Message-ID: <0E29434AEE0C3A4180987AB476A6F6306D28093B@IRSMSX109.ger.corp.intel.com> References: <1412003903-9061-1-git-send-email-alan.carew@intel.com> <1413142571-23069-1-git-send-email-alan.carew@intel.com> <3264386.kAdiTFhMft@xps13> In-Reply-To: <3264386.kAdiTFhMft@xps13> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Oct 2014 12:30:22 -0000 Hi Thomas, > -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Monday, October 13, 2014 9:26 PM > To: Carew, Alan > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management >=20 > Hi Alan, >=20 > 2014-10-12 20:36, Alan Carew: > > The following patches add two DPDK sample applications and an alternate > > implementation of librte_power for use in virtualized environments. > > The idea is to provide librte_power functionality from within a VM to a= ddress > > the lack of MSRs to facilitate frequency changes from within a VM. > > It is ideally suited for Haswell which provides per core frequency scal= ing. > > > > The current librte_power affects frequency changes via the acpi-cpufreq > > 'userspace' power governor, accessed via sysfs. >=20 > Something was preventing me from looking deeper in this big codebase, > but I didn't know what sounds weird. > Now I realize: the real problem is that virtualization transparency is > broken for power management. So the right thing to do is to fix it in > KVM. I think all this patchset is a huge workaround. >=20 > Did you try to fix it with Qemu/KVM? >=20 > -- > Thomas When looking at the libvirt API it would seem to be a natural fit to have p= ower management sitting there, so in essence I would agree. However with a DPDK solution it would be possible to re-use the message bus= to pass information like device stats, application state, D-state requests= etc. to the host and allow for management layer(e.g. OpenStack) to make in= formed decisions. Also, the scope of adding power management to qemu/KVM would be huge; while= the easier path is not always the best and the problem of power management= in VMs is both a DPDK problem (given that librte_power only worked on the = host) and a general virtualization problem that would be better solved by t= hose with direct knowledge of Qemu/KVM architecture and influence on the di= rection of the Qemu project. As it stands, the host backend is simply an example application that can be= replaced by a VMM or Orchestration layer, by using Virtio-Serial it has ob= vious leanings to Qemu, but even this could be easily swapped out for XenBu= s, IVSHMEM, IP etc. If power management is to be eventually supported by Hypervisors directly t= hen we could also enable to option to switch to that environment, currently= the librte_power implementations (VM or Host) can be selected dynamically(= environment auto-detection) or explicitly via rte_power_set_env(), adding a= n arbitrary number of environments is relatively easy. I hope this helps to clarify the approach. Thanks, Alan.