From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 8E2D81B19B for ; Thu, 12 Oct 2017 02:23:51 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2017 17:23:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,363,1503385200"; d="scan'208";a="909094901" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.241.225.21]) ([10.241.225.21]) by FMSMGA003.fm.intel.com with ESMTP; 11 Oct 2017 17:23:48 -0700 To: David Hunt , dev@dpdk.org Cc: konstantin.ananyev@intel.com, jingjing.wu@intel.com, santosh.shukla@caviumnetworks.com References: <1505299459-24135-2-git-send-email-david.hunt@intel.com> <1507738735-24879-1-git-send-email-david.hunt@intel.com> From: Ferruh Yigit Message-ID: Date: Thu, 12 Oct 2017 01:23:48 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <1507738735-24879-1-git-send-email-david.hunt@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Oct 2017 00:23:52 -0000 On 10/11/2017 5:18 PM, David Hunt wrote: > Policy Based Power Control for Guest > > This patchset adds the facility for a guest VM to send a policy down to the > host that will allow the host to scale up/down cpu frequencies > depending on the policy criteria independently of the DPDK app running in > the guest. This differs from the previous vm_power implementation where > individual scale up/down requests were send from the guest to the host via > virtio-serial. > > V9 patchset changes: > * Rebased on top of the tip of the master branch > * changed port_id from uint8 to uint16 due to changes elsewhere > > V8 patchset changes: > * Added Ack's and Reviewed-by's to individual patches in the set so as to > keep patchwork A/R/T flags properly in sync. > > V7 patchset changes: > * Changed return code of rte_pmd_i40e_query_vfid_by_mac() from an > int64_t to int > > V6 patchset changes: > * Fixed comments in header for rte_pmd_i40e_query_vfid_by_mac. > * changed rte_pmd_i40e_query_vfid_by_mac return code from uint to int > as it can return negative error codes. > * Removed bool enum from channel_commands.h, including stdbool.h instead. > * Added #define VM_MAX_NAME_SZ 32 to channel_commands.h > * Renamed a few variables to be more readable. > * Added returns in a few places if failed to get info on domain. > * Fixed power_manager_init to keep track of num_freqs for each core. > * In power_manager_scale_core_med(), changed a hardcoded '5' to instead > be calculated from the centre of the frequency list > (global_core_freq_info[core_num].num_freqs / 2) > > V5 patchset changes: > * Removed most of the #ifdef I40_PMD as it will be applicable to > other PMDs in the future. > * Changed the parameter of rte_pmd_i40e_query_vfid_by_mac from a uint64 > to a const struct ether_addr *, rather than casting it later in the > function. > > V4 patchset changes: > * None, re-post to mailing list under the correct email thread. > > V3 patchset changes: > * Changed to using is_same_ether_addr() instead of looping through > the mac address bytes to compare them. > * Tweaked some comments and working in the i40e patch after review. > * Added a patch to the set to add new i40e function to map file, so > as to allow shared library builds. The power library API needs a cleanup > in next release, so will add API/ABI warning for this cleanup in a > separate patch. > > V2 patchset changes: > * Removed API's in ethdev layer. > * Now just a single new API in the i40e driver for mapping VF MAC to > VF index. > * Moved new function from rte_rxtx.c to rte_pmd_i40e.c > * Removed function for reading i40e register, moved to using the > standard stats API. > * Renamed i40e function to rte_pmd_i40e_query_vfid_by_mac > * Cleaned up policy generation code. > > It's a modification of the vm_power_manager app that runs in the host, and > the guest_vm_power_app example app that runs in the guest. This allows the > guest to send down a policy to the host via virtio-serial, which then allows > the host to scale up/down based on the criteria in the policy, resulting in > quicker scale up/down than individual requests coming from the guest. > It also means that the DPDK application running in the guest does not need > to be modified in any way, it is unaware that it's cores are being scaled > up/down, reducing the effort in implementing a power-aware infrastructure. > > The usage model is as follows: > 1. Set up the VF's and assign to the guest in the usual way. > 2. run vm_power_manager on the host, creating a channel to the guest. > 3. Start the guest_vm_power_mgr app on the guest, which establishes > a virtio-serial channel to the host. > 4. Send down the profile for the guest using the "send_profile now" command. > There is an example profile hard-coded into guest_vm_power_mgr. > 5. Stop the guest_vm_power_mgr and run your normal power-unaware application. > 6. Send traffic into the VFs at varying traffic rates. > Observe the frequency change on the host (turbostat -i 1) > > The sequence of code changes are as follows: > > A new function has been aded to the i40e driver to allow mapping of > a VF MAC to VF index. > > Next we make an addition to librte_power that adds an extra command to allow > the passing of a policy structure from the guest to the host. This struct > contains information like busy/quiet hour, packet throughput thresholds, etc. > > The next addition adds functionality to convert the virtual CPU (vcpU0 IDs to > physical CPU (pcpu) IDs so that the host can scale up/down the cores used > in the guest. > > The remaining patches are functionality to process the policy, and take action > when the relevant trigger occurs to cause a frequency change. Applied to dpdk/master, thanks.