[dpdk-dev] [PATCH 00/10] VM Power Management

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] [PATCH 00/10] VM Power Management
@ 2014-09-22 18:34 Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
                   ` (10 more replies)
  0 siblings, 11 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/<vm_name>.<channel_number>,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
 Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.

Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.
<controller type='virtio-serial' index='0'>
 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<channel type='unix'>
  <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
  <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
  <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
</channel>

Where multiple channels can be configured by specifying multiple <channel>
elements, by replacing <vm_name>, <channel_num>.
<N>(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two seperate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infintie loop that waits on channel events
 from VMs and calls the corresponing librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  CPU Frequency Power Management(Host).
  VM communication channels for VM Power Management(Guest).
  Alternate implementation of librte_power for VM Power
    Management(Guest).
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests(Guest)
  VM Power Management CLI(Guest).

 app/test/Makefile                                  |   1 +
 app/test/autotest_data.py                          |  13 +
 app/test/test_power_vm.c                           | 215 +++++++
 config/common_linuxapp                             |   6 +
 examples/vm_power_manager/Makefile                 |  57 ++
 examples/vm_power_manager/channel_manager.c        | 643 +++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h        | 273 +++++++++
 examples/vm_power_manager/channel_monitor.c        | 228 ++++++++
 examples/vm_power_manager/channel_monitor.h        | 102 ++++
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++
 examples/vm_power_manager/guest_cli/main.c         |  87 +++
 examples/vm_power_manager/guest_cli/main.h         |  52 ++
 .../guest_cli/vm_power_cli_guest.c                 | 155 +++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++
 examples/vm_power_manager/main.c                   | 113 ++++
 examples/vm_power_manager/main.h                   |  52 ++
 examples/vm_power_manager/power_manager.c          | 234 ++++++++
 examples/vm_power_manager/power_manager.h          | 191 ++++++
 examples/vm_power_manager/vm_power_cli.c           | 567 ++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h           |  47 ++
 lib/Makefile                                       |   1 +
 lib/librte_power_vm/Makefile                       |  49 ++
 lib/librte_power_vm/channel_commands.h             |  68 +++
 lib/librte_power_vm/guest_channel.c                | 150 +++++
 lib/librte_power_vm/guest_channel.h                |  89 +++
 lib/librte_power_vm/rte_power.c                    | 146 +++++
 mk/rte.app.mk                                      |   4 +
 27 files changed, 3654 insertions(+)
 create mode 100644 app/test/test_power_vm.c
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h
 create mode 100644 lib/librte_power_vm/Makefile
 create mode 100644 lib/librte_power_vm/channel_commands.h
 create mode 100644 lib/librte_power_vm/guest_channel.c
 create mode 100644 lib/librte_power_vm/guest_channel.h
 create mode 100644 lib/librte_power_vm/rte_power.c

-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 02/10] VM Power Management CLI(Host) Alan Carew
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/channel_manager.c | 643 ++++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h | 273 ++++++++++++
 examples/vm_power_manager/channel_monitor.c | 228 ++++++++++
 examples/vm_power_manager/channel_monitor.h | 102 +++++
 4 files changed, 1246 insertions(+)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 0000000..7ec798f
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,643 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+
+#include <rte_config.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_log.h>
+#include <rte_atomic.h>
+#include <rte_spinlock.h>
+
+#include <libvirt/libvirt.h>
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define SOCKET_PATH "/tmp/powermonitor/"
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+			if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+	char name[MAX_NAME_LEN];
+	uint64_t pcpu_mask[MAX_VCPU];
+	struct channel_info *channels[MAX_VM_CHANNELS];
+	uint64_t channel_mask;
+	uint8_t num_channels;
+	enum vm_status status;
+	virDomainPtr domainPtr;
+	virDomainInfo info;
+	rte_spinlock_t config_spinlock;
+	LIST_ENTRY(virtual_machine_info) vms_info;
+};
+
+LIST_HEAD(, virtual_machine_info) vm_list_head;
+
+static struct virtual_machine_info *
+find_domain_by_name(const char *name)
+{
+	struct virtual_machine_info *info;
+	LIST_FOREACH(info, &vm_list_head, vms_info) {
+		if (!strncmp(info->name, name, MAX_NAME_LEN-1))
+			return info;
+	}
+	return NULL;
+}
+
+static void
+disconnect_hypervisor(void)
+{
+	if (global_vir_conn_ptr != NULL) {
+		virConnectClose(global_vir_conn_ptr);
+		global_vir_conn_ptr = NULL;
+	}
+}
+
+static int
+connect_hypervisor(const char *path)
+{
+	if (global_vir_conn_ptr != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+				"already established\n", path);
+		return -1;
+	}
+	global_vir_conn_ptr = virConnectOpen(path);
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
+				"Hypervisor '%s'\n", path);
+		return -1;
+	}
+	return 0;
+}
+
+static int
+update_pcpus_mask(struct virtual_machine_info *vm_info)
+{
+	size_t maplen = VIR_CPU_MAPLEN(RTE_MAX_LCORE);
+	unsigned char cpumap[RTE_MAX_LCORE*maplen];
+	unsigned int i, j;
+
+	if (virDomainGetVcpuPinInfo(vm_info->domainPtr,
+			vm_info->info.nrVirtCpu,
+			cpumap,
+			maplen,
+			VIR_DOMAIN_DEVICE_MODIFY_CURRENT) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting Vcpu Pin Info\n");
+		return -1;
+	}
+	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
+		vm_info->pcpu_mask[i] = 0;
+		for (j = 0; j < maplen; j++)
+			vm_info->pcpu_mask[i] += VIR_CPU_USABLE(cpumap, maplen, i, j);
+	}
+	return 0;
+}
+
+uint64_t
+get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
+{
+	struct virtual_machine_info *vm_info =
+			(struct virtual_machine_info *)chan_info->priv_info;
+	return vm_info->pcpu_mask[vcpu];
+}
+
+static inline int
+channel_exists(struct virtual_machine_info *vm_info, unsigned int channel_num)
+{
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	if (vm_info->channel_mask & (1ULL << channel_num)) {
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+		return 1;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_vm(const char *vm_name)
+{
+	struct virtual_machine_info *new_domain;
+	virDomainPtr dom_ptr;
+
+	if (find_domain_by_name(vm_name) != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
+						"already exists\n", vm_name);
+		return -1;
+	}
+
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
+		return -1;
+	}
+	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
+	if (dom_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
+				"VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
+			CACHE_LINE_SIZE);
+	if (new_domain == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
+				"info\n");
+		return -1;
+	}
+	new_domain->domainPtr = dom_ptr;
+	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	if (new_domain->info.nrVirtCpu > MAX_VCPU) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
+				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
+				MAX_VCPU);
+		rte_free(new_domain);
+		return -1;
+	}
+	if (update_pcpus_mask(new_domain) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
+	new_domain->channel_mask = 0;
+	new_domain->num_channels = 0;
+
+	if (!virDomainIsActive(dom_ptr))
+		new_domain->status = VM_INACTIVE;
+	else
+		new_domain->status = VM_ACTIVE;
+
+	rte_spinlock_init(&(new_domain->config_spinlock));
+	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
+	return 0;
+}
+
+static int
+open_non_blocking_channel(struct channel_info *info)
+{
+	int ret, flags;
+	struct sockaddr_un sock_addr;
+	fd_set soc_fd_set;
+	struct timeval tv;
+
+	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (info->fd == -1) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
+				strerror(errno),
+				info->channel_path);
+		return -1;
+	}
+	sock_addr.sun_family = AF_UNIX;
+	memcpy(&sock_addr.sun_path, info->channel_path,
+			strlen(info->channel_path)+1);
+
+	/* Get current flags */
+	flags = fcntl(info->fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
+				"'%s'\n", strerror(errno), info->channel_path);
+		return 1;
+	}
+	/* Set to Non Blocking */
+	flags |= O_NONBLOCK;
+	if (fcntl(info->fd, F_SETFL, flags) < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
+				"socket for '%s'\n", strerror(errno), info->channel_path);
+		return -1;
+	}
+	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
+			sizeof(sock_addr));
+	if (ret < 0) {
+		/* ECONNREFUSED error is given when VM is not active */
+		if (errno == ECONNREFUSED) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
+					"activated its endpoint to channel %s\n",
+					info->channel_path);
+			return -1;
+		}
+		/* Wait for tv_sec if in progress */
+		else if (errno == EINPROGRESS) {
+			tv.tv_sec = 2;
+			tv.tv_usec = 0;
+			FD_ZERO(&soc_fd_set);
+			FD_SET(info->fd, &soc_fd_set);
+			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
+				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
+						"'%s'\n", info->channel_path);
+				return -1;
+			}
+		} else {
+			/* Any other error */
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
+					" for '%s'\n", strerror(errno), info->channel_path);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int
+setup_channel_info(struct virtual_machine_info **vm_info_dptr,
+		struct channel_info **chan_info_dptr, unsigned channel_num)
+{
+	struct channel_info *chan_info = *chan_info_dptr;
+	struct virtual_machine_info *vm_info = *vm_info_dptr;
+
+	chan_info->channel_num = channel_num;
+	chan_info->priv_info = (void *)vm_info;
+	chan_info->status = CHANNEL_DISCONNECTED;
+	if (open_non_blocking_channel(chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
+				"'%s' for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+	}
+	if (add_channel_to_monitor(&chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
+				"'%s' to epoll ctl for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+
+	}
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->num_channels++;
+	vm_info->channel_mask |= 1ULL << channel_num;
+	vm_info->channels[channel_num] = chan_info;
+	chan_info->status = CHANNEL_CONNECTED;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_all_channels(const char *vm_name)
+{
+	DIR *d;
+	struct dirent *dir;
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char *token, *remaining, *tail_ptr;
+	char socket_name[PATH_MAX];
+	unsigned channel_num;
+	int num_channels_enabled = 0;
+
+	/* verify VM exists */
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
+				" during channel discovery\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = VM_INACTIVE;
+		return 0;
+	}
+	d = opendir(SOCKET_PATH);
+	if (d == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
+				SOCKET_PATH, strerror(errno));
+		return -1;
+	}
+	while ((dir = readdir(d)) != NULL) {
+		if (!strncmp(dir->d_name, ".", 1) ||
+				!strncmp(dir->d_name, "..", 2))
+			continue;
+
+		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
+		remaining = socket_name;
+		/* Extract vm_name from "<vm_name>.<channel_num>" */
+		token = strsep(&remaining, ".");
+		if (remaining == NULL)
+			continue;
+		if (strncmp(vm_name, token, MAX_NAME_LEN))
+			continue;
+
+		/* remaining should contain only <channel_num> */
+		errno = 0;
+		channel_num = (unsigned int)strtol(remaining, &tail_ptr, 0);
+		if ((errno != 0) || (remaining[0] == '\0') ||
+				(*tail_ptr != '\0') || tail_ptr == NULL) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
+					"'%s' found it should be in the form of "
+					"'<guest_name>.<channel_num>(decimal)'\n",
+					dir->d_name);
+			continue;
+		}
+		if (channel_num >= MAX_VM_CHANNELS) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
+					"greater than max allowable: %d, skipping '%s%s'\n",
+					channel_num, MAX_VM_CHANNELS-1, SOCKET_PATH,
+					dir->d_name);
+			continue;
+		}
+		/* if channel has not been added previously */
+		if (channel_exists(vm_info, channel_num))
+			continue;
+
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s%s'\n", SOCKET_PATH, dir->d_name);
+			continue;
+		}
+
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s", SOCKET_PATH,
+				dir->d_name);
+
+		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+
+		num_channels_enabled++;
+	}
+	closedir(d);
+	return num_channels_enabled;
+}
+
+int
+add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char socket_path[PATH_MAX];
+	unsigned i;
+	int num_channels_enabled = 0;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = VM_INACTIVE;
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (i == MAX_VM_CHANNELS)
+			continue;
+		if (channel_exists(vm_info, channel_list[i])) {
+			RTE_LOG(INFO, CHANNEL_MANAGER,  "Channel already exists, skipping  "
+					"'%s.%u'\n", vm_name, i);
+			continue;
+		}
+
+		snprintf(socket_path, sizeof(socket_path), "%s%s.%u", SOCKET_PATH,
+				vm_name, channel_list[i]);
+		errno = 0;
+		if (access(socket_path, F_OK) < 0) {
+				RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
+						"%s\n", socket_path, strerror(errno));
+			continue;
+		}
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s'\n", socket_path);
+			continue;
+		}
+		snprintf(chan_info->channel_path,
+							sizeof(chan_info->channel_path), "%s%s.%u",
+							SOCKET_PATH, vm_name, channel_list[i]);
+		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+		num_channels_enabled++;
+
+	}
+	return num_channels_enabled;
+}
+
+int
+remove_channel(struct channel_info *chan_info)
+{
+	struct virtual_machine_info *vm_info;
+	close(chan_info->fd);
+	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
+	vm_info->num_channels--;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	rte_free(chan_info);
+	return 0;
+}
+
+int
+remove_vm(const char *vm_name)
+{
+	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
+						"not found\n", vm_name);
+		return -1;
+	}
+	rte_spinlock_lock(&vm_info->config_spinlock);
+	if (vm_info->num_channels != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
+				"%"PRId8" channels still active\n",
+				vm_name, vm_info->num_channels);
+		rte_spinlock_unlock(&vm_info->config_spinlock);
+		return -1;
+	}
+	LIST_REMOVE(vm_info, vms_info);
+	rte_spinlock_unlock(&vm_info->config_spinlock);
+	rte_free(vm_info);
+	return 0;
+}
+
+int
+set_channel_status_all(const char *vm_name, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned int i;
+	uint64_t mask;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_CONNECTED || status == CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			vm_info->channels[i]->status = status;
+			num_channels_changed++;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return num_channels_changed;
+
+}
+
+int
+set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_CONNECTED || status == CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (channel_exists(vm_info, channel_list[i])) {
+			rte_spinlock_lock(&(vm_info->config_spinlock));
+			vm_info->channels[channel_list[i]]->status = status;
+			rte_spinlock_unlock(&(vm_info->config_spinlock));
+			num_channels_changed++;
+		}
+	}
+	return num_channels_changed;
+}
+
+int
+get_info_vm(const char *vm_name, struct vm_info *info)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i, channel_num = 0;
+	uint64_t mask;
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+	info->status = VM_ACTIVE;
+	if (!virDomainIsActive(vm_info->domainPtr))
+		info->status = VM_INACTIVE;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			info->channels[channel_num].channel_num = i;
+			memcpy(info->channels[channel_num].channel_path,
+					vm_info->channels[i]->channel_path, PATH_MAX);
+			info->channels[channel_num].status = vm_info->channels[i]->status;
+			info->channels[channel_num].fd = vm_info->channels[i]->fd;
+			channel_num++;
+	}
+	info->num_channels = channel_num;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	info->num_vcpus = vm_info->info.nrVirtCpu;
+
+	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
+
+	if (update_pcpus_mask(vm_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' unable to update vCPU Pinning\n",
+				vm_name);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_manager_init(const char *path)
+{
+	LIST_INIT(&vm_list_head);
+	if (connect_hypervisor(path) < 0)
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
+	return 0;
+}
+
+void
+channel_manager_exit(void)
+{
+	unsigned int i;
+	uint64_t mask;
+	struct virtual_machine_info *vm_info;
+
+	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
+
+		rte_spinlock_lock(&(vm_info->config_spinlock));
+		mask = vm_info->channel_mask;
+		ITERATIVE_BITMASK_CHECK_64(mask, i) {
+				remove_channel_from_monitor(vm_info->channels[i]);
+				close(vm_info->channels[i]->fd);
+				rte_free(vm_info->channels[i]);
+		}
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+		LIST_REMOVE(vm_info, vms_info);
+		rte_free(vm_info);
+	}
+	disconnect_hypervisor();
+}
+
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
new file mode 100644
index 0000000..d114201
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.h
@@ -0,0 +1,273 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MANAGER_H_
+#define CHANNEL_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/limits.h>
+#include <rte_atomic.h>
+
+/* Maximum number of channels per VM */
+#define MAX_VM_CHANNELS 64
+
+/* Maximum name length including '\0' terminator */
+#define MAX_NAME_LEN    64
+
+/* Maximum number of virtual CPUs that can be assigned to a VM */
+#define MAX_VCPU        64
+
+/* Default Hypervisor Path for libvirt(qemu/KVM) */
+#define DEFAULT_HV_PATH "qemu:///system"
+
+/* Communication Channel Status */
+enum channel_status { CHANNEL_DISCONNECTED = 0, CHANNEL_CONNECTED,
+	CHANNEL_DISABLED, CHANNEL_PROCESSING};
+
+/* VM libvirt(qemu/KVM) connection status */
+enum vm_status { VM_INACTIVE = 0, VM_ACTIVE};
+
+/*
+ *  Represents a single and exclusive VM channel that exists between a guest and
+ *  the host.
+ */
+struct channel_info {
+	char channel_path[PATH_MAX]; /**< Path to host socket */
+	volatile uint32_t status;    /**< Connection status(enum channel_status) */
+	int fd;                      /**< AF_UNIX socket fd */
+	unsigned int channel_num;    /**< /tmp/powermonitor/<vm_name>.channel_num */
+	void *priv_info;             /**< Pointer to private info, do not modify */
+};
+
+/* Represents a single VM instance used to return internal information about
+ * a VM */
+struct vm_info {
+	char name[MAX_NAME_LEN];                      /**< VM name */
+	enum vm_status status;                        /**< libvirt status */
+	uint64_t pcpu_mask[MAX_VCPU];                 /**< pCPU mask for each vCPU */
+	unsigned num_vcpus;                           /**< number of vCPUS */
+	struct channel_info channels[MAX_VM_CHANNELS];/**< Array of channel_info */
+	unsigned num_channels;                        /**< Number of channels */
+};
+
+/**
+ * Initialize the Channel Manager resources and connect to the Hypervisor
+ * specified in path.
+ * This must be successfully called first before calling any other functions.
+ * It must only be call once;
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_manager_init(const char *path);
+
+/**
+ * Free resources associated with the Channel Manager.
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  None
+ */
+void channel_manager_exit(void);
+
+/**
+ * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
+ * core_mask.
+ * It is not thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info
+ *
+ * @param vcpu
+ *  The virtual CPU to query.
+ *
+ *
+ * @return
+ *  - 0 on error.
+ *  - >0 on success.
+ */
+uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
+
+/**
+ * Add a VM as specified by name to the Channel Manager. The name must
+ * correspond to a valid libvirt domain name.
+ * This is required prior to adding channels.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_vm(const char *name);
+
+/**
+ * Remove a previously added Virtual Machine from the Channel Manager
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_vm(const char *name);
+
+/**
+ * Add all available channels to the VM as specified by name.
+ * Channels in the form of paths(/tmp/powermonitor/<vm_name>.<channel_number>
+ * will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ */
+int add_all_channels(const char *vm_name);
+
+/**
+ * Add the channel numbers in channel_list to the domain specified by name.
+ * Channels in the form of paths(/tmp/powermonitor/<vm_name>.<channel_number>
+ * will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel number to add
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned num_channels);
+
+/**
+ * Remove a channel definition from the channel manager. This must only be
+ * called from the channel monitor thread.
+ *
+ * @param chan_info
+ *  Pointer to a valid struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel(struct channel_info *chan_info);
+
+/**
+ * For all channels associated with a Virtual Machine name, update the
+ * connection status. Valid states are CHANNEL_CONNECTED or CHANNEL_DISABLED
+ * only.
+ *
+ *
+ * @param name
+ *  Virtual Machine name to modify all channels.
+ *
+ * @param status
+ *  The status to set each channel
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int set_channel_status_all(const char *name, enum channel_status status);
+
+/**
+ * For all channels in channel_list associated with a Virtual Machine name
+ * update the connection status of each.
+ * Valid states are CHANNEL_CONNECTED or CHANNEL_DISABLED only.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel numbers to
+ *  modify.
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels modified for the VM
+ *  - 0 for error
+ */
+int set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status);
+
+/**
+ * Populates a pointer to struct vm_info associated with vm_name.
+ *
+ * @param vm_name
+ *  The name of the virtual machine to lookup.
+ *
+ *  @param vm_info
+ *   Pointer to a struct vm_info, this must be allocated prior to calling this
+ *   function.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int get_info_vm(const char *vm_name, struct vm_info *info);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_MANAGER_H_ */
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
new file mode 100644
index 0000000..7207a2e
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -0,0 +1,228 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+
+
+#include "channel_monitor.h"
+#include "channel_commands.h"
+#include "channel_manager.h"
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+
+static volatile unsigned run_loop = 1;
+static int global_event_fd;
+static struct epoll_event *global_events_list;
+
+void channel_monitor_exit(void)
+{
+	run_loop = 0;
+	rte_free(global_events_list);
+}
+
+static int
+process_request(struct channel_packet *pkt, struct channel_info *chan_info)
+{
+	uint64_t core_mask;
+
+	if (chan_info == NULL)
+		return -1;
+
+	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_CONNECTED,
+			CHANNEL_PROCESSING) == 0)
+		return -1;
+
+	if (pkt->command == CPU_POWER) {
+		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
+		if (core_mask == 0) {
+			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
+					"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
+					(unsigned)pkt->unit);
+			return -1;
+		}
+		if (__builtin_popcountll(core_mask) == 1) {
+
+			unsigned core_num = __builtin_ffsll(core_mask) - 1;
+			switch (pkt->unit) {
+			case(CPU_SCALE_MIN):
+				power_manager_scale_core_min(core_num);
+				break;
+			case(CPU_SCALE_MAX):
+				power_manager_scale_core_max(core_num);
+				break;
+			case(CPU_SCALE_DOWN):
+				power_manager_scale_core_down(core_num);
+				break;
+			case(CPU_SCALE_UP):
+				power_manager_scale_core_up(core_num);
+				break;
+			default:
+				break;
+			}
+		} else {
+			switch (pkt->unit) {
+			case(CPU_SCALE_MIN):
+				power_manager_scale_mask_min(core_mask);
+				break;
+			case(CPU_SCALE_MAX):
+				power_manager_scale_mask_max(core_mask);
+				break;
+			case(CPU_SCALE_DOWN):
+				power_manager_scale_mask_down(core_mask);
+				break;
+			case(CPU_SCALE_UP):
+				power_manager_scale_mask_up(core_mask);
+				break;
+			default:
+				break;
+			}
+
+		}
+	}
+	/* Return is not checked as channel status may have been set to DISABLED
+	 * from management thread
+	 */
+	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_PROCESSING,
+				CHANNEL_CONNECTED);
+	return 0;
+
+}
+
+int
+add_channel_to_monitor(struct channel_info **chan_info)
+{
+	struct channel_info *info = *chan_info;
+	struct epoll_event event;
+	event.events = EPOLLIN;
+	event.data.ptr = info;
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
+				"to epoll\n", info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+remove_channel_from_monitor(struct channel_info *chan_info)
+{
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
+				"from epoll\n", chan_info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_monitor_init(void)
+{
+	global_event_fd = epoll_create1(0);
+	if (global_event_fd == 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
+				"error %s\n", strerror(errno));
+		return -1;
+	}
+	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
+			* MAX_EVENTS, CACHE_LINE_SIZE);
+	if (global_events_list == NULL) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+				"epoll events\n");
+		return -1;
+	}
+	return 0;
+}
+
+void
+run_channel_monitor(void)
+{
+	while (run_loop) {
+		int n_events, i;
+		n_events = epoll_wait(global_event_fd, global_events_list,
+				MAX_EVENTS, 1);
+		if (!run_loop)
+			break;
+		for (i = 0; i < n_events; i++) {
+			struct channel_info *chan_info = (struct channel_info *)
+								global_events_list[i].data.ptr;
+			if ((global_events_list[i].events & EPOLLERR) ||
+					(global_events_list[i].events & EPOLLHUP)) {
+				remove_channel(chan_info);
+			}
+			if (global_events_list[i].events & EPOLLIN) {
+
+				int n_bytes, err = 0;
+				struct channel_packet pkt;
+				void *buffer = &pkt;
+				int buffer_len = sizeof(pkt);
+				while (buffer_len > 0) {
+					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					if (n_bytes == buffer_len)
+						break;
+					if (n_bytes == -1) {
+						err = errno;
+						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
+								"channel '%s' read: %s\n",
+								chan_info->channel_path, strerror(err));
+						remove_channel(chan_info);
+						break;
+					}
+					buffer = (char *)buffer + n_bytes;
+					buffer_len -= n_bytes;
+
+				}
+				if (!err)
+					process_request(&pkt, chan_info);
+			}
+		}
+	}
+}
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
new file mode 100644
index 0000000..c138607
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MONITOR_H_
+#define CHANNEL_MONITOR_H_
+
+#include "channel_manager.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Channel Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_monitor_init(void);
+
+/**
+ * Run the channel monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_channel_monitor(void);
+
+/**
+ * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+void channel_monitor_exit(void);
+
+/**
+ * Add an open channel to monitor via epoll. A pointer to struct channel_info
+ * will be registered with epoll for event processing.
+ * It is thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info pointer.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_channel_to_monitor(struct channel_info **chan_info);
+
+/**
+ * Remove a previously added channel from epoll control.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel_from_monitor(struct channel_info *chan_info);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* CHANNEL_MONITOR_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 02/10] VM Power Management CLI(Host).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 03/10] CPU Frequency Power Management(Host) Alan Carew
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-seperated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/vm_power_cli.c | 567 +++++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h |  47 +++
 2 files changed, 614 insertions(+)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 0000000..f5e3759
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,567 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <termios.h>
+#include <errno.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+			    struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+struct cmd_show_vm_result {
+	cmdline_fixed_string_t show_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_show_vm_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_show_vm_result *res = parsed_result;
+	struct vm_info info;
+	unsigned i;
+
+	if (get_info_vm(res->vm_name, &info) != 0)
+		return;
+	cmdline_printf(cl, "VM: '%s', status = ", info.name);
+	if (info.status == VM_ACTIVE)
+		cmdline_printf(cl, "ACTIVE\n");
+	else
+		cmdline_printf(cl, "INACTIVE\n");
+	cmdline_printf(cl, "Channels %u\n", info.num_channels);
+	for (i = 0; i < info.num_channels; i++) {
+		cmdline_printf(cl, "  [%u]: %s, status = ", i,
+				info.channels[i].channel_path);
+		switch (info.channels[i].status) {
+		case CHANNEL_CONNECTED:
+			cmdline_printf(cl, "CONNECTED\n");
+			break;
+		case CHANNEL_DISCONNECTED:
+			cmdline_printf(cl, "DISCONNECTED\n");
+			break;
+		case CHANNEL_DISABLED:
+			cmdline_printf(cl, "DISABLED\n");
+			break;
+		case CHANNEL_PROCESSING:
+			cmdline_printf(cl, "PROCESSING\n");
+			break;
+		default:
+			cmdline_printf(cl, "UNKNOWN\n");
+			break;
+		}
+	}
+	cmdline_printf(cl, "Virtual CPU(s): %u\n", info.num_vcpus);
+	for (i = 0; i < info.num_vcpus; i++) {
+		cmdline_printf(cl, "  [%u]: Physical CPU Mask 0x%"PRIx64"\n", i,
+				info.pcpu_mask[i]);
+	}
+}
+
+cmdline_parse_token_string_t cmd_vm_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+				show_vm, "show_vm");
+cmdline_parse_token_string_t cmd_show_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_show_vm_set = {
+	.f = cmd_show_vm_parsed,
+	.data = NULL,
+	.help_str = "show_vm <vm_name>, prints the information on the "
+			"specified VM(s), the information lists the number of vCPUS, the "
+			"pinning to pCPU(s) as a bit mask, along with any communication "
+			"channels associated with each VM",
+	.tokens = {
+		(void *)&cmd_vm_show,
+		(void *)&cmd_show_vm_name,
+		NULL,
+	},
+};
+
+struct cmd_vm_op_result {
+	cmdline_fixed_string_t op_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_vm_op_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_vm_op_result *res = parsed_result;
+
+	if (!strcmp(res->op_vm, "add_vm")) {
+		if (add_vm(res->vm_name) < 0)
+			cmdline_printf(cl, "Unable to add VM '%s'\n", res->vm_name);
+	} else if (remove_vm(res->vm_name) < 0)
+		cmdline_printf(cl, "Unable to remove VM '%s'\n", res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_vm_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			op_vm, "add_vm#rm_vm");
+cmdline_parse_token_string_t cmd_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_vm_op_set = {
+	.f = cmd_vm_op_parsed,
+	.data = NULL,
+	.help_str = "add_vm|rm_vm <name>, add a VM for "
+			"subsequent operations with the CLI or remove a previously added "
+			"VM from the VM Power Manager",
+	.tokens = {
+		(void *)&cmd_vm_op,
+		(void *)&cmd_vm_name,
+	NULL,
+	},
+};
+
+/* *** VM channel operations *** */
+struct cmd_channels_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+};
+static void
+cmd_channels_op_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num, i;
+	int channels_added;
+	unsigned channel_list[MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_op_result *res = parsed_result;
+
+	if (!strcmp(res->channel_list, "all")) {
+		channels_added = add_all_channels(res->vm_name);
+		cmdline_printf(cl, "Added %d channels for VM '%s'\n",
+				channels_added, res->vm_name);
+		return;
+	}
+
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "Channel number '%u' exceeds the maximum number "
+					"of allowable channels(%u) for VM '%s'\n", channel_num,
+					MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	for (i = 0; i < num_channels; i++)
+		cmdline_printf(cl, "[%u]: Adding channel %u\n", i, channel_list[i]);
+
+	channels_added = add_channels(res->vm_name, channel_list,
+			num_channels);
+	cmdline_printf(cl, "Enabled %d channels for '%s'\n", channels_added,
+			res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+				op, "add_channels");
+cmdline_parse_token_string_t cmd_channels_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			channel_list, NULL);
+
+cmdline_parse_inst_t cmd_channels_op_set = {
+	.f = cmd_channels_op_parsed,
+	.data = NULL,
+	.help_str = "add_channels <vm_name> <list>|all, add "
+			"communication channels for the specified VM, the "
+			"virtio channels must be enabled in the VM "
+			"configuration(qemu/libvirt) and the associated VM must be active. "
+			"<list> is a comma-seperated list of channel numbers to add, using "
+			"the keyword 'all' will attempt to add all channels for the VM",
+	.tokens = {
+		(void *)&cmd_channels_op,
+		(void *)&cmd_channels_vm_name,
+		(void *)&cmd_channels_list,
+		NULL,
+	},
+};
+
+struct cmd_channels_status_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+	cmdline_fixed_string_t status;
+};
+
+static void
+cmd_channels_status_op_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num;
+	int changed;
+	unsigned channel_list[MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_status_op_result *res = parsed_result;
+	enum channel_status status;
+
+	if (!strcmp(res->status, "enabled"))
+		status = CHANNEL_CONNECTED;
+	else
+		status = CHANNEL_DISABLED;
+
+	if (!strcmp(res->channel_list, "all")) {
+		changed = set_channel_status_all(res->vm_name, status);
+		cmdline_printf(cl, "Updated status of %d channels "
+				"for VM '%s'\n", changed, res->vm_name);
+		return;
+	}
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "%u exceeds the maximum number of allowable "
+					"channels(%u) for VM '%s'\n", channel_num, MAX_VM_CHANNELS,
+					res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	changed = set_channel_status(res->vm_name, channel_list, num_channels,
+			status);
+	cmdline_printf(cl, "Updated status of %d channels "
+					"for VM '%s'\n", changed, res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_status_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+				op, "set_channel_status");
+cmdline_parse_token_string_t cmd_channels_status_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_status_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			channel_list, NULL);
+cmdline_parse_token_string_t cmd_channels_status =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			status, "enabled#disabled");
+
+cmdline_parse_inst_t cmd_channels_status_op_set = {
+	.f = cmd_channels_status_op_parsed,
+	.data = NULL,
+	.help_str = "set_channel_status <vm_name> <list>|all enabled|disabled, "
+			" enable or disable the communication channels in "
+			"list(comma-seperated) for the specified VM, alternatively list can"
+			" be replaced with keyword 'all'. Disabled channels will still "
+			"receive packets on the host, however the commands they specify "
+			"will be ignored. Set status to 'enabled' to begin processing "
+			"requests again.",
+	.tokens = {
+		(void *)&cmd_channels_status_op,
+		(void *)&cmd_channels_status_vm_name,
+		(void *)&cmd_channels_status_list,
+		(void *)&cmd_channels_status,
+		NULL,
+	},
+};
+
+/* *** CPU Frequency operations *** */
+struct cmd_show_cpu_freq_mask_result {
+	cmdline_fixed_string_t show_cpu_freq_mask;
+	uint64_t core_mask;
+};
+
+static void
+cmd_show_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_mask_result *res = parsed_result;
+	unsigned i;
+	uint64_t mask = res->core_mask;
+	uint32_t freq;
+	for (i = 0; mask; mask &= ~(1ULL << i++)) {
+		if ((mask >> i) & 1) {
+			freq = power_manager_get_current_frequency(i);
+			if (freq > 0)
+				cmdline_printf(cl, "Core %u: %"PRId32"\n", i, freq);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			show_cpu_freq_mask, "show_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_show_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			core_mask, UINT64);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_mask_set = {
+	.f = cmd_show_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "show_cpu_freq_mask <mask>, Get the current frequency for each "
+			"core specified in the mask",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq_mask,
+		(void *)&cmd_show_cpu_freq_mask_core_mask,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_mask_result {
+	cmdline_fixed_string_t set_cpu_freq_mask;
+	uint64_t core_mask;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	struct cmd_set_cpu_freq_mask_result *res = parsed_result;
+	int ret = -1;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_mask_up(res->core_mask);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_mask_down(res->core_mask);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_mask_min(res->core_mask);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_mask_max(res->core_mask);
+	if (ret != 0) {
+		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s'\n",
+				res->core_mask, res->cmd);
+	};
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			set_cpu_freq_mask, "set_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			core_mask, UINT64);
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
+	.f = cmd_set_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
+			"frequency for the cores specified in <core_mask> by scaling "
+			"each up/down/min/max.",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq_mask,
+		(void *)&cmd_set_cpu_freq_mask_core_mask,
+		(void *)&cmd_set_cpu_freq_mask_result,
+		NULL,
+	},
+};
+
+
+
+struct cmd_show_cpu_freq_result {
+	cmdline_fixed_string_t show_cpu_freq;
+	uint8_t core_num;
+};
+
+static void
+cmd_show_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_result *res = parsed_result;
+	uint32_t curr_freq = power_manager_get_current_frequency(res->core_num);
+	if (curr_freq == 0) {
+		cmdline_printf(cl, "Unable to get frequency for core %u\n",
+				res->core_num);
+		return;
+	}
+	cmdline_printf(cl, "Core %u frequency: %"PRId32"\n", res->core_num,
+			curr_freq);
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_result,
+			show_cpu_freq, "show_cpu_freq");
+
+cmdline_parse_token_num_t cmd_show_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_result,
+			core_num, UINT8);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_set = {
+	.f = cmd_show_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "Get the current frequency for the specified core",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq,
+		(void *)&cmd_show_cpu_freq_core_num,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t core_num;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_core_up(res->core_num);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_core_down(res->core_num);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_core_min(res->core_num);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_core_max(res->core_num);
+	if (ret != 0) {
+		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
+				res->cmd);
+	}
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			core_num, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_vm_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_status_op_set,
+		(cmdline_parse_inst_t *)&cmd_show_vm_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	power_manager_init();
+	cl = cmdline_stdin_new(main_ctx, "vmpower> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/vm_power_cli.h b/examples/vm_power_manager/vm_power_cli.h
new file mode 100644
index 0000000..deccd51
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.h
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 03/10] CPU Frequency Power Management(Host).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 02/10] VM Power Management CLI(Host) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 04/10] " Alan Carew
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

A wrapper around librte_power, providing locking around the non-threadsafe
library, allowing for frequency changes based on core masks and core numbers
from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/power_manager.c | 234 ++++++++++++++++++++++++++++++
 examples/vm_power_manager/power_manager.h | 191 ++++++++++++++++++++++++
 2 files changed, 425 insertions(+)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 0000000..ceca532
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,234 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/types.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_power.h>
+#include <rte_spinlock.h>
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num, ret) do { \
+	if (!(global_enabled_cpus & (1ULL << core_num))) \
+		return -1; \
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+	ret = rte_power_freq_##DIRECTION(core_num); \
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+	int i; \
+	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+		if (!(global_enabled_cpus & (1ULL << i))) \
+			return -1; \
+		rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+		ret = rte_power_freq_##DIRECTION(i); \
+		rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+	} \
+} while (0)
+
+struct freq_info {
+	rte_spinlock_t power_sl;
+	uint32_t freqs[RTE_MAX_LCORE_FREQS];
+	unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[RTE_MAX_LCORE];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+	char path[PATH_MAX];
+	unsigned i;
+	unsigned num_cpus = 0;
+	for (i = 0; i < RTE_MAX_LCORE; i++) {
+		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+		if (access(path, F_OK) == 0) {
+			global_enabled_cpus |= 1 << i;
+			num_cpus++;
+		} else
+			return num_cpus;
+	}
+	return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+	unsigned i, num_cpus;
+	uint64_t cpu_mask;
+	int ret = 0;
+
+	num_cpus = set_host_cpus_mask();
+	if (num_cpus == 0) {
+		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
+				"ensure that sufficient privileges exist to inspect sysfs\n");
+		return -1;
+	}
+
+	cpu_mask = global_enabled_cpus;
+	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
+		if (rte_power_init(i) < 0 || rte_power_freqs(i,
+				global_core_freq_info[i].freqs,
+				RTE_MAX_LCORE_FREQS) == 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+					"for core %u\n", i);
+			global_enabled_cpus &= ~(1 << i);
+			num_cpus--;
+			ret = -1;
+		}
+		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+	}
+	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
+					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	return ret;
+
+}
+
+uint32_t
+power_manager_get_current_frequency(unsigned core_num)
+{
+	uint32_t freq, index;
+
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return 0;
+
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	index = rte_power_get_freq(core_num);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	if (index >= RTE_MAX_LCORE)
+		freq = 0;
+	else
+		freq = global_core_freq_info[core_num].freqs[index];
+
+	return freq;
+}
+
+int
+power_manager_exit(void)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
+		if (rte_power_exit(i) < 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+					"for core %u\n", i);
+			ret = -1;
+		}
+	}
+	global_enabled_cpus = 0;
+	return ret;
+}
+
+int
+power_manager_scale_mask_up(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(up, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_down(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(down, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_min(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(min, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_max(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(max, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_up(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(up, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_down(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(down, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_min(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(min, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_max(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(max, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
new file mode 100644
index 0000000..0b2f2a0
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.h
@@ -0,0 +1,191 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef POWER_MANAGER_H_
+#define POWER_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management.
+ * Initializes resources and verifies the number of CPUs on the system.
+ * Wraps librte_power int rte_power_init(unsigned lcore_id);
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_init(void);
+
+/**
+ * Exit power management. Must be called prior to exiting the application.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_exit(void);
+
+/**
+ * Scale up the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_up(uint64_t core_mask);
+
+/**
+ * Scale down the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_down(uint64_t core_mask);
+
+/**
+ * Scale to the minimum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_min(uint64_t core_mask);
+
+/**
+ * Scale to the maximum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_max(uint64_t core_mask);
+
+/**
+ * Scale up frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_up(unsigned core_num);
+
+/**
+ * Scale down frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_down(unsigned core_num);
+
+/**
+ * Scale to minimum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_min(unsigned core_num);
+
+/**
+ * Scale to maximum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_max(unsigned core_num);
+
+/**
+ * Get the current freuency of the core specified by core_num
+ *
+ * @param core_num
+ *  The core number to get the current frequency
+ *
+ * @return
+ *  - 0  on error
+ *  - >0 for current frequency.
+ */
+uint32_t power_manager_get_current_frequency(unsigned core_num);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* POWER_MANAGER_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 04/10] CPU Frequency Power Management(Host).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (2 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 03/10] CPU Frequency Power Management(Host) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 05/10] VM communication channels for VM Power Management(Guest) Alan Carew
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

A wrapper around librte_power, providing locking around the non-threadsafe
library, allowing for frequency changes based on core masks and core numbers
from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/Makefile |  57 +++++++++++++++++++
 examples/vm_power_manager/main.c   | 113 +++++++++++++++++++++++++++++++++++++
 examples/vm_power_manager/main.h   |  52 +++++++++++++++++
 3 files changed, 222 insertions(+)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
new file mode 100644
index 0000000..a2f00ea
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power_vm/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 0000000..fdb9e73
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,113 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_launch.h>
+#include <rte_log.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "vm_power_cli.h"
+#include "main.h"
+
+static int
+run_monitor(__attribute__((unused)) void *arg)
+{
+	if (channel_manager_init(DEFAULT_HV_PATH) < 0) {
+		printf("Unable to initialize channel manager\n");
+		return -1;
+	}
+	if (channel_monitor_init() < 0) {
+		printf("Unable to initialize channel monitor\n");
+		return -1;
+	}
+	run_channel_monitor();
+	return 0;
+}
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	lcore_id = rte_get_next_lcore(-1, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+				"application\n");
+		return 0;
+	}
+	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
+
+	run_cli(NULL);
+
+	rte_eal_mp_wait_lcore();
+	return 0;
+}
diff --git a/examples/vm_power_manager/main.h b/examples/vm_power_manager/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 05/10] VM communication channels for VM Power Management(Guest).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (3 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 04/10] " Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power " Alan Carew
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power_vm/guest_channel.c | 150 ++++++++++++++++++++++++++++++++++++
 lib/librte_power_vm/guest_channel.h |  89 +++++++++++++++++++++
 2 files changed, 239 insertions(+)
 create mode 100644 lib/librte_power_vm/guest_channel.c
 create mode 100644 lib/librte_power_vm/guest_channel.h

diff --git a/lib/librte_power_vm/guest_channel.c b/lib/librte_power_vm/guest_channel.c
new file mode 100644
index 0000000..8baa20a
--- /dev/null
+++ b/lib/librte_power_vm/guest_channel.c
@@ -0,0 +1,150 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+	int flags, ret;
+	struct channel_packet pkt;
+	char fd_path[PATH_MAX];
+	int fd = -1;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	/* check if path is already open */
+	if (global_fds[lcore_id] != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open\n", lcore_id);
+		return -1;
+	}
+
+	snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+	RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+			fd_path, lcore_id);
+	fd = open(fd_path, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with error "
+				"%s\n", fd_path, strerror(errno));
+		return -1;
+	}
+
+	flags = fcntl(fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file %s\n",
+				fd_path);
+		return -1;
+	}
+
+	flags |= O_NONBLOCK;
+	if (fcntl(fd, F_SETFL, flags) < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking mode for "
+				"file %s", fd_path);
+		return -1;
+	}
+	/* QEMU needs a delay after connection */
+	sleep(1);
+
+	/* Send a test packet, this command is ignored by the host, but a successful
+	 *  send indicates that the host endpoint is monitoring.
+	 */
+	pkt.command = CPU_POWER_CONNECT;
+	global_fds[lcore_id] = fd;
+	ret = guest_channel_send_msg(&pkt, lcore_id);
+	if (ret != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' communications "
+				"test: %s\n", fd_path, strerror(ret));
+		close(fd);
+		global_fds[lcore_id] = 0;
+		return -1;
+	}
+	RTE_LOG(INFO, GUEST_CHANNEL, "Channel '%s' is now connected\n", fd_path);
+	return 0;
+}
+
+int
+guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
+{
+	int ret, buffer_len = sizeof(*pkt);
+	void *buffer = pkt;
+
+	if (global_fds[lcore_id] == 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel is not connected\n");
+		return -1;
+	}
+	while (buffer_len > 0) {
+		ret = write(global_fds[lcore_id], buffer, buffer_len);
+		if (ret == buffer_len)
+			return 0;
+		if (ret == -1) {
+			if (errno == EINTR)
+				continue;
+			return errno;
+		}
+		buffer = (char *)buffer + ret;
+		buffer_len -= ret;
+	}
+	return 0;
+}
+
+void
+guest_channel_host_disconnect(unsigned lcore_id)
+{
+	if (global_fds[lcore_id] == 0)
+		return;
+	close(global_fds[lcore_id]);
+	global_fds[lcore_id] = 0;
+}
+
+
diff --git a/lib/librte_power_vm/guest_channel.h b/lib/librte_power_vm/guest_channel.h
new file mode 100644
index 0000000..9e18af5
--- /dev/null
+++ b/lib/librte_power_vm/guest_channel.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef _GUEST_CHANNEL_H
+#define _GUEST_CHANNEL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <channel_commands.h>
+
+/**
+ * Connect to the Virtio-Serial VM end-point located in path. It is
+ * thread safe for unique lcore_ids. This function must be only called once from
+ * each lcore.
+ *
+ * @param path
+ *  The path to the serial device on the filesystem
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int guest_channel_host_connect(const char *path, unsigned lcore_id);
+
+/**
+ * Disconnect from an already connected Virtio-Serial Endpoint.
+ *
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ */
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct guest_agent_pkt
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on channel not connected.
+ *  - errno on write to channel error.
+ */
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power for VM Power Management(Guest).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (4 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 05/10] VM communication channels for VM Power Management(Guest) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 19:17   ` Neil Horman
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 07/10] Packet format for VM Power Management(Host and Guest) Alan Carew
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

Re-using the host based librte_power API the alternate implementation uses
the guest channel API to forward request for frequency changes to the host
monitor.
A subset of the librte_power API is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs from librte_power return -ENOTSUP.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power_vm/Makefile    |  49 ++++++++++++++
 lib/librte_power_vm/rte_power.c | 146 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 195 insertions(+)
 create mode 100644 lib/librte_power_vm/Makefile
 create mode 100644 lib/librte_power_vm/rte_power.c

diff --git a/lib/librte_power_vm/Makefile b/lib/librte_power_vm/Makefile
new file mode 100644
index 0000000..284ec2c
--- /dev/null
+++ b/lib/librte_power_vm/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_power.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
+CFLAGS += -I$(RTE_SDK)/lib/librte_power/
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_POWER_VM) := guest_channel.c rte_power.c
+
+# install this header file
+SYMLINK-y-include := ../librte_power/rte_power.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_POWER) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_power_vm/rte_power.c b/lib/librte_power_vm/rte_power.c
new file mode 100644
index 0000000..1ce3fb0
--- /dev/null
+++ b/lib/librte_power_vm/rte_power.c
@@ -0,0 +1,146 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <errno.h>
+#include <string.h>
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+#include "rte_power.h"
+
+#define RTE_LOGTYPE_POWER_VM RTE_LOGTYPE_USER1
+
+#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+static struct channel_packet pkt[RTE_MAX_LCORE];
+
+
+int
+rte_power_init(unsigned lcore_id)
+{
+	pkt[lcore_id].command = CPU_POWER;
+	pkt[lcore_id].resource_id = lcore_id;
+	return guest_channel_host_connect(FD_PATH, lcore_id);
+}
+
+int
+rte_power_exit(unsigned lcore_id)
+{
+	guest_channel_host_disconnect(lcore_id);
+	return 0;
+}
+
+uint32_t
+rte_power_freqs(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t *freqs,
+		__attribute__((unused)) uint32_t num)
+{
+	RTE_LOG(ERR, POWER_VM, "rte_power_freqs is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+uint32_t
+rte_power_get_freq(__attribute__((unused)) unsigned lcore_id)
+{
+	RTE_LOG(ERR, POWER_VM, "rte_power_get_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_set_freq(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t index)
+{
+	RTE_LOG(ERR, POWER_VM, "rte_power_set_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_freq_up(unsigned lcore_id)
+{
+	int ret;
+	pkt[lcore_id].unit = CPU_SCALE_UP;
+
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret <= 0)
+		return ret;
+	if (ret > 0)
+		RTE_LOG(DEBUG, POWER_VM, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_freq_down(__attribute__((unused)) unsigned lcore_id)
+{
+	int ret;
+	pkt[lcore_id].unit = CPU_SCALE_DOWN;
+
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret <= 0)
+		return ret;
+	if (ret > 0)
+		RTE_LOG(DEBUG, POWER_VM, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_freq_max(__attribute__((unused)) unsigned lcore_id)
+{
+	int ret;
+	pkt[lcore_id].unit = CPU_SCALE_MAX;
+
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret <= 0)
+		return ret;
+	if (ret > 0)
+		RTE_LOG(DEBUG, POWER_VM, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_freq_min(__attribute__((unused)) unsigned lcore_id)
+{
+	int ret;
+	pkt[lcore_id].unit = CPU_SCALE_MIN;
+
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret <= 0)
+		return ret;
+	if (ret > 0)
+		RTE_LOG(DEBUG, POWER_VM, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power for VM Power Management(Guest).
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power " Alan Carew
@ 2014-09-22 19:17   ` Neil Horman
  2014-09-23  7:48     ` Carew, Alan
  0 siblings, 1 reply; 97+ messages in thread
From: Neil Horman @ 2014-09-22 19:17 UTC (permalink / raw)
  To: Alan Carew; +Cc: dev

On Mon, Sep 22, 2014 at 07:34:35PM +0100, Alan Carew wrote:
> Re-using the host based librte_power API the alternate implementation uses
> the guest channel API to forward request for frequency changes to the host
> monitor.
> A subset of the librte_power API is supported:
>  rte_power_init(unsigned lcore_id)
>  rte_power_exit(unsigned lcore_id)
>  rte_power_freq_up(unsigned lcore_id)
>  rte_power_freq_down(unsigned lcore_id)
>  rte_power_freq_min(unsigned lcore_id)
>  rte_power_freq_max(unsigned lcore_id)
> 
> The other unsupported APIs from librte_power return -ENOTSUP.
> 
> Signed-off-by: Alan Carew <alan.carew@intel.com>
> ---
>  lib/librte_power_vm/Makefile    |  49 ++++++++++++++
>  lib/librte_power_vm/rte_power.c | 146 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 195 insertions(+)
>  create mode 100644 lib/librte_power_vm/Makefile
>  create mode 100644 lib/librte_power_vm/rte_power.c
> 
NAK.
This is a bad design choice.  Creating an alternate library with all the same
symbols in place prevents an application from compiling in support for both host
and guest power management in parallel (i.e. if an app wants to be able to do
power management in either environment, and only gets built once, it won't
work).

In fact, linking a statically built library with both CONFIG_RTE_LIBRTE_POWER=y
and CONFIG_RTE_LIBRTE_POWER_VM=y yields the following link-time build break:

LD test
/home/nhorman/git/dpdk/build/lib/librte_power.a(guest_channel.o): In function
`guest_channel_host_connect':
guest_channel.c:(.text+0x0): multiple definition of `guest_channel_host_connect'
/home/nhorman/git/dpdk/build/lib/librte_power.a(guest_channel.o):guest_channel.c:(.text+0x0):
first defined here
/home/nhorman/git/dpdk/build/lib/librte_power.a(guest_channel.o): In function
`guest_channel_send_msg':
guest_channel.c:(.text+0x370): multiple definition of `guest_channel_send_msg'
....
Ad nauseum.

What you should do is merge this functionality in with the existing librte power
library, and make the choice of implementation a run time decision, so theres
only a single public facing API symbol set, and both implementations can
coexist, getting chosen at run time (via initialization config option,
environment detection, etc).  Konstantin and I had a simmilar discussion
regarding the ACL library and the use of the match function.  I think we came up
with some reasonably performant solutions.

Neil

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power for VM Power Management(Guest).
  2014-09-22 19:17   ` Neil Horman
@ 2014-09-23  7:48     ` Carew, Alan
  0 siblings, 0 replies; 97+ messages in thread
From: Carew, Alan @ 2014-09-23  7:48 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

Hi Neil,


> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, September 22, 2014 8:18 PM
> To: Carew, Alan
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 06/10] Alternate implementation of
> librte_power for VM Power Management(Guest).
> 
> On Mon, Sep 22, 2014 at 07:34:35PM +0100, Alan Carew wrote:
> > Re-using the host based librte_power API the alternate implementation uses
> > the guest channel API to forward request for frequency changes to the host
> > monitor.
> > A subset of the librte_power API is supported:
> >  rte_power_init(unsigned lcore_id)
> >  rte_power_exit(unsigned lcore_id)
> >  rte_power_freq_up(unsigned lcore_id)
> >  rte_power_freq_down(unsigned lcore_id)
> >  rte_power_freq_min(unsigned lcore_id)
> >  rte_power_freq_max(unsigned lcore_id)
> >
> > The other unsupported APIs from librte_power return -ENOTSUP.
> >
> > Signed-off-by: Alan Carew <alan.carew@intel.com>
> > ---
> >  lib/librte_power_vm/Makefile    |  49 ++++++++++++++
> >  lib/librte_power_vm/rte_power.c | 146
> ++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 195 insertions(+)
> >  create mode 100644 lib/librte_power_vm/Makefile
> >  create mode 100644 lib/librte_power_vm/rte_power.c
> >
> NAK.
> This is a bad design choice.  Creating an alternate library with all the same
> symbols in place prevents an application from compiling in support for both host
> and guest power management in parallel (i.e. if an app wants to be able to do
> power management in either environment, and only gets built once, it won't
> work).
> 
> In fact, linking a statically built library with both CONFIG_RTE_LIBRTE_POWER=y
> and CONFIG_RTE_LIBRTE_POWER_VM=y yields the following link-time build
> break:
> 
> LD test
> /home/nhorman/git/dpdk/build/lib/librte_power.a(guest_channel.o): In
> function
> `guest_channel_host_connect':
> guest_channel.c:(.text+0x0): multiple definition of
> `guest_channel_host_connect'
> /home/nhorman/git/dpdk/build/lib/librte_power.a(guest_channel.o):guest_cha
> nnel.c:(.text+0x0):
> first defined here
> /home/nhorman/git/dpdk/build/lib/librte_power.a(guest_channel.o): In
> function
> `guest_channel_send_msg':
> guest_channel.c:(.text+0x370): multiple definition of `guest_channel_send_msg'
> ....
> Ad nauseum.
> 
> What you should do is merge this functionality in with the existing librte power
> library, and make the choice of implementation a run time decision, so theres
> only a single public facing API symbol set, and both implementations can
> coexist, getting chosen at run time (via initialization config option,
> environment detection, etc).  Konstantin and I had a simmilar discussion
> regarding the ACL library and the use of the match function.  I think we came up
> with some reasonably performant solutions.
> 
> Neil

Makes sense, I'll take a look at runtime configuration options and post a V2.

Thanks,
Alan 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 07/10] Packet format for VM Power Management(Host and Guest).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (5 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power " Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 08/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power_vm/channel_commands.h | 68 ++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100644 lib/librte_power_vm/channel_commands.h

diff --git a/lib/librte_power_vm/channel_commands.h b/lib/librte_power_vm/channel_commands.h
new file mode 100644
index 0000000..4ad65cf
--- /dev/null
+++ b/lib/librte_power_vm/channel_commands.h
@@ -0,0 +1,68 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_config.h>
+
+#if RTE_MAX_LCORE > 64
+#error Maximum number of cores and channels is 64, overflow is guaranteed to \
+	cause problems.
+#endif
+
+#define CPU_POWER         1
+#define CPU_POWER_CONNECT 2
+
+#define CPU_SCALE_UP      1
+#define CPU_SCALE_DOWN    2
+#define CPU_SCALE_MAX     3
+#define CPU_SCALE_MIN     4
+
+struct channel_packet {
+	uint64_t resource_id; /* core_num, device */
+	uint32_t unit; /* scale down/up/min/max */
+	uint32_t command; /* Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 08/10] Build system integration for VM Power Management(Guest and Host)
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (6 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 07/10] Packet format for VM Power Management(Host and Guest) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 09/10] VM Power Management Unit Tests(Guest) Alan Carew
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

Add CONFIG_RTE_LIBRTE_POWER_VM to config/common_linuxapp, default=n
As both host and guest side rely on the same API(librte_power) but different
implementations, it requires the following configurations:
Host: CONFIG_RTE_LIBRTE_POWER_VM=n and Add CONFIG_RTE_LIBRTE_POWER=y
Guest: CONFIG_RTE_LIBRTE_POWER_VM=y and Add CONFIG_RTE_LIBRTE_POWER=n

When building for either the resulting library is called rte_power.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 config/common_linuxapp | 6 ++++++
 lib/Makefile           | 1 +
 mk/rte.app.mk          | 4 ++++
 3 files changed, 11 insertions(+)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 5bee910..fbecad3 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -332,6 +332,12 @@ CONFIG_RTE_LIBRTE_POWER_DEBUG=n
 CONFIG_RTE_MAX_LCORE_FREQS=64
 
 #
+# Compile librte_power_vm
+#
+CONFIG_RTE_LIBRTE_POWER_VM=n
+CONFIG_RTE_LIBRTE_POWER_VM_DEBUG=n
+
+#
 # Compile librte_net
 #
 CONFIG_RTE_LIBRTE_NET=y
diff --git a/lib/Makefile b/lib/Makefile
index 10c5bb3..d291459 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -56,6 +56,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
 DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
 DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
+DIRS-$(CONFIG_RTE_LIBRTE_POWER_VM) += librte_power_vm
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
 DIRS-$(CONFIG_RTE_LIBRTE_KVARGS) += librte_kvargs
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 34dff2a..ce8c684 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -105,6 +105,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_POWER),y)
 LDLIBS += -lrte_power
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_POWER_VM),y)
+LDLIBS += -lrte_power
+endif
+
 ifeq ($(CONFIG_RTE_LIBRTE_ACL),y)
 LDLIBS += -lrte_acl
 endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 09/10] VM Power Management Unit Tests(Guest)
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (7 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 08/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 10/10] VM Power Management CLI(Guest) Alan Carew
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 app/test/Makefile         |   1 +
 app/test/autotest_data.py |  13 +++
 app/test/test_power_vm.c  | 215 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 229 insertions(+)
 create mode 100644 app/test/test_power_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 37a3772..39dd08e 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -120,6 +120,7 @@ endif
 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
 SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER_VM) += test_power_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..5c6b60b 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,19 @@ non_parallel_test_group_list = [
 	]
 },
 {
+	"Prefix" :      "power_vm",
+	"Memory" :      "512",
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power VM  autotest",
+		 "Command" :    "power_vm_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
 	"Prefix" :	"lpm6",
 	"Memory" :	"512",
 	"Tests" :
diff --git a/app/test/test_power_vm.c b/app/test/test_power_vm.c
new file mode 100644
index 0000000..176fbec
--- /dev/null
+++ b/app/test/test_power_vm.c
@@ -0,0 +1,215 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+
+#define TEST_POWER_VM_LCORE_ID      0U
+#define TEST_POWER_VM_LCORE_INVALID 64U
+
+static int
+test_power_vm(void)
+{
+	int ret;
+
+	/* Test initialisation of invalid lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_INVALID);
+	if (ret != -1) {
+		printf("rte_power_init unexpectedly succeeded on an invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test initialisation of previously initialised lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_init unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly and test application is running on a VM\n",
+				TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_init unexpectedly succeeded on calling init twice on"
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency up of invalid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Test frequency down of invalid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Test frequency min of invalid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Test frequency max of invalid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Test frequency up of valid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_freq_up unexpectedly failed on valid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of valid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_freq_down unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of valid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_freq_min unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of valid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_freq_max unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test unsupported rte_power_freqs */
+	ret = rte_power_freqs(TEST_POWER_VM_LCORE_ID, NULL, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_freqs did not return the expected -ENOTSUP(%d) but "
+				"returned %d\n", -ENOTSUP, ret);
+		return -1;
+	}
+
+	/* Test unsupported rte_power_get_freq */
+	ret = rte_power_get_freq(TEST_POWER_VM_LCORE_ID);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_get_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test unsupported rte_power_set_freq */
+	ret = rte_power_set_freq(TEST_POWER_VM_LCORE_ID, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_set_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test removing of an lcore */
+	ret = rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_exit unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency up of previously removed lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of previously removed lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of previously removed lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of previously removed lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	return 0;
+}
+
+static struct test_command power_vm_cmd = {
+    .command = "power_vm_autotest",
+    .callback = test_power_vm,
+};
+REGISTER_TEST_COMMAND(power_vm_cmd);
+
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH 10/10] VM Power Management CLI(Guest).
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (8 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 09/10] VM Power Management Unit Tests(Guest) Alan Carew
@ 2014-09-22 18:34 ` Alan Carew
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
  10 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-22 18:34 UTC (permalink / raw)
  To: dev

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++++++++
 examples/vm_power_manager/guest_cli/main.c         |  87 ++++++++++++
 examples/vm_power_manager/guest_cli/main.h         |  52 +++++++
 .../guest_cli/vm_power_cli_guest.c                 | 155 +++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++++++++
 5 files changed, 405 insertions(+)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 0000000..c380c77
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power_vm/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 0000000..2715778
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+*/
+#include <signal.h>
+
+#include <rte_lcore.h>
+#include <rte_power.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "vm_power_cli_guest.h"
+#include "main.h"
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+
+}
+
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_init(lcore_id);
+	}
+	run_cli(NULL);
+
+	return 0;
+}
diff --git a/examples/vm_power_manager/guest_cli/main.h b/examples/vm_power_manager/guest_cli/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
new file mode 100644
index 0000000..e374af6
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -0,0 +1,155 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <termios.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_lcore.h>
+
+#include <rte_power.h>
+
+#include "vm_power_cli_guest.h"
+
+
+#define CHANNEL_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+				__attribute__((unused)) struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t lcore_id;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = rte_power_freq_up(res->lcore_id);
+	else if (!strcmp(res->cmd , "down"))
+		ret = rte_power_freq_down(res->lcore_id);
+	else if (!strcmp(res->cmd , "min"))
+		ret = rte_power_freq_min(res->lcore_id);
+	else if (!strcmp(res->cmd , "max"))
+		ret = rte_power_freq_max(res->lcore_id);
+	if (ret != 0)
+		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			lcore_id, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	cl = cmdline_stdin_new(main_ctx, "vmpower(guest)> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
new file mode 100644
index 0000000..0c4bdd5
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "channel_commands.h"
+
+int guest_channel_host_connect(unsigned lcore_id);
+
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 00/10] VM Power Management
  2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
                   ` (9 preceding siblings ...)
  2014-09-22 18:34 ` [dpdk-dev] [PATCH 10/10] VM Power Management CLI(Guest) Alan Carew
@ 2014-09-24 17:26 ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
                     ` (11 more replies)
  10 siblings, 12 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

Virtual Machine Power Management.

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/<vm_name>.<channel_number>,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
 Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.

Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.
<controller type='virtio-serial' index='0'>
 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<channel type='unix'>
  <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
  <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
  <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
</channel>

Where multiple channels can be configured by specifying multiple <channel>
elements, by replacing <vm_name>, <channel_num>.
<N>(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two separate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
 from VMs and calls the corresponding librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.

Changes in V2:
 Runtime selection of librte_power implementations.
 Updated Unit tests to cover librte_power changes.
 PATCH[0/3] was sent twice, again as PATCH[0/4]
 Miscellaneous fixes.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  VM Power Management application and Makefile.
  VM Power Management CLI(Guest).
  VM communication channels for VM Power Management(Guest).
  librte_power common interface for Guest and Host
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests

 app/test/Makefile                                  |   3 +-
 app/test/autotest_data.py                          |  26 +
 app/test/test_power.c                              | 445 ++------------
 app/test/test_power_acpi_cpufreq.c                 | 544 +++++++++++++++++
 app/test/test_power_kvm_vm.c                       | 308 ++++++++++
 examples/vm_power_manager/Makefile                 |  57 ++
 examples/vm_power_manager/channel_manager.c        | 645 +++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h        | 273 +++++++++
 examples/vm_power_manager/channel_monitor.c        | 228 ++++++++
 examples/vm_power_manager/channel_monitor.h        | 102 ++++
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++
 examples/vm_power_manager/guest_cli/main.c         |  86 +++
 examples/vm_power_manager/guest_cli/main.h         |  52 ++
 .../guest_cli/vm_power_cli_guest.c                 | 155 +++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++
 examples/vm_power_manager/main.c                   | 113 ++++
 examples/vm_power_manager/main.h                   |  52 ++
 examples/vm_power_manager/power_manager.c          | 244 ++++++++
 examples/vm_power_manager/power_manager.h          | 186 ++++++
 examples/vm_power_manager/vm_power_cli.c           | 568 ++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h           |  47 ++
 lib/librte_power/Makefile                          |   3 +-
 lib/librte_power/channel_commands.h                |  68 +++
 lib/librte_power/guest_channel.c                   | 162 ++++++
 lib/librte_power/guest_channel.h                   |  89 +++
 lib/librte_power/rte_power.c                       | 540 +++--------------
 lib/librte_power/rte_power.h                       | 120 +++-
 lib/librte_power/rte_power_acpi_cpufreq.c          | 545 +++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h          | 192 ++++++
 lib/librte_power/rte_power_common.h                |  39 ++
 lib/librte_power/rte_power_kvm_vm.c                | 160 +++++
 lib/librte_power/rte_power_kvm_vm.h                | 179 ++++++
 32 files changed, 5430 insertions(+), 912 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h
 create mode 100644 lib/librte_power/channel_commands.h
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 02/10] VM Power Management CLI(Host) Alan Carew
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/channel_manager.c | 645 ++++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h | 273 ++++++++++++
 examples/vm_power_manager/channel_monitor.c | 228 ++++++++++
 examples/vm_power_manager/channel_monitor.h | 102 +++++
 4 files changed, 1248 insertions(+)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 0000000..fdb0ea5
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,645 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+
+#include <rte_config.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_log.h>
+#include <rte_atomic.h>
+#include <rte_spinlock.h>
+
+#include <libvirt/libvirt.h>
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define SOCKET_PATH "/tmp/powermonitor/"
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+		if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+	char name[MAX_NAME_LEN];
+	uint64_t pcpu_mask[MAX_VCPU];
+	struct channel_info *channels[MAX_VM_CHANNELS];
+	uint64_t channel_mask;
+	uint8_t num_channels;
+	enum vm_status status;
+	virDomainPtr domainPtr;
+	virDomainInfo info;
+	rte_spinlock_t config_spinlock;
+	LIST_ENTRY(virtual_machine_info) vms_info;
+};
+
+LIST_HEAD(, virtual_machine_info) vm_list_head;
+
+static struct virtual_machine_info *
+find_domain_by_name(const char *name)
+{
+	struct virtual_machine_info *info;
+	LIST_FOREACH(info, &vm_list_head, vms_info) {
+		if (!strncmp(info->name, name, MAX_NAME_LEN-1))
+			return info;
+	}
+	return NULL;
+}
+
+static void
+disconnect_hypervisor(void)
+{
+	if (global_vir_conn_ptr != NULL) {
+		virConnectClose(global_vir_conn_ptr);
+		global_vir_conn_ptr = NULL;
+	}
+}
+
+static int
+connect_hypervisor(const char *path)
+{
+	if (global_vir_conn_ptr != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+				"already established\n", path);
+		return -1;
+	}
+	global_vir_conn_ptr = virConnectOpen(path);
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
+				"Hypervisor '%s'\n", path);
+		return -1;
+	}
+	return 0;
+}
+
+static int
+update_pcpus_mask(struct virtual_machine_info *vm_info)
+{
+	size_t maplen = VIR_CPU_MAPLEN(RTE_MAX_LCORE);
+	unsigned char cpumap[RTE_MAX_LCORE*maplen];
+	unsigned int i, j;
+
+	if (virDomainGetVcpuPinInfo(vm_info->domainPtr,
+			vm_info->info.nrVirtCpu,
+			cpumap,
+			maplen,
+			VIR_DOMAIN_DEVICE_MODIFY_CURRENT) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting Vcpu Pin Info\n");
+		return -1;
+	}
+	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
+		vm_info->pcpu_mask[i] = 0;
+		for (j = 0; j < maplen; j++)
+			vm_info->pcpu_mask[i] += VIR_CPU_USABLE(cpumap, maplen, i, j);
+	}
+	return 0;
+}
+
+uint64_t
+get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
+{
+	struct virtual_machine_info *vm_info =
+			(struct virtual_machine_info *)chan_info->priv_info;
+	return vm_info->pcpu_mask[vcpu];
+}
+
+static inline int
+channel_exists(struct virtual_machine_info *vm_info, unsigned int channel_num)
+{
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	if (vm_info->channel_mask & (1ULL << channel_num)) {
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+		return 1;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_vm(const char *vm_name)
+{
+	struct virtual_machine_info *new_domain;
+	virDomainPtr dom_ptr;
+
+	if (find_domain_by_name(vm_name) != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
+				"already exists\n", vm_name);
+		return -1;
+	}
+
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
+		return -1;
+	}
+	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
+	if (dom_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
+				"VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
+			CACHE_LINE_SIZE);
+	if (new_domain == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
+				"info\n");
+		return -1;
+	}
+	new_domain->domainPtr = dom_ptr;
+	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	if (new_domain->info.nrVirtCpu > MAX_VCPU) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
+				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
+				MAX_VCPU);
+		rte_free(new_domain);
+		return -1;
+	}
+	if (update_pcpus_mask(new_domain) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
+	new_domain->channel_mask = 0;
+	new_domain->num_channels = 0;
+
+	if (!virDomainIsActive(dom_ptr))
+		new_domain->status = VM_INACTIVE;
+	else
+		new_domain->status = VM_ACTIVE;
+
+	rte_spinlock_init(&(new_domain->config_spinlock));
+	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
+	return 0;
+}
+
+static int
+open_non_blocking_channel(struct channel_info *info)
+{
+	int ret, flags;
+	struct sockaddr_un sock_addr;
+	fd_set soc_fd_set;
+	struct timeval tv;
+
+	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (info->fd == -1) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
+				strerror(errno),
+				info->channel_path);
+		return -1;
+	}
+	sock_addr.sun_family = AF_UNIX;
+	memcpy(&sock_addr.sun_path, info->channel_path,
+			strlen(info->channel_path)+1);
+
+	/* Get current flags */
+	flags = fcntl(info->fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
+				"'%s'\n", strerror(errno), info->channel_path);
+		return 1;
+	}
+	/* Set to Non Blocking */
+	flags |= O_NONBLOCK;
+	if (fcntl(info->fd, F_SETFL, flags) < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
+				"socket for '%s'\n", strerror(errno), info->channel_path);
+		return -1;
+	}
+	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
+			sizeof(sock_addr));
+	if (ret < 0) {
+		/* ECONNREFUSED error is given when VM is not active */
+		if (errno == ECONNREFUSED) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
+					"activated its endpoint to channel %s\n",
+					info->channel_path);
+			return -1;
+		}
+		/* Wait for tv_sec if in progress */
+		else if (errno == EINPROGRESS) {
+			tv.tv_sec = 2;
+			tv.tv_usec = 0;
+			FD_ZERO(&soc_fd_set);
+			FD_SET(info->fd, &soc_fd_set);
+			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
+				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
+						"'%s'\n", info->channel_path);
+				return -1;
+			}
+		} else {
+			/* Any other error */
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
+					" for '%s'\n", strerror(errno), info->channel_path);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int
+setup_channel_info(struct virtual_machine_info **vm_info_dptr,
+		struct channel_info **chan_info_dptr, unsigned channel_num)
+{
+	struct channel_info *chan_info = *chan_info_dptr;
+	struct virtual_machine_info *vm_info = *vm_info_dptr;
+
+	chan_info->channel_num = channel_num;
+	chan_info->priv_info = (void *)vm_info;
+	chan_info->status = CHANNEL_DISCONNECTED;
+	if (open_non_blocking_channel(chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
+				"'%s' for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+	}
+	if (add_channel_to_monitor(&chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
+				"'%s' to epoll ctl for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+
+	}
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->num_channels++;
+	vm_info->channel_mask |= 1ULL << channel_num;
+	vm_info->channels[channel_num] = chan_info;
+	chan_info->status = CHANNEL_CONNECTED;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_all_channels(const char *vm_name)
+{
+	DIR *d;
+	struct dirent *dir;
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char *token, *remaining, *tail_ptr;
+	char socket_name[PATH_MAX];
+	unsigned channel_num;
+	int num_channels_enabled = 0;
+
+	/* verify VM exists */
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
+				" during channel discovery\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = VM_INACTIVE;
+		return 0;
+	}
+	d = opendir(SOCKET_PATH);
+	if (d == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
+				SOCKET_PATH, strerror(errno));
+		return -1;
+	}
+	while ((dir = readdir(d)) != NULL) {
+		if (!strncmp(dir->d_name, ".", 1) ||
+				!strncmp(dir->d_name, "..", 2))
+			continue;
+
+		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
+		remaining = socket_name;
+		/* Extract vm_name from "<vm_name>.<channel_num>" */
+		token = strsep(&remaining, ".");
+		if (remaining == NULL)
+			continue;
+		if (strncmp(vm_name, token, MAX_NAME_LEN))
+			continue;
+
+		/* remaining should contain only <channel_num> */
+		errno = 0;
+		channel_num = (unsigned int)strtol(remaining, &tail_ptr, 0);
+		if ((errno != 0) || (remaining[0] == '\0') ||
+				(*tail_ptr != '\0') || tail_ptr == NULL) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
+					"'%s' found it should be in the form of "
+					"'<guest_name>.<channel_num>(decimal)'\n",
+					dir->d_name);
+			continue;
+		}
+		if (channel_num >= MAX_VM_CHANNELS) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
+					"greater than max allowable: %d, skipping '%s%s'\n",
+					channel_num, MAX_VM_CHANNELS-1, SOCKET_PATH,
+					dir->d_name);
+			continue;
+		}
+		/* if channel has not been added previously */
+		if (channel_exists(vm_info, channel_num))
+			continue;
+
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s%s'\n", SOCKET_PATH, dir->d_name);
+			continue;
+		}
+
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s", SOCKET_PATH,
+				dir->d_name);
+
+		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+
+		num_channels_enabled++;
+	}
+	closedir(d);
+	return num_channels_enabled;
+}
+
+int
+add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char socket_path[PATH_MAX];
+	unsigned i;
+	int num_channels_enabled = 0;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = VM_INACTIVE;
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (i == MAX_VM_CHANNELS)
+			continue;
+		if (channel_exists(vm_info, channel_list[i])) {
+			RTE_LOG(INFO, CHANNEL_MANAGER,  "Channel already exists, skipping  "
+					"'%s.%u'\n", vm_name, i);
+			continue;
+		}
+
+		snprintf(socket_path, sizeof(socket_path), "%s%s.%u", SOCKET_PATH,
+				vm_name, channel_list[i]);
+		errno = 0;
+		if (access(socket_path, F_OK) < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
+					"%s\n", socket_path, strerror(errno));
+			continue;
+		}
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s'\n", socket_path);
+			continue;
+		}
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s.%u",
+				SOCKET_PATH, vm_name, channel_list[i]);
+		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+		num_channels_enabled++;
+
+	}
+	return num_channels_enabled;
+}
+
+int
+remove_channel(struct channel_info *chan_info)
+{
+	struct virtual_machine_info *vm_info;
+	close(chan_info->fd);
+	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
+	vm_info->num_channels--;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	rte_free(chan_info);
+	return 0;
+}
+
+int
+remove_vm(const char *vm_name)
+{
+	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
+				"not found\n", vm_name);
+		return -1;
+	}
+	rte_spinlock_lock(&vm_info->config_spinlock);
+	if (vm_info->num_channels != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
+				"%"PRId8" channels still active\n",
+				vm_name, vm_info->num_channels);
+		rte_spinlock_unlock(&vm_info->config_spinlock);
+		return -1;
+	}
+	LIST_REMOVE(vm_info, vms_info);
+	rte_spinlock_unlock(&vm_info->config_spinlock);
+	rte_free(vm_info);
+	return 0;
+}
+
+int
+set_channel_status_all(const char *vm_name, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned int i;
+	uint64_t mask;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_CONNECTED || status == CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		vm_info->channels[i]->status = status;
+		num_channels_changed++;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return num_channels_changed;
+
+}
+
+int
+set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_CONNECTED || status == CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (channel_exists(vm_info, channel_list[i])) {
+			rte_spinlock_lock(&(vm_info->config_spinlock));
+			vm_info->channels[channel_list[i]]->status = status;
+			rte_spinlock_unlock(&(vm_info->config_spinlock));
+			num_channels_changed++;
+		}
+	}
+	return num_channels_changed;
+}
+
+int
+get_info_vm(const char *vm_name, struct vm_info *info)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i, channel_num = 0;
+	uint64_t mask;
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+	info->status = VM_ACTIVE;
+	if (!virDomainIsActive(vm_info->domainPtr))
+		info->status = VM_INACTIVE;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		info->channels[channel_num].channel_num = i;
+		memcpy(info->channels[channel_num].channel_path,
+				vm_info->channels[i]->channel_path, PATH_MAX);
+		info->channels[channel_num].status = vm_info->channels[i]->status;
+		info->channels[channel_num].fd = vm_info->channels[i]->fd;
+		channel_num++;
+	}
+	info->num_channels = channel_num;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	info->num_vcpus = vm_info->info.nrVirtCpu;
+
+	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
+
+	if (update_pcpus_mask(vm_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' unable to update vCPU Pinning\n",
+				vm_name);
+		return -1;
+	}
+	for (i = 0; i < info->num_vcpus; i++) {
+		info->pcpu_mask[i] = vm_info->pcpu_mask[i];
+	}
+	return 0;
+}
+
+int
+channel_manager_init(const char *path)
+{
+	LIST_INIT(&vm_list_head);
+	if (connect_hypervisor(path) < 0)
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
+	return 0;
+}
+
+void
+channel_manager_exit(void)
+{
+	unsigned int i;
+	uint64_t mask;
+	struct virtual_machine_info *vm_info;
+
+	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
+
+		rte_spinlock_lock(&(vm_info->config_spinlock));
+		mask = vm_info->channel_mask;
+		ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			remove_channel_from_monitor(vm_info->channels[i]);
+			close(vm_info->channels[i]->fd);
+			rte_free(vm_info->channels[i]);
+		}
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+		LIST_REMOVE(vm_info, vms_info);
+		rte_free(vm_info);
+	}
+	disconnect_hypervisor();
+}
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
new file mode 100644
index 0000000..d114201
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.h
@@ -0,0 +1,273 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MANAGER_H_
+#define CHANNEL_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/limits.h>
+#include <rte_atomic.h>
+
+/* Maximum number of channels per VM */
+#define MAX_VM_CHANNELS 64
+
+/* Maximum name length including '\0' terminator */
+#define MAX_NAME_LEN    64
+
+/* Maximum number of virtual CPUs that can be assigned to a VM */
+#define MAX_VCPU        64
+
+/* Default Hypervisor Path for libvirt(qemu/KVM) */
+#define DEFAULT_HV_PATH "qemu:///system"
+
+/* Communication Channel Status */
+enum channel_status { CHANNEL_DISCONNECTED = 0, CHANNEL_CONNECTED,
+	CHANNEL_DISABLED, CHANNEL_PROCESSING};
+
+/* VM libvirt(qemu/KVM) connection status */
+enum vm_status { VM_INACTIVE = 0, VM_ACTIVE};
+
+/*
+ *  Represents a single and exclusive VM channel that exists between a guest and
+ *  the host.
+ */
+struct channel_info {
+	char channel_path[PATH_MAX]; /**< Path to host socket */
+	volatile uint32_t status;    /**< Connection status(enum channel_status) */
+	int fd;                      /**< AF_UNIX socket fd */
+	unsigned int channel_num;    /**< /tmp/powermonitor/<vm_name>.channel_num */
+	void *priv_info;             /**< Pointer to private info, do not modify */
+};
+
+/* Represents a single VM instance used to return internal information about
+ * a VM */
+struct vm_info {
+	char name[MAX_NAME_LEN];                      /**< VM name */
+	enum vm_status status;                        /**< libvirt status */
+	uint64_t pcpu_mask[MAX_VCPU];                 /**< pCPU mask for each vCPU */
+	unsigned num_vcpus;                           /**< number of vCPUS */
+	struct channel_info channels[MAX_VM_CHANNELS];/**< Array of channel_info */
+	unsigned num_channels;                        /**< Number of channels */
+};
+
+/**
+ * Initialize the Channel Manager resources and connect to the Hypervisor
+ * specified in path.
+ * This must be successfully called first before calling any other functions.
+ * It must only be call once;
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_manager_init(const char *path);
+
+/**
+ * Free resources associated with the Channel Manager.
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  None
+ */
+void channel_manager_exit(void);
+
+/**
+ * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
+ * core_mask.
+ * It is not thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info
+ *
+ * @param vcpu
+ *  The virtual CPU to query.
+ *
+ *
+ * @return
+ *  - 0 on error.
+ *  - >0 on success.
+ */
+uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
+
+/**
+ * Add a VM as specified by name to the Channel Manager. The name must
+ * correspond to a valid libvirt domain name.
+ * This is required prior to adding channels.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_vm(const char *name);
+
+/**
+ * Remove a previously added Virtual Machine from the Channel Manager
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_vm(const char *name);
+
+/**
+ * Add all available channels to the VM as specified by name.
+ * Channels in the form of paths(/tmp/powermonitor/<vm_name>.<channel_number>
+ * will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ */
+int add_all_channels(const char *vm_name);
+
+/**
+ * Add the channel numbers in channel_list to the domain specified by name.
+ * Channels in the form of paths(/tmp/powermonitor/<vm_name>.<channel_number>
+ * will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel number to add
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned num_channels);
+
+/**
+ * Remove a channel definition from the channel manager. This must only be
+ * called from the channel monitor thread.
+ *
+ * @param chan_info
+ *  Pointer to a valid struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel(struct channel_info *chan_info);
+
+/**
+ * For all channels associated with a Virtual Machine name, update the
+ * connection status. Valid states are CHANNEL_CONNECTED or CHANNEL_DISABLED
+ * only.
+ *
+ *
+ * @param name
+ *  Virtual Machine name to modify all channels.
+ *
+ * @param status
+ *  The status to set each channel
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int set_channel_status_all(const char *name, enum channel_status status);
+
+/**
+ * For all channels in channel_list associated with a Virtual Machine name
+ * update the connection status of each.
+ * Valid states are CHANNEL_CONNECTED or CHANNEL_DISABLED only.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel numbers to
+ *  modify.
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels modified for the VM
+ *  - 0 for error
+ */
+int set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status);
+
+/**
+ * Populates a pointer to struct vm_info associated with vm_name.
+ *
+ * @param vm_name
+ *  The name of the virtual machine to lookup.
+ *
+ *  @param vm_info
+ *   Pointer to a struct vm_info, this must be allocated prior to calling this
+ *   function.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int get_info_vm(const char *vm_name, struct vm_info *info);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_MANAGER_H_ */
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
new file mode 100644
index 0000000..d30acbf
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -0,0 +1,228 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+
+
+#include "channel_monitor.h"
+#include "channel_commands.h"
+#include "channel_manager.h"
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+
+static volatile unsigned run_loop = 1;
+static int global_event_fd;
+static struct epoll_event *global_events_list;
+
+void channel_monitor_exit(void)
+{
+	run_loop = 0;
+	rte_free(global_events_list);
+}
+
+static int
+process_request(struct channel_packet *pkt, struct channel_info *chan_info)
+{
+	uint64_t core_mask;
+
+	if (chan_info == NULL)
+		return -1;
+
+	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_CONNECTED,
+			CHANNEL_PROCESSING) == 0)
+		return -1;
+
+	if (pkt->command == CPU_POWER) {
+		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
+		if (core_mask == 0) {
+			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
+					"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
+					(unsigned)pkt->unit);
+			return -1;
+		}
+		if (__builtin_popcountll(core_mask) == 1) {
+
+			unsigned core_num = __builtin_ffsll(core_mask) - 1;
+
+			switch (pkt->unit) {
+			case(CPU_SCALE_MIN):
+					power_manager_scale_core_min(core_num);
+			break;
+			case(CPU_SCALE_MAX):
+					power_manager_scale_core_max(core_num);
+			break;
+			case(CPU_SCALE_DOWN):
+					power_manager_scale_core_down(core_num);
+			break;
+			case(CPU_SCALE_UP):
+					power_manager_scale_core_up(core_num);
+			break;
+			default:
+				break;
+			}
+		} else {
+			switch (pkt->unit) {
+			case(CPU_SCALE_MIN):
+					power_manager_scale_mask_min(core_mask);
+			break;
+			case(CPU_SCALE_MAX):
+					power_manager_scale_mask_max(core_mask);
+			break;
+			case(CPU_SCALE_DOWN):
+					power_manager_scale_mask_down(core_mask);
+			break;
+			case(CPU_SCALE_UP):
+					power_manager_scale_mask_up(core_mask);
+			break;
+			default:
+				break;
+			}
+
+		}
+	}
+	/* Return is not checked as channel status may have been set to DISABLED
+	 * from management thread
+	 */
+	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_PROCESSING,
+			CHANNEL_CONNECTED);
+	return 0;
+
+}
+
+int
+add_channel_to_monitor(struct channel_info **chan_info)
+{
+	struct channel_info *info = *chan_info;
+	struct epoll_event event;
+	event.events = EPOLLIN;
+	event.data.ptr = info;
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
+				"to epoll\n", info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+remove_channel_from_monitor(struct channel_info *chan_info)
+{
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
+				"from epoll\n", chan_info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_monitor_init(void)
+{
+	global_event_fd = epoll_create1(0);
+	if (global_event_fd == 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
+				"error %s\n", strerror(errno));
+		return -1;
+	}
+	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
+			* MAX_EVENTS, CACHE_LINE_SIZE);
+	if (global_events_list == NULL) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+				"epoll events\n");
+		return -1;
+	}
+	return 0;
+}
+
+void
+run_channel_monitor(void)
+{
+	while (run_loop) {
+		int n_events, i;
+		n_events = epoll_wait(global_event_fd, global_events_list,
+				MAX_EVENTS, 1);
+		if (!run_loop)
+			break;
+		for (i = 0; i < n_events; i++) {
+			struct channel_info *chan_info = (struct channel_info *)
+					global_events_list[i].data.ptr;
+			if ((global_events_list[i].events & EPOLLERR) ||
+					(global_events_list[i].events & EPOLLHUP)) {
+				remove_channel(chan_info);
+			}
+			if (global_events_list[i].events & EPOLLIN) {
+
+				int n_bytes, err = 0;
+				struct channel_packet pkt;
+				void *buffer = &pkt;
+				int buffer_len = sizeof(pkt);
+				while (buffer_len > 0) {
+					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					if (n_bytes == buffer_len)
+						break;
+					if (n_bytes == -1) {
+						err = errno;
+						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
+								"channel '%s' read: %s\n",
+								chan_info->channel_path, strerror(err));
+						remove_channel(chan_info);
+						break;
+					}
+					buffer = (char *)buffer + n_bytes;
+					buffer_len -= n_bytes;
+				}
+				if (!err)
+					process_request(&pkt, chan_info);
+			}
+		}
+	}
+}
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
new file mode 100644
index 0000000..c138607
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MONITOR_H_
+#define CHANNEL_MONITOR_H_
+
+#include "channel_manager.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Channel Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_monitor_init(void);
+
+/**
+ * Run the channel monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_channel_monitor(void);
+
+/**
+ * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+void channel_monitor_exit(void);
+
+/**
+ * Add an open channel to monitor via epoll. A pointer to struct channel_info
+ * will be registered with epoll for event processing.
+ * It is thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info pointer.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_channel_to_monitor(struct channel_info **chan_info);
+
+/**
+ * Remove a previously added channel from epoll control.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel_from_monitor(struct channel_info *chan_info);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* CHANNEL_MONITOR_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 02/10] VM Power Management CLI(Host).
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 03/10] CPU Frequency Power Management(Host) Alan Carew
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-seperated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/vm_power_cli.c | 568 +++++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h |  47 +++
 2 files changed, 615 insertions(+)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 0000000..33a4bcf
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,568 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <termios.h>
+#include <errno.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+		struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+struct cmd_show_vm_result {
+	cmdline_fixed_string_t show_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_show_vm_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_show_vm_result *res = parsed_result;
+	struct vm_info info;
+	unsigned i;
+
+	if (get_info_vm(res->vm_name, &info) != 0)
+		return;
+	cmdline_printf(cl, "VM: '%s', status = ", info.name);
+	if (info.status == VM_ACTIVE)
+		cmdline_printf(cl, "ACTIVE\n");
+	else
+		cmdline_printf(cl, "INACTIVE\n");
+	cmdline_printf(cl, "Channels %u\n", info.num_channels);
+	for (i = 0; i < info.num_channels; i++) {
+		cmdline_printf(cl, "  [%u]: %s, status = ", i,
+				info.channels[i].channel_path);
+		switch (info.channels[i].status) {
+		case CHANNEL_CONNECTED:
+			cmdline_printf(cl, "CONNECTED\n");
+			break;
+		case CHANNEL_DISCONNECTED:
+			cmdline_printf(cl, "DISCONNECTED\n");
+			break;
+		case CHANNEL_DISABLED:
+			cmdline_printf(cl, "DISABLED\n");
+			break;
+		case CHANNEL_PROCESSING:
+			cmdline_printf(cl, "PROCESSING\n");
+			break;
+		default:
+			cmdline_printf(cl, "UNKNOWN\n");
+			break;
+		}
+	}
+	cmdline_printf(cl, "Virtual CPU(s): %u\n", info.num_vcpus);
+	for (i = 0; i < info.num_vcpus; i++) {
+		cmdline_printf(cl, "  [%u]: Physical CPU Mask 0x%"PRIx64"\n", i,
+				info.pcpu_mask[i]);
+	}
+}
+
+cmdline_parse_token_string_t cmd_vm_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+				show_vm, "show_vm");
+cmdline_parse_token_string_t cmd_show_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_show_vm_set = {
+	.f = cmd_show_vm_parsed,
+	.data = NULL,
+	.help_str = "show_vm <vm_name>, prints the information on the "
+			"specified VM(s), the information lists the number of vCPUS, the "
+			"pinning to pCPU(s) as a bit mask, along with any communication "
+			"channels associated with each VM",
+	.tokens = {
+		(void *)&cmd_vm_show,
+		(void *)&cmd_show_vm_name,
+		NULL,
+	},
+};
+
+struct cmd_vm_op_result {
+	cmdline_fixed_string_t op_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_vm_op_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_vm_op_result *res = parsed_result;
+
+	if (!strcmp(res->op_vm, "add_vm")) {
+		if (add_vm(res->vm_name) < 0)
+			cmdline_printf(cl, "Unable to add VM '%s'\n", res->vm_name);
+	} else if (remove_vm(res->vm_name) < 0)
+		cmdline_printf(cl, "Unable to remove VM '%s'\n", res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_vm_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			op_vm, "add_vm#rm_vm");
+cmdline_parse_token_string_t cmd_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_vm_op_set = {
+	.f = cmd_vm_op_parsed,
+	.data = NULL,
+	.help_str = "add_vm|rm_vm <name>, add a VM for "
+			"subsequent operations with the CLI or remove a previously added "
+			"VM from the VM Power Manager",
+	.tokens = {
+		(void *)&cmd_vm_op,
+		(void *)&cmd_vm_name,
+	NULL,
+	},
+};
+
+/* *** VM channel operations *** */
+struct cmd_channels_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+};
+static void
+cmd_channels_op_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num, i;
+	int channels_added;
+	unsigned channel_list[MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_op_result *res = parsed_result;
+
+	if (!strcmp(res->channel_list, "all")) {
+		channels_added = add_all_channels(res->vm_name);
+		cmdline_printf(cl, "Added %d channels for VM '%s'\n",
+				channels_added, res->vm_name);
+		return;
+	}
+
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "Channel number '%u' exceeds the maximum number "
+					"of allowable channels(%u) for VM '%s'\n", channel_num,
+					MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	for (i = 0; i < num_channels; i++)
+		cmdline_printf(cl, "[%u]: Adding channel %u\n", i, channel_list[i]);
+
+	channels_added = add_channels(res->vm_name, channel_list,
+			num_channels);
+	cmdline_printf(cl, "Enabled %d channels for '%s'\n", channels_added,
+			res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+				op, "add_channels");
+cmdline_parse_token_string_t cmd_channels_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			channel_list, NULL);
+
+cmdline_parse_inst_t cmd_channels_op_set = {
+	.f = cmd_channels_op_parsed,
+	.data = NULL,
+	.help_str = "add_channels <vm_name> <list>|all, add "
+			"communication channels for the specified VM, the "
+			"virtio channels must be enabled in the VM "
+			"configuration(qemu/libvirt) and the associated VM must be active. "
+			"<list> is a comma-seperated list of channel numbers to add, using "
+			"the keyword 'all' will attempt to add all channels for the VM",
+	.tokens = {
+		(void *)&cmd_channels_op,
+		(void *)&cmd_channels_vm_name,
+		(void *)&cmd_channels_list,
+		NULL,
+	},
+};
+
+struct cmd_channels_status_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+	cmdline_fixed_string_t status;
+};
+
+static void
+cmd_channels_status_op_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num;
+	int changed;
+	unsigned channel_list[MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_status_op_result *res = parsed_result;
+	enum channel_status status;
+
+	if (!strcmp(res->status, "enabled"))
+		status = CHANNEL_CONNECTED;
+	else
+		status = CHANNEL_DISABLED;
+
+	if (!strcmp(res->channel_list, "all")) {
+		changed = set_channel_status_all(res->vm_name, status);
+		cmdline_printf(cl, "Updated status of %d channels "
+				"for VM '%s'\n", changed, res->vm_name);
+		return;
+	}
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "%u exceeds the maximum number of allowable "
+					"channels(%u) for VM '%s'\n", channel_num, MAX_VM_CHANNELS,
+					res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	changed = set_channel_status(res->vm_name, channel_list, num_channels,
+			status);
+	cmdline_printf(cl, "Updated status of %d channels "
+					"for VM '%s'\n", changed, res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_status_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+				op, "set_channel_status");
+cmdline_parse_token_string_t cmd_channels_status_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_status_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			channel_list, NULL);
+cmdline_parse_token_string_t cmd_channels_status =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			status, "enabled#disabled");
+
+cmdline_parse_inst_t cmd_channels_status_op_set = {
+	.f = cmd_channels_status_op_parsed,
+	.data = NULL,
+	.help_str = "set_channel_status <vm_name> <list>|all enabled|disabled, "
+			" enable or disable the communication channels in "
+			"list(comma-seperated) for the specified VM, alternatively list can"
+			" be replaced with keyword 'all'. Disabled channels will still "
+			"receive packets on the host, however the commands they specify "
+			"will be ignored. Set status to 'enabled' to begin processing "
+			"requests again.",
+	.tokens = {
+		(void *)&cmd_channels_status_op,
+		(void *)&cmd_channels_status_vm_name,
+		(void *)&cmd_channels_status_list,
+		(void *)&cmd_channels_status,
+		NULL,
+	},
+};
+
+/* *** CPU Frequency operations *** */
+struct cmd_show_cpu_freq_mask_result {
+	cmdline_fixed_string_t show_cpu_freq_mask;
+	uint64_t core_mask;
+};
+
+static void
+cmd_show_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_mask_result *res = parsed_result;
+	unsigned i;
+	uint64_t mask = res->core_mask;
+	uint32_t freq;
+	for (i = 0; mask; mask &= ~(1ULL << i++)) {
+		if ((mask >> i) & 1) {
+			freq = power_manager_get_current_frequency(i);
+			if (freq > 0)
+				cmdline_printf(cl, "Core %u: %"PRId32"\n", i, freq);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			show_cpu_freq_mask, "show_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_show_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			core_mask, UINT64);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_mask_set = {
+	.f = cmd_show_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "show_cpu_freq_mask <mask>, Get the current frequency for each "
+			"core specified in the mask",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq_mask,
+		(void *)&cmd_show_cpu_freq_mask_core_mask,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_mask_result {
+	cmdline_fixed_string_t set_cpu_freq_mask;
+	uint64_t core_mask;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	struct cmd_set_cpu_freq_mask_result *res = parsed_result;
+	int ret = -1;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_mask_up(res->core_mask);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_mask_down(res->core_mask);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_mask_min(res->core_mask);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_mask_max(res->core_mask);
+	if (ret != 0) {
+		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
+				"all cores specified have been scaled\n",
+				res->core_mask, res->cmd);
+	};
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			set_cpu_freq_mask, "set_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			core_mask, UINT64);
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
+	.f = cmd_set_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
+			"frequency for the cores specified in <core_mask> by scaling "
+			"each up/down/min/max.",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq_mask,
+		(void *)&cmd_set_cpu_freq_mask_core_mask,
+		(void *)&cmd_set_cpu_freq_mask_result,
+		NULL,
+	},
+};
+
+
+
+struct cmd_show_cpu_freq_result {
+	cmdline_fixed_string_t show_cpu_freq;
+	uint8_t core_num;
+};
+
+static void
+cmd_show_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_result *res = parsed_result;
+	uint32_t curr_freq = power_manager_get_current_frequency(res->core_num);
+	if (curr_freq == 0) {
+		cmdline_printf(cl, "Unable to get frequency for core %u\n",
+				res->core_num);
+		return;
+	}
+	cmdline_printf(cl, "Core %u frequency: %"PRId32"\n", res->core_num,
+			curr_freq);
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_result,
+			show_cpu_freq, "show_cpu_freq");
+
+cmdline_parse_token_num_t cmd_show_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_result,
+			core_num, UINT8);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_set = {
+	.f = cmd_show_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "Get the current frequency for the specified core",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq,
+		(void *)&cmd_show_cpu_freq_core_num,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t core_num;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_core_up(res->core_num);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_core_down(res->core_num);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_core_min(res->core_num);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_core_max(res->core_num);
+	if (ret != 0) {
+		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
+				res->cmd);
+	}
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			core_num, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_vm_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_status_op_set,
+		(cmdline_parse_inst_t *)&cmd_show_vm_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	power_manager_init();
+	cl = cmdline_stdin_new(main_ctx, "vmpower> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/vm_power_cli.h b/examples/vm_power_manager/vm_power_cli.h
new file mode 100644
index 0000000..deccd51
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.h
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 03/10] CPU Frequency Power Management(Host).
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 02/10] VM Power Management CLI(Host) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 04/10] VM Power Management application and Makefile Alan Carew
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/power_manager.c | 244 ++++++++++++++++++++++++++++++
 examples/vm_power_manager/power_manager.h | 186 +++++++++++++++++++++++
 2 files changed, 430 insertions(+)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 0000000..c736cd0
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,244 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/types.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_power.h>
+#include <rte_spinlock.h>
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
+	if (core_num > RTE_MAX_LCORE) \
+		return -1; \
+	if (!(global_enabled_cpus & (1ULL << core_num))) \
+		return -1; \
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+	ret = rte_power_freq_##DIRECTION(core_num); \
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+	int i; \
+	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+		if ((core_mask >> i) & 1) { \
+			if (!(global_enabled_cpus & (1ULL << i))) \
+			continue; \
+		rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+		if (rte_power_freq_##DIRECTION(i) != 1) \
+			ret = -1; \
+		rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+		} \
+	} \
+} while (0)
+
+struct freq_info {
+	rte_spinlock_t power_sl;
+	uint32_t freqs[RTE_MAX_LCORE_FREQS];
+	unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[RTE_MAX_LCORE];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+	char path[PATH_MAX];
+	unsigned i;
+	unsigned num_cpus = 0;
+	for (i = 0; i < RTE_MAX_LCORE; i++) {
+		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+		if (access(path, F_OK) == 0) {
+			global_enabled_cpus |= 1 << i;
+			num_cpus++;
+		} else
+			return num_cpus;
+	}
+	return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+	unsigned i, num_cpus;
+	uint64_t cpu_mask;
+	int ret = 0;
+
+	num_cpus = set_host_cpus_mask();
+	if (num_cpus == 0) {
+		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
+				"ensure that sufficient privileges exist to inspect sysfs\n");
+		return -1;
+	}
+
+	cpu_mask = global_enabled_cpus;
+	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
+		if (rte_power_init(i) < 0 || rte_power_freqs(i,
+				global_core_freq_info[i].freqs,
+				RTE_MAX_LCORE_FREQS) == 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+					"for core %u\n", i);
+			global_enabled_cpus &= ~(1 << i);
+			num_cpus--;
+			ret = -1;
+		}
+		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+	}
+	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
+					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	return ret;
+
+}
+
+uint32_t
+power_manager_get_current_frequency(unsigned core_num)
+{
+	uint32_t freq, index;
+
+	if (core_num >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER_MANAGER, "Core(%u) is out of range 0...%d\n",
+				core_num, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return 0;
+
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	index = rte_power_get_freq(core_num);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	if (index >= RTE_MAX_LCORE)
+		freq = 0;
+	else
+		freq = global_core_freq_info[core_num].freqs[index];
+
+	return freq;
+}
+
+int
+power_manager_exit(void)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
+		if (rte_power_exit(i) < 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+					"for core %u\n", i);
+			ret = -1;
+		}
+	}
+	global_enabled_cpus = 0;
+	return ret;
+}
+
+int
+power_manager_scale_mask_up(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(up, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_down(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(down, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_min(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(min, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_max(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(max, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_up(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(up, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_down(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(down, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_min(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(min, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_max(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(max, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
new file mode 100644
index 0000000..d1d5c2c
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.h
@@ -0,0 +1,186 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef POWER_MANAGER_H_
+#define POWER_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management.
+ * Initializes resources and verifies the number of CPUs on the system.
+ * Wraps librte_power int rte_power_init(unsigned lcore_id);
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_init(void);
+
+/**
+ * Exit power management. Must be called prior to exiting the application.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_exit(void);
+
+/**
+ * Scale up the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_up(uint64_t core_mask);
+
+/**
+ * Scale down the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_down(uint64_t core_mask);
+
+/**
+ * Scale to the minimum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_min(uint64_t core_mask);
+
+/**
+ * Scale to the maximum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_max(uint64_t core_mask);
+
+/**
+ * Scale up frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_up(unsigned core_num);
+
+/**
+ * Scale down frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_down(unsigned core_num);
+
+/**
+ * Scale to minimum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_min(unsigned core_num);
+
+/**
+ * Scale to maximum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_max(unsigned core_num);
+
+/**
+ * Get the current freuency of the core specified by core_num
+ *
+ * @param core_num
+ *  The core number to get the current frequency
+ *
+ * @return
+ *  - 0  on error
+ *  - >0 for current frequency.
+ */
+uint32_t power_manager_get_current_frequency(unsigned core_num);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* POWER_MANAGER_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 04/10] VM Power Management application and Makefile.
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (2 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 03/10] CPU Frequency Power Management(Host) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 05/10] VM Power Management CLI(Guest) Alan Carew
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/Makefile |  57 +++++++++++++++++++
 examples/vm_power_manager/main.c   | 113 +++++++++++++++++++++++++++++++++++++
 examples/vm_power_manager/main.h   |  52 +++++++++++++++++
 3 files changed, 222 insertions(+)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
new file mode 100644
index 0000000..7d6f943
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 0000000..fdb9e73
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,113 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_launch.h>
+#include <rte_log.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "vm_power_cli.h"
+#include "main.h"
+
+static int
+run_monitor(__attribute__((unused)) void *arg)
+{
+	if (channel_manager_init(DEFAULT_HV_PATH) < 0) {
+		printf("Unable to initialize channel manager\n");
+		return -1;
+	}
+	if (channel_monitor_init() < 0) {
+		printf("Unable to initialize channel monitor\n");
+		return -1;
+	}
+	run_channel_monitor();
+	return 0;
+}
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	lcore_id = rte_get_next_lcore(-1, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+				"application\n");
+		return 0;
+	}
+	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
+
+	run_cli(NULL);
+
+	rte_eal_mp_wait_lcore();
+	return 0;
+}
diff --git a/examples/vm_power_manager/main.h b/examples/vm_power_manager/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 05/10] VM Power Management CLI(Guest).
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (3 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 04/10] VM Power Management application and Makefile Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++++++++
 examples/vm_power_manager/guest_cli/main.c         |  86 ++++++++++++
 examples/vm_power_manager/guest_cli/main.h         |  52 +++++++
 .../guest_cli/vm_power_cli_guest.c                 | 155 +++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++++++++
 5 files changed, 404 insertions(+)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 0000000..167a7ed
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 0000000..b8f86d0
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,86 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+*/
+#include <signal.h>
+
+#include <rte_lcore.h>
+#include <rte_power.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "vm_power_cli_guest.h"
+#include "main.h"
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_init(lcore_id);
+	}
+	run_cli(NULL);
+
+	return 0;
+}
diff --git a/examples/vm_power_manager/guest_cli/main.h b/examples/vm_power_manager/guest_cli/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
new file mode 100644
index 0000000..7c4af4a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -0,0 +1,155 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <termios.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_lcore.h>
+
+#include <rte_power.h>
+
+#include "vm_power_cli_guest.h"
+
+
+#define CHANNEL_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+				__attribute__((unused)) struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t lcore_id;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = rte_power_freq_up(res->lcore_id);
+	else if (!strcmp(res->cmd , "down"))
+		ret = rte_power_freq_down(res->lcore_id);
+	else if (!strcmp(res->cmd , "min"))
+		ret = rte_power_freq_min(res->lcore_id);
+	else if (!strcmp(res->cmd , "max"))
+		ret = rte_power_freq_max(res->lcore_id);
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			lcore_id, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	cl = cmdline_stdin_new(main_ctx, "vmpower(guest)> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
new file mode 100644
index 0000000..0c4bdd5
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "channel_commands.h"
+
+int guest_channel_host_connect(unsigned lcore_id);
+
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 06/10] VM communication channels for VM Power Management(Guest).
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (4 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 05/10] VM Power Management CLI(Guest) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host Alan Carew
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/guest_channel.c | 162 +++++++++++++++++++++++++++++++++++++++
 lib/librte_power/guest_channel.h |  89 +++++++++++++++++++++
 2 files changed, 251 insertions(+)
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
new file mode 100644
index 0000000..2295665
--- /dev/null
+++ b/lib/librte_power/guest_channel.c
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+	int flags, ret;
+	struct channel_packet pkt;
+	char fd_path[PATH_MAX];
+	int fd = -1;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	/* check if path is already open */
+	if (global_fds[lcore_id] != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open with fd %d\n",
+				lcore_id, global_fds[lcore_id]);
+		return -1;
+	}
+
+	snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+	RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+			fd_path, lcore_id);
+	fd = open(fd_path, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with error "
+				"%s\n", fd_path, strerror(errno));
+		return -1;
+	}
+
+	flags = fcntl(fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file %s\n",
+				fd_path);
+		goto error;
+	}
+
+	flags |= O_NONBLOCK;
+	if (fcntl(fd, F_SETFL, flags) < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking mode for "
+				"file %s", fd_path);
+		goto error;
+	}
+	/* QEMU needs a delay after connection */
+	sleep(1);
+
+	/* Send a test packet, this command is ignored by the host, but a successful
+	 * send indicates that the host endpoint is monitoring.
+	 */
+	pkt.command = CPU_POWER_CONNECT;
+	global_fds[lcore_id] = fd;
+	ret = guest_channel_send_msg(&pkt, lcore_id);
+	if (ret != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' communications "
+				"test: %s\n", fd_path, strerror(ret));
+		goto error;
+	}
+	RTE_LOG(INFO, GUEST_CHANNEL, "Channel '%s' is now connected\n", fd_path);
+	return 0;
+error:
+	close(fd);
+	global_fds[lcore_id] = 0;
+	return -1;
+}
+
+int
+guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
+{
+	int ret, buffer_len = sizeof(*pkt);
+	void *buffer = pkt;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+
+	if (global_fds[lcore_id] == 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel is not connected\n");
+		return -1;
+	}
+	while (buffer_len > 0) {
+		ret = write(global_fds[lcore_id], buffer, buffer_len);
+		if (ret == buffer_len)
+			return 0;
+		if (ret == -1) {
+			if (errno == EINTR)
+				continue;
+			return errno;
+		}
+		buffer = (char *)buffer + ret;
+		buffer_len -= ret;
+	}
+	return 0;
+}
+
+void
+guest_channel_host_disconnect(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return;
+	}
+	if (global_fds[lcore_id] == 0)
+		return;
+	close(global_fds[lcore_id]);
+	global_fds[lcore_id] = 0;
+}
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
new file mode 100644
index 0000000..9e18af5
--- /dev/null
+++ b/lib/librte_power/guest_channel.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef _GUEST_CHANNEL_H
+#define _GUEST_CHANNEL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <channel_commands.h>
+
+/**
+ * Connect to the Virtio-Serial VM end-point located in path. It is
+ * thread safe for unique lcore_ids. This function must be only called once from
+ * each lcore.
+ *
+ * @param path
+ *  The path to the serial device on the filesystem
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int guest_channel_host_connect(const char *path, unsigned lcore_id);
+
+/**
+ * Disconnect from an already connected Virtio-Serial Endpoint.
+ *
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ */
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct guest_agent_pkt
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on channel not connected.
+ *  - errno on write to channel error.
+ */
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (5 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-25 10:10     ` Neil Horman
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
                     ` (4 subsequent siblings)
  11 siblings, 1 reply; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implmentation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/rte_power.c              | 540 ++++-------------------------
 lib/librte_power/rte_power.h              | 120 +++++--
 lib/librte_power/rte_power_acpi_cpufreq.c | 545 ++++++++++++++++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++++++++++
 lib/librte_power/rte_power_common.h       |  39 +++
 lib/librte_power/rte_power_kvm_vm.c       | 160 +++++++++
 lib/librte_power/rte_power_kvm_vm.h       | 179 ++++++++++
 7 files changed, 1273 insertions(+), 502 deletions(-)
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 856da9a..998ed1c 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -31,515 +31,113 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <signal.h>
-#include <limits.h>
-
-#include <rte_memcpy.h>
 #include <rte_atomic.h>
 
 #include "rte_power.h"
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
 
-#ifdef RTE_LIBRTE_POWER_DEBUG
-#define POWER_DEBUG_TRACE(fmt, args...) do { \
-		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
-	} while (0)
-#else
-#define POWER_DEBUG_TRACE(fmt, args...)
-#endif
-
-#define FOPEN_OR_ERR_RET(f, retval) do { \
-	if ((f) == NULL) { \
-		RTE_LOG(ERR, POWER, "File not openned\n"); \
-		return (retval); \
-	} \
-} while(0)
-
-#define FOPS_OR_NULL_GOTO(ret, label) do { \
-	if ((ret) == NULL) { \
-		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define FOPS_OR_ERR_GOTO(ret, label) do { \
-	if ((ret) < 0) { \
-		RTE_LOG(ERR, POWER, "File operations failed\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define STR_SIZE     1024
-#define POWER_CONVERT_TO_DECIMAL 10
+enum power_management_env global_default_env = PM_ENV_NOT_SET;
 
-#define POWER_GOVERNOR_USERSPACE "userspace"
-#define POWER_SYSFILE_GOVERNOR   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
-#define POWER_SYSFILE_AVAIL_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
-#define POWER_SYSFILE_SETSPEED   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+volatile uint32_t global_env_cfg_status = 0;
 
-enum power_state {
-	POWER_IDLE = 0,
-	POWER_ONGOING,
-	POWER_USED,
-	POWER_UNKNOWN
-};
+/* function pointers */
+rte_power_freqs_t rte_power_freqs  = NULL;
+rte_power_get_freq_t rte_power_get_freq = NULL;
+rte_power_set_freq_t rte_power_set_freq = NULL;
+rte_power_freq_change_t rte_power_freq_up = NULL;
+rte_power_freq_change_t rte_power_freq_down = NULL;
+rte_power_freq_change_t rte_power_freq_max = NULL;
+rte_power_freq_change_t rte_power_freq_min = NULL;
 
-/**
- * Power info per lcore.
- */
-struct rte_power_info {
-	unsigned lcore_id;                   /**< Logical core id */
-	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-	uint32_t nb_freqs;                   /**< number of available freqs */
-	FILE *f;                             /**< FD of scaling_setspeed */
-	char governor_ori[32];               /**< Original governor name */
-	uint32_t curr_idx;                   /**< Freq index in freqs array */
-	volatile uint32_t state;             /**< Power in use state */
-} __rte_cache_aligned;
-
-static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
-
-/**
- * It is to set specific freq for specific logical core, according to the index
- * of supported frequencies.
- */
-static int
-set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+int
+rte_power_set_env(enum power_management_env env)
 {
-	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
-			"should be less than %u\n", idx, pi->nb_freqs);
-		return -1;
-	}
-
-	/* Check if it is the same as current */
-	if (idx == pi->curr_idx)
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
 		return 0;
-
-	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
-				idx, pi->freqs[idx], pi->lcore_id);
-	if (fseek(pi->f, 0, SEEK_SET) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
-			"for setting frequency for lcore %u\n", pi->lcore_id);
-		return -1;
 	}
-	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
-					"lcore %u\n", pi->lcore_id);
+	if (env == PM_ENV_ACPI_CPUFREQ) {
+		rte_power_freqs = rte_power_acpi_cpufreq_freqs;
+		rte_power_get_freq = rte_power_acpi_cpufreq_get_freq;
+		rte_power_set_freq = rte_power_acpi_cpufreq_set_freq;
+		rte_power_freq_up = rte_power_acpi_cpufreq_freq_up;
+		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
+		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
+		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+	} else if (env == PM_ENV_KVM_VM) {
+		rte_power_freqs = rte_power_kvm_vm_freqs;
+		rte_power_get_freq = rte_power_kvm_vm_get_freq;
+		rte_power_set_freq = rte_power_kvm_vm_set_freq;
+		rte_power_freq_up = rte_power_kvm_vm_freq_up;
+		rte_power_freq_down = rte_power_kvm_vm_freq_down;
+		rte_power_freq_min = rte_power_kvm_vm_freq_min;
+		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+	} else {
+		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
+				env);
+		rte_power_unset_env();
 		return -1;
 	}
-	fflush(pi->f);
-	pi->curr_idx = idx;
-
-	return 1;
-}
-
-/**
- * It is to check the current scaling governor by reading sys file, and then
- * set it into 'userspace' if it is not by writing the sys file. The original
- * governor will be saved for rolling back.
- */
-static int
-power_set_governor_userspace(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if current governor is userspace */
-	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
-		sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
-					"already userspace\n", pi->lcore_id);
-		goto out;
-	}
-	/* Save the original governor */
-	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
-
-	/* Write 'userspace' to the governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(POWER_GOVERNOR_USERSPACE, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
-			"set to user space successfully\n", pi->lcore_id);
-out:
-	fclose(f);
+	global_default_env = env;
+	return 0;
 
-	return ret;
 }
 
-/**
- * It is to get the available frequencies of the specific lcore by reading the
- * sys file.
- */
-static int
-power_get_available_freqs(struct rte_power_info *pi)
+void
+rte_power_unset_env(void)
 {
-	FILE *f;
-	int ret = -1, i, count;
-	char *p;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *freqs[RTE_MAX_LCORE_FREQS];
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
-								pi->lcore_id);
-	f = fopen(fullpath, "r");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Strip the line break if there is */
-	p = strchr(buf, '\n');
-	if (p != NULL)
-		*p = 0;
-
-	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
-	count = rte_strsplit(buf, sizeof(buf), freqs,
-				RTE_MAX_LCORE_FREQS, ' ');
-	if (count <= 0) {
-		RTE_LOG(ERR, POWER, "No available frequency in "
-			""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
-		goto out;
-	}
-	if (count >= RTE_MAX_LCORE_FREQS) {
-		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
-								count);
-		goto out;
-	}
-
-	/* Store the available frequncies into power context */
-	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
-		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
-								i, freqs[i]);
-		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
-					POWER_CONVERT_TO_DECIMAL);
-	}
-
-	ret = 0;
-	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
-						count, pi->lcore_id);
-out:
-	fclose(f);
-
-	return ret;
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 1, 0) != 0)
+		global_default_env = PM_ENV_NOT_SET;
 }
 
-/**
- * It is to fopen the sys file for the future setting the lcore frequency.
- */
-static int
-power_init_for_setting_freq(struct rte_power_info *pi)
-{
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t i, freq;
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, -1);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
-	for (i = 0; i < pi->nb_freqs; i++) {
-		if (freq == pi->freqs[i]) {
-			pi->curr_idx = i;
-			pi->f = f;
-			return 0;
-		}
-	}
-
-out:
-	fclose(f);
-
-	return -1;
+enum power_management_env
+rte_power_get_env(void) {
+	return global_default_env;
 }
 
 int
 rte_power_init(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"in use\n", lcore_id);
-		return -1;
-	}
-
-	pi->lcore_id = lcore_id;
-	/* Check and set the governor */
-	if (power_set_governor_userspace(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
-						"userspace\n", lcore_id);
-		goto fail;
-	}
+	int ret = -1;
 
-	/* Get the available frequencies */
-	if (power_get_available_freqs(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ) {
+		return rte_power_acpi_cpufreq_init(lcore_id);
 	}
-
-	/* Init for setting lcore frequency */
-	if (power_init_for_setting_freq(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_KVM_VM) {
+		return rte_power_kvm_vm_init(lcore_id);
 	}
-
-	/* Set freq to max by default */
-	if (rte_power_freq_max(lcore_id) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
-						"to max\n", lcore_id);
-		goto fail;
+	/* Auto detect Environment */
+	RTE_LOG(INFO, POWER, "Attempting to initialise ACPI cpufreq power "
+			"management...\n");
+	ret = rte_power_acpi_cpufreq_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+		goto out;
 	}
 
-	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
-					"power manamgement\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
-
-	return -1;
-}
-
-/**
- * It is to check the governor and then set the original governor back if
- * needed by writing the the sys file.
- */
-static int
-power_set_governor_original(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if the governor to be set is the same as current */
-	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u "
-					"has already been set to %s\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(INFO, POWER, "Attempting to initialise VM power management...\n");
+	ret = rte_power_kvm_vm_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_KVM_VM);
 		goto out;
 	}
-
-	/* Write back the original governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(pi->governor_ori, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power manamgement governor of lcore %u "
-				"has been set back to %s successfully\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(ERR, POWER, "Unable to set Power Management Environment for lcore "
+			"%u\n", lcore_id);
 out:
-	fclose(f);
-
 	return ret;
 }
 
 int
 rte_power_exit(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"not used\n", lcore_id);
-		return -1;
-	}
-
-	/* Close FD of setting freq */
-	fclose(pi->f);
-	pi->f = NULL;
-
-	/* Set the governor back to the original */
-	if (power_set_governor_original(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
-					"to the original\n", lcore_id);
-		goto fail;
-	}
-
-	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
-				"'userspace' mode and been set back to the "
-						"original\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ)
+		return rte_power_acpi_cpufreq_exit(lcore_id);
+	if (global_default_env == PM_ENV_KVM_VM)
+		return rte_power_kvm_vm_exit(lcore_id);
 
+	RTE_LOG(ERR, POWER, "Environment has not been set, unable to exit "
+				"gracefully\n");
 	return -1;
-}
-
-uint32_t
-rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
-		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
-		return 0;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (num < pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
-		return 0;
-	}
-	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
-
-	return pi->nb_freqs;
-}
-
-uint32_t
-rte_power_get_freq(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return RTE_POWER_INVALID_FREQ_INDEX;
-	}
-
-	return lcore_power_info[lcore_id].curr_idx;
-}
-
-int
-rte_power_set_freq(unsigned lcore_id, uint32_t index)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
-}
-
-int
-rte_power_freq_down(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
 
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx + 1 == pi->nb_freqs)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx + 1);
 }
-
-int
-rte_power_freq_up(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx == 0)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx - 1);
-}
-
-int
-rte_power_freq_max(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(&lcore_power_info[lcore_id], 0);
-}
-
-int
-rte_power_freq_min(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->nb_freqs - 1);
-}
-
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 9c1419e..9338069 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -48,12 +48,48 @@
 extern "C" {
 #endif
 
-#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+/* Power Management Environment State */
+enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM};
 
 /**
- * Initialize power management for a specific lcore. It will check and set the
- * governor to userspace for the lcore, get the available frequencies, and
- * prepare to set new lcore frequency.
+ * Set the default power management implementation. If this is not called prior
+ * to rte_power_init(), then auto-detect of the environment will take place.
+ * It is not thread safe.
+ *
+ * @param env
+ *  env. The environment in which to initialise Power Management for.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_set_env(enum power_management_env env);
+
+/**
+ * Unset the global environment configuration.
+ * This can only be called after all threads have completed.
+ *
+ * @param None.
+ *
+ * @return
+ *  None.
+ */
+void rte_power_unset_env(void);
+
+/**
+ * Get the default power management implementation.
+ *
+ * @param None.
+ *
+ * @return
+ *  power_management_env The configured environment.
+ */
+enum power_management_env rte_power_get_env(void);
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
  *
  * @param lcore_id
  *  lcore id.
@@ -65,8 +101,9 @@ extern "C" {
 int rte_power_init(unsigned lcore_id);
 
 /**
- * Exit power management on a specific lcore. It will set the governor to which
- * is before initialized.
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
  *
  * @param lcore_id
  *  lcore id.
@@ -78,11 +115,9 @@ int rte_power_init(unsigned lcore_id);
 int rte_power_exit(unsigned lcore_id);
 
 /**
- * Get the available frequencies of a specific lcore. The return value will be
- * the minimal one of the total number of available frequencies and the number
- * of buffer. The index of available frequencies used in other interfaces
- * should be in the range of 0 to this return value.
- * It should be protected outside of this function for threadsafe.
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -94,12 +129,15 @@ int rte_power_exit(unsigned lcore_id);
  * @return
  *  The number of available frequencies.
  */
-uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
+typedef uint32_t (*rte_power_freqs_t)(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+extern rte_power_freqs_t rte_power_freqs;
 
 /**
- * Return the current index of available frequencies of a specific lcore. It
- * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
- * It should be protected outside of this function for threadsafe.
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -107,12 +145,15 @@ uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
  * @return
  *  The current index of available frequencies.
  */
-uint32_t rte_power_get_freq(unsigned lcore_id);
+typedef uint32_t (*rte_power_get_freq_t)(unsigned lcore_id);
+
+extern rte_power_get_freq_t rte_power_get_freq;
 
 /**
  * Set the new frequency for a specific lcore by indicating the index of
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -121,70 +162,87 @@ uint32_t rte_power_get_freq(unsigned lcore_id);
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_set_freq(unsigned lcore_id, uint32_t index);
+typedef int (*rte_power_set_freq_t)(unsigned lcore_id, uint32_t index);
+
+extern rte_power_set_freq_t rte_power_set_freq;
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned lcore_id);
 
 /**
  * Scale up the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_up(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_up;
 
 /**
  * Scale down the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_down(unsigned lcore_id);
+
+extern rte_power_freq_change_t rte_power_freq_down;
 
 /**
  * Scale up the frequency of a specific lcore to the highest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_max(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_max;
 
 /**
  * Scale down the frequency of a specific lcore to the lowest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage..
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_min(unsigned lcore_id);
+rte_power_freq_change_t rte_power_freq_min;
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
new file mode 100644
index 0000000..09085c3
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -0,0 +1,545 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+
+#include <rte_memcpy.h>
+#include <rte_atomic.h>
+
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_common.h"
+
+#ifdef RTE_LIBRTE_POWER_DEBUG
+#define POWER_DEBUG_TRACE(fmt, args...) do { \
+		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
+} while (0)
+#else
+#define POWER_DEBUG_TRACE(fmt, args...)
+#endif
+
+#define FOPEN_OR_ERR_RET(f, retval) do { \
+		if ((f) == NULL) { \
+			RTE_LOG(ERR, POWER, "File not openned\n"); \
+			return retval; \
+		} \
+} while (0)
+
+#define FOPS_OR_NULL_GOTO(ret, label) do { \
+		if ((ret) == NULL) { \
+			RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define FOPS_OR_ERR_GOTO(ret, label) do { \
+		if ((ret) < 0) { \
+			RTE_LOG(ERR, POWER, "File operations failed\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define STR_SIZE     1024
+#define POWER_CONVERT_TO_DECIMAL 10
+
+#define POWER_GOVERNOR_USERSPACE "userspace"
+#define POWER_SYSFILE_GOVERNOR   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
+#define POWER_SYSFILE_AVAIL_FREQ \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
+#define POWER_SYSFILE_SETSPEED   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+
+enum power_state {
+	POWER_IDLE = 0,
+	POWER_ONGOING,
+	POWER_USED,
+	POWER_UNKNOWN
+};
+
+/**
+ * Power info per lcore.
+ */
+struct rte_power_info {
+	unsigned lcore_id;                   /**< Logical core id */
+	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
+	uint32_t nb_freqs;                   /**< number of available freqs */
+	FILE *f;                             /**< FD of scaling_setspeed */
+	char governor_ori[32];               /**< Original governor name */
+	uint32_t curr_idx;                   /**< Freq index in freqs array */
+	volatile uint32_t state;             /**< Power in use state */
+} __rte_cache_aligned;
+
+static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
+
+/**
+ * It is to set specific freq for specific logical core, according to the index
+ * of supported frequencies.
+ */
+static int
+set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+{
+	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
+				"should be less than %u\n", idx, pi->nb_freqs);
+		return -1;
+	}
+
+	/* Check if it is the same as current */
+	if (idx == pi->curr_idx)
+		return 0;
+
+	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
+			idx, pi->freqs[idx], pi->lcore_id);
+	if (fseek(pi->f, 0, SEEK_SET) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
+				"for setting frequency for lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
+				"lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	fflush(pi->f);
+	pi->curr_idx = idx;
+
+	return 1;
+}
+
+/**
+ * It is to check the current scaling governor by reading sys file, and then
+ * set it into 'userspace' if it is not by writing the sys file. The original
+ * governor will be saved for rolling back.
+ */
+static int
+power_set_governor_userspace(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if current governor is userspace */
+	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
+			sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
+				"already userspace\n", pi->lcore_id);
+		goto out;
+	}
+	/* Save the original governor */
+	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
+
+	/* Write 'userspace' to the governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(POWER_GOVERNOR_USERSPACE, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
+			"set to user space successfully\n", pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to get the available frequencies of the specific lcore by reading the
+ * sys file.
+ */
+static int
+power_get_available_freqs(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1, i, count;
+	char *p;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *freqs[RTE_MAX_LCORE_FREQS];
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
+			pi->lcore_id);
+	f = fopen(fullpath, "r");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Strip the line break if there is */
+	p = strchr(buf, '\n');
+	if (p != NULL)
+		*p = 0;
+
+	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
+	count = rte_strsplit(buf, sizeof(buf), freqs,
+			RTE_MAX_LCORE_FREQS, ' ');
+	if (count <= 0) {
+		RTE_LOG(ERR, POWER, "No available frequency in "
+				""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
+		goto out;
+	}
+	if (count >= RTE_MAX_LCORE_FREQS) {
+		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
+				count);
+		goto out;
+	}
+
+	/* Store the available frequncies into power context */
+	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
+		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
+				i, freqs[i]);
+		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
+				POWER_CONVERT_TO_DECIMAL);
+	}
+
+	ret = 0;
+	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
+			count, pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to fopen the sys file for the future setting the lcore frequency.
+ */
+static int
+power_init_for_setting_freq(struct rte_power_info *pi)
+{
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t i, freq;
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, -1);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
+	for (i = 0; i < pi->nb_freqs; i++) {
+		if (freq == pi->freqs[i]) {
+			pi->curr_idx = i;
+			pi->f = f;
+			return 0;
+		}
+	}
+
+	out:
+	fclose(f);
+
+	return -1;
+}
+
+int
+rte_power_acpi_cpufreq_init(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"in use\n", lcore_id);
+		return -1;
+	}
+
+	pi->lcore_id = lcore_id;
+	/* Check and set the governor */
+	if (power_set_governor_userspace(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
+				"userspace\n", lcore_id);
+		goto fail;
+	}
+
+	/* Get the available frequencies */
+	if (power_get_available_freqs(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Init for setting lcore frequency */
+	if (power_init_for_setting_freq(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Set freq to max by default */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
+				"to max\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
+			"power manamgement\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+/**
+ * It is to check the governor and then set the original governor back if
+ * needed by writing the the sys file.
+ */
+static int
+power_set_governor_original(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if the governor to be set is the same as current */
+	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u "
+				"has already been set to %s\n",
+				pi->lcore_id, pi->governor_ori);
+		goto out;
+	}
+
+	/* Write back the original governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(pi->governor_ori, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u "
+			"has been set back to %s successfully\n",
+			pi->lcore_id, pi->governor_ori);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_acpi_cpufreq_exit(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"not used\n", lcore_id);
+		return -1;
+	}
+
+	/* Close FD of setting freq */
+	fclose(pi->f);
+	pi->f = NULL;
+
+	/* Set the governor back to the original */
+	if (power_set_governor_original(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
+				"to the original\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
+			"'userspace' mode and been set back to the "
+			"original\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
+		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
+		return 0;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (num < pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
+		return 0;
+	}
+	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
+
+	return pi->nb_freqs;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_get_freq(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return RTE_POWER_INVALID_FREQ_INDEX;
+	}
+
+	return lcore_power_info[lcore_id].curr_idx;
+}
+
+int
+rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
+}
+
+int
+rte_power_acpi_cpufreq_freq_down(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx + 1 == pi->nb_freqs)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx + 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_up(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx == 0)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx - 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_max(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(&lcore_power_info[lcore_id], 0);
+}
+
+int
+rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->nb_freqs - 1);
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
new file mode 100644
index 0000000..68578e9
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -0,0 +1,192 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_ACPI_CPUFREQ_H
+#define _RTE_POWER_ACPI_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management via userspace ACPI cpufreq
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore. It will check and set the
+ * governor to userspace for the lcore, get the available frequencies, and
+ * prepare to set new lcore frequency.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore. It will set the governor to which
+ * is before initialized.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore. The return value will be
+ * the minimal one of the total number of available frequencies and the number
+ * of buffer. The index of available frequencies used in other interfaces
+ * should be in the range of 0 to this return value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  The number of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore. It
+ * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  The current index of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency chnaged.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_power/rte_power_common.h b/lib/librte_power/rte_power_common.h
new file mode 100644
index 0000000..64bd168
--- /dev/null
+++ b/lib/librte_power/rte_power_common.h
@@ -0,0 +1,39 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_POWER_COMMON_H_
+#define RTE_POWER_COMMON_H_
+
+#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+
+#endif /* RTE_POWER_COMMON_H_ */
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
new file mode 100644
index 0000000..d8cef98
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -0,0 +1,160 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <errno.h>
+#include <string.h>
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
+
+#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+#define SEND_MSG_AND_RETURN(pkt, lcore_id, ret) do { \
+		ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id); \
+		if ((ret) == 0) \
+			return 1; \
+		if ((ret) > 0) \
+			RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", \
+				strerror(ret)); \
+		return -1; \
+} while (0)
+
+static struct channel_packet pkt[RTE_MAX_LCORE];
+
+
+int
+rte_power_kvm_vm_init(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	pkt[lcore_id].command = CPU_POWER;
+	pkt[lcore_id].resource_id = lcore_id;
+	return guest_channel_host_connect(FD_PATH, lcore_id);
+}
+
+int
+rte_power_kvm_vm_exit(unsigned lcore_id)
+{
+	guest_channel_host_disconnect(lcore_id);
+	return 0;
+}
+
+uint32_t
+rte_power_kvm_vm_freqs(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t *freqs,
+		__attribute__((unused)) uint32_t num)
+{
+	RTE_LOG(ERR, POWER, "rte_power_freqs is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+uint32_t
+rte_power_kvm_vm_get_freq(__attribute__((unused)) unsigned lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_get_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_set_freq(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t index)
+{
+	RTE_LOG(ERR, POWER, "rte_power_set_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_freq_up(unsigned lcore_id)
+{
+	int ret;
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = CPU_SCALE_UP;
+
+	SEND_MSG_AND_RETURN(pkt, lcore_id , ret);
+}
+
+int
+rte_power_kvm_vm_freq_down(unsigned lcore_id)
+{
+	int ret;
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = CPU_SCALE_DOWN;
+
+	SEND_MSG_AND_RETURN(pkt, lcore_id , ret);
+}
+
+int
+rte_power_kvm_vm_freq_max(unsigned lcore_id)
+{
+	int ret;
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = CPU_SCALE_MAX;
+
+	SEND_MSG_AND_RETURN(pkt, lcore_id , ret);
+}
+
+int
+rte_power_kvm_vm_freq_min(unsigned lcore_id)
+{
+	int ret;
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = CPU_SCALE_MIN;
+
+	SEND_MSG_AND_RETURN(pkt, lcore_id , ret);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
new file mode 100644
index 0000000..dcbc878
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -0,0 +1,179 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_KVM_VM_H
+#define _RTE_POWER_KVM_VM_H
+
+/**
+ * @file
+ * RTE Power Management KVM VM
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore. This request is forwarded to the
+ * host monitor.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host Alan Carew
@ 2014-09-25 10:10     ` Neil Horman
  2014-09-25 17:06       ` Carew, Alan
  0 siblings, 1 reply; 97+ messages in thread
From: Neil Horman @ 2014-09-25 10:10 UTC (permalink / raw)
  To: Alan Carew; +Cc: dev

On Wed, Sep 24, 2014 at 06:26:13PM +0100, Alan Carew wrote:
> Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
> renaming of functions only.
> Added rte_power_kvm_vm implmentation to support Power Management from a VM.
> 
> librte_power now hides the implementation based on the environment used.
> A new call rte_power_set_env() can explicidly set the environment, if not
> called then auto-detection takes place.
> 
> rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
>  rte_power_init(unsigned lcore_id)
>  rte_power_exit(unsigned lcore_id)
>  rte_power_freq_up(unsigned lcore_id)
>  rte_power_freq_down(unsigned lcore_id)
>  rte_power_freq_min(unsigned lcore_id)
>  rte_power_freq_max(unsigned lcore_id)
> 
> The other unsupported APIs return -ENOTSUP
> 
> Signed-off-by: Alan Carew <alan.carew@intel.com>
> ---
>  lib/librte_power/rte_power.c              | 540 ++++-------------------------
>  lib/librte_power/rte_power.h              | 120 +++++--
>  lib/librte_power/rte_power_acpi_cpufreq.c | 545 ++++++++++++++++++++++++++++++
>  lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++++++++++
>  lib/librte_power/rte_power_common.h       |  39 +++
>  lib/librte_power/rte_power_kvm_vm.c       | 160 +++++++++
>  lib/librte_power/rte_power_kvm_vm.h       | 179 ++++++++++
>  7 files changed, 1273 insertions(+), 502 deletions(-)
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
>  create mode 100644 lib/librte_power/rte_power_common.h
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.c
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.h
> 
> diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
> index 856da9a..998ed1c 100644
> --- a/lib/librte_power/rte_power.c
> +++ b/lib/librte_power/rte_power.c
> @@ -31,515 +31,113 @@
>   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>   */
>  
> -#include <stdio.h>
> -#include <sys/types.h>
> -#include <sys/stat.h>
> -#include <fcntl.h>
> -#include <stdlib.h>
> -#include <string.h>
> -#include <unistd.h>
> -#include <signal.h>
> -#include <limits.h>
> -
> -#include <rte_memcpy.h>
>  #include <rte_atomic.h>
>  
>  #include "rte_power.h"
> +#include "rte_power_acpi_cpufreq.h"
> +#include "rte_power_kvm_vm.h"
> +#include "rte_power_common.h"
>  
> -#ifdef RTE_LIBRTE_POWER_DEBUG
> -#define POWER_DEBUG_TRACE(fmt, args...) do { \
> -		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
> -	} while (0)
> -#else
> -#define POWER_DEBUG_TRACE(fmt, args...)
> -#endif
> -
> -#define FOPEN_OR_ERR_RET(f, retval) do { \
> -	if ((f) == NULL) { \
> -		RTE_LOG(ERR, POWER, "File not openned\n"); \
> -		return (retval); \
> -	} \
> -} while(0)
> -
> -#define FOPS_OR_NULL_GOTO(ret, label) do { \
> -	if ((ret) == NULL) { \
> -		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
> -		goto label; \
> -	} \
> -} while(0)
> -
> -#define FOPS_OR_ERR_GOTO(ret, label) do { \
> -	if ((ret) < 0) { \
> -		RTE_LOG(ERR, POWER, "File operations failed\n"); \
> -		goto label; \
> -	} \
> -} while(0)
> -
> -#define STR_SIZE     1024
> -#define POWER_CONVERT_TO_DECIMAL 10
> +enum power_management_env global_default_env = PM_ENV_NOT_SET;
>  
> -#define POWER_GOVERNOR_USERSPACE "userspace"
> -#define POWER_SYSFILE_GOVERNOR   \
> -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
> -#define POWER_SYSFILE_AVAIL_FREQ \
> -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
> -#define POWER_SYSFILE_SETSPEED   \
> -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
> +volatile uint32_t global_env_cfg_status = 0;
>  
> -enum power_state {
> -	POWER_IDLE = 0,
> -	POWER_ONGOING,
> -	POWER_USED,
> -	POWER_UNKNOWN
> -};
> +/* function pointers */
> +rte_power_freqs_t rte_power_freqs  = NULL;
> +rte_power_get_freq_t rte_power_get_freq = NULL;
> +rte_power_set_freq_t rte_power_set_freq = NULL;
> +rte_power_freq_change_t rte_power_freq_up = NULL;
> +rte_power_freq_change_t rte_power_freq_down = NULL;
> +rte_power_freq_change_t rte_power_freq_max = NULL;
> +rte_power_freq_change_t rte_power_freq_min = NULL;
>  
> -/**
> - * Power info per lcore.
> - */
> -struct rte_power_info {
> -	unsigned lcore_id;                   /**< Logical core id */
> -	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
> -	uint32_t nb_freqs;                   /**< number of available freqs */
> -	FILE *f;                             /**< FD of scaling_setspeed */
> -	char governor_ori[32];               /**< Original governor name */
> -	uint32_t curr_idx;                   /**< Freq index in freqs array */
> -	volatile uint32_t state;             /**< Power in use state */
> -} __rte_cache_aligned;
> -
> -static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
> -
> -/**
> - * It is to set specific freq for specific logical core, according to the index
> - * of supported frequencies.
> - */
> -static int
> -set_freq_internal(struct rte_power_info *pi, uint32_t idx)
> +int
> +rte_power_set_env(enum power_management_env env)
>  {
> -	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
> -		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
> -			"should be less than %u\n", idx, pi->nb_freqs);
> -		return -1;
> -	}
> -
> -	/* Check if it is the same as current */
> -	if (idx == pi->curr_idx)
> +	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
>  		return 0;
> -
1 Nit here.  If an invalid environment value is passed in on the first config
attempt here, you won't ever be able to set it.  Maybe add some logic to return
us to an initial state if a value env isn't selected?

Neil

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host
  2014-09-25 10:10     ` Neil Horman
@ 2014-09-25 17:06       ` Carew, Alan
  2014-09-25 17:49         ` Neil Horman
  0 siblings, 1 reply; 97+ messages in thread
From: Carew, Alan @ 2014-09-25 17:06 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, September 25, 2014 11:10 AM
> To: Carew, Alan
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v2 07/10] librte_power common interface for Guest and
> Host
> 
> On Wed, Sep 24, 2014 at 06:26:13PM +0100, Alan Carew wrote:
> > Moved the current librte_power implementation to rte_power_acpi_cpufreq,
> with
> > renaming of functions only.
> > Added rte_power_kvm_vm implmentation to support Power Management
> from a VM.
> >
> > librte_power now hides the implementation based on the environment used.
> > A new call rte_power_set_env() can explicidly set the environment, if not
> > called then auto-detection takes place.
> >
> > rte_power_kvm_vm is subset of the librte_power APIs, the following is
> supported:
> >  rte_power_init(unsigned lcore_id)
> >  rte_power_exit(unsigned lcore_id)
> >  rte_power_freq_up(unsigned lcore_id)
> >  rte_power_freq_down(unsigned lcore_id)
> >  rte_power_freq_min(unsigned lcore_id)
> >  rte_power_freq_max(unsigned lcore_id)
> >
> > The other unsupported APIs return -ENOTSUP
> >
> > Signed-off-by: Alan Carew <alan.carew@intel.com>
> > ---
> >  lib/librte_power/rte_power.c              | 540 ++++-------------------------
> >  lib/librte_power/rte_power.h              | 120 +++++--
> >  lib/librte_power/rte_power_acpi_cpufreq.c | 545
> ++++++++++++++++++++++++++++++
> >  lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++++++++++
> >  lib/librte_power/rte_power_common.h       |  39 +++
> >  lib/librte_power/rte_power_kvm_vm.c       | 160 +++++++++
> >  lib/librte_power/rte_power_kvm_vm.h       | 179 ++++++++++
> >  7 files changed, 1273 insertions(+), 502 deletions(-)
> >  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
> >  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
> >  create mode 100644 lib/librte_power/rte_power_common.h
> >  create mode 100644 lib/librte_power/rte_power_kvm_vm.c
> >  create mode 100644 lib/librte_power/rte_power_kvm_vm.h
> >
> > diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
> > index 856da9a..998ed1c 100644
> > --- a/lib/librte_power/rte_power.c
> > +++ b/lib/librte_power/rte_power.c
> > @@ -31,515 +31,113 @@
> >   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> >   */
> >
> > -#include <stdio.h>
> > -#include <sys/types.h>
> > -#include <sys/stat.h>
> > -#include <fcntl.h>
> > -#include <stdlib.h>
> > -#include <string.h>
> > -#include <unistd.h>
> > -#include <signal.h>
> > -#include <limits.h>
> > -
> > -#include <rte_memcpy.h>
> >  #include <rte_atomic.h>
> >
> >  #include "rte_power.h"
> > +#include "rte_power_acpi_cpufreq.h"
> > +#include "rte_power_kvm_vm.h"
> > +#include "rte_power_common.h"
> >
> > -#ifdef RTE_LIBRTE_POWER_DEBUG
> > -#define POWER_DEBUG_TRACE(fmt, args...) do { \
> > -		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
> > -	} while (0)
> > -#else
> > -#define POWER_DEBUG_TRACE(fmt, args...)
> > -#endif
> > -
> > -#define FOPEN_OR_ERR_RET(f, retval) do { \
> > -	if ((f) == NULL) { \
> > -		RTE_LOG(ERR, POWER, "File not openned\n"); \
> > -		return (retval); \
> > -	} \
> > -} while(0)
> > -
> > -#define FOPS_OR_NULL_GOTO(ret, label) do { \
> > -	if ((ret) == NULL) { \
> > -		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
> > -		goto label; \
> > -	} \
> > -} while(0)
> > -
> > -#define FOPS_OR_ERR_GOTO(ret, label) do { \
> > -	if ((ret) < 0) { \
> > -		RTE_LOG(ERR, POWER, "File operations failed\n"); \
> > -		goto label; \
> > -	} \
> > -} while(0)
> > -
> > -#define STR_SIZE     1024
> > -#define POWER_CONVERT_TO_DECIMAL 10
> > +enum power_management_env global_default_env = PM_ENV_NOT_SET;
> >
> > -#define POWER_GOVERNOR_USERSPACE "userspace"
> > -#define POWER_SYSFILE_GOVERNOR   \
> > -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
> > -#define POWER_SYSFILE_AVAIL_FREQ \
> > -
> 	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencie
> s"
> > -#define POWER_SYSFILE_SETSPEED   \
> > -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
> > +volatile uint32_t global_env_cfg_status = 0;
> >
> > -enum power_state {
> > -	POWER_IDLE = 0,
> > -	POWER_ONGOING,
> > -	POWER_USED,
> > -	POWER_UNKNOWN
> > -};
> > +/* function pointers */
> > +rte_power_freqs_t rte_power_freqs  = NULL;
> > +rte_power_get_freq_t rte_power_get_freq = NULL;
> > +rte_power_set_freq_t rte_power_set_freq = NULL;
> > +rte_power_freq_change_t rte_power_freq_up = NULL;
> > +rte_power_freq_change_t rte_power_freq_down = NULL;
> > +rte_power_freq_change_t rte_power_freq_max = NULL;
> > +rte_power_freq_change_t rte_power_freq_min = NULL;
> >
> > -/**
> > - * Power info per lcore.
> > - */
> > -struct rte_power_info {
> > -	unsigned lcore_id;                   /**< Logical core id */
> > -	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
> > -	uint32_t nb_freqs;                   /**< number of available freqs */
> > -	FILE *f;                             /**< FD of scaling_setspeed */
> > -	char governor_ori[32];               /**< Original governor name */
> > -	uint32_t curr_idx;                   /**< Freq index in freqs array */
> > -	volatile uint32_t state;             /**< Power in use state */
> > -} __rte_cache_aligned;
> > -
> > -static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
> > -
> > -/**
> > - * It is to set specific freq for specific logical core, according to the index
> > - * of supported frequencies.
> > - */
> > -static int
> > -set_freq_internal(struct rte_power_info *pi, uint32_t idx)
> > +int
> > +rte_power_set_env(enum power_management_env env)
> >  {
> > -	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
> > -		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
> > -			"should be less than %u\n", idx, pi->nb_freqs);
> > -		return -1;
> > -	}
> > -
> > -	/* Check if it is the same as current */
> > -	if (idx == pi->curr_idx)
> > +	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
> >  		return 0;
> > -
> 1 Nit here.  If an invalid environment value is passed in on the first config
> attempt here, you won't ever be able to set it.  Maybe add some logic to return
> us to an initial state if a value env isn't selected?
> 
> Neil

Hi Neil,

I should have called it out in the commit, but there's also a rte_power_unset_env()
function that resets the environment that allows for retrying a different environment.
rte_power_unset_env() is also called when an invalid configuration is set.

Thanks,
Alan.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host
  2014-09-25 17:06       ` Carew, Alan
@ 2014-09-25 17:49         ` Neil Horman
  0 siblings, 0 replies; 97+ messages in thread
From: Neil Horman @ 2014-09-25 17:49 UTC (permalink / raw)
  To: Carew, Alan; +Cc: dev

On Thu, Sep 25, 2014 at 05:06:11PM +0000, Carew, Alan wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, September 25, 2014 11:10 AM
> > To: Carew, Alan
> > Cc: dev@dpdk.org
> > Subject: Re: [PATCH v2 07/10] librte_power common interface for Guest and
> > Host
> > 
> > On Wed, Sep 24, 2014 at 06:26:13PM +0100, Alan Carew wrote:
> > > Moved the current librte_power implementation to rte_power_acpi_cpufreq,
> > with
> > > renaming of functions only.
> > > Added rte_power_kvm_vm implmentation to support Power Management
> > from a VM.
> > >
> > > librte_power now hides the implementation based on the environment used.
> > > A new call rte_power_set_env() can explicidly set the environment, if not
> > > called then auto-detection takes place.
> > >
> > > rte_power_kvm_vm is subset of the librte_power APIs, the following is
> > supported:
> > >  rte_power_init(unsigned lcore_id)
> > >  rte_power_exit(unsigned lcore_id)
> > >  rte_power_freq_up(unsigned lcore_id)
> > >  rte_power_freq_down(unsigned lcore_id)
> > >  rte_power_freq_min(unsigned lcore_id)
> > >  rte_power_freq_max(unsigned lcore_id)
> > >
> > > The other unsupported APIs return -ENOTSUP
> > >
> > > Signed-off-by: Alan Carew <alan.carew@intel.com>
> > > ---
> > >  lib/librte_power/rte_power.c              | 540 ++++-------------------------
> > >  lib/librte_power/rte_power.h              | 120 +++++--
> > >  lib/librte_power/rte_power_acpi_cpufreq.c | 545
> > ++++++++++++++++++++++++++++++
> > >  lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++++++++++
> > >  lib/librte_power/rte_power_common.h       |  39 +++
> > >  lib/librte_power/rte_power_kvm_vm.c       | 160 +++++++++
> > >  lib/librte_power/rte_power_kvm_vm.h       | 179 ++++++++++
> > >  7 files changed, 1273 insertions(+), 502 deletions(-)
> > >  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
> > >  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
> > >  create mode 100644 lib/librte_power/rte_power_common.h
> > >  create mode 100644 lib/librte_power/rte_power_kvm_vm.c
> > >  create mode 100644 lib/librte_power/rte_power_kvm_vm.h
> > >
> > > diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
> > > index 856da9a..998ed1c 100644
> > > --- a/lib/librte_power/rte_power.c
> > > +++ b/lib/librte_power/rte_power.c
> > > @@ -31,515 +31,113 @@
> > >   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > DAMAGE.
> > >   */
> > >
> > > -#include <stdio.h>
> > > -#include <sys/types.h>
> > > -#include <sys/stat.h>
> > > -#include <fcntl.h>
> > > -#include <stdlib.h>
> > > -#include <string.h>
> > > -#include <unistd.h>
> > > -#include <signal.h>
> > > -#include <limits.h>
> > > -
> > > -#include <rte_memcpy.h>
> > >  #include <rte_atomic.h>
> > >
> > >  #include "rte_power.h"
> > > +#include "rte_power_acpi_cpufreq.h"
> > > +#include "rte_power_kvm_vm.h"
> > > +#include "rte_power_common.h"
> > >
> > > -#ifdef RTE_LIBRTE_POWER_DEBUG
> > > -#define POWER_DEBUG_TRACE(fmt, args...) do { \
> > > -		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
> > > -	} while (0)
> > > -#else
> > > -#define POWER_DEBUG_TRACE(fmt, args...)
> > > -#endif
> > > -
> > > -#define FOPEN_OR_ERR_RET(f, retval) do { \
> > > -	if ((f) == NULL) { \
> > > -		RTE_LOG(ERR, POWER, "File not openned\n"); \
> > > -		return (retval); \
> > > -	} \
> > > -} while(0)
> > > -
> > > -#define FOPS_OR_NULL_GOTO(ret, label) do { \
> > > -	if ((ret) == NULL) { \
> > > -		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
> > > -		goto label; \
> > > -	} \
> > > -} while(0)
> > > -
> > > -#define FOPS_OR_ERR_GOTO(ret, label) do { \
> > > -	if ((ret) < 0) { \
> > > -		RTE_LOG(ERR, POWER, "File operations failed\n"); \
> > > -		goto label; \
> > > -	} \
> > > -} while(0)
> > > -
> > > -#define STR_SIZE     1024
> > > -#define POWER_CONVERT_TO_DECIMAL 10
> > > +enum power_management_env global_default_env = PM_ENV_NOT_SET;
> > >
> > > -#define POWER_GOVERNOR_USERSPACE "userspace"
> > > -#define POWER_SYSFILE_GOVERNOR   \
> > > -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
> > > -#define POWER_SYSFILE_AVAIL_FREQ \
> > > -
> > 	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencie
> > s"
> > > -#define POWER_SYSFILE_SETSPEED   \
> > > -	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
> > > +volatile uint32_t global_env_cfg_status = 0;
> > >
> > > -enum power_state {
> > > -	POWER_IDLE = 0,
> > > -	POWER_ONGOING,
> > > -	POWER_USED,
> > > -	POWER_UNKNOWN
> > > -};
> > > +/* function pointers */
> > > +rte_power_freqs_t rte_power_freqs  = NULL;
> > > +rte_power_get_freq_t rte_power_get_freq = NULL;
> > > +rte_power_set_freq_t rte_power_set_freq = NULL;
> > > +rte_power_freq_change_t rte_power_freq_up = NULL;
> > > +rte_power_freq_change_t rte_power_freq_down = NULL;
> > > +rte_power_freq_change_t rte_power_freq_max = NULL;
> > > +rte_power_freq_change_t rte_power_freq_min = NULL;
> > >
> > > -/**
> > > - * Power info per lcore.
> > > - */
> > > -struct rte_power_info {
> > > -	unsigned lcore_id;                   /**< Logical core id */
> > > -	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
> > > -	uint32_t nb_freqs;                   /**< number of available freqs */
> > > -	FILE *f;                             /**< FD of scaling_setspeed */
> > > -	char governor_ori[32];               /**< Original governor name */
> > > -	uint32_t curr_idx;                   /**< Freq index in freqs array */
> > > -	volatile uint32_t state;             /**< Power in use state */
> > > -} __rte_cache_aligned;
> > > -
> > > -static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
> > > -
> > > -/**
> > > - * It is to set specific freq for specific logical core, according to the index
> > > - * of supported frequencies.
> > > - */
> > > -static int
> > > -set_freq_internal(struct rte_power_info *pi, uint32_t idx)
> > > +int
> > > +rte_power_set_env(enum power_management_env env)
> > >  {
> > > -	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
> > > -		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
> > > -			"should be less than %u\n", idx, pi->nb_freqs);
> > > -		return -1;
> > > -	}
> > > -
> > > -	/* Check if it is the same as current */
> > > -	if (idx == pi->curr_idx)
> > > +	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
> > >  		return 0;
> > > -
> > 1 Nit here.  If an invalid environment value is passed in on the first config
> > attempt here, you won't ever be able to set it.  Maybe add some logic to return
> > us to an initial state if a value env isn't selected?
> > 
> > Neil
> 
> Hi Neil,
> 
> I should have called it out in the commit, but there's also a rte_power_unset_env()
> function that resets the environment that allows for retrying a different environment.
> rte_power_unset_env() is also called when an invalid configuration is set.
> 
> Thanks,
> Alan.
> 
Ok, that seems like an odd interface too me, but it works as well as anything
else.

Thanks!
Neil

> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 08/10] Packet format for VM Power Management(Host and Guest).
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (6 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/channel_commands.h | 68 +++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100644 lib/librte_power/channel_commands.h

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
new file mode 100644
index 0000000..e33e85b
--- /dev/null
+++ b/lib/librte_power/channel_commands.h
@@ -0,0 +1,68 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_config.h>
+
+#if RTE_MAX_LCORE > 64
+#error Maximum number of cores and channels is 64, overflow is guaranteed to \
+	cause problems with VM Power Management
+#endif
+
+#define CPU_POWER         1
+#define CPU_POWER_CONNECT 2
+
+#define CPU_SCALE_UP      1
+#define CPU_SCALE_DOWN    2
+#define CPU_SCALE_MAX     3
+#define CPU_SCALE_MIN     4
+
+struct channel_packet {
+	uint64_t resource_id; /* core_num, device */
+	uint32_t unit; /* scale down/up/min/max */
+	uint32_t command; /* Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 09/10] Build system integration for VM Power Management(Guest and Host)
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (7 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 10/10] VM Power Management Unit Tests Alan Carew
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
index 6185812..d672a5a 100644
--- a/lib/librte_power/Makefile
+++ b/lib/librte_power/Makefile
@@ -37,7 +37,8 @@ LIB = librte_power.a
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c rte_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_kvm_vm.c guest_channel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2 10/10] VM Power Management Unit Tests
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (8 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
@ 2014-09-24 17:26   ` Alan Carew
  2014-09-25  2:56   ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Liu, Yong
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
  11 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-24 17:26 UTC (permalink / raw)
  To: dev

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 app/test/Makefile                  |   3 +-
 app/test/autotest_data.py          |  26 ++
 app/test/test_power.c              | 445 +++---------------------------
 app/test/test_power_acpi_cpufreq.c | 544 +++++++++++++++++++++++++++++++++++++
 app/test/test_power_kvm_vm.c       | 308 +++++++++++++++++++++
 5 files changed, 917 insertions(+), 409 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 37a3772..03ade39 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -119,7 +119,8 @@ endif
 
 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c test_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power_kvm_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..618a946 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,32 @@ non_parallel_test_group_list = [
 	]
 },
 {
+	"Prefix" :      "power_acpi_cpufreq",
+	"Memory" :      all_sockets(512),
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power ACPI cpufreq autotest",
+		 "Command" :    "power_acpi_cpufreq_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
+	"Prefix" :      "power_kvm_vm",
+	"Memory" :      "512",
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power KVM VM  autotest",
+		 "Command" :    "power_kvm_vm_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
 	"Prefix" :	"lpm6",
 	"Memory" :	"512",
 	"Tests" :
diff --git a/app/test/test_power.c b/app/test/test_power.c
index d9eb420..64a2305 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -41,437 +41,66 @@
 
 #include <rte_power.h>
 
-#define TEST_POWER_LCORE_ID      2U
-#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
-#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
-
-#define TEST_POWER_SYSFILE_CUR_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
-
-static uint32_t total_freq_num;
-static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
-
-static int
-check_cur_freq(unsigned lcore_id, uint32_t idx)
-{
-#define TEST_POWER_CONVERT_TO_DECIMAL 10
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t cur_freq;
-	int ret = -1;
-
-	if (snprintf(fullpath, sizeof(fullpath),
-		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
-		return 0;
-	}
-	f = fopen(fullpath, "r");
-	if (f == NULL) {
-		return 0;
-	}
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		goto fail_get_cur_freq;
-	}
-	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
-	ret = (freqs[idx] == cur_freq ? 0 : -1);
-
-fail_get_cur_freq:
-	fclose(f);
-
-	return ret;
-}
-
-/* Check rte_power_freqs() */
-static int
-check_power_freqs(void)
-{
-	uint32_t ret;
-
-	total_freq_num = 0;
-	memset(freqs, 0, sizeof(freqs));
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
-					TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with NULL buffer to save available freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test of getting zero number of freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test with all valid input parameters */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get available freqs on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Save the total number of available freqs */
-	total_freq_num = ret;
-
-	return 0;
-}
-
-/* Check rte_power_get_freq() */
-static int
-check_power_get_freq(void)
-{
-	int ret;
-	uint32_t count;
-
-	/* test with an invalid lcore id */
-	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
-	if (count < TEST_POWER_FREQS_NUM_MAX) {
-		printf("Unexpectedly get freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
-	if (count >= TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get the freq index on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_set_freq() */
-static int
-check_power_set_freq(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
-	if (ret >= 0) {
-		printf("Unexpectedly set freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with an invalid freq index */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/**
-	 * test with an invalid freq index which is right one bigger than
-	 * total number of freqs
-	 */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", total_freq_num,
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0) {
-		printf("Fail to set freq index on lcore %u\n",
-					TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_down() */
-static int
-check_power_freq_down(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale down one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale down one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf ("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_up() */
-static int
-check_power_freq_up(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq on %u\n",
-						TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale up one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale up one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_max() */
-static int
-check_power_freq_max(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq to max on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_min() */
-static int
-check_power_freq_min(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq to min "
-				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
 static int
 test_power(void)
 {
 	int ret = -1;
+	enum power_management_env env;
 
-	/* test of init power management for an invalid lcore */
-	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	/* Test setting an invalid environment */
+	ret = rte_power_set_env(PM_ENV_NOT_SET);
 	if (ret == 0) {
-		printf("Unexpectedly initialise power management successfully "
-				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot initialise power management for lcore %u\n",
-							TEST_POWER_LCORE_ID);
+		printf("Unexpectedly succeeded on setting an invalid environment\n");
 		return -1;
 	}
 
-	/**
-	 * test of initialising power management for the lcore which has
-	 * been initialised
-	 */
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly init successfully power twice on "
-					"lcore %u\n", TEST_POWER_LCORE_ID);
+	/* Test that the environment has not been set */
+	env = rte_power_get_env();
+	if (env != PM_ENV_NOT_SET) {
+		printf("Unexpectedly got a valid environment configuration\n");
 		return -1;
 	}
 
-	ret = check_power_freqs();
-	if (ret < 0)
+	/* verify that function pointers are NULL */
+	if (rte_power_freqs != NULL) {
+		printf("rte_power_freqs should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	if (total_freq_num < 2) {
-		rte_power_exit(TEST_POWER_LCORE_ID);
-		printf("Frequency can not be changed due to CPU itself\n");
-		return 0;
 	}
-
-	ret = check_power_get_freq();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_set_freq();
-	if (ret < 0)
+	if (rte_power_get_freq != NULL) {
+		printf("rte_power_get_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_down();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_freq_up();
-	if (ret < 0)
+	}
+	if (rte_power_set_freq != NULL) {
+		printf("rte_power_set_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_max();
-	if (ret < 0)
+	}
+	if (rte_power_freq_up != NULL) {
+		printf("rte_power_freq_up should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_min();
-	if (ret < 0)
+	}
+	if (rte_power_freq_down != NULL) {
+		printf("rte_power_freq_down should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot exit power management for lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
 	}
-
-	/**
-	 * test of exiting power management for the lcore which has been exited
-	 */
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly exit successfully power management twice "
-					"on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
+	if (rte_power_freq_max != NULL) {
+		printf("rte_power_freq_max should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
-	/* test of exit power management for an invalid lcore */
-	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
-	if (ret == 0) {
-		printf("Unpectedly exit power management successfully for "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
+	if (rte_power_freq_min != NULL) {
+		printf("rte_power_freq_min should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
+	rte_power_unset_env();
 	return 0;
-
 fail_all:
-	rte_power_exit(TEST_POWER_LCORE_ID);
-
+	rte_power_unset_env();
 	return -1;
 }
 
diff --git a/app/test/test_power_acpi_cpufreq.c b/app/test/test_power_acpi_cpufreq.c
new file mode 100644
index 0000000..8848d75
--- /dev/null
+++ b/app/test/test_power_acpi_cpufreq.c
@@ -0,0 +1,544 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+
+#define TEST_POWER_LCORE_ID      2U
+#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
+#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
+
+#define TEST_POWER_SYSFILE_CUR_FREQ \
+	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
+
+static uint32_t total_freq_num;
+static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
+
+static int
+check_cur_freq(unsigned lcore_id, uint32_t idx)
+{
+#define TEST_POWER_CONVERT_TO_DECIMAL 10
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t cur_freq;
+	int ret = -1;
+
+	if (snprintf(fullpath, sizeof(fullpath),
+		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
+		return 0;
+	}
+	f = fopen(fullpath, "r");
+	if (f == NULL) {
+		return 0;
+	}
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		goto fail_get_cur_freq;
+	}
+	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
+	ret = (freqs[idx] == cur_freq ? 0 : -1);
+
+fail_get_cur_freq:
+	fclose(f);
+
+	return ret;
+}
+
+/* Check rte_power_freqs() */
+static int
+check_power_freqs(void)
+{
+	uint32_t ret;
+
+	total_freq_num = 0;
+	memset(freqs, 0, sizeof(freqs));
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
+					TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with NULL buffer to save available freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test of getting zero number of freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test with all valid input parameters */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get available freqs on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Save the total number of available freqs */
+	total_freq_num = ret;
+
+	return 0;
+}
+
+/* Check rte_power_get_freq() */
+static int
+check_power_get_freq(void)
+{
+	int ret;
+	uint32_t count;
+
+	/* test with an invalid lcore id */
+	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
+	if (count < TEST_POWER_FREQS_NUM_MAX) {
+		printf("Unexpectedly get freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
+	if (count >= TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get the freq index on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_set_freq() */
+static int
+check_power_set_freq(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
+	if (ret >= 0) {
+		printf("Unexpectedly set freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with an invalid freq index */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/**
+	 * test with an invalid freq index which is right one bigger than
+	 * total number of freqs
+	 */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", total_freq_num,
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0) {
+		printf("Fail to set freq index on lcore %u\n",
+					TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_down() */
+static int
+check_power_freq_down(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale down one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale down one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf ("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_up() */
+static int
+check_power_freq_up(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq on %u\n",
+						TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale up one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale up one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_max() */
+static int
+check_power_freq_max(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq to max on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_min() */
+static int
+check_power_freq_min(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq to min "
+				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+static int
+test_power_acpi_cpufreq(void)
+{
+	int ret = -1;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_ACPI_CPUFREQ, this "
+				"may occur if environment is not configured correctly or "
+				" operating in another valid Power management environment\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_ACPI_CPUFREQ) {
+		printf("Unexpectedly got an environment other than ACPI cpufreq\n");
+		goto fail_all;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+
+	/* test of init power management for an invalid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unexpectedly initialise power management successfully "
+				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(APCI cpufreq) or operating in another valid "
+				"Power management environment\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of initialising power management for the lcore which has
+	 * been initialised
+	 */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly init successfully power twice on "
+					"lcore %u\n", TEST_POWER_LCORE_ID);
+		goto fail_all;
+	}
+
+	ret = check_power_freqs();
+	if (ret < 0)
+		goto fail_all;
+
+	if (total_freq_num < 2) {
+		rte_power_exit(TEST_POWER_LCORE_ID);
+		printf("Frequency can not be changed due to CPU itself\n");
+		rte_power_unset_env();
+		return 0;
+	}
+
+	ret = check_power_get_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_set_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_down();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_up();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_max();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_min();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot exit power management for lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of exiting power management for the lcore which has been exited
+	 */
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly exit successfully power management twice "
+					"on lcore %u\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* test of exit power management for an invalid lcore */
+	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unpectedly exit power management successfully for "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+
+fail_all:
+	rte_power_exit(TEST_POWER_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_acpi_cpufreq_cmd = {
+	.command = "power_acpi_cpufreq_autotest",
+	.callback = test_power_acpi_cpufreq,
+};
+REGISTER_TEST_COMMAND(power_acpi_cpufreq_cmd);
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
new file mode 100644
index 0000000..ac0fcb6
--- /dev/null
+++ b/app/test/test_power_kvm_vm.c
@@ -0,0 +1,308 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+#include <rte_config.h>
+
+#define TEST_POWER_VM_LCORE_ID            0U
+#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
+#define TEST_POWER_VM_LCORE_INVALID       1U
+
+static int
+test_power_kvm_vm(void)
+{
+	int ret;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_KVM_VM);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_KVM_VM\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_KVM_VM) {
+		printf("Unexpectedly got a Power Management environment other than "
+				"KVM VM\n");
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		return -1;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	/* Test initialisation of an out of bounds lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret != -1) {
+		printf("rte_power_init unexpectedly succeeded on an invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(KVM VM) or operating in another valid "
+				"Power management environment\n", TEST_POWER_VM_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of previously initialised lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_init unexpectedly succeeded on calling init twice on"
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of invalid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency down of invalid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency min of invalid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency max of invalid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid but uninitialised lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid but uninitialised lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid but uninitialised lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid but uninitialised lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_up unexpectedly failed on valid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_down unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_min unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_max unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_freqs */
+	ret = rte_power_freqs(TEST_POWER_VM_LCORE_ID, NULL, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_freqs did not return the expected -ENOTSUP(%d) but "
+				"returned %d\n", -ENOTSUP, ret);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_get_freq */
+	ret = rte_power_get_freq(TEST_POWER_VM_LCORE_ID);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_get_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_set_freq */
+	ret = rte_power_set_freq(TEST_POWER_VM_LCORE_ID, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_set_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test removing of an lcore */
+	ret = rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_exit unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of previously removed lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of previously removed lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of previously removed lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of previously removed lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+fail_all:
+	rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_kvm_vm_cmd = {
+    .command = "power_kvm_vm_autotest",
+    .callback = test_power_kvm_vm,
+};
+REGISTER_TEST_COMMAND(power_kvm_vm_cmd);
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/10] VM Power Management
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (9 preceding siblings ...)
  2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 10/10] VM Power Management Unit Tests Alan Carew
@ 2014-09-25  2:56   ` Liu, Yong
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
  11 siblings, 0 replies; 97+ messages in thread
From: Liu, Yong @ 2014-09-25  2:56 UTC (permalink / raw)
  To: dev

Tested-by: Liu Yong <yong.liu at intel.com>

This patch set has been tested by Intel.
Please see information as the following:

Host:
	OS          : Fedora 20 x86_64
	Kernel   : 3.11.10-301
	GCC        : 4.8.3
	CPU        : Intel Xeon CPU E5-2680 v2 @ 2.80GHz
	NIC         : Intel Niantic 82599
	Qemu    : 1.6.2
	Libvirt    :1.1.3
Guest:
	OS          : Fedora 20 x86_64
	Kernel   : 3.11.10-301
	GCC        : 4.8.3

We verified vm power management by unit test and function test.
The detail information is listed below.

vm power unit test				Passed
vm power channel connected			Passed
vm power frequency max			Passed
vm power frequency min			Passed
vm power frequency up			Passed
vm power frequency down			Passed
vm power l3fwd-power 			Passed

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alan Carew
> Sent: Thursday, September 25, 2014 1:26 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 00/10] VM Power Management
> 
> Virtual Machine Power Management.
> 
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> 
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
> 
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests
> for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
> 
>  2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and
> managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/<vm_name>.<channel_number>,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
>  Host channel endpoints are opened in non-blocking mode and are monitored
> via epoll.
>  Requests over each channel to change frequency are forwarded to the
> original
>  librte_power.
> 
> Channels must be manually configured as qemu-kvm command line arguments
> or
> libvirt domain definition(xml) e.g.
> <controller type='virtio-serial' index='0'>
>  <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
> </controller>
> <channel type='unix'>
>   <source mode='bind'
> path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
>   <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
>   <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
> </channel>
> 
> Where multiple channels can be configured by specifying multiple <channel>
> elements, by replacing <vm_name>, <channel_num>.
> <N>(port number) should be incremented by 1 for each new channel
> element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> 
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to Monitor
> thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel
> events
>  from VMs and calls the corresponding librte_power functions.
> 
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a
> vCPU[PATCH 08/09]
> 
> The current l3fwd-power sample application can also be run on a VM.
> 
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> 
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests
> 
>  app/test/Makefile                                  |   3 +-
>  app/test/autotest_data.py                          |  26 +
>  app/test/test_power.c                              | 445 ++------------
>  app/test/test_power_acpi_cpufreq.c                 | 544 +++++++++++++++++
>  app/test/test_power_kvm_vm.c                       | 308 ++++++++++
>  examples/vm_power_manager/Makefile                 |  57 ++
>  examples/vm_power_manager/channel_manager.c        | 645
> +++++++++++++++++++++
>  examples/vm_power_manager/channel_manager.h        | 273 +++++++++
>  examples/vm_power_manager/channel_monitor.c        | 228 ++++++++
>  examples/vm_power_manager/channel_monitor.h        | 102 ++++
>  examples/vm_power_manager/guest_cli/Makefile       |  56 ++
>  examples/vm_power_manager/guest_cli/main.c         |  86 +++
>  examples/vm_power_manager/guest_cli/main.h         |  52 ++
>  .../guest_cli/vm_power_cli_guest.c                 | 155 +++++
>  .../guest_cli/vm_power_cli_guest.h                 |  55 ++
>  examples/vm_power_manager/main.c                   | 113 ++++
>  examples/vm_power_manager/main.h                   |  52 ++
>  examples/vm_power_manager/power_manager.c          | 244 ++++++++
>  examples/vm_power_manager/power_manager.h          | 186 ++++++
>  examples/vm_power_manager/vm_power_cli.c           | 568
> ++++++++++++++++++
>  examples/vm_power_manager/vm_power_cli.h           |  47 ++
>  lib/librte_power/Makefile                          |   3 +-
>  lib/librte_power/channel_commands.h                |  68 +++
>  lib/librte_power/guest_channel.c                   | 162 ++++++
>  lib/librte_power/guest_channel.h                   |  89 +++
>  lib/librte_power/rte_power.c                       | 540 +++--------------
>  lib/librte_power/rte_power.h                       | 120 +++-
>  lib/librte_power/rte_power_acpi_cpufreq.c          | 545
> +++++++++++++++++
>  lib/librte_power/rte_power_acpi_cpufreq.h          | 192 ++++++
>  lib/librte_power/rte_power_common.h                |  39 ++
>  lib/librte_power/rte_power_kvm_vm.c                | 160 +++++
>  lib/librte_power/rte_power_kvm_vm.h                | 179 ++++++
>  32 files changed, 5430 insertions(+), 912 deletions(-)
>  create mode 100644 app/test/test_power_acpi_cpufreq.c
>  create mode 100644 app/test/test_power_kvm_vm.c
>  create mode 100644 examples/vm_power_manager/Makefile
>  create mode 100644 examples/vm_power_manager/channel_manager.c
>  create mode 100644 examples/vm_power_manager/channel_manager.h
>  create mode 100644 examples/vm_power_manager/channel_monitor.c
>  create mode 100644 examples/vm_power_manager/channel_monitor.h
>  create mode 100644 examples/vm_power_manager/guest_cli/Makefile
>  create mode 100644 examples/vm_power_manager/guest_cli/main.c
>  create mode 100644 examples/vm_power_manager/guest_cli/main.h
>  create mode 100644
> examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
>  create mode 100644
> examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
>  create mode 100644 examples/vm_power_manager/main.c
>  create mode 100644 examples/vm_power_manager/main.h
>  create mode 100644 examples/vm_power_manager/power_manager.c
>  create mode 100644 examples/vm_power_manager/power_manager.h
>  create mode 100644 examples/vm_power_manager/vm_power_cli.c
>  create mode 100644 examples/vm_power_manager/vm_power_cli.h
>  create mode 100644 lib/librte_power/channel_commands.h
>  create mode 100644 lib/librte_power/guest_channel.c
>  create mode 100644 lib/librte_power/guest_channel.h
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
>  create mode 100644 lib/librte_power/rte_power_common.h
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.c
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.h
> 
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 00/10] VM Power Management
  2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
                     ` (10 preceding siblings ...)
  2014-09-25  2:56   ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Liu, Yong
@ 2014-09-29 15:18   ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
                       ` (12 more replies)
  11 siblings, 13 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

Virtual Machine Power Management.

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/<vm_name>.<channel_number>,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
 Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.

Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.
<controller type='virtio-serial' index='0'>
 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<channel type='unix'>
  <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
  <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
  <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
</channel>

Where multiple channels can be configured by specifying multiple <channel>
elements, by replacing <vm_name>, <channel_num>.
<N>(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two separate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
 from VMs and calls the corresponding librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.

Changes in V3:
 Fixed crash in Guest CLI when host application is not running.
 Renamed #defines to be more specific to the module they belong
 Added vCPU pinning via CLI
 Testing feedback

Changes in V2:
 Runtime selection of librte_power implementations.
 Updated Unit tests to cover librte_power changes.
 PATCH[0/3] was sent twice, again as PATCH[0/4]
 Miscellaneous fixes.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  VM Power Management application and Makefile.
  VM Power Management CLI(Guest).
  VM communication channels for VM Power Management(Guest).
  librte_power common interface for Guest and Host
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests

 app/test/Makefile                                  |   3 +-
 app/test/autotest_data.py                          |  26 +
 app/test/test_power.c                              | 445 +-----------
 app/test/test_power_acpi_cpufreq.c                 | 544 ++++++++++++++
 app/test/test_power_kvm_vm.c                       | 308 ++++++++
 examples/vm_power_manager/Makefile                 |  57 ++
 examples/vm_power_manager/channel_manager.c        | 804 +++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h        | 314 ++++++++
 examples/vm_power_manager/channel_monitor.c        | 228 ++++++
 examples/vm_power_manager/channel_monitor.h        | 102 +++
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++
 examples/vm_power_manager/guest_cli/main.c         |  87 +++
 examples/vm_power_manager/guest_cli/main.h         |  52 ++
 .../guest_cli/vm_power_cli_guest.c                 | 155 ++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++
 examples/vm_power_manager/main.c                   | 113 +++
 examples/vm_power_manager/main.h                   |  52 ++
 examples/vm_power_manager/power_manager.c          | 244 +++++++
 examples/vm_power_manager/power_manager.h          | 188 +++++
 examples/vm_power_manager/vm_power_cli.c           | 669 +++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h           |  47 ++
 lib/librte_power/Makefile                          |   3 +-
 lib/librte_power/channel_commands.h                |  77 ++
 lib/librte_power/guest_channel.c                   | 162 +++++
 lib/librte_power/guest_channel.h                   |  89 +++
 lib/librte_power/rte_power.c                       | 540 ++------------
 lib/librte_power/rte_power.h                       | 120 ++-
 lib/librte_power/rte_power_acpi_cpufreq.c          | 545 ++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h          | 192 +++++
 lib/librte_power/rte_power_common.h                |  39 +
 lib/librte_power/rte_power_kvm_vm.c                | 135 ++++
 lib/librte_power/rte_power_kvm_vm.h                | 179 +++++
 32 files changed, 5718 insertions(+), 912 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h
 create mode 100644 lib/librte_power/channel_commands.h
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 02/10] VM Power Management CLI(Host) Alan Carew
                       ` (11 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/channel_manager.c | 804 ++++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h | 314 +++++++++++
 examples/vm_power_manager/channel_monitor.c | 228 ++++++++
 examples/vm_power_manager/channel_monitor.h | 102 ++++
 4 files changed, 1448 insertions(+)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 0000000..a14f191
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,804 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+
+#include <rte_config.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+#include <rte_log.h>
+#include <rte_atomic.h>
+#include <rte_spinlock.h>
+
+#include <libvirt/libvirt.h>
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+		if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+static unsigned char *global_cpumaps;
+static virVcpuInfo *global_vircpuinfo;
+static size_t global_maplen;
+
+static unsigned global_n_host_cpus;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];
+	rte_atomic64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];
+	struct channel_info *channels[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	uint64_t channel_mask;
+	uint8_t num_channels;
+	enum vm_status status;
+	virDomainPtr domainPtr;
+	virDomainInfo info;
+	rte_spinlock_t config_spinlock;
+	LIST_ENTRY(virtual_machine_info) vms_info;
+};
+
+LIST_HEAD(, virtual_machine_info) vm_list_head;
+
+static struct virtual_machine_info *
+find_domain_by_name(const char *name)
+{
+	struct virtual_machine_info *info;
+	LIST_FOREACH(info, &vm_list_head, vms_info) {
+		if (!strncmp(info->name, name, CHANNEL_MGR_MAX_NAME_LEN-1))
+			return info;
+	}
+	return NULL;
+}
+
+static int
+update_pcpus_mask(struct virtual_machine_info *vm_info)
+{
+	virVcpuInfoPtr cpuinfo;
+	unsigned i, j;
+	int n_vcpus;
+	uint64_t mask;
+
+	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		n_vcpus = virDomainGetVcpuPinInfo(vm_info->domainPtr,
+				vm_info->info.nrVirtCpu, global_cpumaps, global_maplen,
+				VIR_DOMAIN_AFFECT_CONFIG);
+		if (n_vcpus < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+					"in-active VM '%s'\n", vm_info->name);
+			return -1;
+		}
+		goto update_pcpus;
+	}
+
+	memset(global_vircpuinfo, 0, sizeof(*global_vircpuinfo)*
+			CHANNEL_CMDS_MAX_CPUS);
+
+	cpuinfo = global_vircpuinfo;
+
+	n_vcpus = virDomainGetVcpus(vm_info->domainPtr, cpuinfo,
+			CHANNEL_CMDS_MAX_CPUS, global_cpumaps, global_maplen);
+	if (n_vcpus < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+							"active VM '%s'\n", vm_info->name);
+		return -1;
+	}
+update_pcpus:
+	if (n_vcpus >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Number of vCPUS(%u) is out of range "
+				"0...%d\n", n_vcpus, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+	if (n_vcpus != vm_info->info.nrVirtCpu) {
+		RTE_LOG(INFO, CHANNEL_MANAGER, "Updating the number of vCPUs for VM '%s"
+				" from %d -> %d\n", vm_info->name, vm_info->info.nrVirtCpu,
+				n_vcpus);
+		vm_info->info.nrVirtCpu = n_vcpus;
+	}
+	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
+		mask = 0;
+		for (j = 0; j < global_n_host_cpus; j++) {
+			if (VIR_CPU_USABLE(global_cpumaps, global_maplen, i, j) > 0) {
+				mask |= 1ULL << j;
+			}
+		}
+		rte_atomic64_set(&vm_info->pcpu_mask[i], mask);
+	}
+	return 0;
+}
+
+int
+set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask)
+{
+	unsigned i = 0;
+	int flags = VIR_DOMAIN_AFFECT_LIVE|VIR_DOMAIN_AFFECT_CONFIG;
+	struct virtual_machine_info *vm_info;
+	uint64_t mask = core_mask;
+
+	if (vcpu >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds max allowable(%d)\n",
+				vcpu, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s', VM is not active\n",
+				vcpu, core_mask, vm_info->name);
+		return -1;
+	}
+
+	if (vcpu >= vm_info->info.nrVirtCpu) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds the assigned number of "
+				"vCPUs(%u)\n", vcpu, vm_info->info.nrVirtCpu);
+		return -1;
+	}
+	memset(global_cpumaps, 0 , CHANNEL_CMDS_MAX_CPUS * global_maplen);
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		VIR_USE_CPU(global_cpumaps, i);
+		if (i >= global_n_host_cpus) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "CPU(%u) exceeds the available "
+					"number of CPUs(%u)\n", i, global_n_host_cpus);
+			return -1;
+		}
+	}
+	if (virDomainPinVcpuFlags(vm_info->domainPtr, vcpu, global_cpumaps,
+			global_maplen, flags) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s'\n", vcpu, core_mask,
+				vm_info->name);
+		return -1;
+	}
+	rte_atomic64_set(&vm_info->pcpu_mask[vcpu], core_mask);
+	return 0;
+
+}
+
+int
+set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num)
+{
+	uint64_t mask = 1ULL << core_num;
+	return set_pcpus_mask(vm_name, vcpu, mask);
+}
+
+uint64_t
+get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
+{
+	struct virtual_machine_info *vm_info =
+			(struct virtual_machine_info *)chan_info->priv_info;
+	return rte_atomic64_read(&vm_info->pcpu_mask[vcpu]);
+}
+
+static inline int
+channel_exists(struct virtual_machine_info *vm_info, unsigned channel_num)
+{
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	if (vm_info->channel_mask & (1ULL << channel_num)) {
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+		return 1;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+
+
+static int
+open_non_blocking_channel(struct channel_info *info)
+{
+	int ret, flags;
+	struct sockaddr_un sock_addr;
+	fd_set soc_fd_set;
+	struct timeval tv;
+
+	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (info->fd == -1) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
+				strerror(errno),
+				info->channel_path);
+		return -1;
+	}
+	sock_addr.sun_family = AF_UNIX;
+	memcpy(&sock_addr.sun_path, info->channel_path,
+			strlen(info->channel_path)+1);
+
+	/* Get current flags */
+	flags = fcntl(info->fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
+				"'%s'\n", strerror(errno), info->channel_path);
+		return 1;
+	}
+	/* Set to Non Blocking */
+	flags |= O_NONBLOCK;
+	if (fcntl(info->fd, F_SETFL, flags) < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
+				"socket for '%s'\n", strerror(errno), info->channel_path);
+		return -1;
+	}
+	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
+			sizeof(sock_addr));
+	if (ret < 0) {
+		/* ECONNREFUSED error is given when VM is not active */
+		if (errno == ECONNREFUSED) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
+					"activated its endpoint to channel %s\n",
+					info->channel_path);
+			return -1;
+		}
+		/* Wait for tv_sec if in progress */
+		else if (errno == EINPROGRESS) {
+			tv.tv_sec = 2;
+			tv.tv_usec = 0;
+			FD_ZERO(&soc_fd_set);
+			FD_SET(info->fd, &soc_fd_set);
+			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
+				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
+						"'%s'\n", info->channel_path);
+				return -1;
+			}
+		} else {
+			/* Any other error */
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
+					" for '%s'\n", strerror(errno), info->channel_path);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int
+setup_channel_info(struct virtual_machine_info **vm_info_dptr,
+		struct channel_info **chan_info_dptr, unsigned channel_num)
+{
+	struct channel_info *chan_info = *chan_info_dptr;
+	struct virtual_machine_info *vm_info = *vm_info_dptr;
+
+	chan_info->channel_num = channel_num;
+	chan_info->priv_info = (void *)vm_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_DISCONNECTED;
+	if (open_non_blocking_channel(chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
+				"'%s' for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+	}
+	if (add_channel_to_monitor(&chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
+				"'%s' to epoll ctl for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+
+	}
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->num_channels++;
+	vm_info->channel_mask |= 1ULL << channel_num;
+	vm_info->channels[channel_num] = chan_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_all_channels(const char *vm_name)
+{
+	DIR *d;
+	struct dirent *dir;
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char *token, *remaining, *tail_ptr;
+	char socket_name[PATH_MAX];
+	unsigned channel_num;
+	int num_channels_enabled = 0;
+
+	/* verify VM exists */
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
+				" during channel discovery\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+	d = opendir(CHANNEL_MGR_SOCKET_PATH);
+	if (d == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
+				CHANNEL_MGR_SOCKET_PATH, strerror(errno));
+		return -1;
+	}
+	while ((dir = readdir(d)) != NULL) {
+		if (!strncmp(dir->d_name, ".", 1) ||
+				!strncmp(dir->d_name, "..", 2))
+			continue;
+
+		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
+		remaining = socket_name;
+		/* Extract vm_name from "<vm_name>.<channel_num>" */
+		token = strsep(&remaining, ".");
+		if (remaining == NULL)
+			continue;
+		if (strncmp(vm_name, token, CHANNEL_MGR_MAX_NAME_LEN))
+			continue;
+
+		/* remaining should contain only <channel_num> */
+		errno = 0;
+		channel_num = (unsigned)strtol(remaining, &tail_ptr, 0);
+		if ((errno != 0) || (remaining[0] == '\0') ||
+				(*tail_ptr != '\0') || tail_ptr == NULL) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
+					"'%s' found it should be in the form of "
+					"'<guest_name>.<channel_num>(decimal)'\n",
+					dir->d_name);
+			continue;
+		}
+		if (channel_num >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
+					"greater than max allowable: %d, skipping '%s%s'\n",
+					channel_num, CHANNEL_CMDS_MAX_VM_CHANNELS-1,
+					CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+		/* if channel has not been added previously */
+		if (channel_exists(vm_info, channel_num))
+			continue;
+
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s%s'\n", CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s",
+				CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+
+		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+
+		num_channels_enabled++;
+	}
+	closedir(d);
+	return num_channels_enabled;
+}
+
+int
+add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char socket_path[PATH_MAX];
+	unsigned i;
+	int num_channels_enabled = 0;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+
+	for (i = 0; i < len_channel_list; i++) {
+
+		if (channel_list[i] >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel(%u) is out of range "
+							"0...%d\n", channel_list[i],
+							CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+			continue;
+		}
+		if (channel_exists(vm_info, channel_list[i])) {
+			RTE_LOG(INFO, CHANNEL_MANAGER,  "Channel already exists, skipping  "
+					"'%s.%u'\n", vm_name, i);
+			continue;
+		}
+
+		snprintf(socket_path, sizeof(socket_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		errno = 0;
+		if (access(socket_path, F_OK) < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
+					"%s\n", socket_path, strerror(errno));
+			continue;
+		}
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s'\n", socket_path);
+			continue;
+		}
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+		num_channels_enabled++;
+
+	}
+	return num_channels_enabled;
+}
+
+int
+remove_channel(struct channel_info **chan_info_dptr)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info = *chan_info_dptr;
+
+	close(chan_info->fd);
+
+	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
+	vm_info->num_channels--;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	rte_free(chan_info);
+	return 0;
+}
+
+int
+set_channel_status_all(const char *vm_name, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	uint64_t mask;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		vm_info->channels[i]->status = status;
+		num_channels_changed++;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return num_channels_changed;
+
+}
+
+int
+set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (channel_exists(vm_info, channel_list[i])) {
+			rte_spinlock_lock(&(vm_info->config_spinlock));
+			vm_info->channels[channel_list[i]]->status = status;
+			rte_spinlock_unlock(&(vm_info->config_spinlock));
+			num_channels_changed++;
+		}
+	}
+	return num_channels_changed;
+}
+
+int
+get_info_vm(const char *vm_name, struct vm_info *info)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i, channel_num = 0;
+	uint64_t mask;
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+	info->status = CHANNEL_MGR_VM_ACTIVE;
+	if (!virDomainIsActive(vm_info->domainPtr))
+		info->status = CHANNEL_MGR_VM_INACTIVE;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		info->channels[channel_num].channel_num = i;
+		memcpy(info->channels[channel_num].channel_path,
+				vm_info->channels[i]->channel_path, PATH_MAX);
+		info->channels[channel_num].status = vm_info->channels[i]->status;
+		info->channels[channel_num].fd = vm_info->channels[i]->fd;
+		channel_num++;
+	}
+
+	info->num_channels = channel_num;
+	info->num_vcpus = vm_info->info.nrVirtCpu;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
+	for (i = 0; i < info->num_vcpus; i++) {
+		info->pcpu_mask[i] = rte_atomic64_read(&vm_info->pcpu_mask[i]);
+	}
+	return 0;
+}
+
+int
+add_vm(const char *vm_name)
+{
+	struct virtual_machine_info *new_domain;
+	virDomainPtr dom_ptr;
+	int i;
+
+	if (find_domain_by_name(vm_name) != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
+				"already exists\n", vm_name);
+		return -1;
+	}
+
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
+		return -1;
+	}
+	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
+	if (dom_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
+				"VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
+			CACHE_LINE_SIZE);
+	if (new_domain == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
+				"info\n");
+		return -1;
+	}
+	new_domain->domainPtr = dom_ptr;
+	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	if (new_domain->info.nrVirtCpu > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
+				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
+				CHANNEL_CMDS_MAX_CPUS);
+		rte_free(new_domain);
+		return -1;
+	}
+
+	for (i = 0; i < CHANNEL_CMDS_MAX_CPUS; i++) {
+		rte_atomic64_init(&new_domain->pcpu_mask[i]);
+	}
+	if (update_pcpus_mask(new_domain) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
+	new_domain->channel_mask = 0;
+	new_domain->num_channels = 0;
+
+	if (!virDomainIsActive(dom_ptr))
+		new_domain->status = CHANNEL_MGR_VM_INACTIVE;
+	else
+		new_domain->status = CHANNEL_MGR_VM_ACTIVE;
+
+	rte_spinlock_init(&(new_domain->config_spinlock));
+	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
+	return 0;
+}
+
+int
+remove_vm(const char *vm_name)
+{
+	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
+				"not found\n", vm_name);
+		return -1;
+	}
+	rte_spinlock_lock(&vm_info->config_spinlock);
+	if (vm_info->num_channels != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
+				"%"PRId8" channels still active\n",
+				vm_name, vm_info->num_channels);
+		rte_spinlock_unlock(&vm_info->config_spinlock);
+		return -1;
+	}
+	LIST_REMOVE(vm_info, vms_info);
+	rte_spinlock_unlock(&vm_info->config_spinlock);
+	rte_free(vm_info);
+	return 0;
+}
+
+static void
+disconnect_hypervisor(void)
+{
+	if (global_vir_conn_ptr != NULL) {
+		virConnectClose(global_vir_conn_ptr);
+		global_vir_conn_ptr = NULL;
+	}
+}
+
+static int
+connect_hypervisor(const char *path)
+{
+	if (global_vir_conn_ptr != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+				"already established\n", path);
+		return -1;
+	}
+	global_vir_conn_ptr = virConnectOpen(path);
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
+				"Hypervisor '%s'\n", path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_manager_init(const char *path)
+{
+	int n_cpus;
+	LIST_INIT(&vm_list_head);
+	if (connect_hypervisor(path) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
+		return -1;
+	}
+
+	global_maplen = VIR_CPU_MAPLEN(CHANNEL_CMDS_MAX_CPUS);
+
+	global_vircpuinfo = rte_zmalloc(NULL, sizeof(*global_vircpuinfo) *
+			CHANNEL_CMDS_MAX_CPUS, CACHE_LINE_SIZE);
+	if (global_vircpuinfo == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for CPU Info\n");
+		goto error;
+	}
+	global_cpumaps = rte_zmalloc(NULL, CHANNEL_CMDS_MAX_CPUS * global_maplen,
+			CACHE_LINE_SIZE);
+	if (global_cpumaps == NULL) {
+		goto error;
+	}
+
+	n_cpus = virNodeGetCPUMap(global_vir_conn_ptr, NULL, NULL, 0);
+	if (n_cpus <= 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get the number of Host "
+				"CPUs\n");
+		goto error;
+	}
+	global_n_host_cpus = (unsigned)n_cpus;
+
+	if (global_n_host_cpus > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "The number of host CPUs(%u) exceeds the "
+				"maximum of %u\n", global_n_host_cpus, CHANNEL_CMDS_MAX_CPUS);
+		goto error;
+
+	}
+
+	return 0;
+error:
+	disconnect_hypervisor();
+	return -1;
+}
+
+void
+channel_manager_exit(void)
+{
+	unsigned i;
+	uint64_t mask;
+	struct virtual_machine_info *vm_info;
+
+	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
+
+		rte_spinlock_lock(&(vm_info->config_spinlock));
+
+		mask = vm_info->channel_mask;
+		ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			remove_channel_from_monitor(vm_info->channels[i]);
+			close(vm_info->channels[i]->fd);
+			rte_free(vm_info->channels[i]);
+		}
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+		LIST_REMOVE(vm_info, vms_info);
+		rte_free(vm_info);
+	}
+
+	if (global_cpumaps != NULL)
+		rte_free(global_cpumaps);
+	if (global_vircpuinfo != NULL)
+		rte_free(global_vircpuinfo);
+	disconnect_hypervisor();
+}
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
new file mode 100644
index 0000000..12c29c3
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.h
@@ -0,0 +1,314 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MANAGER_H_
+#define CHANNEL_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/limits.h>
+#include <rte_atomic.h>
+#include "channel_commands.h"
+
+/* Maximum name length including '\0' terminator */
+#define CHANNEL_MGR_MAX_NAME_LEN    64
+
+/* Maximum number of channels to each Virtual Machine */
+#define CHANNEL_MGR_MAX_CHANNELS    64
+
+/* Hypervisor Path for libvirt(qemu/KVM) */
+#define CHANNEL_MGR_DEFAULT_HV_PATH "qemu:///system"
+
+/* File socket directory */
+#define CHANNEL_MGR_SOCKET_PATH     "/tmp/powermonitor/"
+
+/* Communication Channel Status */
+enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
+	CHANNEL_MGR_CHANNEL_CONNECTED,
+	CHANNEL_MGR_CHANNEL_DISABLED,
+	CHANNEL_MGR_CHANNEL_PROCESSING};
+
+/* VM libvirt(qemu/KVM) connection status */
+enum vm_status { CHANNEL_MGR_VM_INACTIVE = 0, CHANNEL_MGR_VM_ACTIVE};
+
+/*
+ *  Represents a single and exclusive VM channel that exists between a guest and
+ *  the host.
+ */
+struct channel_info {
+	char channel_path[PATH_MAX]; /**< Path to host socket */
+	volatile uint32_t status;    /**< Connection status(enum channel_status) */
+	int fd;                      /**< AF_UNIX socket fd */
+	unsigned channel_num;        /**< CHANNEL_MGR_SOCKET_PATH/<vm_name>.channel_num */
+	void *priv_info;             /**< Pointer to private info, do not modify */
+};
+
+/* Represents a single VM instance used to return internal information about
+ * a VM */
+struct vm_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];          /**< VM name */
+	enum vm_status status;                        /**< libvirt status */
+	uint64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];    /**< pCPU mask for each vCPU */
+	unsigned num_vcpus;                           /**< number of vCPUS */
+	struct channel_info channels[CHANNEL_MGR_MAX_CHANNELS]; /**< Array of channel_info */
+	unsigned num_channels;                        /**< Number of channels */
+};
+
+/**
+ * Initialize the Channel Manager resources and connect to the Hypervisor
+ * specified in path.
+ * This must be successfully called first before calling any other functions.
+ * It must only be call once;
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_manager_init(const char *path);
+
+/**
+ * Free resources associated with the Channel Manager.
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  None
+ */
+void channel_manager_exit(void);
+
+/**
+ * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
+ * core_mask.
+ * It is not thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info
+ *
+ * @param vcpu
+ *  The virtual CPU to query.
+ *
+ *
+ * @return
+ *  - 0 on error.
+ *  - >0 on success.
+ */
+uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
+
+/**
+ * Set the Physical CPU mask for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_mask
+ *  The core mask of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask);
+
+/**
+ * Set the Physical CPU for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_num
+ *  The core number of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num);
+/**
+ * Add a VM as specified by name to the Channel Manager. The name must
+ * correspond to a valid libvirt domain name.
+ * This is required prior to adding channels.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_vm(const char *name);
+
+/**
+ * Remove a previously added Virtual Machine from the Channel Manager
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_vm(const char *name);
+
+/**
+ * Add all available channels to the VM as specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ */
+int add_all_channels(const char *vm_name);
+
+/**
+ * Add the channel numbers in channel_list to the domain specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel number to add
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned num_channels);
+
+/**
+ * Remove a channel definition from the channel manager. This must only be
+ * called from the channel monitor thread.
+ *
+ * @param chan_info
+ *  Pointer to a valid struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel(struct channel_info **chan_info_dptr);
+
+/**
+ * For all channels associated with a Virtual Machine name, update the
+ * connection status. Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ *
+ *
+ * @param name
+ *  Virtual Machine name to modify all channels.
+ *
+ * @param status
+ *  The status to set each channel
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int set_channel_status_all(const char *name, enum channel_status status);
+
+/**
+ * For all channels in channel_list associated with a Virtual Machine name
+ * update the connection status of each.
+ * Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel numbers to
+ *  modify.
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels modified for the VM
+ *  - 0 for error
+ */
+int set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status);
+
+/**
+ * Populates a pointer to struct vm_info associated with vm_name.
+ *
+ * @param vm_name
+ *  The name of the virtual machine to lookup.
+ *
+ *  @param vm_info
+ *   Pointer to a struct vm_info, this must be allocated prior to calling this
+ *   function.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int get_info_vm(const char *vm_name, struct vm_info *info);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_MANAGER_H_ */
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
new file mode 100644
index 0000000..5dc3e41
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -0,0 +1,228 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+
+
+#include "channel_monitor.h"
+#include "channel_commands.h"
+#include "channel_manager.h"
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+
+static volatile unsigned run_loop = 1;
+static int global_event_fd;
+static struct epoll_event *global_events_list;
+
+void channel_monitor_exit(void)
+{
+	run_loop = 0;
+	rte_free(global_events_list);
+}
+
+static int
+process_request(struct channel_packet *pkt, struct channel_info *chan_info)
+{
+	uint64_t core_mask;
+
+	if (chan_info == NULL)
+		return -1;
+
+	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_CONNECTED,
+			CHANNEL_MGR_CHANNEL_PROCESSING) == 0)
+		return -1;
+
+	if (pkt->command == CPU_POWER) {
+		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
+		if (core_mask == 0) {
+			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
+					"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
+					(unsigned)pkt->unit);
+			return -1;
+		}
+		if (__builtin_popcountll(core_mask) == 1) {
+
+			unsigned core_num = __builtin_ffsll(core_mask) - 1;
+
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_core_min(core_num);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_core_max(core_num);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_core_down(core_num);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_core_up(core_num);
+			break;
+			default:
+				break;
+			}
+		} else {
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_mask_min(core_mask);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_mask_max(core_mask);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_mask_down(core_mask);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_mask_up(core_mask);
+			break;
+			default:
+				break;
+			}
+
+		}
+	}
+	/* Return is not checked as channel status may have been set to DISABLED
+	 * from management thread
+	 */
+	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_PROCESSING,
+			CHANNEL_MGR_CHANNEL_CONNECTED);
+	return 0;
+
+}
+
+int
+add_channel_to_monitor(struct channel_info **chan_info)
+{
+	struct channel_info *info = *chan_info;
+	struct epoll_event event;
+	event.events = EPOLLIN;
+	event.data.ptr = info;
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
+				"to epoll\n", info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+remove_channel_from_monitor(struct channel_info *chan_info)
+{
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
+				"from epoll\n", chan_info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_monitor_init(void)
+{
+	global_event_fd = epoll_create1(0);
+	if (global_event_fd == 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
+				"error %s\n", strerror(errno));
+		return -1;
+	}
+	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
+			* MAX_EVENTS, CACHE_LINE_SIZE);
+	if (global_events_list == NULL) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+				"epoll events\n");
+		return -1;
+	}
+	return 0;
+}
+
+void
+run_channel_monitor(void)
+{
+	while (run_loop) {
+		int n_events, i;
+		n_events = epoll_wait(global_event_fd, global_events_list,
+				MAX_EVENTS, 1);
+		if (!run_loop)
+			break;
+		for (i = 0; i < n_events; i++) {
+			struct channel_info *chan_info = (struct channel_info *)
+					global_events_list[i].data.ptr;
+			if ((global_events_list[i].events & EPOLLERR) ||
+					(global_events_list[i].events & EPOLLHUP)) {
+				remove_channel(&chan_info);
+			}
+			if (global_events_list[i].events & EPOLLIN) {
+
+				int n_bytes, err = 0;
+				struct channel_packet pkt;
+				void *buffer = &pkt;
+				int buffer_len = sizeof(pkt);
+				while (buffer_len > 0) {
+					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					if (n_bytes == buffer_len)
+						break;
+					if (n_bytes == -1) {
+						err = errno;
+						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
+								"channel '%s' read: %s\n",
+								chan_info->channel_path, strerror(err));
+						remove_channel(&chan_info);
+						break;
+					}
+					buffer = (char *)buffer + n_bytes;
+					buffer_len -= n_bytes;
+				}
+				if (!err)
+					process_request(&pkt, chan_info);
+			}
+		}
+	}
+}
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
new file mode 100644
index 0000000..c138607
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MONITOR_H_
+#define CHANNEL_MONITOR_H_
+
+#include "channel_manager.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Channel Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_monitor_init(void);
+
+/**
+ * Run the channel monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_channel_monitor(void);
+
+/**
+ * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+void channel_monitor_exit(void);
+
+/**
+ * Add an open channel to monitor via epoll. A pointer to struct channel_info
+ * will be registered with epoll for event processing.
+ * It is thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info pointer.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_channel_to_monitor(struct channel_info **chan_info);
+
+/**
+ * Remove a previously added channel from epoll control.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel_from_monitor(struct channel_info *chan_info);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* CHANNEL_MONITOR_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 02/10] VM Power Management CLI(Host).
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 03/10] CPU Frequency Power Management(Host) Alan Carew
                       ` (10 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-seperated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/vm_power_cli.c | 669 +++++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h |  47 +++
 2 files changed, 716 insertions(+)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 0000000..a8cfb3a
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,669 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <termios.h>
+#include <errno.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+		struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+struct cmd_show_vm_result {
+	cmdline_fixed_string_t show_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_show_vm_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_show_vm_result *res = parsed_result;
+	struct vm_info info;
+	unsigned i;
+
+	if (get_info_vm(res->vm_name, &info) != 0)
+		return;
+	cmdline_printf(cl, "VM: '%s', status = ", info.name);
+	if (info.status == CHANNEL_MGR_VM_ACTIVE)
+		cmdline_printf(cl, "ACTIVE\n");
+	else
+		cmdline_printf(cl, "INACTIVE\n");
+	cmdline_printf(cl, "Channels %u\n", info.num_channels);
+	for (i = 0; i < info.num_channels; i++) {
+		cmdline_printf(cl, "  [%u]: %s, status = ", i,
+				info.channels[i].channel_path);
+		switch (info.channels[i].status) {
+		case CHANNEL_MGR_CHANNEL_CONNECTED:
+			cmdline_printf(cl, "CONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISCONNECTED:
+			cmdline_printf(cl, "DISCONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISABLED:
+			cmdline_printf(cl, "DISABLED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_PROCESSING:
+			cmdline_printf(cl, "PROCESSING\n");
+			break;
+		default:
+			cmdline_printf(cl, "UNKNOWN\n");
+			break;
+		}
+	}
+	cmdline_printf(cl, "Virtual CPU(s): %u\n", info.num_vcpus);
+	for (i = 0; i < info.num_vcpus; i++) {
+		cmdline_printf(cl, "  [%u]: Physical CPU Mask 0x%"PRIx64"\n", i,
+				info.pcpu_mask[i]);
+	}
+}
+
+
+
+cmdline_parse_token_string_t cmd_vm_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+				show_vm, "show_vm");
+cmdline_parse_token_string_t cmd_show_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_show_vm_set = {
+	.f = cmd_show_vm_parsed,
+	.data = NULL,
+	.help_str = "show_vm <vm_name>, prints the information on the "
+			"specified VM(s), the information lists the number of vCPUS, the "
+			"pinning to pCPU(s) as a bit mask, along with any communication "
+			"channels associated with each VM",
+	.tokens = {
+		(void *)&cmd_vm_show,
+		(void *)&cmd_show_vm_name,
+		NULL,
+	},
+};
+
+/* *** vCPU to pCPU mapping operations *** */
+struct cmd_set_pcpu_mask_result {
+    cmdline_fixed_string_t set_pcpu_mask;
+    cmdline_fixed_string_t vm_name;
+    uint8_t vcpu;
+    uint64_t core_mask;
+};
+
+static void
+cmd_set_pcpu_mask_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_mask_result *res = parsed_result;
+	if (set_pcpus_mask(res->vm_name, res->vcpu, res->core_mask) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu_mask =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				set_pcpu_mask, "set_pcpu_mask");
+cmdline_parse_token_string_t cmd_set_pcpu_mask_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_mask_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_mask_core_mask =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				core_mask, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_mask_set = {
+		.f = cmd_set_pcpu_mask_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu_mask <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU mask.",
+				.tokens = {
+						(void *)&cmd_set_pcpu_mask,
+						(void *)&cmd_set_pcpu_mask_vm_name,
+						(void *)&set_pcpu_mask_vcpu,
+						(void *)&set_pcpu_mask_core_mask,
+						NULL,
+		},
+};
+
+struct cmd_set_pcpu_result {
+    cmdline_fixed_string_t set_pcpu;
+    cmdline_fixed_string_t vm_name;
+    uint8_t vcpu;
+    uint8_t core;
+};
+
+static void
+cmd_set_pcpu_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_result *res = parsed_result;
+	if (set_pcpu(res->vm_name, res->vcpu, res->core) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				set_pcpu, "set_pcpu");
+cmdline_parse_token_string_t cmd_set_pcpu_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_core =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				core, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_set = {
+		.f = cmd_set_pcpu_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU.",
+				.tokens = {
+						(void *)&cmd_set_pcpu,
+						(void *)&cmd_set_pcpu_vm_name,
+						(void *)&set_pcpu_vcpu,
+						(void *)&set_pcpu_core,
+						NULL,
+		},
+};
+
+struct cmd_vm_op_result {
+	cmdline_fixed_string_t op_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_vm_op_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_vm_op_result *res = parsed_result;
+
+	if (!strcmp(res->op_vm, "add_vm")) {
+		if (add_vm(res->vm_name) < 0)
+			cmdline_printf(cl, "Unable to add VM '%s'\n", res->vm_name);
+	} else if (remove_vm(res->vm_name) < 0)
+		cmdline_printf(cl, "Unable to remove VM '%s'\n", res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_vm_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			op_vm, "add_vm#rm_vm");
+cmdline_parse_token_string_t cmd_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_vm_op_set = {
+	.f = cmd_vm_op_parsed,
+	.data = NULL,
+	.help_str = "add_vm|rm_vm <name>, add a VM for "
+			"subsequent operations with the CLI or remove a previously added "
+			"VM from the VM Power Manager",
+	.tokens = {
+		(void *)&cmd_vm_op,
+		(void *)&cmd_vm_name,
+	NULL,
+	},
+};
+
+/* *** VM channel operations *** */
+struct cmd_channels_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+};
+static void
+cmd_channels_op_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num, i;
+	int channels_added;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_op_result *res = parsed_result;
+
+	if (!strcmp(res->channel_list, "all")) {
+		channels_added = add_all_channels(res->vm_name);
+		cmdline_printf(cl, "Added %d channels for VM '%s'\n",
+				channels_added, res->vm_name);
+		return;
+	}
+
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "Channel number '%u' exceeds the maximum number "
+					"of allowable channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	for (i = 0; i < num_channels; i++)
+		cmdline_printf(cl, "[%u]: Adding channel %u\n", i, channel_list[i]);
+
+	channels_added = add_channels(res->vm_name, channel_list,
+			num_channels);
+	cmdline_printf(cl, "Enabled %d channels for '%s'\n", channels_added,
+			res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+				op, "add_channels");
+cmdline_parse_token_string_t cmd_channels_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			channel_list, NULL);
+
+cmdline_parse_inst_t cmd_channels_op_set = {
+	.f = cmd_channels_op_parsed,
+	.data = NULL,
+	.help_str = "add_channels <vm_name> <list>|all, add "
+			"communication channels for the specified VM, the "
+			"virtio channels must be enabled in the VM "
+			"configuration(qemu/libvirt) and the associated VM must be active. "
+			"<list> is a comma-seperated list of channel numbers to add, using "
+			"the keyword 'all' will attempt to add all channels for the VM",
+	.tokens = {
+		(void *)&cmd_channels_op,
+		(void *)&cmd_channels_vm_name,
+		(void *)&cmd_channels_list,
+		NULL,
+	},
+};
+
+struct cmd_channels_status_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+	cmdline_fixed_string_t status;
+};
+
+static void
+cmd_channels_status_op_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num;
+	int changed;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_status_op_result *res = parsed_result;
+	enum channel_status status;
+
+	if (!strcmp(res->status, "enabled"))
+		status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	else
+		status = CHANNEL_MGR_CHANNEL_DISABLED;
+
+	if (!strcmp(res->channel_list, "all")) {
+		changed = set_channel_status_all(res->vm_name, status);
+		cmdline_printf(cl, "Updated status of %d channels "
+				"for VM '%s'\n", changed, res->vm_name);
+		return;
+	}
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "%u exceeds the maximum number of allowable "
+					"channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	changed = set_channel_status(res->vm_name, channel_list, num_channels,
+			status);
+	cmdline_printf(cl, "Updated status of %d channels "
+					"for VM '%s'\n", changed, res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_status_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+				op, "set_channel_status");
+cmdline_parse_token_string_t cmd_channels_status_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_status_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			channel_list, NULL);
+cmdline_parse_token_string_t cmd_channels_status =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			status, "enabled#disabled");
+
+cmdline_parse_inst_t cmd_channels_status_op_set = {
+	.f = cmd_channels_status_op_parsed,
+	.data = NULL,
+	.help_str = "set_channel_status <vm_name> <list>|all enabled|disabled, "
+			" enable or disable the communication channels in "
+			"list(comma-seperated) for the specified VM, alternatively list can"
+			" be replaced with keyword 'all'. Disabled channels will still "
+			"receive packets on the host, however the commands they specify "
+			"will be ignored. Set status to 'enabled' to begin processing "
+			"requests again.",
+	.tokens = {
+		(void *)&cmd_channels_status_op,
+		(void *)&cmd_channels_status_vm_name,
+		(void *)&cmd_channels_status_list,
+		(void *)&cmd_channels_status,
+		NULL,
+	},
+};
+
+/* *** CPU Frequency operations *** */
+struct cmd_show_cpu_freq_mask_result {
+	cmdline_fixed_string_t show_cpu_freq_mask;
+	uint64_t core_mask;
+};
+
+static void
+cmd_show_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_mask_result *res = parsed_result;
+	unsigned i;
+	uint64_t mask = res->core_mask;
+	uint32_t freq;
+	for (i = 0; mask; mask &= ~(1ULL << i++)) {
+		if ((mask >> i) & 1) {
+			freq = power_manager_get_current_frequency(i);
+			if (freq > 0)
+				cmdline_printf(cl, "Core %u: %"PRId32"\n", i, freq);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			show_cpu_freq_mask, "show_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_show_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			core_mask, UINT64);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_mask_set = {
+	.f = cmd_show_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "show_cpu_freq_mask <mask>, Get the current frequency for each "
+			"core specified in the mask",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq_mask,
+		(void *)&cmd_show_cpu_freq_mask_core_mask,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_mask_result {
+	cmdline_fixed_string_t set_cpu_freq_mask;
+	uint64_t core_mask;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	struct cmd_set_cpu_freq_mask_result *res = parsed_result;
+	int ret = -1;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_mask_up(res->core_mask);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_mask_down(res->core_mask);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_mask_min(res->core_mask);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_mask_max(res->core_mask);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
+				"all cores specified have been scaled\n",
+				res->core_mask, res->cmd);
+	};
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			set_cpu_freq_mask, "set_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			core_mask, UINT64);
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
+	.f = cmd_set_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
+			"frequency for the cores specified in <core_mask> by scaling "
+			"each up/down/min/max.",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq_mask,
+		(void *)&cmd_set_cpu_freq_mask_core_mask,
+		(void *)&cmd_set_cpu_freq_mask_result,
+		NULL,
+	},
+};
+
+
+
+struct cmd_show_cpu_freq_result {
+	cmdline_fixed_string_t show_cpu_freq;
+	uint8_t core_num;
+};
+
+static void
+cmd_show_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_result *res = parsed_result;
+	uint32_t curr_freq = power_manager_get_current_frequency(res->core_num);
+	if (curr_freq == 0) {
+		cmdline_printf(cl, "Unable to get frequency for core %u\n",
+				res->core_num);
+		return;
+	}
+	cmdline_printf(cl, "Core %u frequency: %"PRId32"\n", res->core_num,
+			curr_freq);
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_result,
+			show_cpu_freq, "show_cpu_freq");
+
+cmdline_parse_token_num_t cmd_show_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_result,
+			core_num, UINT8);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_set = {
+	.f = cmd_show_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "Get the current frequency for the specified core",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq,
+		(void *)&cmd_show_cpu_freq_core_num,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t core_num;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_core_up(res->core_num);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_core_down(res->core_num);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_core_min(res->core_num);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_core_max(res->core_num);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
+				res->cmd);
+	}
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			core_num, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_vm_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_status_op_set,
+		(cmdline_parse_inst_t *)&cmd_show_vm_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	power_manager_init();
+	cl = cmdline_stdin_new(main_ctx, "vmpower> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/vm_power_cli.h b/examples/vm_power_manager/vm_power_cli.h
new file mode 100644
index 0000000..deccd51
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.h
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 03/10] CPU Frequency Power Management(Host).
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 02/10] VM Power Management CLI(Host) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 04/10] VM Power Management application and Makefile Alan Carew
                       ` (9 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/power_manager.c | 244 ++++++++++++++++++++++++++++++
 examples/vm_power_manager/power_manager.h | 188 +++++++++++++++++++++++
 2 files changed, 432 insertions(+)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 0000000..b7b1fca
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,244 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/types.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_power.h>
+#include <rte_spinlock.h>
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
+	if (core_num >= POWER_MGR_MAX_CPUS) \
+		return -1; \
+	if (!(global_enabled_cpus & (1ULL << core_num))) \
+		return -1; \
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+	ret = rte_power_freq_##DIRECTION(core_num); \
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+	int i; \
+	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+		if ((core_mask >> i) & 1) { \
+			if (!(global_enabled_cpus & (1ULL << i))) \
+			continue; \
+		rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+		if (rte_power_freq_##DIRECTION(i) != 1) \
+			ret = -1; \
+		rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+		} \
+	} \
+} while (0)
+
+struct freq_info {
+	rte_spinlock_t power_sl;
+	uint32_t freqs[RTE_MAX_LCORE_FREQS];
+	unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+	char path[PATH_MAX];
+	unsigned i;
+	unsigned num_cpus = 0;
+	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
+		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+		if (access(path, F_OK) == 0) {
+			global_enabled_cpus |= 1ULL << i;
+			num_cpus++;
+		} else
+			return num_cpus;
+	}
+	return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+	unsigned i, num_cpus;
+	uint64_t cpu_mask;
+	int ret = 0;
+
+	num_cpus = set_host_cpus_mask();
+	if (num_cpus == 0) {
+		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
+				"ensure that sufficient privileges exist to inspect sysfs\n");
+		return -1;
+	}
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	cpu_mask = global_enabled_cpus;
+	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
+		if (rte_power_init(i) < 0 || rte_power_freqs(i,
+				global_core_freq_info[i].freqs,
+				RTE_MAX_LCORE_FREQS) == 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+					"for core %u\n", i);
+			global_enabled_cpus &= ~(1 << i);
+			num_cpus--;
+			ret = -1;
+		}
+		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+	}
+	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
+					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	return ret;
+
+}
+
+uint32_t
+power_manager_get_current_frequency(unsigned core_num)
+{
+	uint32_t freq, index;
+
+	if (core_num >= POWER_MGR_MAX_CPUS) {
+		RTE_LOG(ERR, POWER_MANAGER, "Core(%u) is out of range 0...%d\n",
+				core_num, POWER_MGR_MAX_CPUS-1);
+		return -1;
+	}
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return 0;
+
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	index = rte_power_get_freq(core_num);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	if (index >= POWER_MGR_MAX_CPUS)
+		freq = 0;
+	else
+		freq = global_core_freq_info[core_num].freqs[index];
+
+	return freq;
+}
+
+int
+power_manager_exit(void)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
+		if (rte_power_exit(i) < 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+					"for core %u\n", i);
+			ret = -1;
+		}
+	}
+	global_enabled_cpus = 0;
+	return ret;
+}
+
+int
+power_manager_scale_mask_up(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(up, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_down(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(down, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_min(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(min, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_max(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(max, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_up(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(up, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_down(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(down, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_min(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(min, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_max(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(max, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
new file mode 100644
index 0000000..1b45bab
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.h
@@ -0,0 +1,188 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef POWER_MANAGER_H_
+#define POWER_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Maximum number of CPUS to manage */
+#define POWER_MGR_MAX_CPUS 64
+/**
+ * Initialize power management.
+ * Initializes resources and verifies the number of CPUs on the system.
+ * Wraps librte_power int rte_power_init(unsigned lcore_id);
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_init(void);
+
+/**
+ * Exit power management. Must be called prior to exiting the application.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_exit(void);
+
+/**
+ * Scale up the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_up(uint64_t core_mask);
+
+/**
+ * Scale down the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_down(uint64_t core_mask);
+
+/**
+ * Scale to the minimum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_min(uint64_t core_mask);
+
+/**
+ * Scale to the maximum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_max(uint64_t core_mask);
+
+/**
+ * Scale up frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_up(unsigned core_num);
+
+/**
+ * Scale down frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_down(unsigned core_num);
+
+/**
+ * Scale to minimum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_min(unsigned core_num);
+
+/**
+ * Scale to maximum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_max(unsigned core_num);
+
+/**
+ * Get the current freuency of the core specified by core_num
+ *
+ * @param core_num
+ *  The core number to get the current frequency
+ *
+ * @return
+ *  - 0  on error
+ *  - >0 for current frequency.
+ */
+uint32_t power_manager_get_current_frequency(unsigned core_num);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* POWER_MANAGER_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 04/10] VM Power Management application and Makefile.
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (2 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 03/10] CPU Frequency Power Management(Host) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 05/10] VM Power Management CLI(Guest) Alan Carew
                       ` (8 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/Makefile |  57 +++++++++++++++++++
 examples/vm_power_manager/main.c   | 113 +++++++++++++++++++++++++++++++++++++
 examples/vm_power_manager/main.h   |  52 +++++++++++++++++
 3 files changed, 222 insertions(+)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
new file mode 100644
index 0000000..7d6f943
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 0000000..e819e6f
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,113 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_launch.h>
+#include <rte_log.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "vm_power_cli.h"
+#include "main.h"
+
+static int
+run_monitor(__attribute__((unused)) void *arg)
+{
+	if (channel_manager_init(CHANNEL_MGR_DEFAULT_HV_PATH) < 0) {
+		printf("Unable to initialize channel manager\n");
+		return -1;
+	}
+	if (channel_monitor_init() < 0) {
+		printf("Unable to initialize channel monitor\n");
+		return -1;
+	}
+	run_channel_monitor();
+	return 0;
+}
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	lcore_id = rte_get_next_lcore(-1, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+				"application\n");
+		return 0;
+	}
+	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
+
+	run_cli(NULL);
+
+	rte_eal_mp_wait_lcore();
+	return 0;
+}
diff --git a/examples/vm_power_manager/main.h b/examples/vm_power_manager/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 05/10] VM Power Management CLI(Guest).
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (3 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 04/10] VM Power Management application and Makefile Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
                       ` (7 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++++++++
 examples/vm_power_manager/guest_cli/main.c         |  87 ++++++++++++
 examples/vm_power_manager/guest_cli/main.h         |  52 +++++++
 .../guest_cli/vm_power_cli_guest.c                 | 155 +++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++++++++
 5 files changed, 405 insertions(+)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 0000000..167a7ed
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 0000000..1e4767a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+*/
+#include <signal.h>
+
+#include <rte_lcore.h>
+#include <rte_power.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "vm_power_cli_guest.h"
+#include "main.h"
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	rte_power_set_env(PM_ENV_KVM_VM);
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_init(lcore_id);
+	}
+	run_cli(NULL);
+
+	return 0;
+}
diff --git a/examples/vm_power_manager/guest_cli/main.h b/examples/vm_power_manager/guest_cli/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
new file mode 100644
index 0000000..7c4af4a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -0,0 +1,155 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <termios.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_lcore.h>
+
+#include <rte_power.h>
+
+#include "vm_power_cli_guest.h"
+
+
+#define CHANNEL_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+				__attribute__((unused)) struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t lcore_id;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = rte_power_freq_up(res->lcore_id);
+	else if (!strcmp(res->cmd , "down"))
+		ret = rte_power_freq_down(res->lcore_id);
+	else if (!strcmp(res->cmd , "min"))
+		ret = rte_power_freq_min(res->lcore_id);
+	else if (!strcmp(res->cmd , "max"))
+		ret = rte_power_freq_max(res->lcore_id);
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			lcore_id, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	cl = cmdline_stdin_new(main_ctx, "vmpower(guest)> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
new file mode 100644
index 0000000..0c4bdd5
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "channel_commands.h"
+
+int guest_channel_host_connect(unsigned lcore_id);
+
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 06/10] VM communication channels for VM Power Management(Guest).
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (4 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 05/10] VM Power Management CLI(Guest) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 07/10] librte_power common interface for Guest and Host Alan Carew
                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/guest_channel.c | 162 +++++++++++++++++++++++++++++++++++++++
 lib/librte_power/guest_channel.h |  89 +++++++++++++++++++++
 2 files changed, 251 insertions(+)
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
new file mode 100644
index 0000000..2295665
--- /dev/null
+++ b/lib/librte_power/guest_channel.c
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+	int flags, ret;
+	struct channel_packet pkt;
+	char fd_path[PATH_MAX];
+	int fd = -1;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	/* check if path is already open */
+	if (global_fds[lcore_id] != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open with fd %d\n",
+				lcore_id, global_fds[lcore_id]);
+		return -1;
+	}
+
+	snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+	RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+			fd_path, lcore_id);
+	fd = open(fd_path, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with error "
+				"%s\n", fd_path, strerror(errno));
+		return -1;
+	}
+
+	flags = fcntl(fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file %s\n",
+				fd_path);
+		goto error;
+	}
+
+	flags |= O_NONBLOCK;
+	if (fcntl(fd, F_SETFL, flags) < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking mode for "
+				"file %s", fd_path);
+		goto error;
+	}
+	/* QEMU needs a delay after connection */
+	sleep(1);
+
+	/* Send a test packet, this command is ignored by the host, but a successful
+	 * send indicates that the host endpoint is monitoring.
+	 */
+	pkt.command = CPU_POWER_CONNECT;
+	global_fds[lcore_id] = fd;
+	ret = guest_channel_send_msg(&pkt, lcore_id);
+	if (ret != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' communications "
+				"test: %s\n", fd_path, strerror(ret));
+		goto error;
+	}
+	RTE_LOG(INFO, GUEST_CHANNEL, "Channel '%s' is now connected\n", fd_path);
+	return 0;
+error:
+	close(fd);
+	global_fds[lcore_id] = 0;
+	return -1;
+}
+
+int
+guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
+{
+	int ret, buffer_len = sizeof(*pkt);
+	void *buffer = pkt;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+
+	if (global_fds[lcore_id] == 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel is not connected\n");
+		return -1;
+	}
+	while (buffer_len > 0) {
+		ret = write(global_fds[lcore_id], buffer, buffer_len);
+		if (ret == buffer_len)
+			return 0;
+		if (ret == -1) {
+			if (errno == EINTR)
+				continue;
+			return errno;
+		}
+		buffer = (char *)buffer + ret;
+		buffer_len -= ret;
+	}
+	return 0;
+}
+
+void
+guest_channel_host_disconnect(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return;
+	}
+	if (global_fds[lcore_id] == 0)
+		return;
+	close(global_fds[lcore_id]);
+	global_fds[lcore_id] = 0;
+}
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
new file mode 100644
index 0000000..9e18af5
--- /dev/null
+++ b/lib/librte_power/guest_channel.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef _GUEST_CHANNEL_H
+#define _GUEST_CHANNEL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <channel_commands.h>
+
+/**
+ * Connect to the Virtio-Serial VM end-point located in path. It is
+ * thread safe for unique lcore_ids. This function must be only called once from
+ * each lcore.
+ *
+ * @param path
+ *  The path to the serial device on the filesystem
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int guest_channel_host_connect(const char *path, unsigned lcore_id);
+
+/**
+ * Disconnect from an already connected Virtio-Serial Endpoint.
+ *
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ */
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct guest_agent_pkt
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on channel not connected.
+ *  - errno on write to channel error.
+ */
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 07/10] librte_power common interface for Guest and Host
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (5 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implmentation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/rte_power.c              | 540 ++++-------------------------
 lib/librte_power/rte_power.h              | 120 +++++--
 lib/librte_power/rte_power_acpi_cpufreq.c | 545 ++++++++++++++++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++++++++++
 lib/librte_power/rte_power_common.h       |  39 +++
 lib/librte_power/rte_power_kvm_vm.c       | 135 ++++++++
 lib/librte_power/rte_power_kvm_vm.h       | 179 ++++++++++
 7 files changed, 1248 insertions(+), 502 deletions(-)
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 856da9a..998ed1c 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -31,515 +31,113 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <signal.h>
-#include <limits.h>
-
-#include <rte_memcpy.h>
 #include <rte_atomic.h>
 
 #include "rte_power.h"
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
 
-#ifdef RTE_LIBRTE_POWER_DEBUG
-#define POWER_DEBUG_TRACE(fmt, args...) do { \
-		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
-	} while (0)
-#else
-#define POWER_DEBUG_TRACE(fmt, args...)
-#endif
-
-#define FOPEN_OR_ERR_RET(f, retval) do { \
-	if ((f) == NULL) { \
-		RTE_LOG(ERR, POWER, "File not openned\n"); \
-		return (retval); \
-	} \
-} while(0)
-
-#define FOPS_OR_NULL_GOTO(ret, label) do { \
-	if ((ret) == NULL) { \
-		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define FOPS_OR_ERR_GOTO(ret, label) do { \
-	if ((ret) < 0) { \
-		RTE_LOG(ERR, POWER, "File operations failed\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define STR_SIZE     1024
-#define POWER_CONVERT_TO_DECIMAL 10
+enum power_management_env global_default_env = PM_ENV_NOT_SET;
 
-#define POWER_GOVERNOR_USERSPACE "userspace"
-#define POWER_SYSFILE_GOVERNOR   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
-#define POWER_SYSFILE_AVAIL_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
-#define POWER_SYSFILE_SETSPEED   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+volatile uint32_t global_env_cfg_status = 0;
 
-enum power_state {
-	POWER_IDLE = 0,
-	POWER_ONGOING,
-	POWER_USED,
-	POWER_UNKNOWN
-};
+/* function pointers */
+rte_power_freqs_t rte_power_freqs  = NULL;
+rte_power_get_freq_t rte_power_get_freq = NULL;
+rte_power_set_freq_t rte_power_set_freq = NULL;
+rte_power_freq_change_t rte_power_freq_up = NULL;
+rte_power_freq_change_t rte_power_freq_down = NULL;
+rte_power_freq_change_t rte_power_freq_max = NULL;
+rte_power_freq_change_t rte_power_freq_min = NULL;
 
-/**
- * Power info per lcore.
- */
-struct rte_power_info {
-	unsigned lcore_id;                   /**< Logical core id */
-	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-	uint32_t nb_freqs;                   /**< number of available freqs */
-	FILE *f;                             /**< FD of scaling_setspeed */
-	char governor_ori[32];               /**< Original governor name */
-	uint32_t curr_idx;                   /**< Freq index in freqs array */
-	volatile uint32_t state;             /**< Power in use state */
-} __rte_cache_aligned;
-
-static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
-
-/**
- * It is to set specific freq for specific logical core, according to the index
- * of supported frequencies.
- */
-static int
-set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+int
+rte_power_set_env(enum power_management_env env)
 {
-	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
-			"should be less than %u\n", idx, pi->nb_freqs);
-		return -1;
-	}
-
-	/* Check if it is the same as current */
-	if (idx == pi->curr_idx)
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
 		return 0;
-
-	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
-				idx, pi->freqs[idx], pi->lcore_id);
-	if (fseek(pi->f, 0, SEEK_SET) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
-			"for setting frequency for lcore %u\n", pi->lcore_id);
-		return -1;
 	}
-	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
-					"lcore %u\n", pi->lcore_id);
+	if (env == PM_ENV_ACPI_CPUFREQ) {
+		rte_power_freqs = rte_power_acpi_cpufreq_freqs;
+		rte_power_get_freq = rte_power_acpi_cpufreq_get_freq;
+		rte_power_set_freq = rte_power_acpi_cpufreq_set_freq;
+		rte_power_freq_up = rte_power_acpi_cpufreq_freq_up;
+		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
+		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
+		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+	} else if (env == PM_ENV_KVM_VM) {
+		rte_power_freqs = rte_power_kvm_vm_freqs;
+		rte_power_get_freq = rte_power_kvm_vm_get_freq;
+		rte_power_set_freq = rte_power_kvm_vm_set_freq;
+		rte_power_freq_up = rte_power_kvm_vm_freq_up;
+		rte_power_freq_down = rte_power_kvm_vm_freq_down;
+		rte_power_freq_min = rte_power_kvm_vm_freq_min;
+		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+	} else {
+		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
+				env);
+		rte_power_unset_env();
 		return -1;
 	}
-	fflush(pi->f);
-	pi->curr_idx = idx;
-
-	return 1;
-}
-
-/**
- * It is to check the current scaling governor by reading sys file, and then
- * set it into 'userspace' if it is not by writing the sys file. The original
- * governor will be saved for rolling back.
- */
-static int
-power_set_governor_userspace(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if current governor is userspace */
-	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
-		sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
-					"already userspace\n", pi->lcore_id);
-		goto out;
-	}
-	/* Save the original governor */
-	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
-
-	/* Write 'userspace' to the governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(POWER_GOVERNOR_USERSPACE, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
-			"set to user space successfully\n", pi->lcore_id);
-out:
-	fclose(f);
+	global_default_env = env;
+	return 0;
 
-	return ret;
 }
 
-/**
- * It is to get the available frequencies of the specific lcore by reading the
- * sys file.
- */
-static int
-power_get_available_freqs(struct rte_power_info *pi)
+void
+rte_power_unset_env(void)
 {
-	FILE *f;
-	int ret = -1, i, count;
-	char *p;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *freqs[RTE_MAX_LCORE_FREQS];
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
-								pi->lcore_id);
-	f = fopen(fullpath, "r");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Strip the line break if there is */
-	p = strchr(buf, '\n');
-	if (p != NULL)
-		*p = 0;
-
-	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
-	count = rte_strsplit(buf, sizeof(buf), freqs,
-				RTE_MAX_LCORE_FREQS, ' ');
-	if (count <= 0) {
-		RTE_LOG(ERR, POWER, "No available frequency in "
-			""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
-		goto out;
-	}
-	if (count >= RTE_MAX_LCORE_FREQS) {
-		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
-								count);
-		goto out;
-	}
-
-	/* Store the available frequncies into power context */
-	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
-		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
-								i, freqs[i]);
-		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
-					POWER_CONVERT_TO_DECIMAL);
-	}
-
-	ret = 0;
-	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
-						count, pi->lcore_id);
-out:
-	fclose(f);
-
-	return ret;
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 1, 0) != 0)
+		global_default_env = PM_ENV_NOT_SET;
 }
 
-/**
- * It is to fopen the sys file for the future setting the lcore frequency.
- */
-static int
-power_init_for_setting_freq(struct rte_power_info *pi)
-{
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t i, freq;
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, -1);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
-	for (i = 0; i < pi->nb_freqs; i++) {
-		if (freq == pi->freqs[i]) {
-			pi->curr_idx = i;
-			pi->f = f;
-			return 0;
-		}
-	}
-
-out:
-	fclose(f);
-
-	return -1;
+enum power_management_env
+rte_power_get_env(void) {
+	return global_default_env;
 }
 
 int
 rte_power_init(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"in use\n", lcore_id);
-		return -1;
-	}
-
-	pi->lcore_id = lcore_id;
-	/* Check and set the governor */
-	if (power_set_governor_userspace(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
-						"userspace\n", lcore_id);
-		goto fail;
-	}
+	int ret = -1;
 
-	/* Get the available frequencies */
-	if (power_get_available_freqs(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ) {
+		return rte_power_acpi_cpufreq_init(lcore_id);
 	}
-
-	/* Init for setting lcore frequency */
-	if (power_init_for_setting_freq(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_KVM_VM) {
+		return rte_power_kvm_vm_init(lcore_id);
 	}
-
-	/* Set freq to max by default */
-	if (rte_power_freq_max(lcore_id) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
-						"to max\n", lcore_id);
-		goto fail;
+	/* Auto detect Environment */
+	RTE_LOG(INFO, POWER, "Attempting to initialise ACPI cpufreq power "
+			"management...\n");
+	ret = rte_power_acpi_cpufreq_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+		goto out;
 	}
 
-	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
-					"power manamgement\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
-
-	return -1;
-}
-
-/**
- * It is to check the governor and then set the original governor back if
- * needed by writing the the sys file.
- */
-static int
-power_set_governor_original(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if the governor to be set is the same as current */
-	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u "
-					"has already been set to %s\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(INFO, POWER, "Attempting to initialise VM power management...\n");
+	ret = rte_power_kvm_vm_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_KVM_VM);
 		goto out;
 	}
-
-	/* Write back the original governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(pi->governor_ori, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power manamgement governor of lcore %u "
-				"has been set back to %s successfully\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(ERR, POWER, "Unable to set Power Management Environment for lcore "
+			"%u\n", lcore_id);
 out:
-	fclose(f);
-
 	return ret;
 }
 
 int
 rte_power_exit(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"not used\n", lcore_id);
-		return -1;
-	}
-
-	/* Close FD of setting freq */
-	fclose(pi->f);
-	pi->f = NULL;
-
-	/* Set the governor back to the original */
-	if (power_set_governor_original(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
-					"to the original\n", lcore_id);
-		goto fail;
-	}
-
-	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
-				"'userspace' mode and been set back to the "
-						"original\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ)
+		return rte_power_acpi_cpufreq_exit(lcore_id);
+	if (global_default_env == PM_ENV_KVM_VM)
+		return rte_power_kvm_vm_exit(lcore_id);
 
+	RTE_LOG(ERR, POWER, "Environment has not been set, unable to exit "
+				"gracefully\n");
 	return -1;
-}
-
-uint32_t
-rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
-		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
-		return 0;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (num < pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
-		return 0;
-	}
-	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
-
-	return pi->nb_freqs;
-}
-
-uint32_t
-rte_power_get_freq(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return RTE_POWER_INVALID_FREQ_INDEX;
-	}
-
-	return lcore_power_info[lcore_id].curr_idx;
-}
-
-int
-rte_power_set_freq(unsigned lcore_id, uint32_t index)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
-}
-
-int
-rte_power_freq_down(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
 
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx + 1 == pi->nb_freqs)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx + 1);
 }
-
-int
-rte_power_freq_up(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx == 0)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx - 1);
-}
-
-int
-rte_power_freq_max(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(&lcore_power_info[lcore_id], 0);
-}
-
-int
-rte_power_freq_min(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->nb_freqs - 1);
-}
-
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 9c1419e..9338069 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -48,12 +48,48 @@
 extern "C" {
 #endif
 
-#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+/* Power Management Environment State */
+enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM};
 
 /**
- * Initialize power management for a specific lcore. It will check and set the
- * governor to userspace for the lcore, get the available frequencies, and
- * prepare to set new lcore frequency.
+ * Set the default power management implementation. If this is not called prior
+ * to rte_power_init(), then auto-detect of the environment will take place.
+ * It is not thread safe.
+ *
+ * @param env
+ *  env. The environment in which to initialise Power Management for.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_set_env(enum power_management_env env);
+
+/**
+ * Unset the global environment configuration.
+ * This can only be called after all threads have completed.
+ *
+ * @param None.
+ *
+ * @return
+ *  None.
+ */
+void rte_power_unset_env(void);
+
+/**
+ * Get the default power management implementation.
+ *
+ * @param None.
+ *
+ * @return
+ *  power_management_env The configured environment.
+ */
+enum power_management_env rte_power_get_env(void);
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
  *
  * @param lcore_id
  *  lcore id.
@@ -65,8 +101,9 @@ extern "C" {
 int rte_power_init(unsigned lcore_id);
 
 /**
- * Exit power management on a specific lcore. It will set the governor to which
- * is before initialized.
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
  *
  * @param lcore_id
  *  lcore id.
@@ -78,11 +115,9 @@ int rte_power_init(unsigned lcore_id);
 int rte_power_exit(unsigned lcore_id);
 
 /**
- * Get the available frequencies of a specific lcore. The return value will be
- * the minimal one of the total number of available frequencies and the number
- * of buffer. The index of available frequencies used in other interfaces
- * should be in the range of 0 to this return value.
- * It should be protected outside of this function for threadsafe.
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -94,12 +129,15 @@ int rte_power_exit(unsigned lcore_id);
  * @return
  *  The number of available frequencies.
  */
-uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
+typedef uint32_t (*rte_power_freqs_t)(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+extern rte_power_freqs_t rte_power_freqs;
 
 /**
- * Return the current index of available frequencies of a specific lcore. It
- * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
- * It should be protected outside of this function for threadsafe.
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -107,12 +145,15 @@ uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
  * @return
  *  The current index of available frequencies.
  */
-uint32_t rte_power_get_freq(unsigned lcore_id);
+typedef uint32_t (*rte_power_get_freq_t)(unsigned lcore_id);
+
+extern rte_power_get_freq_t rte_power_get_freq;
 
 /**
  * Set the new frequency for a specific lcore by indicating the index of
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -121,70 +162,87 @@ uint32_t rte_power_get_freq(unsigned lcore_id);
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_set_freq(unsigned lcore_id, uint32_t index);
+typedef int (*rte_power_set_freq_t)(unsigned lcore_id, uint32_t index);
+
+extern rte_power_set_freq_t rte_power_set_freq;
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned lcore_id);
 
 /**
  * Scale up the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_up(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_up;
 
 /**
  * Scale down the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_down(unsigned lcore_id);
+
+extern rte_power_freq_change_t rte_power_freq_down;
 
 /**
  * Scale up the frequency of a specific lcore to the highest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_max(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_max;
 
 /**
  * Scale down the frequency of a specific lcore to the lowest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage..
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_min(unsigned lcore_id);
+rte_power_freq_change_t rte_power_freq_min;
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
new file mode 100644
index 0000000..09085c3
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -0,0 +1,545 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+
+#include <rte_memcpy.h>
+#include <rte_atomic.h>
+
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_common.h"
+
+#ifdef RTE_LIBRTE_POWER_DEBUG
+#define POWER_DEBUG_TRACE(fmt, args...) do { \
+		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
+} while (0)
+#else
+#define POWER_DEBUG_TRACE(fmt, args...)
+#endif
+
+#define FOPEN_OR_ERR_RET(f, retval) do { \
+		if ((f) == NULL) { \
+			RTE_LOG(ERR, POWER, "File not openned\n"); \
+			return retval; \
+		} \
+} while (0)
+
+#define FOPS_OR_NULL_GOTO(ret, label) do { \
+		if ((ret) == NULL) { \
+			RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define FOPS_OR_ERR_GOTO(ret, label) do { \
+		if ((ret) < 0) { \
+			RTE_LOG(ERR, POWER, "File operations failed\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define STR_SIZE     1024
+#define POWER_CONVERT_TO_DECIMAL 10
+
+#define POWER_GOVERNOR_USERSPACE "userspace"
+#define POWER_SYSFILE_GOVERNOR   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
+#define POWER_SYSFILE_AVAIL_FREQ \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
+#define POWER_SYSFILE_SETSPEED   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+
+enum power_state {
+	POWER_IDLE = 0,
+	POWER_ONGOING,
+	POWER_USED,
+	POWER_UNKNOWN
+};
+
+/**
+ * Power info per lcore.
+ */
+struct rte_power_info {
+	unsigned lcore_id;                   /**< Logical core id */
+	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
+	uint32_t nb_freqs;                   /**< number of available freqs */
+	FILE *f;                             /**< FD of scaling_setspeed */
+	char governor_ori[32];               /**< Original governor name */
+	uint32_t curr_idx;                   /**< Freq index in freqs array */
+	volatile uint32_t state;             /**< Power in use state */
+} __rte_cache_aligned;
+
+static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
+
+/**
+ * It is to set specific freq for specific logical core, according to the index
+ * of supported frequencies.
+ */
+static int
+set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+{
+	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
+				"should be less than %u\n", idx, pi->nb_freqs);
+		return -1;
+	}
+
+	/* Check if it is the same as current */
+	if (idx == pi->curr_idx)
+		return 0;
+
+	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
+			idx, pi->freqs[idx], pi->lcore_id);
+	if (fseek(pi->f, 0, SEEK_SET) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
+				"for setting frequency for lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
+				"lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	fflush(pi->f);
+	pi->curr_idx = idx;
+
+	return 1;
+}
+
+/**
+ * It is to check the current scaling governor by reading sys file, and then
+ * set it into 'userspace' if it is not by writing the sys file. The original
+ * governor will be saved for rolling back.
+ */
+static int
+power_set_governor_userspace(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if current governor is userspace */
+	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
+			sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
+				"already userspace\n", pi->lcore_id);
+		goto out;
+	}
+	/* Save the original governor */
+	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
+
+	/* Write 'userspace' to the governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(POWER_GOVERNOR_USERSPACE, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
+			"set to user space successfully\n", pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to get the available frequencies of the specific lcore by reading the
+ * sys file.
+ */
+static int
+power_get_available_freqs(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1, i, count;
+	char *p;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *freqs[RTE_MAX_LCORE_FREQS];
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
+			pi->lcore_id);
+	f = fopen(fullpath, "r");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Strip the line break if there is */
+	p = strchr(buf, '\n');
+	if (p != NULL)
+		*p = 0;
+
+	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
+	count = rte_strsplit(buf, sizeof(buf), freqs,
+			RTE_MAX_LCORE_FREQS, ' ');
+	if (count <= 0) {
+		RTE_LOG(ERR, POWER, "No available frequency in "
+				""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
+		goto out;
+	}
+	if (count >= RTE_MAX_LCORE_FREQS) {
+		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
+				count);
+		goto out;
+	}
+
+	/* Store the available frequncies into power context */
+	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
+		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
+				i, freqs[i]);
+		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
+				POWER_CONVERT_TO_DECIMAL);
+	}
+
+	ret = 0;
+	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
+			count, pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to fopen the sys file for the future setting the lcore frequency.
+ */
+static int
+power_init_for_setting_freq(struct rte_power_info *pi)
+{
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t i, freq;
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, -1);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
+	for (i = 0; i < pi->nb_freqs; i++) {
+		if (freq == pi->freqs[i]) {
+			pi->curr_idx = i;
+			pi->f = f;
+			return 0;
+		}
+	}
+
+	out:
+	fclose(f);
+
+	return -1;
+}
+
+int
+rte_power_acpi_cpufreq_init(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"in use\n", lcore_id);
+		return -1;
+	}
+
+	pi->lcore_id = lcore_id;
+	/* Check and set the governor */
+	if (power_set_governor_userspace(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
+				"userspace\n", lcore_id);
+		goto fail;
+	}
+
+	/* Get the available frequencies */
+	if (power_get_available_freqs(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Init for setting lcore frequency */
+	if (power_init_for_setting_freq(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Set freq to max by default */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
+				"to max\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
+			"power manamgement\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+/**
+ * It is to check the governor and then set the original governor back if
+ * needed by writing the the sys file.
+ */
+static int
+power_set_governor_original(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if the governor to be set is the same as current */
+	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u "
+				"has already been set to %s\n",
+				pi->lcore_id, pi->governor_ori);
+		goto out;
+	}
+
+	/* Write back the original governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(pi->governor_ori, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u "
+			"has been set back to %s successfully\n",
+			pi->lcore_id, pi->governor_ori);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_acpi_cpufreq_exit(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"not used\n", lcore_id);
+		return -1;
+	}
+
+	/* Close FD of setting freq */
+	fclose(pi->f);
+	pi->f = NULL;
+
+	/* Set the governor back to the original */
+	if (power_set_governor_original(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
+				"to the original\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
+			"'userspace' mode and been set back to the "
+			"original\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
+		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
+		return 0;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (num < pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
+		return 0;
+	}
+	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
+
+	return pi->nb_freqs;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_get_freq(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return RTE_POWER_INVALID_FREQ_INDEX;
+	}
+
+	return lcore_power_info[lcore_id].curr_idx;
+}
+
+int
+rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
+}
+
+int
+rte_power_acpi_cpufreq_freq_down(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx + 1 == pi->nb_freqs)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx + 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_up(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx == 0)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx - 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_max(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(&lcore_power_info[lcore_id], 0);
+}
+
+int
+rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->nb_freqs - 1);
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
new file mode 100644
index 0000000..68578e9
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -0,0 +1,192 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_ACPI_CPUFREQ_H
+#define _RTE_POWER_ACPI_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management via userspace ACPI cpufreq
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore. It will check and set the
+ * governor to userspace for the lcore, get the available frequencies, and
+ * prepare to set new lcore frequency.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore. It will set the governor to which
+ * is before initialized.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore. The return value will be
+ * the minimal one of the total number of available frequencies and the number
+ * of buffer. The index of available frequencies used in other interfaces
+ * should be in the range of 0 to this return value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  The number of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore. It
+ * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  The current index of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency chnaged.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_power/rte_power_common.h b/lib/librte_power/rte_power_common.h
new file mode 100644
index 0000000..64bd168
--- /dev/null
+++ b/lib/librte_power/rte_power_common.h
@@ -0,0 +1,39 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_POWER_COMMON_H_
+#define RTE_POWER_COMMON_H_
+
+#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+
+#endif /* RTE_POWER_COMMON_H_ */
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
new file mode 100644
index 0000000..3ccd92b
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -0,0 +1,135 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <errno.h>
+#include <string.h>
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
+
+#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+static struct channel_packet pkt[CHANNEL_CMDS_MAX_VM_CHANNELS];
+
+
+int
+rte_power_kvm_vm_init(unsigned lcore_id)
+{
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].command = CPU_POWER;
+	pkt[lcore_id].resource_id = lcore_id;
+	return guest_channel_host_connect(FD_PATH, lcore_id);
+}
+
+int
+rte_power_kvm_vm_exit(unsigned lcore_id)
+{
+	guest_channel_host_disconnect(lcore_id);
+	return 0;
+}
+
+uint32_t
+rte_power_kvm_vm_freqs(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t *freqs,
+		__attribute__((unused)) uint32_t num)
+{
+	RTE_LOG(ERR, POWER, "rte_power_freqs is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+uint32_t
+rte_power_kvm_vm_get_freq(__attribute__((unused)) unsigned lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_get_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_set_freq(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t index)
+{
+	RTE_LOG(ERR, POWER, "rte_power_set_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+static inline int
+send_msg(unsigned lcore_id, uint32_t scale_direction)
+{
+	int ret;
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = scale_direction;
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret == 0)
+		return 1;
+	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_kvm_vm_freq_up(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_UP);
+}
+
+int
+rte_power_kvm_vm_freq_down(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_DOWN);
+}
+
+int
+rte_power_kvm_vm_freq_max(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MAX);
+}
+
+int
+rte_power_kvm_vm_freq_min(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
new file mode 100644
index 0000000..dcbc878
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -0,0 +1,179 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_KVM_VM_H
+#define _RTE_POWER_KVM_VM_H
+
+/**
+ * @file
+ * RTE Power Management KVM VM
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore. This request is forwarded to the
+ * host monitor.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 08/10] Packet format for VM Power Management(Host and Guest).
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (6 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 07/10] librte_power common interface for Guest and Host Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/channel_commands.h | 77 +++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)
 create mode 100644 lib/librte_power/channel_commands.h

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
new file mode 100644
index 0000000..7e78a8b
--- /dev/null
+++ b/lib/librte_power/channel_commands.h
@@ -0,0 +1,77 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/* Maximum number of CPUs */
+#define CHANNEL_CMDS_MAX_CPUS        64
+#if CHANNEL_CMDS_MAX_CPUS > 64
+#error Maximum number of cores is 64, overflow is guaranteed to \
+	cause problems with VM Power Management
+#endif
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Valid Commands */
+#define CPU_POWER               1
+#define CPU_POWER_CONNECT       2
+
+/* CPU Power Command Scaling */
+#define CPU_POWER_SCALE_UP      1
+#define CPU_POWER_SCALE_DOWN    2
+#define CPU_POWER_SCALE_MAX     3
+#define CPU_POWER_SCALE_MIN     4
+
+struct channel_packet {
+	uint64_t resource_id; /**< core_num, device */
+	uint32_t unit;        /**< scale down/up/min/max */
+	uint32_t command;     /**< Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 09/10] Build system integration for VM Power Management(Guest and Host)
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (7 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 10/10] VM Power Management Unit Tests Alan Carew
                       ` (3 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
index 6185812..d672a5a 100644
--- a/lib/librte_power/Makefile
+++ b/lib/librte_power/Makefile
@@ -37,7 +37,8 @@ LIB = librte_power.a
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c rte_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_kvm_vm.c guest_channel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3 10/10] VM Power Management Unit Tests
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (8 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
@ 2014-09-29 15:18     ` Alan Carew
  2014-09-29 17:29     ` [dpdk-dev] [PATCH v3 00/10] VM Power Management Neil Horman
                       ` (2 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-09-29 15:18 UTC (permalink / raw)
  To: dev

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 app/test/Makefile                  |   3 +-
 app/test/autotest_data.py          |  26 ++
 app/test/test_power.c              | 445 +++---------------------------
 app/test/test_power_acpi_cpufreq.c | 544 +++++++++++++++++++++++++++++++++++++
 app/test/test_power_kvm_vm.c       | 308 +++++++++++++++++++++
 5 files changed, 917 insertions(+), 409 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 37a3772..03ade39 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -119,7 +119,8 @@ endif
 
 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c test_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power_kvm_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..618a946 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,32 @@ non_parallel_test_group_list = [
 	]
 },
 {
+	"Prefix" :      "power_acpi_cpufreq",
+	"Memory" :      all_sockets(512),
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power ACPI cpufreq autotest",
+		 "Command" :    "power_acpi_cpufreq_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
+	"Prefix" :      "power_kvm_vm",
+	"Memory" :      "512",
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power KVM VM  autotest",
+		 "Command" :    "power_kvm_vm_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
 	"Prefix" :	"lpm6",
 	"Memory" :	"512",
 	"Tests" :
diff --git a/app/test/test_power.c b/app/test/test_power.c
index d9eb420..64a2305 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -41,437 +41,66 @@
 
 #include <rte_power.h>
 
-#define TEST_POWER_LCORE_ID      2U
-#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
-#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
-
-#define TEST_POWER_SYSFILE_CUR_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
-
-static uint32_t total_freq_num;
-static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
-
-static int
-check_cur_freq(unsigned lcore_id, uint32_t idx)
-{
-#define TEST_POWER_CONVERT_TO_DECIMAL 10
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t cur_freq;
-	int ret = -1;
-
-	if (snprintf(fullpath, sizeof(fullpath),
-		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
-		return 0;
-	}
-	f = fopen(fullpath, "r");
-	if (f == NULL) {
-		return 0;
-	}
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		goto fail_get_cur_freq;
-	}
-	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
-	ret = (freqs[idx] == cur_freq ? 0 : -1);
-
-fail_get_cur_freq:
-	fclose(f);
-
-	return ret;
-}
-
-/* Check rte_power_freqs() */
-static int
-check_power_freqs(void)
-{
-	uint32_t ret;
-
-	total_freq_num = 0;
-	memset(freqs, 0, sizeof(freqs));
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
-					TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with NULL buffer to save available freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test of getting zero number of freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test with all valid input parameters */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get available freqs on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Save the total number of available freqs */
-	total_freq_num = ret;
-
-	return 0;
-}
-
-/* Check rte_power_get_freq() */
-static int
-check_power_get_freq(void)
-{
-	int ret;
-	uint32_t count;
-
-	/* test with an invalid lcore id */
-	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
-	if (count < TEST_POWER_FREQS_NUM_MAX) {
-		printf("Unexpectedly get freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
-	if (count >= TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get the freq index on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_set_freq() */
-static int
-check_power_set_freq(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
-	if (ret >= 0) {
-		printf("Unexpectedly set freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with an invalid freq index */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/**
-	 * test with an invalid freq index which is right one bigger than
-	 * total number of freqs
-	 */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", total_freq_num,
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0) {
-		printf("Fail to set freq index on lcore %u\n",
-					TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_down() */
-static int
-check_power_freq_down(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale down one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale down one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf ("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_up() */
-static int
-check_power_freq_up(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq on %u\n",
-						TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale up one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale up one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_max() */
-static int
-check_power_freq_max(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq to max on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_min() */
-static int
-check_power_freq_min(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq to min "
-				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
 static int
 test_power(void)
 {
 	int ret = -1;
+	enum power_management_env env;
 
-	/* test of init power management for an invalid lcore */
-	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	/* Test setting an invalid environment */
+	ret = rte_power_set_env(PM_ENV_NOT_SET);
 	if (ret == 0) {
-		printf("Unexpectedly initialise power management successfully "
-				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot initialise power management for lcore %u\n",
-							TEST_POWER_LCORE_ID);
+		printf("Unexpectedly succeeded on setting an invalid environment\n");
 		return -1;
 	}
 
-	/**
-	 * test of initialising power management for the lcore which has
-	 * been initialised
-	 */
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly init successfully power twice on "
-					"lcore %u\n", TEST_POWER_LCORE_ID);
+	/* Test that the environment has not been set */
+	env = rte_power_get_env();
+	if (env != PM_ENV_NOT_SET) {
+		printf("Unexpectedly got a valid environment configuration\n");
 		return -1;
 	}
 
-	ret = check_power_freqs();
-	if (ret < 0)
+	/* verify that function pointers are NULL */
+	if (rte_power_freqs != NULL) {
+		printf("rte_power_freqs should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	if (total_freq_num < 2) {
-		rte_power_exit(TEST_POWER_LCORE_ID);
-		printf("Frequency can not be changed due to CPU itself\n");
-		return 0;
 	}
-
-	ret = check_power_get_freq();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_set_freq();
-	if (ret < 0)
+	if (rte_power_get_freq != NULL) {
+		printf("rte_power_get_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_down();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_freq_up();
-	if (ret < 0)
+	}
+	if (rte_power_set_freq != NULL) {
+		printf("rte_power_set_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_max();
-	if (ret < 0)
+	}
+	if (rte_power_freq_up != NULL) {
+		printf("rte_power_freq_up should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_min();
-	if (ret < 0)
+	}
+	if (rte_power_freq_down != NULL) {
+		printf("rte_power_freq_down should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot exit power management for lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
 	}
-
-	/**
-	 * test of exiting power management for the lcore which has been exited
-	 */
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly exit successfully power management twice "
-					"on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
+	if (rte_power_freq_max != NULL) {
+		printf("rte_power_freq_max should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
-	/* test of exit power management for an invalid lcore */
-	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
-	if (ret == 0) {
-		printf("Unpectedly exit power management successfully for "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
+	if (rte_power_freq_min != NULL) {
+		printf("rte_power_freq_min should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
+	rte_power_unset_env();
 	return 0;
-
 fail_all:
-	rte_power_exit(TEST_POWER_LCORE_ID);
-
+	rte_power_unset_env();
 	return -1;
 }
 
diff --git a/app/test/test_power_acpi_cpufreq.c b/app/test/test_power_acpi_cpufreq.c
new file mode 100644
index 0000000..8848d75
--- /dev/null
+++ b/app/test/test_power_acpi_cpufreq.c
@@ -0,0 +1,544 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+
+#define TEST_POWER_LCORE_ID      2U
+#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
+#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
+
+#define TEST_POWER_SYSFILE_CUR_FREQ \
+	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
+
+static uint32_t total_freq_num;
+static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
+
+static int
+check_cur_freq(unsigned lcore_id, uint32_t idx)
+{
+#define TEST_POWER_CONVERT_TO_DECIMAL 10
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t cur_freq;
+	int ret = -1;
+
+	if (snprintf(fullpath, sizeof(fullpath),
+		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
+		return 0;
+	}
+	f = fopen(fullpath, "r");
+	if (f == NULL) {
+		return 0;
+	}
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		goto fail_get_cur_freq;
+	}
+	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
+	ret = (freqs[idx] == cur_freq ? 0 : -1);
+
+fail_get_cur_freq:
+	fclose(f);
+
+	return ret;
+}
+
+/* Check rte_power_freqs() */
+static int
+check_power_freqs(void)
+{
+	uint32_t ret;
+
+	total_freq_num = 0;
+	memset(freqs, 0, sizeof(freqs));
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
+					TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with NULL buffer to save available freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test of getting zero number of freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test with all valid input parameters */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get available freqs on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Save the total number of available freqs */
+	total_freq_num = ret;
+
+	return 0;
+}
+
+/* Check rte_power_get_freq() */
+static int
+check_power_get_freq(void)
+{
+	int ret;
+	uint32_t count;
+
+	/* test with an invalid lcore id */
+	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
+	if (count < TEST_POWER_FREQS_NUM_MAX) {
+		printf("Unexpectedly get freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
+	if (count >= TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get the freq index on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_set_freq() */
+static int
+check_power_set_freq(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
+	if (ret >= 0) {
+		printf("Unexpectedly set freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with an invalid freq index */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/**
+	 * test with an invalid freq index which is right one bigger than
+	 * total number of freqs
+	 */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", total_freq_num,
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0) {
+		printf("Fail to set freq index on lcore %u\n",
+					TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_down() */
+static int
+check_power_freq_down(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale down one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale down one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf ("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_up() */
+static int
+check_power_freq_up(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq on %u\n",
+						TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale up one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale up one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_max() */
+static int
+check_power_freq_max(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq to max on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_min() */
+static int
+check_power_freq_min(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq to min "
+				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+static int
+test_power_acpi_cpufreq(void)
+{
+	int ret = -1;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_ACPI_CPUFREQ, this "
+				"may occur if environment is not configured correctly or "
+				" operating in another valid Power management environment\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_ACPI_CPUFREQ) {
+		printf("Unexpectedly got an environment other than ACPI cpufreq\n");
+		goto fail_all;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+
+	/* test of init power management for an invalid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unexpectedly initialise power management successfully "
+				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(APCI cpufreq) or operating in another valid "
+				"Power management environment\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of initialising power management for the lcore which has
+	 * been initialised
+	 */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly init successfully power twice on "
+					"lcore %u\n", TEST_POWER_LCORE_ID);
+		goto fail_all;
+	}
+
+	ret = check_power_freqs();
+	if (ret < 0)
+		goto fail_all;
+
+	if (total_freq_num < 2) {
+		rte_power_exit(TEST_POWER_LCORE_ID);
+		printf("Frequency can not be changed due to CPU itself\n");
+		rte_power_unset_env();
+		return 0;
+	}
+
+	ret = check_power_get_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_set_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_down();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_up();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_max();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_min();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot exit power management for lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of exiting power management for the lcore which has been exited
+	 */
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly exit successfully power management twice "
+					"on lcore %u\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* test of exit power management for an invalid lcore */
+	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unpectedly exit power management successfully for "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+
+fail_all:
+	rte_power_exit(TEST_POWER_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_acpi_cpufreq_cmd = {
+	.command = "power_acpi_cpufreq_autotest",
+	.callback = test_power_acpi_cpufreq,
+};
+REGISTER_TEST_COMMAND(power_acpi_cpufreq_cmd);
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
new file mode 100644
index 0000000..ac0fcb6
--- /dev/null
+++ b/app/test/test_power_kvm_vm.c
@@ -0,0 +1,308 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+#include <rte_config.h>
+
+#define TEST_POWER_VM_LCORE_ID            0U
+#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
+#define TEST_POWER_VM_LCORE_INVALID       1U
+
+static int
+test_power_kvm_vm(void)
+{
+	int ret;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_KVM_VM);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_KVM_VM\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_KVM_VM) {
+		printf("Unexpectedly got a Power Management environment other than "
+				"KVM VM\n");
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		return -1;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	/* Test initialisation of an out of bounds lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret != -1) {
+		printf("rte_power_init unexpectedly succeeded on an invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(KVM VM) or operating in another valid "
+				"Power management environment\n", TEST_POWER_VM_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of previously initialised lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_init unexpectedly succeeded on calling init twice on"
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of invalid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency down of invalid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency min of invalid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency max of invalid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid but uninitialised lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid but uninitialised lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid but uninitialised lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid but uninitialised lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_up unexpectedly failed on valid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_down unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_min unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_max unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_freqs */
+	ret = rte_power_freqs(TEST_POWER_VM_LCORE_ID, NULL, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_freqs did not return the expected -ENOTSUP(%d) but "
+				"returned %d\n", -ENOTSUP, ret);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_get_freq */
+	ret = rte_power_get_freq(TEST_POWER_VM_LCORE_ID);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_get_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_set_freq */
+	ret = rte_power_set_freq(TEST_POWER_VM_LCORE_ID, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_set_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test removing of an lcore */
+	ret = rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_exit unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of previously removed lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of previously removed lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of previously removed lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of previously removed lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+fail_all:
+	rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_kvm_vm_cmd = {
+    .command = "power_kvm_vm_autotest",
+    .callback = test_power_kvm_vm,
+};
+REGISTER_TEST_COMMAND(power_kvm_vm_cmd);
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/10] VM Power Management
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (9 preceding siblings ...)
  2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 10/10] VM Power Management Unit Tests Alan Carew
@ 2014-09-29 17:29     ` Neil Horman
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
  2014-11-10  9:19     ` [dpdk-dev] [PATCH v2] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ Alan Carew
  12 siblings, 0 replies; 97+ messages in thread
From: Neil Horman @ 2014-09-29 17:29 UTC (permalink / raw)
  To: Alan Carew; +Cc: dev

On Mon, Sep 29, 2014 at 04:18:13PM +0100, Alan Carew wrote:
> Virtual Machine Power Management.
> 
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> 
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
> 
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
> 
>  2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/<vm_name>.<channel_number>,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
>  Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
>  Requests over each channel to change frequency are forwarded to the original
>  librte_power.
>  
> Channels must be manually configured as qemu-kvm command line arguments or
> libvirt domain definition(xml) e.g.
> <controller type='virtio-serial' index='0'>
>  <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
> </controller>
> <channel type='unix'>
>   <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
>   <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
>   <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
> </channel>
> 
> Where multiple channels can be configured by specifying multiple <channel>
> elements, by replacing <vm_name>, <channel_num>.
> <N>(port number) should be incremented by 1 for each new channel element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> 
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
>  from VMs and calls the corresponding librte_power functions.
> 
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a 
> vCPU[PATCH 08/09]
> 
> The current l3fwd-power sample application can also be run on a VM.
> 
> Changes in V3:
>  Fixed crash in Guest CLI when host application is not running.
>  Renamed #defines to be more specific to the module they belong
>  Added vCPU pinning via CLI
>  Testing feedback
> 
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> 
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests
> 
>  app/test/Makefile                                  |   3 +-
>  app/test/autotest_data.py                          |  26 +
>  app/test/test_power.c                              | 445 +-----------
>  app/test/test_power_acpi_cpufreq.c                 | 544 ++++++++++++++
>  app/test/test_power_kvm_vm.c                       | 308 ++++++++
>  examples/vm_power_manager/Makefile                 |  57 ++
>  examples/vm_power_manager/channel_manager.c        | 804 +++++++++++++++++++++
>  examples/vm_power_manager/channel_manager.h        | 314 ++++++++
>  examples/vm_power_manager/channel_monitor.c        | 228 ++++++
>  examples/vm_power_manager/channel_monitor.h        | 102 +++
>  examples/vm_power_manager/guest_cli/Makefile       |  56 ++
>  examples/vm_power_manager/guest_cli/main.c         |  87 +++
>  examples/vm_power_manager/guest_cli/main.h         |  52 ++
>  .../guest_cli/vm_power_cli_guest.c                 | 155 ++++
>  .../guest_cli/vm_power_cli_guest.h                 |  55 ++
>  examples/vm_power_manager/main.c                   | 113 +++
>  examples/vm_power_manager/main.h                   |  52 ++
>  examples/vm_power_manager/power_manager.c          | 244 +++++++
>  examples/vm_power_manager/power_manager.h          | 188 +++++
>  examples/vm_power_manager/vm_power_cli.c           | 669 +++++++++++++++++
>  examples/vm_power_manager/vm_power_cli.h           |  47 ++
>  lib/librte_power/Makefile                          |   3 +-
>  lib/librte_power/channel_commands.h                |  77 ++
>  lib/librte_power/guest_channel.c                   | 162 +++++
>  lib/librte_power/guest_channel.h                   |  89 +++
>  lib/librte_power/rte_power.c                       | 540 ++------------
>  lib/librte_power/rte_power.h                       | 120 ++-
>  lib/librte_power/rte_power_acpi_cpufreq.c          | 545 ++++++++++++++
>  lib/librte_power/rte_power_acpi_cpufreq.h          | 192 +++++
>  lib/librte_power/rte_power_common.h                |  39 +
>  lib/librte_power/rte_power_kvm_vm.c                | 135 ++++
>  lib/librte_power/rte_power_kvm_vm.h                | 179 +++++
>  32 files changed, 5718 insertions(+), 912 deletions(-)
>  create mode 100644 app/test/test_power_acpi_cpufreq.c
>  create mode 100644 app/test/test_power_kvm_vm.c
>  create mode 100644 examples/vm_power_manager/Makefile
>  create mode 100644 examples/vm_power_manager/channel_manager.c
>  create mode 100644 examples/vm_power_manager/channel_manager.h
>  create mode 100644 examples/vm_power_manager/channel_monitor.c
>  create mode 100644 examples/vm_power_manager/channel_monitor.h
>  create mode 100644 examples/vm_power_manager/guest_cli/Makefile
>  create mode 100644 examples/vm_power_manager/guest_cli/main.c
>  create mode 100644 examples/vm_power_manager/guest_cli/main.h
>  create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
>  create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
>  create mode 100644 examples/vm_power_manager/main.c
>  create mode 100644 examples/vm_power_manager/main.h
>  create mode 100644 examples/vm_power_manager/power_manager.c
>  create mode 100644 examples/vm_power_manager/power_manager.h
>  create mode 100644 examples/vm_power_manager/vm_power_cli.c
>  create mode 100644 examples/vm_power_manager/vm_power_cli.h
>  create mode 100644 lib/librte_power/channel_commands.h
>  create mode 100644 lib/librte_power/guest_channel.c
>  create mode 100644 lib/librte_power/guest_channel.h
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
>  create mode 100644 lib/librte_power/rte_power_common.h
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.c
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.h
> 
> -- 
> 1.9.3
> 
> 
This all seems to be reasonable.  Thanks
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (10 preceding siblings ...)
  2014-09-29 17:29     ` [dpdk-dev] [PATCH v3 00/10] VM Power Management Neil Horman
@ 2014-10-12 19:36     ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
                         ` (12 more replies)
  2014-11-10  9:19     ` [dpdk-dev] [PATCH v2] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ Alan Carew
  12 siblings, 13 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

Virtual Machine Power Management.

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/<vm_name>.<channel_number>,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
 Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.
 
Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.
<controller type='virtio-serial' index='0'>
 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<channel type='unix'>
  <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
  <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
  <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
</channel>

Where multiple channels can be configured by specifying multiple <channel>
elements, by replacing <vm_name>, <channel_num>.
<N>(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two separate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
 from VMs and calls the corresponding librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.

Changes in V4:
 Fixed double free of channel during VM shutdown.

Changes in V3:
 Fixed crash in Guest CLI when host application is not running.
 Renamed #defines to be more specific to the module they belong
 Added vCPU pinning via CLI

Changes in V2:
 Runtime selection of librte_power implementations.
 Updated Unit tests to cover librte_power changes.
 PATCH[0/3] was sent twice, again as PATCH[0/4]
 Miscellaneous fixes.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  VM Power Management application and Makefile.
  VM Power Management CLI(Guest).
  VM communication channels for VM Power Management(Guest).
  librte_power common interface for Guest and Host
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests

 app/test/Makefile                                  |   3 +-
 app/test/autotest_data.py                          |  26 +
 app/test/test_power.c                              | 445 +-----------
 app/test/test_power_acpi_cpufreq.c                 | 544 ++++++++++++++
 app/test/test_power_kvm_vm.c                       | 308 ++++++++
 examples/vm_power_manager/Makefile                 |  57 ++
 examples/vm_power_manager/channel_manager.c        | 804 +++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h        | 314 ++++++++
 examples/vm_power_manager/channel_monitor.c        | 231 ++++++
 examples/vm_power_manager/channel_monitor.h        | 102 +++
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++
 examples/vm_power_manager/guest_cli/main.c         |  87 +++
 examples/vm_power_manager/guest_cli/main.h         |  52 ++
 .../guest_cli/vm_power_cli_guest.c                 | 155 ++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++
 examples/vm_power_manager/main.c                   | 117 +++
 examples/vm_power_manager/main.h                   |  52 ++
 examples/vm_power_manager/power_manager.c          | 244 +++++++
 examples/vm_power_manager/power_manager.h          | 188 +++++
 examples/vm_power_manager/vm_power_cli.c           | 669 +++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h           |  47 ++
 lib/librte_power/Makefile                          |   3 +-
 lib/librte_power/channel_commands.h                |  77 ++
 lib/librte_power/guest_channel.c                   | 162 +++++
 lib/librte_power/guest_channel.h                   |  89 +++
 lib/librte_power/rte_power.c                       | 540 ++------------
 lib/librte_power/rte_power.h                       | 120 ++-
 lib/librte_power/rte_power_acpi_cpufreq.c          | 545 ++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h          | 192 +++++
 lib/librte_power/rte_power_common.h                |  39 +
 lib/librte_power/rte_power_kvm_vm.c                | 135 ++++
 lib/librte_power/rte_power_kvm_vm.h                | 179 +++++
 32 files changed, 5725 insertions(+), 912 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h
 create mode 100644 lib/librte_power/channel_commands.h
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 02/10] VM Power Management CLI(Host) Alan Carew
                         ` (11 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/channel_manager.c | 804 ++++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h | 314 +++++++++++
 examples/vm_power_manager/channel_monitor.c | 231 ++++++++
 examples/vm_power_manager/channel_monitor.h | 102 ++++
 4 files changed, 1451 insertions(+)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 0000000..a14f191
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,804 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+
+#include <rte_config.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+#include <rte_log.h>
+#include <rte_atomic.h>
+#include <rte_spinlock.h>
+
+#include <libvirt/libvirt.h>
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+		if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+static unsigned char *global_cpumaps;
+static virVcpuInfo *global_vircpuinfo;
+static size_t global_maplen;
+
+static unsigned global_n_host_cpus;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];
+	rte_atomic64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];
+	struct channel_info *channels[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	uint64_t channel_mask;
+	uint8_t num_channels;
+	enum vm_status status;
+	virDomainPtr domainPtr;
+	virDomainInfo info;
+	rte_spinlock_t config_spinlock;
+	LIST_ENTRY(virtual_machine_info) vms_info;
+};
+
+LIST_HEAD(, virtual_machine_info) vm_list_head;
+
+static struct virtual_machine_info *
+find_domain_by_name(const char *name)
+{
+	struct virtual_machine_info *info;
+	LIST_FOREACH(info, &vm_list_head, vms_info) {
+		if (!strncmp(info->name, name, CHANNEL_MGR_MAX_NAME_LEN-1))
+			return info;
+	}
+	return NULL;
+}
+
+static int
+update_pcpus_mask(struct virtual_machine_info *vm_info)
+{
+	virVcpuInfoPtr cpuinfo;
+	unsigned i, j;
+	int n_vcpus;
+	uint64_t mask;
+
+	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		n_vcpus = virDomainGetVcpuPinInfo(vm_info->domainPtr,
+				vm_info->info.nrVirtCpu, global_cpumaps, global_maplen,
+				VIR_DOMAIN_AFFECT_CONFIG);
+		if (n_vcpus < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+					"in-active VM '%s'\n", vm_info->name);
+			return -1;
+		}
+		goto update_pcpus;
+	}
+
+	memset(global_vircpuinfo, 0, sizeof(*global_vircpuinfo)*
+			CHANNEL_CMDS_MAX_CPUS);
+
+	cpuinfo = global_vircpuinfo;
+
+	n_vcpus = virDomainGetVcpus(vm_info->domainPtr, cpuinfo,
+			CHANNEL_CMDS_MAX_CPUS, global_cpumaps, global_maplen);
+	if (n_vcpus < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+							"active VM '%s'\n", vm_info->name);
+		return -1;
+	}
+update_pcpus:
+	if (n_vcpus >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Number of vCPUS(%u) is out of range "
+				"0...%d\n", n_vcpus, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+	if (n_vcpus != vm_info->info.nrVirtCpu) {
+		RTE_LOG(INFO, CHANNEL_MANAGER, "Updating the number of vCPUs for VM '%s"
+				" from %d -> %d\n", vm_info->name, vm_info->info.nrVirtCpu,
+				n_vcpus);
+		vm_info->info.nrVirtCpu = n_vcpus;
+	}
+	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
+		mask = 0;
+		for (j = 0; j < global_n_host_cpus; j++) {
+			if (VIR_CPU_USABLE(global_cpumaps, global_maplen, i, j) > 0) {
+				mask |= 1ULL << j;
+			}
+		}
+		rte_atomic64_set(&vm_info->pcpu_mask[i], mask);
+	}
+	return 0;
+}
+
+int
+set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask)
+{
+	unsigned i = 0;
+	int flags = VIR_DOMAIN_AFFECT_LIVE|VIR_DOMAIN_AFFECT_CONFIG;
+	struct virtual_machine_info *vm_info;
+	uint64_t mask = core_mask;
+
+	if (vcpu >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds max allowable(%d)\n",
+				vcpu, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s', VM is not active\n",
+				vcpu, core_mask, vm_info->name);
+		return -1;
+	}
+
+	if (vcpu >= vm_info->info.nrVirtCpu) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds the assigned number of "
+				"vCPUs(%u)\n", vcpu, vm_info->info.nrVirtCpu);
+		return -1;
+	}
+	memset(global_cpumaps, 0 , CHANNEL_CMDS_MAX_CPUS * global_maplen);
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		VIR_USE_CPU(global_cpumaps, i);
+		if (i >= global_n_host_cpus) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "CPU(%u) exceeds the available "
+					"number of CPUs(%u)\n", i, global_n_host_cpus);
+			return -1;
+		}
+	}
+	if (virDomainPinVcpuFlags(vm_info->domainPtr, vcpu, global_cpumaps,
+			global_maplen, flags) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s'\n", vcpu, core_mask,
+				vm_info->name);
+		return -1;
+	}
+	rte_atomic64_set(&vm_info->pcpu_mask[vcpu], core_mask);
+	return 0;
+
+}
+
+int
+set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num)
+{
+	uint64_t mask = 1ULL << core_num;
+	return set_pcpus_mask(vm_name, vcpu, mask);
+}
+
+uint64_t
+get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
+{
+	struct virtual_machine_info *vm_info =
+			(struct virtual_machine_info *)chan_info->priv_info;
+	return rte_atomic64_read(&vm_info->pcpu_mask[vcpu]);
+}
+
+static inline int
+channel_exists(struct virtual_machine_info *vm_info, unsigned channel_num)
+{
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	if (vm_info->channel_mask & (1ULL << channel_num)) {
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+		return 1;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+
+
+static int
+open_non_blocking_channel(struct channel_info *info)
+{
+	int ret, flags;
+	struct sockaddr_un sock_addr;
+	fd_set soc_fd_set;
+	struct timeval tv;
+
+	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (info->fd == -1) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
+				strerror(errno),
+				info->channel_path);
+		return -1;
+	}
+	sock_addr.sun_family = AF_UNIX;
+	memcpy(&sock_addr.sun_path, info->channel_path,
+			strlen(info->channel_path)+1);
+
+	/* Get current flags */
+	flags = fcntl(info->fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
+				"'%s'\n", strerror(errno), info->channel_path);
+		return 1;
+	}
+	/* Set to Non Blocking */
+	flags |= O_NONBLOCK;
+	if (fcntl(info->fd, F_SETFL, flags) < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
+				"socket for '%s'\n", strerror(errno), info->channel_path);
+		return -1;
+	}
+	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
+			sizeof(sock_addr));
+	if (ret < 0) {
+		/* ECONNREFUSED error is given when VM is not active */
+		if (errno == ECONNREFUSED) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
+					"activated its endpoint to channel %s\n",
+					info->channel_path);
+			return -1;
+		}
+		/* Wait for tv_sec if in progress */
+		else if (errno == EINPROGRESS) {
+			tv.tv_sec = 2;
+			tv.tv_usec = 0;
+			FD_ZERO(&soc_fd_set);
+			FD_SET(info->fd, &soc_fd_set);
+			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
+				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
+						"'%s'\n", info->channel_path);
+				return -1;
+			}
+		} else {
+			/* Any other error */
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
+					" for '%s'\n", strerror(errno), info->channel_path);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int
+setup_channel_info(struct virtual_machine_info **vm_info_dptr,
+		struct channel_info **chan_info_dptr, unsigned channel_num)
+{
+	struct channel_info *chan_info = *chan_info_dptr;
+	struct virtual_machine_info *vm_info = *vm_info_dptr;
+
+	chan_info->channel_num = channel_num;
+	chan_info->priv_info = (void *)vm_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_DISCONNECTED;
+	if (open_non_blocking_channel(chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
+				"'%s' for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+	}
+	if (add_channel_to_monitor(&chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
+				"'%s' to epoll ctl for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+
+	}
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->num_channels++;
+	vm_info->channel_mask |= 1ULL << channel_num;
+	vm_info->channels[channel_num] = chan_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_all_channels(const char *vm_name)
+{
+	DIR *d;
+	struct dirent *dir;
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char *token, *remaining, *tail_ptr;
+	char socket_name[PATH_MAX];
+	unsigned channel_num;
+	int num_channels_enabled = 0;
+
+	/* verify VM exists */
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
+				" during channel discovery\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+	d = opendir(CHANNEL_MGR_SOCKET_PATH);
+	if (d == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
+				CHANNEL_MGR_SOCKET_PATH, strerror(errno));
+		return -1;
+	}
+	while ((dir = readdir(d)) != NULL) {
+		if (!strncmp(dir->d_name, ".", 1) ||
+				!strncmp(dir->d_name, "..", 2))
+			continue;
+
+		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
+		remaining = socket_name;
+		/* Extract vm_name from "<vm_name>.<channel_num>" */
+		token = strsep(&remaining, ".");
+		if (remaining == NULL)
+			continue;
+		if (strncmp(vm_name, token, CHANNEL_MGR_MAX_NAME_LEN))
+			continue;
+
+		/* remaining should contain only <channel_num> */
+		errno = 0;
+		channel_num = (unsigned)strtol(remaining, &tail_ptr, 0);
+		if ((errno != 0) || (remaining[0] == '\0') ||
+				(*tail_ptr != '\0') || tail_ptr == NULL) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
+					"'%s' found it should be in the form of "
+					"'<guest_name>.<channel_num>(decimal)'\n",
+					dir->d_name);
+			continue;
+		}
+		if (channel_num >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
+					"greater than max allowable: %d, skipping '%s%s'\n",
+					channel_num, CHANNEL_CMDS_MAX_VM_CHANNELS-1,
+					CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+		/* if channel has not been added previously */
+		if (channel_exists(vm_info, channel_num))
+			continue;
+
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s%s'\n", CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s",
+				CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+
+		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+
+		num_channels_enabled++;
+	}
+	closedir(d);
+	return num_channels_enabled;
+}
+
+int
+add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char socket_path[PATH_MAX];
+	unsigned i;
+	int num_channels_enabled = 0;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+
+	for (i = 0; i < len_channel_list; i++) {
+
+		if (channel_list[i] >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel(%u) is out of range "
+							"0...%d\n", channel_list[i],
+							CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+			continue;
+		}
+		if (channel_exists(vm_info, channel_list[i])) {
+			RTE_LOG(INFO, CHANNEL_MANAGER,  "Channel already exists, skipping  "
+					"'%s.%u'\n", vm_name, i);
+			continue;
+		}
+
+		snprintf(socket_path, sizeof(socket_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		errno = 0;
+		if (access(socket_path, F_OK) < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
+					"%s\n", socket_path, strerror(errno));
+			continue;
+		}
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s'\n", socket_path);
+			continue;
+		}
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+		num_channels_enabled++;
+
+	}
+	return num_channels_enabled;
+}
+
+int
+remove_channel(struct channel_info **chan_info_dptr)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info = *chan_info_dptr;
+
+	close(chan_info->fd);
+
+	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
+	vm_info->num_channels--;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	rte_free(chan_info);
+	return 0;
+}
+
+int
+set_channel_status_all(const char *vm_name, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	uint64_t mask;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		vm_info->channels[i]->status = status;
+		num_channels_changed++;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return num_channels_changed;
+
+}
+
+int
+set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (channel_exists(vm_info, channel_list[i])) {
+			rte_spinlock_lock(&(vm_info->config_spinlock));
+			vm_info->channels[channel_list[i]]->status = status;
+			rte_spinlock_unlock(&(vm_info->config_spinlock));
+			num_channels_changed++;
+		}
+	}
+	return num_channels_changed;
+}
+
+int
+get_info_vm(const char *vm_name, struct vm_info *info)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i, channel_num = 0;
+	uint64_t mask;
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+	info->status = CHANNEL_MGR_VM_ACTIVE;
+	if (!virDomainIsActive(vm_info->domainPtr))
+		info->status = CHANNEL_MGR_VM_INACTIVE;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		info->channels[channel_num].channel_num = i;
+		memcpy(info->channels[channel_num].channel_path,
+				vm_info->channels[i]->channel_path, PATH_MAX);
+		info->channels[channel_num].status = vm_info->channels[i]->status;
+		info->channels[channel_num].fd = vm_info->channels[i]->fd;
+		channel_num++;
+	}
+
+	info->num_channels = channel_num;
+	info->num_vcpus = vm_info->info.nrVirtCpu;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
+	for (i = 0; i < info->num_vcpus; i++) {
+		info->pcpu_mask[i] = rte_atomic64_read(&vm_info->pcpu_mask[i]);
+	}
+	return 0;
+}
+
+int
+add_vm(const char *vm_name)
+{
+	struct virtual_machine_info *new_domain;
+	virDomainPtr dom_ptr;
+	int i;
+
+	if (find_domain_by_name(vm_name) != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
+				"already exists\n", vm_name);
+		return -1;
+	}
+
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
+		return -1;
+	}
+	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
+	if (dom_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
+				"VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
+			CACHE_LINE_SIZE);
+	if (new_domain == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
+				"info\n");
+		return -1;
+	}
+	new_domain->domainPtr = dom_ptr;
+	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	if (new_domain->info.nrVirtCpu > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
+				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
+				CHANNEL_CMDS_MAX_CPUS);
+		rte_free(new_domain);
+		return -1;
+	}
+
+	for (i = 0; i < CHANNEL_CMDS_MAX_CPUS; i++) {
+		rte_atomic64_init(&new_domain->pcpu_mask[i]);
+	}
+	if (update_pcpus_mask(new_domain) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
+	new_domain->channel_mask = 0;
+	new_domain->num_channels = 0;
+
+	if (!virDomainIsActive(dom_ptr))
+		new_domain->status = CHANNEL_MGR_VM_INACTIVE;
+	else
+		new_domain->status = CHANNEL_MGR_VM_ACTIVE;
+
+	rte_spinlock_init(&(new_domain->config_spinlock));
+	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
+	return 0;
+}
+
+int
+remove_vm(const char *vm_name)
+{
+	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
+				"not found\n", vm_name);
+		return -1;
+	}
+	rte_spinlock_lock(&vm_info->config_spinlock);
+	if (vm_info->num_channels != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
+				"%"PRId8" channels still active\n",
+				vm_name, vm_info->num_channels);
+		rte_spinlock_unlock(&vm_info->config_spinlock);
+		return -1;
+	}
+	LIST_REMOVE(vm_info, vms_info);
+	rte_spinlock_unlock(&vm_info->config_spinlock);
+	rte_free(vm_info);
+	return 0;
+}
+
+static void
+disconnect_hypervisor(void)
+{
+	if (global_vir_conn_ptr != NULL) {
+		virConnectClose(global_vir_conn_ptr);
+		global_vir_conn_ptr = NULL;
+	}
+}
+
+static int
+connect_hypervisor(const char *path)
+{
+	if (global_vir_conn_ptr != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+				"already established\n", path);
+		return -1;
+	}
+	global_vir_conn_ptr = virConnectOpen(path);
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
+				"Hypervisor '%s'\n", path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_manager_init(const char *path)
+{
+	int n_cpus;
+	LIST_INIT(&vm_list_head);
+	if (connect_hypervisor(path) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
+		return -1;
+	}
+
+	global_maplen = VIR_CPU_MAPLEN(CHANNEL_CMDS_MAX_CPUS);
+
+	global_vircpuinfo = rte_zmalloc(NULL, sizeof(*global_vircpuinfo) *
+			CHANNEL_CMDS_MAX_CPUS, CACHE_LINE_SIZE);
+	if (global_vircpuinfo == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for CPU Info\n");
+		goto error;
+	}
+	global_cpumaps = rte_zmalloc(NULL, CHANNEL_CMDS_MAX_CPUS * global_maplen,
+			CACHE_LINE_SIZE);
+	if (global_cpumaps == NULL) {
+		goto error;
+	}
+
+	n_cpus = virNodeGetCPUMap(global_vir_conn_ptr, NULL, NULL, 0);
+	if (n_cpus <= 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get the number of Host "
+				"CPUs\n");
+		goto error;
+	}
+	global_n_host_cpus = (unsigned)n_cpus;
+
+	if (global_n_host_cpus > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "The number of host CPUs(%u) exceeds the "
+				"maximum of %u\n", global_n_host_cpus, CHANNEL_CMDS_MAX_CPUS);
+		goto error;
+
+	}
+
+	return 0;
+error:
+	disconnect_hypervisor();
+	return -1;
+}
+
+void
+channel_manager_exit(void)
+{
+	unsigned i;
+	uint64_t mask;
+	struct virtual_machine_info *vm_info;
+
+	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
+
+		rte_spinlock_lock(&(vm_info->config_spinlock));
+
+		mask = vm_info->channel_mask;
+		ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			remove_channel_from_monitor(vm_info->channels[i]);
+			close(vm_info->channels[i]->fd);
+			rte_free(vm_info->channels[i]);
+		}
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+		LIST_REMOVE(vm_info, vms_info);
+		rte_free(vm_info);
+	}
+
+	if (global_cpumaps != NULL)
+		rte_free(global_cpumaps);
+	if (global_vircpuinfo != NULL)
+		rte_free(global_vircpuinfo);
+	disconnect_hypervisor();
+}
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
new file mode 100644
index 0000000..12c29c3
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.h
@@ -0,0 +1,314 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MANAGER_H_
+#define CHANNEL_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/limits.h>
+#include <rte_atomic.h>
+#include "channel_commands.h"
+
+/* Maximum name length including '\0' terminator */
+#define CHANNEL_MGR_MAX_NAME_LEN    64
+
+/* Maximum number of channels to each Virtual Machine */
+#define CHANNEL_MGR_MAX_CHANNELS    64
+
+/* Hypervisor Path for libvirt(qemu/KVM) */
+#define CHANNEL_MGR_DEFAULT_HV_PATH "qemu:///system"
+
+/* File socket directory */
+#define CHANNEL_MGR_SOCKET_PATH     "/tmp/powermonitor/"
+
+/* Communication Channel Status */
+enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
+	CHANNEL_MGR_CHANNEL_CONNECTED,
+	CHANNEL_MGR_CHANNEL_DISABLED,
+	CHANNEL_MGR_CHANNEL_PROCESSING};
+
+/* VM libvirt(qemu/KVM) connection status */
+enum vm_status { CHANNEL_MGR_VM_INACTIVE = 0, CHANNEL_MGR_VM_ACTIVE};
+
+/*
+ *  Represents a single and exclusive VM channel that exists between a guest and
+ *  the host.
+ */
+struct channel_info {
+	char channel_path[PATH_MAX]; /**< Path to host socket */
+	volatile uint32_t status;    /**< Connection status(enum channel_status) */
+	int fd;                      /**< AF_UNIX socket fd */
+	unsigned channel_num;        /**< CHANNEL_MGR_SOCKET_PATH/<vm_name>.channel_num */
+	void *priv_info;             /**< Pointer to private info, do not modify */
+};
+
+/* Represents a single VM instance used to return internal information about
+ * a VM */
+struct vm_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];          /**< VM name */
+	enum vm_status status;                        /**< libvirt status */
+	uint64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];    /**< pCPU mask for each vCPU */
+	unsigned num_vcpus;                           /**< number of vCPUS */
+	struct channel_info channels[CHANNEL_MGR_MAX_CHANNELS]; /**< Array of channel_info */
+	unsigned num_channels;                        /**< Number of channels */
+};
+
+/**
+ * Initialize the Channel Manager resources and connect to the Hypervisor
+ * specified in path.
+ * This must be successfully called first before calling any other functions.
+ * It must only be call once;
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_manager_init(const char *path);
+
+/**
+ * Free resources associated with the Channel Manager.
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  None
+ */
+void channel_manager_exit(void);
+
+/**
+ * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
+ * core_mask.
+ * It is not thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info
+ *
+ * @param vcpu
+ *  The virtual CPU to query.
+ *
+ *
+ * @return
+ *  - 0 on error.
+ *  - >0 on success.
+ */
+uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
+
+/**
+ * Set the Physical CPU mask for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_mask
+ *  The core mask of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask);
+
+/**
+ * Set the Physical CPU for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_num
+ *  The core number of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num);
+/**
+ * Add a VM as specified by name to the Channel Manager. The name must
+ * correspond to a valid libvirt domain name.
+ * This is required prior to adding channels.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_vm(const char *name);
+
+/**
+ * Remove a previously added Virtual Machine from the Channel Manager
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_vm(const char *name);
+
+/**
+ * Add all available channels to the VM as specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ */
+int add_all_channels(const char *vm_name);
+
+/**
+ * Add the channel numbers in channel_list to the domain specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel number to add
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned num_channels);
+
+/**
+ * Remove a channel definition from the channel manager. This must only be
+ * called from the channel monitor thread.
+ *
+ * @param chan_info
+ *  Pointer to a valid struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel(struct channel_info **chan_info_dptr);
+
+/**
+ * For all channels associated with a Virtual Machine name, update the
+ * connection status. Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ *
+ *
+ * @param name
+ *  Virtual Machine name to modify all channels.
+ *
+ * @param status
+ *  The status to set each channel
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int set_channel_status_all(const char *name, enum channel_status status);
+
+/**
+ * For all channels in channel_list associated with a Virtual Machine name
+ * update the connection status of each.
+ * Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel numbers to
+ *  modify.
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels modified for the VM
+ *  - 0 for error
+ */
+int set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status);
+
+/**
+ * Populates a pointer to struct vm_info associated with vm_name.
+ *
+ * @param vm_name
+ *  The name of the virtual machine to lookup.
+ *
+ *  @param vm_info
+ *   Pointer to a struct vm_info, this must be allocated prior to calling this
+ *   function.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int get_info_vm(const char *vm_name, struct vm_info *info);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_MANAGER_H_ */
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
new file mode 100644
index 0000000..3674c7c
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -0,0 +1,231 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+
+
+#include "channel_monitor.h"
+#include "channel_commands.h"
+#include "channel_manager.h"
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+
+static volatile unsigned run_loop = 1;
+static int global_event_fd;
+static struct epoll_event *global_events_list;
+
+void channel_monitor_exit(void)
+{
+	run_loop = 0;
+	rte_free(global_events_list);
+}
+
+static int
+process_request(struct channel_packet *pkt, struct channel_info *chan_info)
+{
+	uint64_t core_mask;
+
+	if (chan_info == NULL)
+		return -1;
+
+	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_CONNECTED,
+			CHANNEL_MGR_CHANNEL_PROCESSING) == 0)
+		return -1;
+
+	if (pkt->command == CPU_POWER) {
+		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
+		if (core_mask == 0) {
+			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
+					"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
+					(unsigned)pkt->unit);
+			return -1;
+		}
+		if (__builtin_popcountll(core_mask) == 1) {
+
+			unsigned core_num = __builtin_ffsll(core_mask) - 1;
+
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_core_min(core_num);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_core_max(core_num);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_core_down(core_num);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_core_up(core_num);
+			break;
+			default:
+				break;
+			}
+		} else {
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_mask_min(core_mask);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_mask_max(core_mask);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_mask_down(core_mask);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_mask_up(core_mask);
+			break;
+			default:
+				break;
+			}
+
+		}
+	}
+	/* Return is not checked as channel status may have been set to DISABLED
+	 * from management thread
+	 */
+	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_PROCESSING,
+			CHANNEL_MGR_CHANNEL_CONNECTED);
+	return 0;
+
+}
+
+int
+add_channel_to_monitor(struct channel_info **chan_info)
+{
+	struct channel_info *info = *chan_info;
+	struct epoll_event event;
+	event.events = EPOLLIN;
+	event.data.ptr = info;
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
+				"to epoll\n", info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+remove_channel_from_monitor(struct channel_info *chan_info)
+{
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
+				"from epoll\n", chan_info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_monitor_init(void)
+{
+	global_event_fd = epoll_create1(0);
+	if (global_event_fd == 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
+				"error %s\n", strerror(errno));
+		return -1;
+	}
+	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
+			* MAX_EVENTS, CACHE_LINE_SIZE);
+	if (global_events_list == NULL) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+				"epoll events\n");
+		return -1;
+	}
+	return 0;
+}
+
+void
+run_channel_monitor(void)
+{
+	while (run_loop) {
+		int n_events, i;
+		n_events = epoll_wait(global_event_fd, global_events_list,
+				MAX_EVENTS, 1);
+		if (!run_loop)
+			break;
+		for (i = 0; i < n_events; i++) {
+			struct channel_info *chan_info = (struct channel_info *)
+					global_events_list[i].data.ptr;
+			if ((global_events_list[i].events & EPOLLERR) ||
+					(global_events_list[i].events & EPOLLHUP)) {
+				RTE_LOG(DEBUG, CHANNEL_MONITOR, "Remote closed connection for "
+						"channel '%s'\n", chan_info->channel_path);
+				remove_channel(&chan_info);
+				continue;
+			}
+			if (global_events_list[i].events & EPOLLIN) {
+
+				int n_bytes, err = 0;
+				struct channel_packet pkt;
+				void *buffer = &pkt;
+				int buffer_len = sizeof(pkt);
+				while (buffer_len > 0) {
+					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					if (n_bytes == buffer_len)
+						break;
+					if (n_bytes == -1) {
+						err = errno;
+						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
+								"channel '%s' read: %s\n",
+								chan_info->channel_path, strerror(err));
+						remove_channel(&chan_info);
+						break;
+					}
+					buffer = (char *)buffer + n_bytes;
+					buffer_len -= n_bytes;
+				}
+				if (!err)
+					process_request(&pkt, chan_info);
+			}
+		}
+	}
+}
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
new file mode 100644
index 0000000..c138607
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MONITOR_H_
+#define CHANNEL_MONITOR_H_
+
+#include "channel_manager.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Channel Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_monitor_init(void);
+
+/**
+ * Run the channel monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_channel_monitor(void);
+
+/**
+ * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+void channel_monitor_exit(void);
+
+/**
+ * Add an open channel to monitor via epoll. A pointer to struct channel_info
+ * will be registered with epoll for event processing.
+ * It is thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info pointer.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_channel_to_monitor(struct channel_info **chan_info);
+
+/**
+ * Remove a previously added channel from epoll control.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel_from_monitor(struct channel_info *chan_info);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* CHANNEL_MONITOR_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 02/10] VM Power Management CLI(Host).
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 03/10] CPU Frequency Power Management(Host) Alan Carew
                         ` (10 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-seperated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/vm_power_cli.c | 669 +++++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h |  47 +++
 2 files changed, 716 insertions(+)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 0000000..e162e88
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,669 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <termios.h>
+#include <errno.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+		struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+struct cmd_show_vm_result {
+	cmdline_fixed_string_t show_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_show_vm_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_show_vm_result *res = parsed_result;
+	struct vm_info info;
+	unsigned i;
+
+	if (get_info_vm(res->vm_name, &info) != 0)
+		return;
+	cmdline_printf(cl, "VM: '%s', status = ", info.name);
+	if (info.status == CHANNEL_MGR_VM_ACTIVE)
+		cmdline_printf(cl, "ACTIVE\n");
+	else
+		cmdline_printf(cl, "INACTIVE\n");
+	cmdline_printf(cl, "Channels %u\n", info.num_channels);
+	for (i = 0; i < info.num_channels; i++) {
+		cmdline_printf(cl, "  [%u]: %s, status = ", i,
+				info.channels[i].channel_path);
+		switch (info.channels[i].status) {
+		case CHANNEL_MGR_CHANNEL_CONNECTED:
+			cmdline_printf(cl, "CONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISCONNECTED:
+			cmdline_printf(cl, "DISCONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISABLED:
+			cmdline_printf(cl, "DISABLED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_PROCESSING:
+			cmdline_printf(cl, "PROCESSING\n");
+			break;
+		default:
+			cmdline_printf(cl, "UNKNOWN\n");
+			break;
+		}
+	}
+	cmdline_printf(cl, "Virtual CPU(s): %u\n", info.num_vcpus);
+	for (i = 0; i < info.num_vcpus; i++) {
+		cmdline_printf(cl, "  [%u]: Physical CPU Mask 0x%"PRIx64"\n", i,
+				info.pcpu_mask[i]);
+	}
+}
+
+
+
+cmdline_parse_token_string_t cmd_vm_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+				show_vm, "show_vm");
+cmdline_parse_token_string_t cmd_show_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_show_vm_set = {
+	.f = cmd_show_vm_parsed,
+	.data = NULL,
+	.help_str = "show_vm <vm_name>, prints the information on the "
+			"specified VM(s), the information lists the number of vCPUS, the "
+			"pinning to pCPU(s) as a bit mask, along with any communication "
+			"channels associated with each VM",
+	.tokens = {
+		(void *)&cmd_vm_show,
+		(void *)&cmd_show_vm_name,
+		NULL,
+	},
+};
+
+/* *** vCPU to pCPU mapping operations *** */
+struct cmd_set_pcpu_mask_result {
+    cmdline_fixed_string_t set_pcpu_mask;
+    cmdline_fixed_string_t vm_name;
+    uint8_t vcpu;
+    uint64_t core_mask;
+};
+
+static void
+cmd_set_pcpu_mask_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_mask_result *res = parsed_result;
+	if (set_pcpus_mask(res->vm_name, res->vcpu, res->core_mask) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu_mask =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				set_pcpu_mask, "set_pcpu_mask");
+cmdline_parse_token_string_t cmd_set_pcpu_mask_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_mask_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_mask_core_mask =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				core_mask, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_mask_set = {
+		.f = cmd_set_pcpu_mask_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu_mask <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU mask.",
+				.tokens = {
+						(void *)&cmd_set_pcpu_mask,
+						(void *)&cmd_set_pcpu_mask_vm_name,
+						(void *)&set_pcpu_mask_vcpu,
+						(void *)&set_pcpu_mask_core_mask,
+						NULL,
+		},
+};
+
+struct cmd_set_pcpu_result {
+    cmdline_fixed_string_t set_pcpu;
+    cmdline_fixed_string_t vm_name;
+    uint8_t vcpu;
+    uint8_t core;
+};
+
+static void
+cmd_set_pcpu_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_result *res = parsed_result;
+	if (set_pcpu(res->vm_name, res->vcpu, res->core) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				set_pcpu, "set_pcpu");
+cmdline_parse_token_string_t cmd_set_pcpu_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_core =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				core, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_set = {
+		.f = cmd_set_pcpu_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU.",
+				.tokens = {
+						(void *)&cmd_set_pcpu,
+						(void *)&cmd_set_pcpu_vm_name,
+						(void *)&set_pcpu_vcpu,
+						(void *)&set_pcpu_core,
+						NULL,
+		},
+};
+
+struct cmd_vm_op_result {
+	cmdline_fixed_string_t op_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_vm_op_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_vm_op_result *res = parsed_result;
+
+	if (!strcmp(res->op_vm, "add_vm")) {
+		if (add_vm(res->vm_name) < 0)
+			cmdline_printf(cl, "Unable to add VM '%s'\n", res->vm_name);
+	} else if (remove_vm(res->vm_name) < 0)
+		cmdline_printf(cl, "Unable to remove VM '%s'\n", res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_vm_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			op_vm, "add_vm#rm_vm");
+cmdline_parse_token_string_t cmd_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_vm_op_set = {
+	.f = cmd_vm_op_parsed,
+	.data = NULL,
+	.help_str = "add_vm|rm_vm <name>, add a VM for "
+			"subsequent operations with the CLI or remove a previously added "
+			"VM from the VM Power Manager",
+	.tokens = {
+		(void *)&cmd_vm_op,
+		(void *)&cmd_vm_name,
+	NULL,
+	},
+};
+
+/* *** VM channel operations *** */
+struct cmd_channels_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+};
+static void
+cmd_channels_op_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num, i;
+	int channels_added;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_op_result *res = parsed_result;
+
+	if (!strcmp(res->channel_list, "all")) {
+		channels_added = add_all_channels(res->vm_name);
+		cmdline_printf(cl, "Added %d channels for VM '%s'\n",
+				channels_added, res->vm_name);
+		return;
+	}
+
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "Channel number '%u' exceeds the maximum number "
+					"of allowable channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	for (i = 0; i < num_channels; i++)
+		cmdline_printf(cl, "[%u]: Adding channel %u\n", i, channel_list[i]);
+
+	channels_added = add_channels(res->vm_name, channel_list,
+			num_channels);
+	cmdline_printf(cl, "Enabled %d channels for '%s'\n", channels_added,
+			res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+				op, "add_channels");
+cmdline_parse_token_string_t cmd_channels_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			channel_list, NULL);
+
+cmdline_parse_inst_t cmd_channels_op_set = {
+	.f = cmd_channels_op_parsed,
+	.data = NULL,
+	.help_str = "add_channels <vm_name> <list>|all, add "
+			"communication channels for the specified VM, the "
+			"virtio channels must be enabled in the VM "
+			"configuration(qemu/libvirt) and the associated VM must be active. "
+			"<list> is a comma-seperated list of channel numbers to add, using "
+			"the keyword 'all' will attempt to add all channels for the VM",
+	.tokens = {
+		(void *)&cmd_channels_op,
+		(void *)&cmd_channels_vm_name,
+		(void *)&cmd_channels_list,
+		NULL,
+	},
+};
+
+struct cmd_channels_status_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+	cmdline_fixed_string_t status;
+};
+
+static void
+cmd_channels_status_op_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num;
+	int changed;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_status_op_result *res = parsed_result;
+	enum channel_status status;
+
+	if (!strcmp(res->status, "enabled"))
+		status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	else
+		status = CHANNEL_MGR_CHANNEL_DISABLED;
+
+	if (!strcmp(res->channel_list, "all")) {
+		changed = set_channel_status_all(res->vm_name, status);
+		cmdline_printf(cl, "Updated status of %d channels "
+				"for VM '%s'\n", changed, res->vm_name);
+		return;
+	}
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "%u exceeds the maximum number of allowable "
+					"channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	changed = set_channel_status(res->vm_name, channel_list, num_channels,
+			status);
+	cmdline_printf(cl, "Updated status of %d channels "
+					"for VM '%s'\n", changed, res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_status_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+				op, "set_channel_status");
+cmdline_parse_token_string_t cmd_channels_status_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_status_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			channel_list, NULL);
+cmdline_parse_token_string_t cmd_channels_status =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			status, "enabled#disabled");
+
+cmdline_parse_inst_t cmd_channels_status_op_set = {
+	.f = cmd_channels_status_op_parsed,
+	.data = NULL,
+	.help_str = "set_channel_status <vm_name> <list>|all enabled|disabled, "
+			" enable or disable the communication channels in "
+			"list(comma-seperated) for the specified VM, alternatively list can"
+			" be replaced with keyword 'all'. Disabled channels will still "
+			"receive packets on the host, however the commands they specify "
+			"will be ignored. Set status to 'enabled' to begin processing "
+			"requests again.",
+	.tokens = {
+		(void *)&cmd_channels_status_op,
+		(void *)&cmd_channels_status_vm_name,
+		(void *)&cmd_channels_status_list,
+		(void *)&cmd_channels_status,
+		NULL,
+	},
+};
+
+/* *** CPU Frequency operations *** */
+struct cmd_show_cpu_freq_mask_result {
+	cmdline_fixed_string_t show_cpu_freq_mask;
+	uint64_t core_mask;
+};
+
+static void
+cmd_show_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_mask_result *res = parsed_result;
+	unsigned i;
+	uint64_t mask = res->core_mask;
+	uint32_t freq;
+	for (i = 0; mask; mask &= ~(1ULL << i++)) {
+		if ((mask >> i) & 1) {
+			freq = power_manager_get_current_frequency(i);
+			if (freq > 0)
+				cmdline_printf(cl, "Core %u: %"PRId32"\n", i, freq);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			show_cpu_freq_mask, "show_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_show_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			core_mask, UINT64);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_mask_set = {
+	.f = cmd_show_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "show_cpu_freq_mask <mask>, Get the current frequency for each "
+			"core specified in the mask",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq_mask,
+		(void *)&cmd_show_cpu_freq_mask_core_mask,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_mask_result {
+	cmdline_fixed_string_t set_cpu_freq_mask;
+	uint64_t core_mask;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	struct cmd_set_cpu_freq_mask_result *res = parsed_result;
+	int ret = -1;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_mask_up(res->core_mask);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_mask_down(res->core_mask);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_mask_min(res->core_mask);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_mask_max(res->core_mask);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
+				"all cores specified have been scaled\n",
+				res->core_mask, res->cmd);
+	};
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			set_cpu_freq_mask, "set_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			core_mask, UINT64);
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
+	.f = cmd_set_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
+			"frequency for the cores specified in <core_mask> by scaling "
+			"each up/down/min/max.",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq_mask,
+		(void *)&cmd_set_cpu_freq_mask_core_mask,
+		(void *)&cmd_set_cpu_freq_mask_result,
+		NULL,
+	},
+};
+
+
+
+struct cmd_show_cpu_freq_result {
+	cmdline_fixed_string_t show_cpu_freq;
+	uint8_t core_num;
+};
+
+static void
+cmd_show_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_result *res = parsed_result;
+	uint32_t curr_freq = power_manager_get_current_frequency(res->core_num);
+	if (curr_freq == 0) {
+		cmdline_printf(cl, "Unable to get frequency for core %u\n",
+				res->core_num);
+		return;
+	}
+	cmdline_printf(cl, "Core %u frequency: %"PRId32"\n", res->core_num,
+			curr_freq);
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_result,
+			show_cpu_freq, "show_cpu_freq");
+
+cmdline_parse_token_num_t cmd_show_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_result,
+			core_num, UINT8);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_set = {
+	.f = cmd_show_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "Get the current frequency for the specified core",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq,
+		(void *)&cmd_show_cpu_freq_core_num,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t core_num;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_core_up(res->core_num);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_core_down(res->core_num);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_core_min(res->core_num);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_core_max(res->core_num);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
+				res->cmd);
+	}
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			core_num, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_vm_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_status_op_set,
+		(cmdline_parse_inst_t *)&cmd_show_vm_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+
+	cl = cmdline_stdin_new(main_ctx, "vmpower> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/vm_power_cli.h b/examples/vm_power_manager/vm_power_cli.h
new file mode 100644
index 0000000..deccd51
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.h
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 03/10] CPU Frequency Power Management(Host).
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 02/10] VM Power Management CLI(Host) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile Alan Carew
                         ` (9 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/power_manager.c | 244 ++++++++++++++++++++++++++++++
 examples/vm_power_manager/power_manager.h | 188 +++++++++++++++++++++++
 2 files changed, 432 insertions(+)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 0000000..b7b1fca
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,244 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/types.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_power.h>
+#include <rte_spinlock.h>
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
+	if (core_num >= POWER_MGR_MAX_CPUS) \
+		return -1; \
+	if (!(global_enabled_cpus & (1ULL << core_num))) \
+		return -1; \
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+	ret = rte_power_freq_##DIRECTION(core_num); \
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+	int i; \
+	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+		if ((core_mask >> i) & 1) { \
+			if (!(global_enabled_cpus & (1ULL << i))) \
+			continue; \
+		rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+		if (rte_power_freq_##DIRECTION(i) != 1) \
+			ret = -1; \
+		rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+		} \
+	} \
+} while (0)
+
+struct freq_info {
+	rte_spinlock_t power_sl;
+	uint32_t freqs[RTE_MAX_LCORE_FREQS];
+	unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+	char path[PATH_MAX];
+	unsigned i;
+	unsigned num_cpus = 0;
+	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
+		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+		if (access(path, F_OK) == 0) {
+			global_enabled_cpus |= 1ULL << i;
+			num_cpus++;
+		} else
+			return num_cpus;
+	}
+	return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+	unsigned i, num_cpus;
+	uint64_t cpu_mask;
+	int ret = 0;
+
+	num_cpus = set_host_cpus_mask();
+	if (num_cpus == 0) {
+		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
+				"ensure that sufficient privileges exist to inspect sysfs\n");
+		return -1;
+	}
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	cpu_mask = global_enabled_cpus;
+	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
+		if (rte_power_init(i) < 0 || rte_power_freqs(i,
+				global_core_freq_info[i].freqs,
+				RTE_MAX_LCORE_FREQS) == 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+					"for core %u\n", i);
+			global_enabled_cpus &= ~(1 << i);
+			num_cpus--;
+			ret = -1;
+		}
+		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+	}
+	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
+					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	return ret;
+
+}
+
+uint32_t
+power_manager_get_current_frequency(unsigned core_num)
+{
+	uint32_t freq, index;
+
+	if (core_num >= POWER_MGR_MAX_CPUS) {
+		RTE_LOG(ERR, POWER_MANAGER, "Core(%u) is out of range 0...%d\n",
+				core_num, POWER_MGR_MAX_CPUS-1);
+		return -1;
+	}
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return 0;
+
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	index = rte_power_get_freq(core_num);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	if (index >= POWER_MGR_MAX_CPUS)
+		freq = 0;
+	else
+		freq = global_core_freq_info[core_num].freqs[index];
+
+	return freq;
+}
+
+int
+power_manager_exit(void)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
+		if (rte_power_exit(i) < 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+					"for core %u\n", i);
+			ret = -1;
+		}
+	}
+	global_enabled_cpus = 0;
+	return ret;
+}
+
+int
+power_manager_scale_mask_up(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(up, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_down(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(down, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_min(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(min, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_max(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(max, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_up(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(up, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_down(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(down, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_min(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(min, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_max(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(max, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
new file mode 100644
index 0000000..1b45bab
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.h
@@ -0,0 +1,188 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef POWER_MANAGER_H_
+#define POWER_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Maximum number of CPUS to manage */
+#define POWER_MGR_MAX_CPUS 64
+/**
+ * Initialize power management.
+ * Initializes resources and verifies the number of CPUs on the system.
+ * Wraps librte_power int rte_power_init(unsigned lcore_id);
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_init(void);
+
+/**
+ * Exit power management. Must be called prior to exiting the application.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_exit(void);
+
+/**
+ * Scale up the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_up(uint64_t core_mask);
+
+/**
+ * Scale down the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_down(uint64_t core_mask);
+
+/**
+ * Scale to the minimum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_min(uint64_t core_mask);
+
+/**
+ * Scale to the maximum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_max(uint64_t core_mask);
+
+/**
+ * Scale up frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_up(unsigned core_num);
+
+/**
+ * Scale down frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_down(unsigned core_num);
+
+/**
+ * Scale to minimum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_min(unsigned core_num);
+
+/**
+ * Scale to maximum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_max(unsigned core_num);
+
+/**
+ * Get the current freuency of the core specified by core_num
+ *
+ * @param core_num
+ *  The core number to get the current frequency
+ *
+ * @return
+ *  - 0  on error
+ *  - >0 for current frequency.
+ */
+uint32_t power_manager_get_current_frequency(unsigned core_num);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* POWER_MANAGER_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile.
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (2 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 03/10] CPU Frequency Power Management(Host) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-16 18:28         ` De Lara Guarch, Pablo
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 05/10] VM Power Management CLI(Guest) Alan Carew
                         ` (8 subsequent siblings)
  12 siblings, 1 reply; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/Makefile |  57 ++++++++++++++++++
 examples/vm_power_manager/main.c   | 117 +++++++++++++++++++++++++++++++++++++
 examples/vm_power_manager/main.h   |  52 +++++++++++++++++
 3 files changed, 226 insertions(+)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
new file mode 100644
index 0000000..7d6f943
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 0000000..875274e
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,117 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_launch.h>
+#include <rte_log.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "vm_power_cli.h"
+#include "main.h"
+
+static int
+run_monitor(__attribute__((unused)) void *arg)
+{
+	if (channel_monitor_init() < 0) {
+		printf("Unable to initialize channel monitor\n");
+		return -1;
+	}
+	run_channel_monitor();
+	return 0;
+}
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	lcore_id = rte_get_next_lcore(-1, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+				"application\n");
+		return 0;
+	}
+	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
+
+	if (power_manager_init() < 0) {
+		printf("Unable to initialize power manager\n");
+		return -1;
+	}
+	if (channel_manager_init(CHANNEL_MGR_DEFAULT_HV_PATH) < 0) {
+		printf("Unable to initialize channel manager\n");
+		return -1;
+	}
+	run_cli(NULL);
+
+	rte_eal_mp_wait_lcore();
+	return 0;
+}
diff --git a/examples/vm_power_manager/main.h b/examples/vm_power_manager/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile.
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile Alan Carew
@ 2014-10-16 18:28         ` De Lara Guarch, Pablo
  0 siblings, 0 replies; 97+ messages in thread
From: De Lara Guarch, Pablo @ 2014-10-16 18:28 UTC (permalink / raw)
  To: Carew, Alan, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alan Carew
> Sent: Sunday, October 12, 2014 8:36 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v4 04/10] VM Power Management application
> and Makefile.
> 
> For launching CLI thread and Monitor thread and initialising
> resources.
> Requires a minimum of two lcores to run, additional cores specified by eal
> core
> mask are not used.
> 
> Signed-off-by: Alan Carew <alan.carew@intel.com>
> ---
>  examples/vm_power_manager/Makefile |  57 ++++++++++++++++++
>  examples/vm_power_manager/main.c   | 117
> +++++++++++++++++++++++++++++++++++++
>  examples/vm_power_manager/main.h   |  52 +++++++++++++++++
>  3 files changed, 226 insertions(+)
>  create mode 100644 examples/vm_power_manager/Makefile
>  create mode 100644 examples/vm_power_manager/main.c
>  create mode 100644 examples/vm_power_manager/main.h
[...]
> +# Default target, can be overriden by command line or environment
> +RTE_TARGET ?= x86_64-default-linuxapp-gcc

Tiny comment here. Target should be x86_64-native-linuxapp-gcc

> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# binary name
> +APP = vm_power_mgr
> +
> +# all source are stored in SRCS-y
> +SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
> +SRCS-y += channel_monitor.c
> +
> +CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
> +CFLAGS += $(WERROR_FLAGS)
> +

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 05/10] VM Power Management CLI(Guest).
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (3 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
                         ` (7 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile       |  56 ++++++++
 examples/vm_power_manager/guest_cli/main.c         |  87 ++++++++++++
 examples/vm_power_manager/guest_cli/main.h         |  52 +++++++
 .../guest_cli/vm_power_cli_guest.c                 | 155 +++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |  55 ++++++++
 5 files changed, 405 insertions(+)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 0000000..167a7ed
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 0000000..1e4767a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+*/
+#include <signal.h>
+
+#include <rte_lcore.h>
+#include <rte_power.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "vm_power_cli_guest.h"
+#include "main.h"
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	rte_power_set_env(PM_ENV_KVM_VM);
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_init(lcore_id);
+	}
+	run_cli(NULL);
+
+	return 0;
+}
diff --git a/examples/vm_power_manager/guest_cli/main.h b/examples/vm_power_manager/guest_cli/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
new file mode 100644
index 0000000..7c4af4a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -0,0 +1,155 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <termios.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_lcore.h>
+
+#include <rte_power.h>
+
+#include "vm_power_cli_guest.h"
+
+
+#define CHANNEL_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+				__attribute__((unused)) struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t lcore_id;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = rte_power_freq_up(res->lcore_id);
+	else if (!strcmp(res->cmd , "down"))
+		ret = rte_power_freq_down(res->lcore_id);
+	else if (!strcmp(res->cmd , "min"))
+		ret = rte_power_freq_min(res->lcore_id);
+	else if (!strcmp(res->cmd , "max"))
+		ret = rte_power_freq_max(res->lcore_id);
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			lcore_id, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	cl = cmdline_stdin_new(main_ctx, "vmpower(guest)> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
new file mode 100644
index 0000000..0c4bdd5
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "channel_commands.h"
+
+int guest_channel_host_connect(unsigned lcore_id);
+
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 06/10] VM communication channels for VM Power Management(Guest).
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (4 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 05/10] VM Power Management CLI(Guest) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 07/10] librte_power common interface for Guest and Host Alan Carew
                         ` (6 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/guest_channel.c | 162 +++++++++++++++++++++++++++++++++++++++
 lib/librte_power/guest_channel.h |  89 +++++++++++++++++++++
 2 files changed, 251 insertions(+)
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
new file mode 100644
index 0000000..2295665
--- /dev/null
+++ b/lib/librte_power/guest_channel.c
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+	int flags, ret;
+	struct channel_packet pkt;
+	char fd_path[PATH_MAX];
+	int fd = -1;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	/* check if path is already open */
+	if (global_fds[lcore_id] != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open with fd %d\n",
+				lcore_id, global_fds[lcore_id]);
+		return -1;
+	}
+
+	snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+	RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+			fd_path, lcore_id);
+	fd = open(fd_path, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with error "
+				"%s\n", fd_path, strerror(errno));
+		return -1;
+	}
+
+	flags = fcntl(fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file %s\n",
+				fd_path);
+		goto error;
+	}
+
+	flags |= O_NONBLOCK;
+	if (fcntl(fd, F_SETFL, flags) < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking mode for "
+				"file %s", fd_path);
+		goto error;
+	}
+	/* QEMU needs a delay after connection */
+	sleep(1);
+
+	/* Send a test packet, this command is ignored by the host, but a successful
+	 * send indicates that the host endpoint is monitoring.
+	 */
+	pkt.command = CPU_POWER_CONNECT;
+	global_fds[lcore_id] = fd;
+	ret = guest_channel_send_msg(&pkt, lcore_id);
+	if (ret != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' communications "
+				"test: %s\n", fd_path, strerror(ret));
+		goto error;
+	}
+	RTE_LOG(INFO, GUEST_CHANNEL, "Channel '%s' is now connected\n", fd_path);
+	return 0;
+error:
+	close(fd);
+	global_fds[lcore_id] = 0;
+	return -1;
+}
+
+int
+guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
+{
+	int ret, buffer_len = sizeof(*pkt);
+	void *buffer = pkt;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+
+	if (global_fds[lcore_id] == 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel is not connected\n");
+		return -1;
+	}
+	while (buffer_len > 0) {
+		ret = write(global_fds[lcore_id], buffer, buffer_len);
+		if (ret == buffer_len)
+			return 0;
+		if (ret == -1) {
+			if (errno == EINTR)
+				continue;
+			return errno;
+		}
+		buffer = (char *)buffer + ret;
+		buffer_len -= ret;
+	}
+	return 0;
+}
+
+void
+guest_channel_host_disconnect(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return;
+	}
+	if (global_fds[lcore_id] == 0)
+		return;
+	close(global_fds[lcore_id]);
+	global_fds[lcore_id] = 0;
+}
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
new file mode 100644
index 0000000..9e18af5
--- /dev/null
+++ b/lib/librte_power/guest_channel.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef _GUEST_CHANNEL_H
+#define _GUEST_CHANNEL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <channel_commands.h>
+
+/**
+ * Connect to the Virtio-Serial VM end-point located in path. It is
+ * thread safe for unique lcore_ids. This function must be only called once from
+ * each lcore.
+ *
+ * @param path
+ *  The path to the serial device on the filesystem
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int guest_channel_host_connect(const char *path, unsigned lcore_id);
+
+/**
+ * Disconnect from an already connected Virtio-Serial Endpoint.
+ *
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ */
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct guest_agent_pkt
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on channel not connected.
+ *  - errno on write to channel error.
+ */
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 07/10] librte_power common interface for Guest and Host
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (5 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
                         ` (5 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implmentation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/rte_power.c              | 540 ++++-------------------------
 lib/librte_power/rte_power.h              | 120 +++++--
 lib/librte_power/rte_power_acpi_cpufreq.c | 545 ++++++++++++++++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++++++++++
 lib/librte_power/rte_power_common.h       |  39 +++
 lib/librte_power/rte_power_kvm_vm.c       | 135 ++++++++
 lib/librte_power/rte_power_kvm_vm.h       | 179 ++++++++++
 7 files changed, 1248 insertions(+), 502 deletions(-)
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 856da9a..998ed1c 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -31,515 +31,113 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <signal.h>
-#include <limits.h>
-
-#include <rte_memcpy.h>
 #include <rte_atomic.h>
 
 #include "rte_power.h"
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
 
-#ifdef RTE_LIBRTE_POWER_DEBUG
-#define POWER_DEBUG_TRACE(fmt, args...) do { \
-		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
-	} while (0)
-#else
-#define POWER_DEBUG_TRACE(fmt, args...)
-#endif
-
-#define FOPEN_OR_ERR_RET(f, retval) do { \
-	if ((f) == NULL) { \
-		RTE_LOG(ERR, POWER, "File not openned\n"); \
-		return (retval); \
-	} \
-} while(0)
-
-#define FOPS_OR_NULL_GOTO(ret, label) do { \
-	if ((ret) == NULL) { \
-		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define FOPS_OR_ERR_GOTO(ret, label) do { \
-	if ((ret) < 0) { \
-		RTE_LOG(ERR, POWER, "File operations failed\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define STR_SIZE     1024
-#define POWER_CONVERT_TO_DECIMAL 10
+enum power_management_env global_default_env = PM_ENV_NOT_SET;
 
-#define POWER_GOVERNOR_USERSPACE "userspace"
-#define POWER_SYSFILE_GOVERNOR   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
-#define POWER_SYSFILE_AVAIL_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
-#define POWER_SYSFILE_SETSPEED   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+volatile uint32_t global_env_cfg_status = 0;
 
-enum power_state {
-	POWER_IDLE = 0,
-	POWER_ONGOING,
-	POWER_USED,
-	POWER_UNKNOWN
-};
+/* function pointers */
+rte_power_freqs_t rte_power_freqs  = NULL;
+rte_power_get_freq_t rte_power_get_freq = NULL;
+rte_power_set_freq_t rte_power_set_freq = NULL;
+rte_power_freq_change_t rte_power_freq_up = NULL;
+rte_power_freq_change_t rte_power_freq_down = NULL;
+rte_power_freq_change_t rte_power_freq_max = NULL;
+rte_power_freq_change_t rte_power_freq_min = NULL;
 
-/**
- * Power info per lcore.
- */
-struct rte_power_info {
-	unsigned lcore_id;                   /**< Logical core id */
-	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-	uint32_t nb_freqs;                   /**< number of available freqs */
-	FILE *f;                             /**< FD of scaling_setspeed */
-	char governor_ori[32];               /**< Original governor name */
-	uint32_t curr_idx;                   /**< Freq index in freqs array */
-	volatile uint32_t state;             /**< Power in use state */
-} __rte_cache_aligned;
-
-static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
-
-/**
- * It is to set specific freq for specific logical core, according to the index
- * of supported frequencies.
- */
-static int
-set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+int
+rte_power_set_env(enum power_management_env env)
 {
-	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
-			"should be less than %u\n", idx, pi->nb_freqs);
-		return -1;
-	}
-
-	/* Check if it is the same as current */
-	if (idx == pi->curr_idx)
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
 		return 0;
-
-	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
-				idx, pi->freqs[idx], pi->lcore_id);
-	if (fseek(pi->f, 0, SEEK_SET) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
-			"for setting frequency for lcore %u\n", pi->lcore_id);
-		return -1;
 	}
-	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
-					"lcore %u\n", pi->lcore_id);
+	if (env == PM_ENV_ACPI_CPUFREQ) {
+		rte_power_freqs = rte_power_acpi_cpufreq_freqs;
+		rte_power_get_freq = rte_power_acpi_cpufreq_get_freq;
+		rte_power_set_freq = rte_power_acpi_cpufreq_set_freq;
+		rte_power_freq_up = rte_power_acpi_cpufreq_freq_up;
+		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
+		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
+		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+	} else if (env == PM_ENV_KVM_VM) {
+		rte_power_freqs = rte_power_kvm_vm_freqs;
+		rte_power_get_freq = rte_power_kvm_vm_get_freq;
+		rte_power_set_freq = rte_power_kvm_vm_set_freq;
+		rte_power_freq_up = rte_power_kvm_vm_freq_up;
+		rte_power_freq_down = rte_power_kvm_vm_freq_down;
+		rte_power_freq_min = rte_power_kvm_vm_freq_min;
+		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+	} else {
+		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
+				env);
+		rte_power_unset_env();
 		return -1;
 	}
-	fflush(pi->f);
-	pi->curr_idx = idx;
-
-	return 1;
-}
-
-/**
- * It is to check the current scaling governor by reading sys file, and then
- * set it into 'userspace' if it is not by writing the sys file. The original
- * governor will be saved for rolling back.
- */
-static int
-power_set_governor_userspace(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if current governor is userspace */
-	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
-		sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
-					"already userspace\n", pi->lcore_id);
-		goto out;
-	}
-	/* Save the original governor */
-	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
-
-	/* Write 'userspace' to the governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(POWER_GOVERNOR_USERSPACE, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
-			"set to user space successfully\n", pi->lcore_id);
-out:
-	fclose(f);
+	global_default_env = env;
+	return 0;
 
-	return ret;
 }
 
-/**
- * It is to get the available frequencies of the specific lcore by reading the
- * sys file.
- */
-static int
-power_get_available_freqs(struct rte_power_info *pi)
+void
+rte_power_unset_env(void)
 {
-	FILE *f;
-	int ret = -1, i, count;
-	char *p;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *freqs[RTE_MAX_LCORE_FREQS];
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
-								pi->lcore_id);
-	f = fopen(fullpath, "r");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Strip the line break if there is */
-	p = strchr(buf, '\n');
-	if (p != NULL)
-		*p = 0;
-
-	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
-	count = rte_strsplit(buf, sizeof(buf), freqs,
-				RTE_MAX_LCORE_FREQS, ' ');
-	if (count <= 0) {
-		RTE_LOG(ERR, POWER, "No available frequency in "
-			""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
-		goto out;
-	}
-	if (count >= RTE_MAX_LCORE_FREQS) {
-		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
-								count);
-		goto out;
-	}
-
-	/* Store the available frequncies into power context */
-	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
-		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
-								i, freqs[i]);
-		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
-					POWER_CONVERT_TO_DECIMAL);
-	}
-
-	ret = 0;
-	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
-						count, pi->lcore_id);
-out:
-	fclose(f);
-
-	return ret;
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 1, 0) != 0)
+		global_default_env = PM_ENV_NOT_SET;
 }
 
-/**
- * It is to fopen the sys file for the future setting the lcore frequency.
- */
-static int
-power_init_for_setting_freq(struct rte_power_info *pi)
-{
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t i, freq;
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, -1);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
-	for (i = 0; i < pi->nb_freqs; i++) {
-		if (freq == pi->freqs[i]) {
-			pi->curr_idx = i;
-			pi->f = f;
-			return 0;
-		}
-	}
-
-out:
-	fclose(f);
-
-	return -1;
+enum power_management_env
+rte_power_get_env(void) {
+	return global_default_env;
 }
 
 int
 rte_power_init(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"in use\n", lcore_id);
-		return -1;
-	}
-
-	pi->lcore_id = lcore_id;
-	/* Check and set the governor */
-	if (power_set_governor_userspace(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
-						"userspace\n", lcore_id);
-		goto fail;
-	}
+	int ret = -1;
 
-	/* Get the available frequencies */
-	if (power_get_available_freqs(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ) {
+		return rte_power_acpi_cpufreq_init(lcore_id);
 	}
-
-	/* Init for setting lcore frequency */
-	if (power_init_for_setting_freq(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_KVM_VM) {
+		return rte_power_kvm_vm_init(lcore_id);
 	}
-
-	/* Set freq to max by default */
-	if (rte_power_freq_max(lcore_id) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
-						"to max\n", lcore_id);
-		goto fail;
+	/* Auto detect Environment */
+	RTE_LOG(INFO, POWER, "Attempting to initialise ACPI cpufreq power "
+			"management...\n");
+	ret = rte_power_acpi_cpufreq_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+		goto out;
 	}
 
-	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
-					"power manamgement\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
-
-	return -1;
-}
-
-/**
- * It is to check the governor and then set the original governor back if
- * needed by writing the the sys file.
- */
-static int
-power_set_governor_original(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if the governor to be set is the same as current */
-	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u "
-					"has already been set to %s\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(INFO, POWER, "Attempting to initialise VM power management...\n");
+	ret = rte_power_kvm_vm_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_KVM_VM);
 		goto out;
 	}
-
-	/* Write back the original governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(pi->governor_ori, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power manamgement governor of lcore %u "
-				"has been set back to %s successfully\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(ERR, POWER, "Unable to set Power Management Environment for lcore "
+			"%u\n", lcore_id);
 out:
-	fclose(f);
-
 	return ret;
 }
 
 int
 rte_power_exit(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"not used\n", lcore_id);
-		return -1;
-	}
-
-	/* Close FD of setting freq */
-	fclose(pi->f);
-	pi->f = NULL;
-
-	/* Set the governor back to the original */
-	if (power_set_governor_original(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
-					"to the original\n", lcore_id);
-		goto fail;
-	}
-
-	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
-				"'userspace' mode and been set back to the "
-						"original\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ)
+		return rte_power_acpi_cpufreq_exit(lcore_id);
+	if (global_default_env == PM_ENV_KVM_VM)
+		return rte_power_kvm_vm_exit(lcore_id);
 
+	RTE_LOG(ERR, POWER, "Environment has not been set, unable to exit "
+				"gracefully\n");
 	return -1;
-}
-
-uint32_t
-rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
-		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
-		return 0;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (num < pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
-		return 0;
-	}
-	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
-
-	return pi->nb_freqs;
-}
-
-uint32_t
-rte_power_get_freq(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return RTE_POWER_INVALID_FREQ_INDEX;
-	}
-
-	return lcore_power_info[lcore_id].curr_idx;
-}
-
-int
-rte_power_set_freq(unsigned lcore_id, uint32_t index)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
-}
-
-int
-rte_power_freq_down(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
 
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx + 1 == pi->nb_freqs)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx + 1);
 }
-
-int
-rte_power_freq_up(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx == 0)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx - 1);
-}
-
-int
-rte_power_freq_max(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(&lcore_power_info[lcore_id], 0);
-}
-
-int
-rte_power_freq_min(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->nb_freqs - 1);
-}
-
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 9c1419e..9338069 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -48,12 +48,48 @@
 extern "C" {
 #endif
 
-#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+/* Power Management Environment State */
+enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM};
 
 /**
- * Initialize power management for a specific lcore. It will check and set the
- * governor to userspace for the lcore, get the available frequencies, and
- * prepare to set new lcore frequency.
+ * Set the default power management implementation. If this is not called prior
+ * to rte_power_init(), then auto-detect of the environment will take place.
+ * It is not thread safe.
+ *
+ * @param env
+ *  env. The environment in which to initialise Power Management for.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_set_env(enum power_management_env env);
+
+/**
+ * Unset the global environment configuration.
+ * This can only be called after all threads have completed.
+ *
+ * @param None.
+ *
+ * @return
+ *  None.
+ */
+void rte_power_unset_env(void);
+
+/**
+ * Get the default power management implementation.
+ *
+ * @param None.
+ *
+ * @return
+ *  power_management_env The configured environment.
+ */
+enum power_management_env rte_power_get_env(void);
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
  *
  * @param lcore_id
  *  lcore id.
@@ -65,8 +101,9 @@ extern "C" {
 int rte_power_init(unsigned lcore_id);
 
 /**
- * Exit power management on a specific lcore. It will set the governor to which
- * is before initialized.
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
  *
  * @param lcore_id
  *  lcore id.
@@ -78,11 +115,9 @@ int rte_power_init(unsigned lcore_id);
 int rte_power_exit(unsigned lcore_id);
 
 /**
- * Get the available frequencies of a specific lcore. The return value will be
- * the minimal one of the total number of available frequencies and the number
- * of buffer. The index of available frequencies used in other interfaces
- * should be in the range of 0 to this return value.
- * It should be protected outside of this function for threadsafe.
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -94,12 +129,15 @@ int rte_power_exit(unsigned lcore_id);
  * @return
  *  The number of available frequencies.
  */
-uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
+typedef uint32_t (*rte_power_freqs_t)(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+extern rte_power_freqs_t rte_power_freqs;
 
 /**
- * Return the current index of available frequencies of a specific lcore. It
- * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
- * It should be protected outside of this function for threadsafe.
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -107,12 +145,15 @@ uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
  * @return
  *  The current index of available frequencies.
  */
-uint32_t rte_power_get_freq(unsigned lcore_id);
+typedef uint32_t (*rte_power_get_freq_t)(unsigned lcore_id);
+
+extern rte_power_get_freq_t rte_power_get_freq;
 
 /**
  * Set the new frequency for a specific lcore by indicating the index of
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -121,70 +162,87 @@ uint32_t rte_power_get_freq(unsigned lcore_id);
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_set_freq(unsigned lcore_id, uint32_t index);
+typedef int (*rte_power_set_freq_t)(unsigned lcore_id, uint32_t index);
+
+extern rte_power_set_freq_t rte_power_set_freq;
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned lcore_id);
 
 /**
  * Scale up the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_up(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_up;
 
 /**
  * Scale down the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_down(unsigned lcore_id);
+
+extern rte_power_freq_change_t rte_power_freq_down;
 
 /**
  * Scale up the frequency of a specific lcore to the highest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_max(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_max;
 
 /**
  * Scale down the frequency of a specific lcore to the lowest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage..
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_min(unsigned lcore_id);
+rte_power_freq_change_t rte_power_freq_min;
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
new file mode 100644
index 0000000..09085c3
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -0,0 +1,545 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+
+#include <rte_memcpy.h>
+#include <rte_atomic.h>
+
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_common.h"
+
+#ifdef RTE_LIBRTE_POWER_DEBUG
+#define POWER_DEBUG_TRACE(fmt, args...) do { \
+		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
+} while (0)
+#else
+#define POWER_DEBUG_TRACE(fmt, args...)
+#endif
+
+#define FOPEN_OR_ERR_RET(f, retval) do { \
+		if ((f) == NULL) { \
+			RTE_LOG(ERR, POWER, "File not openned\n"); \
+			return retval; \
+		} \
+} while (0)
+
+#define FOPS_OR_NULL_GOTO(ret, label) do { \
+		if ((ret) == NULL) { \
+			RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define FOPS_OR_ERR_GOTO(ret, label) do { \
+		if ((ret) < 0) { \
+			RTE_LOG(ERR, POWER, "File operations failed\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define STR_SIZE     1024
+#define POWER_CONVERT_TO_DECIMAL 10
+
+#define POWER_GOVERNOR_USERSPACE "userspace"
+#define POWER_SYSFILE_GOVERNOR   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
+#define POWER_SYSFILE_AVAIL_FREQ \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
+#define POWER_SYSFILE_SETSPEED   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+
+enum power_state {
+	POWER_IDLE = 0,
+	POWER_ONGOING,
+	POWER_USED,
+	POWER_UNKNOWN
+};
+
+/**
+ * Power info per lcore.
+ */
+struct rte_power_info {
+	unsigned lcore_id;                   /**< Logical core id */
+	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
+	uint32_t nb_freqs;                   /**< number of available freqs */
+	FILE *f;                             /**< FD of scaling_setspeed */
+	char governor_ori[32];               /**< Original governor name */
+	uint32_t curr_idx;                   /**< Freq index in freqs array */
+	volatile uint32_t state;             /**< Power in use state */
+} __rte_cache_aligned;
+
+static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
+
+/**
+ * It is to set specific freq for specific logical core, according to the index
+ * of supported frequencies.
+ */
+static int
+set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+{
+	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
+				"should be less than %u\n", idx, pi->nb_freqs);
+		return -1;
+	}
+
+	/* Check if it is the same as current */
+	if (idx == pi->curr_idx)
+		return 0;
+
+	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
+			idx, pi->freqs[idx], pi->lcore_id);
+	if (fseek(pi->f, 0, SEEK_SET) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
+				"for setting frequency for lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
+				"lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	fflush(pi->f);
+	pi->curr_idx = idx;
+
+	return 1;
+}
+
+/**
+ * It is to check the current scaling governor by reading sys file, and then
+ * set it into 'userspace' if it is not by writing the sys file. The original
+ * governor will be saved for rolling back.
+ */
+static int
+power_set_governor_userspace(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if current governor is userspace */
+	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
+			sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
+				"already userspace\n", pi->lcore_id);
+		goto out;
+	}
+	/* Save the original governor */
+	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
+
+	/* Write 'userspace' to the governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(POWER_GOVERNOR_USERSPACE, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
+			"set to user space successfully\n", pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to get the available frequencies of the specific lcore by reading the
+ * sys file.
+ */
+static int
+power_get_available_freqs(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1, i, count;
+	char *p;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *freqs[RTE_MAX_LCORE_FREQS];
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
+			pi->lcore_id);
+	f = fopen(fullpath, "r");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Strip the line break if there is */
+	p = strchr(buf, '\n');
+	if (p != NULL)
+		*p = 0;
+
+	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
+	count = rte_strsplit(buf, sizeof(buf), freqs,
+			RTE_MAX_LCORE_FREQS, ' ');
+	if (count <= 0) {
+		RTE_LOG(ERR, POWER, "No available frequency in "
+				""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
+		goto out;
+	}
+	if (count >= RTE_MAX_LCORE_FREQS) {
+		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
+				count);
+		goto out;
+	}
+
+	/* Store the available frequncies into power context */
+	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
+		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
+				i, freqs[i]);
+		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
+				POWER_CONVERT_TO_DECIMAL);
+	}
+
+	ret = 0;
+	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
+			count, pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to fopen the sys file for the future setting the lcore frequency.
+ */
+static int
+power_init_for_setting_freq(struct rte_power_info *pi)
+{
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t i, freq;
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, -1);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
+	for (i = 0; i < pi->nb_freqs; i++) {
+		if (freq == pi->freqs[i]) {
+			pi->curr_idx = i;
+			pi->f = f;
+			return 0;
+		}
+	}
+
+	out:
+	fclose(f);
+
+	return -1;
+}
+
+int
+rte_power_acpi_cpufreq_init(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"in use\n", lcore_id);
+		return -1;
+	}
+
+	pi->lcore_id = lcore_id;
+	/* Check and set the governor */
+	if (power_set_governor_userspace(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
+				"userspace\n", lcore_id);
+		goto fail;
+	}
+
+	/* Get the available frequencies */
+	if (power_get_available_freqs(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Init for setting lcore frequency */
+	if (power_init_for_setting_freq(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Set freq to max by default */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
+				"to max\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
+			"power manamgement\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+/**
+ * It is to check the governor and then set the original governor back if
+ * needed by writing the the sys file.
+ */
+static int
+power_set_governor_original(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if the governor to be set is the same as current */
+	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u "
+				"has already been set to %s\n",
+				pi->lcore_id, pi->governor_ori);
+		goto out;
+	}
+
+	/* Write back the original governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(pi->governor_ori, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u "
+			"has been set back to %s successfully\n",
+			pi->lcore_id, pi->governor_ori);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_acpi_cpufreq_exit(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"not used\n", lcore_id);
+		return -1;
+	}
+
+	/* Close FD of setting freq */
+	fclose(pi->f);
+	pi->f = NULL;
+
+	/* Set the governor back to the original */
+	if (power_set_governor_original(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
+				"to the original\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
+			"'userspace' mode and been set back to the "
+			"original\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
+		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
+		return 0;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (num < pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
+		return 0;
+	}
+	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
+
+	return pi->nb_freqs;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_get_freq(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return RTE_POWER_INVALID_FREQ_INDEX;
+	}
+
+	return lcore_power_info[lcore_id].curr_idx;
+}
+
+int
+rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
+}
+
+int
+rte_power_acpi_cpufreq_freq_down(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx + 1 == pi->nb_freqs)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx + 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_up(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx == 0)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx - 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_max(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(&lcore_power_info[lcore_id], 0);
+}
+
+int
+rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->nb_freqs - 1);
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
new file mode 100644
index 0000000..68578e9
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -0,0 +1,192 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_ACPI_CPUFREQ_H
+#define _RTE_POWER_ACPI_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management via userspace ACPI cpufreq
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore. It will check and set the
+ * governor to userspace for the lcore, get the available frequencies, and
+ * prepare to set new lcore frequency.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore. It will set the governor to which
+ * is before initialized.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore. The return value will be
+ * the minimal one of the total number of available frequencies and the number
+ * of buffer. The index of available frequencies used in other interfaces
+ * should be in the range of 0 to this return value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  The number of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore. It
+ * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  The current index of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency chnaged.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_power/rte_power_common.h b/lib/librte_power/rte_power_common.h
new file mode 100644
index 0000000..64bd168
--- /dev/null
+++ b/lib/librte_power/rte_power_common.h
@@ -0,0 +1,39 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_POWER_COMMON_H_
+#define RTE_POWER_COMMON_H_
+
+#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+
+#endif /* RTE_POWER_COMMON_H_ */
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
new file mode 100644
index 0000000..3ccd92b
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -0,0 +1,135 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <errno.h>
+#include <string.h>
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
+
+#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+static struct channel_packet pkt[CHANNEL_CMDS_MAX_VM_CHANNELS];
+
+
+int
+rte_power_kvm_vm_init(unsigned lcore_id)
+{
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].command = CPU_POWER;
+	pkt[lcore_id].resource_id = lcore_id;
+	return guest_channel_host_connect(FD_PATH, lcore_id);
+}
+
+int
+rte_power_kvm_vm_exit(unsigned lcore_id)
+{
+	guest_channel_host_disconnect(lcore_id);
+	return 0;
+}
+
+uint32_t
+rte_power_kvm_vm_freqs(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t *freqs,
+		__attribute__((unused)) uint32_t num)
+{
+	RTE_LOG(ERR, POWER, "rte_power_freqs is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+uint32_t
+rte_power_kvm_vm_get_freq(__attribute__((unused)) unsigned lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_get_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_set_freq(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t index)
+{
+	RTE_LOG(ERR, POWER, "rte_power_set_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+static inline int
+send_msg(unsigned lcore_id, uint32_t scale_direction)
+{
+	int ret;
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = scale_direction;
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret == 0)
+		return 1;
+	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_kvm_vm_freq_up(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_UP);
+}
+
+int
+rte_power_kvm_vm_freq_down(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_DOWN);
+}
+
+int
+rte_power_kvm_vm_freq_max(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MAX);
+}
+
+int
+rte_power_kvm_vm_freq_min(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
new file mode 100644
index 0000000..dcbc878
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -0,0 +1,179 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_KVM_VM_H
+#define _RTE_POWER_KVM_VM_H
+
+/**
+ * @file
+ * RTE Power Management KVM VM
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore. This request is forwarded to the
+ * host monitor.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 08/10] Packet format for VM Power Management(Host and Guest).
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (6 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 07/10] librte_power common interface for Guest and Host Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
                         ` (4 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/channel_commands.h | 77 +++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)
 create mode 100644 lib/librte_power/channel_commands.h

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
new file mode 100644
index 0000000..7e78a8b
--- /dev/null
+++ b/lib/librte_power/channel_commands.h
@@ -0,0 +1,77 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/* Maximum number of CPUs */
+#define CHANNEL_CMDS_MAX_CPUS        64
+#if CHANNEL_CMDS_MAX_CPUS > 64
+#error Maximum number of cores is 64, overflow is guaranteed to \
+	cause problems with VM Power Management
+#endif
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Valid Commands */
+#define CPU_POWER               1
+#define CPU_POWER_CONNECT       2
+
+/* CPU Power Command Scaling */
+#define CPU_POWER_SCALE_UP      1
+#define CPU_POWER_SCALE_DOWN    2
+#define CPU_POWER_SCALE_MAX     3
+#define CPU_POWER_SCALE_MIN     4
+
+struct channel_packet {
+	uint64_t resource_id; /**< core_num, device */
+	uint32_t unit;        /**< scale down/up/min/max */
+	uint32_t command;     /**< Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 09/10] Build system integration for VM Power Management(Guest and Host)
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (7 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 10/10] VM Power Management Unit Tests Alan Carew
                         ` (3 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_power/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
index 6185812..d672a5a 100644
--- a/lib/librte_power/Makefile
+++ b/lib/librte_power/Makefile
@@ -37,7 +37,8 @@ LIB = librte_power.a
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c rte_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_kvm_vm.c guest_channel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v4 10/10] VM Power Management Unit Tests
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (8 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
@ 2014-10-12 19:36       ` Alan Carew
  2014-10-13  6:17       ` [dpdk-dev] [PATCH v4 00/10] VM Power Management Liu, Yong
                         ` (2 subsequent siblings)
  12 siblings, 0 replies; 97+ messages in thread
From: Alan Carew @ 2014-10-12 19:36 UTC (permalink / raw)
  To: dev

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 app/test/Makefile                  |   3 +-
 app/test/autotest_data.py          |  26 ++
 app/test/test_power.c              | 445 +++---------------------------
 app/test/test_power_acpi_cpufreq.c | 544 +++++++++++++++++++++++++++++++++++++
 app/test/test_power_kvm_vm.c       | 308 +++++++++++++++++++++
 5 files changed, 917 insertions(+), 409 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 6af6d76..9417eda 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -119,7 +119,8 @@ endif
 
 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c test_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power_kvm_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..618a946 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,32 @@ non_parallel_test_group_list = [
 	]
 },
 {
+	"Prefix" :      "power_acpi_cpufreq",
+	"Memory" :      all_sockets(512),
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power ACPI cpufreq autotest",
+		 "Command" :    "power_acpi_cpufreq_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
+	"Prefix" :      "power_kvm_vm",
+	"Memory" :      "512",
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power KVM VM  autotest",
+		 "Command" :    "power_kvm_vm_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
 	"Prefix" :	"lpm6",
 	"Memory" :	"512",
 	"Tests" :
diff --git a/app/test/test_power.c b/app/test/test_power.c
index d9eb420..64a2305 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -41,437 +41,66 @@
 
 #include <rte_power.h>
 
-#define TEST_POWER_LCORE_ID      2U
-#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
-#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
-
-#define TEST_POWER_SYSFILE_CUR_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
-
-static uint32_t total_freq_num;
-static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
-
-static int
-check_cur_freq(unsigned lcore_id, uint32_t idx)
-{
-#define TEST_POWER_CONVERT_TO_DECIMAL 10
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t cur_freq;
-	int ret = -1;
-
-	if (snprintf(fullpath, sizeof(fullpath),
-		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
-		return 0;
-	}
-	f = fopen(fullpath, "r");
-	if (f == NULL) {
-		return 0;
-	}
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		goto fail_get_cur_freq;
-	}
-	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
-	ret = (freqs[idx] == cur_freq ? 0 : -1);
-
-fail_get_cur_freq:
-	fclose(f);
-
-	return ret;
-}
-
-/* Check rte_power_freqs() */
-static int
-check_power_freqs(void)
-{
-	uint32_t ret;
-
-	total_freq_num = 0;
-	memset(freqs, 0, sizeof(freqs));
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
-					TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with NULL buffer to save available freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test of getting zero number of freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test with all valid input parameters */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get available freqs on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Save the total number of available freqs */
-	total_freq_num = ret;
-
-	return 0;
-}
-
-/* Check rte_power_get_freq() */
-static int
-check_power_get_freq(void)
-{
-	int ret;
-	uint32_t count;
-
-	/* test with an invalid lcore id */
-	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
-	if (count < TEST_POWER_FREQS_NUM_MAX) {
-		printf("Unexpectedly get freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
-	if (count >= TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get the freq index on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_set_freq() */
-static int
-check_power_set_freq(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
-	if (ret >= 0) {
-		printf("Unexpectedly set freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with an invalid freq index */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/**
-	 * test with an invalid freq index which is right one bigger than
-	 * total number of freqs
-	 */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", total_freq_num,
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0) {
-		printf("Fail to set freq index on lcore %u\n",
-					TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_down() */
-static int
-check_power_freq_down(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale down one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale down one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf ("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_up() */
-static int
-check_power_freq_up(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq on %u\n",
-						TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale up one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale up one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_max() */
-static int
-check_power_freq_max(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq to max on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_min() */
-static int
-check_power_freq_min(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq to min "
-				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
 static int
 test_power(void)
 {
 	int ret = -1;
+	enum power_management_env env;
 
-	/* test of init power management for an invalid lcore */
-	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	/* Test setting an invalid environment */
+	ret = rte_power_set_env(PM_ENV_NOT_SET);
 	if (ret == 0) {
-		printf("Unexpectedly initialise power management successfully "
-				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot initialise power management for lcore %u\n",
-							TEST_POWER_LCORE_ID);
+		printf("Unexpectedly succeeded on setting an invalid environment\n");
 		return -1;
 	}
 
-	/**
-	 * test of initialising power management for the lcore which has
-	 * been initialised
-	 */
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly init successfully power twice on "
-					"lcore %u\n", TEST_POWER_LCORE_ID);
+	/* Test that the environment has not been set */
+	env = rte_power_get_env();
+	if (env != PM_ENV_NOT_SET) {
+		printf("Unexpectedly got a valid environment configuration\n");
 		return -1;
 	}
 
-	ret = check_power_freqs();
-	if (ret < 0)
+	/* verify that function pointers are NULL */
+	if (rte_power_freqs != NULL) {
+		printf("rte_power_freqs should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	if (total_freq_num < 2) {
-		rte_power_exit(TEST_POWER_LCORE_ID);
-		printf("Frequency can not be changed due to CPU itself\n");
-		return 0;
 	}
-
-	ret = check_power_get_freq();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_set_freq();
-	if (ret < 0)
+	if (rte_power_get_freq != NULL) {
+		printf("rte_power_get_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_down();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_freq_up();
-	if (ret < 0)
+	}
+	if (rte_power_set_freq != NULL) {
+		printf("rte_power_set_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_max();
-	if (ret < 0)
+	}
+	if (rte_power_freq_up != NULL) {
+		printf("rte_power_freq_up should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_min();
-	if (ret < 0)
+	}
+	if (rte_power_freq_down != NULL) {
+		printf("rte_power_freq_down should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot exit power management for lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
 	}
-
-	/**
-	 * test of exiting power management for the lcore which has been exited
-	 */
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly exit successfully power management twice "
-					"on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
+	if (rte_power_freq_max != NULL) {
+		printf("rte_power_freq_max should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
-	/* test of exit power management for an invalid lcore */
-	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
-	if (ret == 0) {
-		printf("Unpectedly exit power management successfully for "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
+	if (rte_power_freq_min != NULL) {
+		printf("rte_power_freq_min should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
+	rte_power_unset_env();
 	return 0;
-
 fail_all:
-	rte_power_exit(TEST_POWER_LCORE_ID);
-
+	rte_power_unset_env();
 	return -1;
 }
 
diff --git a/app/test/test_power_acpi_cpufreq.c b/app/test/test_power_acpi_cpufreq.c
new file mode 100644
index 0000000..8848d75
--- /dev/null
+++ b/app/test/test_power_acpi_cpufreq.c
@@ -0,0 +1,544 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+
+#define TEST_POWER_LCORE_ID      2U
+#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
+#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
+
+#define TEST_POWER_SYSFILE_CUR_FREQ \
+	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
+
+static uint32_t total_freq_num;
+static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
+
+static int
+check_cur_freq(unsigned lcore_id, uint32_t idx)
+{
+#define TEST_POWER_CONVERT_TO_DECIMAL 10
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t cur_freq;
+	int ret = -1;
+
+	if (snprintf(fullpath, sizeof(fullpath),
+		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
+		return 0;
+	}
+	f = fopen(fullpath, "r");
+	if (f == NULL) {
+		return 0;
+	}
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		goto fail_get_cur_freq;
+	}
+	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
+	ret = (freqs[idx] == cur_freq ? 0 : -1);
+
+fail_get_cur_freq:
+	fclose(f);
+
+	return ret;
+}
+
+/* Check rte_power_freqs() */
+static int
+check_power_freqs(void)
+{
+	uint32_t ret;
+
+	total_freq_num = 0;
+	memset(freqs, 0, sizeof(freqs));
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
+					TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with NULL buffer to save available freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test of getting zero number of freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test with all valid input parameters */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get available freqs on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Save the total number of available freqs */
+	total_freq_num = ret;
+
+	return 0;
+}
+
+/* Check rte_power_get_freq() */
+static int
+check_power_get_freq(void)
+{
+	int ret;
+	uint32_t count;
+
+	/* test with an invalid lcore id */
+	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
+	if (count < TEST_POWER_FREQS_NUM_MAX) {
+		printf("Unexpectedly get freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
+	if (count >= TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get the freq index on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_set_freq() */
+static int
+check_power_set_freq(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
+	if (ret >= 0) {
+		printf("Unexpectedly set freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with an invalid freq index */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/**
+	 * test with an invalid freq index which is right one bigger than
+	 * total number of freqs
+	 */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", total_freq_num,
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0) {
+		printf("Fail to set freq index on lcore %u\n",
+					TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_down() */
+static int
+check_power_freq_down(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale down one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale down one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf ("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_up() */
+static int
+check_power_freq_up(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq on %u\n",
+						TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale up one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale up one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_max() */
+static int
+check_power_freq_max(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq to max on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_min() */
+static int
+check_power_freq_min(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq to min "
+				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+static int
+test_power_acpi_cpufreq(void)
+{
+	int ret = -1;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_ACPI_CPUFREQ, this "
+				"may occur if environment is not configured correctly or "
+				" operating in another valid Power management environment\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_ACPI_CPUFREQ) {
+		printf("Unexpectedly got an environment other than ACPI cpufreq\n");
+		goto fail_all;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+
+	/* test of init power management for an invalid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unexpectedly initialise power management successfully "
+				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(APCI cpufreq) or operating in another valid "
+				"Power management environment\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of initialising power management for the lcore which has
+	 * been initialised
+	 */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly init successfully power twice on "
+					"lcore %u\n", TEST_POWER_LCORE_ID);
+		goto fail_all;
+	}
+
+	ret = check_power_freqs();
+	if (ret < 0)
+		goto fail_all;
+
+	if (total_freq_num < 2) {
+		rte_power_exit(TEST_POWER_LCORE_ID);
+		printf("Frequency can not be changed due to CPU itself\n");
+		rte_power_unset_env();
+		return 0;
+	}
+
+	ret = check_power_get_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_set_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_down();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_up();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_max();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_min();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot exit power management for lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of exiting power management for the lcore which has been exited
+	 */
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly exit successfully power management twice "
+					"on lcore %u\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* test of exit power management for an invalid lcore */
+	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unpectedly exit power management successfully for "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+
+fail_all:
+	rte_power_exit(TEST_POWER_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_acpi_cpufreq_cmd = {
+	.command = "power_acpi_cpufreq_autotest",
+	.callback = test_power_acpi_cpufreq,
+};
+REGISTER_TEST_COMMAND(power_acpi_cpufreq_cmd);
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
new file mode 100644
index 0000000..ac0fcb6
--- /dev/null
+++ b/app/test/test_power_kvm_vm.c
@@ -0,0 +1,308 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+#include <rte_config.h>
+
+#define TEST_POWER_VM_LCORE_ID            0U
+#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
+#define TEST_POWER_VM_LCORE_INVALID       1U
+
+static int
+test_power_kvm_vm(void)
+{
+	int ret;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_KVM_VM);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_KVM_VM\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_KVM_VM) {
+		printf("Unexpectedly got a Power Management environment other than "
+				"KVM VM\n");
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		return -1;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	/* Test initialisation of an out of bounds lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret != -1) {
+		printf("rte_power_init unexpectedly succeeded on an invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(KVM VM) or operating in another valid "
+				"Power management environment\n", TEST_POWER_VM_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of previously initialised lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_init unexpectedly succeeded on calling init twice on"
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of invalid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency down of invalid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency min of invalid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency max of invalid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid but uninitialised lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid but uninitialised lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid but uninitialised lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid but uninitialised lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_up unexpectedly failed on valid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_down unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_min unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_max unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_freqs */
+	ret = rte_power_freqs(TEST_POWER_VM_LCORE_ID, NULL, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_freqs did not return the expected -ENOTSUP(%d) but "
+				"returned %d\n", -ENOTSUP, ret);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_get_freq */
+	ret = rte_power_get_freq(TEST_POWER_VM_LCORE_ID);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_get_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_set_freq */
+	ret = rte_power_set_freq(TEST_POWER_VM_LCORE_ID, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_set_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test removing of an lcore */
+	ret = rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_exit unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of previously removed lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of previously removed lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of previously removed lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of previously removed lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+fail_all:
+	rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_kvm_vm_cmd = {
+    .command = "power_kvm_vm_autotest",
+    .callback = test_power_kvm_vm,
+};
+REGISTER_TEST_COMMAND(power_kvm_vm_cmd);
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (9 preceding siblings ...)
  2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 10/10] VM Power Management Unit Tests Alan Carew
@ 2014-10-13  6:17       ` Liu, Yong
  2014-10-13 20:26       ` Thomas Monjalon
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
  12 siblings, 0 replies; 97+ messages in thread
From: Liu, Yong @ 2014-10-13  6:17 UTC (permalink / raw)
  To: dev

Patch name:		VM Power Management
Brief description:	Verify VM power management in virtualized environments
Test Flag:		Tested-by 
Tester name:		yong.liu@intel.com
Test environment:
			OS: Fedora20 3.11.10-301.fc20.x86_64
			GCC: gcc version 4.8.3 20140911
			CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
			NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb]
Test Tool Chain information:	
			Qemu: 1.6.1
			libvirt: 1.1.3
			Guest OS: Fedora20 3.11.10-301.fc20.x86_64
			Guest GCC: gcc version 4.8.3 20140624
			
Commit ID:		72d3e7ad3183f42f8b9fb3bb1c12b3e1b39eef39

Detailed Testing information	
DPDK SW Configuration:
			Default x86_64-native-linuxapp-gcc configuration
Test Result Summary: 	Total 7 cases, 7 passed, 0 failed

Test Case - name:
			VM Power Management Channel
Test Case - Description:
			Check vm power management communication channels can successfully connected
Test Case -command / instruction:
			Create folder in system temporary filesystem for power monitor socket
				mkdir -p /tmp/powermonitor
					
			Configure VM XML and pin VCPUs to specified CPUs
				<vcpu placement='static'>5</vcpu>
				<cputune>
					 <vcpupin vcpu='0' cpuset='1'/>
					 <vcpupin vcpu='1' cpuset='2'/>
					  <vcpupin vcpu='2' cpuset='3'/>
					  <vcpupin vcpu='3' cpuset='4'/>
					  <vcpupin vcpu='4' cpuset='5'/>
				</cputune>
						
			Configure VM XML to set up virtio serial ports
				<channel type='unix'>
				<source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
				<target type='virtio' name='virtio.serial.port.poweragent.<channel_num>'/>
				<address type='virtio-serial' controller='0' bus='0' port='4'/>
				</channel>

       			Run power-manager monitor in Host
				./build/vm_power_mgr -c 0x3 -n 4
			
			Startup VM and run guest_vm_power_mgr
				guest_vm_power_mgr -c 0x1f -n 4 -- -i
			
			Add vm in host and check vm_power_mgr can get frequency normally

				vmpower> add_vm <vm_name>
				vmpower> add_channels <vm_name> all
				vmpower> get_cpu_freq <core_num>
				
			Check vcpu/cpu mapping can be detected normally
				vmpower> show_vm <vm_name>
		
Test Case - expected test result:
			VM power management communication channels can sucessfully connected and host can get vm core information

Test Case - name:
			VM Power Management Numa
Test Case - Description:
			Check vm power management support manage cores in different sockets
Test Case -command / instruction:
			Get core and socket information by cpu_layout
   
				./tools/cpu_layout.py

			Configure VM XML to pin VCPUs on Socket1:

			Repeat Case1
		
			Check vcpu/cpu mapping can be detected normally
				vmpower> show_vm <vm_name>
			
Test Case - expected test result:
			VM power management communication channels can sucessfully connected and show correct vm core information

Test Case - name:
			VM scale cpu frequency down 
Test Case - Description:
			Check vm power management support VM configure self cores frequency down
Test Case -command / instruction:
			Setup VM power management environment
			
			Send cpu frequency down hints to Host 
			
				vmpower(guest)> set_cpu_freq 0 down
			
			Verify the frequency of physical CPU has been scaled down correctly
				vmpower> get_cpu_freq 1
				Core 1 frequency: 2700000

			Check other CPUs' frequency is not affected by actions above
			
			Check if the other VM works fine (if they use different CPUs)
			
			Repeat above actions several times
			
Test Case - expected test result:
			Frequency for VM's core can be scaling down normally
			
Test Case - name:
			VM scale cpu frequency up 
Test Case - Description:
			Check vm power management support VM configure self cores frequency up
Test Case -command / instruction:
			Setup VM power management environment
			
			Send cpu frequency up hints to Host 

				vmpower(guest)> set_cpu_freq 0 up
			
			Verify the frequency of physical CPU has been scaled up correctly
				vmpower> get_cpu_freq 1

			Check other CPUs' frequency is not affected by actions above
			
			Check if the other VM works fine (if they use different CPUs)
			
			Repeat above actions several times
			
Test Case - expected test result:
			Frequency for VM's core can be scaling up normally
			
Test Case - name:
			VM Scale CPU Frequency to Min 
Test Case - Description:
			Check vm power management support VM configure self cores frequency to minimum
Test Case -command / instruction:
			Setup VM power management environment
			Send cpu frequency scale to minimum hints
			
				vmpower(guest)> set_cpu_freq 0 min

			Verify the frequency of physical CPU has been scale to min correctly

				vmpower> get_cpu_freq 1
				Core 1 frequency: 1200000

			Check other CPUs' frequency is not affected by actions above
			check if the other VM works fine (if they use different CPUs)
			
Test Case - expected test result:
			Frequency for VM's core can be scaling to minimum normally
			
Test Case - name:
			VM Scale CPU Frequency to Max
Test Case - Description:
			Check vm power management support VM configure self cores frequency to maximum
Test Case -command / instruction:
			Setup VM power management environment
			Send cpu frequency scale to maximum hints
			
				vmpower(guest)> set_cpu_freq 0 max

			Verify the frequency of physical CPU has been scale to max correctly

				vmpower> get_cpu_freq 1
				Core 1 frequency: 2800000
	
	
			Check other CPUs' frequency is not affected by actions above
			check if the other VM works fine (if they use different CPUs)
			
Test Case - expected test result:
			Frequency for VM's core can be scaling to maximum normally
			
Test Case - name:
			VM Power Management Multi VMs
Test Case - Description:
			Check vm power management support multiple VMs
Test Case -command / instruction:
			Setup VM power management environment for VM1

			Setup VM power management environment for VM2

			Run power-manager in Host

				./build/vm_power_mgr -c 0x3 -n 4

			Startup VM1 and VM2

			Add VM1 in host and check vm_power_mgr can get frequency normally

				vmpower> add_vm <vm1_name>
				vmpower> add_channels <vm1_name> all
				vmpower> get_cpu_freq <core_num>

			Add VM2 in host and check vm_power_mgr can get frequency normally

				vmpower> add_vm <vm2_name>
				vmpower> add_channels <vm2_name> all
				vmpower> get_cpu_freq <core_num>		

			Check VM1 and VM2 cpu frequency can by modified by guest_cli

			Poweroff VM2 and remove VM2 from host vm_power_mgr
				vmpower> rm_vm <vm2_name>

Test Case - expected test result:
			VM power management	supported multiple VMs add and remove

> -----Original Message-----
> From: Carew, Alan
> Sent: Monday, October 13, 2014 3:36 AM
> To: dev@dpdk.org
> Cc: Liu, Yong
> Subject: [PATCH v4 00/10] VM Power Management
> 
> Virtual Machine Power Management.
> 
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> 
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
> 
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests
> for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
> 
>  2)Host: The host monitor is managed by a CLI, it allows for adding
> qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and
> managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX
> file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/<vm_name>.<channel_number>,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
>  Host channel endpoints are opened in non-blocking mode and are
> monitored via epoll.
>  Requests over each channel to change frequency are forwarded to the
> original
>  librte_power.
> 
> Channels must be manually configured as qemu-kvm command line
> arguments or
> libvirt domain definition(xml) e.g.
> <controller type='virtio-serial' index='0'>
>  <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
> </controller>
> <channel type='unix'>
>   <source mode='bind'
> path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
>   <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
>   <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
> </channel>
> 
> Where multiple channels can be configured by specifying multiple <channel>
> elements, by replacing <vm_name>, <channel_num>.
> <N>(port number) should be incremented by 1 for each new channel
> element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> 
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to
> Monitor thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel
> events
>  from VMs and calls the corresponding librte_power functions.
> 
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a
> vCPU[PATCH 08/09]
> 
> The current l3fwd-power sample application can also be run on a VM.
> 
> Changes in V4:
>  Fixed double free of channel during VM shutdown.
> 
> Changes in V3:
>  Fixed crash in Guest CLI when host application is not running.
>  Renamed #defines to be more specific to the module they belong
>  Added vCPU pinning via CLI
> 
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> 
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests
> 
>  app/test/Makefile                                  |   3 +-
>  app/test/autotest_data.py                          |  26 +
>  app/test/test_power.c                              | 445 +-----------
>  app/test/test_power_acpi_cpufreq.c                 | 544 ++++++++++++++
>  app/test/test_power_kvm_vm.c                       | 308 ++++++++
>  examples/vm_power_manager/Makefile                 |  57 ++
>  examples/vm_power_manager/channel_manager.c        | 804
> +++++++++++++++++++++
>  examples/vm_power_manager/channel_manager.h        | 314 ++++++++
>  examples/vm_power_manager/channel_monitor.c        | 231 ++++++
>  examples/vm_power_manager/channel_monitor.h        | 102 +++
>  examples/vm_power_manager/guest_cli/Makefile       |  56 ++
>  examples/vm_power_manager/guest_cli/main.c         |  87 +++
>  examples/vm_power_manager/guest_cli/main.h         |  52 ++
>  .../guest_cli/vm_power_cli_guest.c                 | 155 ++++
>  .../guest_cli/vm_power_cli_guest.h                 |  55 ++
>  examples/vm_power_manager/main.c                   | 117 +++
>  examples/vm_power_manager/main.h                   |  52 ++
>  examples/vm_power_manager/power_manager.c          | 244 +++++++
>  examples/vm_power_manager/power_manager.h          | 188 +++++
>  examples/vm_power_manager/vm_power_cli.c           | 669
> +++++++++++++++++
>  examples/vm_power_manager/vm_power_cli.h           |  47 ++
>  lib/librte_power/Makefile                          |   3 +-
>  lib/librte_power/channel_commands.h                |  77 ++
>  lib/librte_power/guest_channel.c                   | 162 +++++
>  lib/librte_power/guest_channel.h                   |  89 +++
>  lib/librte_power/rte_power.c                       | 540 ++------------
>  lib/librte_power/rte_power.h                       | 120 ++-
>  lib/librte_power/rte_power_acpi_cpufreq.c          | 545 ++++++++++++++
>  lib/librte_power/rte_power_acpi_cpufreq.h          | 192 +++++
>  lib/librte_power/rte_power_common.h                |  39 +
>  lib/librte_power/rte_power_kvm_vm.c                | 135 ++++
>  lib/librte_power/rte_power_kvm_vm.h                | 179 +++++
>  32 files changed, 5725 insertions(+), 912 deletions(-)
>  create mode 100644 app/test/test_power_acpi_cpufreq.c
>  create mode 100644 app/test/test_power_kvm_vm.c
>  create mode 100644 examples/vm_power_manager/Makefile
>  create mode 100644 examples/vm_power_manager/channel_manager.c
>  create mode 100644 examples/vm_power_manager/channel_manager.h
>  create mode 100644 examples/vm_power_manager/channel_monitor.c
>  create mode 100644 examples/vm_power_manager/channel_monitor.h
>  create mode 100644 examples/vm_power_manager/guest_cli/Makefile
>  create mode 100644 examples/vm_power_manager/guest_cli/main.c
>  create mode 100644 examples/vm_power_manager/guest_cli/main.h
>  create mode 100644
> examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
>  create mode 100644
> examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
>  create mode 100644 examples/vm_power_manager/main.c
>  create mode 100644 examples/vm_power_manager/main.h
>  create mode 100644 examples/vm_power_manager/power_manager.c
>  create mode 100644 examples/vm_power_manager/power_manager.h
>  create mode 100644 examples/vm_power_manager/vm_power_cli.c
>  create mode 100644 examples/vm_power_manager/vm_power_cli.h
>  create mode 100644 lib/librte_power/channel_commands.h
>  create mode 100644 lib/librte_power/guest_channel.c
>  create mode 100644 lib/librte_power/guest_channel.h
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
>  create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
>  create mode 100644 lib/librte_power/rte_power_common.h
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.c
>  create mode 100644 lib/librte_power/rte_power_kvm_vm.h
> 
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (10 preceding siblings ...)
  2014-10-13  6:17       ` [dpdk-dev] [PATCH v4 00/10] VM Power Management Liu, Yong
@ 2014-10-13 20:26       ` Thomas Monjalon
  2014-10-14 12:37         ` Carew, Alan
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
  12 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2014-10-13 20:26 UTC (permalink / raw)
  To: Alan Carew; +Cc: dev

Hi Alan,

2014-10-12 20:36, Alan Carew:
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.

Something was preventing me from looking deeper in this big codebase,
but I didn't know what sounds weird.
Now I realize: the real problem is that virtualization transparency is
broken for power management. So the right thing to do is to fix it in
KVM. I think all this patchset is a huge workaround.

Did you try to fix it with Qemu/KVM?

-- 
Thomas

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-13 20:26       ` Thomas Monjalon
@ 2014-10-14 12:37         ` Carew, Alan
  2014-10-14 15:03           ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Carew, Alan @ 2014-10-14 12:37 UTC (permalink / raw)
  To: dev

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, October 13, 2014 9:26 PM
> To: Carew, Alan
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
> 
> Hi Alan,
> 
> 2014-10-12 20:36, Alan Carew:
> > The following patches add two DPDK sample applications and an alternate
> > implementation of librte_power for use in virtualized environments.
> > The idea is to provide librte_power functionality from within a VM to address
> > the lack of MSRs to facilitate frequency changes from within a VM.
> > It is ideally suited for Haswell which provides per core frequency scaling.
> >
> > The current librte_power affects frequency changes via the acpi-cpufreq
> > 'userspace' power governor, accessed via sysfs.
> 
> Something was preventing me from looking deeper in this big codebase,
> but I didn't know what sounds weird.
> Now I realize: the real problem is that virtualization transparency is
> broken for power management. So the right thing to do is to fix it in
> KVM. I think all this patchset is a huge workaround.
> 
> Did you try to fix it with Qemu/KVM?
> 
> --
> Thomas

When looking at the libvirt API it would seem to be a natural fit to have power management sitting there, so in essence I would agree.

However with a DPDK solution it would be possible to re-use the message bus to pass information like device stats, application state, D-state requests etc. to the host and allow for management layer(e.g. OpenStack) to make informed decisions.

Also, the scope of adding power management to qemu/KVM would be huge; while the easier path is not always the best and the problem of power management in VMs is both a DPDK problem (given that librte_power only worked on the host) and a general virtualization problem that would be better solved by those with direct knowledge of Qemu/KVM architecture and influence on the direction of the Qemu project.

As it stands, the host backend is simply an example application that can be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has obvious leanings to Qemu, but even this could be easily swapped out for XenBus, IVSHMEM, IP etc.

If power management is to be eventually supported by Hypervisors directly then we could also enable to option to switch to that environment, currently the librte_power implementations (VM or Host) can be selected dynamically(environment auto-detection) or explicitly via rte_power_set_env(), adding an arbitrary number of environments is relatively easy.

I hope this helps to clarify the approach.

Thanks,
Alan.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-14 12:37         ` Carew, Alan
@ 2014-10-14 15:03           ` Thomas Monjalon
  2014-10-16 15:21             ` Carew, Alan
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2014-10-14 15:03 UTC (permalink / raw)
  To: Carew, Alan; +Cc: dev

2014-10-14 12:37, Carew, Alan:
> > > The following patches add two DPDK sample applications and an alternate
> > > implementation of librte_power for use in virtualized environments.
> > > The idea is to provide librte_power functionality from within a VM to address
> > > the lack of MSRs to facilitate frequency changes from within a VM.
> > > It is ideally suited for Haswell which provides per core frequency scaling.
> > >
> > > The current librte_power affects frequency changes via the acpi-cpufreq
> > > 'userspace' power governor, accessed via sysfs.
> > 
> > Something was preventing me from looking deeper in this big codebase,
> > but I didn't know what sounds weird.
> > Now I realize: the real problem is that virtualization transparency is
> > broken for power management. So the right thing to do is to fix it in
> > KVM. I think all this patchset is a huge workaround.
> > 
> > Did you try to fix it with Qemu/KVM?
> 
> When looking at the libvirt API it would seem to be a natural fit to have
> power management sitting there, so in essence I would agree.
> 
> However with a DPDK solution it would be possible to re-use the message bus
> to pass information like device stats, application state, D-state requests
> etc. to the host and allow for management layer(e.g. OpenStack) to make
> informed decisions.

I think that management informations should be transmitted in a management
channel. Such solution should exist in OpenStack.

> Also, the scope of adding power management to qemu/KVM would be huge;
> while the easier path is not always the best and the problem of power
> management in VMs is both a DPDK problem (given that librte_power only
> worked on the host) and a general virtualization problem that would be
> better solved by those with direct knowledge of Qemu/KVM architecture
> and influence on the direction of the Qemu project.

Being a huge effort is not an argument.
Please check with Qemu community, they'll welcome it.
 
> As it stands, the host backend is simply an example application that can
> be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has
> obvious leanings to Qemu, but even this could be easily swapped out for
> XenBus, IVSHMEM, IP etc.
> 
> If power management is to be eventually supported by Hypervisors directly
> then we could also enable to option to switch to that environment, currently
> the librte_power implementations (VM or Host) can be selected dynamically
> (environment auto-detection) or explicitly via rte_power_set_env(), adding
> an arbitrary number of environments is relatively easy.

Yes, you are adding a new layer to workaround hypervisor lacks. And this layer
will handle native support when it will exist. But if you implement native
support now, we don't need this extra layer.

> I hope this helps to clarify the approach.

Thanks for your explanation.

-- 
Thomas

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-14 15:03           ` Thomas Monjalon
@ 2014-10-16 15:21             ` Carew, Alan
  2014-10-28 15:21               ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Carew, Alan @ 2014-10-16 15:21 UTC (permalink / raw)
  To: dev

Hi Thomas,

> > However with a DPDK solution it would be possible to re-use the message bus
> > to pass information like device stats, application state, D-state requests
> > etc. to the host and allow for management layer(e.g. OpenStack) to make
> > informed decisions.
> 
> I think that management informations should be transmitted in a management
> channel. Such solution should exist in OpenStack.

Perhaps it does, but this solution is not exclusive to OpenStack and just a potential use case.

> 
> > Also, the scope of adding power management to qemu/KVM would be huge;
> > while the easier path is not always the best and the problem of power
> > management in VMs is both a DPDK problem (given that librte_power only
> > worked on the host) and a general virtualization problem that would be
> > better solved by those with direct knowledge of Qemu/KVM architecture
> > and influence on the direction of the Qemu project.
> 
> Being a huge effort is not an argument.

I agree completely and was implied by what followed the conjunction.

> Please check with Qemu community, they'll welcome it.
> 
> > As it stands, the host backend is simply an example application that can
> > be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has
> > obvious leanings to Qemu, but even this could be easily swapped out for
> > XenBus, IVSHMEM, IP etc.
> >
> > If power management is to be eventually supported by Hypervisors directly
> > then we could also enable to option to switch to that environment, currently
> > the librte_power implementations (VM or Host) can be selected dynamically
> > (environment auto-detection) or explicitly via rte_power_set_env(), adding
> > an arbitrary number of environments is relatively easy.
> 
> Yes, you are adding a new layer to workaround hypervisor lacks. And this layer
> will handle native support when it will exist. But if you implement native
> support now, we don't need this extra layer.

Indeed, but we have a solution implemented now and yes it is a workaround, that is until Hypervisors support such functionality. It is possible that whatever solutions for power management present themselves in the future may require workarounds also, us-vhost is an example of such a workaround introduced to DPDK.

> 
> > I hope this helps to clarify the approach.
> 
> Thanks for your explanation.

Thanks for the feedback.

> 
> --
> Thomas

Alan.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-16 15:21             ` Carew, Alan
@ 2014-10-28 15:21               ` Thomas Monjalon
  2014-11-10  9:05                 ` Carew, Alan
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2014-10-28 15:21 UTC (permalink / raw)
  To: Carew, Alan; +Cc: dev

Hi Alan,

Did you make any progress in Qemu/KVM community?
We need to be sync'ed up with them to be sure we share the same goal.
I want also to avoid using a solution which doesn't fit with their plan.
Remember that we already had this problem with ivshmem which was planned
to be dropped.

Thanks
-- 
Thomas


2014-10-16 15:21, Carew, Alan:
> Hi Thomas,
> 
> > > However with a DPDK solution it would be possible to re-use the message bus
> > > to pass information like device stats, application state, D-state requests
> > > etc. to the host and allow for management layer(e.g. OpenStack) to make
> > > informed decisions.
> > 
> > I think that management informations should be transmitted in a management
> > channel. Such solution should exist in OpenStack.
> 
> Perhaps it does, but this solution is not exclusive to OpenStack and just a potential use case.
> 
> > 
> > > Also, the scope of adding power management to qemu/KVM would be huge;
> > > while the easier path is not always the best and the problem of power
> > > management in VMs is both a DPDK problem (given that librte_power only
> > > worked on the host) and a general virtualization problem that would be
> > > better solved by those with direct knowledge of Qemu/KVM architecture
> > > and influence on the direction of the Qemu project.
> > 
> > Being a huge effort is not an argument.
> 
> I agree completely and was implied by what followed the conjunction.
> 
> > Please check with Qemu community, they'll welcome it.
> > 
> > > As it stands, the host backend is simply an example application that can
> > > be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has
> > > obvious leanings to Qemu, but even this could be easily swapped out for
> > > XenBus, IVSHMEM, IP etc.
> > >
> > > If power management is to be eventually supported by Hypervisors directly
> > > then we could also enable to option to switch to that environment, currently
> > > the librte_power implementations (VM or Host) can be selected dynamically
> > > (environment auto-detection) or explicitly via rte_power_set_env(), adding
> > > an arbitrary number of environments is relatively easy.
> > 
> > Yes, you are adding a new layer to workaround hypervisor lacks. And this layer
> > will handle native support when it will exist. But if you implement native
> > support now, we don't need this extra layer.
> 
> Indeed, but we have a solution implemented now and yes it is a workaround, that is until Hypervisors support such functionality. It is possible that whatever solutions for power management present themselves in the future may require workarounds also, us-vhost is an example of such a workaround introduced to DPDK.
> 
> > 
> > > I hope this helps to clarify the approach.
> > 
> > Thanks for your explanation.
> 
> Thanks for the feedback.
> 
> > 
> > --
> > Thomas
> 
> Alan.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-10-28 15:21               ` Thomas Monjalon
@ 2014-11-10  9:05                 ` Carew, Alan
  2014-11-10 17:54                   ` O'driscoll, Tim
  0 siblings, 1 reply; 97+ messages in thread
From: Carew, Alan @ 2014-11-10  9:05 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,

> Hi Alan,
> 
> Did you make any progress in Qemu/KVM community?
> We need to be sync'ed up with them to be sure we share the same goal.
> I want also to avoid using a solution which doesn't fit with their plan.
> Remember that we already had this problem with ivshmem which was
> planned to be dropped.
> 
> Thanks
> --
> Thomas
> 
> 
> 2014-10-16 15:21, Carew, Alan:
> > Hi Thomas,
> >
> > > > However with a DPDK solution it would be possible to re-use the
> message bus
> > > > to pass information like device stats, application state, D-state requests
> > > > etc. to the host and allow for management layer(e.g. OpenStack) to
> make
> > > > informed decisions.
> > >
> > > I think that management informations should be transmitted in a
> management
> > > channel. Such solution should exist in OpenStack.
> >
> > Perhaps it does, but this solution is not exclusive to OpenStack and just a
> potential use case.
> >
> > >
> > > > Also, the scope of adding power management to qemu/KVM would be
> huge;
> > > > while the easier path is not always the best and the problem of power
> > > > management in VMs is both a DPDK problem (given that librte_power
> only
> > > > worked on the host) and a general virtualization problem that would be
> > > > better solved by those with direct knowledge of Qemu/KVM
> architecture
> > > > and influence on the direction of the Qemu project.
> > >
> > > Being a huge effort is not an argument.
> >
> > I agree completely and was implied by what followed the conjunction.
> >
> > > Please check with Qemu community, they'll welcome it.
> > >
> > > > As it stands, the host backend is simply an example application that can
> > > > be replaced by a VMM or Orchestration layer, by using Virtio-Serial it
> has
> > > > obvious leanings to Qemu, but even this could be easily swapped out
> for
> > > > XenBus, IVSHMEM, IP etc.
> > > >
> > > > If power management is to be eventually supported by Hypervisors
> directly
> > > > then we could also enable to option to switch to that environment,
> currently
> > > > the librte_power implementations (VM or Host) can be selected
> dynamically
> > > > (environment auto-detection) or explicitly via rte_power_set_env(),
> adding
> > > > an arbitrary number of environments is relatively easy.
> > >
> > > Yes, you are adding a new layer to workaround hypervisor lacks. And this
> layer
> > > will handle native support when it will exist. But if you implement native
> > > support now, we don't need this extra layer.
> >
> > Indeed, but we have a solution implemented now and yes it is a
> workaround, that is until Hypervisors support such functionality. It is possible
> that whatever solutions for power management present themselves in the
> future may require workarounds also, us-vhost is an example of such a
> workaround introduced to DPDK.
> >
> > >
> > > > I hope this helps to clarify the approach.
> > >
> > > Thanks for your explanation.
> >
> > Thanks for the feedback.
> >
> > >
> > > --
> > > Thomas
> >
> > Alan.

Unfortunately, I have not yet received any feedback:
http://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01103.html

Alan.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-11-10  9:05                 ` Carew, Alan
@ 2014-11-10 17:54                   ` O'driscoll, Tim
  2014-11-21 23:51                     ` Zhu, Heqing
  2014-11-22 17:17                     ` Vincent JARDIN
  0 siblings, 2 replies; 97+ messages in thread
From: O'driscoll, Tim @ 2014-11-10 17:54 UTC (permalink / raw)
  To: Carew, Alan, Thomas Monjalon; +Cc: dev

> From: Carew, Alan
> 
> > Did you make any progress in Qemu/KVM community?
> > We need to be sync'ed up with them to be sure we share the same goal.
> > I want also to avoid using a solution which doesn't fit with their plan.
> > Remember that we already had this problem with ivshmem which was
> > planned to be dropped.
> >
. . .
> 
> Unfortunately, I have not yet received any feedback:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01103.html

Just to add to what Alan said above, this capability does not exist in qemu at the moment, and based on there having been no feedback on the qemu mailing list so far, I think it's reasonable to assume that it will not be implemented in the immediate future. The VM Power Management feature has also been designed to allow easy migration to a qemu-based solution when this is supported in future. Therefore, I'd be in favour of accepting this feature into DPDK now.

It's true that the implementation is a work-around, but there have been similar cases in DPDK in the past. One recent example that comes to mind is userspace vhost. The original implementation could also be considered a work-around, but it met the needs of many in the community. Now, with support for vhost-user in qemu 2.1, that implementation is being improved. I'd see VM Power Management following a similar path when this capability is supported in qemu.

Tim

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-11-10 17:54                   ` O'driscoll, Tim
@ 2014-11-21 23:51                     ` Zhu, Heqing
  2014-11-22 17:17                     ` Vincent JARDIN
  1 sibling, 0 replies; 97+ messages in thread
From: Zhu, Heqing @ 2014-11-21 23:51 UTC (permalink / raw)
  To: O'driscoll, Tim, Carew, Alan, Thomas Monjalon; +Cc: dev

Pablo just sent a new patch set. This is a significant effort and it addressed a valid technical problem statement. 
I express my support to this feature into the DPDK mainline. 

IMHO, the previous *rejection* reason are not solid. It is important to encourage the real contribution like this. 

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of O'driscoll, Tim
Sent: Monday, November 10, 2014 10:54 AM
To: Carew, Alan; Thomas Monjalon
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management

> From: Carew, Alan
> 
> > Did you make any progress in Qemu/KVM community?
> > We need to be sync'ed up with them to be sure we share the same goal.
> > I want also to avoid using a solution which doesn't fit with their plan.
> > Remember that we already had this problem with ivshmem which was 
> > planned to be dropped.
> >
. . .
> 
> Unfortunately, I have not yet received any feedback:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01103.html

Just to add to what Alan said above, this capability does not exist in qemu at the moment, and based on there having been no feedback on the qemu mailing list so far, I think it's reasonable to assume that it will not be implemented in the immediate future. The VM Power Management feature has also been designed to allow easy migration to a qemu-based solution when this is supported in future. Therefore, I'd be in favour of accepting this feature into DPDK now.

It's true that the implementation is a work-around, but there have been similar cases in DPDK in the past. One recent example that comes to mind is userspace vhost. The original implementation could also be considered a work-around, but it met the needs of many in the community. Now, with support for vhost-user in qemu 2.1, that implementation is being improved. I'd see VM Power Management following a similar path when this capability is supported in qemu.

Tim

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-11-10 17:54                   ` O'driscoll, Tim
  2014-11-21 23:51                     ` Zhu, Heqing
@ 2014-11-22 17:17                     ` Vincent JARDIN
  2014-12-09 17:35                       ` Paolo Bonzini
  1 sibling, 1 reply; 97+ messages in thread
From: Vincent JARDIN @ 2014-11-22 17:17 UTC (permalink / raw)
  To: dev, qemu-devel@nongnu.org Developers; +Cc: Paolo Bonzini

Tim,

cc-ing Paolo and qemu-devel@ again in order to get their take on it.

>>> Did you make any progress in Qemu/KVM community?
>>> We need to be sync'ed up with them to be sure we share the same goal.
>>> I want also to avoid using a solution which doesn't fit with their plan.
>>> Remember that we already had this problem with ivshmem which was
>>> planned to be dropped.
>>>

>> Unfortunately, I have not yet received any feedback:
>> http://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01103.html
>
> Just to add to what Alan said above, this capability does not exist in qemu at the moment, and based on there having been no feedback on the qemu mailing list so far, I think it's reasonable to assume that it will not be implemented in the immediate future. The VM Power Management feature has also been designed to allow easy migration to a qemu-based solution when this is supported in future. Therefore, I'd be in favour of accepting this feature into DPDK now.
>
> It's true that the implementation is a work-around, but there have been similar cases in DPDK in the past. One recent example that comes to mind is userspace vhost. The original implementation could also be considered a work-around, but it met the needs of many in the community. Now, with support for vhost-user in qemu 2.1, that implementation is being improved. I'd see VM Power Management following a similar path when this capability is supported in qemu.

Best regards,
   Vincent

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-11-22 17:17                     ` Vincent JARDIN
@ 2014-12-09 17:35                       ` Paolo Bonzini
  2014-12-11 23:18                         ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Paolo Bonzini @ 2014-12-09 17:35 UTC (permalink / raw)
  To: Vincent JARDIN, dev, qemu-devel

I had replied to this message, but my reply never got to the list.
Let's try again.

I wonder if this might be papering over a bug in the host cpufreq
driver.  If the guest is not doing much and leaving a lot of idle CPU
time, the host should scale down the frequency of that CPU.  In the case
of pinned VCPUs this should really "just work".  What is the problem
that is being solved?

Paolo

On 22/11/2014 18:17, Vincent JARDIN wrote:
> Tim,
> 
> cc-ing Paolo and qemu-devel@ again in order to get their take on it.
> 
>>>> Did you make any progress in Qemu/KVM community?
>>>> We need to be sync'ed up with them to be sure we share the same goal.
>>>> I want also to avoid using a solution which doesn't fit with their
>>>> plan.
>>>> Remember that we already had this problem with ivshmem which was
>>>> planned to be dropped.
>>>>
> 
>>> Unfortunately, I have not yet received any feedback:
>>> http://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01103.html
>>
>> Just to add to what Alan said above, this capability does not exist in
>> qemu at the moment, and based on there having been no feedback on the
>> qemu mailing list so far, I think it's reasonable to assume that it
>> will not be implemented in the immediate future. The VM Power
>> Management feature has also been designed to allow easy migration to a
>> qemu-based solution when this is supported in future. Therefore, I'd
>> be in favour of accepting this feature into DPDK now.
>>
>> It's true that the implementation is a work-around, but there have
>> been similar cases in DPDK in the past. One recent example that comes
>> to mind is userspace vhost. The original implementation could also be
>> considered a work-around, but it met the needs of many in the
>> community. Now, with support for vhost-user in qemu 2.1, that
>> implementation is being improved. I'd see VM Power Management
>> following a similar path when this capability is supported in qemu.
> 
> Best regards,
>   Vincent
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-12-09 17:35                       ` Paolo Bonzini
@ 2014-12-11 23:18                         ` Thomas Monjalon
  2014-12-12 13:00                           ` Carew, Alan
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2014-12-11 23:18 UTC (permalink / raw)
  To: Alan Carew, Pablo de Lara; +Cc: dev, Paolo Bonzini, qemu-devel

2014-12-09 18:35, Paolo Bonzini:
> >>>> Did you make any progress in Qemu/KVM community?
> >>>> We need to be sync'ed up with them to be sure we share the same goal.
> >>>> I want also to avoid using a solution which doesn't fit with their
> >>>> plan.
> >>>> Remember that we already had this problem with ivshmem which was
> >>>> planned to be dropped.
> >>> 
> >>> Unfortunately, I have not yet received any feedback:
> >>> http://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01103.html
> >>
> >> Just to add to what Alan said above, this capability does not exist in
> >> qemu at the moment, and based on there having been no feedback on th
> >> qemu mailing list so far, I think it's reasonable to assume that it
> >> will not be implemented in the immediate future. The VM Power
> >> Management feature has also been designed to allow easy migration to a
> >> qemu-based solution when this is supported in future. Therefore, I'd
> >> be in favour of accepting this feature into DPDK now.
> >>
> >> It's true that the implementation is a work-around, but there have
> >> been similar cases in DPDK in the past. One recent example that comes
> >> to mind is userspace vhost. The original implementation could also be
> >> considered a work-around, but it met the needs of many in the
> >> community. Now, with support for vhost-user in qemu 2.1, that
> >> implementation is being improved. I'd see VM Power Management
> >> following a similar path when this capability is supported in qemu.
> 
> I wonder if this might be papering over a bug in the host cpufreq
> driver.  If the guest is not doing much and leaving a lot of idle CPU
> time, the host should scale down the frequency of that CPU.  In the case
> of pinned VCPUs this should really "just work".  What is the problem
> that is being solved?
> 
> Paolo

Alan, Pablo, please could you explain your logic with VM power management?

-- 
Thomas

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-12-11 23:18                         ` Thomas Monjalon
@ 2014-12-12 13:00                           ` Carew, Alan
  2014-12-12 14:50                             ` Paolo Bonzini
  0 siblings, 1 reply; 97+ messages in thread
From: Carew, Alan @ 2014-12-12 13:00 UTC (permalink / raw)
  To: Thomas Monjalon, De Lara Guarch, Pablo; +Cc: dev, Paolo Bonzini, qemu-devel

Hi Paolo,

> 2014-12-09 18:35, Paolo Bonzini:
> > >>>> Did you make any progress in Qemu/KVM community?
> > >>>> We need to be sync'ed up with them to be sure we share the same
> goal.
> > >>>> I want also to avoid using a solution which doesn't fit with
> > >>>> their plan.
> > >>>> Remember that we already had this problem with ivshmem which
> was
> > >>>> planned to be dropped.
> > >>>
> > >>> Unfortunately, I have not yet received any feedback:
> > >>> http://lists.nongnu.org/archive/html/qemu-devel/2014-
> 11/msg01103.h
> > >>> tml
> > >>
> > >> Just to add to what Alan said above, this capability does not exist
> > >> in qemu at the moment, and based on there having been no feedback
> > >> on th qemu mailing list so far, I think it's reasonable to assume
> > >> that it will not be implemented in the immediate future. The VM
> > >> Power Management feature has also been designed to allow easy
> > >> migration to a qemu-based solution when this is supported in
> > >> future. Therefore, I'd be in favour of accepting this feature into DPDK
> now.
> > >>
> > >> It's true that the implementation is a work-around, but there have
> > >> been similar cases in DPDK in the past. One recent example that
> > >> comes to mind is userspace vhost. The original implementation could
> > >> also be considered a work-around, but it met the needs of many in
> > >> the community. Now, with support for vhost-user in qemu 2.1, that
> > >> implementation is being improved. I'd see VM Power Management
> > >> following a similar path when this capability is supported in qemu.
> >
> > I wonder if this might be papering over a bug in the host cpufreq
> > driver.  If the guest is not doing much and leaving a lot of idle CPU
> > time, the host should scale down the frequency of that CPU.  In the
> > case of pinned VCPUs this should really "just work".  What is the
> > problem that is being solved?
> >
> > Paolo
> 
> Alan, Pablo, please could you explain your logic with VM power
> management?
> 
> --
> Thomas

The problem is deterministic control of host CPU frequency and the DPDK usage
model.
A hands-off power governor will scale based on workload, whether this is a host
application or VM, so no problems or bug there.

Where this solution fits is where an application wants to control its own
power policy, for example l3fwd_power uses librte_power library to change
frequency via apci_cpufreq based on application heuristics rather than
relying on an inbuilt policy for example ondemand or performance.

This ability has existed in DPDK for host usage for some time and VM power
management allows this use case to be extended to cater for virtual machines
by re-using the librte_power interface to encapsulate the VM->Host
comms and provide an example means of managing such communications.

 I hope this clears it up a bit.

Thanks,
Alan.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-12-12 13:00                           ` Carew, Alan
@ 2014-12-12 14:50                             ` Paolo Bonzini
  2014-12-12 16:10                               ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Paolo Bonzini @ 2014-12-12 14:50 UTC (permalink / raw)
  To: Carew, Alan, Thomas Monjalon, De Lara Guarch, Pablo; +Cc: dev, qemu-devel



On 12/12/2014 14:00, Carew, Alan wrote:
> The problem is deterministic control of host CPU frequency and the DPDK usage
> model.
> A hands-off power governor will scale based on workload, whether this is a host
> application or VM, so no problems or bug there.
> 
> Where this solution fits is where an application wants to control its own
> power policy, for example l3fwd_power uses librte_power library to change
> frequency via apci_cpufreq based on application heuristics rather than
> relying on an inbuilt policy for example ondemand or performance.
> 
> This ability has existed in DPDK for host usage for some time and VM power
> management allows this use case to be extended to cater for virtual machines
> by re-using the librte_power interface to encapsulate the VM->Host
> comms and provide an example means of managing such communications.
> 
>  I hope this clears it up a bit.

Ok, this looks specific enough that an out-of-band solution within DPDK
sounds like the best approach.  It seems unnecessary to involve the
hypervisor (neither KVM nor QEMU).

Paolo

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-12-12 14:50                             ` Paolo Bonzini
@ 2014-12-12 16:10                               ` Thomas Monjalon
  2014-12-12 16:13                                 ` Paolo Bonzini
  0 siblings, 1 reply; 97+ messages in thread
From: Thomas Monjalon @ 2014-12-12 16:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: dev, qemu-devel

2014-12-12 15:50, Paolo Bonzini:
> On 12/12/2014 14:00, Carew, Alan wrote:
> > The problem is deterministic control of host CPU frequency and the DPDK usage
> > model.
> > A hands-off power governor will scale based on workload, whether this is a host
> > application or VM, so no problems or bug there.
> > 
> > Where this solution fits is where an application wants to control its own
> > power policy, for example l3fwd_power uses librte_power library to change
> > frequency via apci_cpufreq based on application heuristics rather than
> > relying on an inbuilt policy for example ondemand or performance.
> > 
> > This ability has existed in DPDK for host usage for some time and VM power
> > management allows this use case to be extended to cater for virtual machines
> > by re-using the librte_power interface to encapsulate the VM->Host
> > comms and provide an example means of managing such communications.
> > 
> >  I hope this clears it up a bit.
> 
> Ok, this looks specific enough that an out-of-band solution within DPDK
> sounds like the best approach.  It seems unnecessary to involve the
> hypervisor (neither KVM nor QEMU).

Paolo, I don't understand why you don't imagine controlling frequency scaling
of a pinned vCPU transparently?
In my understanding, we currently cannot control frequency scaling without
knowing wether we are in a VM or not.

-- 
Thomas

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/10] VM Power Management
  2014-12-12 16:10                               ` Thomas Monjalon
@ 2014-12-12 16:13                                 ` Paolo Bonzini
  0 siblings, 0 replies; 97+ messages in thread
From: Paolo Bonzini @ 2014-12-12 16:13 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, qemu-devel



On 12/12/2014 17:10, Thomas Monjalon wrote:
> > Ok, this looks specific enough that an out-of-band solution within DPDK
> > sounds like the best approach.  It seems unnecessary to involve the
> > hypervisor (neither KVM nor QEMU).
>
> Paolo, I don't understand why you don't imagine controlling frequency scaling
> of a pinned vCPU transparently?

Probably because I don't imagine controlling frequency scaling from the
application on bare metal, either. :)  It seems to me that this is just
working around limitations of the kernel.

Paolo

> In my understanding, we currently cannot control frequency scaling without
> knowing wether we are in a VM or not.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 00/10] Virtual Machine Power Management
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
                         ` (11 preceding siblings ...)
  2014-10-13 20:26       ` Thomas Monjalon
@ 2014-11-21 17:42       ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
                           ` (10 more replies)
  12 siblings, 11 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

Virtual Machine Power Management.

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/<vm_name>.<channel_number>,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
 Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.

Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.
<controller type='virtio-serial' index='0'>
 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<channel type='unix'>
  <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
  <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
  <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
</channel>

Where multiple channels can be configured by specifying multiple <channel>
elements, by replacing <vm_name>, <channel_num>.
<N>(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two separate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
 from VMs and calls the corresponding librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.
Changes in V5:
 Fixed default target in sample app Makefiles

Changes in V4:
 Fixed double free of channel during VM shutdown.

Changes in V3:
 Fixed crash in Guest CLI when host application is not running.
 Renamed #defines to be more specific to the module they belong
 Added vCPU pinning via CLI

Changes in V2:
 Runtime selection of librte_power implementations.
 Updated Unit tests to cover librte_power changes.
 PATCH[0/3] was sent twice, again as PATCH[0/4]
 Miscellaneous fixes.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  VM Power Management application and Makefile.
  VM Power Management CLI(Guest).
  VM communication channels for VM Power Management(Guest).
  librte_power common interface for Guest and Host
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests

 app/test/Makefile                                  |    3 +-
 app/test/autotest_data.py                          |   26 +
 app/test/test_power.c                              |  445 +----------
 app/test/test_power_acpi_cpufreq.c                 |  544 +++++++++++++
 app/test/test_power_kvm_vm.c                       |  308 ++++++++
 examples/vm_power_manager/Makefile                 |   57 ++
 examples/vm_power_manager/channel_manager.c        |  804 ++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h        |  314 ++++++++
 examples/vm_power_manager/channel_monitor.c        |  231 ++++++
 examples/vm_power_manager/channel_monitor.h        |  102 +++
 examples/vm_power_manager/guest_cli/Makefile       |   56 ++
 examples/vm_power_manager/guest_cli/main.c         |   87 +++
 examples/vm_power_manager/guest_cli/main.h         |   52 ++
 .../guest_cli/vm_power_cli_guest.c                 |  155 ++++
 .../guest_cli/vm_power_cli_guest.h                 |   55 ++
 examples/vm_power_manager/main.c                   |  117 +++
 examples/vm_power_manager/main.h                   |   52 ++
 examples/vm_power_manager/power_manager.c          |  244 ++++++
 examples/vm_power_manager/power_manager.h          |  188 +++++
 examples/vm_power_manager/vm_power_cli.c           |  669 ++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h           |   47 ++
 lib/librte_power/Makefile                          |    3 +-
 lib/librte_power/channel_commands.h                |   77 ++
 lib/librte_power/guest_channel.c                   |  162 ++++
 lib/librte_power/guest_channel.h                   |   89 +++
 lib/librte_power/rte_power.c                       |  540 ++------------
 lib/librte_power/rte_power.h                       |  120 +++-
 lib/librte_power/rte_power_acpi_cpufreq.c          |  545 +++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h          |  192 +++++
 lib/librte_power/rte_power_common.h                |   39 +
 lib/librte_power/rte_power_kvm_vm.c                |  135 ++++
 lib/librte_power/rte_power_kvm_vm.h                |  179 +++++
 32 files changed, 5725 insertions(+), 912 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h
 create mode 100644 lib/librte_power/channel_commands.h
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 02/10] VM Power Management CLI(Host) Pablo de Lara
                           ` (9 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/channel_manager.c |  804 +++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h |  314 +++++++++++
 examples/vm_power_manager/channel_monitor.c |  231 ++++++++
 examples/vm_power_manager/channel_monitor.h |  102 ++++
 4 files changed, 1451 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 0000000..a14f191
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,804 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+
+#include <rte_config.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+#include <rte_log.h>
+#include <rte_atomic.h>
+#include <rte_spinlock.h>
+
+#include <libvirt/libvirt.h>
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+		if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+static unsigned char *global_cpumaps;
+static virVcpuInfo *global_vircpuinfo;
+static size_t global_maplen;
+
+static unsigned global_n_host_cpus;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];
+	rte_atomic64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];
+	struct channel_info *channels[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	uint64_t channel_mask;
+	uint8_t num_channels;
+	enum vm_status status;
+	virDomainPtr domainPtr;
+	virDomainInfo info;
+	rte_spinlock_t config_spinlock;
+	LIST_ENTRY(virtual_machine_info) vms_info;
+};
+
+LIST_HEAD(, virtual_machine_info) vm_list_head;
+
+static struct virtual_machine_info *
+find_domain_by_name(const char *name)
+{
+	struct virtual_machine_info *info;
+	LIST_FOREACH(info, &vm_list_head, vms_info) {
+		if (!strncmp(info->name, name, CHANNEL_MGR_MAX_NAME_LEN-1))
+			return info;
+	}
+	return NULL;
+}
+
+static int
+update_pcpus_mask(struct virtual_machine_info *vm_info)
+{
+	virVcpuInfoPtr cpuinfo;
+	unsigned i, j;
+	int n_vcpus;
+	uint64_t mask;
+
+	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		n_vcpus = virDomainGetVcpuPinInfo(vm_info->domainPtr,
+				vm_info->info.nrVirtCpu, global_cpumaps, global_maplen,
+				VIR_DOMAIN_AFFECT_CONFIG);
+		if (n_vcpus < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+					"in-active VM '%s'\n", vm_info->name);
+			return -1;
+		}
+		goto update_pcpus;
+	}
+
+	memset(global_vircpuinfo, 0, sizeof(*global_vircpuinfo)*
+			CHANNEL_CMDS_MAX_CPUS);
+
+	cpuinfo = global_vircpuinfo;
+
+	n_vcpus = virDomainGetVcpus(vm_info->domainPtr, cpuinfo,
+			CHANNEL_CMDS_MAX_CPUS, global_cpumaps, global_maplen);
+	if (n_vcpus < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+							"active VM '%s'\n", vm_info->name);
+		return -1;
+	}
+update_pcpus:
+	if (n_vcpus >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Number of vCPUS(%u) is out of range "
+				"0...%d\n", n_vcpus, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+	if (n_vcpus != vm_info->info.nrVirtCpu) {
+		RTE_LOG(INFO, CHANNEL_MANAGER, "Updating the number of vCPUs for VM '%s"
+				" from %d -> %d\n", vm_info->name, vm_info->info.nrVirtCpu,
+				n_vcpus);
+		vm_info->info.nrVirtCpu = n_vcpus;
+	}
+	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
+		mask = 0;
+		for (j = 0; j < global_n_host_cpus; j++) {
+			if (VIR_CPU_USABLE(global_cpumaps, global_maplen, i, j) > 0) {
+				mask |= 1ULL << j;
+			}
+		}
+		rte_atomic64_set(&vm_info->pcpu_mask[i], mask);
+	}
+	return 0;
+}
+
+int
+set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask)
+{
+	unsigned i = 0;
+	int flags = VIR_DOMAIN_AFFECT_LIVE|VIR_DOMAIN_AFFECT_CONFIG;
+	struct virtual_machine_info *vm_info;
+	uint64_t mask = core_mask;
+
+	if (vcpu >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds max allowable(%d)\n",
+				vcpu, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s', VM is not active\n",
+				vcpu, core_mask, vm_info->name);
+		return -1;
+	}
+
+	if (vcpu >= vm_info->info.nrVirtCpu) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds the assigned number of "
+				"vCPUs(%u)\n", vcpu, vm_info->info.nrVirtCpu);
+		return -1;
+	}
+	memset(global_cpumaps, 0 , CHANNEL_CMDS_MAX_CPUS * global_maplen);
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		VIR_USE_CPU(global_cpumaps, i);
+		if (i >= global_n_host_cpus) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "CPU(%u) exceeds the available "
+					"number of CPUs(%u)\n", i, global_n_host_cpus);
+			return -1;
+		}
+	}
+	if (virDomainPinVcpuFlags(vm_info->domainPtr, vcpu, global_cpumaps,
+			global_maplen, flags) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s'\n", vcpu, core_mask,
+				vm_info->name);
+		return -1;
+	}
+	rte_atomic64_set(&vm_info->pcpu_mask[vcpu], core_mask);
+	return 0;
+
+}
+
+int
+set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num)
+{
+	uint64_t mask = 1ULL << core_num;
+	return set_pcpus_mask(vm_name, vcpu, mask);
+}
+
+uint64_t
+get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
+{
+	struct virtual_machine_info *vm_info =
+			(struct virtual_machine_info *)chan_info->priv_info;
+	return rte_atomic64_read(&vm_info->pcpu_mask[vcpu]);
+}
+
+static inline int
+channel_exists(struct virtual_machine_info *vm_info, unsigned channel_num)
+{
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	if (vm_info->channel_mask & (1ULL << channel_num)) {
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+		return 1;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+
+
+static int
+open_non_blocking_channel(struct channel_info *info)
+{
+	int ret, flags;
+	struct sockaddr_un sock_addr;
+	fd_set soc_fd_set;
+	struct timeval tv;
+
+	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (info->fd == -1) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
+				strerror(errno),
+				info->channel_path);
+		return -1;
+	}
+	sock_addr.sun_family = AF_UNIX;
+	memcpy(&sock_addr.sun_path, info->channel_path,
+			strlen(info->channel_path)+1);
+
+	/* Get current flags */
+	flags = fcntl(info->fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
+				"'%s'\n", strerror(errno), info->channel_path);
+		return 1;
+	}
+	/* Set to Non Blocking */
+	flags |= O_NONBLOCK;
+	if (fcntl(info->fd, F_SETFL, flags) < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
+				"socket for '%s'\n", strerror(errno), info->channel_path);
+		return -1;
+	}
+	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
+			sizeof(sock_addr));
+	if (ret < 0) {
+		/* ECONNREFUSED error is given when VM is not active */
+		if (errno == ECONNREFUSED) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
+					"activated its endpoint to channel %s\n",
+					info->channel_path);
+			return -1;
+		}
+		/* Wait for tv_sec if in progress */
+		else if (errno == EINPROGRESS) {
+			tv.tv_sec = 2;
+			tv.tv_usec = 0;
+			FD_ZERO(&soc_fd_set);
+			FD_SET(info->fd, &soc_fd_set);
+			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
+				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
+						"'%s'\n", info->channel_path);
+				return -1;
+			}
+		} else {
+			/* Any other error */
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
+					" for '%s'\n", strerror(errno), info->channel_path);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int
+setup_channel_info(struct virtual_machine_info **vm_info_dptr,
+		struct channel_info **chan_info_dptr, unsigned channel_num)
+{
+	struct channel_info *chan_info = *chan_info_dptr;
+	struct virtual_machine_info *vm_info = *vm_info_dptr;
+
+	chan_info->channel_num = channel_num;
+	chan_info->priv_info = (void *)vm_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_DISCONNECTED;
+	if (open_non_blocking_channel(chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
+				"'%s' for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+	}
+	if (add_channel_to_monitor(&chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
+				"'%s' to epoll ctl for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+
+	}
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->num_channels++;
+	vm_info->channel_mask |= 1ULL << channel_num;
+	vm_info->channels[channel_num] = chan_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_all_channels(const char *vm_name)
+{
+	DIR *d;
+	struct dirent *dir;
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char *token, *remaining, *tail_ptr;
+	char socket_name[PATH_MAX];
+	unsigned channel_num;
+	int num_channels_enabled = 0;
+
+	/* verify VM exists */
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
+				" during channel discovery\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+	d = opendir(CHANNEL_MGR_SOCKET_PATH);
+	if (d == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
+				CHANNEL_MGR_SOCKET_PATH, strerror(errno));
+		return -1;
+	}
+	while ((dir = readdir(d)) != NULL) {
+		if (!strncmp(dir->d_name, ".", 1) ||
+				!strncmp(dir->d_name, "..", 2))
+			continue;
+
+		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
+		remaining = socket_name;
+		/* Extract vm_name from "<vm_name>.<channel_num>" */
+		token = strsep(&remaining, ".");
+		if (remaining == NULL)
+			continue;
+		if (strncmp(vm_name, token, CHANNEL_MGR_MAX_NAME_LEN))
+			continue;
+
+		/* remaining should contain only <channel_num> */
+		errno = 0;
+		channel_num = (unsigned)strtol(remaining, &tail_ptr, 0);
+		if ((errno != 0) || (remaining[0] == '\0') ||
+				(*tail_ptr != '\0') || tail_ptr == NULL) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
+					"'%s' found it should be in the form of "
+					"'<guest_name>.<channel_num>(decimal)'\n",
+					dir->d_name);
+			continue;
+		}
+		if (channel_num >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
+					"greater than max allowable: %d, skipping '%s%s'\n",
+					channel_num, CHANNEL_CMDS_MAX_VM_CHANNELS-1,
+					CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+		/* if channel has not been added previously */
+		if (channel_exists(vm_info, channel_num))
+			continue;
+
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s%s'\n", CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s",
+				CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+
+		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+
+		num_channels_enabled++;
+	}
+	closedir(d);
+	return num_channels_enabled;
+}
+
+int
+add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char socket_path[PATH_MAX];
+	unsigned i;
+	int num_channels_enabled = 0;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+
+	for (i = 0; i < len_channel_list; i++) {
+
+		if (channel_list[i] >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel(%u) is out of range "
+							"0...%d\n", channel_list[i],
+							CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+			continue;
+		}
+		if (channel_exists(vm_info, channel_list[i])) {
+			RTE_LOG(INFO, CHANNEL_MANAGER,  "Channel already exists, skipping  "
+					"'%s.%u'\n", vm_name, i);
+			continue;
+		}
+
+		snprintf(socket_path, sizeof(socket_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		errno = 0;
+		if (access(socket_path, F_OK) < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
+					"%s\n", socket_path, strerror(errno));
+			continue;
+		}
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s'\n", socket_path);
+			continue;
+		}
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+		num_channels_enabled++;
+
+	}
+	return num_channels_enabled;
+}
+
+int
+remove_channel(struct channel_info **chan_info_dptr)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info = *chan_info_dptr;
+
+	close(chan_info->fd);
+
+	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
+	vm_info->num_channels--;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	rte_free(chan_info);
+	return 0;
+}
+
+int
+set_channel_status_all(const char *vm_name, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	uint64_t mask;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		vm_info->channels[i]->status = status;
+		num_channels_changed++;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return num_channels_changed;
+
+}
+
+int
+set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (channel_exists(vm_info, channel_list[i])) {
+			rte_spinlock_lock(&(vm_info->config_spinlock));
+			vm_info->channels[channel_list[i]]->status = status;
+			rte_spinlock_unlock(&(vm_info->config_spinlock));
+			num_channels_changed++;
+		}
+	}
+	return num_channels_changed;
+}
+
+int
+get_info_vm(const char *vm_name, struct vm_info *info)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i, channel_num = 0;
+	uint64_t mask;
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+	info->status = CHANNEL_MGR_VM_ACTIVE;
+	if (!virDomainIsActive(vm_info->domainPtr))
+		info->status = CHANNEL_MGR_VM_INACTIVE;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		info->channels[channel_num].channel_num = i;
+		memcpy(info->channels[channel_num].channel_path,
+				vm_info->channels[i]->channel_path, PATH_MAX);
+		info->channels[channel_num].status = vm_info->channels[i]->status;
+		info->channels[channel_num].fd = vm_info->channels[i]->fd;
+		channel_num++;
+	}
+
+	info->num_channels = channel_num;
+	info->num_vcpus = vm_info->info.nrVirtCpu;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
+	for (i = 0; i < info->num_vcpus; i++) {
+		info->pcpu_mask[i] = rte_atomic64_read(&vm_info->pcpu_mask[i]);
+	}
+	return 0;
+}
+
+int
+add_vm(const char *vm_name)
+{
+	struct virtual_machine_info *new_domain;
+	virDomainPtr dom_ptr;
+	int i;
+
+	if (find_domain_by_name(vm_name) != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
+				"already exists\n", vm_name);
+		return -1;
+	}
+
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
+		return -1;
+	}
+	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
+	if (dom_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
+				"VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
+			CACHE_LINE_SIZE);
+	if (new_domain == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
+				"info\n");
+		return -1;
+	}
+	new_domain->domainPtr = dom_ptr;
+	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	if (new_domain->info.nrVirtCpu > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
+				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
+				CHANNEL_CMDS_MAX_CPUS);
+		rte_free(new_domain);
+		return -1;
+	}
+
+	for (i = 0; i < CHANNEL_CMDS_MAX_CPUS; i++) {
+		rte_atomic64_init(&new_domain->pcpu_mask[i]);
+	}
+	if (update_pcpus_mask(new_domain) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
+	new_domain->channel_mask = 0;
+	new_domain->num_channels = 0;
+
+	if (!virDomainIsActive(dom_ptr))
+		new_domain->status = CHANNEL_MGR_VM_INACTIVE;
+	else
+		new_domain->status = CHANNEL_MGR_VM_ACTIVE;
+
+	rte_spinlock_init(&(new_domain->config_spinlock));
+	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
+	return 0;
+}
+
+int
+remove_vm(const char *vm_name)
+{
+	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
+				"not found\n", vm_name);
+		return -1;
+	}
+	rte_spinlock_lock(&vm_info->config_spinlock);
+	if (vm_info->num_channels != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
+				"%"PRId8" channels still active\n",
+				vm_name, vm_info->num_channels);
+		rte_spinlock_unlock(&vm_info->config_spinlock);
+		return -1;
+	}
+	LIST_REMOVE(vm_info, vms_info);
+	rte_spinlock_unlock(&vm_info->config_spinlock);
+	rte_free(vm_info);
+	return 0;
+}
+
+static void
+disconnect_hypervisor(void)
+{
+	if (global_vir_conn_ptr != NULL) {
+		virConnectClose(global_vir_conn_ptr);
+		global_vir_conn_ptr = NULL;
+	}
+}
+
+static int
+connect_hypervisor(const char *path)
+{
+	if (global_vir_conn_ptr != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+				"already established\n", path);
+		return -1;
+	}
+	global_vir_conn_ptr = virConnectOpen(path);
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
+				"Hypervisor '%s'\n", path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_manager_init(const char *path)
+{
+	int n_cpus;
+	LIST_INIT(&vm_list_head);
+	if (connect_hypervisor(path) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
+		return -1;
+	}
+
+	global_maplen = VIR_CPU_MAPLEN(CHANNEL_CMDS_MAX_CPUS);
+
+	global_vircpuinfo = rte_zmalloc(NULL, sizeof(*global_vircpuinfo) *
+			CHANNEL_CMDS_MAX_CPUS, CACHE_LINE_SIZE);
+	if (global_vircpuinfo == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for CPU Info\n");
+		goto error;
+	}
+	global_cpumaps = rte_zmalloc(NULL, CHANNEL_CMDS_MAX_CPUS * global_maplen,
+			CACHE_LINE_SIZE);
+	if (global_cpumaps == NULL) {
+		goto error;
+	}
+
+	n_cpus = virNodeGetCPUMap(global_vir_conn_ptr, NULL, NULL, 0);
+	if (n_cpus <= 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get the number of Host "
+				"CPUs\n");
+		goto error;
+	}
+	global_n_host_cpus = (unsigned)n_cpus;
+
+	if (global_n_host_cpus > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "The number of host CPUs(%u) exceeds the "
+				"maximum of %u\n", global_n_host_cpus, CHANNEL_CMDS_MAX_CPUS);
+		goto error;
+
+	}
+
+	return 0;
+error:
+	disconnect_hypervisor();
+	return -1;
+}
+
+void
+channel_manager_exit(void)
+{
+	unsigned i;
+	uint64_t mask;
+	struct virtual_machine_info *vm_info;
+
+	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
+
+		rte_spinlock_lock(&(vm_info->config_spinlock));
+
+		mask = vm_info->channel_mask;
+		ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			remove_channel_from_monitor(vm_info->channels[i]);
+			close(vm_info->channels[i]->fd);
+			rte_free(vm_info->channels[i]);
+		}
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+		LIST_REMOVE(vm_info, vms_info);
+		rte_free(vm_info);
+	}
+
+	if (global_cpumaps != NULL)
+		rte_free(global_cpumaps);
+	if (global_vircpuinfo != NULL)
+		rte_free(global_vircpuinfo);
+	disconnect_hypervisor();
+}
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
new file mode 100644
index 0000000..12c29c3
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.h
@@ -0,0 +1,314 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MANAGER_H_
+#define CHANNEL_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/limits.h>
+#include <rte_atomic.h>
+#include "channel_commands.h"
+
+/* Maximum name length including '\0' terminator */
+#define CHANNEL_MGR_MAX_NAME_LEN    64
+
+/* Maximum number of channels to each Virtual Machine */
+#define CHANNEL_MGR_MAX_CHANNELS    64
+
+/* Hypervisor Path for libvirt(qemu/KVM) */
+#define CHANNEL_MGR_DEFAULT_HV_PATH "qemu:///system"
+
+/* File socket directory */
+#define CHANNEL_MGR_SOCKET_PATH     "/tmp/powermonitor/"
+
+/* Communication Channel Status */
+enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
+	CHANNEL_MGR_CHANNEL_CONNECTED,
+	CHANNEL_MGR_CHANNEL_DISABLED,
+	CHANNEL_MGR_CHANNEL_PROCESSING};
+
+/* VM libvirt(qemu/KVM) connection status */
+enum vm_status { CHANNEL_MGR_VM_INACTIVE = 0, CHANNEL_MGR_VM_ACTIVE};
+
+/*
+ *  Represents a single and exclusive VM channel that exists between a guest and
+ *  the host.
+ */
+struct channel_info {
+	char channel_path[PATH_MAX]; /**< Path to host socket */
+	volatile uint32_t status;    /**< Connection status(enum channel_status) */
+	int fd;                      /**< AF_UNIX socket fd */
+	unsigned channel_num;        /**< CHANNEL_MGR_SOCKET_PATH/<vm_name>.channel_num */
+	void *priv_info;             /**< Pointer to private info, do not modify */
+};
+
+/* Represents a single VM instance used to return internal information about
+ * a VM */
+struct vm_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];          /**< VM name */
+	enum vm_status status;                        /**< libvirt status */
+	uint64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];    /**< pCPU mask for each vCPU */
+	unsigned num_vcpus;                           /**< number of vCPUS */
+	struct channel_info channels[CHANNEL_MGR_MAX_CHANNELS]; /**< Array of channel_info */
+	unsigned num_channels;                        /**< Number of channels */
+};
+
+/**
+ * Initialize the Channel Manager resources and connect to the Hypervisor
+ * specified in path.
+ * This must be successfully called first before calling any other functions.
+ * It must only be call once;
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_manager_init(const char *path);
+
+/**
+ * Free resources associated with the Channel Manager.
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  None
+ */
+void channel_manager_exit(void);
+
+/**
+ * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
+ * core_mask.
+ * It is not thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info
+ *
+ * @param vcpu
+ *  The virtual CPU to query.
+ *
+ *
+ * @return
+ *  - 0 on error.
+ *  - >0 on success.
+ */
+uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
+
+/**
+ * Set the Physical CPU mask for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_mask
+ *  The core mask of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask);
+
+/**
+ * Set the Physical CPU for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_num
+ *  The core number of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num);
+/**
+ * Add a VM as specified by name to the Channel Manager. The name must
+ * correspond to a valid libvirt domain name.
+ * This is required prior to adding channels.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_vm(const char *name);
+
+/**
+ * Remove a previously added Virtual Machine from the Channel Manager
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_vm(const char *name);
+
+/**
+ * Add all available channels to the VM as specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ */
+int add_all_channels(const char *vm_name);
+
+/**
+ * Add the channel numbers in channel_list to the domain specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel number to add
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned num_channels);
+
+/**
+ * Remove a channel definition from the channel manager. This must only be
+ * called from the channel monitor thread.
+ *
+ * @param chan_info
+ *  Pointer to a valid struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel(struct channel_info **chan_info_dptr);
+
+/**
+ * For all channels associated with a Virtual Machine name, update the
+ * connection status. Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ *
+ *
+ * @param name
+ *  Virtual Machine name to modify all channels.
+ *
+ * @param status
+ *  The status to set each channel
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int set_channel_status_all(const char *name, enum channel_status status);
+
+/**
+ * For all channels in channel_list associated with a Virtual Machine name
+ * update the connection status of each.
+ * Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel numbers to
+ *  modify.
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels modified for the VM
+ *  - 0 for error
+ */
+int set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status);
+
+/**
+ * Populates a pointer to struct vm_info associated with vm_name.
+ *
+ * @param vm_name
+ *  The name of the virtual machine to lookup.
+ *
+ *  @param vm_info
+ *   Pointer to a struct vm_info, this must be allocated prior to calling this
+ *   function.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int get_info_vm(const char *vm_name, struct vm_info *info);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_MANAGER_H_ */
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
new file mode 100644
index 0000000..3674c7c
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -0,0 +1,231 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+
+
+#include "channel_monitor.h"
+#include "channel_commands.h"
+#include "channel_manager.h"
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+
+static volatile unsigned run_loop = 1;
+static int global_event_fd;
+static struct epoll_event *global_events_list;
+
+void channel_monitor_exit(void)
+{
+	run_loop = 0;
+	rte_free(global_events_list);
+}
+
+static int
+process_request(struct channel_packet *pkt, struct channel_info *chan_info)
+{
+	uint64_t core_mask;
+
+	if (chan_info == NULL)
+		return -1;
+
+	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_CONNECTED,
+			CHANNEL_MGR_CHANNEL_PROCESSING) == 0)
+		return -1;
+
+	if (pkt->command == CPU_POWER) {
+		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
+		if (core_mask == 0) {
+			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
+					"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
+					(unsigned)pkt->unit);
+			return -1;
+		}
+		if (__builtin_popcountll(core_mask) == 1) {
+
+			unsigned core_num = __builtin_ffsll(core_mask) - 1;
+
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_core_min(core_num);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_core_max(core_num);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_core_down(core_num);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_core_up(core_num);
+			break;
+			default:
+				break;
+			}
+		} else {
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_mask_min(core_mask);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_mask_max(core_mask);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_mask_down(core_mask);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_mask_up(core_mask);
+			break;
+			default:
+				break;
+			}
+
+		}
+	}
+	/* Return is not checked as channel status may have been set to DISABLED
+	 * from management thread
+	 */
+	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_PROCESSING,
+			CHANNEL_MGR_CHANNEL_CONNECTED);
+	return 0;
+
+}
+
+int
+add_channel_to_monitor(struct channel_info **chan_info)
+{
+	struct channel_info *info = *chan_info;
+	struct epoll_event event;
+	event.events = EPOLLIN;
+	event.data.ptr = info;
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
+				"to epoll\n", info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+remove_channel_from_monitor(struct channel_info *chan_info)
+{
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
+				"from epoll\n", chan_info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_monitor_init(void)
+{
+	global_event_fd = epoll_create1(0);
+	if (global_event_fd == 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
+				"error %s\n", strerror(errno));
+		return -1;
+	}
+	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
+			* MAX_EVENTS, CACHE_LINE_SIZE);
+	if (global_events_list == NULL) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+				"epoll events\n");
+		return -1;
+	}
+	return 0;
+}
+
+void
+run_channel_monitor(void)
+{
+	while (run_loop) {
+		int n_events, i;
+		n_events = epoll_wait(global_event_fd, global_events_list,
+				MAX_EVENTS, 1);
+		if (!run_loop)
+			break;
+		for (i = 0; i < n_events; i++) {
+			struct channel_info *chan_info = (struct channel_info *)
+					global_events_list[i].data.ptr;
+			if ((global_events_list[i].events & EPOLLERR) ||
+					(global_events_list[i].events & EPOLLHUP)) {
+				RTE_LOG(DEBUG, CHANNEL_MONITOR, "Remote closed connection for "
+						"channel '%s'\n", chan_info->channel_path);
+				remove_channel(&chan_info);
+				continue;
+			}
+			if (global_events_list[i].events & EPOLLIN) {
+
+				int n_bytes, err = 0;
+				struct channel_packet pkt;
+				void *buffer = &pkt;
+				int buffer_len = sizeof(pkt);
+				while (buffer_len > 0) {
+					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					if (n_bytes == buffer_len)
+						break;
+					if (n_bytes == -1) {
+						err = errno;
+						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
+								"channel '%s' read: %s\n",
+								chan_info->channel_path, strerror(err));
+						remove_channel(&chan_info);
+						break;
+					}
+					buffer = (char *)buffer + n_bytes;
+					buffer_len -= n_bytes;
+				}
+				if (!err)
+					process_request(&pkt, chan_info);
+			}
+		}
+	}
+}
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
new file mode 100644
index 0000000..c138607
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MONITOR_H_
+#define CHANNEL_MONITOR_H_
+
+#include "channel_manager.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Channel Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_monitor_init(void);
+
+/**
+ * Run the channel monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_channel_monitor(void);
+
+/**
+ * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+void channel_monitor_exit(void);
+
+/**
+ * Add an open channel to monitor via epoll. A pointer to struct channel_info
+ * will be registered with epoll for event processing.
+ * It is thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info pointer.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_channel_to_monitor(struct channel_info **chan_info);
+
+/**
+ * Remove a previously added channel from epoll control.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel_from_monitor(struct channel_info *chan_info);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* CHANNEL_MONITOR_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 02/10] VM Power Management CLI(Host).
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 03/10] CPU Frequency Power Management(Host) Pablo de Lara
                           ` (8 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-seperated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/vm_power_cli.c |  669 ++++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h |   47 ++
 2 files changed, 716 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 0000000..e162e88
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,669 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <termios.h>
+#include <errno.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+		struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+struct cmd_show_vm_result {
+	cmdline_fixed_string_t show_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_show_vm_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_show_vm_result *res = parsed_result;
+	struct vm_info info;
+	unsigned i;
+
+	if (get_info_vm(res->vm_name, &info) != 0)
+		return;
+	cmdline_printf(cl, "VM: '%s', status = ", info.name);
+	if (info.status == CHANNEL_MGR_VM_ACTIVE)
+		cmdline_printf(cl, "ACTIVE\n");
+	else
+		cmdline_printf(cl, "INACTIVE\n");
+	cmdline_printf(cl, "Channels %u\n", info.num_channels);
+	for (i = 0; i < info.num_channels; i++) {
+		cmdline_printf(cl, "  [%u]: %s, status = ", i,
+				info.channels[i].channel_path);
+		switch (info.channels[i].status) {
+		case CHANNEL_MGR_CHANNEL_CONNECTED:
+			cmdline_printf(cl, "CONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISCONNECTED:
+			cmdline_printf(cl, "DISCONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISABLED:
+			cmdline_printf(cl, "DISABLED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_PROCESSING:
+			cmdline_printf(cl, "PROCESSING\n");
+			break;
+		default:
+			cmdline_printf(cl, "UNKNOWN\n");
+			break;
+		}
+	}
+	cmdline_printf(cl, "Virtual CPU(s): %u\n", info.num_vcpus);
+	for (i = 0; i < info.num_vcpus; i++) {
+		cmdline_printf(cl, "  [%u]: Physical CPU Mask 0x%"PRIx64"\n", i,
+				info.pcpu_mask[i]);
+	}
+}
+
+
+
+cmdline_parse_token_string_t cmd_vm_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+				show_vm, "show_vm");
+cmdline_parse_token_string_t cmd_show_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_show_vm_set = {
+	.f = cmd_show_vm_parsed,
+	.data = NULL,
+	.help_str = "show_vm <vm_name>, prints the information on the "
+			"specified VM(s), the information lists the number of vCPUS, the "
+			"pinning to pCPU(s) as a bit mask, along with any communication "
+			"channels associated with each VM",
+	.tokens = {
+		(void *)&cmd_vm_show,
+		(void *)&cmd_show_vm_name,
+		NULL,
+	},
+};
+
+/* *** vCPU to pCPU mapping operations *** */
+struct cmd_set_pcpu_mask_result {
+    cmdline_fixed_string_t set_pcpu_mask;
+    cmdline_fixed_string_t vm_name;
+    uint8_t vcpu;
+    uint64_t core_mask;
+};
+
+static void
+cmd_set_pcpu_mask_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_mask_result *res = parsed_result;
+	if (set_pcpus_mask(res->vm_name, res->vcpu, res->core_mask) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu_mask =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				set_pcpu_mask, "set_pcpu_mask");
+cmdline_parse_token_string_t cmd_set_pcpu_mask_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_mask_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_mask_core_mask =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				core_mask, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_mask_set = {
+		.f = cmd_set_pcpu_mask_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu_mask <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU mask.",
+				.tokens = {
+						(void *)&cmd_set_pcpu_mask,
+						(void *)&cmd_set_pcpu_mask_vm_name,
+						(void *)&set_pcpu_mask_vcpu,
+						(void *)&set_pcpu_mask_core_mask,
+						NULL,
+		},
+};
+
+struct cmd_set_pcpu_result {
+    cmdline_fixed_string_t set_pcpu;
+    cmdline_fixed_string_t vm_name;
+    uint8_t vcpu;
+    uint8_t core;
+};
+
+static void
+cmd_set_pcpu_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_result *res = parsed_result;
+	if (set_pcpu(res->vm_name, res->vcpu, res->core) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				set_pcpu, "set_pcpu");
+cmdline_parse_token_string_t cmd_set_pcpu_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_core =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				core, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_set = {
+		.f = cmd_set_pcpu_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU.",
+				.tokens = {
+						(void *)&cmd_set_pcpu,
+						(void *)&cmd_set_pcpu_vm_name,
+						(void *)&set_pcpu_vcpu,
+						(void *)&set_pcpu_core,
+						NULL,
+		},
+};
+
+struct cmd_vm_op_result {
+	cmdline_fixed_string_t op_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_vm_op_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_vm_op_result *res = parsed_result;
+
+	if (!strcmp(res->op_vm, "add_vm")) {
+		if (add_vm(res->vm_name) < 0)
+			cmdline_printf(cl, "Unable to add VM '%s'\n", res->vm_name);
+	} else if (remove_vm(res->vm_name) < 0)
+		cmdline_printf(cl, "Unable to remove VM '%s'\n", res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_vm_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			op_vm, "add_vm#rm_vm");
+cmdline_parse_token_string_t cmd_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_vm_op_set = {
+	.f = cmd_vm_op_parsed,
+	.data = NULL,
+	.help_str = "add_vm|rm_vm <name>, add a VM for "
+			"subsequent operations with the CLI or remove a previously added "
+			"VM from the VM Power Manager",
+	.tokens = {
+		(void *)&cmd_vm_op,
+		(void *)&cmd_vm_name,
+	NULL,
+	},
+};
+
+/* *** VM channel operations *** */
+struct cmd_channels_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+};
+static void
+cmd_channels_op_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num, i;
+	int channels_added;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_op_result *res = parsed_result;
+
+	if (!strcmp(res->channel_list, "all")) {
+		channels_added = add_all_channels(res->vm_name);
+		cmdline_printf(cl, "Added %d channels for VM '%s'\n",
+				channels_added, res->vm_name);
+		return;
+	}
+
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "Channel number '%u' exceeds the maximum number "
+					"of allowable channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	for (i = 0; i < num_channels; i++)
+		cmdline_printf(cl, "[%u]: Adding channel %u\n", i, channel_list[i]);
+
+	channels_added = add_channels(res->vm_name, channel_list,
+			num_channels);
+	cmdline_printf(cl, "Enabled %d channels for '%s'\n", channels_added,
+			res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+				op, "add_channels");
+cmdline_parse_token_string_t cmd_channels_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			channel_list, NULL);
+
+cmdline_parse_inst_t cmd_channels_op_set = {
+	.f = cmd_channels_op_parsed,
+	.data = NULL,
+	.help_str = "add_channels <vm_name> <list>|all, add "
+			"communication channels for the specified VM, the "
+			"virtio channels must be enabled in the VM "
+			"configuration(qemu/libvirt) and the associated VM must be active. "
+			"<list> is a comma-seperated list of channel numbers to add, using "
+			"the keyword 'all' will attempt to add all channels for the VM",
+	.tokens = {
+		(void *)&cmd_channels_op,
+		(void *)&cmd_channels_vm_name,
+		(void *)&cmd_channels_list,
+		NULL,
+	},
+};
+
+struct cmd_channels_status_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+	cmdline_fixed_string_t status;
+};
+
+static void
+cmd_channels_status_op_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num;
+	int changed;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_status_op_result *res = parsed_result;
+	enum channel_status status;
+
+	if (!strcmp(res->status, "enabled"))
+		status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	else
+		status = CHANNEL_MGR_CHANNEL_DISABLED;
+
+	if (!strcmp(res->channel_list, "all")) {
+		changed = set_channel_status_all(res->vm_name, status);
+		cmdline_printf(cl, "Updated status of %d channels "
+				"for VM '%s'\n", changed, res->vm_name);
+		return;
+	}
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "%u exceeds the maximum number of allowable "
+					"channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	changed = set_channel_status(res->vm_name, channel_list, num_channels,
+			status);
+	cmdline_printf(cl, "Updated status of %d channels "
+					"for VM '%s'\n", changed, res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_status_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+				op, "set_channel_status");
+cmdline_parse_token_string_t cmd_channels_status_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_status_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			channel_list, NULL);
+cmdline_parse_token_string_t cmd_channels_status =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			status, "enabled#disabled");
+
+cmdline_parse_inst_t cmd_channels_status_op_set = {
+	.f = cmd_channels_status_op_parsed,
+	.data = NULL,
+	.help_str = "set_channel_status <vm_name> <list>|all enabled|disabled, "
+			" enable or disable the communication channels in "
+			"list(comma-seperated) for the specified VM, alternatively list can"
+			" be replaced with keyword 'all'. Disabled channels will still "
+			"receive packets on the host, however the commands they specify "
+			"will be ignored. Set status to 'enabled' to begin processing "
+			"requests again.",
+	.tokens = {
+		(void *)&cmd_channels_status_op,
+		(void *)&cmd_channels_status_vm_name,
+		(void *)&cmd_channels_status_list,
+		(void *)&cmd_channels_status,
+		NULL,
+	},
+};
+
+/* *** CPU Frequency operations *** */
+struct cmd_show_cpu_freq_mask_result {
+	cmdline_fixed_string_t show_cpu_freq_mask;
+	uint64_t core_mask;
+};
+
+static void
+cmd_show_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_mask_result *res = parsed_result;
+	unsigned i;
+	uint64_t mask = res->core_mask;
+	uint32_t freq;
+	for (i = 0; mask; mask &= ~(1ULL << i++)) {
+		if ((mask >> i) & 1) {
+			freq = power_manager_get_current_frequency(i);
+			if (freq > 0)
+				cmdline_printf(cl, "Core %u: %"PRId32"\n", i, freq);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			show_cpu_freq_mask, "show_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_show_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			core_mask, UINT64);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_mask_set = {
+	.f = cmd_show_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "show_cpu_freq_mask <mask>, Get the current frequency for each "
+			"core specified in the mask",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq_mask,
+		(void *)&cmd_show_cpu_freq_mask_core_mask,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_mask_result {
+	cmdline_fixed_string_t set_cpu_freq_mask;
+	uint64_t core_mask;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	struct cmd_set_cpu_freq_mask_result *res = parsed_result;
+	int ret = -1;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_mask_up(res->core_mask);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_mask_down(res->core_mask);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_mask_min(res->core_mask);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_mask_max(res->core_mask);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
+				"all cores specified have been scaled\n",
+				res->core_mask, res->cmd);
+	};
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			set_cpu_freq_mask, "set_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			core_mask, UINT64);
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
+	.f = cmd_set_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
+			"frequency for the cores specified in <core_mask> by scaling "
+			"each up/down/min/max.",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq_mask,
+		(void *)&cmd_set_cpu_freq_mask_core_mask,
+		(void *)&cmd_set_cpu_freq_mask_result,
+		NULL,
+	},
+};
+
+
+
+struct cmd_show_cpu_freq_result {
+	cmdline_fixed_string_t show_cpu_freq;
+	uint8_t core_num;
+};
+
+static void
+cmd_show_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_result *res = parsed_result;
+	uint32_t curr_freq = power_manager_get_current_frequency(res->core_num);
+	if (curr_freq == 0) {
+		cmdline_printf(cl, "Unable to get frequency for core %u\n",
+				res->core_num);
+		return;
+	}
+	cmdline_printf(cl, "Core %u frequency: %"PRId32"\n", res->core_num,
+			curr_freq);
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_result,
+			show_cpu_freq, "show_cpu_freq");
+
+cmdline_parse_token_num_t cmd_show_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_result,
+			core_num, UINT8);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_set = {
+	.f = cmd_show_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "Get the current frequency for the specified core",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq,
+		(void *)&cmd_show_cpu_freq_core_num,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t core_num;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_core_up(res->core_num);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_core_down(res->core_num);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_core_min(res->core_num);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_core_max(res->core_num);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
+				res->cmd);
+	}
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			core_num, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_vm_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_status_op_set,
+		(cmdline_parse_inst_t *)&cmd_show_vm_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+
+	cl = cmdline_stdin_new(main_ctx, "vmpower> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/vm_power_cli.h b/examples/vm_power_manager/vm_power_cli.h
new file mode 100644
index 0000000..deccd51
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.h
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 03/10] CPU Frequency Power Management(Host).
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 02/10] VM Power Management CLI(Host) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 04/10] VM Power Management application and Makefile Pablo de Lara
                           ` (7 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/power_manager.c |  244 +++++++++++++++++++++++++++++
 examples/vm_power_manager/power_manager.h |  188 ++++++++++++++++++++++
 2 files changed, 432 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 0000000..b7b1fca
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,244 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/types.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_power.h>
+#include <rte_spinlock.h>
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
+	if (core_num >= POWER_MGR_MAX_CPUS) \
+		return -1; \
+	if (!(global_enabled_cpus & (1ULL << core_num))) \
+		return -1; \
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+	ret = rte_power_freq_##DIRECTION(core_num); \
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+	int i; \
+	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+		if ((core_mask >> i) & 1) { \
+			if (!(global_enabled_cpus & (1ULL << i))) \
+			continue; \
+		rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+		if (rte_power_freq_##DIRECTION(i) != 1) \
+			ret = -1; \
+		rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+		} \
+	} \
+} while (0)
+
+struct freq_info {
+	rte_spinlock_t power_sl;
+	uint32_t freqs[RTE_MAX_LCORE_FREQS];
+	unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+	char path[PATH_MAX];
+	unsigned i;
+	unsigned num_cpus = 0;
+	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
+		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+		if (access(path, F_OK) == 0) {
+			global_enabled_cpus |= 1ULL << i;
+			num_cpus++;
+		} else
+			return num_cpus;
+	}
+	return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+	unsigned i, num_cpus;
+	uint64_t cpu_mask;
+	int ret = 0;
+
+	num_cpus = set_host_cpus_mask();
+	if (num_cpus == 0) {
+		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
+				"ensure that sufficient privileges exist to inspect sysfs\n");
+		return -1;
+	}
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	cpu_mask = global_enabled_cpus;
+	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
+		if (rte_power_init(i) < 0 || rte_power_freqs(i,
+				global_core_freq_info[i].freqs,
+				RTE_MAX_LCORE_FREQS) == 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+					"for core %u\n", i);
+			global_enabled_cpus &= ~(1 << i);
+			num_cpus--;
+			ret = -1;
+		}
+		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+	}
+	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
+					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	return ret;
+
+}
+
+uint32_t
+power_manager_get_current_frequency(unsigned core_num)
+{
+	uint32_t freq, index;
+
+	if (core_num >= POWER_MGR_MAX_CPUS) {
+		RTE_LOG(ERR, POWER_MANAGER, "Core(%u) is out of range 0...%d\n",
+				core_num, POWER_MGR_MAX_CPUS-1);
+		return -1;
+	}
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return 0;
+
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	index = rte_power_get_freq(core_num);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	if (index >= POWER_MGR_MAX_CPUS)
+		freq = 0;
+	else
+		freq = global_core_freq_info[core_num].freqs[index];
+
+	return freq;
+}
+
+int
+power_manager_exit(void)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
+		if (rte_power_exit(i) < 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+					"for core %u\n", i);
+			ret = -1;
+		}
+	}
+	global_enabled_cpus = 0;
+	return ret;
+}
+
+int
+power_manager_scale_mask_up(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(up, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_down(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(down, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_min(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(min, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_max(uint64_t core_mask)
+{
+	int ret = 0;
+	POWER_SCALE_MASK(max, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_up(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(up, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_down(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(down, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_min(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(min, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_max(unsigned core_num)
+{
+	int ret = 0;
+	POWER_SCALE_CORE(max, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
new file mode 100644
index 0000000..1b45bab
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.h
@@ -0,0 +1,188 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef POWER_MANAGER_H_
+#define POWER_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Maximum number of CPUS to manage */
+#define POWER_MGR_MAX_CPUS 64
+/**
+ * Initialize power management.
+ * Initializes resources and verifies the number of CPUs on the system.
+ * Wraps librte_power int rte_power_init(unsigned lcore_id);
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_init(void);
+
+/**
+ * Exit power management. Must be called prior to exiting the application.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_exit(void);
+
+/**
+ * Scale up the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_up(uint64_t core_mask);
+
+/**
+ * Scale down the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_down(uint64_t core_mask);
+
+/**
+ * Scale to the minimum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_min(uint64_t core_mask);
+
+/**
+ * Scale to the maximum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_max(uint64_t core_mask);
+
+/**
+ * Scale up frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_up(unsigned core_num);
+
+/**
+ * Scale down frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_down(unsigned core_num);
+
+/**
+ * Scale to minimum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_min(unsigned core_num);
+
+/**
+ * Scale to maximum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_max(unsigned core_num);
+
+/**
+ * Get the current freuency of the core specified by core_num
+ *
+ * @param core_num
+ *  The core number to get the current frequency
+ *
+ * @return
+ *  - 0  on error
+ *  - >0 for current frequency.
+ */
+uint32_t power_manager_get_current_frequency(unsigned core_num);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* POWER_MANAGER_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 04/10] VM Power Management application and Makefile.
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (2 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 03/10] CPU Frequency Power Management(Host) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 05/10] VM Power Management CLI(Guest) Pablo de Lara
                           ` (6 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/Makefile |   57 +++++++++++++++++
 examples/vm_power_manager/main.c   |  117 ++++++++++++++++++++++++++++++++++++
 examples/vm_power_manager/main.h   |   52 ++++++++++++++++
 3 files changed, 226 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
new file mode 100644
index 0000000..0bf3cfc
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 0000000..875274e
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,117 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_launch.h>
+#include <rte_log.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "vm_power_cli.h"
+#include "main.h"
+
+static int
+run_monitor(__attribute__((unused)) void *arg)
+{
+	if (channel_monitor_init() < 0) {
+		printf("Unable to initialize channel monitor\n");
+		return -1;
+	}
+	run_channel_monitor();
+	return 0;
+}
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	lcore_id = rte_get_next_lcore(-1, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+				"application\n");
+		return 0;
+	}
+	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
+
+	if (power_manager_init() < 0) {
+		printf("Unable to initialize power manager\n");
+		return -1;
+	}
+	if (channel_manager_init(CHANNEL_MGR_DEFAULT_HV_PATH) < 0) {
+		printf("Unable to initialize channel manager\n");
+		return -1;
+	}
+	run_cli(NULL);
+
+	rte_eal_mp_wait_lcore();
+	return 0;
+}
diff --git a/examples/vm_power_manager/main.h b/examples/vm_power_manager/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 05/10] VM Power Management CLI(Guest).
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (3 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 04/10] VM Power Management application and Makefile Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 06/10] VM communication channels for VM Power Management(Guest) Pablo de Lara
                           ` (5 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile       |   56 +++++++
 examples/vm_power_manager/guest_cli/main.c         |   87 +++++++++++
 examples/vm_power_manager/guest_cli/main.h         |   52 +++++++
 .../guest_cli/vm_power_cli_guest.c                 |  155 ++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |   55 +++++++
 5 files changed, 405 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 0000000..0efb8b2
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 0000000..1e4767a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+*/
+#include <signal.h>
+
+#include <rte_lcore.h>
+#include <rte_power.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "vm_power_cli_guest.h"
+#include "main.h"
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	rte_power_set_env(PM_ENV_KVM_VM);
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_init(lcore_id);
+	}
+	run_cli(NULL);
+
+	return 0;
+}
diff --git a/examples/vm_power_manager/guest_cli/main.h b/examples/vm_power_manager/guest_cli/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
new file mode 100644
index 0000000..7c4af4a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -0,0 +1,155 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <termios.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_lcore.h>
+
+#include <rte_power.h>
+
+#include "vm_power_cli_guest.h"
+
+
+#define CHANNEL_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+				__attribute__((unused)) struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t lcore_id;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = rte_power_freq_up(res->lcore_id);
+	else if (!strcmp(res->cmd , "down"))
+		ret = rte_power_freq_down(res->lcore_id);
+	else if (!strcmp(res->cmd , "min"))
+		ret = rte_power_freq_min(res->lcore_id);
+	else if (!strcmp(res->cmd , "max"))
+		ret = rte_power_freq_max(res->lcore_id);
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			lcore_id, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+	cl = cmdline_stdin_new(main_ctx, "vmpower(guest)> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
new file mode 100644
index 0000000..0c4bdd5
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "channel_commands.h"
+
+int guest_channel_host_connect(unsigned lcore_id);
+
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 06/10] VM communication channels for VM Power Management(Guest).
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (4 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 05/10] VM Power Management CLI(Guest) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 07/10] librte_power common interface for Guest and Host Pablo de Lara
                           ` (4 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/guest_channel.c |  162 ++++++++++++++++++++++++++++++++++++++
 lib/librte_power/guest_channel.h |   89 +++++++++++++++++++++
 2 files changed, 251 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
new file mode 100644
index 0000000..2295665
--- /dev/null
+++ b/lib/librte_power/guest_channel.c
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+	int flags, ret;
+	struct channel_packet pkt;
+	char fd_path[PATH_MAX];
+	int fd = -1;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	/* check if path is already open */
+	if (global_fds[lcore_id] != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open with fd %d\n",
+				lcore_id, global_fds[lcore_id]);
+		return -1;
+	}
+
+	snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+	RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+			fd_path, lcore_id);
+	fd = open(fd_path, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with error "
+				"%s\n", fd_path, strerror(errno));
+		return -1;
+	}
+
+	flags = fcntl(fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file %s\n",
+				fd_path);
+		goto error;
+	}
+
+	flags |= O_NONBLOCK;
+	if (fcntl(fd, F_SETFL, flags) < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking mode for "
+				"file %s", fd_path);
+		goto error;
+	}
+	/* QEMU needs a delay after connection */
+	sleep(1);
+
+	/* Send a test packet, this command is ignored by the host, but a successful
+	 * send indicates that the host endpoint is monitoring.
+	 */
+	pkt.command = CPU_POWER_CONNECT;
+	global_fds[lcore_id] = fd;
+	ret = guest_channel_send_msg(&pkt, lcore_id);
+	if (ret != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' communications "
+				"test: %s\n", fd_path, strerror(ret));
+		goto error;
+	}
+	RTE_LOG(INFO, GUEST_CHANNEL, "Channel '%s' is now connected\n", fd_path);
+	return 0;
+error:
+	close(fd);
+	global_fds[lcore_id] = 0;
+	return -1;
+}
+
+int
+guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
+{
+	int ret, buffer_len = sizeof(*pkt);
+	void *buffer = pkt;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+
+	if (global_fds[lcore_id] == 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel is not connected\n");
+		return -1;
+	}
+	while (buffer_len > 0) {
+		ret = write(global_fds[lcore_id], buffer, buffer_len);
+		if (ret == buffer_len)
+			return 0;
+		if (ret == -1) {
+			if (errno == EINTR)
+				continue;
+			return errno;
+		}
+		buffer = (char *)buffer + ret;
+		buffer_len -= ret;
+	}
+	return 0;
+}
+
+void
+guest_channel_host_disconnect(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return;
+	}
+	if (global_fds[lcore_id] == 0)
+		return;
+	close(global_fds[lcore_id]);
+	global_fds[lcore_id] = 0;
+}
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
new file mode 100644
index 0000000..9e18af5
--- /dev/null
+++ b/lib/librte_power/guest_channel.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef _GUEST_CHANNEL_H
+#define _GUEST_CHANNEL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <channel_commands.h>
+
+/**
+ * Connect to the Virtio-Serial VM end-point located in path. It is
+ * thread safe for unique lcore_ids. This function must be only called once from
+ * each lcore.
+ *
+ * @param path
+ *  The path to the serial device on the filesystem
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int guest_channel_host_connect(const char *path, unsigned lcore_id);
+
+/**
+ * Disconnect from an already connected Virtio-Serial Endpoint.
+ *
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ */
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct guest_agent_pkt
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on channel not connected.
+ *  - errno on write to channel error.
+ */
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 07/10] librte_power common interface for Guest and Host
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (5 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 06/10] VM communication channels for VM Power Management(Guest) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 08/10] Packet format for VM Power Management(Host and Guest) Pablo de Lara
                           ` (3 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implmentation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/rte_power.c              |  540 ++++-------------------------
 lib/librte_power/rte_power.h              |  120 +++++--
 lib/librte_power/rte_power_acpi_cpufreq.c |  545 +++++++++++++++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h |  192 ++++++++++
 lib/librte_power/rte_power_common.h       |   39 ++
 lib/librte_power/rte_power_kvm_vm.c       |  135 +++++++
 lib/librte_power/rte_power_kvm_vm.h       |  179 ++++++++++
 7 files changed, 1248 insertions(+), 502 deletions(-)
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 856da9a..998ed1c 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -31,515 +31,113 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <signal.h>
-#include <limits.h>
-
-#include <rte_memcpy.h>
 #include <rte_atomic.h>
 
 #include "rte_power.h"
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
 
-#ifdef RTE_LIBRTE_POWER_DEBUG
-#define POWER_DEBUG_TRACE(fmt, args...) do { \
-		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
-	} while (0)
-#else
-#define POWER_DEBUG_TRACE(fmt, args...)
-#endif
-
-#define FOPEN_OR_ERR_RET(f, retval) do { \
-	if ((f) == NULL) { \
-		RTE_LOG(ERR, POWER, "File not openned\n"); \
-		return (retval); \
-	} \
-} while(0)
-
-#define FOPS_OR_NULL_GOTO(ret, label) do { \
-	if ((ret) == NULL) { \
-		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define FOPS_OR_ERR_GOTO(ret, label) do { \
-	if ((ret) < 0) { \
-		RTE_LOG(ERR, POWER, "File operations failed\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define STR_SIZE     1024
-#define POWER_CONVERT_TO_DECIMAL 10
+enum power_management_env global_default_env = PM_ENV_NOT_SET;
 
-#define POWER_GOVERNOR_USERSPACE "userspace"
-#define POWER_SYSFILE_GOVERNOR   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
-#define POWER_SYSFILE_AVAIL_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
-#define POWER_SYSFILE_SETSPEED   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+volatile uint32_t global_env_cfg_status = 0;
 
-enum power_state {
-	POWER_IDLE = 0,
-	POWER_ONGOING,
-	POWER_USED,
-	POWER_UNKNOWN
-};
+/* function pointers */
+rte_power_freqs_t rte_power_freqs  = NULL;
+rte_power_get_freq_t rte_power_get_freq = NULL;
+rte_power_set_freq_t rte_power_set_freq = NULL;
+rte_power_freq_change_t rte_power_freq_up = NULL;
+rte_power_freq_change_t rte_power_freq_down = NULL;
+rte_power_freq_change_t rte_power_freq_max = NULL;
+rte_power_freq_change_t rte_power_freq_min = NULL;
 
-/**
- * Power info per lcore.
- */
-struct rte_power_info {
-	unsigned lcore_id;                   /**< Logical core id */
-	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-	uint32_t nb_freqs;                   /**< number of available freqs */
-	FILE *f;                             /**< FD of scaling_setspeed */
-	char governor_ori[32];               /**< Original governor name */
-	uint32_t curr_idx;                   /**< Freq index in freqs array */
-	volatile uint32_t state;             /**< Power in use state */
-} __rte_cache_aligned;
-
-static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
-
-/**
- * It is to set specific freq for specific logical core, according to the index
- * of supported frequencies.
- */
-static int
-set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+int
+rte_power_set_env(enum power_management_env env)
 {
-	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
-			"should be less than %u\n", idx, pi->nb_freqs);
-		return -1;
-	}
-
-	/* Check if it is the same as current */
-	if (idx == pi->curr_idx)
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
 		return 0;
-
-	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
-				idx, pi->freqs[idx], pi->lcore_id);
-	if (fseek(pi->f, 0, SEEK_SET) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
-			"for setting frequency for lcore %u\n", pi->lcore_id);
-		return -1;
 	}
-	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
-					"lcore %u\n", pi->lcore_id);
+	if (env == PM_ENV_ACPI_CPUFREQ) {
+		rte_power_freqs = rte_power_acpi_cpufreq_freqs;
+		rte_power_get_freq = rte_power_acpi_cpufreq_get_freq;
+		rte_power_set_freq = rte_power_acpi_cpufreq_set_freq;
+		rte_power_freq_up = rte_power_acpi_cpufreq_freq_up;
+		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
+		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
+		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+	} else if (env == PM_ENV_KVM_VM) {
+		rte_power_freqs = rte_power_kvm_vm_freqs;
+		rte_power_get_freq = rte_power_kvm_vm_get_freq;
+		rte_power_set_freq = rte_power_kvm_vm_set_freq;
+		rte_power_freq_up = rte_power_kvm_vm_freq_up;
+		rte_power_freq_down = rte_power_kvm_vm_freq_down;
+		rte_power_freq_min = rte_power_kvm_vm_freq_min;
+		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+	} else {
+		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
+				env);
+		rte_power_unset_env();
 		return -1;
 	}
-	fflush(pi->f);
-	pi->curr_idx = idx;
-
-	return 1;
-}
-
-/**
- * It is to check the current scaling governor by reading sys file, and then
- * set it into 'userspace' if it is not by writing the sys file. The original
- * governor will be saved for rolling back.
- */
-static int
-power_set_governor_userspace(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if current governor is userspace */
-	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
-		sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
-					"already userspace\n", pi->lcore_id);
-		goto out;
-	}
-	/* Save the original governor */
-	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
-
-	/* Write 'userspace' to the governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(POWER_GOVERNOR_USERSPACE, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
-			"set to user space successfully\n", pi->lcore_id);
-out:
-	fclose(f);
+	global_default_env = env;
+	return 0;
 
-	return ret;
 }
 
-/**
- * It is to get the available frequencies of the specific lcore by reading the
- * sys file.
- */
-static int
-power_get_available_freqs(struct rte_power_info *pi)
+void
+rte_power_unset_env(void)
 {
-	FILE *f;
-	int ret = -1, i, count;
-	char *p;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *freqs[RTE_MAX_LCORE_FREQS];
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
-								pi->lcore_id);
-	f = fopen(fullpath, "r");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Strip the line break if there is */
-	p = strchr(buf, '\n');
-	if (p != NULL)
-		*p = 0;
-
-	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
-	count = rte_strsplit(buf, sizeof(buf), freqs,
-				RTE_MAX_LCORE_FREQS, ' ');
-	if (count <= 0) {
-		RTE_LOG(ERR, POWER, "No available frequency in "
-			""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
-		goto out;
-	}
-	if (count >= RTE_MAX_LCORE_FREQS) {
-		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
-								count);
-		goto out;
-	}
-
-	/* Store the available frequncies into power context */
-	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
-		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
-								i, freqs[i]);
-		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
-					POWER_CONVERT_TO_DECIMAL);
-	}
-
-	ret = 0;
-	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
-						count, pi->lcore_id);
-out:
-	fclose(f);
-
-	return ret;
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 1, 0) != 0)
+		global_default_env = PM_ENV_NOT_SET;
 }
 
-/**
- * It is to fopen the sys file for the future setting the lcore frequency.
- */
-static int
-power_init_for_setting_freq(struct rte_power_info *pi)
-{
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t i, freq;
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, -1);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
-	for (i = 0; i < pi->nb_freqs; i++) {
-		if (freq == pi->freqs[i]) {
-			pi->curr_idx = i;
-			pi->f = f;
-			return 0;
-		}
-	}
-
-out:
-	fclose(f);
-
-	return -1;
+enum power_management_env
+rte_power_get_env(void) {
+	return global_default_env;
 }
 
 int
 rte_power_init(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"in use\n", lcore_id);
-		return -1;
-	}
-
-	pi->lcore_id = lcore_id;
-	/* Check and set the governor */
-	if (power_set_governor_userspace(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
-						"userspace\n", lcore_id);
-		goto fail;
-	}
+	int ret = -1;
 
-	/* Get the available frequencies */
-	if (power_get_available_freqs(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ) {
+		return rte_power_acpi_cpufreq_init(lcore_id);
 	}
-
-	/* Init for setting lcore frequency */
-	if (power_init_for_setting_freq(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_KVM_VM) {
+		return rte_power_kvm_vm_init(lcore_id);
 	}
-
-	/* Set freq to max by default */
-	if (rte_power_freq_max(lcore_id) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
-						"to max\n", lcore_id);
-		goto fail;
+	/* Auto detect Environment */
+	RTE_LOG(INFO, POWER, "Attempting to initialise ACPI cpufreq power "
+			"management...\n");
+	ret = rte_power_acpi_cpufreq_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+		goto out;
 	}
 
-	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
-					"power manamgement\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
-
-	return -1;
-}
-
-/**
- * It is to check the governor and then set the original governor back if
- * needed by writing the the sys file.
- */
-static int
-power_set_governor_original(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if the governor to be set is the same as current */
-	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u "
-					"has already been set to %s\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(INFO, POWER, "Attempting to initialise VM power management...\n");
+	ret = rte_power_kvm_vm_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_KVM_VM);
 		goto out;
 	}
-
-	/* Write back the original governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(pi->governor_ori, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power manamgement governor of lcore %u "
-				"has been set back to %s successfully\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(ERR, POWER, "Unable to set Power Management Environment for lcore "
+			"%u\n", lcore_id);
 out:
-	fclose(f);
-
 	return ret;
 }
 
 int
 rte_power_exit(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"not used\n", lcore_id);
-		return -1;
-	}
-
-	/* Close FD of setting freq */
-	fclose(pi->f);
-	pi->f = NULL;
-
-	/* Set the governor back to the original */
-	if (power_set_governor_original(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
-					"to the original\n", lcore_id);
-		goto fail;
-	}
-
-	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
-				"'userspace' mode and been set back to the "
-						"original\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ)
+		return rte_power_acpi_cpufreq_exit(lcore_id);
+	if (global_default_env == PM_ENV_KVM_VM)
+		return rte_power_kvm_vm_exit(lcore_id);
 
+	RTE_LOG(ERR, POWER, "Environment has not been set, unable to exit "
+				"gracefully\n");
 	return -1;
-}
-
-uint32_t
-rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
-		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
-		return 0;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (num < pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
-		return 0;
-	}
-	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
-
-	return pi->nb_freqs;
-}
-
-uint32_t
-rte_power_get_freq(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return RTE_POWER_INVALID_FREQ_INDEX;
-	}
-
-	return lcore_power_info[lcore_id].curr_idx;
-}
-
-int
-rte_power_set_freq(unsigned lcore_id, uint32_t index)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
-}
-
-int
-rte_power_freq_down(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
 
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx + 1 == pi->nb_freqs)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx + 1);
 }
-
-int
-rte_power_freq_up(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx == 0)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx - 1);
-}
-
-int
-rte_power_freq_max(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(&lcore_power_info[lcore_id], 0);
-}
-
-int
-rte_power_freq_min(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->nb_freqs - 1);
-}
-
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 9c1419e..9338069 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -48,12 +48,48 @@
 extern "C" {
 #endif
 
-#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+/* Power Management Environment State */
+enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM};
 
 /**
- * Initialize power management for a specific lcore. It will check and set the
- * governor to userspace for the lcore, get the available frequencies, and
- * prepare to set new lcore frequency.
+ * Set the default power management implementation. If this is not called prior
+ * to rte_power_init(), then auto-detect of the environment will take place.
+ * It is not thread safe.
+ *
+ * @param env
+ *  env. The environment in which to initialise Power Management for.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_set_env(enum power_management_env env);
+
+/**
+ * Unset the global environment configuration.
+ * This can only be called after all threads have completed.
+ *
+ * @param None.
+ *
+ * @return
+ *  None.
+ */
+void rte_power_unset_env(void);
+
+/**
+ * Get the default power management implementation.
+ *
+ * @param None.
+ *
+ * @return
+ *  power_management_env The configured environment.
+ */
+enum power_management_env rte_power_get_env(void);
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
  *
  * @param lcore_id
  *  lcore id.
@@ -65,8 +101,9 @@ extern "C" {
 int rte_power_init(unsigned lcore_id);
 
 /**
- * Exit power management on a specific lcore. It will set the governor to which
- * is before initialized.
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
  *
  * @param lcore_id
  *  lcore id.
@@ -78,11 +115,9 @@ int rte_power_init(unsigned lcore_id);
 int rte_power_exit(unsigned lcore_id);
 
 /**
- * Get the available frequencies of a specific lcore. The return value will be
- * the minimal one of the total number of available frequencies and the number
- * of buffer. The index of available frequencies used in other interfaces
- * should be in the range of 0 to this return value.
- * It should be protected outside of this function for threadsafe.
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -94,12 +129,15 @@ int rte_power_exit(unsigned lcore_id);
  * @return
  *  The number of available frequencies.
  */
-uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
+typedef uint32_t (*rte_power_freqs_t)(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+extern rte_power_freqs_t rte_power_freqs;
 
 /**
- * Return the current index of available frequencies of a specific lcore. It
- * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
- * It should be protected outside of this function for threadsafe.
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -107,12 +145,15 @@ uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
  * @return
  *  The current index of available frequencies.
  */
-uint32_t rte_power_get_freq(unsigned lcore_id);
+typedef uint32_t (*rte_power_get_freq_t)(unsigned lcore_id);
+
+extern rte_power_get_freq_t rte_power_get_freq;
 
 /**
  * Set the new frequency for a specific lcore by indicating the index of
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -121,70 +162,87 @@ uint32_t rte_power_get_freq(unsigned lcore_id);
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_set_freq(unsigned lcore_id, uint32_t index);
+typedef int (*rte_power_set_freq_t)(unsigned lcore_id, uint32_t index);
+
+extern rte_power_set_freq_t rte_power_set_freq;
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned lcore_id);
 
 /**
  * Scale up the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_up(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_up;
 
 /**
  * Scale down the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_down(unsigned lcore_id);
+
+extern rte_power_freq_change_t rte_power_freq_down;
 
 /**
  * Scale up the frequency of a specific lcore to the highest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_max(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_max;
 
 /**
  * Scale down the frequency of a specific lcore to the lowest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage..
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_min(unsigned lcore_id);
+rte_power_freq_change_t rte_power_freq_min;
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
new file mode 100644
index 0000000..09085c3
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -0,0 +1,545 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+
+#include <rte_memcpy.h>
+#include <rte_atomic.h>
+
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_common.h"
+
+#ifdef RTE_LIBRTE_POWER_DEBUG
+#define POWER_DEBUG_TRACE(fmt, args...) do { \
+		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
+} while (0)
+#else
+#define POWER_DEBUG_TRACE(fmt, args...)
+#endif
+
+#define FOPEN_OR_ERR_RET(f, retval) do { \
+		if ((f) == NULL) { \
+			RTE_LOG(ERR, POWER, "File not openned\n"); \
+			return retval; \
+		} \
+} while (0)
+
+#define FOPS_OR_NULL_GOTO(ret, label) do { \
+		if ((ret) == NULL) { \
+			RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define FOPS_OR_ERR_GOTO(ret, label) do { \
+		if ((ret) < 0) { \
+			RTE_LOG(ERR, POWER, "File operations failed\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define STR_SIZE     1024
+#define POWER_CONVERT_TO_DECIMAL 10
+
+#define POWER_GOVERNOR_USERSPACE "userspace"
+#define POWER_SYSFILE_GOVERNOR   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
+#define POWER_SYSFILE_AVAIL_FREQ \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
+#define POWER_SYSFILE_SETSPEED   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+
+enum power_state {
+	POWER_IDLE = 0,
+	POWER_ONGOING,
+	POWER_USED,
+	POWER_UNKNOWN
+};
+
+/**
+ * Power info per lcore.
+ */
+struct rte_power_info {
+	unsigned lcore_id;                   /**< Logical core id */
+	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
+	uint32_t nb_freqs;                   /**< number of available freqs */
+	FILE *f;                             /**< FD of scaling_setspeed */
+	char governor_ori[32];               /**< Original governor name */
+	uint32_t curr_idx;                   /**< Freq index in freqs array */
+	volatile uint32_t state;             /**< Power in use state */
+} __rte_cache_aligned;
+
+static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
+
+/**
+ * It is to set specific freq for specific logical core, according to the index
+ * of supported frequencies.
+ */
+static int
+set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+{
+	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
+				"should be less than %u\n", idx, pi->nb_freqs);
+		return -1;
+	}
+
+	/* Check if it is the same as current */
+	if (idx == pi->curr_idx)
+		return 0;
+
+	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
+			idx, pi->freqs[idx], pi->lcore_id);
+	if (fseek(pi->f, 0, SEEK_SET) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
+				"for setting frequency for lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
+				"lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	fflush(pi->f);
+	pi->curr_idx = idx;
+
+	return 1;
+}
+
+/**
+ * It is to check the current scaling governor by reading sys file, and then
+ * set it into 'userspace' if it is not by writing the sys file. The original
+ * governor will be saved for rolling back.
+ */
+static int
+power_set_governor_userspace(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if current governor is userspace */
+	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
+			sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
+				"already userspace\n", pi->lcore_id);
+		goto out;
+	}
+	/* Save the original governor */
+	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
+
+	/* Write 'userspace' to the governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(POWER_GOVERNOR_USERSPACE, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
+			"set to user space successfully\n", pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to get the available frequencies of the specific lcore by reading the
+ * sys file.
+ */
+static int
+power_get_available_freqs(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1, i, count;
+	char *p;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *freqs[RTE_MAX_LCORE_FREQS];
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
+			pi->lcore_id);
+	f = fopen(fullpath, "r");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Strip the line break if there is */
+	p = strchr(buf, '\n');
+	if (p != NULL)
+		*p = 0;
+
+	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
+	count = rte_strsplit(buf, sizeof(buf), freqs,
+			RTE_MAX_LCORE_FREQS, ' ');
+	if (count <= 0) {
+		RTE_LOG(ERR, POWER, "No available frequency in "
+				""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
+		goto out;
+	}
+	if (count >= RTE_MAX_LCORE_FREQS) {
+		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
+				count);
+		goto out;
+	}
+
+	/* Store the available frequncies into power context */
+	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
+		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
+				i, freqs[i]);
+		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
+				POWER_CONVERT_TO_DECIMAL);
+	}
+
+	ret = 0;
+	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
+			count, pi->lcore_id);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to fopen the sys file for the future setting the lcore frequency.
+ */
+static int
+power_init_for_setting_freq(struct rte_power_info *pi)
+{
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t i, freq;
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, -1);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
+	for (i = 0; i < pi->nb_freqs; i++) {
+		if (freq == pi->freqs[i]) {
+			pi->curr_idx = i;
+			pi->f = f;
+			return 0;
+		}
+	}
+
+	out:
+	fclose(f);
+
+	return -1;
+}
+
+int
+rte_power_acpi_cpufreq_init(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"in use\n", lcore_id);
+		return -1;
+	}
+
+	pi->lcore_id = lcore_id;
+	/* Check and set the governor */
+	if (power_set_governor_userspace(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
+				"userspace\n", lcore_id);
+		goto fail;
+	}
+
+	/* Get the available frequencies */
+	if (power_get_available_freqs(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Init for setting lcore frequency */
+	if (power_init_for_setting_freq(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Set freq to max by default */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
+				"to max\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
+			"power manamgement\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+/**
+ * It is to check the governor and then set the original governor back if
+ * needed by writing the the sys file.
+ */
+static int
+power_set_governor_original(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if the governor to be set is the same as current */
+	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u "
+				"has already been set to %s\n",
+				pi->lcore_id, pi->governor_ori);
+		goto out;
+	}
+
+	/* Write back the original governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(pi->governor_ori, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u "
+			"has been set back to %s successfully\n",
+			pi->lcore_id, pi->governor_ori);
+	out:
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_acpi_cpufreq_exit(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"not used\n", lcore_id);
+		return -1;
+	}
+
+	/* Close FD of setting freq */
+	fclose(pi->f);
+	pi->f = NULL;
+
+	/* Set the governor back to the original */
+	if (power_set_governor_original(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
+				"to the original\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
+			"'userspace' mode and been set back to the "
+			"original\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
+
+	return 0;
+
+	fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
+		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
+		return 0;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (num < pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
+		return 0;
+	}
+	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
+
+	return pi->nb_freqs;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_get_freq(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return RTE_POWER_INVALID_FREQ_INDEX;
+	}
+
+	return lcore_power_info[lcore_id].curr_idx;
+}
+
+int
+rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
+}
+
+int
+rte_power_acpi_cpufreq_freq_down(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx + 1 == pi->nb_freqs)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx + 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_up(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx == 0)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx - 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_max(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(&lcore_power_info[lcore_id], 0);
+}
+
+int
+rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->nb_freqs - 1);
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
new file mode 100644
index 0000000..68578e9
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -0,0 +1,192 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_ACPI_CPUFREQ_H
+#define _RTE_POWER_ACPI_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management via userspace ACPI cpufreq
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore. It will check and set the
+ * governor to userspace for the lcore, get the available frequencies, and
+ * prepare to set new lcore frequency.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore. It will set the governor to which
+ * is before initialized.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore. The return value will be
+ * the minimal one of the total number of available frequencies and the number
+ * of buffer. The index of available frequencies used in other interfaces
+ * should be in the range of 0 to this return value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  The number of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore. It
+ * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  The current index of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency chnaged.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_power/rte_power_common.h b/lib/librte_power/rte_power_common.h
new file mode 100644
index 0000000..64bd168
--- /dev/null
+++ b/lib/librte_power/rte_power_common.h
@@ -0,0 +1,39 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_POWER_COMMON_H_
+#define RTE_POWER_COMMON_H_
+
+#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+
+#endif /* RTE_POWER_COMMON_H_ */
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
new file mode 100644
index 0000000..3ccd92b
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -0,0 +1,135 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <errno.h>
+#include <string.h>
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
+
+#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+static struct channel_packet pkt[CHANNEL_CMDS_MAX_VM_CHANNELS];
+
+
+int
+rte_power_kvm_vm_init(unsigned lcore_id)
+{
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].command = CPU_POWER;
+	pkt[lcore_id].resource_id = lcore_id;
+	return guest_channel_host_connect(FD_PATH, lcore_id);
+}
+
+int
+rte_power_kvm_vm_exit(unsigned lcore_id)
+{
+	guest_channel_host_disconnect(lcore_id);
+	return 0;
+}
+
+uint32_t
+rte_power_kvm_vm_freqs(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t *freqs,
+		__attribute__((unused)) uint32_t num)
+{
+	RTE_LOG(ERR, POWER, "rte_power_freqs is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+uint32_t
+rte_power_kvm_vm_get_freq(__attribute__((unused)) unsigned lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_get_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_set_freq(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t index)
+{
+	RTE_LOG(ERR, POWER, "rte_power_set_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+static inline int
+send_msg(unsigned lcore_id, uint32_t scale_direction)
+{
+	int ret;
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = scale_direction;
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret == 0)
+		return 1;
+	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_kvm_vm_freq_up(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_UP);
+}
+
+int
+rte_power_kvm_vm_freq_down(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_DOWN);
+}
+
+int
+rte_power_kvm_vm_freq_max(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MAX);
+}
+
+int
+rte_power_kvm_vm_freq_min(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
new file mode 100644
index 0000000..dcbc878
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -0,0 +1,179 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_KVM_VM_H
+#define _RTE_POWER_KVM_VM_H
+
+/**
+ * @file
+ * RTE Power Management KVM VM
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore. This request is forwarded to the
+ * host monitor.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 08/10] Packet format for VM Power Management(Host and Guest).
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (6 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 07/10] librte_power common interface for Guest and Host Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 09/10] Build system integration for VM Power Management(Guest and Host) Pablo de Lara
                           ` (2 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/channel_commands.h |   77 +++++++++++++++++++++++++++++++++++
 1 files changed, 77 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_power/channel_commands.h

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
new file mode 100644
index 0000000..7e78a8b
--- /dev/null
+++ b/lib/librte_power/channel_commands.h
@@ -0,0 +1,77 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/* Maximum number of CPUs */
+#define CHANNEL_CMDS_MAX_CPUS        64
+#if CHANNEL_CMDS_MAX_CPUS > 64
+#error Maximum number of cores is 64, overflow is guaranteed to \
+	cause problems with VM Power Management
+#endif
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Valid Commands */
+#define CPU_POWER               1
+#define CPU_POWER_CONNECT       2
+
+/* CPU Power Command Scaling */
+#define CPU_POWER_SCALE_UP      1
+#define CPU_POWER_SCALE_DOWN    2
+#define CPU_POWER_SCALE_MAX     3
+#define CPU_POWER_SCALE_MIN     4
+
+struct channel_packet {
+	uint64_t resource_id; /**< core_num, device */
+	uint32_t unit;        /**< scale down/up/min/max */
+	uint32_t command;     /**< Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 09/10] Build system integration for VM Power Management(Guest and Host)
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (7 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 08/10] Packet format for VM Power Management(Host and Guest) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 10/10] VM Power Management Unit Tests Pablo de Lara
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/Makefile |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
index 6185812..d672a5a 100644
--- a/lib/librte_power/Makefile
+++ b/lib/librte_power/Makefile
@@ -37,7 +37,8 @@ LIB = librte_power.a
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c rte_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_kvm_vm.c guest_channel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v5 10/10] VM Power Management Unit Tests
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (8 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 09/10] Build system integration for VM Power Management(Guest and Host) Pablo de Lara
@ 2014-11-21 17:42         ` Pablo de Lara
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-21 17:42 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 app/test/Makefile                  |    3 +-
 app/test/autotest_data.py          |   26 ++
 app/test/test_power.c              |  445 +++---------------------------
 app/test/test_power_acpi_cpufreq.c |  544 ++++++++++++++++++++++++++++++++++++
 app/test/test_power_kvm_vm.c       |  308 ++++++++++++++++++++
 5 files changed, 917 insertions(+), 409 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index ebfa0ba..4b03f00 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -120,7 +120,8 @@ endif
 
 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c test_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power_kvm_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..618a946 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,32 @@ non_parallel_test_group_list = [
 	]
 },
 {
+	"Prefix" :      "power_acpi_cpufreq",
+	"Memory" :      all_sockets(512),
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power ACPI cpufreq autotest",
+		 "Command" :    "power_acpi_cpufreq_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
+	"Prefix" :      "power_kvm_vm",
+	"Memory" :      "512",
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power KVM VM  autotest",
+		 "Command" :    "power_kvm_vm_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
 	"Prefix" :	"lpm6",
 	"Memory" :	"512",
 	"Tests" :
diff --git a/app/test/test_power.c b/app/test/test_power.c
index d9eb420..64a2305 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -41,437 +41,66 @@
 
 #include <rte_power.h>
 
-#define TEST_POWER_LCORE_ID      2U
-#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
-#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
-
-#define TEST_POWER_SYSFILE_CUR_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
-
-static uint32_t total_freq_num;
-static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
-
-static int
-check_cur_freq(unsigned lcore_id, uint32_t idx)
-{
-#define TEST_POWER_CONVERT_TO_DECIMAL 10
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t cur_freq;
-	int ret = -1;
-
-	if (snprintf(fullpath, sizeof(fullpath),
-		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
-		return 0;
-	}
-	f = fopen(fullpath, "r");
-	if (f == NULL) {
-		return 0;
-	}
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		goto fail_get_cur_freq;
-	}
-	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
-	ret = (freqs[idx] == cur_freq ? 0 : -1);
-
-fail_get_cur_freq:
-	fclose(f);
-
-	return ret;
-}
-
-/* Check rte_power_freqs() */
-static int
-check_power_freqs(void)
-{
-	uint32_t ret;
-
-	total_freq_num = 0;
-	memset(freqs, 0, sizeof(freqs));
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
-					TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with NULL buffer to save available freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test of getting zero number of freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test with all valid input parameters */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get available freqs on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Save the total number of available freqs */
-	total_freq_num = ret;
-
-	return 0;
-}
-
-/* Check rte_power_get_freq() */
-static int
-check_power_get_freq(void)
-{
-	int ret;
-	uint32_t count;
-
-	/* test with an invalid lcore id */
-	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
-	if (count < TEST_POWER_FREQS_NUM_MAX) {
-		printf("Unexpectedly get freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
-	if (count >= TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get the freq index on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_set_freq() */
-static int
-check_power_set_freq(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
-	if (ret >= 0) {
-		printf("Unexpectedly set freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with an invalid freq index */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/**
-	 * test with an invalid freq index which is right one bigger than
-	 * total number of freqs
-	 */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", total_freq_num,
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0) {
-		printf("Fail to set freq index on lcore %u\n",
-					TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_down() */
-static int
-check_power_freq_down(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale down one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale down one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf ("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_up() */
-static int
-check_power_freq_up(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq on %u\n",
-						TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale up one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale up one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_max() */
-static int
-check_power_freq_max(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq to max on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_min() */
-static int
-check_power_freq_min(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq to min "
-				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
 static int
 test_power(void)
 {
 	int ret = -1;
+	enum power_management_env env;
 
-	/* test of init power management for an invalid lcore */
-	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	/* Test setting an invalid environment */
+	ret = rte_power_set_env(PM_ENV_NOT_SET);
 	if (ret == 0) {
-		printf("Unexpectedly initialise power management successfully "
-				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot initialise power management for lcore %u\n",
-							TEST_POWER_LCORE_ID);
+		printf("Unexpectedly succeeded on setting an invalid environment\n");
 		return -1;
 	}
 
-	/**
-	 * test of initialising power management for the lcore which has
-	 * been initialised
-	 */
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly init successfully power twice on "
-					"lcore %u\n", TEST_POWER_LCORE_ID);
+	/* Test that the environment has not been set */
+	env = rte_power_get_env();
+	if (env != PM_ENV_NOT_SET) {
+		printf("Unexpectedly got a valid environment configuration\n");
 		return -1;
 	}
 
-	ret = check_power_freqs();
-	if (ret < 0)
+	/* verify that function pointers are NULL */
+	if (rte_power_freqs != NULL) {
+		printf("rte_power_freqs should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	if (total_freq_num < 2) {
-		rte_power_exit(TEST_POWER_LCORE_ID);
-		printf("Frequency can not be changed due to CPU itself\n");
-		return 0;
 	}
-
-	ret = check_power_get_freq();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_set_freq();
-	if (ret < 0)
+	if (rte_power_get_freq != NULL) {
+		printf("rte_power_get_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_down();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_freq_up();
-	if (ret < 0)
+	}
+	if (rte_power_set_freq != NULL) {
+		printf("rte_power_set_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_max();
-	if (ret < 0)
+	}
+	if (rte_power_freq_up != NULL) {
+		printf("rte_power_freq_up should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_min();
-	if (ret < 0)
+	}
+	if (rte_power_freq_down != NULL) {
+		printf("rte_power_freq_down should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot exit power management for lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
 	}
-
-	/**
-	 * test of exiting power management for the lcore which has been exited
-	 */
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly exit successfully power management twice "
-					"on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
+	if (rte_power_freq_max != NULL) {
+		printf("rte_power_freq_max should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
-	/* test of exit power management for an invalid lcore */
-	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
-	if (ret == 0) {
-		printf("Unpectedly exit power management successfully for "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
+	if (rte_power_freq_min != NULL) {
+		printf("rte_power_freq_min should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
+	rte_power_unset_env();
 	return 0;
-
 fail_all:
-	rte_power_exit(TEST_POWER_LCORE_ID);
-
+	rte_power_unset_env();
 	return -1;
 }
 
diff --git a/app/test/test_power_acpi_cpufreq.c b/app/test/test_power_acpi_cpufreq.c
new file mode 100644
index 0000000..8848d75
--- /dev/null
+++ b/app/test/test_power_acpi_cpufreq.c
@@ -0,0 +1,544 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+
+#define TEST_POWER_LCORE_ID      2U
+#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
+#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
+
+#define TEST_POWER_SYSFILE_CUR_FREQ \
+	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
+
+static uint32_t total_freq_num;
+static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
+
+static int
+check_cur_freq(unsigned lcore_id, uint32_t idx)
+{
+#define TEST_POWER_CONVERT_TO_DECIMAL 10
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t cur_freq;
+	int ret = -1;
+
+	if (snprintf(fullpath, sizeof(fullpath),
+		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
+		return 0;
+	}
+	f = fopen(fullpath, "r");
+	if (f == NULL) {
+		return 0;
+	}
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		goto fail_get_cur_freq;
+	}
+	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
+	ret = (freqs[idx] == cur_freq ? 0 : -1);
+
+fail_get_cur_freq:
+	fclose(f);
+
+	return ret;
+}
+
+/* Check rte_power_freqs() */
+static int
+check_power_freqs(void)
+{
+	uint32_t ret;
+
+	total_freq_num = 0;
+	memset(freqs, 0, sizeof(freqs));
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
+					TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with NULL buffer to save available freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test of getting zero number of freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test with all valid input parameters */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get available freqs on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Save the total number of available freqs */
+	total_freq_num = ret;
+
+	return 0;
+}
+
+/* Check rte_power_get_freq() */
+static int
+check_power_get_freq(void)
+{
+	int ret;
+	uint32_t count;
+
+	/* test with an invalid lcore id */
+	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
+	if (count < TEST_POWER_FREQS_NUM_MAX) {
+		printf("Unexpectedly get freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
+	if (count >= TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get the freq index on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_set_freq() */
+static int
+check_power_set_freq(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
+	if (ret >= 0) {
+		printf("Unexpectedly set freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with an invalid freq index */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/**
+	 * test with an invalid freq index which is right one bigger than
+	 * total number of freqs
+	 */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", total_freq_num,
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0) {
+		printf("Fail to set freq index on lcore %u\n",
+					TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_down() */
+static int
+check_power_freq_down(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale down one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale down one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf ("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_up() */
+static int
+check_power_freq_up(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq on %u\n",
+						TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale up one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale up one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_max() */
+static int
+check_power_freq_max(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq to max on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_min() */
+static int
+check_power_freq_min(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq to min "
+				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+static int
+test_power_acpi_cpufreq(void)
+{
+	int ret = -1;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_ACPI_CPUFREQ, this "
+				"may occur if environment is not configured correctly or "
+				" operating in another valid Power management environment\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_ACPI_CPUFREQ) {
+		printf("Unexpectedly got an environment other than ACPI cpufreq\n");
+		goto fail_all;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+
+	/* test of init power management for an invalid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unexpectedly initialise power management successfully "
+				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(APCI cpufreq) or operating in another valid "
+				"Power management environment\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of initialising power management for the lcore which has
+	 * been initialised
+	 */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly init successfully power twice on "
+					"lcore %u\n", TEST_POWER_LCORE_ID);
+		goto fail_all;
+	}
+
+	ret = check_power_freqs();
+	if (ret < 0)
+		goto fail_all;
+
+	if (total_freq_num < 2) {
+		rte_power_exit(TEST_POWER_LCORE_ID);
+		printf("Frequency can not be changed due to CPU itself\n");
+		rte_power_unset_env();
+		return 0;
+	}
+
+	ret = check_power_get_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_set_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_down();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_up();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_max();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_min();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot exit power management for lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of exiting power management for the lcore which has been exited
+	 */
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly exit successfully power management twice "
+					"on lcore %u\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* test of exit power management for an invalid lcore */
+	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unpectedly exit power management successfully for "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+
+fail_all:
+	rte_power_exit(TEST_POWER_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_acpi_cpufreq_cmd = {
+	.command = "power_acpi_cpufreq_autotest",
+	.callback = test_power_acpi_cpufreq,
+};
+REGISTER_TEST_COMMAND(power_acpi_cpufreq_cmd);
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
new file mode 100644
index 0000000..ac0fcb6
--- /dev/null
+++ b/app/test/test_power_kvm_vm.c
@@ -0,0 +1,308 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+#include <rte_config.h>
+
+#define TEST_POWER_VM_LCORE_ID            0U
+#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
+#define TEST_POWER_VM_LCORE_INVALID       1U
+
+static int
+test_power_kvm_vm(void)
+{
+	int ret;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_KVM_VM);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_KVM_VM\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_KVM_VM) {
+		printf("Unexpectedly got a Power Management environment other than "
+				"KVM VM\n");
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		return -1;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	/* Test initialisation of an out of bounds lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret != -1) {
+		printf("rte_power_init unexpectedly succeeded on an invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(KVM VM) or operating in another valid "
+				"Power management environment\n", TEST_POWER_VM_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of previously initialised lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_init unexpectedly succeeded on calling init twice on"
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of invalid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency down of invalid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency min of invalid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency max of invalid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid but uninitialised lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid but uninitialised lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid but uninitialised lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid but uninitialised lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_up unexpectedly failed on valid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_down unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_min unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_max unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_freqs */
+	ret = rte_power_freqs(TEST_POWER_VM_LCORE_ID, NULL, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_freqs did not return the expected -ENOTSUP(%d) but "
+				"returned %d\n", -ENOTSUP, ret);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_get_freq */
+	ret = rte_power_get_freq(TEST_POWER_VM_LCORE_ID);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_get_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_set_freq */
+	ret = rte_power_set_freq(TEST_POWER_VM_LCORE_ID, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_set_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test removing of an lcore */
+	ret = rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_exit unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of previously removed lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of previously removed lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of previously removed lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of previously removed lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+fail_all:
+	rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_kvm_vm_cmd = {
+    .command = "power_kvm_vm_autotest",
+    .callback = test_power_kvm_vm,
+};
+REGISTER_TEST_COMMAND(power_kvm_vm_cmd);
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management
  2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
                           ` (9 preceding siblings ...)
  2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 10/10] VM Power Management Unit Tests Pablo de Lara
@ 2014-11-25 16:18         ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
                             ` (10 more replies)
  10 siblings, 11 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

Virtual Machine Power Management.

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/<vm_name>.<channel_number>,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
 Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.

Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.
<controller type='virtio-serial' index='0'>
 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<channel type='unix'>
  <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
  <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
  <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
</channel>

Where multiple channels can be configured by specifying multiple <channel>
elements, by replacing <vm_name>, <channel_num>.
<N>(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two separate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
 from VMs and calls the corresponding librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.

Changes in V6:
 Fixed typos and missing some identations and blank lines

Changes in V5:
 Fixed default target in sample app Makefiles

Changes in V4:
 Fixed double free of channel during VM shutdown.

Changes in V3:
 Fixed crash in Guest CLI when host application is not running.
 Renamed #defines to be more specific to the module they belong
 Added vCPU pinning via CLI

Changes in V2:
 Runtime selection of librte_power implementations.
 Updated Unit tests to cover librte_power changes.
 PATCH[0/3] was sent twice, again as PATCH[0/4]
 Miscellaneous fixes.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  VM Power Management application and Makefile.
  VM Power Management CLI(Guest).
  VM communication channels for VM Power Management(Guest).
  librte_power common interface for Guest and Host
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests

 app/test/Makefile                                  |    3 +-
 app/test/autotest_data.py                          |   26 +
 app/test/test_power.c                              |  445 +----------
 app/test/test_power_acpi_cpufreq.c                 |  544 +++++++++++++
 app/test/test_power_kvm_vm.c                       |  308 ++++++++
 examples/vm_power_manager/Makefile                 |   57 ++
 examples/vm_power_manager/channel_manager.c        |  808 ++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h        |  314 ++++++++
 examples/vm_power_manager/channel_monitor.c        |  234 ++++++
 examples/vm_power_manager/channel_monitor.h        |  102 +++
 examples/vm_power_manager/guest_cli/Makefile       |   56 ++
 examples/vm_power_manager/guest_cli/main.c         |   88 +++
 examples/vm_power_manager/guest_cli/main.h         |   52 ++
 .../guest_cli/vm_power_cli_guest.c                 |  156 ++++
 .../guest_cli/vm_power_cli_guest.h                 |   55 ++
 examples/vm_power_manager/main.c                   |  117 +++
 examples/vm_power_manager/main.h                   |   52 ++
 examples/vm_power_manager/power_manager.c          |  253 ++++++
 examples/vm_power_manager/power_manager.h          |  188 +++++
 examples/vm_power_manager/vm_power_cli.c           |  673 ++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h           |   47 ++
 lib/librte_power/Makefile                          |    3 +-
 lib/librte_power/channel_commands.h                |   77 ++
 lib/librte_power/guest_channel.c                   |  162 ++++
 lib/librte_power/guest_channel.h                   |   89 +++
 lib/librte_power/rte_power.c                       |  540 ++------------
 lib/librte_power/rte_power.h                       |  120 +++-
 lib/librte_power/rte_power_acpi_cpufreq.c          |  545 +++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h          |  192 +++++
 lib/librte_power/rte_power_common.h                |   39 +
 lib/librte_power/rte_power_kvm_vm.c                |  136 ++++
 lib/librte_power/rte_power_kvm_vm.h                |  179 +++++
 32 files changed, 5748 insertions(+), 912 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h
 create mode 100644 lib/librte_power/channel_commands.h
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-29 15:21             ` Neil Horman
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 02/10] VM Power Management CLI(Host) Pablo de Lara
                             ` (9 subsequent siblings)
  10 siblings, 1 reply; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/channel_manager.c |  808 +++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h |  314 +++++++++++
 examples/vm_power_manager/channel_monitor.c |  234 ++++++++
 examples/vm_power_manager/channel_monitor.h |  102 ++++
 4 files changed, 1458 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 0000000..7d744c0
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,808 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+
+#include <rte_config.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+#include <rte_log.h>
+#include <rte_atomic.h>
+#include <rte_spinlock.h>
+
+#include <libvirt/libvirt.h>
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+		if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+static unsigned char *global_cpumaps;
+static virVcpuInfo *global_vircpuinfo;
+static size_t global_maplen;
+
+static unsigned global_n_host_cpus;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];
+	rte_atomic64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];
+	struct channel_info *channels[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	uint64_t channel_mask;
+	uint8_t num_channels;
+	enum vm_status status;
+	virDomainPtr domainPtr;
+	virDomainInfo info;
+	rte_spinlock_t config_spinlock;
+	LIST_ENTRY(virtual_machine_info) vms_info;
+};
+
+LIST_HEAD(, virtual_machine_info) vm_list_head;
+
+static struct virtual_machine_info *
+find_domain_by_name(const char *name)
+{
+	struct virtual_machine_info *info;
+	LIST_FOREACH(info, &vm_list_head, vms_info) {
+		if (!strncmp(info->name, name, CHANNEL_MGR_MAX_NAME_LEN-1))
+			return info;
+	}
+	return NULL;
+}
+
+static int
+update_pcpus_mask(struct virtual_machine_info *vm_info)
+{
+	virVcpuInfoPtr cpuinfo;
+	unsigned i, j;
+	int n_vcpus;
+	uint64_t mask;
+
+	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		n_vcpus = virDomainGetVcpuPinInfo(vm_info->domainPtr,
+				vm_info->info.nrVirtCpu, global_cpumaps, global_maplen,
+				VIR_DOMAIN_AFFECT_CONFIG);
+		if (n_vcpus < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+					"in-active VM '%s'\n", vm_info->name);
+			return -1;
+		}
+		goto update_pcpus;
+	}
+
+	memset(global_vircpuinfo, 0, sizeof(*global_vircpuinfo)*
+			CHANNEL_CMDS_MAX_CPUS);
+
+	cpuinfo = global_vircpuinfo;
+
+	n_vcpus = virDomainGetVcpus(vm_info->domainPtr, cpuinfo,
+			CHANNEL_CMDS_MAX_CPUS, global_cpumaps, global_maplen);
+	if (n_vcpus < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
+				"active VM '%s'\n", vm_info->name);
+		return -1;
+	}
+update_pcpus:
+	if (n_vcpus >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Number of vCPUS(%u) is out of range "
+				"0...%d\n", n_vcpus, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+	if (n_vcpus != vm_info->info.nrVirtCpu) {
+		RTE_LOG(INFO, CHANNEL_MANAGER, "Updating the number of vCPUs for VM '%s"
+				" from %d -> %d\n", vm_info->name, vm_info->info.nrVirtCpu,
+				n_vcpus);
+		vm_info->info.nrVirtCpu = n_vcpus;
+	}
+	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
+		mask = 0;
+		for (j = 0; j < global_n_host_cpus; j++) {
+			if (VIR_CPU_USABLE(global_cpumaps, global_maplen, i, j) > 0) {
+				mask |= 1ULL << j;
+			}
+		}
+		rte_atomic64_set(&vm_info->pcpu_mask[i], mask);
+	}
+	return 0;
+}
+
+int
+set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask)
+{
+	unsigned i = 0;
+	int flags = VIR_DOMAIN_AFFECT_LIVE|VIR_DOMAIN_AFFECT_CONFIG;
+	struct virtual_machine_info *vm_info;
+	uint64_t mask = core_mask;
+
+	if (vcpu >= CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds max allowable(%d)\n",
+				vcpu, CHANNEL_CMDS_MAX_CPUS-1);
+		return -1;
+	}
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s', VM is not active\n",
+				vcpu, core_mask, vm_info->name);
+		return -1;
+	}
+
+	if (vcpu >= vm_info->info.nrVirtCpu) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds the assigned number of "
+				"vCPUs(%u)\n", vcpu, vm_info->info.nrVirtCpu);
+		return -1;
+	}
+	memset(global_cpumaps, 0 , CHANNEL_CMDS_MAX_CPUS * global_maplen);
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		VIR_USE_CPU(global_cpumaps, i);
+		if (i >= global_n_host_cpus) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "CPU(%u) exceeds the available "
+					"number of CPUs(%u)\n", i, global_n_host_cpus);
+			return -1;
+		}
+	}
+	if (virDomainPinVcpuFlags(vm_info->domainPtr, vcpu, global_cpumaps,
+			global_maplen, flags) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
+				"mask(0x%"PRIx64") for VM '%s'\n", vcpu, core_mask,
+				vm_info->name);
+		return -1;
+	}
+	rte_atomic64_set(&vm_info->pcpu_mask[vcpu], core_mask);
+	return 0;
+
+}
+
+int
+set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num)
+{
+	uint64_t mask = 1ULL << core_num;
+
+	return set_pcpus_mask(vm_name, vcpu, mask);
+}
+
+uint64_t
+get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
+{
+	struct virtual_machine_info *vm_info =
+			(struct virtual_machine_info *)chan_info->priv_info;
+	return rte_atomic64_read(&vm_info->pcpu_mask[vcpu]);
+}
+
+static inline int
+channel_exists(struct virtual_machine_info *vm_info, unsigned channel_num)
+{
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	if (vm_info->channel_mask & (1ULL << channel_num)) {
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+		return 1;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+
+
+static int
+open_non_blocking_channel(struct channel_info *info)
+{
+	int ret, flags;
+	struct sockaddr_un sock_addr;
+	fd_set soc_fd_set;
+	struct timeval tv;
+
+	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (info->fd == -1) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
+				strerror(errno),
+				info->channel_path);
+		return -1;
+	}
+	sock_addr.sun_family = AF_UNIX;
+	memcpy(&sock_addr.sun_path, info->channel_path,
+			strlen(info->channel_path)+1);
+
+	/* Get current flags */
+	flags = fcntl(info->fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
+				"'%s'\n", strerror(errno), info->channel_path);
+		return 1;
+	}
+	/* Set to Non Blocking */
+	flags |= O_NONBLOCK;
+	if (fcntl(info->fd, F_SETFL, flags) < 0) {
+		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
+				"socket for '%s'\n", strerror(errno), info->channel_path);
+		return -1;
+	}
+	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
+			sizeof(sock_addr));
+	if (ret < 0) {
+		/* ECONNREFUSED error is given when VM is not active */
+		if (errno == ECONNREFUSED) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
+					"activated its endpoint to channel %s\n",
+					info->channel_path);
+			return -1;
+		}
+		/* Wait for tv_sec if in progress */
+		else if (errno == EINPROGRESS) {
+			tv.tv_sec = 2;
+			tv.tv_usec = 0;
+			FD_ZERO(&soc_fd_set);
+			FD_SET(info->fd, &soc_fd_set);
+			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
+				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
+						"'%s'\n", info->channel_path);
+				return -1;
+			}
+		} else {
+			/* Any other error */
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
+					" for '%s'\n", strerror(errno), info->channel_path);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int
+setup_channel_info(struct virtual_machine_info **vm_info_dptr,
+		struct channel_info **chan_info_dptr, unsigned channel_num)
+{
+	struct channel_info *chan_info = *chan_info_dptr;
+	struct virtual_machine_info *vm_info = *vm_info_dptr;
+
+	chan_info->channel_num = channel_num;
+	chan_info->priv_info = (void *)vm_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_DISCONNECTED;
+	if (open_non_blocking_channel(chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
+				"'%s' for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+	}
+	if (add_channel_to_monitor(&chan_info) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
+				"'%s' to epoll ctl for VM '%s'\n",
+				chan_info->channel_path, vm_info->name);
+		return -1;
+
+	}
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->num_channels++;
+	vm_info->channel_mask |= 1ULL << channel_num;
+	vm_info->channels[channel_num] = chan_info;
+	chan_info->status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return 0;
+}
+
+int
+add_all_channels(const char *vm_name)
+{
+	DIR *d;
+	struct dirent *dir;
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char *token, *remaining, *tail_ptr;
+	char socket_name[PATH_MAX];
+	unsigned channel_num;
+	int num_channels_enabled = 0;
+
+	/* verify VM exists */
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
+				" during channel discovery\n", vm_name);
+		return 0;
+	}
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+	d = opendir(CHANNEL_MGR_SOCKET_PATH);
+	if (d == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
+				CHANNEL_MGR_SOCKET_PATH, strerror(errno));
+		return -1;
+	}
+	while ((dir = readdir(d)) != NULL) {
+		if (!strncmp(dir->d_name, ".", 1) ||
+				!strncmp(dir->d_name, "..", 2))
+			continue;
+
+		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
+		remaining = socket_name;
+		/* Extract vm_name from "<vm_name>.<channel_num>" */
+		token = strsep(&remaining, ".");
+		if (remaining == NULL)
+			continue;
+		if (strncmp(vm_name, token, CHANNEL_MGR_MAX_NAME_LEN))
+			continue;
+
+		/* remaining should contain only <channel_num> */
+		errno = 0;
+		channel_num = (unsigned)strtol(remaining, &tail_ptr, 0);
+		if ((errno != 0) || (remaining[0] == '\0') ||
+				(*tail_ptr != '\0') || tail_ptr == NULL) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
+					"'%s' found it should be in the form of "
+					"'<guest_name>.<channel_num>(decimal)'\n",
+					dir->d_name);
+			continue;
+		}
+		if (channel_num >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
+					"greater than max allowable: %d, skipping '%s%s'\n",
+					channel_num, CHANNEL_CMDS_MAX_VM_CHANNELS-1,
+					CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+		/* if channel has not been added previously */
+		if (channel_exists(vm_info, channel_num))
+			continue;
+
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+				"channel '%s%s'\n", CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+			continue;
+		}
+
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s",
+				CHANNEL_MGR_SOCKET_PATH, dir->d_name);
+
+		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+
+		num_channels_enabled++;
+	}
+	closedir(d);
+	return num_channels_enabled;
+}
+
+int
+add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info;
+	char socket_path[PATH_MAX];
+	unsigned i;
+	int num_channels_enabled = 0;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	if (!virDomainIsActive(vm_info->domainPtr)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
+		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
+		return 0;
+	}
+
+	for (i = 0; i < len_channel_list; i++) {
+
+		if (channel_list[i] >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel(%u) is out of range "
+							"0...%d\n", channel_list[i],
+							CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+			continue;
+		}
+		if (channel_exists(vm_info, channel_list[i])) {
+			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel already exists, skipping  "
+					"'%s.%u'\n", vm_name, i);
+			continue;
+		}
+
+		snprintf(socket_path, sizeof(socket_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		errno = 0;
+		if (access(socket_path, F_OK) < 0) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
+					"%s\n", socket_path, strerror(errno));
+			continue;
+		}
+		chan_info = rte_malloc(NULL, sizeof(*chan_info),
+				CACHE_LINE_SIZE);
+		if (chan_info == NULL) {
+			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
+					"channel '%s'\n", socket_path);
+			continue;
+		}
+		snprintf(chan_info->channel_path,
+				sizeof(chan_info->channel_path), "%s%s.%u",
+				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
+		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
+			rte_free(chan_info);
+			continue;
+		}
+		num_channels_enabled++;
+
+	}
+	return num_channels_enabled;
+}
+
+int
+remove_channel(struct channel_info **chan_info_dptr)
+{
+	struct virtual_machine_info *vm_info;
+	struct channel_info *chan_info = *chan_info_dptr;
+
+	close(chan_info->fd);
+
+	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
+	vm_info->num_channels--;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	rte_free(chan_info);
+	return 0;
+}
+
+int
+set_channel_status_all(const char *vm_name, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	uint64_t mask;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		vm_info->channels[i]->status = status;
+		num_channels_changed++;
+	}
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+	return num_channels_changed;
+
+}
+
+int
+set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i;
+	int num_channels_changed = 0;
+
+	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
+			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
+				"disabled: Unable to change status for VM '%s'\n", vm_name);
+	}
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
+				"not found\n", vm_name);
+		return 0;
+	}
+	for (i = 0; i < len_channel_list; i++) {
+		if (channel_exists(vm_info, channel_list[i])) {
+			rte_spinlock_lock(&(vm_info->config_spinlock));
+			vm_info->channels[channel_list[i]]->status = status;
+			rte_spinlock_unlock(&(vm_info->config_spinlock));
+			num_channels_changed++;
+		}
+	}
+	return num_channels_changed;
+}
+
+int
+get_info_vm(const char *vm_name, struct vm_info *info)
+{
+	struct virtual_machine_info *vm_info;
+	unsigned i, channel_num = 0;
+	uint64_t mask;
+
+	vm_info = find_domain_by_name(vm_name);
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
+		return -1;
+	}
+	info->status = CHANNEL_MGR_VM_ACTIVE;
+	if (!virDomainIsActive(vm_info->domainPtr))
+		info->status = CHANNEL_MGR_VM_INACTIVE;
+
+	rte_spinlock_lock(&(vm_info->config_spinlock));
+
+	mask = vm_info->channel_mask;
+	ITERATIVE_BITMASK_CHECK_64(mask, i) {
+		info->channels[channel_num].channel_num = i;
+		memcpy(info->channels[channel_num].channel_path,
+				vm_info->channels[i]->channel_path, PATH_MAX);
+		info->channels[channel_num].status = vm_info->channels[i]->status;
+		info->channels[channel_num].fd = vm_info->channels[i]->fd;
+		channel_num++;
+	}
+
+	info->num_channels = channel_num;
+	info->num_vcpus = vm_info->info.nrVirtCpu;
+	rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
+	for (i = 0; i < info->num_vcpus; i++) {
+		info->pcpu_mask[i] = rte_atomic64_read(&vm_info->pcpu_mask[i]);
+	}
+	return 0;
+}
+
+int
+add_vm(const char *vm_name)
+{
+	struct virtual_machine_info *new_domain;
+	virDomainPtr dom_ptr;
+	int i;
+
+	if (find_domain_by_name(vm_name) != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
+				"already exists\n", vm_name);
+		return -1;
+	}
+
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
+		return -1;
+	}
+	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
+	if (dom_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
+				"VM '%s' not found\n", vm_name);
+		return -1;
+	}
+
+	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
+			CACHE_LINE_SIZE);
+	if (new_domain == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
+				"info\n");
+		return -1;
+	}
+	new_domain->domainPtr = dom_ptr;
+	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	if (new_domain->info.nrVirtCpu > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
+				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
+				CHANNEL_CMDS_MAX_CPUS);
+		rte_free(new_domain);
+		return -1;
+	}
+
+	for (i = 0; i < CHANNEL_CMDS_MAX_CPUS; i++) {
+		rte_atomic64_init(&new_domain->pcpu_mask[i]);
+	}
+	if (update_pcpus_mask(new_domain) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
+		rte_free(new_domain);
+		return -1;
+	}
+	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
+	new_domain->channel_mask = 0;
+	new_domain->num_channels = 0;
+
+	if (!virDomainIsActive(dom_ptr))
+		new_domain->status = CHANNEL_MGR_VM_INACTIVE;
+	else
+		new_domain->status = CHANNEL_MGR_VM_ACTIVE;
+
+	rte_spinlock_init(&(new_domain->config_spinlock));
+	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
+	return 0;
+}
+
+int
+remove_vm(const char *vm_name)
+{
+	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
+
+	if (vm_info == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
+				"not found\n", vm_name);
+		return -1;
+	}
+	rte_spinlock_lock(&vm_info->config_spinlock);
+	if (vm_info->num_channels != 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
+				"%"PRId8" channels still active\n",
+				vm_name, vm_info->num_channels);
+		rte_spinlock_unlock(&vm_info->config_spinlock);
+		return -1;
+	}
+	LIST_REMOVE(vm_info, vms_info);
+	rte_spinlock_unlock(&vm_info->config_spinlock);
+	rte_free(vm_info);
+	return 0;
+}
+
+static void
+disconnect_hypervisor(void)
+{
+	if (global_vir_conn_ptr != NULL) {
+		virConnectClose(global_vir_conn_ptr);
+		global_vir_conn_ptr = NULL;
+	}
+}
+
+static int
+connect_hypervisor(const char *path)
+{
+	if (global_vir_conn_ptr != NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+				"already established\n", path);
+		return -1;
+	}
+	global_vir_conn_ptr = virConnectOpen(path);
+	if (global_vir_conn_ptr == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
+				"Hypervisor '%s'\n", path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_manager_init(const char *path)
+{
+	int n_cpus;
+
+	LIST_INIT(&vm_list_head);
+	if (connect_hypervisor(path) < 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
+		return -1;
+	}
+
+	global_maplen = VIR_CPU_MAPLEN(CHANNEL_CMDS_MAX_CPUS);
+
+	global_vircpuinfo = rte_zmalloc(NULL, sizeof(*global_vircpuinfo) *
+			CHANNEL_CMDS_MAX_CPUS, CACHE_LINE_SIZE);
+	if (global_vircpuinfo == NULL) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for CPU Info\n");
+		goto error;
+	}
+	global_cpumaps = rte_zmalloc(NULL, CHANNEL_CMDS_MAX_CPUS * global_maplen,
+			CACHE_LINE_SIZE);
+	if (global_cpumaps == NULL) {
+		goto error;
+	}
+
+	n_cpus = virNodeGetCPUMap(global_vir_conn_ptr, NULL, NULL, 0);
+	if (n_cpus <= 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get the number of Host "
+				"CPUs\n");
+		goto error;
+	}
+	global_n_host_cpus = (unsigned)n_cpus;
+
+	if (global_n_host_cpus > CHANNEL_CMDS_MAX_CPUS) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "The number of host CPUs(%u) exceeds the "
+				"maximum of %u\n", global_n_host_cpus, CHANNEL_CMDS_MAX_CPUS);
+		goto error;
+
+	}
+
+	return 0;
+error:
+	disconnect_hypervisor();
+	return -1;
+}
+
+void
+channel_manager_exit(void)
+{
+	unsigned i;
+	uint64_t mask;
+	struct virtual_machine_info *vm_info;
+
+	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
+
+		rte_spinlock_lock(&(vm_info->config_spinlock));
+
+		mask = vm_info->channel_mask;
+		ITERATIVE_BITMASK_CHECK_64(mask, i) {
+			remove_channel_from_monitor(vm_info->channels[i]);
+			close(vm_info->channels[i]->fd);
+			rte_free(vm_info->channels[i]);
+		}
+		rte_spinlock_unlock(&(vm_info->config_spinlock));
+
+		LIST_REMOVE(vm_info, vms_info);
+		rte_free(vm_info);
+	}
+
+	if (global_cpumaps != NULL)
+		rte_free(global_cpumaps);
+	if (global_vircpuinfo != NULL)
+		rte_free(global_vircpuinfo);
+	disconnect_hypervisor();
+}
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
new file mode 100644
index 0000000..12c29c3
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.h
@@ -0,0 +1,314 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MANAGER_H_
+#define CHANNEL_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/limits.h>
+#include <rte_atomic.h>
+#include "channel_commands.h"
+
+/* Maximum name length including '\0' terminator */
+#define CHANNEL_MGR_MAX_NAME_LEN    64
+
+/* Maximum number of channels to each Virtual Machine */
+#define CHANNEL_MGR_MAX_CHANNELS    64
+
+/* Hypervisor Path for libvirt(qemu/KVM) */
+#define CHANNEL_MGR_DEFAULT_HV_PATH "qemu:///system"
+
+/* File socket directory */
+#define CHANNEL_MGR_SOCKET_PATH     "/tmp/powermonitor/"
+
+/* Communication Channel Status */
+enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
+	CHANNEL_MGR_CHANNEL_CONNECTED,
+	CHANNEL_MGR_CHANNEL_DISABLED,
+	CHANNEL_MGR_CHANNEL_PROCESSING};
+
+/* VM libvirt(qemu/KVM) connection status */
+enum vm_status { CHANNEL_MGR_VM_INACTIVE = 0, CHANNEL_MGR_VM_ACTIVE};
+
+/*
+ *  Represents a single and exclusive VM channel that exists between a guest and
+ *  the host.
+ */
+struct channel_info {
+	char channel_path[PATH_MAX]; /**< Path to host socket */
+	volatile uint32_t status;    /**< Connection status(enum channel_status) */
+	int fd;                      /**< AF_UNIX socket fd */
+	unsigned channel_num;        /**< CHANNEL_MGR_SOCKET_PATH/<vm_name>.channel_num */
+	void *priv_info;             /**< Pointer to private info, do not modify */
+};
+
+/* Represents a single VM instance used to return internal information about
+ * a VM */
+struct vm_info {
+	char name[CHANNEL_MGR_MAX_NAME_LEN];          /**< VM name */
+	enum vm_status status;                        /**< libvirt status */
+	uint64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];    /**< pCPU mask for each vCPU */
+	unsigned num_vcpus;                           /**< number of vCPUS */
+	struct channel_info channels[CHANNEL_MGR_MAX_CHANNELS]; /**< Array of channel_info */
+	unsigned num_channels;                        /**< Number of channels */
+};
+
+/**
+ * Initialize the Channel Manager resources and connect to the Hypervisor
+ * specified in path.
+ * This must be successfully called first before calling any other functions.
+ * It must only be call once;
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_manager_init(const char *path);
+
+/**
+ * Free resources associated with the Channel Manager.
+ *
+ * @param path
+ *  Must be a local path, e.g. qemu:///system.
+ *
+ * @return
+ *  None
+ */
+void channel_manager_exit(void);
+
+/**
+ * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
+ * core_mask.
+ * It is not thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info
+ *
+ * @param vcpu
+ *  The virtual CPU to query.
+ *
+ *
+ * @return
+ *  - 0 on error.
+ *  - >0 on success.
+ */
+uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
+
+/**
+ * Set the Physical CPU mask for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_mask
+ *  The core mask of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask);
+
+/**
+ * Set the Physical CPU for the specified vCPU.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup
+ *
+ * @param vcpu
+ *  The virtual CPU to set.
+ *
+ * @param core_num
+ *  The core number of the physical CPU(s) to bind the vCPU
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num);
+/**
+ * Add a VM as specified by name to the Channel Manager. The name must
+ * correspond to a valid libvirt domain name.
+ * This is required prior to adding channels.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_vm(const char *name);
+
+/**
+ * Remove a previously added Virtual Machine from the Channel Manager
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_vm(const char *name);
+
+/**
+ * Add all available channels to the VM as specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to lookup.
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ */
+int add_all_channels(const char *vm_name);
+
+/**
+ * Add the channel numbers in channel_list to the domain specified by name.
+ * Channels in the form of paths
+ * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel number to add
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int add_channels(const char *vm_name, unsigned *channel_list,
+		unsigned num_channels);
+
+/**
+ * Remove a channel definition from the channel manager. This must only be
+ * called from the channel monitor thread.
+ *
+ * @param chan_info
+ *  Pointer to a valid struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel(struct channel_info **chan_info_dptr);
+
+/**
+ * For all channels associated with a Virtual Machine name, update the
+ * connection status. Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ *
+ *
+ * @param name
+ *  Virtual Machine name to modify all channels.
+ *
+ * @param status
+ *  The status to set each channel
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels added for the VM
+ *  - 0 for error
+ */
+int set_channel_status_all(const char *name, enum channel_status status);
+
+/**
+ * For all channels in channel_list associated with a Virtual Machine name
+ * update the connection status of each.
+ * Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
+ * CHANNEL_MGR_CHANNEL_DISABLED only.
+ * It is not thread-safe.
+ *
+ * @param name
+ *  Virtual Machine name to add channels.
+ *
+ * @param channel_list
+ *  Pointer to list of unsigned integers, representing the channel numbers to
+ *  modify.
+ *  It must be allocated outside of this function.
+ *
+ * @param num_channels
+ *  The amount of channel numbers in channel_list
+ *
+ * @return
+ *  - N the number of channels modified for the VM
+ *  - 0 for error
+ */
+int set_channel_status(const char *vm_name, unsigned *channel_list,
+		unsigned len_channel_list, enum channel_status status);
+
+/**
+ * Populates a pointer to struct vm_info associated with vm_name.
+ *
+ * @param vm_name
+ *  The name of the virtual machine to lookup.
+ *
+ *  @param vm_info
+ *   Pointer to a struct vm_info, this must be allocated prior to calling this
+ *   function.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int get_info_vm(const char *vm_name, struct vm_info *info);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_MANAGER_H_ */
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
new file mode 100644
index 0000000..e3c1b0c
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -0,0 +1,234 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+
+
+#include "channel_monitor.h"
+#include "channel_commands.h"
+#include "channel_manager.h"
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+
+static volatile unsigned run_loop = 1;
+static int global_event_fd;
+static struct epoll_event *global_events_list;
+
+void channel_monitor_exit(void)
+{
+	run_loop = 0;
+	rte_free(global_events_list);
+}
+
+static int
+process_request(struct channel_packet *pkt, struct channel_info *chan_info)
+{
+	uint64_t core_mask;
+
+	if (chan_info == NULL)
+		return -1;
+
+	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_CONNECTED,
+			CHANNEL_MGR_CHANNEL_PROCESSING) == 0)
+		return -1;
+
+	if (pkt->command == CPU_POWER) {
+		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
+		if (core_mask == 0) {
+			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
+				"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
+				(unsigned)pkt->unit);
+			return -1;
+		}
+		if (__builtin_popcountll(core_mask) == 1) {
+
+			unsigned core_num = __builtin_ffsll(core_mask) - 1;
+
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_core_min(core_num);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_core_max(core_num);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_core_down(core_num);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_core_up(core_num);
+			break;
+			default:
+				break;
+			}
+		} else {
+			switch (pkt->unit) {
+			case(CPU_POWER_SCALE_MIN):
+					power_manager_scale_mask_min(core_mask);
+			break;
+			case(CPU_POWER_SCALE_MAX):
+					power_manager_scale_mask_max(core_mask);
+			break;
+			case(CPU_POWER_SCALE_DOWN):
+					power_manager_scale_mask_down(core_mask);
+			break;
+			case(CPU_POWER_SCALE_UP):
+					power_manager_scale_mask_up(core_mask);
+			break;
+			default:
+				break;
+			}
+
+		}
+	}
+	/* Return is not checked as channel status may have been set to DISABLED
+	 * from management thread
+	 */
+	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_PROCESSING,
+			CHANNEL_MGR_CHANNEL_CONNECTED);
+	return 0;
+
+}
+
+int
+add_channel_to_monitor(struct channel_info **chan_info)
+{
+	struct channel_info *info = *chan_info;
+	struct epoll_event event;
+
+	event.events = EPOLLIN;
+	event.data.ptr = info;
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
+				"to epoll\n", info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+remove_channel_from_monitor(struct channel_info *chan_info)
+{
+	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
+				"from epoll\n", chan_info->channel_path);
+		return -1;
+	}
+	return 0;
+}
+
+int
+channel_monitor_init(void)
+{
+	global_event_fd = epoll_create1(0);
+	if (global_event_fd == 0) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
+				"error %s\n", strerror(errno));
+		return -1;
+	}
+	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
+			* MAX_EVENTS, CACHE_LINE_SIZE);
+	if (global_events_list == NULL) {
+		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+				"epoll events\n");
+		return -1;
+	}
+	return 0;
+}
+
+void
+run_channel_monitor(void)
+{
+	while (run_loop) {
+		int n_events, i;
+
+		n_events = epoll_wait(global_event_fd, global_events_list,
+				MAX_EVENTS, 1);
+		if (!run_loop)
+			break;
+		for (i = 0; i < n_events; i++) {
+			struct channel_info *chan_info = (struct channel_info *)
+					global_events_list[i].data.ptr;
+			if ((global_events_list[i].events & EPOLLERR) ||
+					(global_events_list[i].events & EPOLLHUP)) {
+				RTE_LOG(DEBUG, CHANNEL_MONITOR, "Remote closed connection for "
+						"channel '%s'\n", chan_info->channel_path);
+				remove_channel(&chan_info);
+				continue;
+			}
+			if (global_events_list[i].events & EPOLLIN) {
+
+				int n_bytes, err = 0;
+				struct channel_packet pkt;
+				void *buffer = &pkt;
+				int buffer_len = sizeof(pkt);
+
+				while (buffer_len > 0) {
+					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					if (n_bytes == buffer_len)
+						break;
+					if (n_bytes == -1) {
+						err = errno;
+						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
+								"channel '%s' read: %s\n",
+								chan_info->channel_path, strerror(err));
+						remove_channel(&chan_info);
+						break;
+					}
+					buffer = (char *)buffer + n_bytes;
+					buffer_len -= n_bytes;
+				}
+				if (!err)
+					process_request(&pkt, chan_info);
+			}
+		}
+	}
+}
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
new file mode 100644
index 0000000..c138607
--- /dev/null
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_MONITOR_H_
+#define CHANNEL_MONITOR_H_
+
+#include "channel_manager.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Channel Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int channel_monitor_init(void);
+
+/**
+ * Run the channel monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_channel_monitor(void);
+
+/**
+ * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+void channel_monitor_exit(void);
+
+/**
+ * Add an open channel to monitor via epoll. A pointer to struct channel_info
+ * will be registered with epoll for event processing.
+ * It is thread-safe.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info pointer.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_channel_to_monitor(struct channel_info **chan_info);
+
+/**
+ * Remove a previously added channel from epoll control.
+ *
+ * @param chan_info
+ *  Pointer to struct channel_info.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_channel_from_monitor(struct channel_info *chan_info);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* CHANNEL_MONITOR_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host).
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
@ 2014-11-29 15:21             ` Neil Horman
  0 siblings, 0 replies; 97+ messages in thread
From: Neil Horman @ 2014-11-29 15:21 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

On Tue, Nov 25, 2014 at 04:18:02PM +0000, Pablo de Lara wrote:
> From: Alan Carew <alan.carew@intel.com>
> 
> The manager is responsible for adding communications channels to the Monitor
> thread, tracking and reporting VM state and employs the libvirt API for
> synchronization with the KVM Hypervisor. The manager interacts with the
> Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
> physical CPUS(pCPUs) and to inspect the VM running state.
> 
> The manager provides the following functionality to the CLI:
> 1) Connect to a libvirtd instance, default: qemu:///system
> 2) Add a VM to an internal list, each VM is identified by a "name" which must
>    correspond a valid libvirt Domain Name.
> 3) Add communication channels associated with a VM to the epoll based Monitor
>    thread.
>    The channels must exist and be in the form of:
>    /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
>    Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
>    non-blocking mode.
>    Each VM can have a maximum of 64 channels associated with it.
> 4) Disable or re-enable VM communication channels, channels once added to the
>    Monitor thread remain in that threads control, however acting on channel
>    requests can be disabled and renabled via CLI.
> 
> The monitor is an epoll based infinite loop running in a separate thread that
> waits on channel events from VMs and calls the corresponding functions. Channel
> definitions from the manager are registered via the epoll event opaque pointer
> when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
> file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
> associated with a request from a particular VM.
> 
> Signed-off-by: Alan Carew <alan.carew@intel.com>
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> ---
>  examples/vm_power_manager/channel_manager.c |  808 +++++++++++++++++++++++++++
>  examples/vm_power_manager/channel_manager.h |  314 +++++++++++
>  examples/vm_power_manager/channel_monitor.c |  234 ++++++++
>  examples/vm_power_manager/channel_monitor.h |  102 ++++
>  4 files changed, 1458 insertions(+), 0 deletions(-)
>  create mode 100644 examples/vm_power_manager/channel_manager.c
>  create mode 100644 examples/vm_power_manager/channel_manager.h
>  create mode 100644 examples/vm_power_manager/channel_monitor.c
>  create mode 100644 examples/vm_power_manager/channel_monitor.h
> 
> diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
> new file mode 100644
> index 0000000..7d744c0
> --- /dev/null
> +++ b/examples/vm_power_manager/channel_manager.c
> @@ -0,0 +1,808 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/un.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include <inttypes.h>
> +#include <dirent.h>
> +#include <errno.h>
> +
> +#include <sys/queue.h>
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/select.h>
> +
> +#include <rte_config.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_mempool.h>
> +#include <rte_log.h>
> +#include <rte_atomic.h>
> +#include <rte_spinlock.h>
> +
> +#include <libvirt/libvirt.h>
> +
> +#include "channel_manager.h"
> +#include "channel_commands.h"
> +#include "channel_monitor.h"
> +
> +
> +#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
> +
> +#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
> +		for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
> +		if ((mask_u64b >> i) & 1) \
> +
You specify masks as 64bit entries throughout this code, is that sufficient?
IIRC someone was just undertaking some work to allow for systems that supported
larger than 64 bit system.  I know linux (and I'm sure bsd) contain a bitmask or
cpumask that is of variable length so that an arbitrary number of cpus can be
specified.

> +/* Global pointer to libvirt connection */
> +static virConnectPtr global_vir_conn_ptr;
> +
> +static unsigned char *global_cpumaps;
> +static virVcpuInfo *global_vircpuinfo;
> +static size_t global_maplen;
> +
> +static unsigned global_n_host_cpus;
> +
> +/*
> + * Represents a single Virtual Machine
> + */
> +struct virtual_machine_info {
> +	char name[CHANNEL_MGR_MAX_NAME_LEN];
> +	rte_atomic64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];
> +	struct channel_info *channels[CHANNEL_CMDS_MAX_VM_CHANNELS];
> +	uint64_t channel_mask;
> +	uint8_t num_channels;
> +	enum vm_status status;
> +	virDomainPtr domainPtr;
> +	virDomainInfo info;
> +	rte_spinlock_t config_spinlock;
> +	LIST_ENTRY(virtual_machine_info) vms_info;
> +};
> +
> +LIST_HEAD(, virtual_machine_info) vm_list_head;
> +
> +static struct virtual_machine_info *
> +find_domain_by_name(const char *name)
> +{
> +	struct virtual_machine_info *info;
> +	LIST_FOREACH(info, &vm_list_head, vms_info) {
> +		if (!strncmp(info->name, name, CHANNEL_MGR_MAX_NAME_LEN-1))
> +			return info;
> +	}
> +	return NULL;
> +}
> +
> +static int
> +update_pcpus_mask(struct virtual_machine_info *vm_info)
> +{
> +	virVcpuInfoPtr cpuinfo;
> +	unsigned i, j;
> +	int n_vcpus;
> +	uint64_t mask;
> +
> +	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
> +
> +	if (!virDomainIsActive(vm_info->domainPtr)) {
> +		n_vcpus = virDomainGetVcpuPinInfo(vm_info->domainPtr,
> +				vm_info->info.nrVirtCpu, global_cpumaps, global_maplen,
> +				VIR_DOMAIN_AFFECT_CONFIG);
> +		if (n_vcpus < 0) {
> +			RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
> +					"in-active VM '%s'\n", vm_info->name);
> +			return -1;
> +		}
> +		goto update_pcpus;
> +	}
> +
> +	memset(global_vircpuinfo, 0, sizeof(*global_vircpuinfo)*
> +			CHANNEL_CMDS_MAX_CPUS);
> +
> +	cpuinfo = global_vircpuinfo;
> +
> +	n_vcpus = virDomainGetVcpus(vm_info->domainPtr, cpuinfo,
> +			CHANNEL_CMDS_MAX_CPUS, global_cpumaps, global_maplen);
> +	if (n_vcpus < 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting vCPU info for "
> +				"active VM '%s'\n", vm_info->name);
> +		return -1;
> +	}
> +update_pcpus:
> +	if (n_vcpus >= CHANNEL_CMDS_MAX_CPUS) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Number of vCPUS(%u) is out of range "
> +				"0...%d\n", n_vcpus, CHANNEL_CMDS_MAX_CPUS-1);
> +		return -1;
> +	}
> +	if (n_vcpus != vm_info->info.nrVirtCpu) {
> +		RTE_LOG(INFO, CHANNEL_MANAGER, "Updating the number of vCPUs for VM '%s"
> +				" from %d -> %d\n", vm_info->name, vm_info->info.nrVirtCpu,
> +				n_vcpus);
> +		vm_info->info.nrVirtCpu = n_vcpus;
> +	}
> +	for (i = 0; i < vm_info->info.nrVirtCpu; i++) {
> +		mask = 0;
> +		for (j = 0; j < global_n_host_cpus; j++) {
> +			if (VIR_CPU_USABLE(global_cpumaps, global_maplen, i, j) > 0) {
> +				mask |= 1ULL << j;
> +			}
> +		}
> +		rte_atomic64_set(&vm_info->pcpu_mask[i], mask);
> +	}
> +	return 0;
> +}
> +
> +int
> +set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask)
> +{
> +	unsigned i = 0;
> +	int flags = VIR_DOMAIN_AFFECT_LIVE|VIR_DOMAIN_AFFECT_CONFIG;
> +	struct virtual_machine_info *vm_info;
> +	uint64_t mask = core_mask;
> +
> +	if (vcpu >= CHANNEL_CMDS_MAX_CPUS) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds max allowable(%d)\n",
> +				vcpu, CHANNEL_CMDS_MAX_CPUS-1);
> +		return -1;
> +	}
> +
> +	vm_info = find_domain_by_name(vm_name);
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
> +		return -1;
> +	}
> +
> +	if (!virDomainIsActive(vm_info->domainPtr)) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
> +				"mask(0x%"PRIx64") for VM '%s', VM is not active\n",
> +				vcpu, core_mask, vm_info->name);
> +		return -1;
> +	}
> +
> +	if (vcpu >= vm_info->info.nrVirtCpu) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "vCPU(%u) exceeds the assigned number of "
> +				"vCPUs(%u)\n", vcpu, vm_info->info.nrVirtCpu);
> +		return -1;
> +	}
> +	memset(global_cpumaps, 0 , CHANNEL_CMDS_MAX_CPUS * global_maplen);
> +	ITERATIVE_BITMASK_CHECK_64(mask, i) {
> +		VIR_USE_CPU(global_cpumaps, i);
> +		if (i >= global_n_host_cpus) {
> +			RTE_LOG(ERR, CHANNEL_MANAGER, "CPU(%u) exceeds the available "
> +					"number of CPUs(%u)\n", i, global_n_host_cpus);
> +			return -1;
> +		}
> +	}
> +	if (virDomainPinVcpuFlags(vm_info->domainPtr, vcpu, global_cpumaps,
> +			global_maplen, flags) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to set vCPU(%u) to pCPU "
> +				"mask(0x%"PRIx64") for VM '%s'\n", vcpu, core_mask,
> +				vm_info->name);
> +		return -1;
> +	}
> +	rte_atomic64_set(&vm_info->pcpu_mask[vcpu], core_mask);
> +	return 0;
> +
> +}
> +
> +int
> +set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num)
> +{
> +	uint64_t mask = 1ULL << core_num;
> +
> +	return set_pcpus_mask(vm_name, vcpu, mask);
> +}
> +
> +uint64_t
> +get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu)
> +{
> +	struct virtual_machine_info *vm_info =
> +			(struct virtual_machine_info *)chan_info->priv_info;
> +	return rte_atomic64_read(&vm_info->pcpu_mask[vcpu]);
> +}
> +
> +static inline int
> +channel_exists(struct virtual_machine_info *vm_info, unsigned channel_num)
Is your intent for this to always be inlined?  If so, you likely meant to make
this __always_inline__ (or whatever the unilateral inline macro is)

> +{
> +	rte_spinlock_lock(&(vm_info->config_spinlock));
> +	if (vm_info->channel_mask & (1ULL << channel_num)) {
> +		rte_spinlock_unlock(&(vm_info->config_spinlock));
> +		return 1;
> +	}
> +	rte_spinlock_unlock(&(vm_info->config_spinlock));
> +	return 0;
> +}
> +
> +
> +
> +static int
> +open_non_blocking_channel(struct channel_info *info)
> +{
> +	int ret, flags;
> +	struct sockaddr_un sock_addr;
> +	fd_set soc_fd_set;
> +	struct timeval tv;
> +
> +	info->fd = socket(AF_UNIX, SOCK_STREAM, 0);
> +	if (info->fd == -1) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error(%s) creating socket for '%s'\n",
> +				strerror(errno),
> +				info->channel_path);
> +		return -1;
> +	}
> +	sock_addr.sun_family = AF_UNIX;
> +	memcpy(&sock_addr.sun_path, info->channel_path,
> +			strlen(info->channel_path)+1);
> +
> +	/* Get current flags */
> +	flags = fcntl(info->fd, F_GETFL, 0);
> +	if (flags < 0) {
> +		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) fcntl get flags socket for"
> +				"'%s'\n", strerror(errno), info->channel_path);
> +		return 1;
> +	}
> +	/* Set to Non Blocking */
> +	flags |= O_NONBLOCK;
> +	if (fcntl(info->fd, F_SETFL, flags) < 0) {
> +		RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) setting non-blocking "
> +				"socket for '%s'\n", strerror(errno), info->channel_path);
> +		return -1;
> +	}
> +	ret = connect(info->fd, (struct sockaddr *)&sock_addr,
> +			sizeof(sock_addr));
> +	if (ret < 0) {
> +		/* ECONNREFUSED error is given when VM is not active */
> +		if (errno == ECONNREFUSED) {
> +			RTE_LOG(WARNING, CHANNEL_MANAGER, "VM is not active or has not "
> +					"activated its endpoint to channel %s\n",
> +					info->channel_path);
> +			return -1;
> +		}
> +		/* Wait for tv_sec if in progress */
> +		else if (errno == EINPROGRESS) {
> +			tv.tv_sec = 2;
> +			tv.tv_usec = 0;
> +			FD_ZERO(&soc_fd_set);
> +			FD_SET(info->fd, &soc_fd_set);
> +			if (select(info->fd+1, NULL, &soc_fd_set, NULL, &tv) > 0) {
> +				RTE_LOG(WARNING, CHANNEL_MANAGER, "Timeout or error on channel "
> +						"'%s'\n", info->channel_path);
> +				return -1;
> +			}
> +		} else {
> +			/* Any other error */
> +			RTE_LOG(WARNING, CHANNEL_MANAGER, "Error(%s) connecting socket"
> +					" for '%s'\n", strerror(errno), info->channel_path);
> +			return -1;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int
> +setup_channel_info(struct virtual_machine_info **vm_info_dptr,
> +		struct channel_info **chan_info_dptr, unsigned channel_num)
> +{
> +	struct channel_info *chan_info = *chan_info_dptr;
> +	struct virtual_machine_info *vm_info = *vm_info_dptr;
> +
> +	chan_info->channel_num = channel_num;
> +	chan_info->priv_info = (void *)vm_info;
> +	chan_info->status = CHANNEL_MGR_CHANNEL_DISCONNECTED;
> +	if (open_non_blocking_channel(chan_info) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Could not open channel: "
> +				"'%s' for VM '%s'\n",
> +				chan_info->channel_path, vm_info->name);
> +		return -1;
> +	}
> +	if (add_channel_to_monitor(&chan_info) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Could add channel: "
> +				"'%s' to epoll ctl for VM '%s'\n",
> +				chan_info->channel_path, vm_info->name);
> +		return -1;
> +
> +	}
> +	rte_spinlock_lock(&(vm_info->config_spinlock));
> +	vm_info->num_channels++;
> +	vm_info->channel_mask |= 1ULL << channel_num;
> +	vm_info->channels[channel_num] = chan_info;
> +	chan_info->status = CHANNEL_MGR_CHANNEL_CONNECTED;
> +	rte_spinlock_unlock(&(vm_info->config_spinlock));
> +	return 0;
> +}
> +
> +int
> +add_all_channels(const char *vm_name)
> +{
> +	DIR *d;
> +	struct dirent *dir;
> +	struct virtual_machine_info *vm_info;
> +	struct channel_info *chan_info;
> +	char *token, *remaining, *tail_ptr;
> +	char socket_name[PATH_MAX];
> +	unsigned channel_num;
> +	int num_channels_enabled = 0;
> +
> +	/* verify VM exists */
> +	vm_info = find_domain_by_name(vm_name);
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' not found"
> +				" during channel discovery\n", vm_name);
> +		return 0;
> +	}
> +	if (!virDomainIsActive(vm_info->domainPtr)) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
> +		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
> +		return 0;
> +	}
> +	d = opendir(CHANNEL_MGR_SOCKET_PATH);
> +	if (d == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error opening directory '%s': %s\n",
> +				CHANNEL_MGR_SOCKET_PATH, strerror(errno));
> +		return -1;
> +	}
> +	while ((dir = readdir(d)) != NULL) {
> +		if (!strncmp(dir->d_name, ".", 1) ||
> +				!strncmp(dir->d_name, "..", 2))
> +			continue;
> +
> +		snprintf(socket_name, sizeof(socket_name), "%s", dir->d_name);
> +		remaining = socket_name;
> +		/* Extract vm_name from "<vm_name>.<channel_num>" */
> +		token = strsep(&remaining, ".");
> +		if (remaining == NULL)
> +			continue;
> +		if (strncmp(vm_name, token, CHANNEL_MGR_MAX_NAME_LEN))
> +			continue;
> +
> +		/* remaining should contain only <channel_num> */
> +		errno = 0;
> +		channel_num = (unsigned)strtol(remaining, &tail_ptr, 0);
> +		if ((errno != 0) || (remaining[0] == '\0') ||
> +				(*tail_ptr != '\0') || tail_ptr == NULL) {
> +			RTE_LOG(WARNING, CHANNEL_MANAGER, "Malformed channel name"
> +					"'%s' found it should be in the form of "
> +					"'<guest_name>.<channel_num>(decimal)'\n",
> +					dir->d_name);
> +			continue;
> +		}
> +		if (channel_num >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
> +			RTE_LOG(WARNING, CHANNEL_MANAGER, "Channel number(%u) is "
> +					"greater than max allowable: %d, skipping '%s%s'\n",
> +					channel_num, CHANNEL_CMDS_MAX_VM_CHANNELS-1,
> +					CHANNEL_MGR_SOCKET_PATH, dir->d_name);
> +			continue;
> +		}
> +		/* if channel has not been added previously */
> +		if (channel_exists(vm_info, channel_num))
> +			continue;
> +
> +		chan_info = rte_malloc(NULL, sizeof(*chan_info),
> +				CACHE_LINE_SIZE);
> +		if (chan_info == NULL) {
> +			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
> +				"channel '%s%s'\n", CHANNEL_MGR_SOCKET_PATH, dir->d_name);
> +			continue;
> +		}
> +
> +		snprintf(chan_info->channel_path,
> +				sizeof(chan_info->channel_path), "%s%s",
> +				CHANNEL_MGR_SOCKET_PATH, dir->d_name);
> +
> +		if (setup_channel_info(&vm_info, &chan_info, channel_num) < 0) {
> +			rte_free(chan_info);
> +			continue;
> +		}
> +
> +		num_channels_enabled++;
> +	}
> +	closedir(d);
> +	return num_channels_enabled;
> +}
> +
> +int
> +add_channels(const char *vm_name, unsigned *channel_list,
> +		unsigned len_channel_list)
> +{
> +	struct virtual_machine_info *vm_info;
> +	struct channel_info *chan_info;
> +	char socket_path[PATH_MAX];
> +	unsigned i;
> +	int num_channels_enabled = 0;
> +
> +	vm_info = find_domain_by_name(vm_name);
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
> +				"not found\n", vm_name);
> +		return 0;
> +	}
> +
> +	if (!virDomainIsActive(vm_info->domainPtr)) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "VM: '%s' is not active\n", vm_name);
> +		vm_info->status = CHANNEL_MGR_VM_INACTIVE;
> +		return 0;
> +	}
> +
> +	for (i = 0; i < len_channel_list; i++) {
> +
> +		if (channel_list[i] >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
> +			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel(%u) is out of range "
> +							"0...%d\n", channel_list[i],
> +							CHANNEL_CMDS_MAX_VM_CHANNELS-1);
> +			continue;
> +		}
> +		if (channel_exists(vm_info, channel_list[i])) {
> +			RTE_LOG(INFO, CHANNEL_MANAGER, "Channel already exists, skipping  "
> +					"'%s.%u'\n", vm_name, i);
> +			continue;
> +		}
> +
> +		snprintf(socket_path, sizeof(socket_path), "%s%s.%u",
> +				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
> +		errno = 0;
> +		if (access(socket_path, F_OK) < 0) {
> +			RTE_LOG(ERR, CHANNEL_MANAGER, "Channel path '%s' error: "
> +					"%s\n", socket_path, strerror(errno));
> +			continue;
> +		}
> +		chan_info = rte_malloc(NULL, sizeof(*chan_info),
> +				CACHE_LINE_SIZE);
> +		if (chan_info == NULL) {
> +			RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for "
> +					"channel '%s'\n", socket_path);
> +			continue;
> +		}
> +		snprintf(chan_info->channel_path,
> +				sizeof(chan_info->channel_path), "%s%s.%u",
> +				CHANNEL_MGR_SOCKET_PATH, vm_name, channel_list[i]);
> +		if (setup_channel_info(&vm_info, &chan_info, channel_list[i]) < 0) {
> +			rte_free(chan_info);
> +			continue;
> +		}
> +		num_channels_enabled++;
> +
> +	}
> +	return num_channels_enabled;
> +}
> +
> +int
> +remove_channel(struct channel_info **chan_info_dptr)
> +{
> +	struct virtual_machine_info *vm_info;
> +	struct channel_info *chan_info = *chan_info_dptr;
> +
> +	close(chan_info->fd);
> +
> +	vm_info = (struct virtual_machine_info *)chan_info->priv_info;
> +
> +	rte_spinlock_lock(&(vm_info->config_spinlock));
> +	vm_info->channel_mask &= ~(1ULL << chan_info->channel_num);
> +	vm_info->num_channels--;
> +	rte_spinlock_unlock(&(vm_info->config_spinlock));
> +
> +	rte_free(chan_info);
> +	return 0;
> +}
> +
> +int
> +set_channel_status_all(const char *vm_name, enum channel_status status)
> +{
> +	struct virtual_machine_info *vm_info;
> +	unsigned i;
> +	uint64_t mask;
> +	int num_channels_changed = 0;
> +
> +	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
> +			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
> +				"disabled: Unable to change status for VM '%s'\n", vm_name);
> +	}
> +	vm_info = find_domain_by_name(vm_name);
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to disable channels: VM '%s' "
> +				"not found\n", vm_name);
> +		return 0;
> +	}
> +
> +	rte_spinlock_lock(&(vm_info->config_spinlock));
> +	mask = vm_info->channel_mask;
> +	ITERATIVE_BITMASK_CHECK_64(mask, i) {
> +		vm_info->channels[i]->status = status;
> +		num_channels_changed++;
> +	}
> +	rte_spinlock_unlock(&(vm_info->config_spinlock));
> +	return num_channels_changed;
> +
> +}
> +
> +int
> +set_channel_status(const char *vm_name, unsigned *channel_list,
> +		unsigned len_channel_list, enum channel_status status)
> +{
> +	struct virtual_machine_info *vm_info;
> +	unsigned i;
> +	int num_channels_changed = 0;
> +
> +	if (!(status == CHANNEL_MGR_CHANNEL_CONNECTED ||
> +			status == CHANNEL_MGR_CHANNEL_DISABLED)) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Channels can only be enabled or "
> +				"disabled: Unable to change status for VM '%s'\n", vm_name);
> +	}
> +	vm_info = find_domain_by_name(vm_name);
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add channels: VM '%s' "
> +				"not found\n", vm_name);
> +		return 0;
> +	}
> +	for (i = 0; i < len_channel_list; i++) {
> +		if (channel_exists(vm_info, channel_list[i])) {
> +			rte_spinlock_lock(&(vm_info->config_spinlock));
> +			vm_info->channels[channel_list[i]]->status = status;
> +			rte_spinlock_unlock(&(vm_info->config_spinlock));
> +			num_channels_changed++;
> +		}
> +	}
> +	return num_channels_changed;
> +}
> +
> +int
> +get_info_vm(const char *vm_name, struct vm_info *info)
> +{
> +	struct virtual_machine_info *vm_info;
> +	unsigned i, channel_num = 0;
> +	uint64_t mask;
> +
> +	vm_info = find_domain_by_name(vm_name);
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "VM '%s' not found\n", vm_name);
> +		return -1;
> +	}
> +	info->status = CHANNEL_MGR_VM_ACTIVE;
> +	if (!virDomainIsActive(vm_info->domainPtr))
> +		info->status = CHANNEL_MGR_VM_INACTIVE;
> +
> +	rte_spinlock_lock(&(vm_info->config_spinlock));
> +
> +	mask = vm_info->channel_mask;
> +	ITERATIVE_BITMASK_CHECK_64(mask, i) {
> +		info->channels[channel_num].channel_num = i;
> +		memcpy(info->channels[channel_num].channel_path,
> +				vm_info->channels[i]->channel_path, PATH_MAX);
> +		info->channels[channel_num].status = vm_info->channels[i]->status;
> +		info->channels[channel_num].fd = vm_info->channels[i]->fd;
> +		channel_num++;
> +	}
> +
> +	info->num_channels = channel_num;
> +	info->num_vcpus = vm_info->info.nrVirtCpu;
> +	rte_spinlock_unlock(&(vm_info->config_spinlock));
> +
> +	memcpy(info->name, vm_info->name, sizeof(vm_info->name));
> +	for (i = 0; i < info->num_vcpus; i++) {
> +		info->pcpu_mask[i] = rte_atomic64_read(&vm_info->pcpu_mask[i]);
> +	}
> +	return 0;
> +}
> +
> +int
> +add_vm(const char *vm_name)
> +{
> +	struct virtual_machine_info *new_domain;
> +	virDomainPtr dom_ptr;
> +	int i;
> +
> +	if (find_domain_by_name(vm_name) != NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to add VM: VM '%s' "
> +				"already exists\n", vm_name);
> +		return -1;
> +	}
> +
> +	if (global_vir_conn_ptr == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "No connection to hypervisor exists\n");
> +		return -1;
> +	}
> +	dom_ptr = virDomainLookupByName(global_vir_conn_ptr, vm_name);
> +	if (dom_ptr == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error on VM lookup with libvirt: "
> +				"VM '%s' not found\n", vm_name);
> +		return -1;
> +	}
> +
> +	new_domain = rte_malloc("virtual_machine_info", sizeof(*new_domain),
> +			CACHE_LINE_SIZE);
> +	if (new_domain == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to allocate memory for VM "
> +				"info\n");
> +		return -1;
> +	}
> +	new_domain->domainPtr = dom_ptr;
> +	if (virDomainGetInfo(new_domain->domainPtr, &new_domain->info) != 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get libvirt VM info\n");
> +		rte_free(new_domain);
> +		return -1;
> +	}
> +	if (new_domain->info.nrVirtCpu > CHANNEL_CMDS_MAX_CPUS) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error the number of virtual CPUs(%u) is "
> +				"greater than allowable(%d)\n", new_domain->info.nrVirtCpu,
> +				CHANNEL_CMDS_MAX_CPUS);
> +		rte_free(new_domain);
> +		return -1;
> +	}
> +
> +	for (i = 0; i < CHANNEL_CMDS_MAX_CPUS; i++) {
> +		rte_atomic64_init(&new_domain->pcpu_mask[i]);
> +	}
> +	if (update_pcpus_mask(new_domain) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error getting physical CPU pinning\n");
> +		rte_free(new_domain);
> +		return -1;
> +	}
> +	strncpy(new_domain->name, vm_name, sizeof(new_domain->name));
> +	new_domain->channel_mask = 0;
> +	new_domain->num_channels = 0;
> +
> +	if (!virDomainIsActive(dom_ptr))
> +		new_domain->status = CHANNEL_MGR_VM_INACTIVE;
> +	else
> +		new_domain->status = CHANNEL_MGR_VM_ACTIVE;
> +
> +	rte_spinlock_init(&(new_domain->config_spinlock));
> +	LIST_INSERT_HEAD(&vm_list_head, new_domain, vms_info);
> +	return 0;
> +}
> +
> +int
> +remove_vm(const char *vm_name)
> +{
> +	struct virtual_machine_info *vm_info = find_domain_by_name(vm_name);
> +
> +	if (vm_info == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM: VM '%s' "
> +				"not found\n", vm_name);
> +		return -1;
> +	}
> +	rte_spinlock_lock(&vm_info->config_spinlock);
> +	if (vm_info->num_channels != 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to remove VM '%s', there are "
> +				"%"PRId8" channels still active\n",
> +				vm_name, vm_info->num_channels);
> +		rte_spinlock_unlock(&vm_info->config_spinlock);
> +		return -1;
> +	}
> +	LIST_REMOVE(vm_info, vms_info);
> +	rte_spinlock_unlock(&vm_info->config_spinlock);
> +	rte_free(vm_info);
> +	return 0;
> +}
> +
> +static void
> +disconnect_hypervisor(void)
> +{
> +	if (global_vir_conn_ptr != NULL) {
> +		virConnectClose(global_vir_conn_ptr);
> +		global_vir_conn_ptr = NULL;
> +	}
> +}
> +
> +static int
> +connect_hypervisor(const char *path)
> +{
> +	if (global_vir_conn_ptr != NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
> +				"already established\n", path);
> +		return -1;
> +	}
> +	global_vir_conn_ptr = virConnectOpen(path);
> +	if (global_vir_conn_ptr == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error failed to open connection to "
> +				"Hypervisor '%s'\n", path);
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +int
> +channel_manager_init(const char *path)
> +{
> +	int n_cpus;
> +
> +	LIST_INIT(&vm_list_head);
> +	if (connect_hypervisor(path) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to initialize channel manager\n");
> +		return -1;
> +	}
> +
> +	global_maplen = VIR_CPU_MAPLEN(CHANNEL_CMDS_MAX_CPUS);
> +
> +	global_vircpuinfo = rte_zmalloc(NULL, sizeof(*global_vircpuinfo) *
> +			CHANNEL_CMDS_MAX_CPUS, CACHE_LINE_SIZE);
> +	if (global_vircpuinfo == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Error allocating memory for CPU Info\n");
> +		goto error;
> +	}
> +	global_cpumaps = rte_zmalloc(NULL, CHANNEL_CMDS_MAX_CPUS * global_maplen,
> +			CACHE_LINE_SIZE);
> +	if (global_cpumaps == NULL) {
> +		goto error;
> +	}
> +
> +	n_cpus = virNodeGetCPUMap(global_vir_conn_ptr, NULL, NULL, 0);
> +	if (n_cpus <= 0) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to get the number of Host "
> +				"CPUs\n");
> +		goto error;
> +	}
> +	global_n_host_cpus = (unsigned)n_cpus;
> +
> +	if (global_n_host_cpus > CHANNEL_CMDS_MAX_CPUS) {
> +		RTE_LOG(ERR, CHANNEL_MANAGER, "The number of host CPUs(%u) exceeds the "
> +				"maximum of %u\n", global_n_host_cpus, CHANNEL_CMDS_MAX_CPUS);
> +		goto error;
> +
> +	}
> +
> +	return 0;
> +error:
> +	disconnect_hypervisor();
> +	return -1;
> +}
> +
> +void
> +channel_manager_exit(void)
> +{
> +	unsigned i;
> +	uint64_t mask;
> +	struct virtual_machine_info *vm_info;
> +
> +	LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
> +
> +		rte_spinlock_lock(&(vm_info->config_spinlock));
> +
> +		mask = vm_info->channel_mask;
> +		ITERATIVE_BITMASK_CHECK_64(mask, i) {
> +			remove_channel_from_monitor(vm_info->channels[i]);
> +			close(vm_info->channels[i]->fd);
> +			rte_free(vm_info->channels[i]);
> +		}
> +		rte_spinlock_unlock(&(vm_info->config_spinlock));
> +
> +		LIST_REMOVE(vm_info, vms_info);
> +		rte_free(vm_info);
> +	}
> +
> +	if (global_cpumaps != NULL)
> +		rte_free(global_cpumaps);
> +	if (global_vircpuinfo != NULL)
> +		rte_free(global_vircpuinfo);
> +	disconnect_hypervisor();
> +}
> diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
> new file mode 100644
> index 0000000..12c29c3
> --- /dev/null
> +++ b/examples/vm_power_manager/channel_manager.h
> @@ -0,0 +1,314 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef CHANNEL_MANAGER_H_
> +#define CHANNEL_MANAGER_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <linux/limits.h>
> +#include <rte_atomic.h>
> +#include "channel_commands.h"
> +
> +/* Maximum name length including '\0' terminator */
> +#define CHANNEL_MGR_MAX_NAME_LEN    64
> +
> +/* Maximum number of channels to each Virtual Machine */
> +#define CHANNEL_MGR_MAX_CHANNELS    64
> +
> +/* Hypervisor Path for libvirt(qemu/KVM) */
> +#define CHANNEL_MGR_DEFAULT_HV_PATH "qemu:///system"
> +
> +/* File socket directory */
> +#define CHANNEL_MGR_SOCKET_PATH     "/tmp/powermonitor/"
> +
> +/* Communication Channel Status */
> +enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
> +	CHANNEL_MGR_CHANNEL_CONNECTED,
> +	CHANNEL_MGR_CHANNEL_DISABLED,
> +	CHANNEL_MGR_CHANNEL_PROCESSING};
> +
> +/* VM libvirt(qemu/KVM) connection status */
> +enum vm_status { CHANNEL_MGR_VM_INACTIVE = 0, CHANNEL_MGR_VM_ACTIVE};
> +
> +/*
> + *  Represents a single and exclusive VM channel that exists between a guest and
> + *  the host.
> + */
> +struct channel_info {
> +	char channel_path[PATH_MAX]; /**< Path to host socket */
> +	volatile uint32_t status;    /**< Connection status(enum channel_status) */
> +	int fd;                      /**< AF_UNIX socket fd */
> +	unsigned channel_num;        /**< CHANNEL_MGR_SOCKET_PATH/<vm_name>.channel_num */
> +	void *priv_info;             /**< Pointer to private info, do not modify */
> +};
> +
> +/* Represents a single VM instance used to return internal information about
> + * a VM */
> +struct vm_info {
> +	char name[CHANNEL_MGR_MAX_NAME_LEN];          /**< VM name */
> +	enum vm_status status;                        /**< libvirt status */
> +	uint64_t pcpu_mask[CHANNEL_CMDS_MAX_CPUS];    /**< pCPU mask for each vCPU */
> +	unsigned num_vcpus;                           /**< number of vCPUS */
> +	struct channel_info channels[CHANNEL_MGR_MAX_CHANNELS]; /**< Array of channel_info */
> +	unsigned num_channels;                        /**< Number of channels */
> +};
> +
> +/**
> + * Initialize the Channel Manager resources and connect to the Hypervisor
> + * specified in path.
> + * This must be successfully called first before calling any other functions.
> + * It must only be call once;
> + *
> + * @param path
> + *  Must be a local path, e.g. qemu:///system.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int channel_manager_init(const char *path);
> +
> +/**
> + * Free resources associated with the Channel Manager.
> + *
> + * @param path
> + *  Must be a local path, e.g. qemu:///system.
> + *
> + * @return
> + *  None
> + */
> +void channel_manager_exit(void);
> +
> +/**
> + * Get the Physical CPU mask for VM lcore channel(vcpu), result is assigned to
> + * core_mask.
> + * It is not thread-safe.
> + *
> + * @param chan_info
> + *  Pointer to struct channel_info
> + *
> + * @param vcpu
> + *  The virtual CPU to query.
> + *
> + *
> + * @return
> + *  - 0 on error.
> + *  - >0 on success.
> + */
> +uint64_t get_pcpus_mask(struct channel_info *chan_info, unsigned vcpu);
> +
> +/**
> + * Set the Physical CPU mask for the specified vCPU.
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to lookup
> + *
> + * @param vcpu
> + *  The virtual CPU to set.
> + *
> + * @param core_mask
> + *  The core mask of the physical CPU(s) to bind the vCPU
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int set_pcpus_mask(char *vm_name, unsigned vcpu, uint64_t core_mask);
> +
> +/**
> + * Set the Physical CPU for the specified vCPU.
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to lookup
> + *
> + * @param vcpu
> + *  The virtual CPU to set.
> + *
> + * @param core_num
> + *  The core number of the physical CPU(s) to bind the vCPU
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int set_pcpu(char *vm_name, unsigned vcpu, unsigned core_num);
> +/**
> + * Add a VM as specified by name to the Channel Manager. The name must
> + * correspond to a valid libvirt domain name.
> + * This is required prior to adding channels.
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to lookup.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int add_vm(const char *name);
> +
> +/**
> + * Remove a previously added Virtual Machine from the Channel Manager
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to lookup.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int remove_vm(const char *name);
> +
> +/**
> + * Add all available channels to the VM as specified by name.
> + * Channels in the form of paths
> + * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to lookup.
> + *
> + * @return
> + *  - N the number of channels added for the VM
> + */
> +int add_all_channels(const char *vm_name);
> +
> +/**
> + * Add the channel numbers in channel_list to the domain specified by name.
> + * Channels in the form of paths
> + * (CHANNEL_MGR_SOCKET_PATH/<vm_name>.<channel_number>) will only be parsed.
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to add channels.
> + *
> + * @param channel_list
> + *  Pointer to list of unsigned integers, representing the channel number to add
> + *  It must be allocated outside of this function.
> + *
> + * @param num_channels
> + *  The amount of channel numbers in channel_list
> + *
> + * @return
> + *  - N the number of channels added for the VM
> + *  - 0 for error
> + */
> +int add_channels(const char *vm_name, unsigned *channel_list,
> +		unsigned num_channels);
> +
> +/**
> + * Remove a channel definition from the channel manager. This must only be
> + * called from the channel monitor thread.
> + *
> + * @param chan_info
> + *  Pointer to a valid struct channel_info.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int remove_channel(struct channel_info **chan_info_dptr);
> +
> +/**
> + * For all channels associated with a Virtual Machine name, update the
> + * connection status. Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
> + * CHANNEL_MGR_CHANNEL_DISABLED only.
> + *
> + *
> + * @param name
> + *  Virtual Machine name to modify all channels.
> + *
> + * @param status
> + *  The status to set each channel
> + *
> + * @param num_channels
> + *  The amount of channel numbers in channel_list
> + *
> + * @return
> + *  - N the number of channels added for the VM
> + *  - 0 for error
> + */
> +int set_channel_status_all(const char *name, enum channel_status status);
> +
> +/**
> + * For all channels in channel_list associated with a Virtual Machine name
> + * update the connection status of each.
> + * Valid states are CHANNEL_MGR_CHANNEL_CONNECTED or
> + * CHANNEL_MGR_CHANNEL_DISABLED only.
> + * It is not thread-safe.
> + *
> + * @param name
> + *  Virtual Machine name to add channels.
> + *
> + * @param channel_list
> + *  Pointer to list of unsigned integers, representing the channel numbers to
> + *  modify.
> + *  It must be allocated outside of this function.
> + *
> + * @param num_channels
> + *  The amount of channel numbers in channel_list
> + *
> + * @return
> + *  - N the number of channels modified for the VM
> + *  - 0 for error
> + */
> +int set_channel_status(const char *vm_name, unsigned *channel_list,
> +		unsigned len_channel_list, enum channel_status status);
> +
> +/**
> + * Populates a pointer to struct vm_info associated with vm_name.
> + *
> + * @param vm_name
> + *  The name of the virtual machine to lookup.
> + *
> + *  @param vm_info
> + *   Pointer to a struct vm_info, this must be allocated prior to calling this
> + *   function.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int get_info_vm(const char *vm_name, struct vm_info *info);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* CHANNEL_MANAGER_H_ */
> diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
> new file mode 100644
> index 0000000..e3c1b0c
> --- /dev/null
> +++ b/examples/vm_power_manager/channel_monitor.c
> @@ -0,0 +1,234 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <unistd.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <signal.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <sys/types.h>
> +#include <sys/epoll.h>
> +#include <sys/queue.h>
> +
> +#include <rte_config.h>
> +#include <rte_log.h>
> +#include <rte_memory.h>
> +#include <rte_malloc.h>
> +#include <rte_atomic.h>
> +
> +
> +#include "channel_monitor.h"
> +#include "channel_commands.h"
> +#include "channel_manager.h"
> +#include "power_manager.h"
> +
> +#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
> +
> +#define MAX_EVENTS 256
> +
> +
> +static volatile unsigned run_loop = 1;
> +static int global_event_fd;
> +static struct epoll_event *global_events_list;
> +
> +void channel_monitor_exit(void)
> +{
> +	run_loop = 0;
> +	rte_free(global_events_list);
> +}
> +
> +static int
> +process_request(struct channel_packet *pkt, struct channel_info *chan_info)
> +{
> +	uint64_t core_mask;
> +
> +	if (chan_info == NULL)
> +		return -1;
> +
> +	if (rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_CONNECTED,
> +			CHANNEL_MGR_CHANNEL_PROCESSING) == 0)
> +		return -1;
> +
> +	if (pkt->command == CPU_POWER) {
> +		core_mask = get_pcpus_mask(chan_info, pkt->resource_id);
> +		if (core_mask == 0) {
> +			RTE_LOG(ERR, CHANNEL_MONITOR, "Error get physical CPU mask for "
> +				"channel '%s' using vCPU(%u)\n", chan_info->channel_path,
> +				(unsigned)pkt->unit);
> +			return -1;
> +		}
> +		if (__builtin_popcountll(core_mask) == 1) {
> +
> +			unsigned core_num = __builtin_ffsll(core_mask) - 1;
> +
> +			switch (pkt->unit) {
> +			case(CPU_POWER_SCALE_MIN):
> +					power_manager_scale_core_min(core_num);
> +			break;
> +			case(CPU_POWER_SCALE_MAX):
> +					power_manager_scale_core_max(core_num);
> +			break;
> +			case(CPU_POWER_SCALE_DOWN):
> +					power_manager_scale_core_down(core_num);
> +			break;
> +			case(CPU_POWER_SCALE_UP):
> +					power_manager_scale_core_up(core_num);
> +			break;
> +			default:
> +				break;
> +			}
> +		} else {
> +			switch (pkt->unit) {
> +			case(CPU_POWER_SCALE_MIN):
> +					power_manager_scale_mask_min(core_mask);
> +			break;
> +			case(CPU_POWER_SCALE_MAX):
> +					power_manager_scale_mask_max(core_mask);
> +			break;
> +			case(CPU_POWER_SCALE_DOWN):
> +					power_manager_scale_mask_down(core_mask);
> +			break;
> +			case(CPU_POWER_SCALE_UP):
> +					power_manager_scale_mask_up(core_mask);
> +			break;
> +			default:
> +				break;
> +			}
> +
> +		}
> +	}
> +	/* Return is not checked as channel status may have been set to DISABLED
> +	 * from management thread
> +	 */
> +	rte_atomic32_cmpset(&(chan_info->status), CHANNEL_MGR_CHANNEL_PROCESSING,
> +			CHANNEL_MGR_CHANNEL_CONNECTED);
> +	return 0;
> +
> +}
> +
> +int
> +add_channel_to_monitor(struct channel_info **chan_info)
> +{
> +	struct channel_info *info = *chan_info;
> +	struct epoll_event event;
> +
> +	event.events = EPOLLIN;
> +	event.data.ptr = info;
> +	if (epoll_ctl(global_event_fd, EPOLL_CTL_ADD, info->fd, &event) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to add channel '%s' "
> +				"to epoll\n", info->channel_path);
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +int
> +remove_channel_from_monitor(struct channel_info *chan_info)
> +{
> +	if (epoll_ctl(global_event_fd, EPOLL_CTL_DEL, chan_info->fd, NULL) < 0) {
> +		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to remove channel '%s' "
> +				"from epoll\n", chan_info->channel_path);
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +int
> +channel_monitor_init(void)
> +{
> +	global_event_fd = epoll_create1(0);
> +	if (global_event_fd == 0) {
> +		RTE_LOG(ERR, CHANNEL_MONITOR, "Error creating epoll context with "
> +				"error %s\n", strerror(errno));
> +		return -1;
> +	}
> +	global_events_list = rte_malloc("epoll_events", sizeof(*global_events_list)
> +			* MAX_EVENTS, CACHE_LINE_SIZE);
> +	if (global_events_list == NULL) {
> +		RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
> +				"epoll events\n");
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +void
> +run_channel_monitor(void)
> +{
> +	while (run_loop) {
> +		int n_events, i;
> +
> +		n_events = epoll_wait(global_event_fd, global_events_list,
> +				MAX_EVENTS, 1);
> +		if (!run_loop)
> +			break;
> +		for (i = 0; i < n_events; i++) {
> +			struct channel_info *chan_info = (struct channel_info *)
> +					global_events_list[i].data.ptr;
> +			if ((global_events_list[i].events & EPOLLERR) ||
> +					(global_events_list[i].events & EPOLLHUP)) {
> +				RTE_LOG(DEBUG, CHANNEL_MONITOR, "Remote closed connection for "
> +						"channel '%s'\n", chan_info->channel_path);
> +				remove_channel(&chan_info);
> +				continue;
> +			}
> +			if (global_events_list[i].events & EPOLLIN) {
> +
> +				int n_bytes, err = 0;
> +				struct channel_packet pkt;
> +				void *buffer = &pkt;
> +				int buffer_len = sizeof(pkt);
> +
> +				while (buffer_len > 0) {
> +					n_bytes = read(chan_info->fd, buffer, buffer_len);
> +					if (n_bytes == buffer_len)
> +						break;
> +					if (n_bytes == -1) {
> +						err = errno;
> +						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
> +								"channel '%s' read: %s\n",
> +								chan_info->channel_path, strerror(err));
> +						remove_channel(&chan_info);
> +						break;
> +					}
> +					buffer = (char *)buffer + n_bytes;
> +					buffer_len -= n_bytes;
> +				}
> +				if (!err)
> +					process_request(&pkt, chan_info);
> +			}
> +		}
> +	}
> +}
> diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
> new file mode 100644
> index 0000000..c138607
> --- /dev/null
> +++ b/examples/vm_power_manager/channel_monitor.h
> @@ -0,0 +1,102 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef CHANNEL_MONITOR_H_
> +#define CHANNEL_MONITOR_H_
> +
> +#include "channel_manager.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Setup the Channel Monitor resources required to initialize epoll.
> + * Must be called first before calling other functions.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int channel_monitor_init(void);
> +
> +/**
> + * Run the channel monitor, loops forever on on epoll_wait.
> + *
> + *
> + * @return
> + *  None
> + */
> +void run_channel_monitor(void);
> +
> +/**
> + * Exit the Channel Monitor, exiting the epoll_wait loop and events processing.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +void channel_monitor_exit(void);
> +
> +/**
> + * Add an open channel to monitor via epoll. A pointer to struct channel_info
> + * will be registered with epoll for event processing.
> + * It is thread-safe.
> + *
> + * @param chan_info
> + *  Pointer to struct channel_info pointer.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int add_channel_to_monitor(struct channel_info **chan_info);
> +
> +/**
> + * Remove a previously added channel from epoll control.
> + *
> + * @param chan_info
> + *  Pointer to struct channel_info.
> + *
> + * @return
> + *  - 0 on success.
> + *  - Negative on error.
> + */
> +int remove_channel_from_monitor(struct channel_info *chan_info);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +
> +#endif /* CHANNEL_MONITOR_H_ */
> -- 
> 1.7.4.1
> 
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 02/10] VM Power Management CLI(Host).
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 03/10] CPU Frequency Power Management(Host) Pablo de Lara
                             ` (8 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-separated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/vm_power_cli.c |  673 ++++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.h |   47 ++
 2 files changed, 720 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 0000000..e7f4469
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,673 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <termios.h>
+#include <errno.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+		struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+struct cmd_show_vm_result {
+	cmdline_fixed_string_t show_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_show_vm_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_show_vm_result *res = parsed_result;
+	struct vm_info info;
+	unsigned i;
+
+	if (get_info_vm(res->vm_name, &info) != 0)
+		return;
+	cmdline_printf(cl, "VM: '%s', status = ", info.name);
+	if (info.status == CHANNEL_MGR_VM_ACTIVE)
+		cmdline_printf(cl, "ACTIVE\n");
+	else
+		cmdline_printf(cl, "INACTIVE\n");
+	cmdline_printf(cl, "Channels %u\n", info.num_channels);
+	for (i = 0; i < info.num_channels; i++) {
+		cmdline_printf(cl, "  [%u]: %s, status = ", i,
+				info.channels[i].channel_path);
+		switch (info.channels[i].status) {
+		case CHANNEL_MGR_CHANNEL_CONNECTED:
+			cmdline_printf(cl, "CONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISCONNECTED:
+			cmdline_printf(cl, "DISCONNECTED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_DISABLED:
+			cmdline_printf(cl, "DISABLED\n");
+			break;
+		case CHANNEL_MGR_CHANNEL_PROCESSING:
+			cmdline_printf(cl, "PROCESSING\n");
+			break;
+		default:
+			cmdline_printf(cl, "UNKNOWN\n");
+			break;
+		}
+	}
+	cmdline_printf(cl, "Virtual CPU(s): %u\n", info.num_vcpus);
+	for (i = 0; i < info.num_vcpus; i++) {
+		cmdline_printf(cl, "  [%u]: Physical CPU Mask 0x%"PRIx64"\n", i,
+				info.pcpu_mask[i]);
+	}
+}
+
+
+
+cmdline_parse_token_string_t cmd_vm_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+				show_vm, "show_vm");
+cmdline_parse_token_string_t cmd_show_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_vm_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_show_vm_set = {
+	.f = cmd_show_vm_parsed,
+	.data = NULL,
+	.help_str = "show_vm <vm_name>, prints the information on the "
+			"specified VM(s), the information lists the number of vCPUS, the "
+			"pinning to pCPU(s) as a bit mask, along with any communication "
+			"channels associated with each VM",
+	.tokens = {
+		(void *)&cmd_vm_show,
+		(void *)&cmd_show_vm_name,
+		NULL,
+	},
+};
+
+/* *** vCPU to pCPU mapping operations *** */
+struct cmd_set_pcpu_mask_result {
+	cmdline_fixed_string_t set_pcpu_mask;
+	cmdline_fixed_string_t vm_name;
+	uint8_t vcpu;
+	uint64_t core_mask;
+};
+
+static void
+cmd_set_pcpu_mask_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_mask_result *res = parsed_result;
+
+	if (set_pcpus_mask(res->vm_name, res->vcpu, res->core_mask) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"mask(0x%"PRIx64")\n", res->vcpu, res->core_mask);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu_mask =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				set_pcpu_mask, "set_pcpu_mask");
+cmdline_parse_token_string_t cmd_set_pcpu_mask_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_mask_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_mask_core_mask =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_mask_result,
+				core_mask, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_mask_set = {
+		.f = cmd_set_pcpu_mask_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu_mask <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU mask.",
+				.tokens = {
+						(void *)&cmd_set_pcpu_mask,
+						(void *)&cmd_set_pcpu_mask_vm_name,
+						(void *)&set_pcpu_mask_vcpu,
+						(void *)&set_pcpu_mask_core_mask,
+						NULL,
+		},
+};
+
+struct cmd_set_pcpu_result {
+	cmdline_fixed_string_t set_pcpu;
+	cmdline_fixed_string_t vm_name;
+	uint8_t vcpu;
+	uint8_t core;
+};
+
+static void
+cmd_set_pcpu_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_pcpu_result *res = parsed_result;
+
+	if (set_pcpu(res->vm_name, res->vcpu, res->core) == 0)
+		cmdline_printf(cl, "Pinned vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+	else
+		cmdline_printf(cl, "Unable to pin vCPU(%"PRId8") to pCPU core "
+				"%"PRId8")\n", res->vcpu, res->core);
+}
+
+cmdline_parse_token_string_t cmd_set_pcpu =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				set_pcpu, "set_pcpu");
+cmdline_parse_token_string_t cmd_set_pcpu_vm_name =
+		TOKEN_STRING_INITIALIZER(struct cmd_set_pcpu_result,
+				vm_name, NULL);
+cmdline_parse_token_num_t set_pcpu_vcpu =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				vcpu, UINT8);
+cmdline_parse_token_num_t set_pcpu_core =
+		TOKEN_NUM_INITIALIZER(struct cmd_set_pcpu_result,
+				core, UINT64);
+
+
+cmdline_parse_inst_t cmd_set_pcpu_set = {
+		.f = cmd_set_pcpu_parsed,
+		.data = NULL,
+		.help_str = "set_pcpu <vm_name> <vcpu> <pcpu>, Set the binding "
+				"of Virtual CPU on VM to the Physical CPU.",
+				.tokens = {
+						(void *)&cmd_set_pcpu,
+						(void *)&cmd_set_pcpu_vm_name,
+						(void *)&set_pcpu_vcpu,
+						(void *)&set_pcpu_core,
+						NULL,
+		},
+};
+
+struct cmd_vm_op_result {
+	cmdline_fixed_string_t op_vm;
+	cmdline_fixed_string_t vm_name;
+};
+
+static void
+cmd_vm_op_parsed(void *parsed_result, struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_vm_op_result *res = parsed_result;
+
+	if (!strcmp(res->op_vm, "add_vm")) {
+		if (add_vm(res->vm_name) < 0)
+			cmdline_printf(cl, "Unable to add VM '%s'\n", res->vm_name);
+	} else if (remove_vm(res->vm_name) < 0)
+		cmdline_printf(cl, "Unable to remove VM '%s'\n", res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_vm_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			op_vm, "add_vm#rm_vm");
+cmdline_parse_token_string_t cmd_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_vm_op_result,
+			vm_name, NULL);
+
+cmdline_parse_inst_t cmd_vm_op_set = {
+	.f = cmd_vm_op_parsed,
+	.data = NULL,
+	.help_str = "add_vm|rm_vm <name>, add a VM for "
+			"subsequent operations with the CLI or remove a previously added "
+			"VM from the VM Power Manager",
+	.tokens = {
+		(void *)&cmd_vm_op,
+		(void *)&cmd_vm_name,
+	NULL,
+	},
+};
+
+/* *** VM channel operations *** */
+struct cmd_channels_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+};
+static void
+cmd_channels_op_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num, i;
+	int channels_added;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_op_result *res = parsed_result;
+
+	if (!strcmp(res->channel_list, "all")) {
+		channels_added = add_all_channels(res->vm_name);
+		cmdline_printf(cl, "Added %d channels for VM '%s'\n",
+				channels_added, res->vm_name);
+		return;
+	}
+
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "Channel number '%u' exceeds the maximum number "
+					"of allowable channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	for (i = 0; i < num_channels; i++)
+		cmdline_printf(cl, "[%u]: Adding channel %u\n", i, channel_list[i]);
+
+	channels_added = add_channels(res->vm_name, channel_list,
+			num_channels);
+	cmdline_printf(cl, "Enabled %d channels for '%s'\n", channels_added,
+			res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+				op, "add_channels");
+cmdline_parse_token_string_t cmd_channels_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_op_result,
+			channel_list, NULL);
+
+cmdline_parse_inst_t cmd_channels_op_set = {
+	.f = cmd_channels_op_parsed,
+	.data = NULL,
+	.help_str = "add_channels <vm_name> <list>|all, add "
+			"communication channels for the specified VM, the "
+			"virtio channels must be enabled in the VM "
+			"configuration(qemu/libvirt) and the associated VM must be active. "
+			"<list> is a comma-separated list of channel numbers to add, using "
+			"the keyword 'all' will attempt to add all channels for the VM",
+	.tokens = {
+		(void *)&cmd_channels_op,
+		(void *)&cmd_channels_vm_name,
+		(void *)&cmd_channels_list,
+		NULL,
+	},
+};
+
+struct cmd_channels_status_op_result {
+	cmdline_fixed_string_t op;
+	cmdline_fixed_string_t vm_name;
+	cmdline_fixed_string_t channel_list;
+	cmdline_fixed_string_t status;
+};
+
+static void
+cmd_channels_status_op_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	unsigned num_channels = 0, channel_num;
+	int changed;
+	unsigned channel_list[CHANNEL_CMDS_MAX_VM_CHANNELS];
+	char *token, *remaining, *tail_ptr;
+	struct cmd_channels_status_op_result *res = parsed_result;
+	enum channel_status status;
+
+	if (!strcmp(res->status, "enabled"))
+		status = CHANNEL_MGR_CHANNEL_CONNECTED;
+	else
+		status = CHANNEL_MGR_CHANNEL_DISABLED;
+
+	if (!strcmp(res->channel_list, "all")) {
+		changed = set_channel_status_all(res->vm_name, status);
+		cmdline_printf(cl, "Updated status of %d channels "
+				"for VM '%s'\n", changed, res->vm_name);
+		return;
+	}
+	remaining = res->channel_list;
+	while (1) {
+		if (remaining == NULL || remaining[0] == '\0')
+			break;
+		token = strsep(&remaining, ",");
+		if (token == NULL)
+			break;
+		errno = 0;
+		channel_num = (unsigned)strtol(token, &tail_ptr, 10);
+		if ((errno != 0) || (*tail_ptr != '\0') || tail_ptr == NULL)
+			break;
+
+		if (channel_num == CHANNEL_CMDS_MAX_VM_CHANNELS) {
+			cmdline_printf(cl, "%u exceeds the maximum number of allowable "
+					"channels(%u) for VM '%s'\n", channel_num,
+					CHANNEL_CMDS_MAX_VM_CHANNELS, res->vm_name);
+			return;
+		}
+		channel_list[num_channels++] = channel_num;
+	}
+	changed = set_channel_status(res->vm_name, channel_list, num_channels,
+			status);
+	cmdline_printf(cl, "Updated status of %d channels "
+					"for VM '%s'\n", changed, res->vm_name);
+}
+
+cmdline_parse_token_string_t cmd_channels_status_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+				op, "set_channel_status");
+cmdline_parse_token_string_t cmd_channels_status_vm_name =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			vm_name, NULL);
+cmdline_parse_token_string_t cmd_channels_status_list =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			channel_list, NULL);
+cmdline_parse_token_string_t cmd_channels_status =
+	TOKEN_STRING_INITIALIZER(struct cmd_channels_status_op_result,
+			status, "enabled#disabled");
+
+cmdline_parse_inst_t cmd_channels_status_op_set = {
+	.f = cmd_channels_status_op_parsed,
+	.data = NULL,
+	.help_str = "set_channel_status <vm_name> <list>|all enabled|disabled, "
+			" enable or disable the communication channels in "
+			"list(comma-separated) for the specified VM, alternatively "
+			"list can be replaced with keyword 'all'. "
+			"Disabled channels will still receive packets on the host, "
+			"however the commands they specify will be ignored. "
+			"Set status to 'enabled' to begin processing requests again.",
+	.tokens = {
+		(void *)&cmd_channels_status_op,
+		(void *)&cmd_channels_status_vm_name,
+		(void *)&cmd_channels_status_list,
+		(void *)&cmd_channels_status,
+		NULL,
+	},
+};
+
+/* *** CPU Frequency operations *** */
+struct cmd_show_cpu_freq_mask_result {
+	cmdline_fixed_string_t show_cpu_freq_mask;
+	uint64_t core_mask;
+};
+
+static void
+cmd_show_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_mask_result *res = parsed_result;
+	unsigned i;
+	uint64_t mask = res->core_mask;
+	uint32_t freq;
+
+	for (i = 0; mask; mask &= ~(1ULL << i++)) {
+		if ((mask >> i) & 1) {
+			freq = power_manager_get_current_frequency(i);
+			if (freq > 0)
+				cmdline_printf(cl, "Core %u: %"PRId32"\n", i, freq);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			show_cpu_freq_mask, "show_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_show_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_mask_result,
+			core_mask, UINT64);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_mask_set = {
+	.f = cmd_show_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "show_cpu_freq_mask <mask>, Get the current frequency for each "
+			"core specified in the mask",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq_mask,
+		(void *)&cmd_show_cpu_freq_mask_core_mask,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_mask_result {
+	cmdline_fixed_string_t set_cpu_freq_mask;
+	uint64_t core_mask;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
+			__attribute__((unused)) void *data)
+{
+	struct cmd_set_cpu_freq_mask_result *res = parsed_result;
+	int ret = -1;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_mask_up(res->core_mask);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_mask_down(res->core_mask);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_mask_min(res->core_mask);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_mask_max(res->core_mask);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
+				"all cores specified have been scaled\n",
+				res->core_mask, res->cmd);
+	};
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			set_cpu_freq_mask, "set_cpu_freq_mask");
+cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			core_mask, UINT64);
+cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
+	.f = cmd_set_cpu_freq_mask_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
+			"frequency for the cores specified in <core_mask> by scaling "
+			"each up/down/min/max.",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq_mask,
+		(void *)&cmd_set_cpu_freq_mask_core_mask,
+		(void *)&cmd_set_cpu_freq_mask_result,
+		NULL,
+	},
+};
+
+
+
+struct cmd_show_cpu_freq_result {
+	cmdline_fixed_string_t show_cpu_freq;
+	uint8_t core_num;
+};
+
+static void
+cmd_show_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_show_cpu_freq_result *res = parsed_result;
+	uint32_t curr_freq = power_manager_get_current_frequency(res->core_num);
+
+	if (curr_freq == 0) {
+		cmdline_printf(cl, "Unable to get frequency for core %u\n",
+				res->core_num);
+		return;
+	}
+	cmdline_printf(cl, "Core %u frequency: %"PRId32"\n", res->core_num,
+			curr_freq);
+}
+
+cmdline_parse_token_string_t cmd_show_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_cpu_freq_result,
+			show_cpu_freq, "show_cpu_freq");
+
+cmdline_parse_token_num_t cmd_show_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_cpu_freq_result,
+			core_num, UINT8);
+
+cmdline_parse_inst_t cmd_show_cpu_freq_set = {
+	.f = cmd_show_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "Get the current frequency for the specified core",
+	.tokens = {
+		(void *)&cmd_show_cpu_freq,
+		(void *)&cmd_show_cpu_freq_core_num,
+		NULL,
+	},
+};
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t core_num;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = power_manager_scale_core_up(res->core_num);
+	else if (!strcmp(res->cmd , "down"))
+		ret = power_manager_scale_core_down(res->core_num);
+	else if (!strcmp(res->cmd , "min"))
+		ret = power_manager_scale_core_min(res->core_num);
+	else if (!strcmp(res->cmd , "max"))
+		ret = power_manager_scale_core_max(res->core_num);
+	if (ret < 0) {
+		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
+				res->cmd);
+	}
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			core_num, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_vm_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_op_set,
+		(cmdline_parse_inst_t *)&cmd_channels_status_op_set,
+		(cmdline_parse_inst_t *)&cmd_show_vm_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_mask_set,
+		(cmdline_parse_inst_t *)&cmd_show_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_mask_set,
+		(cmdline_parse_inst_t *)&cmd_set_pcpu_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+
+	cl = cmdline_stdin_new(main_ctx, "vmpower> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/vm_power_cli.h b/examples/vm_power_manager/vm_power_cli.h
new file mode 100644
index 0000000..deccd51
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.h
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 03/10] CPU Frequency Power Management(Host).
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 02/10] VM Power Management CLI(Host) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 04/10] VM Power Management application and Makefile Pablo de Lara
                             ` (7 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/power_manager.c |  253 +++++++++++++++++++++++++++++
 examples/vm_power_manager/power_manager.h |  188 +++++++++++++++++++++
 2 files changed, 441 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 0000000..60da96c
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,253 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/un.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <errno.h>
+
+#include <sys/types.h>
+
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_power.h>
+#include <rte_spinlock.h>
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
+	if (core_num >= POWER_MGR_MAX_CPUS) \
+		return -1; \
+	if (!(global_enabled_cpus & (1ULL << core_num))) \
+		return -1; \
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+	ret = rte_power_freq_##DIRECTION(core_num); \
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+	int i; \
+	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+		if ((core_mask >> i) & 1) { \
+			if (!(global_enabled_cpus & (1ULL << i))) \
+				continue; \
+			rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+			if (rte_power_freq_##DIRECTION(i) != 1) \
+				ret = -1; \
+			rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+		} \
+	} \
+} while (0)
+
+struct freq_info {
+	rte_spinlock_t power_sl;
+	uint32_t freqs[RTE_MAX_LCORE_FREQS];
+	unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+	char path[PATH_MAX];
+	unsigned i;
+	unsigned num_cpus = 0;
+
+	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
+		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+		if (access(path, F_OK) == 0) {
+			global_enabled_cpus |= 1ULL << i;
+			num_cpus++;
+		} else
+			return num_cpus;
+	}
+	return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+	unsigned i, num_cpus;
+	uint64_t cpu_mask;
+	int ret = 0;
+
+	num_cpus = set_host_cpus_mask();
+	if (num_cpus == 0) {
+		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
+			"ensure that sufficient privileges exist to inspect sysfs\n");
+		return -1;
+	}
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	cpu_mask = global_enabled_cpus;
+	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
+		if (rte_power_init(i) < 0 || rte_power_freqs(i,
+				global_core_freq_info[i].freqs,
+				RTE_MAX_LCORE_FREQS) == 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+					"for core %u\n", i);
+			global_enabled_cpus &= ~(1 << i);
+			num_cpus--;
+			ret = -1;
+		}
+		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+	}
+	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
+					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	return ret;
+
+}
+
+uint32_t
+power_manager_get_current_frequency(unsigned core_num)
+{
+	uint32_t freq, index;
+
+	if (core_num >= POWER_MGR_MAX_CPUS) {
+		RTE_LOG(ERR, POWER_MANAGER, "Core(%u) is out of range 0...%d\n",
+				core_num, POWER_MGR_MAX_CPUS-1);
+		return -1;
+	}
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return 0;
+
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	index = rte_power_get_freq(core_num);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	if (index >= POWER_MGR_MAX_CPUS)
+		freq = 0;
+	else
+		freq = global_core_freq_info[core_num].freqs[index];
+
+	return freq;
+}
+
+int
+power_manager_exit(void)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
+		if (rte_power_exit(i) < 0) {
+			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+					"for core %u\n", i);
+			ret = -1;
+		}
+	}
+	global_enabled_cpus = 0;
+	return ret;
+}
+
+int
+power_manager_scale_mask_up(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(up, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_down(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(down, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_min(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(min, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_mask_max(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(max, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_up(unsigned core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(up, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_down(unsigned core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(down, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_min(unsigned core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(min, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_scale_core_max(unsigned core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(max, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
new file mode 100644
index 0000000..1b45bab
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.h
@@ -0,0 +1,188 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef POWER_MANAGER_H_
+#define POWER_MANAGER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Maximum number of CPUS to manage */
+#define POWER_MGR_MAX_CPUS 64
+/**
+ * Initialize power management.
+ * Initializes resources and verifies the number of CPUs on the system.
+ * Wraps librte_power int rte_power_init(unsigned lcore_id);
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_init(void);
+
+/**
+ * Exit power management. Must be called prior to exiting the application.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_manager_exit(void);
+
+/**
+ * Scale up the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_up(uint64_t core_mask);
+
+/**
+ * Scale down the frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_down(uint64_t core_mask);
+
+/**
+ * Scale to the minimum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_min(uint64_t core_mask);
+
+/**
+ * Scale to the maximum frequency of the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_mask_max(uint64_t core_mask);
+
+/**
+ * Scale up frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_up(unsigned core_num);
+
+/**
+ * Scale down frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_down(unsigned core_num);
+
+/**
+ * Scale to minimum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_min(unsigned core_num);
+
+/**
+ * Scale to maximum frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_max(unsigned core_num);
+
+/**
+ * Get the current freuency of the core specified by core_num
+ *
+ * @param core_num
+ *  The core number to get the current frequency
+ *
+ * @return
+ *  - 0  on error
+ *  - >0 for current frequency.
+ */
+uint32_t power_manager_get_current_frequency(unsigned core_num);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* POWER_MANAGER_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 04/10] VM Power Management application and Makefile.
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (2 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 03/10] CPU Frequency Power Management(Host) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 05/10] VM Power Management CLI(Guest) Pablo de Lara
                             ` (6 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/Makefile |   57 +++++++++++++++++
 examples/vm_power_manager/main.c   |  117 ++++++++++++++++++++++++++++++++++++
 examples/vm_power_manager/main.h   |   52 ++++++++++++++++
 3 files changed, 226 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
new file mode 100644
index 0000000..b0a1037
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 0000000..875274e
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,117 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_launch.h>
+#include <rte_log.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "vm_power_cli.h"
+#include "main.h"
+
+static int
+run_monitor(__attribute__((unused)) void *arg)
+{
+	if (channel_monitor_init() < 0) {
+		printf("Unable to initialize channel monitor\n");
+		return -1;
+	}
+	run_channel_monitor();
+	return 0;
+}
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	channel_monitor_exit();
+	channel_manager_exit();
+	power_manager_exit();
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	lcore_id = rte_get_next_lcore(-1, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+				"application\n");
+		return 0;
+	}
+	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
+
+	if (power_manager_init() < 0) {
+		printf("Unable to initialize power manager\n");
+		return -1;
+	}
+	if (channel_manager_init(CHANNEL_MGR_DEFAULT_HV_PATH) < 0) {
+		printf("Unable to initialize channel manager\n");
+		return -1;
+	}
+	run_cli(NULL);
+
+	rte_eal_mp_wait_lcore();
+	return 0;
+}
diff --git a/examples/vm_power_manager/main.h b/examples/vm_power_manager/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 05/10] VM Power Management CLI(Guest).
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (3 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 04/10] VM Power Management application and Makefile Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 06/10] VM communication channels for VM Power Management(Guest) Pablo de Lara
                             ` (5 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile       |   56 +++++++
 examples/vm_power_manager/guest_cli/main.c         |   88 +++++++++++
 examples/vm_power_manager/guest_cli/main.h         |   52 +++++++
 .../guest_cli/vm_power_cli_guest.c                 |  156 ++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |   55 +++++++
 5 files changed, 407 insertions(+), 0 deletions(-)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 0000000..5507270
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 0000000..f8e6a7d
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,88 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/epoll.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+*/
+#include <signal.h>
+
+#include <rte_lcore.h>
+#include <rte_power.h>
+#include <rte_debug.h>
+#include <rte_config.h>
+
+#include "vm_power_cli_guest.h"
+#include "main.h"
+
+static void
+sig_handler(int signo)
+{
+	printf("Received signal %d, exiting...\n", signo);
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	rte_power_set_env(PM_ENV_KVM_VM);
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_init(lcore_id);
+	}
+	run_cli(NULL);
+
+	return 0;
+}
diff --git a/examples/vm_power_manager/guest_cli/main.h b/examples/vm_power_manager/guest_cli/main.h
new file mode 100644
index 0000000..7b4c3da
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
+
+#ifndef MAIN_H_
+#define MAIN_H_
+
+
+
+#endif /* MAIN_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
new file mode 100644
index 0000000..75e544a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -0,0 +1,156 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <termios.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+#include <rte_config.h>
+#include <rte_log.h>
+#include <rte_lcore.h>
+
+#include <rte_power.h>
+
+#include "vm_power_cli_guest.h"
+
+
+#define CHANNEL_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+				__attribute__((unused)) struct cmdline *cl,
+			    __attribute__((unused)) void *data)
+{
+	unsigned lcore_id;
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		rte_power_exit(lcore_id);
+	}
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,  /* function to call */
+	.data = NULL,      /* 2nd arg of func */
+	.help_str = "close the application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/* *** VM operations *** */
+
+struct cmd_set_cpu_freq_result {
+	cmdline_fixed_string_t set_cpu_freq;
+	uint8_t lcore_id;
+	cmdline_fixed_string_t cmd;
+};
+
+static void
+cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_set_cpu_freq_result *res = parsed_result;
+
+	if (!strcmp(res->cmd , "up"))
+		ret = rte_power_freq_up(res->lcore_id);
+	else if (!strcmp(res->cmd , "down"))
+		ret = rte_power_freq_down(res->lcore_id);
+	else if (!strcmp(res->cmd , "min"))
+		ret = rte_power_freq_min(res->lcore_id);
+	else if (!strcmp(res->cmd , "max"))
+		ret = rte_power_freq_max(res->lcore_id);
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_set_cpu_freq =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			set_cpu_freq, "set_cpu_freq");
+cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
+	TOKEN_NUM_INITIALIZER(struct cmd_set_cpu_freq_result,
+			lcore_id, UINT8);
+cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
+			cmd, "up#down#min#max");
+
+cmdline_parse_inst_t cmd_set_cpu_freq_set = {
+	.f = cmd_set_cpu_freq_parsed,
+	.data = NULL,
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
+			"frequency for the specified core by scaling up/down/min/max",
+	.tokens = {
+		(void *)&cmd_set_cpu_freq,
+		(void *)&cmd_set_cpu_freq_core_num,
+		(void *)&cmd_set_cpu_freq_cmd_cmd,
+		NULL,
+	},
+};
+
+cmdline_parse_ctx_t main_ctx[] = {
+		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
+		NULL,
+};
+
+void
+run_cli(__attribute__((unused)) void *arg)
+{
+	struct cmdline *cl;
+
+	cl = cmdline_stdin_new(main_ctx, "vmpower(guest)> ");
+	if (cl == NULL)
+		return;
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+}
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
new file mode 100644
index 0000000..0c4bdd5
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef VM_POWER_CLI_H_
+#define VM_POWER_CLI_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "channel_commands.h"
+
+int guest_channel_host_connect(unsigned lcore_id);
+
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+void run_cli(__attribute__((unused)) void *arg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* VM_POWER_CLI_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 06/10] VM communication channels for VM Power Management(Guest).
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (4 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 05/10] VM Power Management CLI(Guest) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 07/10] librte_power common interface for Guest and Host Pablo de Lara
                             ` (4 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/guest_channel.c |  162 ++++++++++++++++++++++++++++++++++++++
 lib/librte_power/guest_channel.h |   89 +++++++++++++++++++++
 2 files changed, 251 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
new file mode 100644
index 0000000..2295665
--- /dev/null
+++ b/lib/librte_power/guest_channel.c
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+	int flags, ret;
+	struct channel_packet pkt;
+	char fd_path[PATH_MAX];
+	int fd = -1;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+	/* check if path is already open */
+	if (global_fds[lcore_id] != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open with fd %d\n",
+				lcore_id, global_fds[lcore_id]);
+		return -1;
+	}
+
+	snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+	RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+			fd_path, lcore_id);
+	fd = open(fd_path, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with error "
+				"%s\n", fd_path, strerror(errno));
+		return -1;
+	}
+
+	flags = fcntl(fd, F_GETFL, 0);
+	if (flags < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file %s\n",
+				fd_path);
+		goto error;
+	}
+
+	flags |= O_NONBLOCK;
+	if (fcntl(fd, F_SETFL, flags) < 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking mode for "
+				"file %s", fd_path);
+		goto error;
+	}
+	/* QEMU needs a delay after connection */
+	sleep(1);
+
+	/* Send a test packet, this command is ignored by the host, but a successful
+	 * send indicates that the host endpoint is monitoring.
+	 */
+	pkt.command = CPU_POWER_CONNECT;
+	global_fds[lcore_id] = fd;
+	ret = guest_channel_send_msg(&pkt, lcore_id);
+	if (ret != 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' communications "
+				"test: %s\n", fd_path, strerror(ret));
+		goto error;
+	}
+	RTE_LOG(INFO, GUEST_CHANNEL, "Channel '%s' is now connected\n", fd_path);
+	return 0;
+error:
+	close(fd);
+	global_fds[lcore_id] = 0;
+	return -1;
+}
+
+int
+guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
+{
+	int ret, buffer_len = sizeof(*pkt);
+	void *buffer = pkt;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return -1;
+	}
+
+	if (global_fds[lcore_id] == 0) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel is not connected\n");
+		return -1;
+	}
+	while (buffer_len > 0) {
+		ret = write(global_fds[lcore_id], buffer, buffer_len);
+		if (ret == buffer_len)
+			return 0;
+		if (ret == -1) {
+			if (errno == EINTR)
+				continue;
+			return errno;
+		}
+		buffer = (char *)buffer + ret;
+		buffer_len -= ret;
+	}
+	return 0;
+}
+
+void
+guest_channel_host_disconnect(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 0...%d\n",
+				lcore_id, RTE_MAX_LCORE-1);
+		return;
+	}
+	if (global_fds[lcore_id] == 0)
+		return;
+	close(global_fds[lcore_id]);
+	global_fds[lcore_id] = 0;
+}
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
new file mode 100644
index 0000000..9e18af5
--- /dev/null
+++ b/lib/librte_power/guest_channel.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef _GUEST_CHANNEL_H
+#define _GUEST_CHANNEL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <channel_commands.h>
+
+/**
+ * Connect to the Virtio-Serial VM end-point located in path. It is
+ * thread safe for unique lcore_ids. This function must be only called once from
+ * each lcore.
+ *
+ * @param path
+ *  The path to the serial device on the filesystem
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int guest_channel_host_connect(const char *path, unsigned lcore_id);
+
+/**
+ * Disconnect from an already connected Virtio-Serial Endpoint.
+ *
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ */
+void guest_channel_host_disconnect(unsigned lcore_id);
+
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct guest_agent_pkt
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on channel not connected.
+ *  - errno on write to channel error.
+ */
+int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 07/10] librte_power common interface for Guest and Host
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (5 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 06/10] VM communication channels for VM Power Management(Guest) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 08/10] Packet format for VM Power Management(Host and Guest) Pablo de Lara
                             ` (3 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implementation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/rte_power.c              |  540 ++++-------------------------
 lib/librte_power/rte_power.h              |  120 +++++--
 lib/librte_power/rte_power_acpi_cpufreq.c |  545 +++++++++++++++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h |  192 ++++++++++
 lib/librte_power/rte_power_common.h       |   39 ++
 lib/librte_power/rte_power_kvm_vm.c       |  136 +++++++
 lib/librte_power/rte_power_kvm_vm.h       |  179 ++++++++++
 7 files changed, 1249 insertions(+), 502 deletions(-)
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 856da9a..998ed1c 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -31,515 +31,113 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <signal.h>
-#include <limits.h>
-
-#include <rte_memcpy.h>
 #include <rte_atomic.h>
 
 #include "rte_power.h"
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
 
-#ifdef RTE_LIBRTE_POWER_DEBUG
-#define POWER_DEBUG_TRACE(fmt, args...) do { \
-		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
-	} while (0)
-#else
-#define POWER_DEBUG_TRACE(fmt, args...)
-#endif
-
-#define FOPEN_OR_ERR_RET(f, retval) do { \
-	if ((f) == NULL) { \
-		RTE_LOG(ERR, POWER, "File not openned\n"); \
-		return (retval); \
-	} \
-} while(0)
-
-#define FOPS_OR_NULL_GOTO(ret, label) do { \
-	if ((ret) == NULL) { \
-		RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define FOPS_OR_ERR_GOTO(ret, label) do { \
-	if ((ret) < 0) { \
-		RTE_LOG(ERR, POWER, "File operations failed\n"); \
-		goto label; \
-	} \
-} while(0)
-
-#define STR_SIZE     1024
-#define POWER_CONVERT_TO_DECIMAL 10
+enum power_management_env global_default_env = PM_ENV_NOT_SET;
 
-#define POWER_GOVERNOR_USERSPACE "userspace"
-#define POWER_SYSFILE_GOVERNOR   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
-#define POWER_SYSFILE_AVAIL_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
-#define POWER_SYSFILE_SETSPEED   \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+volatile uint32_t global_env_cfg_status = 0;
 
-enum power_state {
-	POWER_IDLE = 0,
-	POWER_ONGOING,
-	POWER_USED,
-	POWER_UNKNOWN
-};
+/* function pointers */
+rte_power_freqs_t rte_power_freqs  = NULL;
+rte_power_get_freq_t rte_power_get_freq = NULL;
+rte_power_set_freq_t rte_power_set_freq = NULL;
+rte_power_freq_change_t rte_power_freq_up = NULL;
+rte_power_freq_change_t rte_power_freq_down = NULL;
+rte_power_freq_change_t rte_power_freq_max = NULL;
+rte_power_freq_change_t rte_power_freq_min = NULL;
 
-/**
- * Power info per lcore.
- */
-struct rte_power_info {
-	unsigned lcore_id;                   /**< Logical core id */
-	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-	uint32_t nb_freqs;                   /**< number of available freqs */
-	FILE *f;                             /**< FD of scaling_setspeed */
-	char governor_ori[32];               /**< Original governor name */
-	uint32_t curr_idx;                   /**< Freq index in freqs array */
-	volatile uint32_t state;             /**< Power in use state */
-} __rte_cache_aligned;
-
-static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
-
-/**
- * It is to set specific freq for specific logical core, according to the index
- * of supported frequencies.
- */
-static int
-set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+int
+rte_power_set_env(enum power_management_env env)
 {
-	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
-			"should be less than %u\n", idx, pi->nb_freqs);
-		return -1;
-	}
-
-	/* Check if it is the same as current */
-	if (idx == pi->curr_idx)
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 0, 1) == 0) {
 		return 0;
-
-	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
-				idx, pi->freqs[idx], pi->lcore_id);
-	if (fseek(pi->f, 0, SEEK_SET) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
-			"for setting frequency for lcore %u\n", pi->lcore_id);
-		return -1;
 	}
-	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
-		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
-					"lcore %u\n", pi->lcore_id);
+	if (env == PM_ENV_ACPI_CPUFREQ) {
+		rte_power_freqs = rte_power_acpi_cpufreq_freqs;
+		rte_power_get_freq = rte_power_acpi_cpufreq_get_freq;
+		rte_power_set_freq = rte_power_acpi_cpufreq_set_freq;
+		rte_power_freq_up = rte_power_acpi_cpufreq_freq_up;
+		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
+		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
+		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+	} else if (env == PM_ENV_KVM_VM) {
+		rte_power_freqs = rte_power_kvm_vm_freqs;
+		rte_power_get_freq = rte_power_kvm_vm_get_freq;
+		rte_power_set_freq = rte_power_kvm_vm_set_freq;
+		rte_power_freq_up = rte_power_kvm_vm_freq_up;
+		rte_power_freq_down = rte_power_kvm_vm_freq_down;
+		rte_power_freq_min = rte_power_kvm_vm_freq_min;
+		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+	} else {
+		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
+				env);
+		rte_power_unset_env();
 		return -1;
 	}
-	fflush(pi->f);
-	pi->curr_idx = idx;
-
-	return 1;
-}
-
-/**
- * It is to check the current scaling governor by reading sys file, and then
- * set it into 'userspace' if it is not by writing the sys file. The original
- * governor will be saved for rolling back.
- */
-static int
-power_set_governor_userspace(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if current governor is userspace */
-	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
-		sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
-					"already userspace\n", pi->lcore_id);
-		goto out;
-	}
-	/* Save the original governor */
-	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
-
-	/* Write 'userspace' to the governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(POWER_GOVERNOR_USERSPACE, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
-			"set to user space successfully\n", pi->lcore_id);
-out:
-	fclose(f);
+	global_default_env = env;
+	return 0;
 
-	return ret;
 }
 
-/**
- * It is to get the available frequencies of the specific lcore by reading the
- * sys file.
- */
-static int
-power_get_available_freqs(struct rte_power_info *pi)
+void
+rte_power_unset_env(void)
 {
-	FILE *f;
-	int ret = -1, i, count;
-	char *p;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *freqs[RTE_MAX_LCORE_FREQS];
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
-								pi->lcore_id);
-	f = fopen(fullpath, "r");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Strip the line break if there is */
-	p = strchr(buf, '\n');
-	if (p != NULL)
-		*p = 0;
-
-	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
-	count = rte_strsplit(buf, sizeof(buf), freqs,
-				RTE_MAX_LCORE_FREQS, ' ');
-	if (count <= 0) {
-		RTE_LOG(ERR, POWER, "No available frequency in "
-			""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
-		goto out;
-	}
-	if (count >= RTE_MAX_LCORE_FREQS) {
-		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
-								count);
-		goto out;
-	}
-
-	/* Store the available frequncies into power context */
-	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
-		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
-								i, freqs[i]);
-		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
-					POWER_CONVERT_TO_DECIMAL);
-	}
-
-	ret = 0;
-	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
-						count, pi->lcore_id);
-out:
-	fclose(f);
-
-	return ret;
+	if (rte_atomic32_cmpset(&global_env_cfg_status, 1, 0) != 0)
+		global_default_env = PM_ENV_NOT_SET;
 }
 
-/**
- * It is to fopen the sys file for the future setting the lcore frequency.
- */
-static int
-power_init_for_setting_freq(struct rte_power_info *pi)
-{
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t i, freq;
-	char *s;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, -1);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
-	for (i = 0; i < pi->nb_freqs; i++) {
-		if (freq == pi->freqs[i]) {
-			pi->curr_idx = i;
-			pi->f = f;
-			return 0;
-		}
-	}
-
-out:
-	fclose(f);
-
-	return -1;
+enum power_management_env
+rte_power_get_env(void) {
+	return global_default_env;
 }
 
 int
 rte_power_init(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"in use\n", lcore_id);
-		return -1;
-	}
-
-	pi->lcore_id = lcore_id;
-	/* Check and set the governor */
-	if (power_set_governor_userspace(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
-						"userspace\n", lcore_id);
-		goto fail;
-	}
+	int ret = -1;
 
-	/* Get the available frequencies */
-	if (power_get_available_freqs(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ) {
+		return rte_power_acpi_cpufreq_init(lcore_id);
 	}
-
-	/* Init for setting lcore frequency */
-	if (power_init_for_setting_freq(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
-						"lcore %u\n", lcore_id);
-		goto fail;
+	if (global_default_env == PM_ENV_KVM_VM) {
+		return rte_power_kvm_vm_init(lcore_id);
 	}
-
-	/* Set freq to max by default */
-	if (rte_power_freq_max(lcore_id) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
-						"to max\n", lcore_id);
-		goto fail;
+	/* Auto detect Environment */
+	RTE_LOG(INFO, POWER, "Attempting to initialise ACPI cpufreq power "
+			"management...\n");
+	ret = rte_power_acpi_cpufreq_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+		goto out;
 	}
 
-	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
-					"power manamgement\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
-
-	return -1;
-}
-
-/**
- * It is to check the governor and then set the original governor back if
- * needed by writing the the sys file.
- */
-static int
-power_set_governor_original(struct rte_power_info *pi)
-{
-	FILE *f;
-	int ret = -1;
-	char buf[BUFSIZ];
-	char fullpath[PATH_MAX];
-	char *s;
-	int val;
-
-	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
-							pi->lcore_id);
-	f = fopen(fullpath, "rw+");
-	FOPEN_OR_ERR_RET(f, ret);
-
-	s = fgets(buf, sizeof(buf), f);
-	FOPS_OR_NULL_GOTO(s, out);
-
-	/* Check if the governor to be set is the same as current */
-	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
-		ret = 0;
-		POWER_DEBUG_TRACE("Power management governor of lcore %u "
-					"has already been set to %s\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(INFO, POWER, "Attempting to initialise VM power management...\n");
+	ret = rte_power_kvm_vm_init(lcore_id);
+	if (ret == 0) {
+		rte_power_set_env(PM_ENV_KVM_VM);
 		goto out;
 	}
-
-	/* Write back the original governor */
-	val = fseek(f, 0, SEEK_SET);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	val = fputs(pi->governor_ori, f);
-	FOPS_OR_ERR_GOTO(val, out);
-
-	ret = 0;
-	RTE_LOG(INFO, POWER, "Power manamgement governor of lcore %u "
-				"has been set back to %s successfully\n",
-					pi->lcore_id, pi->governor_ori);
+	RTE_LOG(ERR, POWER, "Unable to set Power Management Environment for lcore "
+			"%u\n", lcore_id);
 out:
-	fclose(f);
-
 	return ret;
 }
 
 int
 rte_power_exit(unsigned lcore_id)
 {
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
-					lcore_id, RTE_MAX_LCORE - 1U);
-		return -1;
-	}
-	pi = &lcore_power_info[lcore_id];
-	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
-								== 0) {
-		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
-						"not used\n", lcore_id);
-		return -1;
-	}
-
-	/* Close FD of setting freq */
-	fclose(pi->f);
-	pi->f = NULL;
-
-	/* Set the governor back to the original */
-	if (power_set_governor_original(pi) < 0) {
-		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
-					"to the original\n", lcore_id);
-		goto fail;
-	}
-
-	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
-				"'userspace' mode and been set back to the "
-						"original\n", lcore_id);
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
-
-	return 0;
-
-fail:
-	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+	if (global_default_env == PM_ENV_ACPI_CPUFREQ)
+		return rte_power_acpi_cpufreq_exit(lcore_id);
+	if (global_default_env == PM_ENV_KVM_VM)
+		return rte_power_kvm_vm_exit(lcore_id);
 
+	RTE_LOG(ERR, POWER, "Environment has not been set, unable to exit "
+				"gracefully\n");
 	return -1;
-}
-
-uint32_t
-rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
-		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
-		return 0;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (num < pi->nb_freqs) {
-		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
-		return 0;
-	}
-	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
-
-	return pi->nb_freqs;
-}
-
-uint32_t
-rte_power_get_freq(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return RTE_POWER_INVALID_FREQ_INDEX;
-	}
-
-	return lcore_power_info[lcore_id].curr_idx;
-}
-
-int
-rte_power_set_freq(unsigned lcore_id, uint32_t index)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
-}
-
-int
-rte_power_freq_down(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
 
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx + 1 == pi->nb_freqs)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx + 1);
 }
-
-int
-rte_power_freq_up(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-	if (pi->curr_idx == 0)
-		return 0;
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->curr_idx - 1);
-}
-
-int
-rte_power_freq_max(unsigned lcore_id)
-{
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(&lcore_power_info[lcore_id], 0);
-}
-
-int
-rte_power_freq_min(unsigned lcore_id)
-{
-	struct rte_power_info *pi;
-
-	if (lcore_id >= RTE_MAX_LCORE) {
-		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
-		return -1;
-	}
-
-	pi = &lcore_power_info[lcore_id];
-
-	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(pi, pi->nb_freqs - 1);
-}
-
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 9c1419e..9338069 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -48,12 +48,48 @@
 extern "C" {
 #endif
 
-#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+/* Power Management Environment State */
+enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM};
 
 /**
- * Initialize power management for a specific lcore. It will check and set the
- * governor to userspace for the lcore, get the available frequencies, and
- * prepare to set new lcore frequency.
+ * Set the default power management implementation. If this is not called prior
+ * to rte_power_init(), then auto-detect of the environment will take place.
+ * It is not thread safe.
+ *
+ * @param env
+ *  env. The environment in which to initialise Power Management for.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_set_env(enum power_management_env env);
+
+/**
+ * Unset the global environment configuration.
+ * This can only be called after all threads have completed.
+ *
+ * @param None.
+ *
+ * @return
+ *  None.
+ */
+void rte_power_unset_env(void);
+
+/**
+ * Get the default power management implementation.
+ *
+ * @param None.
+ *
+ * @return
+ *  power_management_env The configured environment.
+ */
+enum power_management_env rte_power_get_env(void);
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
  *
  * @param lcore_id
  *  lcore id.
@@ -65,8 +101,9 @@ extern "C" {
 int rte_power_init(unsigned lcore_id);
 
 /**
- * Exit power management on a specific lcore. It will set the governor to which
- * is before initialized.
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
  *
  * @param lcore_id
  *  lcore id.
@@ -78,11 +115,9 @@ int rte_power_init(unsigned lcore_id);
 int rte_power_exit(unsigned lcore_id);
 
 /**
- * Get the available frequencies of a specific lcore. The return value will be
- * the minimal one of the total number of available frequencies and the number
- * of buffer. The index of available frequencies used in other interfaces
- * should be in the range of 0 to this return value.
- * It should be protected outside of this function for threadsafe.
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -94,12 +129,15 @@ int rte_power_exit(unsigned lcore_id);
  * @return
  *  The number of available frequencies.
  */
-uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
+typedef uint32_t (*rte_power_freqs_t)(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+extern rte_power_freqs_t rte_power_freqs;
 
 /**
- * Return the current index of available frequencies of a specific lcore. It
- * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
- * It should be protected outside of this function for threadsafe.
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -107,12 +145,15 @@ uint32_t rte_power_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num);
  * @return
  *  The current index of available frequencies.
  */
-uint32_t rte_power_get_freq(unsigned lcore_id);
+typedef uint32_t (*rte_power_get_freq_t)(unsigned lcore_id);
+
+extern rte_power_get_freq_t rte_power_get_freq;
 
 /**
  * Set the new frequency for a specific lcore by indicating the index of
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
@@ -121,70 +162,87 @@ uint32_t rte_power_get_freq(unsigned lcore_id);
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_set_freq(unsigned lcore_id, uint32_t index);
+typedef int (*rte_power_set_freq_t)(unsigned lcore_id, uint32_t index);
+
+extern rte_power_set_freq_t rte_power_set_freq;
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned lcore_id);
 
 /**
  * Scale up the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_up(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_up;
 
 /**
  * Scale down the frequency of a specific lcore according to the available
  * frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_down(unsigned lcore_id);
+
+extern rte_power_freq_change_t rte_power_freq_down;
 
 /**
  * Scale up the frequency of a specific lcore to the highest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage.
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_max(unsigned lcore_id);
+extern rte_power_freq_change_t rte_power_freq_max;
 
 /**
  * Scale down the frequency of a specific lcore to the lowest according to the
  * available frequencies.
- * It should be protected outside of this function for threadsafe.
+ * Review each environments specific documentation for usage..
  *
  * @param lcore_id
  *  lcore id.
  *
  * @return
  *  - 1 on success with frequency changed.
- *  - 0 on success without frequency chnaged.
+ *  - 0 on success without frequency changed.
  *  - Negative on error.
  */
-int rte_power_freq_min(unsigned lcore_id);
+rte_power_freq_change_t rte_power_freq_min;
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
new file mode 100644
index 0000000..a56c9b5
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -0,0 +1,545 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <limits.h>
+
+#include <rte_memcpy.h>
+#include <rte_atomic.h>
+
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_common.h"
+
+#ifdef RTE_LIBRTE_POWER_DEBUG
+#define POWER_DEBUG_TRACE(fmt, args...) do { \
+		RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
+} while (0)
+#else
+#define POWER_DEBUG_TRACE(fmt, args...)
+#endif
+
+#define FOPEN_OR_ERR_RET(f, retval) do { \
+		if ((f) == NULL) { \
+			RTE_LOG(ERR, POWER, "File not openned\n"); \
+			return retval; \
+		} \
+} while (0)
+
+#define FOPS_OR_NULL_GOTO(ret, label) do { \
+		if ((ret) == NULL) { \
+			RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define FOPS_OR_ERR_GOTO(ret, label) do { \
+		if ((ret) < 0) { \
+			RTE_LOG(ERR, POWER, "File operations failed\n"); \
+			goto label; \
+		} \
+} while (0)
+
+#define STR_SIZE     1024
+#define POWER_CONVERT_TO_DECIMAL 10
+
+#define POWER_GOVERNOR_USERSPACE "userspace"
+#define POWER_SYSFILE_GOVERNOR   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
+#define POWER_SYSFILE_AVAIL_FREQ \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
+#define POWER_SYSFILE_SETSPEED   \
+		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+
+enum power_state {
+	POWER_IDLE = 0,
+	POWER_ONGOING,
+	POWER_USED,
+	POWER_UNKNOWN
+};
+
+/**
+ * Power info per lcore.
+ */
+struct rte_power_info {
+	unsigned lcore_id;                   /**< Logical core id */
+	uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
+	uint32_t nb_freqs;                   /**< number of available freqs */
+	FILE *f;                             /**< FD of scaling_setspeed */
+	char governor_ori[32];               /**< Original governor name */
+	uint32_t curr_idx;                   /**< Freq index in freqs array */
+	volatile uint32_t state;             /**< Power in use state */
+} __rte_cache_aligned;
+
+static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
+
+/**
+ * It is to set specific freq for specific logical core, according to the index
+ * of supported frequencies.
+ */
+static int
+set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+{
+	if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
+				"should be less than %u\n", idx, pi->nb_freqs);
+		return -1;
+	}
+
+	/* Check if it is the same as current */
+	if (idx == pi->curr_idx)
+		return 0;
+
+	POWER_DEBUG_TRACE("Freqency[%u] %u to be set for lcore %u\n",
+			idx, pi->freqs[idx], pi->lcore_id);
+	if (fseek(pi->f, 0, SEEK_SET) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to set file position indicator to 0 "
+				"for setting frequency for lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	if (fprintf(pi->f, "%u", pi->freqs[idx]) < 0) {
+		RTE_LOG(ERR, POWER, "Fail to write new frequency for "
+				"lcore %u\n", pi->lcore_id);
+		return -1;
+	}
+	fflush(pi->f);
+	pi->curr_idx = idx;
+
+	return 1;
+}
+
+/**
+ * It is to check the current scaling governor by reading sys file, and then
+ * set it into 'userspace' if it is not by writing the sys file. The original
+ * governor will be saved for rolling back.
+ */
+static int
+power_set_governor_userspace(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if current governor is userspace */
+	if (strncmp(buf, POWER_GOVERNOR_USERSPACE,
+			sizeof(POWER_GOVERNOR_USERSPACE)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u is "
+				"already userspace\n", pi->lcore_id);
+		goto out;
+	}
+	/* Save the original governor */
+	snprintf(pi->governor_ori, sizeof(pi->governor_ori), "%s", buf);
+
+	/* Write 'userspace' to the governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(POWER_GOVERNOR_USERSPACE, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u has been "
+			"set to user space successfully\n", pi->lcore_id);
+out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to get the available frequencies of the specific lcore by reading the
+ * sys file.
+ */
+static int
+power_get_available_freqs(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1, i, count;
+	char *p;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *freqs[RTE_MAX_LCORE_FREQS];
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_AVAIL_FREQ,
+			pi->lcore_id);
+	f = fopen(fullpath, "r");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Strip the line break if there is */
+	p = strchr(buf, '\n');
+	if (p != NULL)
+		*p = 0;
+
+	/* Split string into at most RTE_MAX_LCORE_FREQS frequencies */
+	count = rte_strsplit(buf, sizeof(buf), freqs,
+			RTE_MAX_LCORE_FREQS, ' ');
+	if (count <= 0) {
+		RTE_LOG(ERR, POWER, "No available frequency in "
+				""POWER_SYSFILE_AVAIL_FREQ"\n", pi->lcore_id);
+		goto out;
+	}
+	if (count >= RTE_MAX_LCORE_FREQS) {
+		RTE_LOG(ERR, POWER, "Too many available frequencies : %d\n",
+				count);
+		goto out;
+	}
+
+	/* Store the available frequncies into power context */
+	for (i = 0, pi->nb_freqs = 0; i < count; i++) {
+		POWER_DEBUG_TRACE("Lcore %u frequency[%d]: %s\n", pi->lcore_id,
+				i, freqs[i]);
+		pi->freqs[pi->nb_freqs++] = strtoul(freqs[i], &p,
+				POWER_CONVERT_TO_DECIMAL);
+	}
+
+	ret = 0;
+	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
+			count, pi->lcore_id);
+out:
+	fclose(f);
+
+	return ret;
+}
+
+/**
+ * It is to fopen the sys file for the future setting the lcore frequency.
+ */
+static int
+power_init_for_setting_freq(struct rte_power_info *pi)
+{
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t i, freq;
+	char *s;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_SETSPEED,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, -1);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	freq = strtoul(buf, NULL, POWER_CONVERT_TO_DECIMAL);
+	for (i = 0; i < pi->nb_freqs; i++) {
+		if (freq == pi->freqs[i]) {
+			pi->curr_idx = i;
+			pi->f = f;
+			return 0;
+		}
+	}
+
+out:
+	fclose(f);
+
+	return -1;
+}
+
+int
+rte_power_acpi_cpufreq_init(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_IDLE, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"in use\n", lcore_id);
+		return -1;
+	}
+
+	pi->lcore_id = lcore_id;
+	/* Check and set the governor */
+	if (power_set_governor_userspace(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set governor of lcore %u to "
+				"userspace\n", lcore_id);
+		goto fail;
+	}
+
+	/* Get the available frequencies */
+	if (power_get_available_freqs(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot get available frequencies of "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Init for setting lcore frequency */
+	if (power_init_for_setting_freq(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot init for setting frequency for "
+				"lcore %u\n", lcore_id);
+		goto fail;
+	}
+
+	/* Set freq to max by default */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set frequency of lcore %u "
+				"to max\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Initialized successfully for lcore %u "
+			"power manamgement\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_USED);
+
+	return 0;
+
+fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+/**
+ * It is to check the governor and then set the original governor back if
+ * needed by writing the the sys file.
+ */
+static int
+power_set_governor_original(struct rte_power_info *pi)
+{
+	FILE *f;
+	int ret = -1;
+	char buf[BUFSIZ];
+	char fullpath[PATH_MAX];
+	char *s;
+	int val;
+
+	snprintf(fullpath, sizeof(fullpath), POWER_SYSFILE_GOVERNOR,
+			pi->lcore_id);
+	f = fopen(fullpath, "rw+");
+	FOPEN_OR_ERR_RET(f, ret);
+
+	s = fgets(buf, sizeof(buf), f);
+	FOPS_OR_NULL_GOTO(s, out);
+
+	/* Check if the governor to be set is the same as current */
+	if (strncmp(buf, pi->governor_ori, sizeof(pi->governor_ori)) == 0) {
+		ret = 0;
+		POWER_DEBUG_TRACE("Power management governor of lcore %u "
+				"has already been set to %s\n",
+				pi->lcore_id, pi->governor_ori);
+		goto out;
+	}
+
+	/* Write back the original governor */
+	val = fseek(f, 0, SEEK_SET);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	val = fputs(pi->governor_ori, f);
+	FOPS_OR_ERR_GOTO(val, out);
+
+	ret = 0;
+	RTE_LOG(INFO, POWER, "Power management governor of lcore %u "
+			"has been set back to %s successfully\n",
+			pi->lcore_id, pi->governor_ori);
+out:
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_acpi_cpufreq_exit(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Lcore id %u can not exceeds %u\n",
+				lcore_id, RTE_MAX_LCORE - 1U);
+		return -1;
+	}
+	pi = &lcore_power_info[lcore_id];
+	if (rte_atomic32_cmpset(&(pi->state), POWER_USED, POWER_ONGOING)
+			== 0) {
+		RTE_LOG(INFO, POWER, "Power management of lcore %u is "
+				"not used\n", lcore_id);
+		return -1;
+	}
+
+	/* Close FD of setting freq */
+	fclose(pi->f);
+	pi->f = NULL;
+
+	/* Set the governor back to the original */
+	if (power_set_governor_original(pi) < 0) {
+		RTE_LOG(ERR, POWER, "Cannot set the governor of %u back "
+				"to the original\n", lcore_id);
+		goto fail;
+	}
+
+	RTE_LOG(INFO, POWER, "Power management of lcore %u has exited from "
+			"'userspace' mode and been set back to the "
+			"original\n", lcore_id);
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_IDLE);
+
+	return 0;
+
+fail:
+	rte_atomic32_cmpset(&(pi->state), POWER_ONGOING, POWER_UNKNOWN);
+
+	return -1;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs, uint32_t num)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE || !freqs) {
+		RTE_LOG(ERR, POWER, "Invalid input parameter\n");
+		return 0;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (num < pi->nb_freqs) {
+		RTE_LOG(ERR, POWER, "Buffer size is not enough\n");
+		return 0;
+	}
+	rte_memcpy(freqs, pi->freqs, pi->nb_freqs * sizeof(uint32_t));
+
+	return pi->nb_freqs;
+}
+
+uint32_t
+rte_power_acpi_cpufreq_get_freq(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return RTE_POWER_INVALID_FREQ_INDEX;
+	}
+
+	return lcore_power_info[lcore_id].curr_idx;
+}
+
+int
+rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	return set_freq_internal(&(lcore_power_info[lcore_id]), index);
+}
+
+int
+rte_power_acpi_cpufreq_freq_down(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx + 1 == pi->nb_freqs)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx + 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_up(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+	if (pi->curr_idx == 0)
+		return 0;
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->curr_idx - 1);
+}
+
+int
+rte_power_acpi_cpufreq_freq_max(unsigned lcore_id)
+{
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(&lcore_power_info[lcore_id], 0);
+}
+
+int
+rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	/* Frequencies in the array are from high to low. */
+	return set_freq_internal(pi, pi->nb_freqs - 1);
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
new file mode 100644
index 0000000..68578e9
--- /dev/null
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -0,0 +1,192 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_ACPI_CPUFREQ_H
+#define _RTE_POWER_ACPI_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management via userspace ACPI cpufreq
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore. It will check and set the
+ * governor to userspace for the lcore, get the available frequencies, and
+ * prepare to set new lcore frequency.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore. It will set the governor to which
+ * is before initialized.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore. The return value will be
+ * the minimal one of the total number of available frequencies and the number
+ * of buffer. The index of available frequencies used in other interfaces
+ * should be in the range of 0 to this return value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  The number of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore. It
+ * will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)' if error.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  The current index of available frequencies.
+ */
+uint32_t rte_power_acpi_cpufreq_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency chnaged.
+ *  - Negative on error.
+ */
+int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_power/rte_power_common.h b/lib/librte_power/rte_power_common.h
new file mode 100644
index 0000000..64bd168
--- /dev/null
+++ b/lib/librte_power/rte_power_common.h
@@ -0,0 +1,39 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_POWER_COMMON_H_
+#define RTE_POWER_COMMON_H_
+
+#define RTE_POWER_INVALID_FREQ_INDEX (~0)
+
+#endif /* RTE_POWER_COMMON_H_ */
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
new file mode 100644
index 0000000..11596c3
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -0,0 +1,136 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <errno.h>
+#include <string.h>
+
+#include <rte_log.h>
+#include <rte_config.h>
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"
+
+#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
+
+static struct channel_packet pkt[CHANNEL_CMDS_MAX_VM_CHANNELS];
+
+
+int
+rte_power_kvm_vm_init(unsigned lcore_id)
+{
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].command = CPU_POWER;
+	pkt[lcore_id].resource_id = lcore_id;
+	return guest_channel_host_connect(FD_PATH, lcore_id);
+}
+
+int
+rte_power_kvm_vm_exit(unsigned lcore_id)
+{
+	guest_channel_host_disconnect(lcore_id);
+	return 0;
+}
+
+uint32_t
+rte_power_kvm_vm_freqs(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t *freqs,
+		__attribute__((unused)) uint32_t num)
+{
+	RTE_LOG(ERR, POWER, "rte_power_freqs is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+uint32_t
+rte_power_kvm_vm_get_freq(__attribute__((unused)) unsigned lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_get_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_set_freq(__attribute__((unused)) unsigned lcore_id,
+		__attribute__((unused)) uint32_t index)
+{
+	RTE_LOG(ERR, POWER, "rte_power_set_freq is not implemented "
+			"for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+static inline int
+send_msg(unsigned lcore_id, uint32_t scale_direction)
+{
+	int ret;
+
+	if (lcore_id >= CHANNEL_CMDS_MAX_VM_CHANNELS) {
+		RTE_LOG(ERR, POWER, "Core(%u) is out of range 0...%d\n",
+				lcore_id, CHANNEL_CMDS_MAX_VM_CHANNELS-1);
+		return -1;
+	}
+	pkt[lcore_id].unit = scale_direction;
+	ret = guest_channel_send_msg(&pkt[lcore_id], lcore_id);
+	if (ret == 0)
+		return 1;
+	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", strerror(ret));
+	return -1;
+}
+
+int
+rte_power_kvm_vm_freq_up(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_UP);
+}
+
+int
+rte_power_kvm_vm_freq_down(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_DOWN);
+}
+
+int
+rte_power_kvm_vm_freq_max(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MAX);
+}
+
+int
+rte_power_kvm_vm_freq_min(unsigned lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
new file mode 100644
index 0000000..dcbc878
--- /dev/null
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -0,0 +1,179 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_POWER_KVM_VM_H
+#define _RTE_POWER_KVM_VM_H
+
+/**
+ * @file
+ * RTE Power Management KVM VM
+ */
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize power management for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_init(unsigned lcore_id);
+
+/**
+ * Exit power management on a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_exit(unsigned lcore_id);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param freqs
+ *  The buffer array to save the frequencies.
+ * @param num
+ *  The number of frequencies to get.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_freqs(unsigned lcore_id, uint32_t *freqs,
+		uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+uint32_t rte_power_kvm_vm_get_freq(unsigned lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * It is not currently supported for VM Power Management.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_set_freq(unsigned lcore_id, uint32_t index);
+
+/**
+ * Scale up the frequency of a specific lcore. This request is forwarded to the
+ * host monitor.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_up(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore according to the available
+ * frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_down(unsigned lcore_id);
+
+/**
+ * Scale up the frequency of a specific lcore to the highest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_max(unsigned lcore_id);
+
+/**
+ * Scale down the frequency of a specific lcore to the lowest according to the
+ * available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_freq_min(unsigned lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 08/10] Packet format for VM Power Management(Host and Guest).
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (6 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 07/10] librte_power common interface for Guest and Host Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 09/10] Build system integration for VM Power Management(Guest and Host) Pablo de Lara
                             ` (2 subsequent siblings)
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/channel_commands.h |   77 +++++++++++++++++++++++++++++++++++
 1 files changed, 77 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_power/channel_commands.h

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
new file mode 100644
index 0000000..7e78a8b
--- /dev/null
+++ b/lib/librte_power/channel_commands.h
@@ -0,0 +1,77 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/* Maximum number of CPUs */
+#define CHANNEL_CMDS_MAX_CPUS        64
+#if CHANNEL_CMDS_MAX_CPUS > 64
+#error Maximum number of cores is 64, overflow is guaranteed to \
+	cause problems with VM Power Management
+#endif
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Valid Commands */
+#define CPU_POWER               1
+#define CPU_POWER_CONNECT       2
+
+/* CPU Power Command Scaling */
+#define CPU_POWER_SCALE_UP      1
+#define CPU_POWER_SCALE_DOWN    2
+#define CPU_POWER_SCALE_MAX     3
+#define CPU_POWER_SCALE_MIN     4
+
+struct channel_packet {
+	uint64_t resource_id; /**< core_num, device */
+	uint32_t unit;        /**< scale down/up/min/max */
+	uint32_t command;     /**< Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 09/10] Build system integration for VM Power Management(Guest and Host)
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (7 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 08/10] Packet format for VM Power Management(Host and Guest) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 10/10] VM Power Management Unit Tests Pablo de Lara
  2014-11-26 16:41           ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Thomas Monjalon
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_power/Makefile |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
index 6185812..d672a5a 100644
--- a/lib/librte_power/Makefile
+++ b/lib/librte_power/Makefile
@@ -37,7 +37,8 @@ LIB = librte_power.a
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c rte_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_kvm_vm.c guest_channel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v6 10/10] VM Power Management Unit Tests
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (8 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 09/10] Build system integration for VM Power Management(Guest and Host) Pablo de Lara
@ 2014-11-25 16:18           ` Pablo de Lara
  2014-11-26 16:41           ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Thomas Monjalon
  10 siblings, 0 replies; 97+ messages in thread
From: Pablo de Lara @ 2014-11-25 16:18 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 app/test/Makefile                  |    3 +-
 app/test/autotest_data.py          |   26 ++
 app/test/test_power.c              |  445 +++---------------------------
 app/test/test_power_acpi_cpufreq.c |  544 ++++++++++++++++++++++++++++++++++++
 app/test/test_power_kvm_vm.c       |  308 ++++++++++++++++++++
 5 files changed, 917 insertions(+), 409 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index ebfa0ba..4b03f00 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -120,7 +120,8 @@ endif
 
 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c test_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power_kvm_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..618a946 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,32 @@ non_parallel_test_group_list = [
 	]
 },
 {
+	"Prefix" :      "power_acpi_cpufreq",
+	"Memory" :      all_sockets(512),
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power ACPI cpufreq autotest",
+		 "Command" :    "power_acpi_cpufreq_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
+	"Prefix" :      "power_kvm_vm",
+	"Memory" :      "512",
+	"Tests" :
+	[
+		{
+		 "Name" :       "Power KVM VM  autotest",
+		 "Command" :    "power_kvm_vm_autotest",
+		 "Func" :       default_autotest,
+		 "Report" :     None,
+		},
+	]
+},
+{
 	"Prefix" :	"lpm6",
 	"Memory" :	"512",
 	"Tests" :
diff --git a/app/test/test_power.c b/app/test/test_power.c
index d9eb420..64a2305 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -41,437 +41,66 @@
 
 #include <rte_power.h>
 
-#define TEST_POWER_LCORE_ID      2U
-#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
-#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
-
-#define TEST_POWER_SYSFILE_CUR_FREQ \
-	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
-
-static uint32_t total_freq_num;
-static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
-
-static int
-check_cur_freq(unsigned lcore_id, uint32_t idx)
-{
-#define TEST_POWER_CONVERT_TO_DECIMAL 10
-	FILE *f;
-	char fullpath[PATH_MAX];
-	char buf[BUFSIZ];
-	uint32_t cur_freq;
-	int ret = -1;
-
-	if (snprintf(fullpath, sizeof(fullpath),
-		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
-		return 0;
-	}
-	f = fopen(fullpath, "r");
-	if (f == NULL) {
-		return 0;
-	}
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		goto fail_get_cur_freq;
-	}
-	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
-	ret = (freqs[idx] == cur_freq ? 0 : -1);
-
-fail_get_cur_freq:
-	fclose(f);
-
-	return ret;
-}
-
-/* Check rte_power_freqs() */
-static int
-check_power_freqs(void)
-{
-	uint32_t ret;
-
-	total_freq_num = 0;
-	memset(freqs, 0, sizeof(freqs));
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
-					TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with NULL buffer to save available freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test of getting zero number of freqs */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
-	if (ret > 0) {
-		printf("Unexpectedly get available freqs successfully with "
-			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* test with all valid input parameters */
-	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get available freqs on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Save the total number of available freqs */
-	total_freq_num = ret;
-
-	return 0;
-}
-
-/* Check rte_power_get_freq() */
-static int
-check_power_get_freq(void)
-{
-	int ret;
-	uint32_t count;
-
-	/* test with an invalid lcore id */
-	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
-	if (count < TEST_POWER_FREQS_NUM_MAX) {
-		printf("Unexpectedly get freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
-	if (count >= TEST_POWER_FREQS_NUM_MAX) {
-		printf("Fail to get the freq index on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_set_freq() */
-static int
-check_power_set_freq(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
-	if (ret >= 0) {
-		printf("Unexpectedly set freq index successfully on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* test with an invalid freq index */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
-				TEST_POWER_FREQS_NUM_MAX);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/**
-	 * test with an invalid freq index which is right one bigger than
-	 * total number of freqs
-	 */
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
-	if (ret >= 0) {
-		printf("Unexpectedly set an invalid freq index (%u)"
-			"successfully on lcore %u\n", total_freq_num,
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0) {
-		printf("Fail to set freq index on lcore %u\n",
-					TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_down() */
-static int
-check_power_freq_down(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale down one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale down one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf ("Fail to scale down the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_up() */
-static int
-check_power_freq_up(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq on %u\n",
-						TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	/* Scale down to min and then scale up one step */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
-	if (ret < 0)
-		return -1;
-
-	/* Scale up to max and then scale up one step */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_max() */
-static int
-check_power_freq_max(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale up successfully the freq to max on "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale up the freq to max on lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Check rte_power_freq_min() */
-static int
-check_power_freq_min(void)
-{
-	int ret;
-
-	/* test with an invalid lcore id */
-	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
-	if (ret >= 0) {
-		printf("Unexpectedly scale down successfully the freq to min "
-				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Fail to scale down the freq to min on lcore %u\n",
-							TEST_POWER_LCORE_ID);
-		return -1;
-	}
-
-	/* Check the current frequency */
-	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
-	if (ret < 0)
-		return -1;
-
-	return 0;
-}
-
 static int
 test_power(void)
 {
 	int ret = -1;
+	enum power_management_env env;
 
-	/* test of init power management for an invalid lcore */
-	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	/* Test setting an invalid environment */
+	ret = rte_power_set_env(PM_ENV_NOT_SET);
 	if (ret == 0) {
-		printf("Unexpectedly initialise power management successfully "
-				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
-	}
-
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot initialise power management for lcore %u\n",
-							TEST_POWER_LCORE_ID);
+		printf("Unexpectedly succeeded on setting an invalid environment\n");
 		return -1;
 	}
 
-	/**
-	 * test of initialising power management for the lcore which has
-	 * been initialised
-	 */
-	ret = rte_power_init(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly init successfully power twice on "
-					"lcore %u\n", TEST_POWER_LCORE_ID);
+	/* Test that the environment has not been set */
+	env = rte_power_get_env();
+	if (env != PM_ENV_NOT_SET) {
+		printf("Unexpectedly got a valid environment configuration\n");
 		return -1;
 	}
 
-	ret = check_power_freqs();
-	if (ret < 0)
+	/* verify that function pointers are NULL */
+	if (rte_power_freqs != NULL) {
+		printf("rte_power_freqs should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	if (total_freq_num < 2) {
-		rte_power_exit(TEST_POWER_LCORE_ID);
-		printf("Frequency can not be changed due to CPU itself\n");
-		return 0;
 	}
-
-	ret = check_power_get_freq();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_set_freq();
-	if (ret < 0)
+	if (rte_power_get_freq != NULL) {
+		printf("rte_power_get_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_down();
-	if (ret < 0)
-		goto fail_all;
-
-	ret = check_power_freq_up();
-	if (ret < 0)
+	}
+	if (rte_power_set_freq != NULL) {
+		printf("rte_power_set_freq should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_max();
-	if (ret < 0)
+	}
+	if (rte_power_freq_up != NULL) {
+		printf("rte_power_freq_up should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = check_power_freq_min();
-	if (ret < 0)
+	}
+	if (rte_power_freq_down != NULL) {
+		printf("rte_power_freq_down should be NULL, environment has not been "
+				"initialised\n");
 		goto fail_all;
-
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret < 0) {
-		printf("Cannot exit power management for lcore %u\n",
-						TEST_POWER_LCORE_ID);
-		return -1;
 	}
-
-	/**
-	 * test of exiting power management for the lcore which has been exited
-	 */
-	ret = rte_power_exit(TEST_POWER_LCORE_ID);
-	if (ret == 0) {
-		printf("Unexpectedly exit successfully power management twice "
-					"on lcore %u\n", TEST_POWER_LCORE_ID);
-		return -1;
+	if (rte_power_freq_max != NULL) {
+		printf("rte_power_freq_max should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
-	/* test of exit power management for an invalid lcore */
-	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
-	if (ret == 0) {
-		printf("Unpectedly exit power management successfully for "
-				"lcore %u\n", TEST_POWER_LCORE_INVALID);
-		return -1;
+	if (rte_power_freq_min != NULL) {
+		printf("rte_power_freq_min should be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
 	}
-
+	rte_power_unset_env();
 	return 0;
-
 fail_all:
-	rte_power_exit(TEST_POWER_LCORE_ID);
-
+	rte_power_unset_env();
 	return -1;
 }
 
diff --git a/app/test/test_power_acpi_cpufreq.c b/app/test/test_power_acpi_cpufreq.c
new file mode 100644
index 0000000..0fb1569
--- /dev/null
+++ b/app/test/test_power_acpi_cpufreq.c
@@ -0,0 +1,544 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+
+#define TEST_POWER_LCORE_ID      2U
+#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
+#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
+
+#define TEST_POWER_SYSFILE_CUR_FREQ \
+	"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
+
+static uint32_t total_freq_num;
+static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
+
+static int
+check_cur_freq(unsigned lcore_id, uint32_t idx)
+{
+#define TEST_POWER_CONVERT_TO_DECIMAL 10
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	uint32_t cur_freq;
+	int ret = -1;
+
+	if (snprintf(fullpath, sizeof(fullpath),
+		TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
+		return 0;
+	}
+	f = fopen(fullpath, "r");
+	if (f == NULL) {
+		return 0;
+	}
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		goto fail_get_cur_freq;
+	}
+	cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
+	ret = (freqs[idx] == cur_freq ? 0 : -1);
+
+fail_get_cur_freq:
+	fclose(f);
+
+	return ret;
+}
+
+/* Check rte_power_freqs() */
+static int
+check_power_freqs(void)
+{
+	uint32_t ret;
+
+	total_freq_num = 0;
+	memset(freqs, 0, sizeof(freqs));
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
+					TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with NULL buffer to save available freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test of getting zero number of freqs */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
+	if (ret > 0) {
+		printf("Unexpectedly get available freqs successfully with "
+			"zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* test with all valid input parameters */
+	ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get available freqs on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Save the total number of available freqs */
+	total_freq_num = ret;
+
+	return 0;
+}
+
+/* Check rte_power_get_freq() */
+static int
+check_power_get_freq(void)
+{
+	int ret;
+	uint32_t count;
+
+	/* test with an invalid lcore id */
+	count = rte_power_get_freq(TEST_POWER_LCORE_INVALID);
+	if (count < TEST_POWER_FREQS_NUM_MAX) {
+		printf("Unexpectedly get freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	count = rte_power_get_freq(TEST_POWER_LCORE_ID);
+	if (count >= TEST_POWER_FREQS_NUM_MAX) {
+		printf("Fail to get the freq index on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, count);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_set_freq() */
+static int
+check_power_set_freq(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_INVALID, 0);
+	if (ret >= 0) {
+		printf("Unexpectedly set freq index successfully on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* test with an invalid freq index */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID,
+				TEST_POWER_FREQS_NUM_MAX);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", TEST_POWER_FREQS_NUM_MAX,
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/**
+	 * test with an invalid freq index which is right one bigger than
+	 * total number of freqs
+	 */
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num);
+	if (ret >= 0) {
+		printf("Unexpectedly set an invalid freq index (%u)"
+			"successfully on lcore %u\n", total_freq_num,
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_set_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0) {
+		printf("Fail to set freq index on lcore %u\n",
+					TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_down() */
+static int
+check_power_freq_down(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_down(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale down one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale down one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_down(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_up() */
+static int
+check_power_freq_up(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_up(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq on %u\n",
+						TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+
+	/* Scale down to min and then scale up one step */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 2);
+	if (ret < 0)
+		return -1;
+
+	/* Scale up to max and then scale up one step */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+	ret = rte_power_freq_up(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_max() */
+static int
+check_power_freq_max(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_max(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale up successfully the freq to max on "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_max(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale up the freq to max on lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, 0);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+/* Check rte_power_freq_min() */
+static int
+check_power_freq_min(void)
+{
+	int ret;
+
+	/* test with an invalid lcore id */
+	ret = rte_power_freq_min(TEST_POWER_LCORE_INVALID);
+	if (ret >= 0) {
+		printf("Unexpectedly scale down successfully the freq to min "
+				"on lcore %u\n", TEST_POWER_LCORE_INVALID);
+		return -1;
+	}
+	ret = rte_power_freq_min(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Fail to scale down the freq to min on lcore %u\n",
+							TEST_POWER_LCORE_ID);
+		return -1;
+	}
+
+	/* Check the current frequency */
+	ret = check_cur_freq(TEST_POWER_LCORE_ID, total_freq_num - 1);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+static int
+test_power_acpi_cpufreq(void)
+{
+	int ret = -1;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_ACPI_CPUFREQ, this "
+				"may occur if environment is not configured correctly or "
+				" operating in another valid Power management environment\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_ACPI_CPUFREQ) {
+		printf("Unexpectedly got an environment other than ACPI cpufreq\n");
+		goto fail_all;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		goto fail_all;
+	}
+
+	/* test of init power management for an invalid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unexpectedly initialise power management successfully "
+				"for lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(APCI cpufreq) or operating in another valid "
+				"Power management environment\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of initialising power management for the lcore which has
+	 * been initialised
+	 */
+	ret = rte_power_init(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly init successfully power twice on "
+					"lcore %u\n", TEST_POWER_LCORE_ID);
+		goto fail_all;
+	}
+
+	ret = check_power_freqs();
+	if (ret < 0)
+		goto fail_all;
+
+	if (total_freq_num < 2) {
+		rte_power_exit(TEST_POWER_LCORE_ID);
+		printf("Frequency can not be changed due to CPU itself\n");
+		rte_power_unset_env();
+		return 0;
+	}
+
+	ret = check_power_get_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_set_freq();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_down();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_up();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_max();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = check_power_freq_min();
+	if (ret < 0)
+		goto fail_all;
+
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot exit power management for lcore %u\n",
+						TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/**
+	 * test of exiting power management for the lcore which has been exited
+	 */
+	ret = rte_power_exit(TEST_POWER_LCORE_ID);
+	if (ret == 0) {
+		printf("Unexpectedly exit successfully power management twice "
+					"on lcore %u\n", TEST_POWER_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* test of exit power management for an invalid lcore */
+	ret = rte_power_exit(TEST_POWER_LCORE_INVALID);
+	if (ret == 0) {
+		printf("Unpectedly exit power management successfully for "
+				"lcore %u\n", TEST_POWER_LCORE_INVALID);
+		rte_power_unset_env();
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+
+fail_all:
+	rte_power_exit(TEST_POWER_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_acpi_cpufreq_cmd = {
+	.command = "power_acpi_cpufreq_autotest",
+	.callback = test_power_acpi_cpufreq,
+};
+REGISTER_TEST_COMMAND(power_acpi_cpufreq_cmd);
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
new file mode 100644
index 0000000..6fdb344
--- /dev/null
+++ b/app/test/test_power_kvm_vm.c
@@ -0,0 +1,308 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <limits.h>
+#include <string.h>
+
+#include "test.h"
+
+#include <rte_power.h>
+#include <rte_config.h>
+
+#define TEST_POWER_VM_LCORE_ID            0U
+#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
+#define TEST_POWER_VM_LCORE_INVALID       1U
+
+static int
+test_power_kvm_vm(void)
+{
+	int ret;
+	enum power_management_env env;
+
+	ret = rte_power_set_env(PM_ENV_KVM_VM);
+	if (ret != 0) {
+		printf("Failed on setting environment to PM_ENV_KVM_VM\n");
+		return -1;
+	}
+
+	/* Test environment configuration */
+	env = rte_power_get_env();
+	if (env != PM_ENV_KVM_VM) {
+		printf("Unexpectedly got a Power Management environment other than "
+				"KVM VM\n");
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* verify that function pointers are not NULL */
+	if (rte_power_freqs == NULL) {
+		printf("rte_power_freqs should not be NULL, environment has not been "
+				"initialised\n");
+		return -1;
+	}
+	if (rte_power_get_freq == NULL) {
+		printf("rte_power_get_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_set_freq == NULL) {
+		printf("rte_power_set_freq should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_up == NULL) {
+		printf("rte_power_freq_up should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_down == NULL) {
+		printf("rte_power_freq_down should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_max == NULL) {
+		printf("rte_power_freq_max should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	if (rte_power_freq_min == NULL) {
+		printf("rte_power_freq_min should not be NULL, environment has not "
+				"been initialised\n");
+		return -1;
+	}
+	/* Test initialisation of an out of bounds lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret != -1) {
+		printf("rte_power_init unexpectedly succeeded on an invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of a valid lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret < 0) {
+		printf("Cannot initialise power management for lcore %u, this "
+				"may occur if environment is not configured "
+				"correctly(KVM VM) or operating in another valid "
+				"Power management environment\n", TEST_POWER_VM_LCORE_ID);
+		rte_power_unset_env();
+		return -1;
+	}
+
+	/* Test initialisation of previously initialised lcore */
+	ret = rte_power_init(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_init unexpectedly succeeded on calling init twice on"
+				" lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of invalid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency down of invalid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency min of invalid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency max of invalid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid but uninitialised lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_up unexpectedly succeeded on invalid lcore %u\n",
+				TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid but uninitialised lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_down unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid but uninitialised lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_min unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid but uninitialised lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_INVALID);
+	if (ret == 1) {
+		printf("rte_power_freq_max unexpectedly succeeded on invalid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_INVALID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of valid lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_up unexpectedly failed on valid lcore %u\n",
+				TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency down of valid lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_down unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency min of valid lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_min unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency max of valid lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret != 1) {
+		printf("rte_power_freq_max unexpectedly failed on valid lcore "
+				"%u\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_freqs */
+	ret = rte_power_freqs(TEST_POWER_VM_LCORE_ID, NULL, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_freqs did not return the expected -ENOTSUP(%d) but "
+				"returned %d\n", -ENOTSUP, ret);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_get_freq */
+	ret = rte_power_get_freq(TEST_POWER_VM_LCORE_ID);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_get_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test unsupported rte_power_set_freq */
+	ret = rte_power_set_freq(TEST_POWER_VM_LCORE_ID, 0);
+	if (ret != -ENOTSUP) {
+		printf("rte_power_set_freq did not return the expected -ENOTSUP(%d) but"
+				" returned %d for lcore %u\n",
+				-ENOTSUP, ret, TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test removing of an lcore */
+	ret = rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	if (ret != 0) {
+		printf("rte_power_exit unexpectedly failed on valid lcore %u,"
+				"please ensure that the environment has been configured "
+				"correctly\n", TEST_POWER_VM_LCORE_ID);
+		goto fail_all;
+	}
+
+	/* Test frequency up of previously removed lcore */
+	ret = rte_power_freq_up(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_up unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency down of previously removed lcore */
+	ret = rte_power_freq_down(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_down unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency min of previously removed lcore */
+	ret = rte_power_freq_min(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_min unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+
+	/* Test frequency max of previously removed lcore */
+	ret = rte_power_freq_max(TEST_POWER_VM_LCORE_ID);
+	if (ret == 0) {
+		printf("rte_power_freq_max unexpectedly succeeded on a removed "
+				"lcore %u\n", TEST_POWER_VM_LCORE_ID);
+		return -1;
+	}
+	rte_power_unset_env();
+	return 0;
+fail_all:
+	rte_power_exit(TEST_POWER_VM_LCORE_ID);
+	rte_power_unset_env();
+	return -1;
+}
+
+static struct test_command power_kvm_vm_cmd = {
+	.command = "power_kvm_vm_autotest",
+	.callback = test_power_kvm_vm,
+};
+REGISTER_TEST_COMMAND(power_kvm_vm_cmd);
-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management
  2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
                             ` (9 preceding siblings ...)
  2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 10/10] VM Power Management Unit Tests Pablo de Lara
@ 2014-11-26 16:41           ` Thomas Monjalon
  10 siblings, 0 replies; 97+ messages in thread
From: Thomas Monjalon @ 2014-11-26 16:41 UTC (permalink / raw)
  To: Pablo de Lara, Alan Carew; +Cc: dev

Hi Pablo and Alan,

2014-11-25 16:18, Pablo de Lara:
> Virtual Machine Power Management.
> 
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> 
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
> 
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
> 
>  2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/<vm_name>.<channel_number>,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
>  Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
>  Requests over each channel to change frequency are forwarded to the original
>  librte_power.
>  
> Channels must be manually configured as qemu-kvm command line arguments or
> libvirt domain definition(xml) e.g.
> <controller type='virtio-serial' index='0'>
>  <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
> </controller>
> <channel type='unix'>
>   <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
>   <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
>   <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
> </channel>
> 
> Where multiple channels can be configured by specifying multiple <channel>
> elements, by replacing <vm_name>, <channel_num>.
> <N>(port number) should be incremented by 1 for each new channel element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> 
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
>  from VMs and calls the corresponding librte_power functions.
> 
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a 
> vCPU[PATCH 08/09]
> 
> The current l3fwd-power sample application can also be run on a VM.
> 
> Changes in V6:
>  Fixed typos and missing some identations and blank lines
> 
> Changes in V5:
>  Fixed default target in sample app Makefiles
> 
> Changes in V4:
>  Fixed double free of channel during VM shutdown.
> 
> Changes in V3:
>  Fixed crash in Guest CLI when host application is not running.
>  Renamed #defines to be more specific to the module they belong
>  Added vCPU pinning via CLI
> 
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> 
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests

Thanks to my shiny updated checkpatch, I was able to fix these 2 typos:

WARNING:MISSING_SPACE: break quoted strings at a space character
#831: FILE: examples/vm_power_manager/channel_manager.c:722:
+               RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+                               "already established\n", path);

WARNING:MISSING_SPACE: break quoted strings at a space character
#1424: FILE: examples/vm_power_manager/channel_monitor.c:181:
+               RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+                               "epoll events\n");

This codebase is really too big to be properly reviewed.

As discussed earlier, it's a workaround for a missing feature in Qemu/KVM.
It's now applied in DPDK but it would be really more convenient for everyone if
it could be fixed upstream. I hope you'll be able to sustain this work for the
goodness of every implied communities.

Thank you
-- 
Thomas

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v2] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ
  2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
                       ` (11 preceding siblings ...)
  2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
@ 2014-11-10  9:19     ` Alan Carew
  2014-12-05 14:16       ` Olivier MATZ
  12 siblings, 1 reply; 97+ messages in thread
From: Alan Carew @ 2014-11-10  9:19 UTC (permalink / raw)
  To: dev

When using test-pmd with flow director in FreeBSD, the application will
segfault/Bus error while parsing the command-line. This is due to how
each commands result structure is represented during parsing, where the offsets
for each tokens value is stored in a character array(char result_buf[BUFSIZ])
in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).

The overflow occurs where BUFSIZ is less than the size of a commands result
structure, in this case "struct cmd_pkt_filter_result"
(app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
opposed to 8192 bytes on Linux.

The problem can be reproduced by running test-pmd on FreeBSD:
./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
And adding a filter:
add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
0x800 vlan 0 queue 0 soft 0x17

This patch removes the OS dependency on BUFSIZ and defines and uses a
library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192

Added boundary checking to ensure this buffer size cannot overflow, with
an error message being produced.

Suggested-by: Olivier MATZ <olivier.matz@6wind.com>
http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f

Signed-off-by: Alan Carew <alan.carew@intel.com>
---
 lib/librte_cmdline/cmdline_parse.c | 22 +++++++++++++++-------
 lib/librte_cmdline/cmdline_parse.h |  3 +++
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 940480d..f86f163 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -138,7 +138,7 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void * result_buf)
+	   unsigned int nb_match_token, void *result_buf, unsigned result_buf_size)
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -162,10 +162,18 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		if ( isendofline(*buf) || iscomment(*buf) )
 			break;
 
-		if (result_buf)
+		if (result_buf) {
+			if (token_hdr.offset > result_buf_size) {
+				printf("Parse error(%s:%d): Token offset(%u) exceeds maximum "
+				"size(%u)\n", __FILE__, __LINE__, token_hdr.offset,
+				result_buf_size);
+				return -ENOBUFS;
+			}
+
 			n = token_hdr.ops->parse(token_p, buf,
 						 (char *)result_buf +
 						 token_hdr.offset);
+		}
 		else
 			n = token_hdr.ops->parse(token_p, buf, NULL);
 
@@ -219,7 +227,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[BUFSIZ];
+	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -280,7 +288,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf);
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -377,10 +385,10 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL))
+			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
 				goto next;
 
-			debug_printf("instruction match \n");
+			debug_printf("instruction match\n");
 			token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
@@ -471,7 +479,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL))
+		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
 			goto next2;
 
 		token_p = inst->tokens[nb_token];
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index f18836d..dae53ba 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -80,6 +80,9 @@ extern "C" {
 #define CMDLINE_PARSE_COMPLETE_AGAIN    1
 #define CMDLINE_PARSE_COMPLETED_BUFFER  2
 
+/* maximum buffer size for parsed result */
+#define CMDLINE_PARSE_RESULT_BUFSIZE 8192
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
-- 
1.9.3

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v2] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ
  2014-11-10  9:19     ` [dpdk-dev] [PATCH v2] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ Alan Carew
@ 2014-12-05 14:16       ` Olivier MATZ
  2014-12-05 14:19         ` [dpdk-dev] [PATCH v3] " Olivier Matz
  0 siblings, 1 reply; 97+ messages in thread
From: Olivier MATZ @ 2014-12-05 14:16 UTC (permalink / raw)
  To: Alan Carew, dev

Hi Alan,
On 11/10/2014 10:19 AM, Alan Carew wrote:
> When using test-pmd with flow director in FreeBSD, the application will
> segfault/Bus error while parsing the command-line. This is due to how
> each commands result structure is represented during parsing, where the offsets
> for each tokens value is stored in a character array(char result_buf[BUFSIZ])
> in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).
> 
> The overflow occurs where BUFSIZ is less than the size of a commands result
> structure, in this case "struct cmd_pkt_filter_result"
> (app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
> opposed to 8192 bytes on Linux.
> 
> The problem can be reproduced by running test-pmd on FreeBSD:
> ./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
> And adding a filter:
> add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
> 0x800 vlan 0 queue 0 soft 0x17
> 
> This patch removes the OS dependency on BUFSIZ and defines and uses a
> library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
> 
> Added boundary checking to ensure this buffer size cannot overflow, with
> an error message being produced.
> 
> Suggested-by: Olivier MATZ <olivier.matz@6wind.com>
> http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f
> 
> Signed-off-by: Alan Carew <alan.carew@intel.com>

I think some checks are missing compared to the original patch. The
cmdline_parse_xxx() functions should be modified too. Please see a
v3 in my next email.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 97+ messages in thread

* [dpdk-dev] [PATCH v3] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ
  2014-12-05 14:16       ` Olivier MATZ
@ 2014-12-05 14:19         ` Olivier Matz
  2014-12-05 15:51           ` Bruce Richardson
  0 siblings, 1 reply; 97+ messages in thread
From: Olivier Matz @ 2014-12-05 14:19 UTC (permalink / raw)
  To: dev

From: Alan Carew <alan.carew@intel.com>

When using test-pmd with flow director in FreeBSD, the application will
segfault/Bus error while parsing the command-line. This is due to how
each commands result structure is represented during parsing, where the offsets
for each tokens value is stored in a character array(char result_buf[BUFSIZ])
in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).

The overflow occurs where BUFSIZ is less than the size of a commands result
structure, in this case "struct cmd_pkt_filter_result"
(app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
opposed to 8192 bytes on Linux.

The problem can be reproduced by running test-pmd on FreeBSD:
./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
And adding a filter:
add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
0x800 vlan 0 queue 0 soft 0x17

This patch removes the OS dependency on BUFSIZ and defines and uses a
library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192

Added boundary checking to ensure this buffer size cannot overflow, with
an error message being produced.

Suggested-by: Olivier MATZ <olivier.matz@6wind.com>
http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Olivier MATZ <olivier.matz@6wind.com>
---
 app/test-pmd/parameters.c                    |  6 +++--
 app/test/test_cmdline_etheraddr.c            | 13 +++++-----
 app/test/test_cmdline_ipaddr.c               | 27 ++++++++++++--------
 app/test/test_cmdline_num.c                  | 31 +++++++++++++----------
 app/test/test_cmdline_portlist.c             | 13 +++++-----
 app/test/test_cmdline_string.c               | 13 ++++++----
 examples/cmdline/parse_obj_list.c            |  6 ++++-
 examples/cmdline/parse_obj_list.h            |  3 ++-
 examples/vhost_xen/xenstore_parse.c          |  5 ++--
 lib/librte_cmdline/cmdline_parse.c           | 35 ++++++++++++++++---------
 lib/librte_cmdline/cmdline_parse.h           | 11 +++++---
 lib/librte_cmdline/cmdline_parse_etheraddr.c |  5 +++-
 lib/librte_cmdline/cmdline_parse_etheraddr.h |  4 +--
 lib/librte_cmdline/cmdline_parse_ipaddr.c    |  6 ++++-
 lib/librte_cmdline/cmdline_parse_ipaddr.h    |  4 +--
 lib/librte_cmdline/cmdline_parse_num.c       | 38 +++++++++++++++++++++++++++-
 lib/librte_cmdline/cmdline_parse_num.h       |  4 +--
 lib/librte_cmdline/cmdline_parse_portlist.c  |  5 +++-
 lib/librte_cmdline/cmdline_parse_portlist.h  |  4 +--
 lib/librte_cmdline/cmdline_parse_string.c    |  6 ++++-
 lib/librte_cmdline/cmdline_parse_string.h    |  2 +-
 lib/librte_pmd_bond/rte_eth_bond_args.c      |  3 ++-
 22 files changed, 168 insertions(+), 76 deletions(-)

diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 9573a43..8558985 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -223,7 +223,8 @@ init_peer_eth_addrs(char *config_filename)
 		if (fgets(buf, sizeof(buf), config_file) == NULL)
 			break;
 
-		if (cmdline_parse_etheraddr(NULL, buf, &peer_eth_addrs[i]) < 0 ){
+		if (cmdline_parse_etheraddr(NULL, buf, &peer_eth_addrs[i],
+				sizeof(peer_eth_addrs[i])) < 0 ){
 			printf("Bad MAC address format on line %d\n", i+1);
 			fclose(config_file);
 			return -1;
@@ -658,7 +659,8 @@ launch_args_parse(int argc, char** argv)
 						 "eth-peer: port %d >= RTE_MAX_ETHPORTS(%d)\n",
 						 n, RTE_MAX_ETHPORTS);
 
-				if (cmdline_parse_etheraddr(NULL, port_end, &peer_addr) < 0 )
+				if (cmdline_parse_etheraddr(NULL, port_end,
+						&peer_addr, sizeof(peer_addr)) < 0 )
 					rte_exit(EXIT_FAILURE,
 						 "Invalid ethernet address: %s\n",
 						 port_end);
diff --git a/app/test/test_cmdline_etheraddr.c b/app/test/test_cmdline_etheraddr.c
index 45c61ff..e4f4231 100644
--- a/app/test/test_cmdline_etheraddr.c
+++ b/app/test/test_cmdline_etheraddr.c
@@ -130,14 +130,15 @@ test_parse_etheraddr_invalid_param(void)
 	int ret = 0;
 
 	/* try all null */
-	ret = cmdline_parse_etheraddr(NULL, NULL, NULL);
+	ret = cmdline_parse_etheraddr(NULL, NULL, NULL, 0);
 	if (ret != -1) {
 		printf("Error: parser accepted null parameters!\n");
 		return -1;
 	}
 
 	/* try null buf */
-	ret = cmdline_parse_etheraddr(NULL, NULL, (void*)&result);
+	ret = cmdline_parse_etheraddr(NULL, NULL, (void*)&result,
+		sizeof(result));
 	if (ret != -1) {
 		printf("Error: parser accepted null string!\n");
 		return -1;
@@ -149,7 +150,7 @@ test_parse_etheraddr_invalid_param(void)
 	snprintf(buf, sizeof(buf), "%s",
 			ether_addr_valid_strs[0].str);
 
-	ret = cmdline_parse_etheraddr(NULL, buf, NULL);
+	ret = cmdline_parse_etheraddr(NULL, buf, NULL, 0);
 	if (ret == -1) {
 		printf("Error: parser rejected null result!\n");
 		return -1;
@@ -185,7 +186,7 @@ test_parse_etheraddr_invalid_data(void)
 		memset(&result, 0, sizeof(struct ether_addr));
 
 		ret = cmdline_parse_etheraddr(NULL, ether_addr_invalid_strs[i],
-				(void*)&result);
+			(void*)&result, sizeof(result));
 		if (ret != -1) {
 			printf("Error: parsing %s succeeded!\n",
 					ether_addr_invalid_strs[i]);
@@ -210,7 +211,7 @@ test_parse_etheraddr_valid(void)
 		memset(&result, 0, sizeof(struct ether_addr));
 
 		ret = cmdline_parse_etheraddr(NULL, ether_addr_valid_strs[i].str,
-				(void*)&result);
+			(void*)&result, sizeof(result));
 		if (ret < 0) {
 			printf("Error: parsing %s failed!\n",
 					ether_addr_valid_strs[i].str);
@@ -229,7 +230,7 @@ test_parse_etheraddr_valid(void)
 		memset(&result, 0, sizeof(struct ether_addr));
 
 		ret = cmdline_parse_etheraddr(NULL, ether_addr_garbage_strs[i],
-				(void*)&result);
+			(void*)&result, sizeof(result));
 		if (ret < 0) {
 			printf("Error: parsing %s failed!\n",
 					ether_addr_garbage_strs[i]);
diff --git a/app/test/test_cmdline_ipaddr.c b/app/test/test_cmdline_ipaddr.c
index 4ce928d..471d2ff 100644
--- a/app/test/test_cmdline_ipaddr.c
+++ b/app/test/test_cmdline_ipaddr.c
@@ -425,7 +425,8 @@ test_parse_ipaddr_valid(void)
 							buf, sizeof(buf));
 
 			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-					ipaddr_valid_strs[i].str, (void*)&result);
+				ipaddr_valid_strs[i].str, (void*)&result,
+				sizeof(result));
 
 			/* if should have passed, or should have failed */
 			if ((ret < 0) ==
@@ -474,7 +475,8 @@ test_parse_ipaddr_valid(void)
 							buf, sizeof(buf));
 
 			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-					ipaddr_garbage_addr4_strs[i], (void*)&result);
+				ipaddr_garbage_addr4_strs[i], (void*)&result,
+				sizeof(result));
 
 			/* if should have passed, or should have failed */
 			if ((ret < 0) ==
@@ -515,7 +517,8 @@ test_parse_ipaddr_valid(void)
 							buf, sizeof(buf));
 
 			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-					ipaddr_garbage_addr6_strs[i], (void*)&result);
+				ipaddr_garbage_addr6_strs[i], (void*)&result,
+				sizeof(result));
 
 			/* if should have passed, or should have failed */
 			if ((ret < 0) ==
@@ -557,7 +560,8 @@ test_parse_ipaddr_valid(void)
 							buf, sizeof(buf));
 
 			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-					ipaddr_garbage_network4_strs[i], (void*)&result);
+				ipaddr_garbage_network4_strs[i], (void*)&result,
+				sizeof(result));
 
 			/* if should have passed, or should have failed */
 			if ((ret < 0) ==
@@ -598,7 +602,8 @@ test_parse_ipaddr_valid(void)
 							buf, sizeof(buf));
 
 			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-					ipaddr_garbage_network6_strs[i], (void*)&result);
+				ipaddr_garbage_network6_strs[i], (void*)&result,
+				sizeof(result));
 
 			/* if should have passed, or should have failed */
 			if ((ret < 0) ==
@@ -651,7 +656,8 @@ test_parse_ipaddr_invalid_data(void)
 					buf, sizeof(buf));
 
 			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-					ipaddr_invalid_strs[i], (void*)&result);
+				ipaddr_invalid_strs[i], (void*)&result,
+				sizeof(result));
 
 			if (ret != -1) {
 				printf("Error: parsing %s as %s succeeded!\n",
@@ -677,25 +683,26 @@ test_parse_ipaddr_invalid_param(void)
 	token.ipaddr_data.flags = CMDLINE_IPADDR_V4;
 
 	/* null token */
-	if (cmdline_parse_ipaddr(NULL, buf, (void*)&result) != -1) {
+	if (cmdline_parse_ipaddr(NULL, buf, (void*)&result,
+			sizeof(result)) != -1) {
 		printf("Error: parser accepted invalid parameters!\n");
 		return -1;
 	}
 	/* null buffer */
 	if (cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-			NULL, (void*)&result) != -1) {
+			NULL, (void*)&result, sizeof(result)) != -1) {
 		printf("Error: parser accepted invalid parameters!\n");
 		return -1;
 	}
 	/* empty buffer */
 	if (cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-			"", (void*)&result) != -1) {
+			"", (void*)&result, sizeof(result)) != -1) {
 		printf("Error: parser accepted invalid parameters!\n");
 		return -1;
 	}
 	/* null result */
 	if (cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
-			buf, NULL) == -1) {
+			buf, NULL, 0) == -1) {
 		printf("Error: parser rejected null result!\n");
 		return -1;
 	}
diff --git a/app/test/test_cmdline_num.c b/app/test/test_cmdline_num.c
index 799d68c..04263d3 100644
--- a/app/test/test_cmdline_num.c
+++ b/app/test/test_cmdline_num.c
@@ -350,14 +350,14 @@ test_parse_num_invalid_param(void)
 			num_valid_positive_strs[0].str);
 
 	/* try all null */
-	ret = cmdline_parse_num(NULL, NULL, NULL);
+	ret = cmdline_parse_num(NULL, NULL, NULL, 0);
 	if (ret != -1) {
 		printf("Error: parser accepted null parameters!\n");
 		return -1;
 	}
 
 	/* try null token */
-	ret = cmdline_parse_num(NULL, buf, (void*)&result);
+	ret = cmdline_parse_num(NULL, buf, (void*)&result, sizeof(result));
 	if (ret != -1) {
 		printf("Error: parser accepted null token!\n");
 		return -1;
@@ -365,14 +365,15 @@ test_parse_num_invalid_param(void)
 
 	/* try null buf */
 	ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token, NULL,
-			(void*)&result);
+		(void*)&result, sizeof(result));
 	if (ret != -1) {
 		printf("Error: parser accepted null string!\n");
 		return -1;
 	}
 
 	/* try null result */
-	ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token, buf, NULL);
+	ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token, buf,
+		NULL, 0);
 	if (ret == -1) {
 		printf("Error: parser rejected null result!\n");
 		return -1;
@@ -426,7 +427,7 @@ test_parse_num_invalid_data(void)
 			memset(&buf, 0, sizeof(buf));
 
 			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token,
-					num_invalid_strs[i], (void*)&result);
+				num_invalid_strs[i], (void*)&result, sizeof(result));
 			if (ret != -1) {
 				/* get some info about what we are trying to parse */
 				cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
@@ -466,8 +467,9 @@ test_parse_num_valid(void)
 			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
 					buf, sizeof(buf));
 
-			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_valid_positive_strs[i].str,
-					(void*)&result);
+			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
+				num_valid_positive_strs[i].str,
+				(void*)&result, sizeof(result));
 
 			/* if it should have passed but didn't, or if it should have failed but didn't */
 			if ((ret < 0) == (can_parse_unsigned(num_valid_positive_strs[i].result, type) > 0)) {
@@ -493,8 +495,9 @@ test_parse_num_valid(void)
 			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
 					buf, sizeof(buf));
 
-			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_valid_negative_strs[i].str,
-					(void*)&result);
+			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
+				num_valid_negative_strs[i].str,
+				(void*)&result, sizeof(result));
 
 			/* if it should have passed but didn't, or if it should have failed but didn't */
 			if ((ret < 0) == (can_parse_signed(num_valid_negative_strs[i].result, type) > 0)) {
@@ -542,8 +545,9 @@ test_parse_num_valid(void)
 			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
 					buf, sizeof(buf));
 
-			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_garbage_positive_strs[i].str,
-					(void*)&result);
+			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
+				num_garbage_positive_strs[i].str,
+				(void*)&result, sizeof(result));
 
 			/* if it should have passed but didn't, or if it should have failed but didn't */
 			if ((ret < 0) == (can_parse_unsigned(num_garbage_positive_strs[i].result, type) > 0)) {
@@ -569,8 +573,9 @@ test_parse_num_valid(void)
 			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
 					buf, sizeof(buf));
 
-			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_garbage_negative_strs[i].str,
-					(void*)&result);
+			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
+				num_garbage_negative_strs[i].str,
+				(void*)&result, sizeof(result));
 
 			/* if it should have passed but didn't, or if it should have failed but didn't */
 			if ((ret < 0) == (can_parse_signed(num_garbage_negative_strs[i].result, type) > 0)) {
diff --git a/app/test/test_cmdline_portlist.c b/app/test/test_cmdline_portlist.c
index 9f9633c..b9664b0 100644
--- a/app/test/test_cmdline_portlist.c
+++ b/app/test/test_cmdline_portlist.c
@@ -139,21 +139,22 @@ test_parse_portlist_invalid_param(void)
 	memset(&result, 0, sizeof(cmdline_portlist_t));
 
 	/* try all null */
-	ret = cmdline_parse_portlist(NULL, NULL, NULL);
+	ret = cmdline_parse_portlist(NULL, NULL, NULL, 0);
 	if (ret != -1) {
 		printf("Error: parser accepted null parameters!\n");
 		return -1;
 	}
 
 	/* try null buf */
-	ret = cmdline_parse_portlist(NULL, NULL, (void*)&result);
+	ret = cmdline_parse_portlist(NULL, NULL, (void*)&result,
+		sizeof(result));
 	if (ret != -1) {
 		printf("Error: parser accepted null string!\n");
 		return -1;
 	}
 
 	/* try null result */
-	ret = cmdline_parse_portlist(NULL, portlist_valid_strs[0].str, NULL);
+	ret = cmdline_parse_portlist(NULL, portlist_valid_strs[0].str, NULL, 0);
 	if (ret == -1) {
 		printf("Error: parser rejected null result!\n");
 		return -1;
@@ -188,7 +189,7 @@ test_parse_portlist_invalid_data(void)
 		memset(&result, 0, sizeof(cmdline_portlist_t));
 
 		ret = cmdline_parse_portlist(NULL, portlist_invalid_strs[i],
-				(void*)&result);
+			(void*)&result, sizeof(result));
 		if (ret != -1) {
 			printf("Error: parsing %s succeeded!\n",
 					portlist_invalid_strs[i]);
@@ -213,7 +214,7 @@ test_parse_portlist_valid(void)
 		memset(&result, 0, sizeof(cmdline_portlist_t));
 
 		ret = cmdline_parse_portlist(NULL, portlist_valid_strs[i].str,
-				(void*)&result);
+			(void*)&result, sizeof(result));
 		if (ret < 0) {
 			printf("Error: parsing %s failed!\n",
 					portlist_valid_strs[i].str);
@@ -232,7 +233,7 @@ test_parse_portlist_valid(void)
 		memset(&result, 0, sizeof(cmdline_portlist_t));
 
 		ret = cmdline_parse_portlist(NULL, portlist_garbage_strs[i],
-				(void*)&result);
+			(void*)&result, sizeof(result));
 		if (ret < 0) {
 			printf("Error: parsing %s failed!\n",
 					portlist_garbage_strs[i]);
diff --git a/app/test/test_cmdline_string.c b/app/test/test_cmdline_string.c
index 3ec0ce1..915a7d7 100644
--- a/app/test/test_cmdline_string.c
+++ b/app/test/test_cmdline_string.c
@@ -178,7 +178,7 @@ test_parse_string_invalid_param(void)
 		printf("Error: function accepted null token!\n");
 		return -1;
 	}
-	if (cmdline_parse_string(NULL, buf, NULL) != -1) {
+	if (cmdline_parse_string(NULL, buf, NULL, 0) != -1) {
 		printf("Error: function accepted null token!\n");
 		return -1;
 	}
@@ -189,7 +189,8 @@ test_parse_string_invalid_param(void)
 		return -1;
 	}
 	if (cmdline_parse_string(
-			(cmdline_parse_token_hdr_t*)&token, NULL, (void*)&result) != -1) {
+			(cmdline_parse_token_hdr_t*)&token, NULL,
+			(void*)&result, sizeof(result)) != -1) {
 		printf("Error: function accepted null buffer!\n");
 		return -1;
 	}
@@ -200,7 +201,7 @@ test_parse_string_invalid_param(void)
 	}
 	/* test null result */
 	if (cmdline_parse_string(
-			(cmdline_parse_token_hdr_t*)&token, buf, NULL) == -1) {
+			(cmdline_parse_token_hdr_t*)&token, buf, NULL, 0) == -1) {
 		printf("Error: function rejected null result!\n");
 		return -1;
 	}
@@ -233,7 +234,8 @@ test_parse_string_invalid_data(void)
 		token.string_data.str = string_invalid_strs[i].fixed_str;
 
 		if (cmdline_parse_string((cmdline_parse_token_hdr_t*)&token,
-				string_invalid_strs[i].str, (void*)buf) != -1) {
+				string_invalid_strs[i].str, (void*)buf,
+				sizeof(buf)) != -1) {
 			memset(help_str, 0, sizeof(help_str));
 			memset(&help_token, 0, sizeof(help_token));
 
@@ -330,7 +332,8 @@ test_parse_string_valid(void)
 		token.string_data.str = string_parse_strs[i].fixed_str;
 
 		if (cmdline_parse_string((cmdline_parse_token_hdr_t*)&token,
-				string_parse_strs[i].str, (void*)buf) < 0) {
+				string_parse_strs[i].str, (void*)buf,
+				sizeof(buf)) < 0) {
 
 			/* clean help data */
 			memset(&help_token, 0, sizeof(help_token));
diff --git a/examples/cmdline/parse_obj_list.c b/examples/cmdline/parse_obj_list.c
index 2625ca3..cdbaf2f 100644
--- a/examples/cmdline/parse_obj_list.c
+++ b/examples/cmdline/parse_obj_list.c
@@ -84,7 +84,8 @@ struct cmdline_token_ops token_obj_list_ops = {
 };
 
 int
-parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
+parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *buf, void *res,
+	unsigned ressize)
 {
 	struct token_obj_list *tk2 = (struct token_obj_list *)tk;
 	struct token_obj_list_data *tkd = &tk2->obj_list_data;
@@ -94,6 +95,9 @@ parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
 	if (*buf == 0)
 		return -1;
 
+	if (res && ressize < sizeof(struct object *))
+		return -1;
+
 	while(!cmdline_isendoftoken(buf[token_len]))
 		token_len++;
 
diff --git a/examples/cmdline/parse_obj_list.h b/examples/cmdline/parse_obj_list.h
index 297fec4..871c53a 100644
--- a/examples/cmdline/parse_obj_list.h
+++ b/examples/cmdline/parse_obj_list.h
@@ -91,7 +91,8 @@ typedef struct token_obj_list parse_token_obj_list_t;
 
 extern struct cmdline_token_ops token_obj_list_ops;
 
-int parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res);
+int parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res,
+	unsigned ressize);
 int complete_get_nb_obj_list(cmdline_parse_token_hdr_t *tk);
 int complete_get_elt_obj_list(cmdline_parse_token_hdr_t *tk, int idx,
 			      char *dstbuf, unsigned int size);
diff --git a/examples/vhost_xen/xenstore_parse.c b/examples/vhost_xen/xenstore_parse.c
index fdd69b2..9441639 100644
--- a/examples/vhost_xen/xenstore_parse.c
+++ b/examples/vhost_xen/xenstore_parse.c
@@ -77,7 +77,7 @@ struct grant_node_item {
 } __attribute__((packed));
 
 int cmdline_parse_etheraddr(void *tk, const char *srcbuf,
-			    void *res);
+	void *res, unsigned ressize);
 
 /* Map grant ref refid at addr_ori*/
 static void *
@@ -676,7 +676,8 @@ xen_parse_etheraddr(struct xen_vring *vring)
 	if ((buf = xen_read_node(path, &len)) == NULL)
 		goto out;
 
-	if (cmdline_parse_etheraddr(NULL, buf, &vring->addr) < 0)
+	if (cmdline_parse_etheraddr(NULL, buf, &vring->addr,
+			sizeof(vring->addr)) < 0)
 		goto out;
 	ret = 0;
 out:
diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 940480d..dfc885c 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -138,7 +138,7 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void * result_buf)
+	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -162,12 +162,23 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		if ( isendofline(*buf) || iscomment(*buf) )
 			break;
 
-		if (result_buf)
-			n = token_hdr.ops->parse(token_p, buf,
-						 (char *)result_buf +
-						 token_hdr.offset);
-		else
-			n = token_hdr.ops->parse(token_p, buf, NULL);
+		if (resbuf == NULL) {
+			n = token_hdr.ops->parse(token_p, buf, NULL, 0);
+		} else {
+			unsigned rb_sz;
+
+			if (token_hdr.offset > resbuf_size) {
+				printf("Parse error(%s:%d): Token offset(%u) "
+					"exceeds maximum size(%u)\n",
+					__FILE__, __LINE__,
+					token_hdr.offset, resbuf_size);
+				return -ENOBUFS;
+			}
+			rb_sz = resbuf_size - token_hdr.offset;
+
+			n = token_hdr.ops->parse(token_p, buf, (char *)resbuf +
+				token_hdr.offset, rb_sz);
+		}
 
 		if (n < 0)
 			break;
@@ -219,7 +230,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[BUFSIZ];
+	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -280,7 +291,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf);
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -377,10 +388,10 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL))
+			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
 				goto next;
 
-			debug_printf("instruction match \n");
+			debug_printf("instruction match\n");
 			token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
@@ -471,7 +482,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL))
+		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
 			goto next2;
 
 		token_p = inst->tokens[nb_token];
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index f18836d..4b25c45 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -80,6 +80,9 @@ extern "C" {
 #define CMDLINE_PARSE_COMPLETE_AGAIN    1
 #define CMDLINE_PARSE_COMPLETED_BUFFER  2
 
+/* maximum buffer size for parsed result */
+#define CMDLINE_PARSE_RESULT_BUFSIZE 8192
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
@@ -110,12 +113,14 @@ typedef struct cmdline_token_hdr cmdline_parse_token_hdr_t;
  * -1 on error and 0 on success.
  */
 struct cmdline_token_ops {
-	/** parse(token ptr, buf, res pts) */
-	int (*parse)(cmdline_parse_token_hdr_t *, const char *, void *);
+	/** parse(token ptr, buf, res pts, buf len) */
+	int (*parse)(cmdline_parse_token_hdr_t *, const char *, void *,
+		unsigned int);
 	/** return the num of possible choices for this token */
 	int (*complete_get_nb)(cmdline_parse_token_hdr_t *);
 	/** return the elt x for this token (token, idx, dstbuf, size) */
-	int (*complete_get_elt)(cmdline_parse_token_hdr_t *, int, char *, unsigned int);
+	int (*complete_get_elt)(cmdline_parse_token_hdr_t *, int, char *,
+		unsigned int);
 	/** get help for this token (token, dstbuf, size) */
 	int (*get_help)(cmdline_parse_token_hdr_t *, char *, unsigned int);
 };
diff --git a/lib/librte_cmdline/cmdline_parse_etheraddr.c b/lib/librte_cmdline/cmdline_parse_etheraddr.c
index 5285c40..64ae86c 100644
--- a/lib/librte_cmdline/cmdline_parse_etheraddr.c
+++ b/lib/librte_cmdline/cmdline_parse_etheraddr.c
@@ -137,12 +137,15 @@ my_ether_aton(const char *a)
 
 int
 cmdline_parse_etheraddr(__attribute__((unused)) cmdline_parse_token_hdr_t *tk,
-			const char *buf, void *res)
+	const char *buf, void *res, unsigned ressize)
 {
 	unsigned int token_len = 0;
 	char ether_str[ETHER_ADDRSTRLENLONG+1];
 	struct ether_addr *tmp;
 
+	if (res && ressize < sizeof(struct ether_addr))
+		return -1;
+
 	if (!buf || ! *buf)
 		return -1;
 
diff --git a/lib/librte_cmdline/cmdline_parse_etheraddr.h b/lib/librte_cmdline/cmdline_parse_etheraddr.h
index 4427e40..0085bb3 100644
--- a/lib/librte_cmdline/cmdline_parse_etheraddr.h
+++ b/lib/librte_cmdline/cmdline_parse_etheraddr.h
@@ -73,9 +73,9 @@ typedef struct cmdline_token_etheraddr cmdline_parse_token_etheraddr_t;
 extern struct cmdline_token_ops cmdline_token_etheraddr_ops;
 
 int cmdline_parse_etheraddr(cmdline_parse_token_hdr_t *tk, const char *srcbuf,
-			    void *res);
+	void *res, unsigned ressize);
 int cmdline_get_help_etheraddr(cmdline_parse_token_hdr_t *tk, char *dstbuf,
-			       unsigned int size);
+	unsigned int size);
 
 #define TOKEN_ETHERADDR_INITIALIZER(structure, field)       \
 {                                                           \
diff --git a/lib/librte_cmdline/cmdline_parse_ipaddr.c b/lib/librte_cmdline/cmdline_parse_ipaddr.c
index ac83514..7f33599 100644
--- a/lib/librte_cmdline/cmdline_parse_ipaddr.c
+++ b/lib/librte_cmdline/cmdline_parse_ipaddr.c
@@ -306,7 +306,8 @@ inet_pton6(const char *src, unsigned char *dst)
 }
 
 int
-cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
+cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *buf, void *res,
+	unsigned ressize)
 {
 	struct cmdline_token_ipaddr *tk2;
 	unsigned int token_len = 0;
@@ -315,6 +316,9 @@ cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
 	char *prefix, *prefix_end;
 	long prefixlen = 0;
 
+	if (res && ressize < sizeof(cmdline_ipaddr_t))
+		return -1;
+
 	if (!buf || !tk || ! *buf)
 		return -1;
 
diff --git a/lib/librte_cmdline/cmdline_parse_ipaddr.h b/lib/librte_cmdline/cmdline_parse_ipaddr.h
index 0e2f490..296c374 100644
--- a/lib/librte_cmdline/cmdline_parse_ipaddr.h
+++ b/lib/librte_cmdline/cmdline_parse_ipaddr.h
@@ -92,9 +92,9 @@ typedef struct cmdline_token_ipaddr cmdline_parse_token_ipaddr_t;
 extern struct cmdline_token_ops cmdline_token_ipaddr_ops;
 
 int cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *srcbuf,
-			 void *res);
+	void *res, unsigned ressize);
 int cmdline_get_help_ipaddr(cmdline_parse_token_hdr_t *tk, char *dstbuf,
-			    unsigned int size);
+	unsigned int size);
 
 #define TOKEN_IPADDR_INITIALIZER(structure, field)      \
 {                                                       \
diff --git a/lib/librte_cmdline/cmdline_parse_num.c b/lib/librte_cmdline/cmdline_parse_num.c
index 0b9e4d0..1cf53d9 100644
--- a/lib/librte_cmdline/cmdline_parse_num.c
+++ b/lib/librte_cmdline/cmdline_parse_num.c
@@ -119,10 +119,40 @@ add_to_res(unsigned int c, uint64_t *res, unsigned int base)
 	return 0;
 }
 
+static int
+check_res_size(struct cmdline_token_num_data *nd, unsigned ressize)
+{
+	switch (nd->type) {
+		case INT8:
+		case UINT8:
+			if (ressize < sizeof(int8_t))
+				return -1;
+			break;
+		case INT16:
+		case UINT16:
+			if (ressize < sizeof(int16_t))
+				return -1;
+			break;
+		case INT32:
+		case UINT32:
+			if (ressize < sizeof(int32_t))
+				return -1;
+			break;
+		case INT64:
+		case UINT64:
+			if (ressize < sizeof(int64_t))
+				return -1;
+			break;
+		default:
+			return -1;
+	}
+	return 0;
+}
 
 /* parse an int */
 int
-cmdline_parse_num(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res)
+cmdline_parse_num(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res,
+	unsigned ressize)
 {
 	struct cmdline_token_num_data nd;
 	enum num_parse_state_t st = START;
@@ -141,6 +171,12 @@ cmdline_parse_num(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res)
 
 	memcpy(&nd, &((struct cmdline_token_num *)tk)->num_data, sizeof(nd));
 
+	/* check that we have enough room in res */
+	if (res) {
+		if (check_res_size(&nd, ressize) < 0)
+			return -1;
+	}
+
 	while ( st != ERROR && c && ! cmdline_isendoftoken(c) ) {
 		debug_printf("%c %x -> ", c, c);
 		switch (st) {
diff --git a/lib/librte_cmdline/cmdline_parse_num.h b/lib/librte_cmdline/cmdline_parse_num.h
index 77f2f9b..5376806 100644
--- a/lib/librte_cmdline/cmdline_parse_num.h
+++ b/lib/librte_cmdline/cmdline_parse_num.h
@@ -89,9 +89,9 @@ typedef struct cmdline_token_num cmdline_parse_token_num_t;
 extern struct cmdline_token_ops cmdline_token_num_ops;
 
 int cmdline_parse_num(cmdline_parse_token_hdr_t *tk,
-		      const char *srcbuf, void *res);
+	const char *srcbuf, void *res, unsigned ressize);
 int cmdline_get_help_num(cmdline_parse_token_hdr_t *tk,
-			 char *dstbuf, unsigned int size);
+	char *dstbuf, unsigned int size);
 
 #define TOKEN_NUM_INITIALIZER(structure, field, numtype)    \
 {                                                           \
diff --git a/lib/librte_cmdline/cmdline_parse_portlist.c b/lib/librte_cmdline/cmdline_parse_portlist.c
index 7eac05c..834f2e6 100644
--- a/lib/librte_cmdline/cmdline_parse_portlist.c
+++ b/lib/librte_cmdline/cmdline_parse_portlist.c
@@ -127,7 +127,7 @@ parse_ports(cmdline_portlist_t * pl, const char * str)
 
 int
 cmdline_parse_portlist(__attribute__((unused)) cmdline_parse_token_hdr_t *tk,
-		const char *buf, void *res)
+	const char *buf, void *res, unsigned ressize)
 {
 	unsigned int token_len = 0;
 	char portlist_str[PORTLIST_TOKEN_SIZE+1];
@@ -136,6 +136,9 @@ cmdline_parse_portlist(__attribute__((unused)) cmdline_parse_token_hdr_t *tk,
 	if (!buf || ! *buf)
 		return (-1);
 
+	if (res && ressize < PORTLIST_TOKEN_SIZE)
+		return -1;
+
 	pl = res;
 
 	while (!cmdline_isendoftoken(buf[token_len]) &&
diff --git a/lib/librte_cmdline/cmdline_parse_portlist.h b/lib/librte_cmdline/cmdline_parse_portlist.h
index 6fdc406..8505059 100644
--- a/lib/librte_cmdline/cmdline_parse_portlist.h
+++ b/lib/librte_cmdline/cmdline_parse_portlist.h
@@ -81,9 +81,9 @@ typedef struct cmdline_token_portlist cmdline_parse_token_portlist_t;
 extern struct cmdline_token_ops cmdline_token_portlist_ops;
 
 int cmdline_parse_portlist(cmdline_parse_token_hdr_t *tk,
-		      const char *srcbuf, void *res);
+	const char *srcbuf, void *res, unsigned ressize);
 int cmdline_get_help_portlist(cmdline_parse_token_hdr_t *tk,
-			 char *dstbuf, unsigned int size);
+	char *dstbuf, unsigned int size);
 
 #define TOKEN_PORTLIST_INITIALIZER(structure, field)        \
 {                                                           \
diff --git a/lib/librte_cmdline/cmdline_parse_string.c b/lib/librte_cmdline/cmdline_parse_string.c
index b1bfe91..45883b3 100644
--- a/lib/librte_cmdline/cmdline_parse_string.c
+++ b/lib/librte_cmdline/cmdline_parse_string.c
@@ -105,13 +105,17 @@ get_next_token(const char *s)
 }
 
 int
-cmdline_parse_string(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
+cmdline_parse_string(cmdline_parse_token_hdr_t *tk, const char *buf, void *res,
+	unsigned ressize)
 {
 	struct cmdline_token_string *tk2;
 	struct cmdline_token_string_data *sd;
 	unsigned int token_len;
 	const char *str;
 
+	if (res && ressize < STR_TOKEN_SIZE)
+		return -1;
+
 	if (!tk || !buf || ! *buf)
 		return -1;
 
diff --git a/lib/librte_cmdline/cmdline_parse_string.h b/lib/librte_cmdline/cmdline_parse_string.h
index 52c916c..c205622 100644
--- a/lib/librte_cmdline/cmdline_parse_string.h
+++ b/lib/librte_cmdline/cmdline_parse_string.h
@@ -83,7 +83,7 @@ typedef struct cmdline_token_string cmdline_parse_token_string_t;
 extern struct cmdline_token_ops cmdline_token_string_ops;
 
 int cmdline_parse_string(cmdline_parse_token_hdr_t *tk, const char *srcbuf,
-			 void *res);
+	void *res, unsigned ressize);
 int cmdline_complete_get_nb_string(cmdline_parse_token_hdr_t *tk);
 int cmdline_complete_get_elt_string(cmdline_parse_token_hdr_t *tk, int idx,
 				    char *dstbuf, unsigned int size);
diff --git a/lib/librte_pmd_bond/rte_eth_bond_args.c b/lib/librte_pmd_bond/rte_eth_bond_args.c
index 4114833..ca4de38 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_args.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_args.c
@@ -254,7 +254,8 @@ bond_ethdev_parse_bond_mac_addr_kvarg(const char *key __rte_unused,
 		return -1;
 
 	/* Parse MAC */
-	return cmdline_parse_etheraddr(NULL, value, extra_args);
+	return cmdline_parse_etheraddr(NULL, value, extra_args,
+		sizeof(struct ether_addr));
 }
 
 int
-- 
2.1.0

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ
  2014-12-05 14:19         ` [dpdk-dev] [PATCH v3] " Olivier Matz
@ 2014-12-05 15:51           ` Bruce Richardson
  2014-12-05 15:58             ` Thomas Monjalon
  0 siblings, 1 reply; 97+ messages in thread
From: Bruce Richardson @ 2014-12-05 15:51 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

On Fri, Dec 05, 2014 at 03:19:07PM +0100, Olivier Matz wrote:
> From: Alan Carew <alan.carew@intel.com>
> 
> When using test-pmd with flow director in FreeBSD, the application will
> segfault/Bus error while parsing the command-line. This is due to how
> each commands result structure is represented during parsing, where the offsets
> for each tokens value is stored in a character array(char result_buf[BUFSIZ])
> in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).
> 
> The overflow occurs where BUFSIZ is less than the size of a commands result
> structure, in this case "struct cmd_pkt_filter_result"
> (app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
> opposed to 8192 bytes on Linux.
> 
> The problem can be reproduced by running test-pmd on FreeBSD:
> ./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
> And adding a filter:
> add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
> 0x800 vlan 0 queue 0 soft 0x17
> 
> This patch removes the OS dependency on BUFSIZ and defines and uses a
> library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
> 
> Added boundary checking to ensure this buffer size cannot overflow, with
> an error message being produced.
> 
> Suggested-by: Olivier MATZ <olivier.matz@6wind.com>
> http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f
> 
> Signed-off-by: Alan Carew <alan.carew@intel.com>
> Signed-off-by: Olivier MATZ <olivier.matz@6wind.com>

Tested on FreeBSD 10 and this patch fixes the issue described.

Tested-by: Bruce Richardson <bruce.richardson@intel.com>

> ---
>  app/test-pmd/parameters.c                    |  6 +++--
>  app/test/test_cmdline_etheraddr.c            | 13 +++++-----
>  app/test/test_cmdline_ipaddr.c               | 27 ++++++++++++--------
>  app/test/test_cmdline_num.c                  | 31 +++++++++++++----------
>  app/test/test_cmdline_portlist.c             | 13 +++++-----
>  app/test/test_cmdline_string.c               | 13 ++++++----
>  examples/cmdline/parse_obj_list.c            |  6 ++++-
>  examples/cmdline/parse_obj_list.h            |  3 ++-
>  examples/vhost_xen/xenstore_parse.c          |  5 ++--
>  lib/librte_cmdline/cmdline_parse.c           | 35 ++++++++++++++++---------
>  lib/librte_cmdline/cmdline_parse.h           | 11 +++++---
>  lib/librte_cmdline/cmdline_parse_etheraddr.c |  5 +++-
>  lib/librte_cmdline/cmdline_parse_etheraddr.h |  4 +--
>  lib/librte_cmdline/cmdline_parse_ipaddr.c    |  6 ++++-
>  lib/librte_cmdline/cmdline_parse_ipaddr.h    |  4 +--
>  lib/librte_cmdline/cmdline_parse_num.c       | 38 +++++++++++++++++++++++++++-
>  lib/librte_cmdline/cmdline_parse_num.h       |  4 +--
>  lib/librte_cmdline/cmdline_parse_portlist.c  |  5 +++-
>  lib/librte_cmdline/cmdline_parse_portlist.h  |  4 +--
>  lib/librte_cmdline/cmdline_parse_string.c    |  6 ++++-
>  lib/librte_cmdline/cmdline_parse_string.h    |  2 +-
>  lib/librte_pmd_bond/rte_eth_bond_args.c      |  3 ++-
>  22 files changed, 168 insertions(+), 76 deletions(-)
> 
> diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> index 9573a43..8558985 100644
> --- a/app/test-pmd/parameters.c
> +++ b/app/test-pmd/parameters.c
> @@ -223,7 +223,8 @@ init_peer_eth_addrs(char *config_filename)
>  		if (fgets(buf, sizeof(buf), config_file) == NULL)
>  			break;
>  
> -		if (cmdline_parse_etheraddr(NULL, buf, &peer_eth_addrs[i]) < 0 ){
> +		if (cmdline_parse_etheraddr(NULL, buf, &peer_eth_addrs[i],
> +				sizeof(peer_eth_addrs[i])) < 0 ){
>  			printf("Bad MAC address format on line %d\n", i+1);
>  			fclose(config_file);
>  			return -1;
> @@ -658,7 +659,8 @@ launch_args_parse(int argc, char** argv)
>  						 "eth-peer: port %d >= RTE_MAX_ETHPORTS(%d)\n",
>  						 n, RTE_MAX_ETHPORTS);
>  
> -				if (cmdline_parse_etheraddr(NULL, port_end, &peer_addr) < 0 )
> +				if (cmdline_parse_etheraddr(NULL, port_end,
> +						&peer_addr, sizeof(peer_addr)) < 0 )
>  					rte_exit(EXIT_FAILURE,
>  						 "Invalid ethernet address: %s\n",
>  						 port_end);
> diff --git a/app/test/test_cmdline_etheraddr.c b/app/test/test_cmdline_etheraddr.c
> index 45c61ff..e4f4231 100644
> --- a/app/test/test_cmdline_etheraddr.c
> +++ b/app/test/test_cmdline_etheraddr.c
> @@ -130,14 +130,15 @@ test_parse_etheraddr_invalid_param(void)
>  	int ret = 0;
>  
>  	/* try all null */
> -	ret = cmdline_parse_etheraddr(NULL, NULL, NULL);
> +	ret = cmdline_parse_etheraddr(NULL, NULL, NULL, 0);
>  	if (ret != -1) {
>  		printf("Error: parser accepted null parameters!\n");
>  		return -1;
>  	}
>  
>  	/* try null buf */
> -	ret = cmdline_parse_etheraddr(NULL, NULL, (void*)&result);
> +	ret = cmdline_parse_etheraddr(NULL, NULL, (void*)&result,
> +		sizeof(result));
>  	if (ret != -1) {
>  		printf("Error: parser accepted null string!\n");
>  		return -1;
> @@ -149,7 +150,7 @@ test_parse_etheraddr_invalid_param(void)
>  	snprintf(buf, sizeof(buf), "%s",
>  			ether_addr_valid_strs[0].str);
>  
> -	ret = cmdline_parse_etheraddr(NULL, buf, NULL);
> +	ret = cmdline_parse_etheraddr(NULL, buf, NULL, 0);
>  	if (ret == -1) {
>  		printf("Error: parser rejected null result!\n");
>  		return -1;
> @@ -185,7 +186,7 @@ test_parse_etheraddr_invalid_data(void)
>  		memset(&result, 0, sizeof(struct ether_addr));
>  
>  		ret = cmdline_parse_etheraddr(NULL, ether_addr_invalid_strs[i],
> -				(void*)&result);
> +			(void*)&result, sizeof(result));
>  		if (ret != -1) {
>  			printf("Error: parsing %s succeeded!\n",
>  					ether_addr_invalid_strs[i]);
> @@ -210,7 +211,7 @@ test_parse_etheraddr_valid(void)
>  		memset(&result, 0, sizeof(struct ether_addr));
>  
>  		ret = cmdline_parse_etheraddr(NULL, ether_addr_valid_strs[i].str,
> -				(void*)&result);
> +			(void*)&result, sizeof(result));
>  		if (ret < 0) {
>  			printf("Error: parsing %s failed!\n",
>  					ether_addr_valid_strs[i].str);
> @@ -229,7 +230,7 @@ test_parse_etheraddr_valid(void)
>  		memset(&result, 0, sizeof(struct ether_addr));
>  
>  		ret = cmdline_parse_etheraddr(NULL, ether_addr_garbage_strs[i],
> -				(void*)&result);
> +			(void*)&result, sizeof(result));
>  		if (ret < 0) {
>  			printf("Error: parsing %s failed!\n",
>  					ether_addr_garbage_strs[i]);
> diff --git a/app/test/test_cmdline_ipaddr.c b/app/test/test_cmdline_ipaddr.c
> index 4ce928d..471d2ff 100644
> --- a/app/test/test_cmdline_ipaddr.c
> +++ b/app/test/test_cmdline_ipaddr.c
> @@ -425,7 +425,8 @@ test_parse_ipaddr_valid(void)
>  							buf, sizeof(buf));
>  
>  			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -					ipaddr_valid_strs[i].str, (void*)&result);
> +				ipaddr_valid_strs[i].str, (void*)&result,
> +				sizeof(result));
>  
>  			/* if should have passed, or should have failed */
>  			if ((ret < 0) ==
> @@ -474,7 +475,8 @@ test_parse_ipaddr_valid(void)
>  							buf, sizeof(buf));
>  
>  			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -					ipaddr_garbage_addr4_strs[i], (void*)&result);
> +				ipaddr_garbage_addr4_strs[i], (void*)&result,
> +				sizeof(result));
>  
>  			/* if should have passed, or should have failed */
>  			if ((ret < 0) ==
> @@ -515,7 +517,8 @@ test_parse_ipaddr_valid(void)
>  							buf, sizeof(buf));
>  
>  			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -					ipaddr_garbage_addr6_strs[i], (void*)&result);
> +				ipaddr_garbage_addr6_strs[i], (void*)&result,
> +				sizeof(result));
>  
>  			/* if should have passed, or should have failed */
>  			if ((ret < 0) ==
> @@ -557,7 +560,8 @@ test_parse_ipaddr_valid(void)
>  							buf, sizeof(buf));
>  
>  			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -					ipaddr_garbage_network4_strs[i], (void*)&result);
> +				ipaddr_garbage_network4_strs[i], (void*)&result,
> +				sizeof(result));
>  
>  			/* if should have passed, or should have failed */
>  			if ((ret < 0) ==
> @@ -598,7 +602,8 @@ test_parse_ipaddr_valid(void)
>  							buf, sizeof(buf));
>  
>  			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -					ipaddr_garbage_network6_strs[i], (void*)&result);
> +				ipaddr_garbage_network6_strs[i], (void*)&result,
> +				sizeof(result));
>  
>  			/* if should have passed, or should have failed */
>  			if ((ret < 0) ==
> @@ -651,7 +656,8 @@ test_parse_ipaddr_invalid_data(void)
>  					buf, sizeof(buf));
>  
>  			ret = cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -					ipaddr_invalid_strs[i], (void*)&result);
> +				ipaddr_invalid_strs[i], (void*)&result,
> +				sizeof(result));
>  
>  			if (ret != -1) {
>  				printf("Error: parsing %s as %s succeeded!\n",
> @@ -677,25 +683,26 @@ test_parse_ipaddr_invalid_param(void)
>  	token.ipaddr_data.flags = CMDLINE_IPADDR_V4;
>  
>  	/* null token */
> -	if (cmdline_parse_ipaddr(NULL, buf, (void*)&result) != -1) {
> +	if (cmdline_parse_ipaddr(NULL, buf, (void*)&result,
> +			sizeof(result)) != -1) {
>  		printf("Error: parser accepted invalid parameters!\n");
>  		return -1;
>  	}
>  	/* null buffer */
>  	if (cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -			NULL, (void*)&result) != -1) {
> +			NULL, (void*)&result, sizeof(result)) != -1) {
>  		printf("Error: parser accepted invalid parameters!\n");
>  		return -1;
>  	}
>  	/* empty buffer */
>  	if (cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -			"", (void*)&result) != -1) {
> +			"", (void*)&result, sizeof(result)) != -1) {
>  		printf("Error: parser accepted invalid parameters!\n");
>  		return -1;
>  	}
>  	/* null result */
>  	if (cmdline_parse_ipaddr((cmdline_parse_token_hdr_t*)&token,
> -			buf, NULL) == -1) {
> +			buf, NULL, 0) == -1) {
>  		printf("Error: parser rejected null result!\n");
>  		return -1;
>  	}
> diff --git a/app/test/test_cmdline_num.c b/app/test/test_cmdline_num.c
> index 799d68c..04263d3 100644
> --- a/app/test/test_cmdline_num.c
> +++ b/app/test/test_cmdline_num.c
> @@ -350,14 +350,14 @@ test_parse_num_invalid_param(void)
>  			num_valid_positive_strs[0].str);
>  
>  	/* try all null */
> -	ret = cmdline_parse_num(NULL, NULL, NULL);
> +	ret = cmdline_parse_num(NULL, NULL, NULL, 0);
>  	if (ret != -1) {
>  		printf("Error: parser accepted null parameters!\n");
>  		return -1;
>  	}
>  
>  	/* try null token */
> -	ret = cmdline_parse_num(NULL, buf, (void*)&result);
> +	ret = cmdline_parse_num(NULL, buf, (void*)&result, sizeof(result));
>  	if (ret != -1) {
>  		printf("Error: parser accepted null token!\n");
>  		return -1;
> @@ -365,14 +365,15 @@ test_parse_num_invalid_param(void)
>  
>  	/* try null buf */
>  	ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token, NULL,
> -			(void*)&result);
> +		(void*)&result, sizeof(result));
>  	if (ret != -1) {
>  		printf("Error: parser accepted null string!\n");
>  		return -1;
>  	}
>  
>  	/* try null result */
> -	ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token, buf, NULL);
> +	ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token, buf,
> +		NULL, 0);
>  	if (ret == -1) {
>  		printf("Error: parser rejected null result!\n");
>  		return -1;
> @@ -426,7 +427,7 @@ test_parse_num_invalid_data(void)
>  			memset(&buf, 0, sizeof(buf));
>  
>  			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*)&token,
> -					num_invalid_strs[i], (void*)&result);
> +				num_invalid_strs[i], (void*)&result, sizeof(result));
>  			if (ret != -1) {
>  				/* get some info about what we are trying to parse */
>  				cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
> @@ -466,8 +467,9 @@ test_parse_num_valid(void)
>  			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
>  					buf, sizeof(buf));
>  
> -			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_valid_positive_strs[i].str,
> -					(void*)&result);
> +			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
> +				num_valid_positive_strs[i].str,
> +				(void*)&result, sizeof(result));
>  
>  			/* if it should have passed but didn't, or if it should have failed but didn't */
>  			if ((ret < 0) == (can_parse_unsigned(num_valid_positive_strs[i].result, type) > 0)) {
> @@ -493,8 +495,9 @@ test_parse_num_valid(void)
>  			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
>  					buf, sizeof(buf));
>  
> -			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_valid_negative_strs[i].str,
> -					(void*)&result);
> +			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
> +				num_valid_negative_strs[i].str,
> +				(void*)&result, sizeof(result));
>  
>  			/* if it should have passed but didn't, or if it should have failed but didn't */
>  			if ((ret < 0) == (can_parse_signed(num_valid_negative_strs[i].result, type) > 0)) {
> @@ -542,8 +545,9 @@ test_parse_num_valid(void)
>  			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
>  					buf, sizeof(buf));
>  
> -			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_garbage_positive_strs[i].str,
> -					(void*)&result);
> +			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
> +				num_garbage_positive_strs[i].str,
> +				(void*)&result, sizeof(result));
>  
>  			/* if it should have passed but didn't, or if it should have failed but didn't */
>  			if ((ret < 0) == (can_parse_unsigned(num_garbage_positive_strs[i].result, type) > 0)) {
> @@ -569,8 +573,9 @@ test_parse_num_valid(void)
>  			cmdline_get_help_num((cmdline_parse_token_hdr_t*)&token,
>  					buf, sizeof(buf));
>  
> -			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token, num_garbage_negative_strs[i].str,
> -					(void*)&result);
> +			ret = cmdline_parse_num((cmdline_parse_token_hdr_t*) &token,
> +				num_garbage_negative_strs[i].str,
> +				(void*)&result, sizeof(result));
>  
>  			/* if it should have passed but didn't, or if it should have failed but didn't */
>  			if ((ret < 0) == (can_parse_signed(num_garbage_negative_strs[i].result, type) > 0)) {
> diff --git a/app/test/test_cmdline_portlist.c b/app/test/test_cmdline_portlist.c
> index 9f9633c..b9664b0 100644
> --- a/app/test/test_cmdline_portlist.c
> +++ b/app/test/test_cmdline_portlist.c
> @@ -139,21 +139,22 @@ test_parse_portlist_invalid_param(void)
>  	memset(&result, 0, sizeof(cmdline_portlist_t));
>  
>  	/* try all null */
> -	ret = cmdline_parse_portlist(NULL, NULL, NULL);
> +	ret = cmdline_parse_portlist(NULL, NULL, NULL, 0);
>  	if (ret != -1) {
>  		printf("Error: parser accepted null parameters!\n");
>  		return -1;
>  	}
>  
>  	/* try null buf */
> -	ret = cmdline_parse_portlist(NULL, NULL, (void*)&result);
> +	ret = cmdline_parse_portlist(NULL, NULL, (void*)&result,
> +		sizeof(result));
>  	if (ret != -1) {
>  		printf("Error: parser accepted null string!\n");
>  		return -1;
>  	}
>  
>  	/* try null result */
> -	ret = cmdline_parse_portlist(NULL, portlist_valid_strs[0].str, NULL);
> +	ret = cmdline_parse_portlist(NULL, portlist_valid_strs[0].str, NULL, 0);
>  	if (ret == -1) {
>  		printf("Error: parser rejected null result!\n");
>  		return -1;
> @@ -188,7 +189,7 @@ test_parse_portlist_invalid_data(void)
>  		memset(&result, 0, sizeof(cmdline_portlist_t));
>  
>  		ret = cmdline_parse_portlist(NULL, portlist_invalid_strs[i],
> -				(void*)&result);
> +			(void*)&result, sizeof(result));
>  		if (ret != -1) {
>  			printf("Error: parsing %s succeeded!\n",
>  					portlist_invalid_strs[i]);
> @@ -213,7 +214,7 @@ test_parse_portlist_valid(void)
>  		memset(&result, 0, sizeof(cmdline_portlist_t));
>  
>  		ret = cmdline_parse_portlist(NULL, portlist_valid_strs[i].str,
> -				(void*)&result);
> +			(void*)&result, sizeof(result));
>  		if (ret < 0) {
>  			printf("Error: parsing %s failed!\n",
>  					portlist_valid_strs[i].str);
> @@ -232,7 +233,7 @@ test_parse_portlist_valid(void)
>  		memset(&result, 0, sizeof(cmdline_portlist_t));
>  
>  		ret = cmdline_parse_portlist(NULL, portlist_garbage_strs[i],
> -				(void*)&result);
> +			(void*)&result, sizeof(result));
>  		if (ret < 0) {
>  			printf("Error: parsing %s failed!\n",
>  					portlist_garbage_strs[i]);
> diff --git a/app/test/test_cmdline_string.c b/app/test/test_cmdline_string.c
> index 3ec0ce1..915a7d7 100644
> --- a/app/test/test_cmdline_string.c
> +++ b/app/test/test_cmdline_string.c
> @@ -178,7 +178,7 @@ test_parse_string_invalid_param(void)
>  		printf("Error: function accepted null token!\n");
>  		return -1;
>  	}
> -	if (cmdline_parse_string(NULL, buf, NULL) != -1) {
> +	if (cmdline_parse_string(NULL, buf, NULL, 0) != -1) {
>  		printf("Error: function accepted null token!\n");
>  		return -1;
>  	}
> @@ -189,7 +189,8 @@ test_parse_string_invalid_param(void)
>  		return -1;
>  	}
>  	if (cmdline_parse_string(
> -			(cmdline_parse_token_hdr_t*)&token, NULL, (void*)&result) != -1) {
> +			(cmdline_parse_token_hdr_t*)&token, NULL,
> +			(void*)&result, sizeof(result)) != -1) {
>  		printf("Error: function accepted null buffer!\n");
>  		return -1;
>  	}
> @@ -200,7 +201,7 @@ test_parse_string_invalid_param(void)
>  	}
>  	/* test null result */
>  	if (cmdline_parse_string(
> -			(cmdline_parse_token_hdr_t*)&token, buf, NULL) == -1) {
> +			(cmdline_parse_token_hdr_t*)&token, buf, NULL, 0) == -1) {
>  		printf("Error: function rejected null result!\n");
>  		return -1;
>  	}
> @@ -233,7 +234,8 @@ test_parse_string_invalid_data(void)
>  		token.string_data.str = string_invalid_strs[i].fixed_str;
>  
>  		if (cmdline_parse_string((cmdline_parse_token_hdr_t*)&token,
> -				string_invalid_strs[i].str, (void*)buf) != -1) {
> +				string_invalid_strs[i].str, (void*)buf,
> +				sizeof(buf)) != -1) {
>  			memset(help_str, 0, sizeof(help_str));
>  			memset(&help_token, 0, sizeof(help_token));
>  
> @@ -330,7 +332,8 @@ test_parse_string_valid(void)
>  		token.string_data.str = string_parse_strs[i].fixed_str;
>  
>  		if (cmdline_parse_string((cmdline_parse_token_hdr_t*)&token,
> -				string_parse_strs[i].str, (void*)buf) < 0) {
> +				string_parse_strs[i].str, (void*)buf,
> +				sizeof(buf)) < 0) {
>  
>  			/* clean help data */
>  			memset(&help_token, 0, sizeof(help_token));
> diff --git a/examples/cmdline/parse_obj_list.c b/examples/cmdline/parse_obj_list.c
> index 2625ca3..cdbaf2f 100644
> --- a/examples/cmdline/parse_obj_list.c
> +++ b/examples/cmdline/parse_obj_list.c
> @@ -84,7 +84,8 @@ struct cmdline_token_ops token_obj_list_ops = {
>  };
>  
>  int
> -parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
> +parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *buf, void *res,
> +	unsigned ressize)
>  {
>  	struct token_obj_list *tk2 = (struct token_obj_list *)tk;
>  	struct token_obj_list_data *tkd = &tk2->obj_list_data;
> @@ -94,6 +95,9 @@ parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
>  	if (*buf == 0)
>  		return -1;
>  
> +	if (res && ressize < sizeof(struct object *))
> +		return -1;
> +
>  	while(!cmdline_isendoftoken(buf[token_len]))
>  		token_len++;
>  
> diff --git a/examples/cmdline/parse_obj_list.h b/examples/cmdline/parse_obj_list.h
> index 297fec4..871c53a 100644
> --- a/examples/cmdline/parse_obj_list.h
> +++ b/examples/cmdline/parse_obj_list.h
> @@ -91,7 +91,8 @@ typedef struct token_obj_list parse_token_obj_list_t;
>  
>  extern struct cmdline_token_ops token_obj_list_ops;
>  
> -int parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res);
> +int parse_obj_list(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res,
> +	unsigned ressize);
>  int complete_get_nb_obj_list(cmdline_parse_token_hdr_t *tk);
>  int complete_get_elt_obj_list(cmdline_parse_token_hdr_t *tk, int idx,
>  			      char *dstbuf, unsigned int size);
> diff --git a/examples/vhost_xen/xenstore_parse.c b/examples/vhost_xen/xenstore_parse.c
> index fdd69b2..9441639 100644
> --- a/examples/vhost_xen/xenstore_parse.c
> +++ b/examples/vhost_xen/xenstore_parse.c
> @@ -77,7 +77,7 @@ struct grant_node_item {
>  } __attribute__((packed));
>  
>  int cmdline_parse_etheraddr(void *tk, const char *srcbuf,
> -			    void *res);
> +	void *res, unsigned ressize);
>  
>  /* Map grant ref refid at addr_ori*/
>  static void *
> @@ -676,7 +676,8 @@ xen_parse_etheraddr(struct xen_vring *vring)
>  	if ((buf = xen_read_node(path, &len)) == NULL)
>  		goto out;
>  
> -	if (cmdline_parse_etheraddr(NULL, buf, &vring->addr) < 0)
> +	if (cmdline_parse_etheraddr(NULL, buf, &vring->addr,
> +			sizeof(vring->addr)) < 0)
>  		goto out;
>  	ret = 0;
>  out:
> diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
> index 940480d..dfc885c 100644
> --- a/lib/librte_cmdline/cmdline_parse.c
> +++ b/lib/librte_cmdline/cmdline_parse.c
> @@ -138,7 +138,7 @@ nb_common_chars(const char * s1, const char * s2)
>   */
>  static int
>  match_inst(cmdline_parse_inst_t *inst, const char *buf,
> -	   unsigned int nb_match_token, void * result_buf)
> +	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
>  {
>  	unsigned int token_num=0;
>  	cmdline_parse_token_hdr_t * token_p;
> @@ -162,12 +162,23 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
>  		if ( isendofline(*buf) || iscomment(*buf) )
>  			break;
>  
> -		if (result_buf)
> -			n = token_hdr.ops->parse(token_p, buf,
> -						 (char *)result_buf +
> -						 token_hdr.offset);
> -		else
> -			n = token_hdr.ops->parse(token_p, buf, NULL);
> +		if (resbuf == NULL) {
> +			n = token_hdr.ops->parse(token_p, buf, NULL, 0);
> +		} else {
> +			unsigned rb_sz;
> +
> +			if (token_hdr.offset > resbuf_size) {
> +				printf("Parse error(%s:%d): Token offset(%u) "
> +					"exceeds maximum size(%u)\n",
> +					__FILE__, __LINE__,
> +					token_hdr.offset, resbuf_size);
> +				return -ENOBUFS;
> +			}
> +			rb_sz = resbuf_size - token_hdr.offset;
> +
> +			n = token_hdr.ops->parse(token_p, buf, (char *)resbuf +
> +				token_hdr.offset, rb_sz);
> +		}
>  
>  		if (n < 0)
>  			break;
> @@ -219,7 +230,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
>  	unsigned int inst_num=0;
>  	cmdline_parse_inst_t *inst;
>  	const char *curbuf;
> -	char result_buf[BUFSIZ];
> +	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
>  	void (*f)(void *, struct cmdline *, void *) = NULL;
>  	void *data = NULL;
>  	int comment = 0;
> @@ -280,7 +291,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
>  		debug_printf("INST %d\n", inst_num);
>  
>  		/* fully parsed */
> -		tok = match_inst(inst, buf, 0, result_buf);
> +		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
>  
>  		if (tok > 0) /* we matched at least one token */
>  			err = CMDLINE_PARSE_BAD_ARGS;
> @@ -377,10 +388,10 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
>  		inst = ctx[inst_num];
>  		while (inst) {
>  			/* parse the first tokens of the inst */
> -			if (nb_token && match_inst(inst, buf, nb_token, NULL))
> +			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
>  				goto next;
>  
> -			debug_printf("instruction match \n");
> +			debug_printf("instruction match\n");
>  			token_p = inst->tokens[nb_token];
>  			if (token_p)
>  				memcpy(&token_hdr, token_p, sizeof(token_hdr));
> @@ -471,7 +482,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
>  		/* we need to redo it */
>  		inst = ctx[inst_num];
>  
> -		if (nb_token && match_inst(inst, buf, nb_token, NULL))
> +		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
>  			goto next2;
>  
>  		token_p = inst->tokens[nb_token];
> diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
> index f18836d..4b25c45 100644
> --- a/lib/librte_cmdline/cmdline_parse.h
> +++ b/lib/librte_cmdline/cmdline_parse.h
> @@ -80,6 +80,9 @@ extern "C" {
>  #define CMDLINE_PARSE_COMPLETE_AGAIN    1
>  #define CMDLINE_PARSE_COMPLETED_BUFFER  2
>  
> +/* maximum buffer size for parsed result */
> +#define CMDLINE_PARSE_RESULT_BUFSIZE 8192
> +
>  /**
>   * Stores a pointer to the ops struct, and the offset: the place to
>   * write the parsed result in the destination structure.
> @@ -110,12 +113,14 @@ typedef struct cmdline_token_hdr cmdline_parse_token_hdr_t;
>   * -1 on error and 0 on success.
>   */
>  struct cmdline_token_ops {
> -	/** parse(token ptr, buf, res pts) */
> -	int (*parse)(cmdline_parse_token_hdr_t *, const char *, void *);
> +	/** parse(token ptr, buf, res pts, buf len) */
> +	int (*parse)(cmdline_parse_token_hdr_t *, const char *, void *,
> +		unsigned int);
>  	/** return the num of possible choices for this token */
>  	int (*complete_get_nb)(cmdline_parse_token_hdr_t *);
>  	/** return the elt x for this token (token, idx, dstbuf, size) */
> -	int (*complete_get_elt)(cmdline_parse_token_hdr_t *, int, char *, unsigned int);
> +	int (*complete_get_elt)(cmdline_parse_token_hdr_t *, int, char *,
> +		unsigned int);
>  	/** get help for this token (token, dstbuf, size) */
>  	int (*get_help)(cmdline_parse_token_hdr_t *, char *, unsigned int);
>  };
> diff --git a/lib/librte_cmdline/cmdline_parse_etheraddr.c b/lib/librte_cmdline/cmdline_parse_etheraddr.c
> index 5285c40..64ae86c 100644
> --- a/lib/librte_cmdline/cmdline_parse_etheraddr.c
> +++ b/lib/librte_cmdline/cmdline_parse_etheraddr.c
> @@ -137,12 +137,15 @@ my_ether_aton(const char *a)
>  
>  int
>  cmdline_parse_etheraddr(__attribute__((unused)) cmdline_parse_token_hdr_t *tk,
> -			const char *buf, void *res)
> +	const char *buf, void *res, unsigned ressize)
>  {
>  	unsigned int token_len = 0;
>  	char ether_str[ETHER_ADDRSTRLENLONG+1];
>  	struct ether_addr *tmp;
>  
> +	if (res && ressize < sizeof(struct ether_addr))
> +		return -1;
> +
>  	if (!buf || ! *buf)
>  		return -1;
>  
> diff --git a/lib/librte_cmdline/cmdline_parse_etheraddr.h b/lib/librte_cmdline/cmdline_parse_etheraddr.h
> index 4427e40..0085bb3 100644
> --- a/lib/librte_cmdline/cmdline_parse_etheraddr.h
> +++ b/lib/librte_cmdline/cmdline_parse_etheraddr.h
> @@ -73,9 +73,9 @@ typedef struct cmdline_token_etheraddr cmdline_parse_token_etheraddr_t;
>  extern struct cmdline_token_ops cmdline_token_etheraddr_ops;
>  
>  int cmdline_parse_etheraddr(cmdline_parse_token_hdr_t *tk, const char *srcbuf,
> -			    void *res);
> +	void *res, unsigned ressize);
>  int cmdline_get_help_etheraddr(cmdline_parse_token_hdr_t *tk, char *dstbuf,
> -			       unsigned int size);
> +	unsigned int size);
>  
>  #define TOKEN_ETHERADDR_INITIALIZER(structure, field)       \
>  {                                                           \
> diff --git a/lib/librte_cmdline/cmdline_parse_ipaddr.c b/lib/librte_cmdline/cmdline_parse_ipaddr.c
> index ac83514..7f33599 100644
> --- a/lib/librte_cmdline/cmdline_parse_ipaddr.c
> +++ b/lib/librte_cmdline/cmdline_parse_ipaddr.c
> @@ -306,7 +306,8 @@ inet_pton6(const char *src, unsigned char *dst)
>  }
>  
>  int
> -cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
> +cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *buf, void *res,
> +	unsigned ressize)
>  {
>  	struct cmdline_token_ipaddr *tk2;
>  	unsigned int token_len = 0;
> @@ -315,6 +316,9 @@ cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
>  	char *prefix, *prefix_end;
>  	long prefixlen = 0;
>  
> +	if (res && ressize < sizeof(cmdline_ipaddr_t))
> +		return -1;
> +
>  	if (!buf || !tk || ! *buf)
>  		return -1;
>  
> diff --git a/lib/librte_cmdline/cmdline_parse_ipaddr.h b/lib/librte_cmdline/cmdline_parse_ipaddr.h
> index 0e2f490..296c374 100644
> --- a/lib/librte_cmdline/cmdline_parse_ipaddr.h
> +++ b/lib/librte_cmdline/cmdline_parse_ipaddr.h
> @@ -92,9 +92,9 @@ typedef struct cmdline_token_ipaddr cmdline_parse_token_ipaddr_t;
>  extern struct cmdline_token_ops cmdline_token_ipaddr_ops;
>  
>  int cmdline_parse_ipaddr(cmdline_parse_token_hdr_t *tk, const char *srcbuf,
> -			 void *res);
> +	void *res, unsigned ressize);
>  int cmdline_get_help_ipaddr(cmdline_parse_token_hdr_t *tk, char *dstbuf,
> -			    unsigned int size);
> +	unsigned int size);
>  
>  #define TOKEN_IPADDR_INITIALIZER(structure, field)      \
>  {                                                       \
> diff --git a/lib/librte_cmdline/cmdline_parse_num.c b/lib/librte_cmdline/cmdline_parse_num.c
> index 0b9e4d0..1cf53d9 100644
> --- a/lib/librte_cmdline/cmdline_parse_num.c
> +++ b/lib/librte_cmdline/cmdline_parse_num.c
> @@ -119,10 +119,40 @@ add_to_res(unsigned int c, uint64_t *res, unsigned int base)
>  	return 0;
>  }
>  
> +static int
> +check_res_size(struct cmdline_token_num_data *nd, unsigned ressize)
> +{
> +	switch (nd->type) {
> +		case INT8:
> +		case UINT8:
> +			if (ressize < sizeof(int8_t))
> +				return -1;
> +			break;
> +		case INT16:
> +		case UINT16:
> +			if (ressize < sizeof(int16_t))
> +				return -1;
> +			break;
> +		case INT32:
> +		case UINT32:
> +			if (ressize < sizeof(int32_t))
> +				return -1;
> +			break;
> +		case INT64:
> +		case UINT64:
> +			if (ressize < sizeof(int64_t))
> +				return -1;
> +			break;
> +		default:
> +			return -1;
> +	}
> +	return 0;
> +}
>  
>  /* parse an int */
>  int
> -cmdline_parse_num(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res)
> +cmdline_parse_num(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res,
> +	unsigned ressize)
>  {
>  	struct cmdline_token_num_data nd;
>  	enum num_parse_state_t st = START;
> @@ -141,6 +171,12 @@ cmdline_parse_num(cmdline_parse_token_hdr_t *tk, const char *srcbuf, void *res)
>  
>  	memcpy(&nd, &((struct cmdline_token_num *)tk)->num_data, sizeof(nd));
>  
> +	/* check that we have enough room in res */
> +	if (res) {
> +		if (check_res_size(&nd, ressize) < 0)
> +			return -1;
> +	}
> +
>  	while ( st != ERROR && c && ! cmdline_isendoftoken(c) ) {
>  		debug_printf("%c %x -> ", c, c);
>  		switch (st) {
> diff --git a/lib/librte_cmdline/cmdline_parse_num.h b/lib/librte_cmdline/cmdline_parse_num.h
> index 77f2f9b..5376806 100644
> --- a/lib/librte_cmdline/cmdline_parse_num.h
> +++ b/lib/librte_cmdline/cmdline_parse_num.h
> @@ -89,9 +89,9 @@ typedef struct cmdline_token_num cmdline_parse_token_num_t;
>  extern struct cmdline_token_ops cmdline_token_num_ops;
>  
>  int cmdline_parse_num(cmdline_parse_token_hdr_t *tk,
> -		      const char *srcbuf, void *res);
> +	const char *srcbuf, void *res, unsigned ressize);
>  int cmdline_get_help_num(cmdline_parse_token_hdr_t *tk,
> -			 char *dstbuf, unsigned int size);
> +	char *dstbuf, unsigned int size);
>  
>  #define TOKEN_NUM_INITIALIZER(structure, field, numtype)    \
>  {                                                           \
> diff --git a/lib/librte_cmdline/cmdline_parse_portlist.c b/lib/librte_cmdline/cmdline_parse_portlist.c
> index 7eac05c..834f2e6 100644
> --- a/lib/librte_cmdline/cmdline_parse_portlist.c
> +++ b/lib/librte_cmdline/cmdline_parse_portlist.c
> @@ -127,7 +127,7 @@ parse_ports(cmdline_portlist_t * pl, const char * str)
>  
>  int
>  cmdline_parse_portlist(__attribute__((unused)) cmdline_parse_token_hdr_t *tk,
> -		const char *buf, void *res)
> +	const char *buf, void *res, unsigned ressize)
>  {
>  	unsigned int token_len = 0;
>  	char portlist_str[PORTLIST_TOKEN_SIZE+1];
> @@ -136,6 +136,9 @@ cmdline_parse_portlist(__attribute__((unused)) cmdline_parse_token_hdr_t *tk,
>  	if (!buf || ! *buf)
>  		return (-1);
>  
> +	if (res && ressize < PORTLIST_TOKEN_SIZE)
> +		return -1;
> +
>  	pl = res;
>  
>  	while (!cmdline_isendoftoken(buf[token_len]) &&
> diff --git a/lib/librte_cmdline/cmdline_parse_portlist.h b/lib/librte_cmdline/cmdline_parse_portlist.h
> index 6fdc406..8505059 100644
> --- a/lib/librte_cmdline/cmdline_parse_portlist.h
> +++ b/lib/librte_cmdline/cmdline_parse_portlist.h
> @@ -81,9 +81,9 @@ typedef struct cmdline_token_portlist cmdline_parse_token_portlist_t;
>  extern struct cmdline_token_ops cmdline_token_portlist_ops;
>  
>  int cmdline_parse_portlist(cmdline_parse_token_hdr_t *tk,
> -		      const char *srcbuf, void *res);
> +	const char *srcbuf, void *res, unsigned ressize);
>  int cmdline_get_help_portlist(cmdline_parse_token_hdr_t *tk,
> -			 char *dstbuf, unsigned int size);
> +	char *dstbuf, unsigned int size);
>  
>  #define TOKEN_PORTLIST_INITIALIZER(structure, field)        \
>  {                                                           \
> diff --git a/lib/librte_cmdline/cmdline_parse_string.c b/lib/librte_cmdline/cmdline_parse_string.c
> index b1bfe91..45883b3 100644
> --- a/lib/librte_cmdline/cmdline_parse_string.c
> +++ b/lib/librte_cmdline/cmdline_parse_string.c
> @@ -105,13 +105,17 @@ get_next_token(const char *s)
>  }
>  
>  int
> -cmdline_parse_string(cmdline_parse_token_hdr_t *tk, const char *buf, void *res)
> +cmdline_parse_string(cmdline_parse_token_hdr_t *tk, const char *buf, void *res,
> +	unsigned ressize)
>  {
>  	struct cmdline_token_string *tk2;
>  	struct cmdline_token_string_data *sd;
>  	unsigned int token_len;
>  	const char *str;
>  
> +	if (res && ressize < STR_TOKEN_SIZE)
> +		return -1;
> +
>  	if (!tk || !buf || ! *buf)
>  		return -1;
>  
> diff --git a/lib/librte_cmdline/cmdline_parse_string.h b/lib/librte_cmdline/cmdline_parse_string.h
> index 52c916c..c205622 100644
> --- a/lib/librte_cmdline/cmdline_parse_string.h
> +++ b/lib/librte_cmdline/cmdline_parse_string.h
> @@ -83,7 +83,7 @@ typedef struct cmdline_token_string cmdline_parse_token_string_t;
>  extern struct cmdline_token_ops cmdline_token_string_ops;
>  
>  int cmdline_parse_string(cmdline_parse_token_hdr_t *tk, const char *srcbuf,
> -			 void *res);
> +	void *res, unsigned ressize);
>  int cmdline_complete_get_nb_string(cmdline_parse_token_hdr_t *tk);
>  int cmdline_complete_get_elt_string(cmdline_parse_token_hdr_t *tk, int idx,
>  				    char *dstbuf, unsigned int size);
> diff --git a/lib/librte_pmd_bond/rte_eth_bond_args.c b/lib/librte_pmd_bond/rte_eth_bond_args.c
> index 4114833..ca4de38 100644
> --- a/lib/librte_pmd_bond/rte_eth_bond_args.c
> +++ b/lib/librte_pmd_bond/rte_eth_bond_args.c
> @@ -254,7 +254,8 @@ bond_ethdev_parse_bond_mac_addr_kvarg(const char *key __rte_unused,
>  		return -1;
>  
>  	/* Parse MAC */
> -	return cmdline_parse_etheraddr(NULL, value, extra_args);
> +	return cmdline_parse_etheraddr(NULL, value, extra_args,
> +		sizeof(struct ether_addr));
>  }
>  
>  int
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [dpdk-dev] [PATCH v3] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ
  2014-12-05 15:51           ` Bruce Richardson
@ 2014-12-05 15:58             ` Thomas Monjalon
  0 siblings, 0 replies; 97+ messages in thread
From: Thomas Monjalon @ 2014-12-05 15:58 UTC (permalink / raw)
  To: Alan Carew; +Cc: dev

> > When using test-pmd with flow director in FreeBSD, the application will
> > segfault/Bus error while parsing the command-line. This is due to how
> > each commands result structure is represented during parsing, where the offsets
> > for each tokens value is stored in a character array(char result_buf[BUFSIZ])
> > in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).
> > 
> > The overflow occurs where BUFSIZ is less than the size of a commands result
> > structure, in this case "struct cmd_pkt_filter_result"
> > (app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
> > opposed to 8192 bytes on Linux.
> > 
> > The problem can be reproduced by running test-pmd on FreeBSD:
> > ./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
> > And adding a filter:
> > add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
> > 0x800 vlan 0 queue 0 soft 0x17
> > 
> > This patch removes the OS dependency on BUFSIZ and defines and uses a
> > library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
> > 
> > Added boundary checking to ensure this buffer size cannot overflow, with
> > an error message being produced.
> > 
> > Suggested-by: Olivier MATZ <olivier.matz@6wind.com>
> > http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f
> > 
> > Signed-off-by: Alan Carew <alan.carew@intel.com>
> > Signed-off-by: Olivier MATZ <olivier.matz@6wind.com>
> 
> Tested on FreeBSD 10 and this patch fixes the issue described.
> 
> Tested-by: Bruce Richardson <bruce.richardson@intel.com>

Applied

Thank you all
-- 
Thomas

^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2014-12-12 16:14 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-22 18:34 [dpdk-dev] [PATCH 00/10] VM Power Management Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 02/10] VM Power Management CLI(Host) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 03/10] CPU Frequency Power Management(Host) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 04/10] " Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 05/10] VM communication channels for VM Power Management(Guest) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 06/10] Alternate implementation of librte_power " Alan Carew
2014-09-22 19:17   ` Neil Horman
2014-09-23  7:48     ` Carew, Alan
2014-09-22 18:34 ` [dpdk-dev] [PATCH 07/10] Packet format for VM Power Management(Host and Guest) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 08/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 09/10] VM Power Management Unit Tests(Guest) Alan Carew
2014-09-22 18:34 ` [dpdk-dev] [PATCH 10/10] VM Power Management CLI(Guest) Alan Carew
2014-09-24 17:26 ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 02/10] VM Power Management CLI(Host) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 03/10] CPU Frequency Power Management(Host) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 04/10] VM Power Management application and Makefile Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 05/10] VM Power Management CLI(Guest) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 07/10] librte_power common interface for Guest and Host Alan Carew
2014-09-25 10:10     ` Neil Horman
2014-09-25 17:06       ` Carew, Alan
2014-09-25 17:49         ` Neil Horman
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
2014-09-24 17:26   ` [dpdk-dev] [PATCH v2 10/10] VM Power Management Unit Tests Alan Carew
2014-09-25  2:56   ` [dpdk-dev] [PATCH v2 00/10] VM Power Management Liu, Yong
2014-09-29 15:18   ` [dpdk-dev] [PATCH v3 " Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 02/10] VM Power Management CLI(Host) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 03/10] CPU Frequency Power Management(Host) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 04/10] VM Power Management application and Makefile Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 05/10] VM Power Management CLI(Guest) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 07/10] librte_power common interface for Guest and Host Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
2014-09-29 15:18     ` [dpdk-dev] [PATCH v3 10/10] VM Power Management Unit Tests Alan Carew
2014-09-29 17:29     ` [dpdk-dev] [PATCH v3 00/10] VM Power Management Neil Horman
2014-10-12 19:36     ` [dpdk-dev] [PATCH v4 " Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 01/10] Channel Manager and Monitor for VM Power Management(Host) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 02/10] VM Power Management CLI(Host) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 03/10] CPU Frequency Power Management(Host) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile Alan Carew
2014-10-16 18:28         ` De Lara Guarch, Pablo
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 05/10] VM Power Management CLI(Guest) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 06/10] VM communication channels for VM Power Management(Guest) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 07/10] librte_power common interface for Guest and Host Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 08/10] Packet format for VM Power Management(Host and Guest) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 09/10] Build system integration for VM Power Management(Guest and Host) Alan Carew
2014-10-12 19:36       ` [dpdk-dev] [PATCH v4 10/10] VM Power Management Unit Tests Alan Carew
2014-10-13  6:17       ` [dpdk-dev] [PATCH v4 00/10] VM Power Management Liu, Yong
2014-10-13 20:26       ` Thomas Monjalon
2014-10-14 12:37         ` Carew, Alan
2014-10-14 15:03           ` Thomas Monjalon
2014-10-16 15:21             ` Carew, Alan
2014-10-28 15:21               ` Thomas Monjalon
2014-11-10  9:05                 ` Carew, Alan
2014-11-10 17:54                   ` O'driscoll, Tim
2014-11-21 23:51                     ` Zhu, Heqing
2014-11-22 17:17                     ` Vincent JARDIN
2014-12-09 17:35                       ` Paolo Bonzini
2014-12-11 23:18                         ` Thomas Monjalon
2014-12-12 13:00                           ` Carew, Alan
2014-12-12 14:50                             ` Paolo Bonzini
2014-12-12 16:10                               ` Thomas Monjalon
2014-12-12 16:13                                 ` Paolo Bonzini
2014-11-21 17:42       ` [dpdk-dev] [PATCH v5 00/10] Virtual Machine " Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 02/10] VM Power Management CLI(Host) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 03/10] CPU Frequency Power Management(Host) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 04/10] VM Power Management application and Makefile Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 05/10] VM Power Management CLI(Guest) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 06/10] VM communication channels for VM Power Management(Guest) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 07/10] librte_power common interface for Guest and Host Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 08/10] Packet format for VM Power Management(Host and Guest) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 09/10] Build system integration for VM Power Management(Guest and Host) Pablo de Lara
2014-11-21 17:42         ` [dpdk-dev] [PATCH v5 10/10] VM Power Management Unit Tests Pablo de Lara
2014-11-25 16:18         ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 01/10] Channel Manager and Monitor for VM Power Management(Host) Pablo de Lara
2014-11-29 15:21             ` Neil Horman
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 02/10] VM Power Management CLI(Host) Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 03/10] CPU Frequency Power Management(Host) Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 04/10] VM Power Management application and Makefile Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 05/10] VM Power Management CLI(Guest) Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 06/10] VM communication channels for VM Power Management(Guest) Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 07/10] librte_power common interface for Guest and Host Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 08/10] Packet format for VM Power Management(Host and Guest) Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 09/10] Build system integration for VM Power Management(Guest and Host) Pablo de Lara
2014-11-25 16:18           ` [dpdk-dev] [PATCH v6 10/10] VM Power Management Unit Tests Pablo de Lara
2014-11-26 16:41           ` [dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management Thomas Monjalon
2014-11-10  9:19     ` [dpdk-dev] [PATCH v2] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ Alan Carew
2014-12-05 14:16       ` Olivier MATZ
2014-12-05 14:19         ` [dpdk-dev] [PATCH v3] " Olivier Matz
2014-12-05 15:51           ` Bruce Richardson
2014-12-05 15:58             ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).