DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/4] add per-core Turbo Boost capability
@ 2017-08-22 16:11 David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 1/4] lib/librte_power: add per-core turbo capability David Hunt
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: David Hunt @ 2017-08-22 16:11 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Recent generations of the Intel® Xeon® family processors allow Turbo Boost
to be enabled/disabled on a per-core basis.

This patch set introduces additional API calls to the librte_power library
to allow users to enable/disable Turbo Boost on particular cores.

Additionally, the use of the library is demonstrated by additions to the
vm_power_manager example application, where the new commands have been
added to allow the turbo status of cores to be changed dynamically.

Extra message types have been added to the virtio-serial channels between the
guest_vm_power_manager app and the vm_power_manager apps to demonstrate
turbo change requests from a virtual machine. In this case, the guest will
send a request to the physical host, which in turn will change the state of
the turbo status.


Usage Example:
--------------

A VM has been created using 8 CPU cores, and 8 virtio-serial channels have
been created as per-core communications channels between the host and the VM.

See: http://www.dpdk.org/doc/guides/sample_app_ug/vm_power_management.html
for more information on setting up the vm_power applications.

In the vm_power_manager app on the host, we can query these channels:
vmpower> show_vm ubuntu2

VM: 'ubuntu2', status = ACTIVE
Channels 8
  [0]: /tmp/powermonitor/ubuntu2.0, status = CONNECTED
  [1]: /tmp/powermonitor/ubuntu2.1, status = CONNECTED
  [2]: /tmp/powermonitor/ubuntu2.2, status = CONNECTED
  [3]: /tmp/powermonitor/ubuntu2.3, status = CONNECTED
  [4]: /tmp/powermonitor/ubuntu2.4, status = CONNECTED
  [5]: /tmp/powermonitor/ubuntu2.5, status = CONNECTED
  [6]: /tmp/powermonitor/ubuntu2.6, status = CONNECTED
  [7]: /tmp/powermonitor/ubuntu2.7, status = CONNECTED
Virtual CPU(s): 8
  [0]: Physical CPU Mask 0x100000
  [1]: Physical CPU Mask 0x200000
  [2]: Physical CPU Mask 0x400000
  [3]: Physical CPU Mask 0x800000
  [4]: Physical CPU Mask 0x1000000
  [5]: Physical CPU Mask 0x2000000
  [6]: Physical CPU Mask 0x4000000
  [7]: Physical CPU Mask 0x8000000

Once the VM is up and running, if we exercise all the cores on the guest, we
can use turbostat on the host to see the frequencies of the guest cores. In
this example, it's cores 20-27:

      19       0    0.01    2500    2500
      20    2498  100.00    2500    2498
      21    2498  100.00    2500    2498
      22    2498  100.00    2500    2498
      23    2498  100.00    2500    2498
      24   *2498  100.00    2500    2498
      25    2498  100.00    2500    2498
      26    2498  100.00    2500    2498
      27    2498  100.00    2500    2498
      28       0    0.01    2032    2498

We can then issue a command in the vmpower app on the guest:

vmpower(guest)> set_cpu_freq 4 enable_turbo

This command will pass a message down through virtio-serial to the host, which
will enable turbo on core 24, the underlying physical core for the guest's
4th lcore_id. We can then see the change by running turbostat on the host:

      19       0    0.01    2500    2496
      20    2498  100.00    2500    2498
      21    2498  100.00    2500    2498
      22    2498  100.00    2500    2498
      23    2498  100.00    2500    2498
      24   *3297  100.00    3300    2498
      25    2498  100.00    2500    2498
      26    2498  100.00    2500    2498
      27    2498  100.00    2500    2498
      28       0    0.01    1016    2498

Core 24 is now running at 3300MHz, whereas the remainder are still running
at 2500MHz.

We can issue a similar command in the vm_power_manager running on the host
to disable turbo on that core, but this time we use the physical core id:

vmpower> set_cpu_freq 24 disable_turbo

and we see that turbo is now disabled on that core.

      19       0    0.00    2500    2495
      20    2499  100.00    2500    2499
      21    2499  100.00    2500    2499
      22    2499  100.00    2500    2499
      23    2499  100.00    2500    2499
      24   *2499  100.00    2500    2499
      25    2499  100.00    2500    2499
      26    2499  100.00    2500    2499
      27    2499  100.00    2500    2499
      28       0    0.01    1000    2499

[1/4] lib/librte_power: add per-core turbo capability
[2/4] examples/vm_power_manager: add per-core turbo
[3/4] examples/vm_power_cli_guest: add per-core turbo
[4/4] lib: limit turbo to particular models of CPU

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v1 1/4] lib/librte_power: add per-core turbo capability
  2017-08-22 16:11 [dpdk-dev] [PATCH v1 0/4] add per-core Turbo Boost capability David Hunt
@ 2017-08-22 16:11 ` David Hunt
  2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 2/4] examples/vm_power_manager: add per-core turbo David Hunt
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: David Hunt @ 2017-08-22 16:11 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Adds a new set of APIs to allow per-core turbo
enable-disable.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_power/channel_commands.h       |   2 +
 lib/librte_power/rte_power.c              |   9 ++
 lib/librte_power/rte_power.h              |  41 +++++++++
 lib/librte_power/rte_power_acpi_cpufreq.c | 143 ++++++++++++++++++++++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.h |  40 +++++++++
 lib/librte_power/rte_power_kvm_vm.c       |  19 ++++
 lib/librte_power/rte_power_kvm_vm.h       |  35 +++++++-
 7 files changed, 288 insertions(+), 1 deletion(-)

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 383897b..484085b 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -52,6 +52,8 @@ extern "C" {
 #define CPU_POWER_SCALE_DOWN    2
 #define CPU_POWER_SCALE_MAX     3
 #define CPU_POWER_SCALE_MIN     4
+#define CPU_POWER_ENABLE_TURBO  5
+#define CPU_POWER_DISABLE_TURBO 6
 
 struct channel_packet {
 	uint64_t resource_id; /**< core_num, device */
diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 998ed1c..b327a86 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -50,6 +50,9 @@ rte_power_freq_change_t rte_power_freq_up = NULL;
 rte_power_freq_change_t rte_power_freq_down = NULL;
 rte_power_freq_change_t rte_power_freq_max = NULL;
 rte_power_freq_change_t rte_power_freq_min = NULL;
+rte_power_freq_change_t rte_power_turbo_status;
+rte_power_freq_change_t rte_power_freq_enable_turbo;
+rte_power_freq_change_t rte_power_freq_disable_turbo;
 
 int
 rte_power_set_env(enum power_management_env env)
@@ -65,6 +68,9 @@ rte_power_set_env(enum power_management_env env)
 		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
 		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
 		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+		rte_power_turbo_status = rte_power_acpi_turbo_status;
+		rte_power_freq_enable_turbo = rte_power_acpi_enable_turbo;
+		rte_power_freq_disable_turbo = rte_power_acpi_disable_turbo;
 	} else if (env == PM_ENV_KVM_VM) {
 		rte_power_freqs = rte_power_kvm_vm_freqs;
 		rte_power_get_freq = rte_power_kvm_vm_get_freq;
@@ -73,6 +79,9 @@ rte_power_set_env(enum power_management_env env)
 		rte_power_freq_down = rte_power_kvm_vm_freq_down;
 		rte_power_freq_min = rte_power_kvm_vm_freq_min;
 		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+		rte_power_turbo_status = rte_power_kvm_vm_turbo_status;
+		rte_power_freq_enable_turbo = rte_power_kvm_vm_enable_turbo;
+		rte_power_freq_disable_turbo = rte_power_kvm_vm_disable_turbo;
 	} else {
 		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
 				env);
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 67e0ec0..b17b7a5 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -236,6 +236,47 @@ extern rte_power_freq_change_t rte_power_freq_max;
  */
 extern rte_power_freq_change_t rte_power_freq_min;
 
+/**
+ * Query the Turbo Boost status of a specific lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 Turbo Boost is enabled for this lcore.
+ *  - 0 Turbo Boost is disabled for this lcore.
+ *  - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_turbo_status;
+
+/**
+ * Enable Turbo Boost for this lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+
+/**
+ * Disable Turbo Boost for this lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
index a56c9b5..6695f59 100644
--- a/lib/librte_power/rte_power_acpi_cpufreq.c
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -87,6 +87,14 @@
 #define POWER_SYSFILE_SETSPEED   \
 		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
 
+/*
+ * MSR related
+ */
+#define PLATFORM_INFO     0x0CE
+#define TURBO_RATIO_LIMIT 0x1AD
+#define IA32_PERF_CTL     0x199
+#define CORE_TURBO_DISABLE_BIT ((uint64_t)1<<32)
+
 enum power_state {
 	POWER_IDLE = 0,
 	POWER_ONGOING,
@@ -543,3 +551,138 @@ rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
 	/* Frequencies in the array are from high to low. */
 	return set_freq_internal(pi, pi->nb_freqs - 1);
 }
+
+
+static int
+rdmsr(int lcore, int msr, uint64_t *val)
+{
+	char filename[32];
+	int fd;
+	int retval;
+
+	sprintf(filename, "/dev/cpu/%d/msr", lcore);
+	fd = open(filename, O_RDONLY);
+	if (fd < 0)
+		return fd;
+
+	retval = pread(fd, val, sizeof(uint64_t), msr);
+	if (retval < 0) {
+		close(fd);
+		return retval;
+	}
+	close(fd);
+	return 0;
+}
+
+static int
+wrmsr(int lcore, int msr, uint64_t val)
+{
+	char filename[32];
+	int fd;
+	int retval;
+
+	sprintf(filename, "/dev/cpu/%d/msr", lcore);
+	fd = open(filename, O_WRONLY);
+	if (fd < 0)
+		return fd;
+
+	retval = pwrite(fd, (void *)&val, sizeof(uint64_t), msr);
+	if (retval < 0) {
+		close(fd);
+		return retval;
+	}
+	close(fd);
+	return 0;
+}
+
+int
+rte_power_acpi_turbo_status(unsigned int lcore_id)
+{
+	uint64_t val;
+	int retval;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+#if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+	retval = rdmsr(lcore_id, IA32_PERF_CTL, &val);
+	if (retval)
+		return retval;
+	else
+		return(!(val & CORE_TURBO_DISABLE_BIT));
+#else
+	return 0
+#endif
+}
+
+
+int
+rte_power_acpi_enable_turbo(unsigned int lcore_id)
+{
+	uint64_t val;
+	int retval;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+#if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+	/*
+	 * The low byte of 1ADh MSR contains max recomended ratio when a small
+	 * number of cores are active. Use this ratio when turbo is enabled.
+	 */
+	retval = rdmsr(lcore_id, TURBO_RATIO_LIMIT, &val);
+	if (retval)
+		return retval;
+
+	val = (val & 0x00ff) << 8;       /* Move to second lowest byte     */
+	val &= ~CORE_TURBO_DISABLE_BIT;  /* Switch bit off to enable turbo */
+
+	retval = wrmsr(lcore_id, IA32_PERF_CTL, val);
+	if (retval)
+		return retval;
+	else
+		return 0;
+#else
+	return 0;
+#endif
+}
+
+int
+rte_power_acpi_disable_turbo(unsigned int lcore_id)
+{
+	uint64_t val;
+	int retval;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+#if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+	/*
+	 * 0CEh MSR contains max non-turbo ratio in bits 8-15. Use this
+	 * for the freq when turbo is disabled for that core.
+	 */
+	retval = rdmsr(lcore_id, PLATFORM_INFO, &val);
+	if (retval)
+		return retval;
+
+	val = val & 0xff00;             /* Only need second lowest byte   */
+	val |= CORE_TURBO_DISABLE_BIT;  /* Switch bit on to disable turbo */
+
+	retval = wrmsr(lcore_id, IA32_PERF_CTL, val);
+	if (retval)
+		return retval;
+
+	/* Try to set freq to max by default coming out of turbo */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER, "Failed to set frequency of lcore %u to max\n",
+				lcore_id);
+	}
+#endif
+	return 0;
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
index 68578e9..eee0ca0 100644
--- a/lib/librte_power/rte_power_acpi_cpufreq.h
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -185,6 +185,46 @@ int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
  */
 int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
 
+/**
+ * Get the turbo status of a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 Turbo Boost is enabled on this lcore.
+ *  - 0 Turbo Boost is disabled on this lcore.
+ *  - Negative on error.
+ */
+int rte_power_acpi_turbo_status(unsigned int lcore_id);
+
+/**
+ * Enable Turbo Boost on a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 Turbo Boost is enabled successfully on this lcore.
+ *  - Negative on error.
+ */
+int rte_power_acpi_enable_turbo(unsigned int lcore_id);
+
+/**
+ * Disable Turbo Boost on a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 Turbo Boost disabled successfully on this lcore.
+ *  - Negative on error.
+ */
+int rte_power_acpi_disable_turbo(unsigned int lcore_id);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
index a1badf3..9906062 100644
--- a/lib/librte_power/rte_power_kvm_vm.c
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -134,3 +134,22 @@ rte_power_kvm_vm_freq_min(unsigned lcore_id)
 {
 	return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
 }
+
+int
+rte_power_kvm_vm_turbo_status(__attribute__((unused)) unsigned int lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_turbo_status is not implemented for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_enable_turbo(unsigned int lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_ENABLE_TURBO);
+}
+
+int
+rte_power_kvm_vm_disable_turbo(unsigned int lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_DISABLE_TURBO);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
index dcbc878..9af41d6 100644
--- a/lib/librte_power/rte_power_kvm_vm.h
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -172,8 +172,41 @@ int rte_power_kvm_vm_freq_max(unsigned lcore_id);
  */
 int rte_power_kvm_vm_freq_min(unsigned lcore_id);
 
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_turbo_status(unsigned int lcore_id);
+
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_enable_turbo(unsigned int lcore_id);
+
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_disable_turbo(unsigned int lcore_id);
 #ifdef __cplusplus
 }
 #endif
-
 #endif
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v1 2/4] examples/vm_power_manager: add per-core turbo
  2017-08-22 16:11 [dpdk-dev] [PATCH v1 0/4] add per-core Turbo Boost capability David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 1/4] lib/librte_power: add per-core turbo capability David Hunt
@ 2017-08-22 16:11 ` David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 3/4] examples/vm_power_cli_guest: " David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 4/4] lib: limit turbo to particular models of CPU David Hunt
  3 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-08-22 16:11 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add extra commands to command line to allow enable/disable of
per-core turbo.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/channel_monitor.c | 12 +++++++
 examples/vm_power_manager/power_manager.c   | 36 ++++++++++++++++++++
 examples/vm_power_manager/power_manager.h   | 52 +++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.c    | 21 ++++++++----
 4 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index e7f5cc4..ac40dac 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -105,6 +105,12 @@ process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 			case(CPU_POWER_SCALE_UP):
 					power_manager_scale_core_up(core_num);
 			break;
+			case(CPU_POWER_ENABLE_TURBO):
+				power_manager_enable_turbo_core(core_num);
+			break;
+			case(CPU_POWER_DISABLE_TURBO):
+				power_manager_disable_turbo_core(core_num);
+			break;
 			default:
 				break;
 			}
@@ -122,6 +128,12 @@ process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 			case(CPU_POWER_SCALE_UP):
 					power_manager_scale_mask_up(core_mask);
 			break;
+			case(CPU_POWER_ENABLE_TURBO):
+				power_manager_enable_turbo_mask(core_mask);
+			break;
+			case(CPU_POWER_DISABLE_TURBO):
+				power_manager_disable_turbo_mask(core_mask);
+			break;
 			default:
 				break;
 			}
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 2644fce..80705f9 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -216,6 +216,24 @@ power_manager_scale_mask_max(uint64_t core_mask)
 }
 
 int
+power_manager_enable_turbo_mask(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(enable_turbo, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_disable_turbo_mask(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(disable_turbo, core_mask, ret);
+	return ret;
+}
+
+int
 power_manager_scale_core_up(unsigned core_num)
 {
 	int ret = 0;
@@ -250,3 +268,21 @@ power_manager_scale_core_max(unsigned core_num)
 	POWER_SCALE_CORE(max, core_num, ret);
 	return ret;
 }
+
+int
+power_manager_enable_turbo_core(unsigned int core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(enable_turbo, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_disable_turbo_core(unsigned int core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(disable_turbo, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 1b45bab..b74d09b 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -113,6 +113,32 @@ int power_manager_scale_mask_min(uint64_t core_mask);
 int power_manager_scale_mask_max(uint64_t core_mask);
 
 /**
+ * Enable Turbo Boost on the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_enable_turbo_mask(uint64_t core_mask);
+
+/**
+ * Disable Turbo Boost on the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_disable_turbo_mask(uint64_t core_mask);
+
+/**
  * Scale up frequency for the core specified by core_num.
  * It is thread-safe.
  *
@@ -168,6 +194,32 @@ int power_manager_scale_core_min(unsigned core_num);
 int power_manager_scale_core_max(unsigned core_num);
 
 /**
+ * Enable Turbo Boost for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to boost
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_enable_turbo_core(unsigned int core_num);
+
+/**
+ * Disable Turbo Boost for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to boost
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_disable_turbo_core(unsigned int core_num);
+
+/**
  * Get the current freuency of the core specified by core_num
  *
  * @param core_num
diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
index c5e8d93..6f234fb 100644
--- a/examples/vm_power_manager/vm_power_cli.c
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -520,6 +520,10 @@ cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
 		ret = power_manager_scale_mask_min(res->core_mask);
 	else if (!strcmp(res->cmd , "max"))
 		ret = power_manager_scale_mask_max(res->core_mask);
+	else if (!strcmp(res->cmd, "enable_turbo"))
+		ret = power_manager_enable_turbo_mask(res->core_mask);
+	else if (!strcmp(res->cmd, "disable_turbo"))
+		ret = power_manager_disable_turbo_mask(res->core_mask);
 	if (ret < 0) {
 		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
 				"all cores specified have been scaled\n",
@@ -535,14 +539,13 @@ cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
 			core_mask, UINT64);
 cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
 	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
-			cmd, "up#down#min#max");
+			cmd, "up#down#min#max#enable_turbo#disable_turbo");
 
 cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
 	.f = cmd_set_cpu_freq_mask_parsed,
 	.data = NULL,
-	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
-			"frequency for the cores specified in <core_mask> by scaling "
-			"each up/down/min/max.",
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max|enable_turbo|disable_turbo>, adjust the current "
+			"frequency for the cores specified in <core_mask>",
 	.tokens = {
 		(void *)&cmd_set_cpu_freq_mask,
 		(void *)&cmd_set_cpu_freq_mask_core_mask,
@@ -614,6 +617,10 @@ cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
 		ret = power_manager_scale_core_min(res->core_num);
 	else if (!strcmp(res->cmd , "max"))
 		ret = power_manager_scale_core_max(res->core_num);
+	else if (!strcmp(res->cmd, "enable_turbo"))
+		ret = power_manager_enable_turbo_core(res->core_num);
+	else if (!strcmp(res->cmd, "disable_turbo"))
+		ret = power_manager_disable_turbo_core(res->core_num);
 	if (ret < 0) {
 		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
 				res->cmd);
@@ -628,13 +635,13 @@ cmdline_parse_token_num_t cmd_set_cpu_freq_core_num =
 			core_num, UINT8);
 cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
 	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
-			cmd, "up#down#min#max");
+			cmd, "up#down#min#max#enable_turbo#disable_turbo");
 
 cmdline_parse_inst_t cmd_set_cpu_freq_set = {
 	.f = cmd_set_cpu_freq_parsed,
 	.data = NULL,
-	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
-			"frequency for the specified core by scaling up/down/min/max",
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max|enable_turbo|disable_turbo>, adjust the current "
+			"frequency for the specified core",
 	.tokens = {
 		(void *)&cmd_set_cpu_freq,
 		(void *)&cmd_set_cpu_freq_core_num,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v1 3/4] examples/vm_power_cli_guest: add per-core turbo
  2017-08-22 16:11 [dpdk-dev] [PATCH v1 0/4] add per-core Turbo Boost capability David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 1/4] lib/librte_power: add per-core turbo capability David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 2/4] examples/vm_power_manager: add per-core turbo David Hunt
@ 2017-08-22 16:11 ` David Hunt
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 4/4] lib: limit turbo to particular models of CPU David Hunt
  3 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-08-22 16:11 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add extra commands to guest cli to allow enable/disable of
per-core turbo. Includes messages to vm_power_mgr in host.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 7931135..4e982bd 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -108,6 +108,10 @@ cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
 		ret = rte_power_freq_min(res->lcore_id);
 	else if (!strcmp(res->cmd , "max"))
 		ret = rte_power_freq_max(res->lcore_id);
+	else if (!strcmp(res->cmd, "enable_turbo"))
+		ret = rte_power_freq_enable_turbo(res->lcore_id);
+	else if (!strcmp(res->cmd, "disable_turbo"))
+		ret = rte_power_freq_disable_turbo(res->lcore_id);
 	if (ret != 1)
 		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
 }
@@ -120,7 +124,7 @@ cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
 			lcore_id, UINT8);
 cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
 	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
-			cmd, "up#down#min#max");
+			cmd, "up#down#min#max#enable_turbo#disable_turbo");
 
 cmdline_parse_inst_t cmd_set_cpu_freq_set = {
 	.f = cmd_set_cpu_freq_parsed,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v1 4/4] lib: limit turbo to particular models of CPU
  2017-08-22 16:11 [dpdk-dev] [PATCH v1 0/4] add per-core Turbo Boost capability David Hunt
                   ` (2 preceding siblings ...)
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 3/4] examples/vm_power_cli_guest: " David Hunt
@ 2017-08-22 16:11 ` David Hunt
  3 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-08-22 16:11 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

The per-core turbo functionality is only available on specific models
of CPU, so this patch limits it to those models.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_power/rte_power_acpi_cpufreq.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
index 6695f59..ec8d304 100644
--- a/lib/librte_power/rte_power_acpi_cpufreq.c
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -40,6 +40,7 @@
 #include <unistd.h>
 #include <signal.h>
 #include <limits.h>
+#include <cpuid.h>
 
 #include <rte_memcpy.h>
 #include <rte_atomic.h>
@@ -554,6 +555,27 @@ rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
 
 
 static int
+per_core_turbo_supported(void)
+{
+	uint32_t eax, ebx, ecx, edx;
+	int family, model;
+
+	__cpuid(1, eax, ebx, ecx, edx);
+
+	family = (eax >> 8) & 0xf;
+	if (family > 5)
+		model = ((eax >> 4) & 0xf) | ((eax >> 12) & 0xf0);
+	else
+		model = (eax >> 4) & 0xf;
+
+	if (family == 6)
+		if ((model == 63) || (model == 79) || (model == 85))
+			return 1;
+	return 0;
+}
+
+
+static int
 rdmsr(int lcore, int msr, uint64_t *val)
 {
 	char filename[32];
@@ -607,6 +629,8 @@ rte_power_acpi_turbo_status(unsigned int lcore_id)
 	}
 
 #if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+	if (!per_core_turbo_supported())
+		return 0;
 	retval = rdmsr(lcore_id, IA32_PERF_CTL, &val);
 	if (retval)
 		return retval;
@@ -630,6 +654,8 @@ rte_power_acpi_enable_turbo(unsigned int lcore_id)
 	}
 
 #if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+	if (!per_core_turbo_supported())
+		return 0;
 	/*
 	 * The low byte of 1ADh MSR contains max recomended ratio when a small
 	 * number of cores are active. Use this ratio when turbo is enabled.
@@ -663,6 +689,8 @@ rte_power_acpi_disable_turbo(unsigned int lcore_id)
 	}
 
 #if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+	if (!per_core_turbo_supported())
+		return 0;
 	/*
 	 * 0CEh MSR contains max non-turbo ratio in bits 8-15. Use this
 	 * for the freq when turbo is disabled for that core.
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability
  2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 1/4] lib/librte_power: add per-core turbo capability David Hunt
@ 2017-09-13 10:44   ` David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API David Hunt
                       ` (4 more replies)
  0 siblings, 5 replies; 34+ messages in thread
From: David Hunt @ 2017-09-13 10:44 UTC (permalink / raw)
  To: dev

Recent generations of the Intel® Xeon® family processors allow Turbo Boost
to be enabled/disabled on a per-core basis.

This patch set introduces additional API calls to the librte_power library
to allow users to enable/disable Turbo Boost on particular cores.

Changes in patchset v2:
   * Removed wrmsr/rdmsr functions as they were very architecture specific.
     Now using the scaling_setspeed in the sys filesystem, as this is a more
     standard cross-platform method of changing frequencies (where available).
   * Removed patch that checks for particular models of CPU, as they are no
     longer needed with the above change.
   * Added APIs to the docs.

Additionally, the use of the library is demonstrated by additions to the
vm_power_manager example application, where the new commands have been
added to allow the turbo status of cores to be changed dynamically.

Extra message types have been added to the virtio-serial channels between the
guest_vm_power_manager app and the vm_power_manager apps to demonstrate
turbo change requests from a virtual machine. In this case, the guest will
send a request to the physical host, which in turn will change the state of
the turbo status.


Usage Example:
--------------

A VM has been created using 8 CPU cores, and 8 virtio-serial channels have
been created as per-core communications channels between the host and the VM.

See: http://www.dpdk.org/doc/guides/sample_app_ug/vm_power_management.html
for more information on setting up the vm_power applications.

In the vm_power_manager app on the host, we can query these channels:
vmpower> show_vm ubuntu2

VM: 'ubuntu2', status = ACTIVE
Channels 8
  [0]: /tmp/powermonitor/ubuntu2.0, status = CONNECTED
  [1]: /tmp/powermonitor/ubuntu2.1, status = CONNECTED
  [2]: /tmp/powermonitor/ubuntu2.2, status = CONNECTED
  [3]: /tmp/powermonitor/ubuntu2.3, status = CONNECTED
  [4]: /tmp/powermonitor/ubuntu2.4, status = CONNECTED
  [5]: /tmp/powermonitor/ubuntu2.5, status = CONNECTED
  [6]: /tmp/powermonitor/ubuntu2.6, status = CONNECTED
  [7]: /tmp/powermonitor/ubuntu2.7, status = CONNECTED
Virtual CPU(s): 8
  [0]: Physical CPU Mask 0x100000
  [1]: Physical CPU Mask 0x200000
  [2]: Physical CPU Mask 0x400000
  [3]: Physical CPU Mask 0x800000
  [4]: Physical CPU Mask 0x1000000
  [5]: Physical CPU Mask 0x2000000
  [6]: Physical CPU Mask 0x4000000
  [7]: Physical CPU Mask 0x8000000

Once the VM is up and running, if we exercise all the cores on the guest, we
can use turbostat on the host to see the frequencies of the guest cores. In
this example, it's cores 20-27:

      19       0    0.01    2500    2500
      20    2498  100.00    2500    2498
      21    2498  100.00    2500    2498
      22    2498  100.00    2500    2498
      23    2498  100.00    2500    2498
      24   *2498  100.00    2500    2498
      25    2498  100.00    2500    2498
      26    2498  100.00    2500    2498
      27    2498  100.00    2500    2498
      28       0    0.01    2032    2498

We can then issue a command in the vmpower app on the guest:

vmpower(guest)> set_cpu_freq 4 enable_turbo

This command will pass a message down through virtio-serial to the host, which
will enable turbo on core 24, the underlying physical core for the guest's
4th lcore_id. We can then see the change by running turbostat on the host:

      19       0    0.01    2500    2496
      20    2498  100.00    2500    2498
      21    2498  100.00    2500    2498
      22    2498  100.00    2500    2498
      23    2498  100.00    2500    2498
      24   *3297  100.00    3300    2498
      25    2498  100.00    2500    2498
      26    2498  100.00    2500    2498
      27    2498  100.00    2500    2498
      28       0    0.01    1016    2498

Core 24 is now running at 3300MHz, whereas the remainder are still running
at 2500MHz.

We can issue a similar command in the vm_power_manager running on the host
to disable turbo on that core, but this time we use the physical core id:

vmpower> set_cpu_freq 24 disable_turbo

and we see that turbo is now disabled on that core.

      19       0    0.00    2500    2495
      20    2499  100.00    2500    2499
      21    2499  100.00    2500    2499
      22    2499  100.00    2500    2499
      23    2499  100.00    2500    2499
      24   *2499  100.00    2500    2499
      25    2499  100.00    2500    2499
      26    2499  100.00    2500    2499
      27    2499  100.00    2500    2499
      28       0    0.01    1000    2499

[1/4] lib/librte_power: add turbo boost API
[2/4] examples/vm_power_manager: add per-core turbo
[3/4] examples/vm_power_cli_guest: add per-core turbo
[4/4] doc/power: add information on per-core turbo APIs

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API
  2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
@ 2017-09-13 10:44     ` David Hunt
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 2/4] examples/vm_power_manager: add per-core turbo David Hunt
                       ` (3 subsequent siblings)
  4 siblings, 2 replies; 34+ messages in thread
From: David Hunt @ 2017-09-13 10:44 UTC (permalink / raw)
  To: dev; +Cc: David Hunt

Adds a new set of APIs to allow per-core turbo
enable-disable.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_power/channel_commands.h       |   2 +
 lib/librte_power/rte_power.c              |   9 +++
 lib/librte_power/rte_power.h              |  41 +++++++++++
 lib/librte_power/rte_power_acpi_cpufreq.c | 111 +++++++++++++++++++++++++++++-
 lib/librte_power/rte_power_acpi_cpufreq.h |  40 +++++++++++
 lib/librte_power/rte_power_kvm_vm.c       |  19 +++++
 lib/librte_power/rte_power_kvm_vm.h       |  35 +++++++++-
 7 files changed, 255 insertions(+), 2 deletions(-)

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 383897b..484085b 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -52,6 +52,8 @@ extern "C" {
 #define CPU_POWER_SCALE_DOWN    2
 #define CPU_POWER_SCALE_MAX     3
 #define CPU_POWER_SCALE_MIN     4
+#define CPU_POWER_ENABLE_TURBO  5
+#define CPU_POWER_DISABLE_TURBO 6
 
 struct channel_packet {
 	uint64_t resource_id; /**< core_num, device */
diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 998ed1c..b327a86 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -50,6 +50,9 @@ rte_power_freq_change_t rte_power_freq_up = NULL;
 rte_power_freq_change_t rte_power_freq_down = NULL;
 rte_power_freq_change_t rte_power_freq_max = NULL;
 rte_power_freq_change_t rte_power_freq_min = NULL;
+rte_power_freq_change_t rte_power_turbo_status;
+rte_power_freq_change_t rte_power_freq_enable_turbo;
+rte_power_freq_change_t rte_power_freq_disable_turbo;
 
 int
 rte_power_set_env(enum power_management_env env)
@@ -65,6 +68,9 @@ rte_power_set_env(enum power_management_env env)
 		rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
 		rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
 		rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+		rte_power_turbo_status = rte_power_acpi_turbo_status;
+		rte_power_freq_enable_turbo = rte_power_acpi_enable_turbo;
+		rte_power_freq_disable_turbo = rte_power_acpi_disable_turbo;
 	} else if (env == PM_ENV_KVM_VM) {
 		rte_power_freqs = rte_power_kvm_vm_freqs;
 		rte_power_get_freq = rte_power_kvm_vm_get_freq;
@@ -73,6 +79,9 @@ rte_power_set_env(enum power_management_env env)
 		rte_power_freq_down = rte_power_kvm_vm_freq_down;
 		rte_power_freq_min = rte_power_kvm_vm_freq_min;
 		rte_power_freq_max = rte_power_kvm_vm_freq_max;
+		rte_power_turbo_status = rte_power_kvm_vm_turbo_status;
+		rte_power_freq_enable_turbo = rte_power_kvm_vm_enable_turbo;
+		rte_power_freq_disable_turbo = rte_power_kvm_vm_disable_turbo;
 	} else {
 		RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
 				env);
diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h
index 67e0ec0..b17b7a5 100644
--- a/lib/librte_power/rte_power.h
+++ b/lib/librte_power/rte_power.h
@@ -236,6 +236,47 @@ extern rte_power_freq_change_t rte_power_freq_max;
  */
 extern rte_power_freq_change_t rte_power_freq_min;
 
+/**
+ * Query the Turbo Boost status of a specific lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 Turbo Boost is enabled for this lcore.
+ *  - 0 Turbo Boost is disabled for this lcore.
+ *  - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_turbo_status;
+
+/**
+ * Enable Turbo Boost for this lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+
+/**
+ * Disable Turbo Boost for this lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.c b/lib/librte_power/rte_power_acpi_cpufreq.c
index a56c9b5..01ac5ac 100644
--- a/lib/librte_power/rte_power_acpi_cpufreq.c
+++ b/lib/librte_power/rte_power_acpi_cpufreq.c
@@ -87,6 +87,14 @@
 #define POWER_SYSFILE_SETSPEED   \
 		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
 
+/*
+ * MSR related
+ */
+#define PLATFORM_INFO     0x0CE
+#define TURBO_RATIO_LIMIT 0x1AD
+#define IA32_PERF_CTL     0x199
+#define CORE_TURBO_DISABLE_BIT ((uint64_t)1<<32)
+
 enum power_state {
 	POWER_IDLE = 0,
 	POWER_ONGOING,
@@ -105,6 +113,8 @@ struct rte_power_info {
 	char governor_ori[32];               /**< Original governor name */
 	uint32_t curr_idx;                   /**< Freq index in freqs array */
 	volatile uint32_t state;             /**< Power in use state */
+	uint16_t turbo_available;            /**< Turbo Boost available */
+	uint16_t turbo_enable;               /**< Turbo Boost enable/disable */
 } __rte_cache_aligned;
 
 static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
@@ -244,6 +254,18 @@ power_get_available_freqs(struct rte_power_info *pi)
 				POWER_CONVERT_TO_DECIMAL);
 	}
 
+	if ((pi->freqs[0]-1000) == pi->freqs[1]) {
+		pi->turbo_available = 1;
+		pi->turbo_enable = 1;
+		POWER_DEBUG_TRACE("Lcore %u Can do Turbo Boost\n",
+				pi->lcore_id);
+	} else {
+		pi->turbo_available = 0;
+		pi->turbo_enable = 0;
+		POWER_DEBUG_TRACE("Turbo Boost not available on Lcore %u\n",
+				pi->lcore_id);
+	}
+
 	ret = 0;
 	POWER_DEBUG_TRACE("%d frequencie(s) of lcore %u are available\n",
 			count, pi->lcore_id);
@@ -525,7 +547,17 @@ rte_power_acpi_cpufreq_freq_max(unsigned lcore_id)
 	}
 
 	/* Frequencies in the array are from high to low. */
-	return set_freq_internal(&lcore_power_info[lcore_id], 0);
+	if (lcore_power_info[lcore_id].turbo_available) {
+		if (lcore_power_info[lcore_id].turbo_enable)
+			/* Set to Turbo */
+			return set_freq_internal(
+					&lcore_power_info[lcore_id], 0);
+		else
+			/* Set to max non-turbo */
+			return set_freq_internal(
+					&lcore_power_info[lcore_id], 1);
+	} else
+		return set_freq_internal(&lcore_power_info[lcore_id], 0);
 }
 
 int
@@ -543,3 +575,80 @@ rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
 	/* Frequencies in the array are from high to low. */
 	return set_freq_internal(pi, pi->nb_freqs - 1);
 }
+
+
+int
+rte_power_acpi_turbo_status(unsigned int lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	return pi->turbo_enable;
+}
+
+
+int
+rte_power_acpi_enable_turbo(unsigned int lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	if (pi->turbo_available)
+		pi->turbo_enable = 1;
+	else {
+		pi->turbo_enable = 0;
+		RTE_LOG(ERR, POWER,
+			"Failed to enable turbo on lcore %u\n",
+			lcore_id);
+			return -1;
+	}
+
+	/* Max may have changed, so call to max function */
+	if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+		RTE_LOG(ERR, POWER,
+			"Failed to set frequency of lcore %u to max\n",
+			lcore_id);
+			return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_power_acpi_disable_turbo(unsigned int lcore_id)
+{
+	struct rte_power_info *pi;
+
+	if (lcore_id >= RTE_MAX_LCORE) {
+		RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+		return -1;
+	}
+
+	pi = &lcore_power_info[lcore_id];
+
+	 pi->turbo_enable = 0;
+
+	if ((pi->turbo_available) && (pi->curr_idx <= 1)) {
+		/* Try to set freq to max by default coming out of turbo */
+		if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+			RTE_LOG(ERR, POWER,
+				"Failed to set frequency of lcore %u to max\n",
+				lcore_id);
+			return -1;
+		}
+	}
+
+	return 0;
+}
diff --git a/lib/librte_power/rte_power_acpi_cpufreq.h b/lib/librte_power/rte_power_acpi_cpufreq.h
index 68578e9..eee0ca0 100644
--- a/lib/librte_power/rte_power_acpi_cpufreq.h
+++ b/lib/librte_power/rte_power_acpi_cpufreq.h
@@ -185,6 +185,46 @@ int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
  */
 int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
 
+/**
+ * Get the turbo status of a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 Turbo Boost is enabled on this lcore.
+ *  - 0 Turbo Boost is disabled on this lcore.
+ *  - Negative on error.
+ */
+int rte_power_acpi_turbo_status(unsigned int lcore_id);
+
+/**
+ * Enable Turbo Boost on a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 Turbo Boost is enabled successfully on this lcore.
+ *  - Negative on error.
+ */
+int rte_power_acpi_enable_turbo(unsigned int lcore_id);
+
+/**
+ * Disable Turbo Boost on a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 0 Turbo Boost disabled successfully on this lcore.
+ *  - Negative on error.
+ */
+int rte_power_acpi_disable_turbo(unsigned int lcore_id);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_power/rte_power_kvm_vm.c b/lib/librte_power/rte_power_kvm_vm.c
index a1badf3..9906062 100644
--- a/lib/librte_power/rte_power_kvm_vm.c
+++ b/lib/librte_power/rte_power_kvm_vm.c
@@ -134,3 +134,22 @@ rte_power_kvm_vm_freq_min(unsigned lcore_id)
 {
 	return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
 }
+
+int
+rte_power_kvm_vm_turbo_status(__attribute__((unused)) unsigned int lcore_id)
+{
+	RTE_LOG(ERR, POWER, "rte_power_turbo_status is not implemented for Virtual Machine Power Management\n");
+	return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_enable_turbo(unsigned int lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_ENABLE_TURBO);
+}
+
+int
+rte_power_kvm_vm_disable_turbo(unsigned int lcore_id)
+{
+	return send_msg(lcore_id, CPU_POWER_DISABLE_TURBO);
+}
diff --git a/lib/librte_power/rte_power_kvm_vm.h b/lib/librte_power/rte_power_kvm_vm.h
index dcbc878..9af41d6 100644
--- a/lib/librte_power/rte_power_kvm_vm.h
+++ b/lib/librte_power/rte_power_kvm_vm.h
@@ -172,8 +172,41 @@ int rte_power_kvm_vm_freq_max(unsigned lcore_id);
  */
 int rte_power_kvm_vm_freq_min(unsigned lcore_id);
 
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  -ENOTSUP
+ */
+int rte_power_kvm_vm_turbo_status(unsigned int lcore_id);
+
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_enable_turbo(unsigned int lcore_id);
+
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ *  lcore id.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int rte_power_kvm_vm_disable_turbo(unsigned int lcore_id);
 #ifdef __cplusplus
 }
 #endif
-
 #endif
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2 2/4] examples/vm_power_manager: add per-core turbo
  2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API David Hunt
@ 2017-09-13 10:44     ` David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 3/4] examples/vm_power_cli_guest: " David Hunt
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-09-13 10:44 UTC (permalink / raw)
  To: dev; +Cc: David Hunt

Add extra commands to command line to allow enable/disable of
per-core turbo.

When a core has turbo enabled, calling for max frequency will allow it to
go to a turbo frequency (P0n).

When a core has turbo disabled, calling for max frequency will allow it to
go to the maximum non-turbo frequency (P1), but not beyond.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/channel_monitor.c | 12 +++++++
 examples/vm_power_manager/power_manager.c   | 36 ++++++++++++++++++++
 examples/vm_power_manager/power_manager.h   | 52 +++++++++++++++++++++++++++++
 examples/vm_power_manager/vm_power_cli.c    | 21 ++++++++----
 4 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index e7f5cc4..ac40dac 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -105,6 +105,12 @@ process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 			case(CPU_POWER_SCALE_UP):
 					power_manager_scale_core_up(core_num);
 			break;
+			case(CPU_POWER_ENABLE_TURBO):
+				power_manager_enable_turbo_core(core_num);
+			break;
+			case(CPU_POWER_DISABLE_TURBO):
+				power_manager_disable_turbo_core(core_num);
+			break;
 			default:
 				break;
 			}
@@ -122,6 +128,12 @@ process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 			case(CPU_POWER_SCALE_UP):
 					power_manager_scale_mask_up(core_mask);
 			break;
+			case(CPU_POWER_ENABLE_TURBO):
+				power_manager_enable_turbo_mask(core_mask);
+			break;
+			case(CPU_POWER_DISABLE_TURBO):
+				power_manager_disable_turbo_mask(core_mask);
+			break;
 			default:
 				break;
 			}
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 2644fce..80705f9 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -216,6 +216,24 @@ power_manager_scale_mask_max(uint64_t core_mask)
 }
 
 int
+power_manager_enable_turbo_mask(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(enable_turbo, core_mask, ret);
+	return ret;
+}
+
+int
+power_manager_disable_turbo_mask(uint64_t core_mask)
+{
+	int ret = 0;
+
+	POWER_SCALE_MASK(disable_turbo, core_mask, ret);
+	return ret;
+}
+
+int
 power_manager_scale_core_up(unsigned core_num)
 {
 	int ret = 0;
@@ -250,3 +268,21 @@ power_manager_scale_core_max(unsigned core_num)
 	POWER_SCALE_CORE(max, core_num, ret);
 	return ret;
 }
+
+int
+power_manager_enable_turbo_core(unsigned int core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(enable_turbo, core_num, ret);
+	return ret;
+}
+
+int
+power_manager_disable_turbo_core(unsigned int core_num)
+{
+	int ret = 0;
+
+	POWER_SCALE_CORE(disable_turbo, core_num, ret);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 1b45bab..b74d09b 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -113,6 +113,32 @@ int power_manager_scale_mask_min(uint64_t core_mask);
 int power_manager_scale_mask_max(uint64_t core_mask);
 
 /**
+ * Enable Turbo Boost on the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_enable_turbo_mask(uint64_t core_mask);
+
+/**
+ * Disable Turbo Boost on the cores specified in core_mask.
+ * It is thread-safe.
+ *
+ * @param core_mask
+ *  The uint64_t bit-mask of cores to change frequency.
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_disable_turbo_mask(uint64_t core_mask);
+
+/**
  * Scale up frequency for the core specified by core_num.
  * It is thread-safe.
  *
@@ -168,6 +194,32 @@ int power_manager_scale_core_min(unsigned core_num);
 int power_manager_scale_core_max(unsigned core_num);
 
 /**
+ * Enable Turbo Boost for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to boost
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_enable_turbo_core(unsigned int core_num);
+
+/**
+ * Disable Turbo Boost for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to boost
+ *
+ * @return
+ *  - 1 on success.
+ *  - Negative on error.
+ */
+int power_manager_disable_turbo_core(unsigned int core_num);
+
+/**
  * Get the current freuency of the core specified by core_num
  *
  * @param core_num
diff --git a/examples/vm_power_manager/vm_power_cli.c b/examples/vm_power_manager/vm_power_cli.c
index c5e8d93..6f234fb 100644
--- a/examples/vm_power_manager/vm_power_cli.c
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -520,6 +520,10 @@ cmd_set_cpu_freq_mask_parsed(void *parsed_result, struct cmdline *cl,
 		ret = power_manager_scale_mask_min(res->core_mask);
 	else if (!strcmp(res->cmd , "max"))
 		ret = power_manager_scale_mask_max(res->core_mask);
+	else if (!strcmp(res->cmd, "enable_turbo"))
+		ret = power_manager_enable_turbo_mask(res->core_mask);
+	else if (!strcmp(res->cmd, "disable_turbo"))
+		ret = power_manager_disable_turbo_mask(res->core_mask);
 	if (ret < 0) {
 		cmdline_printf(cl, "Error scaling core_mask(0x%"PRIx64") '%s' , not "
 				"all cores specified have been scaled\n",
@@ -535,14 +539,13 @@ cmdline_parse_token_num_t cmd_set_cpu_freq_mask_core_mask =
 			core_mask, UINT64);
 cmdline_parse_token_string_t cmd_set_cpu_freq_mask_result =
 	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_mask_result,
-			cmd, "up#down#min#max");
+			cmd, "up#down#min#max#enable_turbo#disable_turbo");
 
 cmdline_parse_inst_t cmd_set_cpu_freq_mask_set = {
 	.f = cmd_set_cpu_freq_mask_parsed,
 	.data = NULL,
-	.help_str = "set_cpu_freq <core_mask> <up|down|min|max>, Set the current "
-			"frequency for the cores specified in <core_mask> by scaling "
-			"each up/down/min/max.",
+	.help_str = "set_cpu_freq <core_mask> <up|down|min|max|enable_turbo|disable_turbo>, adjust the current "
+			"frequency for the cores specified in <core_mask>",
 	.tokens = {
 		(void *)&cmd_set_cpu_freq_mask,
 		(void *)&cmd_set_cpu_freq_mask_core_mask,
@@ -614,6 +617,10 @@ cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
 		ret = power_manager_scale_core_min(res->core_num);
 	else if (!strcmp(res->cmd , "max"))
 		ret = power_manager_scale_core_max(res->core_num);
+	else if (!strcmp(res->cmd, "enable_turbo"))
+		ret = power_manager_enable_turbo_core(res->core_num);
+	else if (!strcmp(res->cmd, "disable_turbo"))
+		ret = power_manager_disable_turbo_core(res->core_num);
 	if (ret < 0) {
 		cmdline_printf(cl, "Error scaling core(%u) '%s'\n", res->core_num,
 				res->cmd);
@@ -628,13 +635,13 @@ cmdline_parse_token_num_t cmd_set_cpu_freq_core_num =
 			core_num, UINT8);
 cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
 	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
-			cmd, "up#down#min#max");
+			cmd, "up#down#min#max#enable_turbo#disable_turbo");
 
 cmdline_parse_inst_t cmd_set_cpu_freq_set = {
 	.f = cmd_set_cpu_freq_parsed,
 	.data = NULL,
-	.help_str = "set_cpu_freq <core_num> <up|down|min|max>, Set the current "
-			"frequency for the specified core by scaling up/down/min/max",
+	.help_str = "set_cpu_freq <core_num> <up|down|min|max|enable_turbo|disable_turbo>, adjust the current "
+			"frequency for the specified core",
 	.tokens = {
 		(void *)&cmd_set_cpu_freq,
 		(void *)&cmd_set_cpu_freq_core_num,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2 3/4] examples/vm_power_cli_guest: add per-core turbo
  2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 2/4] examples/vm_power_manager: add per-core turbo David Hunt
@ 2017-09-13 10:44     ` David Hunt
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs David Hunt
  2017-09-22 14:36     ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability Thomas Monjalon
  4 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-09-13 10:44 UTC (permalink / raw)
  To: dev; +Cc: David Hunt

Add extra commands to guest cli to allow enable/disable of
per-core turbo. Includes messages to vm_power_mgr in host.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 7931135..4e982bd 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -108,6 +108,10 @@ cmd_set_cpu_freq_parsed(void *parsed_result, struct cmdline *cl,
 		ret = rte_power_freq_min(res->lcore_id);
 	else if (!strcmp(res->cmd , "max"))
 		ret = rte_power_freq_max(res->lcore_id);
+	else if (!strcmp(res->cmd, "enable_turbo"))
+		ret = rte_power_freq_enable_turbo(res->lcore_id);
+	else if (!strcmp(res->cmd, "disable_turbo"))
+		ret = rte_power_freq_disable_turbo(res->lcore_id);
 	if (ret != 1)
 		cmdline_printf(cl, "Error sending message: %s\n", strerror(ret));
 }
@@ -120,7 +124,7 @@ cmdline_parse_token_string_t cmd_set_cpu_freq_core_num =
 			lcore_id, UINT8);
 cmdline_parse_token_string_t cmd_set_cpu_freq_cmd_cmd =
 	TOKEN_STRING_INITIALIZER(struct cmd_set_cpu_freq_result,
-			cmd, "up#down#min#max");
+			cmd, "up#down#min#max#enable_turbo#disable_turbo");
 
 cmdline_parse_inst_t cmd_set_cpu_freq_set = {
 	.f = cmd_set_cpu_freq_parsed,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs
  2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
                       ` (2 preceding siblings ...)
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 3/4] examples/vm_power_cli_guest: " David Hunt
@ 2017-09-13 10:44     ` David Hunt
  2017-09-18 18:20       ` Mcnamara, John
  2018-02-06 12:29       ` Mcnamara, John
  2017-09-22 14:36     ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability Thomas Monjalon
  4 siblings, 2 replies; 34+ messages in thread
From: David Hunt @ 2017-09-13 10:44 UTC (permalink / raw)
  To: dev; +Cc: David Hunt

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 doc/guides/prog_guide/power_man.rst | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 114d0b1..c5d62a3 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -89,6 +89,14 @@ Core state can be altered by speculative sleeps whenever the specified lcore has
 In the DPDK, if no packet is received after polling,
 speculative sleeps can be triggered according the strategies defined by the user space application.
 
+Per-core Turbo Boost
+--------------------
+
+Individual cores can be allowed to enter a Turbo Boost state on a per-core
+basis. This is achieved by enabling Turbo Boost Technology in the BIOS, then
+looping through the relevant cores and enabling/disabling Turbo Boost on each
+core.
+
 API Overview of the Power Library
 ---------------------------------
 
@@ -108,6 +116,10 @@ The main methods exported by power library are for CPU frequency scaling and inc
 
 *   **Freq set**: Prompt the kernel to set the frequency for the specific lcore.
 
+*   **Enable turbo**: Prompt the kernel to enable Turbo Boost for the specific lcore.
+
+*   **Disable turbo**: Prompt the kernel to disable Turbo Boost for the specific lcore.
+
 User Cases
 ----------
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs David Hunt
@ 2017-09-18 18:20       ` Mcnamara, John
  2018-02-06 12:29       ` Mcnamara, John
  1 sibling, 0 replies; 34+ messages in thread
From: Mcnamara, John @ 2017-09-18 18:20 UTC (permalink / raw)
  To: Hunt, David, dev; +Cc: Hunt, David



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Hunt
> Sent: Wednesday, September 13, 2017 11:44 AM
> To: dev@dpdk.org
> Cc: Hunt, David <david.hunt@intel.com>
> Subject: [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core
> turbo APIs
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability
  2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
                       ` (3 preceding siblings ...)
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs David Hunt
@ 2017-09-22 14:36     ` Thomas Monjalon
  4 siblings, 0 replies; 34+ messages in thread
From: Thomas Monjalon @ 2017-09-22 14:36 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

13/09/2017 12:44, David Hunt:
> Recent generations of the Intel® Xeon® family processors allow Turbo Boost
> to be enabled/disabled on a per-core basis.
> 
> This patch set introduces additional API calls to the librte_power library
> to allow users to enable/disable Turbo Boost on particular cores.

Applied, thanks

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API David Hunt
@ 2017-10-03 14:08       ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
                           ` (8 more replies)
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
  1 sibling, 9 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu

Policy Based Power Control for Guest

This patchset adds the facility for a guest VM to send a policy down to the
host that will allow the host to scale up/down cpu frequencies
depending on the policy criteria independently of the DPDK app running in
the guest.  This differs from the previous vm_power implementation where
individual scale up/down requests were send from the guest to the host via
virtio-serial.

V3 patchset changes:
  * Changed to using is_same_ether_addr() instead of looping through
    the mac address bytes to compare them.
  * Tweaked some comments and working in the i40e patch after review.
  * Added a patch to the set to add new i40e function to map file, so
    as to allow shared library builds. The power library API needs a cleanup
    in next release, so will add API/ABI warning for this cleanup in a
    separate patch.

V2 patchset changes:
  * Removed API's in ethdev layer.
  * Now just a single new API in the i40e driver for mapping VF MAC to
    VF index.
  * Moved new function from rte_rxtx.c to rte_pmd_i40e.c
  * Removed function for reading i40e register, moved to using the
    standard stats API.
  * Renamed i40e function to rte_pmd_i40e_query_vfid_by_mac
  * Cleaned up policy generation code.

It's a modification of the vm_power_manager app that runs in the host, and
the guest_vm_power_app example app that runs in the guest. This allows the
guest to send down a policy to the host via virtio-serial, which then allows
the host to scale up/down based on the criteria in the policy, resulting in
quicker scale up/down than individual requests coming from the guest.
It also means that the DPDK application running in the guest does not need
to be modified in any way, it is unaware that it's cores are being scaled
up/down, reducing the effort in implementing a power-aware infrastructure.

The usage model is as follows:
1. Set up the VF's and assign to the guest in the usual way.
2. run vm_power_manager on the host, creating a channel to the guest.
3. Start the guest_vm_power_mgr app on the guest, which establishes
   a virtio-serial channel to the host.
4. Send down the profile for the guest using the "send_profile now" command.
   There is an example profile hard-coded into guest_vm_power_mgr.
5. Stop the guest_vm_power_mgr and run your normal power-unaware application.
6. Send traffic into the VFs at varying traffic rates.
   Observe the frequency change on the host (turbostat -i 1)

The sequence of code changes are as follows:

A new function has been aded to the i40e driver to allow mapping of
a VF MAC to VF index.

Next we make an addition to librte_power that adds an extra command to allow
the passing of a policy structure from the guest to the host. This struct
contains information like busy/quiet hour, packet throughput thresholds, etc.

The next addition adds functionality to convert the virtual CPU (vcpu) IDs to
physical CPU (pcpu) IDs so that the host can scale up/down the cores used
in the guest.

The remaining patches are functionality to process the policy, and take action
when the relevant trigger occurs to cause a frequency change.

[1/9] net/i40e: add API to convert VF MAC to VF id
[2/9] lib/librte_power: add extra msg type for policies
[3/9] examples/vm_power_mgr: add vcpu to pcpu mapping
[4/9] examples/vm_power_mgr: add scale to medium freq fn
[5/9] examples/vm_power_mgr: add policy to channels
[6/9] examples/vm_power_mgr: add port initialisation
[7/9] power: add send channel msg function to map file
[8/9] examples/guest_cli: add send policy to host
[9/9] examples/vm_power_mgr: set MAC address of VF

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 1/9] net/i40e: add API to convert VF MAC to VF id
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 2/9] lib/librte_power: add extra msg type for policies David Hunt
                           ` (7 subsequent siblings)
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, Sexton, Rory,
	Nemanja Marjanovic, David Hunt

From: "Sexton, Rory" <rory.sexton@intel.com>

Need a way to convert a vf id to a pf id on the host so as to query the pf
for relevant statistics which are used for the frequency changes in the
vm_power_manager app. Used when profiles are passed down from the guest
to the host, allowing the host to map the vfs to pfs.

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 drivers/net/i40e/rte_pmd_i40e.c           | 31 +++++++++++++++++++++++++++++++
 drivers/net/i40e/rte_pmd_i40e.h           | 13 +++++++++++++
 drivers/net/i40e/rte_pmd_i40e_version.map |  7 +++++++
 3 files changed, 51 insertions(+)

diff --git a/drivers/net/i40e/rte_pmd_i40e.c b/drivers/net/i40e/rte_pmd_i40e.c
index f12b7f4..21efb2f 100644
--- a/drivers/net/i40e/rte_pmd_i40e.c
+++ b/drivers/net/i40e/rte_pmd_i40e.c
@@ -2115,3 +2115,34 @@ int rte_pmd_i40e_ptype_mapping_replace(uint8_t port,
 
 	return 0;
 }
+
+uint64_t
+rte_pmd_i40e_query_vfid_by_mac(uint8_t port, uint64_t vf_mac)
+{
+	struct rte_eth_dev *dev;
+	struct ether_addr *vf_mac_addr = (struct ether_addr *)&vf_mac;
+	struct ether_addr *mac;
+	struct i40e_pf *pf;
+	int vf_id;
+	struct i40e_pf_vf *vf;
+	uint16_t vf_num;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+	dev = &rte_eth_devices[port];
+
+	if (!is_i40e_supported(dev))
+		return -ENOTSUP;
+
+	pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+	vf_num = pf->vf_num;
+
+	for (vf_id = 0; vf_id < vf_num; vf_id++) {
+		vf = &pf->vfs[vf_id];
+		mac = &vf->mac_addr;
+
+		if (is_same_ether_addr(mac, vf_mac_addr))
+			return vf_id;
+	}
+
+	return -EINVAL;
+}
diff --git a/drivers/net/i40e/rte_pmd_i40e.h b/drivers/net/i40e/rte_pmd_i40e.h
index 356fa89..a7ae0f0 100644
--- a/drivers/net/i40e/rte_pmd_i40e.h
+++ b/drivers/net/i40e/rte_pmd_i40e.h
@@ -637,4 +637,17 @@ int rte_pmd_i40e_ptype_mapping_replace(uint8_t port,
 				       uint8_t mask,
 				       uint32_t pkt_type);
 
+/**
+ * On the PF, find VF index based on VF MAC address
+ *
+ * @param port
+ *    pointer to port identifier of the device
+ * @param vf_mac
+ *    the mac address of the vf to determine index of
+ * @return
+ *    -(-22 EINVAL) the vf mac does not exist on this port
+ *    -(!-22) the index of vfid in pf->vfs
+ */
+uint64_t rte_pmd_i40e_query_vfid_by_mac(uint8_t port, uint64_t vf_mac);
+
 #endif /* _PMD_I40E_H_ */
diff --git a/drivers/net/i40e/rte_pmd_i40e_version.map b/drivers/net/i40e/rte_pmd_i40e_version.map
index 20cc980..d8b74bd 100644
--- a/drivers/net/i40e/rte_pmd_i40e_version.map
+++ b/drivers/net/i40e/rte_pmd_i40e_version.map
@@ -45,3 +45,10 @@ DPDK_17.08 {
 	rte_pmd_i40e_get_ddp_info;
 
 } DPDK_17.05;
+
+DPDK_17.11 {
+	global:
+
+	rte_pmd_i40e_query_vfid_by_mac;
+
+} DPDK_17.08;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 2/9] lib/librte_power: add extra msg type for policies
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping David Hunt
                           ` (6 subsequent siblings)
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, David Hunt, Nemanja Marjanovic,
	Rory Sexton

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_power/channel_commands.h | 52 +++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 484085b..1599706 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -46,6 +46,7 @@ extern "C" {
 /* Valid Commands */
 #define CPU_POWER               1
 #define CPU_POWER_CONNECT       2
+#define PKT_POLICY              3
 
 /* CPU Power Command Scaling */
 #define CPU_POWER_SCALE_UP      1
@@ -54,11 +55,62 @@ extern "C" {
 #define CPU_POWER_SCALE_MIN     4
 #define CPU_POWER_ENABLE_TURBO  5
 #define CPU_POWER_DISABLE_TURBO 6
+#define HOURS 24
+
+#ifdef RTE_LIBRTE_I40E_PMD
+#define MAX_VFS 10
+#endif
+
+#define MAX_VCPU_PER_VM         8
+
+typedef enum {false, true} bool;
+
+struct t_boost_status {
+	bool tbEnabled;
+};
+
+struct timer_profile {
+	int busy_hours[HOURS];
+	int quiet_hours[HOURS];
+#ifdef RTE_LIBRTE_I40E_PMD
+	int hours_to_use_traffic_profile[HOURS];
+#endif
+};
+
+enum workload {HIGH, MEDIUM, LOW};
+enum policy_to_use {
+#ifdef RTE_LIBRTE_I40E_PMD
+	TRAFFIC,
+#endif
+	TIME,
+	WORKLOAD
+};
+
+#ifdef RTE_LIBRTE_I40E_PMD
+struct traffic {
+	uint32_t min_packet_thresh;
+	uint32_t avg_max_packet_thresh;
+	uint32_t max_max_packet_thresh;
+};
+#endif
 
 struct channel_packet {
 	uint64_t resource_id; /**< core_num, device */
 	uint32_t unit;        /**< scale down/up/min/max */
 	uint32_t command;     /**< Power, IO, etc */
+	char vm_name[32];
+
+#ifdef RTE_LIBRTE_I40E_PMD
+	uint64_t vfid[MAX_VFS];
+	int nb_mac_to_monitor;
+	struct traffic traffic_policy;
+#endif
+	uint8_t vcpu_to_control[MAX_VCPU_PER_VM];
+	uint8_t num_vcpu;
+	struct timer_profile timer_policy;
+	enum workload workload;
+	enum policy_to_use policy_to_use;
+	struct t_boost_status t_boost_status;
 };
 
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 2/9] lib/librte_power: add extra msg type for policies David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 4/9] examples/vm_power_mgr: add scale to medium freq fn David Hunt
                           ` (5 subsequent siblings)
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, David Hunt, Nemanja Marjanovic,
	Rory Sexton

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/channel_manager.c | 62 +++++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h | 25 ++++++++++++
 2 files changed, 87 insertions(+)

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
index e068ae2..03fa626 100644
--- a/examples/vm_power_manager/channel_manager.c
+++ b/examples/vm_power_manager/channel_manager.c
@@ -574,6 +574,68 @@ set_channel_status(const char *vm_name, unsigned *channel_list,
 	return num_channels_changed;
 }
 
+void
+get_all_vm(int *num_vm, int *num_cpu)
+{
+
+	virNodeInfo node_info;
+	virDomainPtr *domptr;
+	uint64_t mask;
+	int i, ii, numVcpus[MAX_VCPUS], cpu, n_vcpus;
+	unsigned int jj;
+	const char *vm_name;
+	unsigned int flags = VIR_CONNECT_LIST_DOMAINS_RUNNING |
+				VIR_CONNECT_LIST_DOMAINS_PERSISTENT;
+	unsigned int flag = VIR_DOMAIN_VCPU_CONFIG;
+
+
+	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
+	if (virNodeGetInfo(global_vir_conn_ptr, &node_info))
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to retrieve node Info\n");
+
+	/* Returns number of pcpus */
+	global_n_host_cpus = (unsigned int)node_info.cpus;
+
+	/* Returns number of active domains */
+	*num_vm = virConnectListAllDomains(global_vir_conn_ptr, &domptr, flags);
+	if (*num_vm <= 0)
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No Active Domains Running\n");
+
+	for (i = 0; i < *num_vm; i++) {
+
+		/* Get Domain Names */
+		vm_name = virDomainGetName(domptr[i]);
+		lvm_info[i].vm_name = vm_name;
+
+		/* Get Number of Vcpus */
+		numVcpus[i] = virDomainGetVcpusFlags(domptr[i], flag);
+
+		/* Get Number of VCpus & VcpuPinInfo */
+		n_vcpus = virDomainGetVcpuPinInfo(domptr[i],
+				numVcpus[i], global_cpumaps,
+				global_maplen, flag);
+
+		if ((int)n_vcpus > 0) {
+			*num_cpu = n_vcpus;
+			lvm_info[i].num_cpus = n_vcpus;
+		}
+
+		/* Save pcpu in use by libvirt VMs */
+		for (ii = 0; ii < n_vcpus; ii++) {
+			mask = 0;
+			for (jj = 0; jj < global_n_host_cpus; jj++) {
+				if (VIR_CPU_USABLE(global_cpumaps,
+						global_maplen, ii, jj) > 0) {
+					mask |= 1ULL << jj;
+				}
+			}
+			ITERATIVE_BITMASK_CHECK_64(mask, cpu) {
+				lvm_info[i].pcpus[ii] = cpu;
+			}
+		}
+	}
+}
+
 int
 get_info_vm(const char *vm_name, struct vm_info *info)
 {
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
index 47c3b9c..788c1e6 100644
--- a/examples/vm_power_manager/channel_manager.h
+++ b/examples/vm_power_manager/channel_manager.h
@@ -66,6 +66,17 @@ struct sockaddr_un _sockaddr_un;
 #define UNIX_PATH_MAX sizeof(_sockaddr_un.sun_path)
 #endif
 
+#define MAX_VMS 4
+#define MAX_VCPUS 20
+
+
+struct libvirt_vm_info {
+	const char *vm_name;
+	unsigned int pcpus[MAX_VCPUS];
+	uint8_t num_cpus;
+};
+
+struct libvirt_vm_info lvm_info[MAX_VMS];
 /* Communication Channel Status */
 enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
 	CHANNEL_MGR_CHANNEL_CONNECTED,
@@ -319,6 +330,20 @@ int set_channel_status(const char *vm_name, unsigned *channel_list,
  */
 int get_info_vm(const char *vm_name, struct vm_info *info);
 
+/**
+ * Populates a table with all domains running and their physical cpu.
+ * All information is gathered through libvirt api.
+ *
+ * @param noVms
+ *  modified to store number of active VMs
+ *
+ * @param noVcpus
+    modified to store number of vcpus active
+ *
+ * @return
+ *   void
+ */
+void get_all_vm(int *noVms, int *noVcpus);
 #ifdef __cplusplus
 }
 #endif
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 4/9] examples/vm_power_mgr: add scale to medium freq fn
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
                           ` (2 preceding siblings ...)
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 5/9] examples/vm_power_mgr: add policy to channels David Hunt
                           ` (4 subsequent siblings)
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, David Hunt, Nemanja Marjanovic,
	Rory Sexton

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/power_manager.c | 15 +++++++++++++++
 examples/vm_power_manager/power_manager.h | 13 +++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 80705f9..c021c1d 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -286,3 +286,18 @@ power_manager_disable_turbo_core(unsigned int core_num)
 	POWER_SCALE_CORE(disable_turbo, core_num, ret);
 	return ret;
 }
+
+int
+power_manager_scale_core_med(unsigned int core_num)
+{
+	int ret = 0;
+
+	if (core_num >= POWER_MGR_MAX_CPUS)
+		return -1;
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return -1;
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	ret = rte_power_set_freq(core_num, 5);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index b74d09b..b52fb4c 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -231,6 +231,19 @@ int power_manager_disable_turbo_core(unsigned int core_num);
  */
 uint32_t power_manager_get_current_frequency(unsigned core_num);
 
+/**
+ * Scale to medium frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_med(unsigned int core_num);
 
 #ifdef __cplusplus
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 5/9] examples/vm_power_mgr: add policy to channels
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
                           ` (3 preceding siblings ...)
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 4/9] examples/vm_power_mgr: add scale to medium freq fn David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 6/9] examples/vm_power_mgr: add port initialisation David Hunt
                           ` (3 subsequent siblings)
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, Sexton, Rory,
	Nemanja Marjanovic, David Hunt

From: "Sexton, Rory" <rory.sexton@intel.com>

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/Makefile          |  16 ++
 examples/vm_power_manager/channel_monitor.c | 340 +++++++++++++++++++++++++++-
 examples/vm_power_manager/channel_monitor.h |  20 ++
 3 files changed, 370 insertions(+), 6 deletions(-)

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index 59a9641..9cf20a2 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -54,6 +54,22 @@ CFLAGS += $(WERROR_FLAGS)
 
 LDLIBS += -lvirt
 
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+
+ifeq ($(CONFIG_RTE_LIBRTE_IXGBE_PMD),y)
+LDLIBS += -lrte_pmd_ixgbe
+endif
+
+ifeq ($(CONFIG_RTE_LIBRTE_I40E_PMD),y)
+LDLIBS += -lrte_pmd_i40e
+endif
+
+ifeq ($(CONFIG_RTE_LIBRTE_BNXT_PMD),y)
+LDLIBS += -lrte_pmd_bnxt
+endif
+
+endif
+
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index ac40dac..7db98ad 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -41,13 +41,20 @@
 #include <sys/types.h>
 #include <sys/epoll.h>
 #include <sys/queue.h>
+#include <sys/time.h>
 
 #include <rte_log.h>
 #include <rte_memory.h>
 #include <rte_malloc.h>
 #include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_ethdev.h>
 
+#ifdef RTE_LIBRTE_I40E_PMD
+#include <rte_pmd_i40e.h>
+#endif
 
+#include <libvirt/libvirt.h>
 #include "channel_monitor.h"
 #include "channel_commands.h"
 #include "channel_manager.h"
@@ -57,10 +64,17 @@
 
 #define MAX_EVENTS 256
 
+#ifdef RTE_LIBRTE_I40E_PMD
+uint64_t vsi_pkt_count_prev[384];
+uint64_t rdtsc_prev[384];
+#endif
 
+double time_period_s = 1;
 static volatile unsigned run_loop = 1;
 static int global_event_fd;
+static unsigned int policy_is_set;
 static struct epoll_event *global_events_list;
+static struct policy policies[MAX_VMS];
 
 void channel_monitor_exit(void)
 {
@@ -68,6 +82,302 @@ void channel_monitor_exit(void)
 	rte_free(global_events_list);
 }
 
+static void
+core_share(int pNo, int z, int x, int t)
+{
+	if (policies[pNo].core_share[z].pcpu == lvm_info[x].pcpus[t]) {
+		if (strcmp(policies[pNo].pkt.vm_name,
+				lvm_info[x].vm_name) != 0) {
+			policies[pNo].core_share[z].status = 1;
+			power_manager_scale_core_max(
+					policies[pNo].core_share[z].pcpu);
+		}
+	}
+}
+
+static void
+core_share_status(int pNo)
+{
+
+	int noVms, noVcpus, z, x, t;
+
+	get_all_vm(&noVms, &noVcpus);
+
+	/* Reset Core Share Status. */
+	for (z = 0; z < noVcpus; z++)
+		policies[pNo].core_share[z].status = 0;
+
+	/* Foreach vcpu in a policy. */
+	for (z = 0; z < policies[pNo].pkt.num_vcpu; z++) {
+		/* Foreach VM on the platform. */
+		for (x = 0; x < noVms; x++) {
+			/* Foreach vcpu of VMs on platform. */
+			for (t = 0; t < lvm_info[x].num_cpus; t++)
+				core_share(pNo, z, x, t);
+		}
+	}
+}
+
+static void
+get_pcpu_to_control(struct policy *pol)
+{
+
+	/* Convert vcpu to pcpu. */
+	struct vm_info info;
+	int pcpu, count;
+	uint64_t mask_u64b;
+
+	RTE_LOG(INFO, CHANNEL_MONITOR, "Looking for pcpu for %s\n",
+			pol->pkt.vm_name);
+	get_info_vm(pol->pkt.vm_name, &info);
+
+	for (count = 0; count < pol->pkt.num_vcpu; count++) {
+		mask_u64b = info.pcpu_mask[pol->pkt.vcpu_to_control[count]];
+		for (pcpu = 0; mask_u64b; mask_u64b &= ~(1ULL << pcpu++)) {
+			if ((mask_u64b >> pcpu) & 1)
+				pol->core_share[count].pcpu = pcpu;
+		}
+	}
+}
+
+#ifdef RTE_LIBRTE_I40E_PMD
+static int
+get_pfid(struct policy *pol)
+{
+
+	int i, x, ret = 0, nb_ports;
+
+	nb_ports = rte_eth_dev_count();
+	for (i = 0; i < pol->pkt.nb_mac_to_monitor; i++) {
+
+		for (x = 0; x < nb_ports; x++) {
+			ret = rte_pmd_i40e_query_vfid_by_mac(x,
+					pol->pkt.vfid[i]);
+			if (ret != -EINVAL) {
+				pol->port[i] = x;
+				break;
+			}
+		}
+		if (ret == -EINVAL) {
+			RTE_LOG(INFO, CHANNEL_MONITOR,
+				"Error with Policy. MAC not found on "
+				"attached ports ");
+			pol->enabled = 0;
+			return ret;
+		}
+		pol->pfid[i] = ret;
+	}
+	return 1;
+}
+#endif
+
+static int
+update_policy(struct channel_packet *pkt)
+{
+
+	unsigned int updated = 0;
+
+	for (int i = 0; i < MAX_VMS; i++) {
+		if (strcmp(policies[i].pkt.vm_name, pkt->vm_name) == 0) {
+			policies[i].pkt = *pkt;
+			get_pcpu_to_control(&policies[i]);
+#ifdef RTE_LIBRTE_I40E_PMD
+			if (get_pfid(&policies[i]) == -1) {
+				updated = 1;
+				break;
+			}
+#endif
+			core_share_status(i);
+			policies[i].enabled = 1;
+			updated = 1;
+		}
+	}
+	if (!updated) {
+		for (int i = 0; i < MAX_VMS; i++) {
+			if (policies[i].enabled == 0) {
+				policies[i].pkt = *pkt;
+				get_pcpu_to_control(&policies[i]);
+#ifdef RTE_LIBRTE_I40E_PMD
+				if (get_pfid(&policies[i]) == -1)
+					break;
+#endif
+				core_share_status(i);
+				policies[i].enabled = 1;
+				break;
+			}
+		}
+	}
+	return 0;
+}
+
+#ifdef RTE_LIBRTE_I40E_PMD
+static uint64_t
+get_pkt_diff(struct policy *pol)
+{
+
+	uint64_t vsi_pkt_count,
+		vsi_pkt_total = 0,
+		vsi_pkt_count_prev_total = 0;
+	double rdtsc_curr, rdtsc_diff, diff;
+	int x;
+	struct rte_eth_stats vf_stats;
+
+	for (x = 0; x < pol->pkt.nb_mac_to_monitor; x++) {
+
+		/*Read vsi stats*/
+		if (rte_pmd_i40e_get_vf_stats(x, pol->pfid[x], &vf_stats) == 0)
+			vsi_pkt_count = vf_stats.ipackets;
+		else
+			vsi_pkt_count = -1;
+
+		vsi_pkt_total += vsi_pkt_count;
+
+		vsi_pkt_count_prev_total += vsi_pkt_count_prev[pol->pfid[x]];
+		vsi_pkt_count_prev[pol->pfid[x]] = vsi_pkt_count;
+	}
+
+	rdtsc_curr = rte_rdtsc_precise();
+	rdtsc_diff = rdtsc_curr - rdtsc_prev[pol->pfid[x-1]];
+	rdtsc_prev[pol->pfid[x-1]] = rdtsc_curr;
+
+	diff = (vsi_pkt_total - vsi_pkt_count_prev_total) *
+			((double)rte_get_tsc_hz() / rdtsc_diff);
+
+	return diff;
+}
+
+static void
+apply_traffic_profile(struct policy *pol)
+{
+
+	int count;
+	uint64_t diff = 0;
+
+	diff = get_pkt_diff(pol);
+
+	RTE_LOG(INFO, CHANNEL_MONITOR, "Applying traffic profile\n");
+
+	if (diff >= (pol->pkt.traffic_policy.max_max_packet_thresh)) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_max(
+						pol->core_share[count].pcpu);
+		}
+	} else if (diff >= (pol->pkt.traffic_policy.avg_max_packet_thresh)) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_med(
+						pol->core_share[count].pcpu);
+		}
+	} else if (diff < (pol->pkt.traffic_policy.avg_max_packet_thresh)) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_min(
+						pol->core_share[count].pcpu);
+		}
+	}
+}
+#endif
+
+static void
+apply_time_profile(struct policy *pol)
+{
+
+	int count, x;
+	struct timeval tv;
+	struct tm *ptm;
+	char time_string[40];
+
+	/* Obtain the time of day, and convert it to a tm struct. */
+	gettimeofday(&tv, NULL);
+	ptm = localtime(&tv.tv_sec);
+	/* Format the date and time, down to a single second. */
+	strftime(time_string, sizeof(time_string), "%Y-%m-%d %H:%M:%S", ptm);
+
+	for (x = 0; x < HOURS; x++) {
+
+		if (ptm->tm_hour == pol->pkt.timer_policy.busy_hours[x]) {
+			for (count = 0; count < pol->pkt.num_vcpu; count++) {
+				if (pol->core_share[count].status != 1) {
+					power_manager_scale_core_max(
+						pol->core_share[count].pcpu);
+				RTE_LOG(INFO, CHANNEL_MONITOR,
+					"Scaling up core %d to max\n",
+					pol->core_share[count].pcpu);
+				}
+			}
+			break;
+		} else if (ptm->tm_hour ==
+				pol->pkt.timer_policy.quiet_hours[x]) {
+			for (count = 0; count < pol->pkt.num_vcpu; count++) {
+				if (pol->core_share[count].status != 1) {
+					power_manager_scale_core_min(
+						pol->core_share[count].pcpu);
+				RTE_LOG(INFO, CHANNEL_MONITOR,
+					"Scaling down core %d to min\n",
+					pol->core_share[count].pcpu);
+			}
+		}
+			break;
+#ifdef RTE_LIBRTE_I40E_PMD
+		} else if (ptm->tm_hour ==
+			pol->pkt.timer_policy.hours_to_use_traffic_profile[x]) {
+			apply_traffic_profile(pol);
+			break;
+		}
+#else
+	}
+#endif
+	}
+}
+
+static void
+apply_workload_profile(struct policy *pol)
+{
+
+	int count;
+
+	if (pol->pkt.workload == HIGH) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_max(
+						pol->core_share[count].pcpu);
+		}
+	} else if (pol->pkt.workload == MEDIUM) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_med(
+						pol->core_share[count].pcpu);
+		}
+	} else if (pol->pkt.workload == LOW) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_min(
+						pol->core_share[count].pcpu);
+		}
+	}
+}
+
+static void
+apply_policy(struct policy *pol)
+{
+
+	struct channel_packet *pkt = &pol->pkt;
+
+	/*Check policy to use*/
+#ifdef RTE_LIBRTE_I40E_PMD
+	if (pkt->policy_to_use == TRAFFIC)
+		apply_traffic_profile(pol);
+	else if (pkt->policy_to_use == TIME)
+#else
+	if (pkt->policy_to_use == TIME)
+#endif
+		apply_time_profile(pol);
+	else if (pkt->policy_to_use == WORKLOAD)
+		apply_workload_profile(pol);
+}
+
+
 static int
 process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 {
@@ -140,6 +450,13 @@ process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 
 		}
 	}
+
+	if (pkt->command == PKT_POLICY) {
+		RTE_LOG(INFO, CHANNEL_MONITOR, "\nProcessing Policy request from Guest\n");
+		update_policy(pkt);
+		policy_is_set = 1;
+	}
+
 	/* Return is not checked as channel status may have been set to DISABLED
 	 * from management thread
 	 */
@@ -209,9 +526,10 @@ run_channel_monitor(void)
 			struct channel_info *chan_info = (struct channel_info *)
 					global_events_list[i].data.ptr;
 			if ((global_events_list[i].events & EPOLLERR) ||
-					(global_events_list[i].events & EPOLLHUP)) {
+				(global_events_list[i].events & EPOLLHUP)) {
 				RTE_LOG(DEBUG, CHANNEL_MONITOR, "Remote closed connection for "
-						"channel '%s'\n", chan_info->channel_path);
+						"channel '%s'\n",
+						chan_info->channel_path);
 				remove_channel(&chan_info);
 				continue;
 			}
@@ -223,14 +541,17 @@ run_channel_monitor(void)
 				int buffer_len = sizeof(pkt);
 
 				while (buffer_len > 0) {
-					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					n_bytes = read(chan_info->fd,
+							buffer, buffer_len);
 					if (n_bytes == buffer_len)
 						break;
 					if (n_bytes == -1) {
 						err = errno;
-						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
-								"channel '%s' read: %s\n",
-								chan_info->channel_path, strerror(err));
+						RTE_LOG(DEBUG, CHANNEL_MONITOR,
+							"Received error on "
+							"channel '%s' read: %s\n",
+							chan_info->channel_path,
+							strerror(err));
 						remove_channel(&chan_info);
 						break;
 					}
@@ -241,5 +562,12 @@ run_channel_monitor(void)
 					process_request(&pkt, chan_info);
 			}
 		}
+		rte_delay_us(time_period_s*1000000);
+		if (policy_is_set) {
+			for (int j = 0; j < MAX_VMS; j++) {
+				if (policies[j].enabled == 1)
+					apply_policy(&policies[j]);
+			}
+		}
 	}
 }
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
index c138607..11f5f75 100644
--- a/examples/vm_power_manager/channel_monitor.h
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -35,6 +35,26 @@
 #define CHANNEL_MONITOR_H_
 
 #include "channel_manager.h"
+#include "channel_commands.h"
+
+struct core_share {
+	unsigned int pcpu;
+	/*
+	 * 1 CORE SHARE
+	 * 0 NOT SHARED
+	 */
+	int status;
+};
+
+struct policy {
+	struct channel_packet pkt;
+#ifdef RTE_LIBRTE_I40E_PMD
+	uint32_t pfid[MAX_VFS];
+	uint32_t port[MAX_VFS];
+#endif
+	unsigned int enabled;
+	struct core_share core_share[MAX_VCPU_PER_VM];
+};
 
 #ifdef __cplusplus
 extern "C" {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 6/9] examples/vm_power_mgr: add port initialisation
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
                           ` (4 preceding siblings ...)
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 5/9] examples/vm_power_mgr: add policy to channels David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 7/9] power: add send channel msg function to map file David Hunt
                           ` (2 subsequent siblings)
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu, David Hunt, Nemanja Marjanovic

We need to initialise the port's we're monitoring to be able to see
the throughput.

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 220 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 220 insertions(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index c33fcc9..698abca 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -49,6 +49,9 @@
 #include <rte_log.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_ethdev.h>
+#include <getopt.h>
+#include <rte_cycles.h>
 #include <rte_debug.h>
 
 #include "channel_manager.h"
@@ -56,6 +59,192 @@
 #include "power_manager.h"
 #include "vm_power_cli.h"
 
+#define RX_RING_SIZE 512
+#define TX_RING_SIZE 512
+
+#define NUM_MBUFS 8191
+#define MBUF_CACHE_SIZE 250
+#define BURST_SIZE 32
+
+static uint32_t enabled_port_mask;
+static volatile bool force_quit;
+
+/****************/
+static const struct rte_eth_conf port_conf_default = {
+	.rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
+};
+
+static inline int
+port_init(uint8_t port, struct rte_mempool *mbuf_pool)
+{
+	struct rte_eth_conf port_conf = port_conf_default;
+	const uint16_t rx_rings = 1, tx_rings = 1;
+	int retval;
+	uint16_t q;
+
+	if (port >= rte_eth_dev_count())
+		return -1;
+
+	/* Configure the Ethernet device. */
+	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
+	if (retval != 0)
+		return retval;
+
+	/* Allocate and set up 1 RX queue per Ethernet port. */
+	for (q = 0; q < rx_rings; q++) {
+		retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
+				rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+		if (retval < 0)
+			return retval;
+	}
+
+	/* Allocate and set up 1 TX queue per Ethernet port. */
+	for (q = 0; q < tx_rings; q++) {
+		retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
+				rte_eth_dev_socket_id(port), NULL);
+		if (retval < 0)
+			return retval;
+	}
+
+	/* Start the Ethernet port. */
+	retval = rte_eth_dev_start(port);
+	if (retval < 0)
+		return retval;
+
+	/* Display the port MAC address. */
+	struct ether_addr addr;
+	rte_eth_macaddr_get(port, &addr);
+	printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8
+			   " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n",
+			(unsigned int)port,
+			addr.addr_bytes[0], addr.addr_bytes[1],
+			addr.addr_bytes[2], addr.addr_bytes[3],
+			addr.addr_bytes[4], addr.addr_bytes[5]);
+
+	/* Enable RX in promiscuous mode for the Ethernet device. */
+	rte_eth_promiscuous_enable(port);
+
+
+	return 0;
+}
+
+static int
+parse_portmask(const char *portmask)
+{
+	char *end = NULL;
+	unsigned long pm;
+
+	/* parse hexadecimal string */
+	pm = strtoul(portmask, &end, 16);
+	if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (pm == 0)
+		return -1;
+
+	return pm;
+}
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	static struct option lgopts[] = {
+		{ "mac-updating", no_argument, 0, 1},
+		{ "no-mac-updating", no_argument, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+				  lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'p':
+			enabled_port_mask = parse_portmask(optarg);
+			if (enabled_port_mask == 0) {
+				printf("invalid portmask\n");
+				return -1;
+			}
+			break;
+		/* long options */
+		case 0:
+			break;
+
+		default:
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+	uint8_t portid, count, all_ports_up, print_flag = 0;
+	struct rte_eth_link link;
+
+	printf("\nChecking link status");
+	fflush(stdout);
+	for (count = 0; count <= MAX_CHECK_TIME; count++) {
+		if (force_quit)
+			return;
+		all_ports_up = 1;
+		for (portid = 0; portid < port_num; portid++) {
+			if (force_quit)
+				return;
+			if ((port_mask & (1 << portid)) == 0)
+				continue;
+			memset(&link, 0, sizeof(link));
+			rte_eth_link_get_nowait(portid, &link);
+			/* print link status if flag set */
+			if (print_flag == 1) {
+				if (link.link_status)
+					printf("Port %d Link Up - speed %u "
+						"Mbps - %s\n", (uint8_t)portid,
+						(unsigned int)link.link_speed,
+				(link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+					("full-duplex") : ("half-duplex\n"));
+				else
+					printf("Port %d Link Down\n",
+						(uint8_t)portid);
+				continue;
+			}
+		       /* clear all_ports_up flag if any link down */
+			if (link.link_status == ETH_LINK_DOWN) {
+				all_ports_up = 0;
+				break;
+			}
+		}
+		/* after finally printing all link status, get out */
+		if (print_flag == 1)
+			break;
+
+		if (all_ports_up == 0) {
+			printf(".");
+			fflush(stdout);
+			rte_delay_ms(CHECK_INTERVAL);
+		}
+
+		/* set the print_flag if all ports up or timeout */
+		if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+			print_flag = 1;
+			printf("done\n");
+		}
+	}
+}
 static int
 run_monitor(__attribute__((unused)) void *arg)
 {
@@ -82,6 +271,10 @@ main(int argc, char **argv)
 {
 	int ret;
 	unsigned lcore_id;
+	unsigned int nb_ports;
+	struct rte_mempool *mbuf_pool;
+	uint8_t portid;
+
 
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
@@ -90,12 +283,39 @@ main(int argc, char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
+	nb_ports = rte_eth_dev_count();
+
+	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
+		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+
+	if (mbuf_pool == NULL)
+		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+
+	/* Initialize ports. */
+	for (portid = 0; portid < nb_ports; portid++) {
+		if ((enabled_port_mask & (1 << portid)) == 0)
+			continue;
+		if (port_init(portid, mbuf_pool) != 0)
+			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+					portid);
+	}
+
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
 		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
 				"application\n");
 		return 0;
 	}
+
+	check_all_ports_link_status(nb_ports, enabled_port_mask);
 	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
 	if (power_manager_init() < 0) {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 7/9] power: add send channel msg function to map file
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
                           ` (5 preceding siblings ...)
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 6/9] examples/vm_power_mgr: add port initialisation David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 8/9] examples/guest_cli: add send policy to host David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 9/9] examples/vm_power_mgr: set MAC address of VF David Hunt
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu, David Hunt

Adding new wrapper function to existing private (but unused 'till now)
function with an rte_power_ prefix.

The plan is to clean up all the header files in the next release so
that only the intended public functions are in the map file and only
the relevant headers have the rte_ prefix so that only they are
included in the documentation.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_power/guest_channel.c       |  7 +++++++
 lib/librte_power/guest_channel.h       | 15 +++++++++++++++
 lib/librte_power/rte_power_version.map |  1 +
 3 files changed, 23 insertions(+)

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
index 85c92fa..fa5de0f 100644
--- a/lib/librte_power/guest_channel.c
+++ b/lib/librte_power/guest_channel.c
@@ -148,6 +148,13 @@ guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
 	return 0;
 }
 
+int rte_power_guest_channel_send_msg(struct channel_packet *pkt,
+			unsigned int lcore_id)
+{
+	return guest_channel_send_msg(pkt, lcore_id);
+}
+
+
 void
 guest_channel_host_disconnect(unsigned lcore_id)
 {
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
index 9e18af5..741339c 100644
--- a/lib/librte_power/guest_channel.h
+++ b/lib/librte_power/guest_channel.h
@@ -81,6 +81,21 @@ void guest_channel_host_disconnect(unsigned lcore_id);
  */
 int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
 
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct channel_packet
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_guest_channel_send_msg(struct channel_packet *pkt,
+			unsigned int lcore_id);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_version.map b/lib/librte_power/rte_power_version.map
index 9ae0627..96dc42e 100644
--- a/lib/librte_power/rte_power_version.map
+++ b/lib/librte_power/rte_power_version.map
@@ -20,6 +20,7 @@ DPDK_2.0 {
 DPDK_17.11 {
 	global:
 
+	rte_power_guest_channel_send_msg;
 	rte_power_freq_disable_turbo;
 	rte_power_freq_enable_turbo;
 	rte_power_turbo_status;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 8/9] examples/guest_cli: add send policy to host
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
                           ` (6 preceding siblings ...)
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 7/9] power: add send channel msg function to map file David Hunt
@ 2017-10-03 14:08         ` David Hunt
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 9/9] examples/vm_power_mgr: set MAC address of VF David Hunt
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, Sexton, Rory,
	Nemanja Marjanovic, David Hunt

From: "Sexton, Rory" <rory.sexton@intel.com>

Here we're adding an example of setting up a policy, and allowing the
vm_cli_guest app to send it to the host using the cli command
"send_policy now"

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../guest_cli/vm_power_cli_guest.c                 | 105 +++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |   6 --
 2 files changed, 105 insertions(+), 6 deletions(-)

diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 4e982bd..fe0d77a 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -45,8 +45,10 @@
 #include <cmdline.h>
 #include <rte_log.h>
 #include <rte_lcore.h>
+#include <rte_ethdev.h>
 
 #include <rte_power.h>
+#include <guest_channel.h>
 
 #include "vm_power_cli_guest.h"
 
@@ -139,8 +141,111 @@ cmdline_parse_inst_t cmd_set_cpu_freq_set = {
 	},
 };
 
+struct cmd_send_policy_result {
+	cmdline_fixed_string_t send_policy;
+	cmdline_fixed_string_t cmd;
+};
+
+#ifdef RTE_LIBRTE_I40E_PMD
+union PFID {
+	struct ether_addr addr;
+	uint64_t pfid;
+};
+#endif
+
+static inline int
+send_policy(void)
+{
+	struct channel_packet pkt;
+	int ret;
+
+#ifdef RTE_LIBRTE_I40E_PMD
+	union PFID pfid;
+	/* Use port MAC address as the vfid */
+	rte_eth_macaddr_get(0, &pfid.addr);
+	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
+			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
+			1,
+			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
+			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
+			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
+	pkt.vfid[0] = pfid.pfid;
+
+	pkt.nb_mac_to_monitor = 1;
+#endif
+	pkt.t_boost_status.tbEnabled = false;
+
+	pkt.vcpu_to_control[0] = 0;
+	pkt.vcpu_to_control[1] = 1;
+	pkt.num_vcpu = 2;
+	/* Dummy Population. */
+#ifdef RTE_LIBRTE_I40E_PMD
+	pkt.traffic_policy.min_packet_thresh = 96000;
+	pkt.traffic_policy.avg_max_packet_thresh = 1800000;
+	pkt.traffic_policy.max_max_packet_thresh = 2000000;
+#endif
+
+	pkt.timer_policy.busy_hours[0] = 3;
+	pkt.timer_policy.busy_hours[1] = 4;
+	pkt.timer_policy.busy_hours[2] = 5;
+	pkt.timer_policy.quiet_hours[0] = 11;
+	pkt.timer_policy.quiet_hours[1] = 12;
+	pkt.timer_policy.quiet_hours[2] = 13;
+
+#ifdef RTE_LIBRTE_I40E_PMD
+	pkt.timer_policy.hours_to_use_traffic_profile[0] = 8;
+	pkt.timer_policy.hours_to_use_traffic_profile[1] = 10;
+#endif
+
+	pkt.workload = LOW;
+	pkt.policy_to_use = TIME;
+	pkt.command = PKT_POLICY;
+	strcpy(pkt.vm_name, "ubintu2");
+	ret = rte_power_guest_channel_send_msg(&pkt, 1);
+	if (ret == 0)
+		return 1;
+	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n",
+			ret > 0 ? strerror(ret) : "channel not connected");
+	return -1;
+}
+
+static void
+cmd_send_policy_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_send_policy_result *res = parsed_result;
+
+	if (!strcmp(res->cmd, "now")) {
+		printf("Sending Policy down now!\n");
+		ret = send_policy();
+	}
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n",
+				strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_send_policy =
+	TOKEN_STRING_INITIALIZER(struct cmd_send_policy_result,
+			send_policy, "send_policy");
+cmdline_parse_token_string_t cmd_send_policy_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_send_policy_result,
+			cmd, "now");
+
+cmdline_parse_inst_t cmd_send_policy_set = {
+	.f = cmd_send_policy_parsed,
+	.data = NULL,
+	.help_str = "send_policy now",
+	.tokens = {
+		(void *)&cmd_send_policy,
+		(void *)&cmd_send_policy_cmd_cmd,
+		NULL,
+	},
+};
+
 cmdline_parse_ctx_t main_ctx[] = {
 		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_send_policy_set,
 		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
 		NULL,
 };
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
index 0c4bdd5..277eab3 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -40,12 +40,6 @@ extern "C" {
 
 #include "channel_commands.h"
 
-int guest_channel_host_connect(unsigned lcore_id);
-
-int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
-
-void guest_channel_host_disconnect(unsigned lcore_id);
-
 void run_cli(__attribute__((unused)) void *arg);
 
 #ifdef __cplusplus
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v3 9/9] examples/vm_power_mgr: set MAC address of VF
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
                           ` (7 preceding siblings ...)
  2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 8/9] examples/guest_cli: add send policy to host David Hunt
@ 2017-10-03 14:08         ` David Hunt
  8 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-03 14:08 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu, David Hunt

We need to set vf mac from the host, so that they will be in sync on the
guest and the host. Otherwise, we'll have a random mac on the guest, and
a 00:00:00:00:00:00 mac on the host.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 60 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 698abca..18f5e7f 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -58,6 +58,15 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#ifdef RTE_LIBRTE_IXGBE_PMD
+#include <rte_pmd_ixgbe.h>
+#endif
+#ifdef RTE_LIBRTE_I40E_PMD
+#include <rte_pmd_i40e.h>
+#endif
+#ifdef RTE_LIBRTE_BNXT_PMD
+#include <rte_pmd_bnxt.h>
+#endif
 
 #define RX_RING_SIZE 512
 #define TX_RING_SIZE 512
@@ -222,7 +231,7 @@ check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
 						(uint8_t)portid);
 				continue;
 			}
-		       /* clear all_ports_up flag if any link down */
+			/* clear all_ports_up flag if any link down */
 			if (link.link_status == ETH_LINK_DOWN) {
 				all_ports_up = 0;
 				break;
@@ -273,7 +282,9 @@ main(int argc, char **argv)
 	unsigned lcore_id;
 	unsigned int nb_ports;
 	struct rte_mempool *mbuf_pool;
+#ifdef RTE_LIBRTE_I40E_PMD
 	uint8_t portid;
+#endif
 
 
 	ret = rte_eal_init(argc, argv);
@@ -300,13 +311,60 @@ main(int argc, char **argv)
 		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
 
 	/* Initialize ports. */
+#ifdef RTE_LIBRTE_I40E_PMD
 	for (portid = 0; portid < nb_ports; portid++) {
+		struct ether_addr eth;
+		int w, j;
+		int ret = -ENOTSUP;
+
 		if ((enabled_port_mask & (1 << portid)) == 0)
 			continue;
+
+		eth.addr_bytes[0] = 0xe0;
+		eth.addr_bytes[1] = 0xe0;
+		eth.addr_bytes[2] = 0xe0;
+		eth.addr_bytes[3] = 0xe0;
+		eth.addr_bytes[4] = portid + 0xf0;
+
 		if (port_init(portid, mbuf_pool) != 0)
 			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
 					portid);
+
+		for (w = 0; w < MAX_VFS; w++) {
+			eth.addr_bytes[5] = w + 0xf0;
+
+#ifdef RTE_LIBRTE_IXGBE_PMD
+			if (ret == -ENOTSUP)
+				ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+						w, &eth);
+#endif
+#ifdef RTE_LIBRTE_I40E_PMD
+			if (ret == -ENOTSUP)
+				ret = rte_pmd_i40e_set_vf_mac_addr(portid,
+						w, &eth);
+#endif
+#ifdef RTE_LIBRTE_BNXT_PMD
+			if (ret == -ENOTSUP)
+				ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
+						w, &eth);
+#endif
+
+
+			switch (ret) {
+			case 0:
+				printf("Port %d VF %d MAC: ",
+						portid, w);
+				for (j = 0; j < 6; j++) {
+					printf("%02x", eth.addr_bytes[j]);
+					if (j < 5)
+						printf(":");
+				}
+				printf("\n");
+				break;
+			}
+		}
 	}
+#endif
 
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API David Hunt
  2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
@ 2017-10-11 16:18       ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
                           ` (9 more replies)
  1 sibling, 10 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu, santosh.shukla

Policy Based Power Control for Guest

This patchset adds the facility for a guest VM to send a policy down to the
host that will allow the host to scale up/down cpu frequencies
depending on the policy criteria independently of the DPDK app running in
the guest.  This differs from the previous vm_power implementation where
individual scale up/down requests were send from the guest to the host via
virtio-serial.

V9 patchset changes:
  * Rebased on top of the tip of the master branch
  * changed port_id from uint8 to uint16 due to changes elsewhere

V8 patchset changes:
  * Added Ack's and Reviewed-by's to individual patches in the set so as to
    keep patchwork A/R/T flags properly in sync.

V7 patchset changes:
  * Changed return code of rte_pmd_i40e_query_vfid_by_mac() from an
    int64_t to int

V6 patchset changes:
  * Fixed comments in header for rte_pmd_i40e_query_vfid_by_mac.
  * changed rte_pmd_i40e_query_vfid_by_mac return code from uint to int
    as it can return negative error codes.
  * Removed bool enum from channel_commands.h, including stdbool.h instead.
  * Added #define VM_MAX_NAME_SZ 32 to channel_commands.h
  * Renamed a few variables to be more readable.
  * Added returns in a few places if failed to get info on domain.
  * Fixed power_manager_init to keep track of num_freqs for each core.
  * In power_manager_scale_core_med(), changed a hardcoded '5' to instead
    be calculated from the centre of the frequency list
    (global_core_freq_info[core_num].num_freqs / 2)

V5 patchset changes:
  * Removed most of the #ifdef I40_PMD as it will be applicable to
    other PMDs in the future.
  * Changed the parameter of rte_pmd_i40e_query_vfid_by_mac from a uint64
    to a const struct ether_addr *, rather than casting it later in the
    function.

V4 patchset changes:
  * None, re-post to mailing list under the correct email thread.

V3 patchset changes:
  * Changed to using is_same_ether_addr() instead of looping through
    the mac address bytes to compare them.
  * Tweaked some comments and working in the i40e patch after review.
  * Added a patch to the set to add new i40e function to map file, so
    as to allow shared library builds. The power library API needs a cleanup
    in next release, so will add API/ABI warning for this cleanup in a
    separate patch.

V2 patchset changes:
  * Removed API's in ethdev layer.
  * Now just a single new API in the i40e driver for mapping VF MAC to
    VF index.
  * Moved new function from rte_rxtx.c to rte_pmd_i40e.c
  * Removed function for reading i40e register, moved to using the
    standard stats API.
  * Renamed i40e function to rte_pmd_i40e_query_vfid_by_mac
  * Cleaned up policy generation code.

It's a modification of the vm_power_manager app that runs in the host, and
the guest_vm_power_app example app that runs in the guest. This allows the
guest to send down a policy to the host via virtio-serial, which then allows
the host to scale up/down based on the criteria in the policy, resulting in
quicker scale up/down than individual requests coming from the guest.
It also means that the DPDK application running in the guest does not need
to be modified in any way, it is unaware that it's cores are being scaled
up/down, reducing the effort in implementing a power-aware infrastructure.

The usage model is as follows:
1. Set up the VF's and assign to the guest in the usual way.
2. run vm_power_manager on the host, creating a channel to the guest.
3. Start the guest_vm_power_mgr app on the guest, which establishes
   a virtio-serial channel to the host.
4. Send down the profile for the guest using the "send_profile now" command.
   There is an example profile hard-coded into guest_vm_power_mgr.
5. Stop the guest_vm_power_mgr and run your normal power-unaware application.
6. Send traffic into the VFs at varying traffic rates.
   Observe the frequency change on the host (turbostat -i 1)

The sequence of code changes are as follows:

A new function has been aded to the i40e driver to allow mapping of
a VF MAC to VF index.

Next we make an addition to librte_power that adds an extra command to allow
the passing of a policy structure from the guest to the host. This struct
contains information like busy/quiet hour, packet throughput thresholds, etc.

The next addition adds functionality to convert the virtual CPU (vcpU0 IDs to
physical CPU (pcpu) IDs so that the host can scale up/down the cores used
in the guest.

The remaining patches are functionality to process the policy, and take action
when the relevant trigger occurs to cause a frequency change.

[1/9] net/i40e: add API to convert VF MAC to VF id
[2/9] lib/librte_power: add extra msg type for policies
[3/9] examples/vm_power_mgr: add vcpu to pcpu mapping
[4/9] examples/vm_power_mgr: add scale to medium freq fn
[5/9] examples/vm_power_mgr: add policy to channels
[6/9] examples/vm_power_mgr: add port initialisation
[7/9] power: add send channel msg function to map file
[8/9] examples/guest_cli: add send policy to host
[9/9] examples/vm_power_mgr: set MAC address of VF

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 1/9] net/i40e: add API to convert VF MAC to VF id
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 2/9] lib/librte_power: add extra msg type for policies David Hunt
                           ` (8 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, Sexton, Rory,
	Nemanja Marjanovic, David Hunt

From: "Sexton, Rory" <rory.sexton@intel.com>

Need a way to convert a vf id to a pf id on the host so as to query the pf
for relevant statistics which are used for the frequency changes in the
vm_power_manager app. Used when profiles are passed down from the guest
to the host, allowing the host to map the vfs to pfs.

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/i40e/rte_pmd_i40e.c           | 30 ++++++++++++++++++++++++++++++
 drivers/net/i40e/rte_pmd_i40e.h           | 15 +++++++++++++++
 drivers/net/i40e/rte_pmd_i40e_version.map |  1 +
 3 files changed, 46 insertions(+)

diff --git a/drivers/net/i40e/rte_pmd_i40e.c b/drivers/net/i40e/rte_pmd_i40e.c
index 0988023..103e161 100644
--- a/drivers/net/i40e/rte_pmd_i40e.c
+++ b/drivers/net/i40e/rte_pmd_i40e.c
@@ -2430,3 +2430,33 @@ rte_pmd_i40e_flow_type_mapping_update(
 
 	return 0;
 }
+
+int
+rte_pmd_i40e_query_vfid_by_mac(uint16_t port, const struct ether_addr *vf_mac)
+{
+	struct rte_eth_dev *dev;
+	struct ether_addr *mac;
+	struct i40e_pf *pf;
+	int vf_id;
+	struct i40e_pf_vf *vf;
+	uint16_t vf_num;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+	dev = &rte_eth_devices[port];
+
+	if (!is_i40e_supported(dev))
+		return -ENOTSUP;
+
+	pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+	vf_num = pf->vf_num;
+
+	for (vf_id = 0; vf_id < vf_num; vf_id++) {
+		vf = &pf->vfs[vf_id];
+		mac = &vf->mac_addr;
+
+		if (is_same_ether_addr(mac, vf_mac))
+			return vf_id;
+	}
+
+	return -EINVAL;
+}
diff --git a/drivers/net/i40e/rte_pmd_i40e.h b/drivers/net/i40e/rte_pmd_i40e.h
index 8fa5869..91f647e 100644
--- a/drivers/net/i40e/rte_pmd_i40e.h
+++ b/drivers/net/i40e/rte_pmd_i40e.h
@@ -737,4 +737,19 @@ int rte_pmd_i40e_flow_type_mapping_get(
  */
 int rte_pmd_i40e_flow_type_mapping_reset(uint8_t port);
 
+/**
+ * On the PF, find VF index based on VF MAC address
+ *
+ * @param port
+ *    pointer to port identifier of the device
+ * @param vf_mac
+ *    the mac address of the vf to determine index of
+ * @return
+ *    The index of vfid If successful.
+ *    -EINVAL: vf mac address does not exist for this port
+ *    -ENOTSUP: i40e not supported for this port.
+ */
+int rte_pmd_i40e_query_vfid_by_mac(uint16_t port,
+					const struct ether_addr *vf_mac);
+
 #endif /* _PMD_I40E_H_ */
diff --git a/drivers/net/i40e/rte_pmd_i40e_version.map b/drivers/net/i40e/rte_pmd_i40e_version.map
index 9292454..3f5871c 100644
--- a/drivers/net/i40e/rte_pmd_i40e_version.map
+++ b/drivers/net/i40e/rte_pmd_i40e_version.map
@@ -53,5 +53,6 @@ DPDK_17.11 {
 	rte_pmd_i40e_flow_type_mapping_update;
 	rte_pmd_i40e_flow_type_mapping_get;
 	rte_pmd_i40e_flow_type_mapping_reset;
+	rte_pmd_i40e_query_vfid_by_mac;
 
 } DPDK_17.08;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 2/9] lib/librte_power: add extra msg type for policies
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping David Hunt
                           ` (7 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, David Hunt,
	Nemanja Marjanovic, Rory Sexton

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_power/channel_commands.h | 42 +++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 484085b..f0f5f0a 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -39,6 +39,7 @@ extern "C" {
 #endif
 
 #include <stdint.h>
+#include <stdbool.h>
 
 /* Maximum number of channels per VM */
 #define CHANNEL_CMDS_MAX_VM_CHANNELS 64
@@ -46,6 +47,7 @@ extern "C" {
 /* Valid Commands */
 #define CPU_POWER               1
 #define CPU_POWER_CONNECT       2
+#define PKT_POLICY              3
 
 /* CPU Power Command Scaling */
 #define CPU_POWER_SCALE_UP      1
@@ -54,11 +56,51 @@ extern "C" {
 #define CPU_POWER_SCALE_MIN     4
 #define CPU_POWER_ENABLE_TURBO  5
 #define CPU_POWER_DISABLE_TURBO 6
+#define HOURS 24
+
+#define MAX_VFS 10
+#define VM_MAX_NAME_SZ 32
+
+#define MAX_VCPU_PER_VM         8
+
+struct t_boost_status {
+	bool tbEnabled;
+};
+
+struct timer_profile {
+	int busy_hours[HOURS];
+	int quiet_hours[HOURS];
+	int hours_to_use_traffic_profile[HOURS];
+};
+
+enum workload {HIGH, MEDIUM, LOW};
+enum policy_to_use {
+	TRAFFIC,
+	TIME,
+	WORKLOAD
+};
+
+struct traffic {
+	uint32_t min_packet_thresh;
+	uint32_t avg_max_packet_thresh;
+	uint32_t max_max_packet_thresh;
+};
 
 struct channel_packet {
 	uint64_t resource_id; /**< core_num, device */
 	uint32_t unit;        /**< scale down/up/min/max */
 	uint32_t command;     /**< Power, IO, etc */
+	char vm_name[VM_MAX_NAME_SZ];
+
+	uint64_t vfid[MAX_VFS];
+	int nb_mac_to_monitor;
+	struct traffic traffic_policy;
+	uint8_t vcpu_to_control[MAX_VCPU_PER_VM];
+	uint8_t num_vcpu;
+	struct timer_profile timer_policy;
+	enum workload workload;
+	enum policy_to_use policy_to_use;
+	struct t_boost_status t_boost_status;
 };
 
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 2/9] lib/librte_power: add extra msg type for policies David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 4/9] examples/vm_power_mgr: add scale to medium freq fn David Hunt
                           ` (6 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, David Hunt,
	Nemanja Marjanovic, Rory Sexton

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/vm_power_manager/channel_manager.c | 67 +++++++++++++++++++++++++++++
 examples/vm_power_manager/channel_manager.h | 25 +++++++++++
 2 files changed, 92 insertions(+)

diff --git a/examples/vm_power_manager/channel_manager.c b/examples/vm_power_manager/channel_manager.c
index e068ae2..ab856bd 100644
--- a/examples/vm_power_manager/channel_manager.c
+++ b/examples/vm_power_manager/channel_manager.c
@@ -574,6 +574,73 @@ set_channel_status(const char *vm_name, unsigned *channel_list,
 	return num_channels_changed;
 }
 
+void
+get_all_vm(int *num_vm, int *num_vcpu)
+{
+
+	virNodeInfo node_info;
+	virDomainPtr *domptr;
+	uint64_t mask;
+	int i, ii, numVcpus[MAX_VCPUS], cpu, n_vcpus;
+	unsigned int jj;
+	const char *vm_name;
+	unsigned int domain_flags = VIR_CONNECT_LIST_DOMAINS_RUNNING |
+				VIR_CONNECT_LIST_DOMAINS_PERSISTENT;
+	unsigned int domain_flag = VIR_DOMAIN_VCPU_CONFIG;
+
+
+	memset(global_cpumaps, 0, CHANNEL_CMDS_MAX_CPUS*global_maplen);
+	if (virNodeGetInfo(global_vir_conn_ptr, &node_info)) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "Unable to retrieve node Info\n");
+		return;
+	}
+
+	/* Returns number of pcpus */
+	global_n_host_cpus = (unsigned int)node_info.cpus;
+
+	/* Returns number of active domains */
+	*num_vm = virConnectListAllDomains(global_vir_conn_ptr, &domptr,
+					domain_flags);
+	if (*num_vm <= 0) {
+		RTE_LOG(ERR, CHANNEL_MANAGER, "No Active Domains Running\n");
+		return;
+	}
+
+	for (i = 0; i < *num_vm; i++) {
+
+		/* Get Domain Names */
+		vm_name = virDomainGetName(domptr[i]);
+		lvm_info[i].vm_name = vm_name;
+
+		/* Get Number of Vcpus */
+		numVcpus[i] = virDomainGetVcpusFlags(domptr[i], domain_flag);
+
+		/* Get Number of VCpus & VcpuPinInfo */
+		n_vcpus = virDomainGetVcpuPinInfo(domptr[i],
+				numVcpus[i], global_cpumaps,
+				global_maplen, domain_flag);
+
+		if ((int)n_vcpus > 0) {
+			*num_vcpu = n_vcpus;
+			lvm_info[i].num_cpus = n_vcpus;
+		}
+
+		/* Save pcpu in use by libvirt VMs */
+		for (ii = 0; ii < n_vcpus; ii++) {
+			mask = 0;
+			for (jj = 0; jj < global_n_host_cpus; jj++) {
+				if (VIR_CPU_USABLE(global_cpumaps,
+						global_maplen, ii, jj) > 0) {
+					mask |= 1ULL << jj;
+				}
+			}
+			ITERATIVE_BITMASK_CHECK_64(mask, cpu) {
+				lvm_info[i].pcpus[ii] = cpu;
+			}
+		}
+	}
+}
+
 int
 get_info_vm(const char *vm_name, struct vm_info *info)
 {
diff --git a/examples/vm_power_manager/channel_manager.h b/examples/vm_power_manager/channel_manager.h
index 47c3b9c..358fb8f 100644
--- a/examples/vm_power_manager/channel_manager.h
+++ b/examples/vm_power_manager/channel_manager.h
@@ -66,6 +66,17 @@ struct sockaddr_un _sockaddr_un;
 #define UNIX_PATH_MAX sizeof(_sockaddr_un.sun_path)
 #endif
 
+#define MAX_VMS 4
+#define MAX_VCPUS 20
+
+
+struct libvirt_vm_info {
+	const char *vm_name;
+	unsigned int pcpus[MAX_VCPUS];
+	uint8_t num_cpus;
+};
+
+struct libvirt_vm_info lvm_info[MAX_VMS];
 /* Communication Channel Status */
 enum channel_status { CHANNEL_MGR_CHANNEL_DISCONNECTED = 0,
 	CHANNEL_MGR_CHANNEL_CONNECTED,
@@ -319,6 +330,20 @@ int set_channel_status(const char *vm_name, unsigned *channel_list,
  */
 int get_info_vm(const char *vm_name, struct vm_info *info);
 
+/**
+ * Populates a table with all domains running and their physical cpu.
+ * All information is gathered through libvirt api.
+ *
+ * @param num_vm
+ *  modified to store number of active VMs
+ *
+ * @param num_vcpu
+    modified to store number of vcpus active
+ *
+ * @return
+ *   void
+ */
+void get_all_vm(int *num_vm, int *num_vcpu);
 #ifdef __cplusplus
 }
 #endif
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 4/9] examples/vm_power_mgr: add scale to medium freq fn
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (2 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 5/9] examples/vm_power_mgr: add policy to channels David Hunt
                           ` (5 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, David Hunt,
	Nemanja Marjanovic, Rory Sexton

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/vm_power_manager/power_manager.c | 32 ++++++++++++++++++++++++++-----
 examples/vm_power_manager/power_manager.h | 13 +++++++++++++
 2 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 80705f9..1834a82 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -108,7 +108,7 @@ set_host_cpus_mask(void)
 int
 power_manager_init(void)
 {
-	unsigned i, num_cpus;
+	unsigned int i, num_cpus, num_freqs;
 	uint64_t cpu_mask;
 	int ret = 0;
 
@@ -121,15 +121,21 @@ power_manager_init(void)
 	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
 	cpu_mask = global_enabled_cpus;
 	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
-		if (rte_power_init(i) < 0 || rte_power_freqs(i,
-				global_core_freq_info[i].freqs,
-				RTE_MAX_LCORE_FREQS) == 0) {
-			RTE_LOG(ERR, POWER_MANAGER, "Unable to initialize power manager "
+		if (rte_power_init(i) < 0)
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Unable to initialize power manager "
 					"for core %u\n", i);
+		num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs,
+					RTE_MAX_LCORE_FREQS);
+		if (num_freqs == 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+				"Unable to get frequency list for core %u\n",
+				i);
 			global_enabled_cpus &= ~(1 << i);
 			num_cpus--;
 			ret = -1;
 		}
+		global_core_freq_info[i].num_freqs = num_freqs;
 		rte_spinlock_init(&global_core_freq_info[i].power_sl);
 	}
 	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
@@ -286,3 +292,19 @@ power_manager_disable_turbo_core(unsigned int core_num)
 	POWER_SCALE_CORE(disable_turbo, core_num, ret);
 	return ret;
 }
+
+int
+power_manager_scale_core_med(unsigned int core_num)
+{
+	int ret = 0;
+
+	if (core_num >= POWER_MGR_MAX_CPUS)
+		return -1;
+	if (!(global_enabled_cpus & (1ULL << core_num)))
+		return -1;
+	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
+	ret = rte_power_set_freq(core_num,
+				global_core_freq_info[core_num].num_freqs / 2);
+	rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl);
+	return ret;
+}
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index b74d09b..b52fb4c 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -231,6 +231,19 @@ int power_manager_disable_turbo_core(unsigned int core_num);
  */
 uint32_t power_manager_get_current_frequency(unsigned core_num);
 
+/**
+ * Scale to medium frequency for the core specified by core_num.
+ * It is thread-safe.
+ *
+ * @param core_num
+ *  The core number to change frequency
+ *
+ * @return
+ *  - 1 on success.
+ *  - 0 if frequency not changed.
+ *  - Negative on error.
+ */
+int power_manager_scale_core_med(unsigned int core_num);
 
 #ifdef __cplusplus
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 5/9] examples/vm_power_mgr: add policy to channels
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (3 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 4/9] examples/vm_power_mgr: add scale to medium freq fn David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 6/9] examples/vm_power_mgr: add port initialisation David Hunt
                           ` (4 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, Sexton, Rory,
	Nemanja Marjanovic, David Hunt

From: "Sexton, Rory" <rory.sexton@intel.com>

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/vm_power_manager/Makefile          |  16 ++
 examples/vm_power_manager/channel_monitor.c | 321 +++++++++++++++++++++++++++-
 examples/vm_power_manager/channel_monitor.h |  18 ++
 3 files changed, 348 insertions(+), 7 deletions(-)

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index 59a9641..9cf20a2 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -54,6 +54,22 @@ CFLAGS += $(WERROR_FLAGS)
 
 LDLIBS += -lvirt
 
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+
+ifeq ($(CONFIG_RTE_LIBRTE_IXGBE_PMD),y)
+LDLIBS += -lrte_pmd_ixgbe
+endif
+
+ifeq ($(CONFIG_RTE_LIBRTE_I40E_PMD),y)
+LDLIBS += -lrte_pmd_i40e
+endif
+
+ifeq ($(CONFIG_RTE_LIBRTE_BNXT_PMD),y)
+LDLIBS += -lrte_pmd_bnxt
+endif
+
+endif
+
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index ac40dac..f16358d 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -41,13 +41,17 @@
 #include <sys/types.h>
 #include <sys/epoll.h>
 #include <sys/queue.h>
+#include <sys/time.h>
 
 #include <rte_log.h>
 #include <rte_memory.h>
 #include <rte_malloc.h>
 #include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_ethdev.h>
+#include <rte_pmd_i40e.h>
 
-
+#include <libvirt/libvirt.h>
 #include "channel_monitor.h"
 #include "channel_commands.h"
 #include "channel_manager.h"
@@ -57,10 +61,15 @@
 
 #define MAX_EVENTS 256
 
+uint64_t vsi_pkt_count_prev[384];
+uint64_t rdtsc_prev[384];
 
+double time_period_s = 1;
 static volatile unsigned run_loop = 1;
 static int global_event_fd;
+static unsigned int policy_is_set;
 static struct epoll_event *global_events_list;
+static struct policy policies[MAX_VMS];
 
 void channel_monitor_exit(void)
 {
@@ -68,6 +77,286 @@ void channel_monitor_exit(void)
 	rte_free(global_events_list);
 }
 
+static void
+core_share(int pNo, int z, int x, int t)
+{
+	if (policies[pNo].core_share[z].pcpu == lvm_info[x].pcpus[t]) {
+		if (strcmp(policies[pNo].pkt.vm_name,
+				lvm_info[x].vm_name) != 0) {
+			policies[pNo].core_share[z].status = 1;
+			power_manager_scale_core_max(
+					policies[pNo].core_share[z].pcpu);
+		}
+	}
+}
+
+static void
+core_share_status(int pNo)
+{
+
+	int noVms, noVcpus, z, x, t;
+
+	get_all_vm(&noVms, &noVcpus);
+
+	/* Reset Core Share Status. */
+	for (z = 0; z < noVcpus; z++)
+		policies[pNo].core_share[z].status = 0;
+
+	/* Foreach vcpu in a policy. */
+	for (z = 0; z < policies[pNo].pkt.num_vcpu; z++) {
+		/* Foreach VM on the platform. */
+		for (x = 0; x < noVms; x++) {
+			/* Foreach vcpu of VMs on platform. */
+			for (t = 0; t < lvm_info[x].num_cpus; t++)
+				core_share(pNo, z, x, t);
+		}
+	}
+}
+
+static void
+get_pcpu_to_control(struct policy *pol)
+{
+
+	/* Convert vcpu to pcpu. */
+	struct vm_info info;
+	int pcpu, count;
+	uint64_t mask_u64b;
+
+	RTE_LOG(INFO, CHANNEL_MONITOR, "Looking for pcpu for %s\n",
+			pol->pkt.vm_name);
+	get_info_vm(pol->pkt.vm_name, &info);
+
+	for (count = 0; count < pol->pkt.num_vcpu; count++) {
+		mask_u64b = info.pcpu_mask[pol->pkt.vcpu_to_control[count]];
+		for (pcpu = 0; mask_u64b; mask_u64b &= ~(1ULL << pcpu++)) {
+			if ((mask_u64b >> pcpu) & 1)
+				pol->core_share[count].pcpu = pcpu;
+		}
+	}
+}
+
+static int
+get_pfid(struct policy *pol)
+{
+
+	int i, x, ret = 0, nb_ports;
+
+	nb_ports = rte_eth_dev_count();
+	for (i = 0; i < pol->pkt.nb_mac_to_monitor; i++) {
+
+		for (x = 0; x < nb_ports; x++) {
+			ret = rte_pmd_i40e_query_vfid_by_mac(x,
+				(struct ether_addr *)&(pol->pkt.vfid[i]));
+			if (ret != -EINVAL) {
+				pol->port[i] = x;
+				break;
+			}
+		}
+		if (ret == -EINVAL || ret == -ENOTSUP || ret == ENODEV) {
+			RTE_LOG(INFO, CHANNEL_MONITOR,
+				"Error with Policy. MAC not found on "
+				"attached ports ");
+			pol->enabled = 0;
+			return ret;
+		}
+		pol->pfid[i] = ret;
+	}
+	return 1;
+}
+
+static int
+update_policy(struct channel_packet *pkt)
+{
+
+	unsigned int updated = 0;
+
+	for (int i = 0; i < MAX_VMS; i++) {
+		if (strcmp(policies[i].pkt.vm_name, pkt->vm_name) == 0) {
+			policies[i].pkt = *pkt;
+			get_pcpu_to_control(&policies[i]);
+			if (get_pfid(&policies[i]) == -1) {
+				updated = 1;
+				break;
+			}
+			core_share_status(i);
+			policies[i].enabled = 1;
+			updated = 1;
+		}
+	}
+	if (!updated) {
+		for (int i = 0; i < MAX_VMS; i++) {
+			if (policies[i].enabled == 0) {
+				policies[i].pkt = *pkt;
+				get_pcpu_to_control(&policies[i]);
+				if (get_pfid(&policies[i]) == -1)
+					break;
+				core_share_status(i);
+				policies[i].enabled = 1;
+				break;
+			}
+		}
+	}
+	return 0;
+}
+
+static uint64_t
+get_pkt_diff(struct policy *pol)
+{
+
+	uint64_t vsi_pkt_count,
+		vsi_pkt_total = 0,
+		vsi_pkt_count_prev_total = 0;
+	double rdtsc_curr, rdtsc_diff, diff;
+	int x;
+	struct rte_eth_stats vf_stats;
+
+	for (x = 0; x < pol->pkt.nb_mac_to_monitor; x++) {
+
+		/*Read vsi stats*/
+		if (rte_pmd_i40e_get_vf_stats(x, pol->pfid[x], &vf_stats) == 0)
+			vsi_pkt_count = vf_stats.ipackets;
+		else
+			vsi_pkt_count = -1;
+
+		vsi_pkt_total += vsi_pkt_count;
+
+		vsi_pkt_count_prev_total += vsi_pkt_count_prev[pol->pfid[x]];
+		vsi_pkt_count_prev[pol->pfid[x]] = vsi_pkt_count;
+	}
+
+	rdtsc_curr = rte_rdtsc_precise();
+	rdtsc_diff = rdtsc_curr - rdtsc_prev[pol->pfid[x-1]];
+	rdtsc_prev[pol->pfid[x-1]] = rdtsc_curr;
+
+	diff = (vsi_pkt_total - vsi_pkt_count_prev_total) *
+			((double)rte_get_tsc_hz() / rdtsc_diff);
+
+	return diff;
+}
+
+static void
+apply_traffic_profile(struct policy *pol)
+{
+
+	int count;
+	uint64_t diff = 0;
+
+	diff = get_pkt_diff(pol);
+
+	RTE_LOG(INFO, CHANNEL_MONITOR, "Applying traffic profile\n");
+
+	if (diff >= (pol->pkt.traffic_policy.max_max_packet_thresh)) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_max(
+						pol->core_share[count].pcpu);
+		}
+	} else if (diff >= (pol->pkt.traffic_policy.avg_max_packet_thresh)) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_med(
+						pol->core_share[count].pcpu);
+		}
+	} else if (diff < (pol->pkt.traffic_policy.avg_max_packet_thresh)) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_min(
+						pol->core_share[count].pcpu);
+		}
+	}
+}
+
+static void
+apply_time_profile(struct policy *pol)
+{
+
+	int count, x;
+	struct timeval tv;
+	struct tm *ptm;
+	char time_string[40];
+
+	/* Obtain the time of day, and convert it to a tm struct. */
+	gettimeofday(&tv, NULL);
+	ptm = localtime(&tv.tv_sec);
+	/* Format the date and time, down to a single second. */
+	strftime(time_string, sizeof(time_string), "%Y-%m-%d %H:%M:%S", ptm);
+
+	for (x = 0; x < HOURS; x++) {
+
+		if (ptm->tm_hour == pol->pkt.timer_policy.busy_hours[x]) {
+			for (count = 0; count < pol->pkt.num_vcpu; count++) {
+				if (pol->core_share[count].status != 1) {
+					power_manager_scale_core_max(
+						pol->core_share[count].pcpu);
+				RTE_LOG(INFO, CHANNEL_MONITOR,
+					"Scaling up core %d to max\n",
+					pol->core_share[count].pcpu);
+				}
+			}
+			break;
+		} else if (ptm->tm_hour ==
+				pol->pkt.timer_policy.quiet_hours[x]) {
+			for (count = 0; count < pol->pkt.num_vcpu; count++) {
+				if (pol->core_share[count].status != 1) {
+					power_manager_scale_core_min(
+						pol->core_share[count].pcpu);
+				RTE_LOG(INFO, CHANNEL_MONITOR,
+					"Scaling down core %d to min\n",
+					pol->core_share[count].pcpu);
+			}
+		}
+			break;
+		} else if (ptm->tm_hour ==
+			pol->pkt.timer_policy.hours_to_use_traffic_profile[x]) {
+			apply_traffic_profile(pol);
+			break;
+		}
+	}
+}
+
+static void
+apply_workload_profile(struct policy *pol)
+{
+
+	int count;
+
+	if (pol->pkt.workload == HIGH) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_max(
+						pol->core_share[count].pcpu);
+		}
+	} else if (pol->pkt.workload == MEDIUM) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_med(
+						pol->core_share[count].pcpu);
+		}
+	} else if (pol->pkt.workload == LOW) {
+		for (count = 0; count < pol->pkt.num_vcpu; count++) {
+			if (pol->core_share[count].status != 1)
+				power_manager_scale_core_min(
+						pol->core_share[count].pcpu);
+		}
+	}
+}
+
+static void
+apply_policy(struct policy *pol)
+{
+
+	struct channel_packet *pkt = &pol->pkt;
+
+	/*Check policy to use*/
+	if (pkt->policy_to_use == TRAFFIC)
+		apply_traffic_profile(pol);
+	else if (pkt->policy_to_use == TIME)
+		apply_time_profile(pol);
+	else if (pkt->policy_to_use == WORKLOAD)
+		apply_workload_profile(pol);
+}
+
+
 static int
 process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 {
@@ -140,6 +429,13 @@ process_request(struct channel_packet *pkt, struct channel_info *chan_info)
 
 		}
 	}
+
+	if (pkt->command == PKT_POLICY) {
+		RTE_LOG(INFO, CHANNEL_MONITOR, "\nProcessing Policy request from Guest\n");
+		update_policy(pkt);
+		policy_is_set = 1;
+	}
+
 	/* Return is not checked as channel status may have been set to DISABLED
 	 * from management thread
 	 */
@@ -209,9 +505,10 @@ run_channel_monitor(void)
 			struct channel_info *chan_info = (struct channel_info *)
 					global_events_list[i].data.ptr;
 			if ((global_events_list[i].events & EPOLLERR) ||
-					(global_events_list[i].events & EPOLLHUP)) {
+				(global_events_list[i].events & EPOLLHUP)) {
 				RTE_LOG(DEBUG, CHANNEL_MONITOR, "Remote closed connection for "
-						"channel '%s'\n", chan_info->channel_path);
+						"channel '%s'\n",
+						chan_info->channel_path);
 				remove_channel(&chan_info);
 				continue;
 			}
@@ -223,14 +520,17 @@ run_channel_monitor(void)
 				int buffer_len = sizeof(pkt);
 
 				while (buffer_len > 0) {
-					n_bytes = read(chan_info->fd, buffer, buffer_len);
+					n_bytes = read(chan_info->fd,
+							buffer, buffer_len);
 					if (n_bytes == buffer_len)
 						break;
 					if (n_bytes == -1) {
 						err = errno;
-						RTE_LOG(DEBUG, CHANNEL_MONITOR, "Received error on "
-								"channel '%s' read: %s\n",
-								chan_info->channel_path, strerror(err));
+						RTE_LOG(DEBUG, CHANNEL_MONITOR,
+							"Received error on "
+							"channel '%s' read: %s\n",
+							chan_info->channel_path,
+							strerror(err));
 						remove_channel(&chan_info);
 						break;
 					}
@@ -241,5 +541,12 @@ run_channel_monitor(void)
 					process_request(&pkt, chan_info);
 			}
 		}
+		rte_delay_us(time_period_s*1000000);
+		if (policy_is_set) {
+			for (int j = 0; j < MAX_VMS; j++) {
+				if (policies[j].enabled == 1)
+					apply_policy(&policies[j]);
+			}
+		}
 	}
 }
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
index c138607..b52c1fc 100644
--- a/examples/vm_power_manager/channel_monitor.h
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -35,6 +35,24 @@
 #define CHANNEL_MONITOR_H_
 
 #include "channel_manager.h"
+#include "channel_commands.h"
+
+struct core_share {
+	unsigned int pcpu;
+	/*
+	 * 1 CORE SHARE
+	 * 0 NOT SHARED
+	 */
+	int status;
+};
+
+struct policy {
+	struct channel_packet pkt;
+	uint32_t pfid[MAX_VFS];
+	uint32_t port[MAX_VFS];
+	unsigned int enabled;
+	struct core_share core_share[MAX_VCPU_PER_VM];
+};
 
 #ifdef __cplusplus
 extern "C" {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 6/9] examples/vm_power_mgr: add port initialisation
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (4 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 5/9] examples/vm_power_mgr: add policy to channels David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 7/9] power: add send channel msg function to map file David Hunt
                           ` (3 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, David Hunt,
	Nemanja Marjanovic

We need to initialise the port's we're monitoring to be able to see
the throughput.

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/vm_power_manager/main.c | 220 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 220 insertions(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index c33fcc9..4ffefce 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -49,6 +49,9 @@
 #include <rte_log.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_ethdev.h>
+#include <getopt.h>
+#include <rte_cycles.h>
 #include <rte_debug.h>
 
 #include "channel_manager.h"
@@ -56,6 +59,192 @@
 #include "power_manager.h"
 #include "vm_power_cli.h"
 
+#define RX_RING_SIZE 512
+#define TX_RING_SIZE 512
+
+#define NUM_MBUFS 8191
+#define MBUF_CACHE_SIZE 250
+#define BURST_SIZE 32
+
+static uint32_t enabled_port_mask;
+static volatile bool force_quit;
+
+/****************/
+static const struct rte_eth_conf port_conf_default = {
+	.rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
+};
+
+static inline int
+port_init(uint16_t port, struct rte_mempool *mbuf_pool)
+{
+	struct rte_eth_conf port_conf = port_conf_default;
+	const uint16_t rx_rings = 1, tx_rings = 1;
+	int retval;
+	uint16_t q;
+
+	if (port >= rte_eth_dev_count())
+		return -1;
+
+	/* Configure the Ethernet device. */
+	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
+	if (retval != 0)
+		return retval;
+
+	/* Allocate and set up 1 RX queue per Ethernet port. */
+	for (q = 0; q < rx_rings; q++) {
+		retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
+				rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+		if (retval < 0)
+			return retval;
+	}
+
+	/* Allocate and set up 1 TX queue per Ethernet port. */
+	for (q = 0; q < tx_rings; q++) {
+		retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
+				rte_eth_dev_socket_id(port), NULL);
+		if (retval < 0)
+			return retval;
+	}
+
+	/* Start the Ethernet port. */
+	retval = rte_eth_dev_start(port);
+	if (retval < 0)
+		return retval;
+
+	/* Display the port MAC address. */
+	struct ether_addr addr;
+	rte_eth_macaddr_get(port, &addr);
+	printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8
+			   " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n",
+			(unsigned int)port,
+			addr.addr_bytes[0], addr.addr_bytes[1],
+			addr.addr_bytes[2], addr.addr_bytes[3],
+			addr.addr_bytes[4], addr.addr_bytes[5]);
+
+	/* Enable RX in promiscuous mode for the Ethernet device. */
+	rte_eth_promiscuous_enable(port);
+
+
+	return 0;
+}
+
+static int
+parse_portmask(const char *portmask)
+{
+	char *end = NULL;
+	unsigned long pm;
+
+	/* parse hexadecimal string */
+	pm = strtoul(portmask, &end, 16);
+	if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (pm == 0)
+		return -1;
+
+	return pm;
+}
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	static struct option lgopts[] = {
+		{ "mac-updating", no_argument, 0, 1},
+		{ "no-mac-updating", no_argument, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+				  lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'p':
+			enabled_port_mask = parse_portmask(optarg);
+			if (enabled_port_mask == 0) {
+				printf("invalid portmask\n");
+				return -1;
+			}
+			break;
+		/* long options */
+		case 0:
+			break;
+
+		default:
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
+static void
+check_all_ports_link_status(uint16_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+	uint16_t portid, count, all_ports_up, print_flag = 0;
+	struct rte_eth_link link;
+
+	printf("\nChecking link status");
+	fflush(stdout);
+	for (count = 0; count <= MAX_CHECK_TIME; count++) {
+		if (force_quit)
+			return;
+		all_ports_up = 1;
+		for (portid = 0; portid < port_num; portid++) {
+			if (force_quit)
+				return;
+			if ((port_mask & (1 << portid)) == 0)
+				continue;
+			memset(&link, 0, sizeof(link));
+			rte_eth_link_get_nowait(portid, &link);
+			/* print link status if flag set */
+			if (print_flag == 1) {
+				if (link.link_status)
+					printf("Port %d Link Up - speed %u "
+						"Mbps - %s\n", (uint16_t)portid,
+						(unsigned int)link.link_speed,
+				(link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+					("full-duplex") : ("half-duplex\n"));
+				else
+					printf("Port %d Link Down\n",
+						(uint16_t)portid);
+				continue;
+			}
+		       /* clear all_ports_up flag if any link down */
+			if (link.link_status == ETH_LINK_DOWN) {
+				all_ports_up = 0;
+				break;
+			}
+		}
+		/* after finally printing all link status, get out */
+		if (print_flag == 1)
+			break;
+
+		if (all_ports_up == 0) {
+			printf(".");
+			fflush(stdout);
+			rte_delay_ms(CHECK_INTERVAL);
+		}
+
+		/* set the print_flag if all ports up or timeout */
+		if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+			print_flag = 1;
+			printf("done\n");
+		}
+	}
+}
 static int
 run_monitor(__attribute__((unused)) void *arg)
 {
@@ -82,6 +271,10 @@ main(int argc, char **argv)
 {
 	int ret;
 	unsigned lcore_id;
+	unsigned int nb_ports;
+	struct rte_mempool *mbuf_pool;
+	uint16_t portid;
+
 
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
@@ -90,12 +283,39 @@ main(int argc, char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
+	nb_ports = rte_eth_dev_count();
+
+	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
+		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+
+	if (mbuf_pool == NULL)
+		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+
+	/* Initialize ports. */
+	for (portid = 0; portid < nb_ports; portid++) {
+		if ((enabled_port_mask & (1 << portid)) == 0)
+			continue;
+		if (port_init(portid, mbuf_pool) != 0)
+			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+					portid);
+	}
+
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
 		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
 				"application\n");
 		return 0;
 	}
+
+	check_all_ports_link_status(nb_ports, enabled_port_mask);
 	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
 	if (power_manager_init() < 0) {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 7/9] power: add send channel msg function to map file
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (5 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 6/9] examples/vm_power_mgr: add port initialisation David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 8/9] examples/guest_cli: add send policy to host David Hunt
                           ` (2 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, David Hunt

Adding new wrapper function to existing private (but unused 'till now)
function with an rte_power_ prefix.

The plan is to clean up all the header files in the next release so
that only the intended public functions are in the map file and only
the relevant headers have the rte_ prefix so that only they are
included in the documentation.

Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_power/guest_channel.c       |  7 +++++++
 lib/librte_power/guest_channel.h       | 15 +++++++++++++++
 lib/librte_power/rte_power_version.map |  1 +
 3 files changed, 23 insertions(+)

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
index 85c92fa..fa5de0f 100644
--- a/lib/librte_power/guest_channel.c
+++ b/lib/librte_power/guest_channel.c
@@ -148,6 +148,13 @@ guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id)
 	return 0;
 }
 
+int rte_power_guest_channel_send_msg(struct channel_packet *pkt,
+			unsigned int lcore_id)
+{
+	return guest_channel_send_msg(pkt, lcore_id);
+}
+
+
 void
 guest_channel_host_disconnect(unsigned lcore_id)
 {
diff --git a/lib/librte_power/guest_channel.h b/lib/librte_power/guest_channel.h
index 9e18af5..741339c 100644
--- a/lib/librte_power/guest_channel.h
+++ b/lib/librte_power/guest_channel.h
@@ -81,6 +81,21 @@ void guest_channel_host_disconnect(unsigned lcore_id);
  */
 int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
 
+/**
+ * Send a message contained in pkt over the Virtio-Serial to the host endpoint.
+ *
+ * @param pkt
+ *  Pointer to a populated struct channel_packet
+ *
+ * @param lcore_id
+ *  lcore_id.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int rte_power_guest_channel_send_msg(struct channel_packet *pkt,
+			unsigned int lcore_id);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_power/rte_power_version.map b/lib/librte_power/rte_power_version.map
index 9ae0627..96dc42e 100644
--- a/lib/librte_power/rte_power_version.map
+++ b/lib/librte_power/rte_power_version.map
@@ -20,6 +20,7 @@ DPDK_2.0 {
 DPDK_17.11 {
 	global:
 
+	rte_power_guest_channel_send_msg;
 	rte_power_freq_disable_turbo;
 	rte_power_freq_enable_turbo;
 	rte_power_turbo_status;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 8/9] examples/guest_cli: add send policy to host
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (6 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 7/9] power: add send channel msg function to map file David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 9/9] examples/vm_power_mgr: set MAC address of VF David Hunt
  2017-10-12  0:23         ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest Ferruh Yigit
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, Sexton, Rory,
	Nemanja Marjanovic, David Hunt

From: "Sexton, Rory" <rory.sexton@intel.com>

Here we're adding an example of setting up a policy, and allowing the
vm_cli_guest app to send it to the host using the cli command
"send_policy now"

Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 .../guest_cli/vm_power_cli_guest.c                 | 97 ++++++++++++++++++++++
 .../guest_cli/vm_power_cli_guest.h                 |  6 --
 2 files changed, 97 insertions(+), 6 deletions(-)

diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 4e982bd..dc9efc2 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -45,8 +45,10 @@
 #include <cmdline.h>
 #include <rte_log.h>
 #include <rte_lcore.h>
+#include <rte_ethdev.h>
 
 #include <rte_power.h>
+#include <guest_channel.h>
 
 #include "vm_power_cli_guest.h"
 
@@ -139,8 +141,103 @@ cmdline_parse_inst_t cmd_set_cpu_freq_set = {
 	},
 };
 
+struct cmd_send_policy_result {
+	cmdline_fixed_string_t send_policy;
+	cmdline_fixed_string_t cmd;
+};
+
+union PFID {
+	struct ether_addr addr;
+	uint64_t pfid;
+};
+
+static inline int
+send_policy(void)
+{
+	struct channel_packet pkt;
+	int ret;
+
+	union PFID pfid;
+	/* Use port MAC address as the vfid */
+	rte_eth_macaddr_get(0, &pfid.addr);
+	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
+			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
+			1,
+			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
+			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
+			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
+	pkt.vfid[0] = pfid.pfid;
+
+	pkt.nb_mac_to_monitor = 1;
+	pkt.t_boost_status.tbEnabled = false;
+
+	pkt.vcpu_to_control[0] = 0;
+	pkt.vcpu_to_control[1] = 1;
+	pkt.num_vcpu = 2;
+	/* Dummy Population. */
+	pkt.traffic_policy.min_packet_thresh = 96000;
+	pkt.traffic_policy.avg_max_packet_thresh = 1800000;
+	pkt.traffic_policy.max_max_packet_thresh = 2000000;
+
+	pkt.timer_policy.busy_hours[0] = 3;
+	pkt.timer_policy.busy_hours[1] = 4;
+	pkt.timer_policy.busy_hours[2] = 5;
+	pkt.timer_policy.quiet_hours[0] = 11;
+	pkt.timer_policy.quiet_hours[1] = 12;
+	pkt.timer_policy.quiet_hours[2] = 13;
+
+	pkt.timer_policy.hours_to_use_traffic_profile[0] = 8;
+	pkt.timer_policy.hours_to_use_traffic_profile[1] = 10;
+
+	pkt.workload = LOW;
+	pkt.policy_to_use = TIME;
+	pkt.command = PKT_POLICY;
+	strcpy(pkt.vm_name, "ubuntu2");
+	ret = rte_power_guest_channel_send_msg(&pkt, 1);
+	if (ret == 0)
+		return 1;
+	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n",
+			ret > 0 ? strerror(ret) : "channel not connected");
+	return -1;
+}
+
+static void
+cmd_send_policy_parsed(void *parsed_result, struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	int ret = -1;
+	struct cmd_send_policy_result *res = parsed_result;
+
+	if (!strcmp(res->cmd, "now")) {
+		printf("Sending Policy down now!\n");
+		ret = send_policy();
+	}
+	if (ret != 1)
+		cmdline_printf(cl, "Error sending message: %s\n",
+				strerror(ret));
+}
+
+cmdline_parse_token_string_t cmd_send_policy =
+	TOKEN_STRING_INITIALIZER(struct cmd_send_policy_result,
+			send_policy, "send_policy");
+cmdline_parse_token_string_t cmd_send_policy_cmd_cmd =
+	TOKEN_STRING_INITIALIZER(struct cmd_send_policy_result,
+			cmd, "now");
+
+cmdline_parse_inst_t cmd_send_policy_set = {
+	.f = cmd_send_policy_parsed,
+	.data = NULL,
+	.help_str = "send_policy now",
+	.tokens = {
+		(void *)&cmd_send_policy,
+		(void *)&cmd_send_policy_cmd_cmd,
+		NULL,
+	},
+};
+
 cmdline_parse_ctx_t main_ctx[] = {
 		(cmdline_parse_inst_t *)&cmd_quit,
+		(cmdline_parse_inst_t *)&cmd_send_policy_set,
 		(cmdline_parse_inst_t *)&cmd_set_cpu_freq_set,
 		NULL,
 };
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
index 0c4bdd5..277eab3 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -40,12 +40,6 @@ extern "C" {
 
 #include "channel_commands.h"
 
-int guest_channel_host_connect(unsigned lcore_id);
-
-int guest_channel_send_msg(struct channel_packet *pkt, unsigned lcore_id);
-
-void guest_channel_host_disconnect(unsigned lcore_id);
-
 void run_cli(__attribute__((unused)) void *arg);
 
 #ifdef __cplusplus
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v9 9/9] examples/vm_power_mgr: set MAC address of VF
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (7 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 8/9] examples/guest_cli: add send policy to host David Hunt
@ 2017-10-11 16:18         ` David Hunt
  2017-10-12  0:23         ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest Ferruh Yigit
  9 siblings, 0 replies; 34+ messages in thread
From: David Hunt @ 2017-10-11 16:18 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, jingjing.wu, santosh.shukla, David Hunt

We need to set vf mac from the host, so that they will be in sync on the
guest and the host. Otherwise, we'll have a random mac on the guest, and
a 00:00:00:00:00:00 mac on the host.

Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/vm_power_manager/main.c | 41 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4ffefce..399fbdd 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -58,6 +58,9 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include <rte_pmd_ixgbe.h>
+#include <rte_pmd_i40e.h>
+#include <rte_pmd_bnxt.h>
 
 #define RX_RING_SIZE 512
 #define TX_RING_SIZE 512
@@ -301,11 +304,49 @@ main(int argc, char **argv)
 
 	/* Initialize ports. */
 	for (portid = 0; portid < nb_ports; portid++) {
+		struct ether_addr eth;
+		int w, j;
+		int ret = -ENOTSUP;
+
 		if ((enabled_port_mask & (1 << portid)) == 0)
 			continue;
+
+		eth.addr_bytes[0] = 0xe0;
+		eth.addr_bytes[1] = 0xe0;
+		eth.addr_bytes[2] = 0xe0;
+		eth.addr_bytes[3] = 0xe0;
+		eth.addr_bytes[4] = portid + 0xf0;
+
 		if (port_init(portid, mbuf_pool) != 0)
 			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
 					portid);
+
+		for (w = 0; w < MAX_VFS; w++) {
+			eth.addr_bytes[5] = w + 0xf0;
+
+			if (ret == -ENOTSUP)
+				ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+						w, &eth);
+			if (ret == -ENOTSUP)
+				ret = rte_pmd_i40e_set_vf_mac_addr(portid,
+						w, &eth);
+			if (ret == -ENOTSUP)
+				ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
+						w, &eth);
+
+			switch (ret) {
+			case 0:
+				printf("Port %d VF %d MAC: ",
+						portid, w);
+				for (j = 0; j < 6; j++) {
+					printf("%02x", eth.addr_bytes[j]);
+					if (j < 5)
+						printf(":");
+				}
+				printf("\n");
+				break;
+			}
+		}
 	}
 
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest
  2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
                           ` (8 preceding siblings ...)
  2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 9/9] examples/vm_power_mgr: set MAC address of VF David Hunt
@ 2017-10-12  0:23         ` Ferruh Yigit
  9 siblings, 0 replies; 34+ messages in thread
From: Ferruh Yigit @ 2017-10-12  0:23 UTC (permalink / raw)
  To: David Hunt, dev; +Cc: konstantin.ananyev, jingjing.wu, santosh.shukla

On 10/11/2017 5:18 PM, David Hunt wrote:
> Policy Based Power Control for Guest
> 
> This patchset adds the facility for a guest VM to send a policy down to the
> host that will allow the host to scale up/down cpu frequencies
> depending on the policy criteria independently of the DPDK app running in
> the guest.  This differs from the previous vm_power implementation where
> individual scale up/down requests were send from the guest to the host via
> virtio-serial.
> 
> V9 patchset changes:
>   * Rebased on top of the tip of the master branch
>   * changed port_id from uint8 to uint16 due to changes elsewhere
> 
> V8 patchset changes:
>   * Added Ack's and Reviewed-by's to individual patches in the set so as to
>     keep patchwork A/R/T flags properly in sync.
> 
> V7 patchset changes:
>   * Changed return code of rte_pmd_i40e_query_vfid_by_mac() from an
>     int64_t to int
> 
> V6 patchset changes:
>   * Fixed comments in header for rte_pmd_i40e_query_vfid_by_mac.
>   * changed rte_pmd_i40e_query_vfid_by_mac return code from uint to int
>     as it can return negative error codes.
>   * Removed bool enum from channel_commands.h, including stdbool.h instead.
>   * Added #define VM_MAX_NAME_SZ 32 to channel_commands.h
>   * Renamed a few variables to be more readable.
>   * Added returns in a few places if failed to get info on domain.
>   * Fixed power_manager_init to keep track of num_freqs for each core.
>   * In power_manager_scale_core_med(), changed a hardcoded '5' to instead
>     be calculated from the centre of the frequency list
>     (global_core_freq_info[core_num].num_freqs / 2)
> 
> V5 patchset changes:
>   * Removed most of the #ifdef I40_PMD as it will be applicable to
>     other PMDs in the future.
>   * Changed the parameter of rte_pmd_i40e_query_vfid_by_mac from a uint64
>     to a const struct ether_addr *, rather than casting it later in the
>     function.
> 
> V4 patchset changes:
>   * None, re-post to mailing list under the correct email thread.
> 
> V3 patchset changes:
>   * Changed to using is_same_ether_addr() instead of looping through
>     the mac address bytes to compare them.
>   * Tweaked some comments and working in the i40e patch after review.
>   * Added a patch to the set to add new i40e function to map file, so
>     as to allow shared library builds. The power library API needs a cleanup
>     in next release, so will add API/ABI warning for this cleanup in a
>     separate patch.
> 
> V2 patchset changes:
>   * Removed API's in ethdev layer.
>   * Now just a single new API in the i40e driver for mapping VF MAC to
>     VF index.
>   * Moved new function from rte_rxtx.c to rte_pmd_i40e.c
>   * Removed function for reading i40e register, moved to using the
>     standard stats API.
>   * Renamed i40e function to rte_pmd_i40e_query_vfid_by_mac
>   * Cleaned up policy generation code.
> 
> It's a modification of the vm_power_manager app that runs in the host, and
> the guest_vm_power_app example app that runs in the guest. This allows the
> guest to send down a policy to the host via virtio-serial, which then allows
> the host to scale up/down based on the criteria in the policy, resulting in
> quicker scale up/down than individual requests coming from the guest.
> It also means that the DPDK application running in the guest does not need
> to be modified in any way, it is unaware that it's cores are being scaled
> up/down, reducing the effort in implementing a power-aware infrastructure.
> 
> The usage model is as follows:
> 1. Set up the VF's and assign to the guest in the usual way.
> 2. run vm_power_manager on the host, creating a channel to the guest.
> 3. Start the guest_vm_power_mgr app on the guest, which establishes
>    a virtio-serial channel to the host.
> 4. Send down the profile for the guest using the "send_profile now" command.
>    There is an example profile hard-coded into guest_vm_power_mgr.
> 5. Stop the guest_vm_power_mgr and run your normal power-unaware application.
> 6. Send traffic into the VFs at varying traffic rates.
>    Observe the frequency change on the host (turbostat -i 1)
> 
> The sequence of code changes are as follows:
> 
> A new function has been aded to the i40e driver to allow mapping of
> a VF MAC to VF index.
> 
> Next we make an addition to librte_power that adds an extra command to allow
> the passing of a policy structure from the guest to the host. This struct
> contains information like busy/quiet hour, packet throughput thresholds, etc.
> 
> The next addition adds functionality to convert the virtual CPU (vcpU0 IDs to
> physical CPU (pcpu) IDs so that the host can scale up/down the cores used
> in the guest.
> 
> The remaining patches are functionality to process the policy, and take action
> when the relevant trigger occurs to cause a frequency change.

Applied to dpdk/master, thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs
  2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs David Hunt
  2017-09-18 18:20       ` Mcnamara, John
@ 2018-02-06 12:29       ` Mcnamara, John
  1 sibling, 0 replies; 34+ messages in thread
From: Mcnamara, John @ 2018-02-06 12:29 UTC (permalink / raw)
  To: Hunt, David, dev; +Cc: Hunt, David



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Hunt
> Sent: Wednesday, September 13, 2017 11:44 AM
> To: dev@dpdk.org
> Cc: Hunt, David <david.hunt@intel.com>
> Subject: [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core
> turbo APIs
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2018-02-06 12:29 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-22 16:11 [dpdk-dev] [PATCH v1 0/4] add per-core Turbo Boost capability David Hunt
2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 1/4] lib/librte_power: add per-core turbo capability David Hunt
2017-09-13 10:44   ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability David Hunt
2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 1/4] lib/librte_power: add turbo boost API David Hunt
2017-10-03 14:08       ` [dpdk-dev] [PATCH v3 0/9] Policy Based Power Control for Guest David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 2/9] lib/librte_power: add extra msg type for policies David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 4/9] examples/vm_power_mgr: add scale to medium freq fn David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 5/9] examples/vm_power_mgr: add policy to channels David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 6/9] examples/vm_power_mgr: add port initialisation David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 7/9] power: add send channel msg function to map file David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 8/9] examples/guest_cli: add send policy to host David Hunt
2017-10-03 14:08         ` [dpdk-dev] [PATCH v3 9/9] examples/vm_power_mgr: set MAC address of VF David Hunt
2017-10-11 16:18       ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 1/9] net/i40e: add API to convert VF MAC to VF id David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 2/9] lib/librte_power: add extra msg type for policies David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 3/9] examples/vm_power_mgr: add vcpu to pcpu mapping David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 4/9] examples/vm_power_mgr: add scale to medium freq fn David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 5/9] examples/vm_power_mgr: add policy to channels David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 6/9] examples/vm_power_mgr: add port initialisation David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 7/9] power: add send channel msg function to map file David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 8/9] examples/guest_cli: add send policy to host David Hunt
2017-10-11 16:18         ` [dpdk-dev] [PATCH v9 9/9] examples/vm_power_mgr: set MAC address of VF David Hunt
2017-10-12  0:23         ` [dpdk-dev] [PATCH v9 0/9] Policy Based Power Control for Guest Ferruh Yigit
2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 2/4] examples/vm_power_manager: add per-core turbo David Hunt
2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 3/4] examples/vm_power_cli_guest: " David Hunt
2017-09-13 10:44     ` [dpdk-dev] [PATCH v2 4/4] doc/power: add information on per-core turbo APIs David Hunt
2017-09-18 18:20       ` Mcnamara, John
2018-02-06 12:29       ` Mcnamara, John
2017-09-22 14:36     ` [dpdk-dev] [PATCH v2 0/4] add per-core Turbo Boost capability Thomas Monjalon
2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 2/4] examples/vm_power_manager: add per-core turbo David Hunt
2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 3/4] examples/vm_power_cli_guest: " David Hunt
2017-08-22 16:11 ` [dpdk-dev] [PATCH v1 4/4] lib: limit turbo to particular models of CPU David Hunt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).