DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling
@ 2018-06-07  7:36 David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count David Hunt
                   ` (5 more replies)
  0 siblings, 6 replies; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:36 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

This patch set adds the capability to do out-of-band power
monitoring on a systemi by detecting when a core is doing 100%
busy polling, but not handling any packets.

It uses a thread to monitor the branch counters in the targeted
cores, and calculates the branch ratio if the running code.

If the branch ratop is low (0.01), then
the code is most likely running in a tight poll loop and doing
nothing, i.e. receiving no packets. In this case we scale down
the frequency of that core.

If the branch ratio is higher (>0.01), then it is likely that
the code is receiving and processing packets. In this case, we
scale up the frequency of that core.

The cpu counters are read via /dev/cpu/x/msr, so requires the
msr kernel module to be loaded. Because this method is used,
the patch set is implemented with one file for x86 systems, and
another for non-x86 systems, with conditional compilation in
the Makefile. The non-x86 functions are stubs, and do not
currently implement any functionality.

The vm_power_manager app has been modified to take a new parameter
   --core-list or -l
which takes a list of cores in a comma-separated list format,
e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
These cores will then be enabled for oob monitoring. When the
OOB monitoring thread starts, it reads the branch hits/miss
counters of each monitored core, and scales up/down accordingly.

[1/6] examples/vm_power: add check for port count
[2/6] examples/vm_power: add core list parameter
[3/6] examples/vm_power: add oob monitoring functions
[4/6] examples/vm_power: allow greater than 64 cores
[5/6] examples/vm_power: add thread for oob core monitor
[6/6] examples/vm_power: add port-list to command line

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count
  2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
@ 2018-06-07  7:37 ` David Hunt
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 2/6] examples/vm_power: add core list parameter David Hunt
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:37 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 81 +++++++++++++++++---------------
 1 file changed, 43 insertions(+), 38 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index c9805a461..043b374bc 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -280,51 +280,56 @@ main(int argc, char **argv)
 
 	nb_ports = rte_eth_dev_count_avail();
 
-	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
-		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+	if (nb_ports > 0) {
+		mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
+				NUM_MBUFS * nb_ports, MBUF_CACHE_SIZE, 0,
+				RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
 
-	if (mbuf_pool == NULL)
-		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+		if (mbuf_pool == NULL)
+			rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
 
-	/* Initialize ports. */
-	RTE_ETH_FOREACH_DEV(portid) {
-		struct ether_addr eth;
-		int w, j;
-		int ret;
+		/* Initialize ports. */
+		RTE_ETH_FOREACH_DEV(portid) {
+			struct ether_addr eth;
+			int w, j;
+			int ret;
 
-		if ((enabled_port_mask & (1 << portid)) == 0)
-			continue;
+			if ((enabled_port_mask & (1 << portid)) == 0)
+				continue;
 
-		eth.addr_bytes[0] = 0xe0;
-		eth.addr_bytes[1] = 0xe0;
-		eth.addr_bytes[2] = 0xe0;
-		eth.addr_bytes[3] = 0xe0;
-		eth.addr_bytes[4] = portid + 0xf0;
+			eth.addr_bytes[0] = 0xe0;
+			eth.addr_bytes[1] = 0xe0;
+			eth.addr_bytes[2] = 0xe0;
+			eth.addr_bytes[3] = 0xe0;
+			eth.addr_bytes[4] = portid + 0xf0;
 
-		if (port_init(portid, mbuf_pool) != 0)
-			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+			if (port_init(portid, mbuf_pool) != 0)
+				rte_exit(EXIT_FAILURE,
+					"Cannot init port %"PRIu8 "\n",
 					portid);
 
-		for (w = 0; w < MAX_VFS; w++) {
-			eth.addr_bytes[5] = w + 0xf0;
-
-			ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_i40e_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
-						w, &eth);
-
-			switch (ret) {
-			case 0:
-				printf("Port %d VF %d MAC: ",
-						portid, w);
-				for (j = 0; j < 6; j++) {
-					printf("%02x", eth.addr_bytes[j]);
-					if (j < 5)
-						printf(":");
+			for (w = 0; w < MAX_VFS; w++) {
+				eth.addr_bytes[5] = w + 0xf0;
+
+				ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+							w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_i40e_set_vf_mac_addr(
+							portid, w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_bnxt_set_vf_mac_addr(
+							portid, w, &eth);
+
+				switch (ret) {
+				case 0:
+					printf("Port %d VF %d MAC: ",
+							portid, w);
+					for (j = 0; j < 5; j++) {
+						printf("%02x:",
+							eth.addr_bytes[j]);
+					}
+					printf("%02x\n", eth.addr_bytes[5]);
+					break;
 				}
 				printf("\n");
 				break;
-- 
2.17.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v1 2/6] examples/vm_power: add core list parameter
  2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count David Hunt
@ 2018-06-07  7:37 ` David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 3/6] examples/vm_power: add oob monitoring functions David Hunt
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:37 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).

This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.

Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/Makefile        |  2 +-
 examples/vm_power_manager/main.c          | 34 ++++++++-
 examples/vm_power_manager/parse.c         | 93 +++++++++++++++++++++++
 examples/vm_power_manager/parse.h         | 20 +++++
 examples/vm_power_manager/power_manager.c | 31 ++++++++
 examples/vm_power_manager/power_manager.h | 20 +++++
 6 files changed, 197 insertions(+), 3 deletions(-)
 create mode 100644 examples/vm_power_manager/parse.c
 create mode 100644 examples/vm_power_manager/parse.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index ef2a9f959..0c925967c 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -19,7 +19,7 @@ APP = vm_power_mgr
 
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
-SRCS-y += channel_monitor.c
+SRCS-y += channel_monitor.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 043b374bc..cc2a1289c 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
 #include <rte_pmd_bnxt.h>
@@ -135,18 +136,22 @@ parse_portmask(const char *portmask)
 static int
 parse_args(int argc, char **argv)
 {
-	int opt, ret;
+	int opt, ret, cnt, i;
 	char **argvopt;
+	uint16_t *oob_enable;
 	int option_index;
 	char *prgname = argv[0];
+	struct core_info *ci;
 	static struct option lgopts[] = {
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
+		{ "core-list", optional_argument, 0, 'l'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
+	ci = get_core_info();
 
-	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -158,6 +163,27 @@ parse_args(int argc, char **argv)
 				return -1;
 			}
 			break;
+		case 'l':
+			oob_enable = malloc(ci->core_count * sizeof(uint16_t));
+			if (oob_enable == NULL) {
+				printf("Error - Unable to allocate memory\n");
+				return -1;
+			}
+			cnt = parse_set(optarg, oob_enable, ci->core_count);
+			if (cnt < 0) {
+				printf("Invalid core-list - [%s]\n",
+						optarg);
+				break;
+			}
+			for (i = 0; i < ci->core_count; i++) {
+				if (oob_enable[i]) {
+					printf("***Using core %d\n", i);
+					ci->cd[i].oob_enabled = 1;
+					ci->cd[i].global_enabled_cpus = 1;
+				}
+			}
+			free(oob_enable);
+			break;
 		/* long options */
 		case 0:
 			break;
@@ -263,6 +289,10 @@ main(int argc, char **argv)
 	uint16_t portid;
 
 
+	ret = core_info_init();
+	if (ret < 0)
+		rte_panic("Cannot allocate core info\n");
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
diff --git a/examples/vm_power_manager/parse.c b/examples/vm_power_manager/parse.c
new file mode 100644
index 000000000..9de15c4a7
--- /dev/null
+++ b/examples/vm_power_manager/parse.c
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <syslog.h>
+#include <ctype.h>
+#include <limits.h>
+#include <errno.h>
+#include <getopt.h>
+#include <dlfcn.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <dirent.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/parse.h b/examples/vm_power_manager/parse.h
new file mode 100644
index 000000000..a5971e9a2
--- /dev/null
+++ b/examples/vm_power_manager/parse.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 35db25591..a7849e48a 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -12,6 +12,7 @@
 #include <dirent.h>
 #include <errno.h>
 
+#include <sys/sysinfo.h>
 #include <sys/types.h>
 
 #include <rte_log.h>
@@ -54,6 +55,7 @@ struct freq_info {
 
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
+struct core_info ci;
 static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
@@ -76,6 +78,35 @@ set_host_cpus_mask(void)
 	return num_cpus;
 }
 
+struct core_info *
+get_core_info(void)
+{
+	return &ci;
+}
+
+int
+core_info_init(void)
+{
+	struct core_info *ci;
+	int i;
+
+	ci = get_core_info();
+
+	ci->core_count = get_nprocs_conf();
+	ci->cd = malloc(ci->core_count * sizeof(struct core_details));
+	if (!ci->cd) {
+		RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info.");
+		return -1;
+	}
+	for (i = 0; i < ci->core_count; i++) {
+		ci->cd[i].global_enabled_cpus = 1;
+		ci->cd[i].oob_enabled = 0;
+		ci->cd[i].msr_fd = 0;
+	}
+	printf("%d cores in system\n", ci->core_count);
+	return 0;
+}
+
 int
 power_manager_init(void)
 {
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 8a8a84aa4..45385de37 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -8,6 +8,26 @@
 #ifdef __cplusplus
 extern "C" {
 #endif
+struct core_details {
+	uint64_t last_branches;
+	uint64_t last_branch_misses;
+	uint16_t global_enabled_cpus;
+	uint16_t oob_enabled;
+	int msr_fd;
+};
+
+struct core_info {
+	uint16_t core_count;
+	struct core_details *cd;
+};
+
+struct core_info *
+get_core_info(void);
+
+int
+core_info_init(void);
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
 
 /* Maximum number of CPUS to manage */
 #define POWER_MGR_MAX_CPUS 64
-- 
2.17.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v1 3/6] examples/vm_power: add oob monitoring functions
  2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 2/6] examples/vm_power: add core list parameter David Hunt
@ 2018-06-07  7:37 ` David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 4/6] examples/vm_power: allow greater than 64 cores David Hunt
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:37 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

This patch introduces the out-of-band (oob) core monitoring
functions.

The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.

The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).

The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.

The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/Makefile          |   5 +
 examples/vm_power_manager/oob_monitor.h     |  68 +++++
 examples/vm_power_manager/oob_monitor_nop.c |  38 +++
 examples/vm_power_manager/oob_monitor_x86.c | 282 ++++++++++++++++++++
 4 files changed, 393 insertions(+)
 create mode 100644 examples/vm_power_manager/oob_monitor.h
 create mode 100644 examples/vm_power_manager/oob_monitor_nop.c
 create mode 100644 examples/vm_power_manager/oob_monitor_x86.c

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index 0c925967c..13a5205ba 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -20,6 +20,11 @@ APP = vm_power_mgr
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
 SRCS-y += channel_monitor.c parse.c
+ifeq ($(CONFIG_RTE_ARCH_X86_64),y)
+SRCS-y += oob_monitor_x86.c
+else
+SRCS-y += oob_monitor_nop.c
+endif
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/oob_monitor.h b/examples/vm_power_manager/oob_monitor.h
new file mode 100644
index 000000000..b96e08df7
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef OOB_MONITOR_H_
+#define OOB_MONITOR_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Branch Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int branch_monitor_init(void);
+
+/**
+ * Run the OOB branch monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_branch_monitor(void);
+
+/**
+ * Exit the OOB Branch Monitor.
+ *
+ * @return
+ *  None
+ */
+void branch_monitor_exit(void);
+
+/**
+ * Add a core to the list of cores to monitor.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_core_to_monitor(int core);
+
+/**
+ * Remove a previously added core from core list.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_core_from_monitor(int core);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* OOB_MONITOR_H_ */
diff --git a/examples/vm_power_manager/oob_monitor_nop.c b/examples/vm_power_manager/oob_monitor_nop.c
new file mode 100644
index 000000000..7e7b8bc14
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_nop.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include "oob_monitor.h"
+
+void branch_monitor_exit(void)
+{
+}
+
+__attribute__((unused)) static float
+apply_policy(__attribute__((unused)) int core)
+{
+	return 0.0;
+}
+
+int
+add_core_to_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+remove_core_from_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+}
diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c
new file mode 100644
index 000000000..485ec5e3f
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -0,0 +1,282 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+#include <sys/time.h>
+#include <fcntl.h>
+
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_ethdev.h>
+#include <rte_pmd_i40e.h>
+
+#include <libvirt/libvirt.h>
+#include "oob_monitor.h"
+#include "power_manager.h"
+#include "channel_manager.h"
+
+#include <rte_log.h>
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+static volatile unsigned run_loop = 1;
+static uint64_t g_branches, g_branch_misses;
+static int g_active;
+
+void branch_monitor_exit(void)
+{
+	run_loop = 0;
+}
+
+/* Number of microseconds between each poll */
+#define INTERVAL 100
+#define PRINT_LOOP_COUNT (1000000/INTERVAL)
+#define RATIO_THRESHOLD 0.03
+#define IA32_PERFEVTSEL0 0x186
+#define IA32_PERFEVTSEL1 0x187
+#define IA32_PERFCTR0 0xc1
+#define IA32_PERFCTR1 0xc2
+#define IA32_PERFEVT_BRANCH_HITS 0x05300c4
+#define IA32_PERFEVT_BRANCH_MISS 0x05300c5
+
+static float
+apply_policy(int core)
+{
+	struct core_info *ci;
+	uint64_t counter;
+	uint64_t branches, branch_misses;
+	uint32_t last_branches, last_branch_misses;
+	int hits_diff, miss_diff;
+	float ratio;
+	int ret;
+
+	g_active = 0;
+	ci = get_core_info();
+
+	last_branches = ci->cd[core].last_branches;
+	last_branch_misses = ci->cd[core].last_branch_misses;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR0);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branches = counter;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR1);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branch_misses = counter;
+
+
+	ci->cd[core].last_branches = branches;
+	ci->cd[core].last_branch_misses = branch_misses;
+
+	hits_diff = (int)branches - (int)last_branches;
+	if (hits_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	miss_diff = (int)branch_misses - (int)last_branch_misses;
+	if (miss_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	g_branches = hits_diff;
+	g_branch_misses = miss_diff;
+
+	if (hits_diff < (INTERVAL*100)) {
+		/* Likely no workload running on this core. Skip. */
+		return -1.0;
+	}
+
+	ratio = (float)miss_diff * (float)100 / (float)hits_diff;
+
+	if (ratio < RATIO_THRESHOLD)
+		power_manager_scale_core_min(core);
+	else
+		power_manager_scale_core_max(core);
+
+	g_active = 1;
+	return ratio;
+}
+
+int
+add_core_to_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (core < ci->core_count) {
+		long setup;
+
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		/*
+		 * Set up branch counters
+		 */
+		setup = IA32_PERFEVT_BRANCH_HITS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = IA32_PERFEVT_BRANCH_MISS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		/*
+		 * Close the file and re-open as read only so
+		 * as not to hog the resource
+		 */
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = open(proc_file, O_RDONLY);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		ci->cd[core].oob_enabled = 1;
+	}
+	return 0;
+}
+
+int
+remove_core_from_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (ci->cd[core].oob_enabled) {
+		long setup;
+
+		/*
+		 * close the msr file, then reopen rw so we can
+		 * disable the counters
+		 */
+		if (ci->cd[core].msr_fd != 0)
+			close(ci->cd[core].msr_fd);
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = 0;
+		ci->cd[core].oob_enabled = 0;
+	}
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+	struct core_info *ci;
+	int print = 0;
+	float ratio;
+	int printed;
+	int reads = 0;
+
+	ci = get_core_info();
+
+	while (run_loop) {
+
+		if (!run_loop)
+			break;
+		usleep(INTERVAL);
+		int j;
+		print++;
+		printed = 0;
+		for (j = 0; j < ci->core_count; j++) {
+			if (ci->cd[j].oob_enabled) {
+				ratio = apply_policy(j);
+				if ((print > PRINT_LOOP_COUNT) && (g_active)) {
+					printf("  %d: %.4f {%lu} {%d}", j,
+							ratio, g_branches,
+							reads);
+					printed = 1;
+					reads = 0;
+				} else {
+					reads++;
+				}
+			}
+		}
+		if (print > PRINT_LOOP_COUNT) {
+			if (printed)
+				printf("\n");
+			print = 0;
+		}
+	}
+}
-- 
2.17.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v1 4/6] examples/vm_power: allow greater than 64 cores
  2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
                   ` (2 preceding siblings ...)
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 3/6] examples/vm_power: add oob monitoring functions David Hunt
@ 2018-06-07  7:37 ` David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 5/6] examples/vm_power: add thread for oob core monitor David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 6/6] examples/vm_power: add port-list to command line David Hunt
  5 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:37 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/power_manager.c | 115 +++++++++++-----------
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index a7849e48a..4bdde23da 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -19,14 +19,14 @@
 #include <rte_power.h>
 #include <rte_spinlock.h>
 
+#include "channel_manager.h"
 #include "power_manager.h"
-
-#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+#include "oob_monitor.h"
 
 #define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
-	if (core_num >= POWER_MGR_MAX_CPUS) \
+	if (core_num >= ci.core_count) \
 		return -1; \
-	if (!(global_enabled_cpus & (1ULL << core_num))) \
+	if (!(ci.cd[core_num].global_enabled_cpus)) \
 		return -1; \
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
 	ret = rte_power_freq_##DIRECTION(core_num); \
@@ -37,7 +37,7 @@
 	int i; \
 	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
 		if ((core_mask >> i) & 1) { \
-			if (!(global_enabled_cpus & (1ULL << i))) \
+			if (!(ci.cd[i].global_enabled_cpus)) \
 				continue; \
 			rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
 			if (rte_power_freq_##DIRECTION(i) != 1) \
@@ -56,28 +56,9 @@ struct freq_info {
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
 struct core_info ci;
-static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
 
-static unsigned
-set_host_cpus_mask(void)
-{
-	char path[PATH_MAX];
-	unsigned i;
-	unsigned num_cpus = 0;
-
-	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
-		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
-		if (access(path, F_OK) == 0) {
-			global_enabled_cpus |= 1ULL << i;
-			num_cpus++;
-		} else
-			return num_cpus;
-	}
-	return num_cpus;
-}
-
 struct core_info *
 get_core_info(void)
 {
@@ -110,38 +91,45 @@ core_info_init(void)
 int
 power_manager_init(void)
 {
-	unsigned int i, num_cpus, num_freqs;
-	uint64_t cpu_mask;
+	unsigned int i, num_cpus = 0, num_freqs = 0;
 	int ret = 0;
+	struct core_info *ci;
+
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
 
-	num_cpus = set_host_cpus_mask();
-	if (num_cpus == 0) {
-		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
-			"ensure that sufficient privileges exist to inspect sysfs\n");
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
 		return -1;
 	}
-	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
-	cpu_mask = global_enabled_cpus;
-	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
-		if (rte_power_init(i) < 0)
-			RTE_LOG(ERR, POWER_MANAGER,
-					"Unable to initialize power manager "
-					"for core %u\n", i);
-		num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs,
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_init(i) < 0)
+				RTE_LOG(ERR, POWER_MANAGER,
+						"Unable to initialize power manager "
+						"for core %u\n", i);
+			num_cpus++;
+			num_freqs = rte_power_freqs(i,
+					global_core_freq_info[i].freqs,
 					RTE_MAX_LCORE_FREQS);
-		if (num_freqs == 0) {
-			RTE_LOG(ERR, POWER_MANAGER,
-				"Unable to get frequency list for core %u\n",
-				i);
-			global_enabled_cpus &= ~(1 << i);
-			num_cpus--;
-			ret = -1;
+			if (num_freqs == 0) {
+				RTE_LOG(ERR, POWER_MANAGER,
+					"Unable to get frequency list for core %u\n",
+					i);
+				ci->cd[i].oob_enabled = 0;
+				ret = -1;
+			}
+			global_core_freq_info[i].num_freqs = num_freqs;
+
+			rte_spinlock_init(&global_core_freq_info[i].power_sl);
 		}
-		global_core_freq_info[i].num_freqs = num_freqs;
-		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+		if (ci->cd[i].oob_enabled)
+			add_core_to_monitor(i);
 	}
-	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
-					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	RTE_LOG(INFO, POWER_MANAGER, "Managing %u cores out of %u available host cores\n",
+			num_cpus, ci->core_count);
 	return ret;
 
 }
@@ -156,7 +144,7 @@ power_manager_get_current_frequency(unsigned core_num)
 				core_num, POWER_MGR_MAX_CPUS-1);
 		return -1;
 	}
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci.cd[core_num].global_enabled_cpus))
 		return 0;
 
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
@@ -175,15 +163,26 @@ power_manager_exit(void)
 {
 	unsigned int i;
 	int ret = 0;
+	struct core_info *ci;
 
-	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
-		if (rte_power_exit(i) < 0) {
-			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
-					"for core %u\n", i);
-			ret = -1;
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
+		return -1;
+	}
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_exit(i) < 0) {
+				RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+						"for core %u\n", i);
+				ret = -1;
+			}
+			ci->cd[i].global_enabled_cpus = 0;
 		}
+		remove_core_from_monitor(i);
 	}
-	global_enabled_cpus = 0;
 	return ret;
 }
 
@@ -299,10 +298,12 @@ int
 power_manager_scale_core_med(unsigned int core_num)
 {
 	int ret = 0;
+	struct core_info *ci;
 
+	ci = get_core_info();
 	if (core_num >= POWER_MGR_MAX_CPUS)
 		return -1;
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci->cd[core_num].global_enabled_cpus))
 		return -1;
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
 	ret = rte_power_set_freq(core_num,
-- 
2.17.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v1 5/6] examples/vm_power: add thread for oob core monitor
  2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
                   ` (3 preceding siblings ...)
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 4/6] examples/vm_power: allow greater than 64 cores David Hunt
@ 2018-06-07  7:37 ` David Hunt
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 6/6] examples/vm_power: add port-list to command line David Hunt
  5 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:37 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 37 +++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index cc2a1289c..4c6b5a990 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "oob_monitor.h"
 #include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
@@ -269,6 +270,17 @@ run_monitor(__attribute__((unused)) void *arg)
 	return 0;
 }
 
+static int
+run_core_monitor(__attribute__((unused)) void *arg)
+{
+	if (branch_monitor_init() < 0) {
+		printf("Unable to initialize core monitor\n");
+		return -1;
+	}
+	run_branch_monitor();
+	return 0;
+}
+
 static void
 sig_handler(int signo)
 {
@@ -287,12 +299,15 @@ main(int argc, char **argv)
 	unsigned int nb_ports;
 	struct rte_mempool *mbuf_pool;
 	uint16_t portid;
+	struct core_info *ci;
 
 
 	ret = core_info_init();
 	if (ret < 0)
 		rte_panic("Cannot allocate core info\n");
 
+	ci = get_core_info();
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
@@ -367,16 +382,23 @@ main(int argc, char **argv)
 		}
 	}
 
+	check_all_ports_link_status(enabled_port_mask);
+
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
-		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
 				"application\n");
 		return 0;
 	}
-
-	check_all_ports_link_status(enabled_port_mask);
+	printf("Running channel monitor on lcore id %d\n", lcore_id);
 	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
+	lcore_id = rte_get_next_lcore(lcore_id, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
+				"application\n");
+		return 0;
+	}
 	if (power_manager_init() < 0) {
 		printf("Unable to initialize power manager\n");
 		return -1;
@@ -385,8 +407,17 @@ main(int argc, char **argv)
 		printf("Unable to initialize channel manager\n");
 		return -1;
 	}
+
+	printf("Running core monitor on lcore id %d\n", lcore_id);
+	rte_eal_remote_launch(run_core_monitor, NULL, lcore_id);
+
 	run_cli(NULL);
 
+	branch_monitor_exit();
+
 	rte_eal_mp_wait_lcore();
+
+	free(ci->cd);
+
 	return 0;
 }
-- 
2.17.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v1 6/6] examples/vm_power: add port-list to command line
  2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
                   ` (4 preceding siblings ...)
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 5/6] examples/vm_power: add thread for oob core monitor David Hunt
@ 2018-06-07  7:37 ` David Hunt
  5 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-07  7:37 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

add in the long form of -p, which is --port-list

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4c6b5a990..4088861f1 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -147,6 +147,7 @@ parse_args(int argc, char **argv)
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
 		{ "core-list", optional_argument, 0, 'l'},
+		{ "port-list", optional_argument, 0, 'p'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
-- 
2.17.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling
  2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count David Hunt
@ 2018-06-21 13:24   ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 1/8] examples/vm_power: add check for port count David Hunt
                       ` (8 more replies)
  0 siblings, 9 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

This patch set adds the capability to do out-of-band power
monitoring on a system. It uses a thread to monitor the branch
counters in the targeted cores, and calculates the branch ratio
if the running code.

If the branch ratop is low (0.01), then
the code is most likely running in a tight poll loop and doing
nothing, i.e. receiving no packets. In this case we scale down
the frequency of that core.

If the branch ratio is higher (>0.01), then it is likely that
the code is receiving and processing packets. In this case, we
scale up the frequency of that core.

The cpu counters are read via /dev/cpu/x/msr, so requires the
msr kernel module to be loaded. Because this method is used,
the patch set is implemented with one file for x86 systems, and
another for non-x86 systems, with conditional compilation in
the Makefile. The non-x86 functions are stubs, and do not
currently implement any functionality.

The vm_power_manager app has been modified to take a new parameter
   --core-list or -l
which takes a list of cores in a comma-separated list format,
e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
These cores will then be enabled for oob monitoring. When the
OOB monitoring thread starts, it reads the branch hits/miss
counters of each monitored core, and scales up/down accordingly.

The guest_cli app has also been modified to allow sending of a
policy of type BRANCH_RATIO where all of the cores included in
the policy will be monitored by the vm_power_manager oob thread.

v2 changes:
   * Add the guest_cli patch into this patch set, including the
     ability to set the policy to BRANCH_RATIO.
     http://patches.dpdk.org/patch/40742/
   * When vm_power_manger receives a policy with type BRANCH_RATIO,
     add the relevant cores to the monitoring thread.

[1/8] examples/vm_power: add check for port count
[2/8] examples/vm_power: add core list parameter
[3/8] examples/vm_power: add oob monitoring functions
[4/8] examples/vm_power: allow greater than 64 cores
[5/8] examples/vm_power: add thread for oob core monitor
[6/8] examples/vm_power: add port-list to command line
[7/8] examples/vm_power: add branch ratio policy type
[8/8] examples/vm_power: add cli args to guest app

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 1/8] examples/vm_power: add check for port count
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 2/8] examples/vm_power: add core list parameter David Hunt
                       ` (7 subsequent siblings)
  8 siblings, 1 reply; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 81 +++++++++++++++++---------------
 1 file changed, 43 insertions(+), 38 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index c9805a461..043b374bc 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -280,51 +280,56 @@ main(int argc, char **argv)
 
 	nb_ports = rte_eth_dev_count_avail();
 
-	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
-		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+	if (nb_ports > 0) {
+		mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
+				NUM_MBUFS * nb_ports, MBUF_CACHE_SIZE, 0,
+				RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
 
-	if (mbuf_pool == NULL)
-		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+		if (mbuf_pool == NULL)
+			rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
 
-	/* Initialize ports. */
-	RTE_ETH_FOREACH_DEV(portid) {
-		struct ether_addr eth;
-		int w, j;
-		int ret;
+		/* Initialize ports. */
+		RTE_ETH_FOREACH_DEV(portid) {
+			struct ether_addr eth;
+			int w, j;
+			int ret;
 
-		if ((enabled_port_mask & (1 << portid)) == 0)
-			continue;
+			if ((enabled_port_mask & (1 << portid)) == 0)
+				continue;
 
-		eth.addr_bytes[0] = 0xe0;
-		eth.addr_bytes[1] = 0xe0;
-		eth.addr_bytes[2] = 0xe0;
-		eth.addr_bytes[3] = 0xe0;
-		eth.addr_bytes[4] = portid + 0xf0;
+			eth.addr_bytes[0] = 0xe0;
+			eth.addr_bytes[1] = 0xe0;
+			eth.addr_bytes[2] = 0xe0;
+			eth.addr_bytes[3] = 0xe0;
+			eth.addr_bytes[4] = portid + 0xf0;
 
-		if (port_init(portid, mbuf_pool) != 0)
-			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+			if (port_init(portid, mbuf_pool) != 0)
+				rte_exit(EXIT_FAILURE,
+					"Cannot init port %"PRIu8 "\n",
 					portid);
 
-		for (w = 0; w < MAX_VFS; w++) {
-			eth.addr_bytes[5] = w + 0xf0;
-
-			ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_i40e_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
-						w, &eth);
-
-			switch (ret) {
-			case 0:
-				printf("Port %d VF %d MAC: ",
-						portid, w);
-				for (j = 0; j < 6; j++) {
-					printf("%02x", eth.addr_bytes[j]);
-					if (j < 5)
-						printf(":");
+			for (w = 0; w < MAX_VFS; w++) {
+				eth.addr_bytes[5] = w + 0xf0;
+
+				ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+							w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_i40e_set_vf_mac_addr(
+							portid, w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_bnxt_set_vf_mac_addr(
+							portid, w, &eth);
+
+				switch (ret) {
+				case 0:
+					printf("Port %d VF %d MAC: ",
+							portid, w);
+					for (j = 0; j < 5; j++) {
+						printf("%02x:",
+							eth.addr_bytes[j]);
+					}
+					printf("%02x\n", eth.addr_bytes[5]);
+					break;
 				}
 				printf("\n");
 				break;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 2/8] examples/vm_power: add core list parameter
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 1/8] examples/vm_power: add check for port count David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 3/8] examples/vm_power: add oob monitoring functions David Hunt
                       ` (6 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).

This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.

Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/Makefile        |  2 +-
 examples/vm_power_manager/main.c          | 34 ++++++++-
 examples/vm_power_manager/parse.c         | 93 +++++++++++++++++++++++
 examples/vm_power_manager/parse.h         | 20 +++++
 examples/vm_power_manager/power_manager.c | 31 ++++++++
 examples/vm_power_manager/power_manager.h | 20 +++++
 6 files changed, 197 insertions(+), 3 deletions(-)
 create mode 100644 examples/vm_power_manager/parse.c
 create mode 100644 examples/vm_power_manager/parse.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index ef2a9f959..0c925967c 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -19,7 +19,7 @@ APP = vm_power_mgr
 
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
-SRCS-y += channel_monitor.c
+SRCS-y += channel_monitor.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 043b374bc..cc2a1289c 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
 #include <rte_pmd_bnxt.h>
@@ -135,18 +136,22 @@ parse_portmask(const char *portmask)
 static int
 parse_args(int argc, char **argv)
 {
-	int opt, ret;
+	int opt, ret, cnt, i;
 	char **argvopt;
+	uint16_t *oob_enable;
 	int option_index;
 	char *prgname = argv[0];
+	struct core_info *ci;
 	static struct option lgopts[] = {
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
+		{ "core-list", optional_argument, 0, 'l'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
+	ci = get_core_info();
 
-	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -158,6 +163,27 @@ parse_args(int argc, char **argv)
 				return -1;
 			}
 			break;
+		case 'l':
+			oob_enable = malloc(ci->core_count * sizeof(uint16_t));
+			if (oob_enable == NULL) {
+				printf("Error - Unable to allocate memory\n");
+				return -1;
+			}
+			cnt = parse_set(optarg, oob_enable, ci->core_count);
+			if (cnt < 0) {
+				printf("Invalid core-list - [%s]\n",
+						optarg);
+				break;
+			}
+			for (i = 0; i < ci->core_count; i++) {
+				if (oob_enable[i]) {
+					printf("***Using core %d\n", i);
+					ci->cd[i].oob_enabled = 1;
+					ci->cd[i].global_enabled_cpus = 1;
+				}
+			}
+			free(oob_enable);
+			break;
 		/* long options */
 		case 0:
 			break;
@@ -263,6 +289,10 @@ main(int argc, char **argv)
 	uint16_t portid;
 
 
+	ret = core_info_init();
+	if (ret < 0)
+		rte_panic("Cannot allocate core info\n");
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
diff --git a/examples/vm_power_manager/parse.c b/examples/vm_power_manager/parse.c
new file mode 100644
index 000000000..9de15c4a7
--- /dev/null
+++ b/examples/vm_power_manager/parse.c
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <syslog.h>
+#include <ctype.h>
+#include <limits.h>
+#include <errno.h>
+#include <getopt.h>
+#include <dlfcn.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <dirent.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/parse.h b/examples/vm_power_manager/parse.h
new file mode 100644
index 000000000..a5971e9a2
--- /dev/null
+++ b/examples/vm_power_manager/parse.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 35db25591..a7849e48a 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -12,6 +12,7 @@
 #include <dirent.h>
 #include <errno.h>
 
+#include <sys/sysinfo.h>
 #include <sys/types.h>
 
 #include <rte_log.h>
@@ -54,6 +55,7 @@ struct freq_info {
 
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
+struct core_info ci;
 static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
@@ -76,6 +78,35 @@ set_host_cpus_mask(void)
 	return num_cpus;
 }
 
+struct core_info *
+get_core_info(void)
+{
+	return &ci;
+}
+
+int
+core_info_init(void)
+{
+	struct core_info *ci;
+	int i;
+
+	ci = get_core_info();
+
+	ci->core_count = get_nprocs_conf();
+	ci->cd = malloc(ci->core_count * sizeof(struct core_details));
+	if (!ci->cd) {
+		RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info.");
+		return -1;
+	}
+	for (i = 0; i < ci->core_count; i++) {
+		ci->cd[i].global_enabled_cpus = 1;
+		ci->cd[i].oob_enabled = 0;
+		ci->cd[i].msr_fd = 0;
+	}
+	printf("%d cores in system\n", ci->core_count);
+	return 0;
+}
+
 int
 power_manager_init(void)
 {
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 8a8a84aa4..45385de37 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -8,6 +8,26 @@
 #ifdef __cplusplus
 extern "C" {
 #endif
+struct core_details {
+	uint64_t last_branches;
+	uint64_t last_branch_misses;
+	uint16_t global_enabled_cpus;
+	uint16_t oob_enabled;
+	int msr_fd;
+};
+
+struct core_info {
+	uint16_t core_count;
+	struct core_details *cd;
+};
+
+struct core_info *
+get_core_info(void);
+
+int
+core_info_init(void);
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
 
 /* Maximum number of CPUS to manage */
 #define POWER_MGR_MAX_CPUS 64
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 3/8] examples/vm_power: add oob monitoring functions
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 1/8] examples/vm_power: add check for port count David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 2/8] examples/vm_power: add core list parameter David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 4/8] examples/vm_power: allow greater than 64 cores David Hunt
                       ` (5 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

This patch introduces the out-of-band (oob) core monitoring
functions.

The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.

The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).

The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.

The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/Makefile          |   5 +
 examples/vm_power_manager/oob_monitor.h     |  68 +++++
 examples/vm_power_manager/oob_monitor_nop.c |  38 +++
 examples/vm_power_manager/oob_monitor_x86.c | 282 ++++++++++++++++++++
 4 files changed, 393 insertions(+)
 create mode 100644 examples/vm_power_manager/oob_monitor.h
 create mode 100644 examples/vm_power_manager/oob_monitor_nop.c
 create mode 100644 examples/vm_power_manager/oob_monitor_x86.c

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index 0c925967c..13a5205ba 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -20,6 +20,11 @@ APP = vm_power_mgr
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
 SRCS-y += channel_monitor.c parse.c
+ifeq ($(CONFIG_RTE_ARCH_X86_64),y)
+SRCS-y += oob_monitor_x86.c
+else
+SRCS-y += oob_monitor_nop.c
+endif
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/oob_monitor.h b/examples/vm_power_manager/oob_monitor.h
new file mode 100644
index 000000000..b96e08df7
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef OOB_MONITOR_H_
+#define OOB_MONITOR_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Branch Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int branch_monitor_init(void);
+
+/**
+ * Run the OOB branch monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_branch_monitor(void);
+
+/**
+ * Exit the OOB Branch Monitor.
+ *
+ * @return
+ *  None
+ */
+void branch_monitor_exit(void);
+
+/**
+ * Add a core to the list of cores to monitor.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_core_to_monitor(int core);
+
+/**
+ * Remove a previously added core from core list.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_core_from_monitor(int core);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* OOB_MONITOR_H_ */
diff --git a/examples/vm_power_manager/oob_monitor_nop.c b/examples/vm_power_manager/oob_monitor_nop.c
new file mode 100644
index 000000000..7e7b8bc14
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_nop.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include "oob_monitor.h"
+
+void branch_monitor_exit(void)
+{
+}
+
+__attribute__((unused)) static float
+apply_policy(__attribute__((unused)) int core)
+{
+	return 0.0;
+}
+
+int
+add_core_to_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+remove_core_from_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+}
diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c
new file mode 100644
index 000000000..485ec5e3f
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -0,0 +1,282 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+#include <sys/time.h>
+#include <fcntl.h>
+
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_ethdev.h>
+#include <rte_pmd_i40e.h>
+
+#include <libvirt/libvirt.h>
+#include "oob_monitor.h"
+#include "power_manager.h"
+#include "channel_manager.h"
+
+#include <rte_log.h>
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+static volatile unsigned run_loop = 1;
+static uint64_t g_branches, g_branch_misses;
+static int g_active;
+
+void branch_monitor_exit(void)
+{
+	run_loop = 0;
+}
+
+/* Number of microseconds between each poll */
+#define INTERVAL 100
+#define PRINT_LOOP_COUNT (1000000/INTERVAL)
+#define RATIO_THRESHOLD 0.03
+#define IA32_PERFEVTSEL0 0x186
+#define IA32_PERFEVTSEL1 0x187
+#define IA32_PERFCTR0 0xc1
+#define IA32_PERFCTR1 0xc2
+#define IA32_PERFEVT_BRANCH_HITS 0x05300c4
+#define IA32_PERFEVT_BRANCH_MISS 0x05300c5
+
+static float
+apply_policy(int core)
+{
+	struct core_info *ci;
+	uint64_t counter;
+	uint64_t branches, branch_misses;
+	uint32_t last_branches, last_branch_misses;
+	int hits_diff, miss_diff;
+	float ratio;
+	int ret;
+
+	g_active = 0;
+	ci = get_core_info();
+
+	last_branches = ci->cd[core].last_branches;
+	last_branch_misses = ci->cd[core].last_branch_misses;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR0);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branches = counter;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR1);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branch_misses = counter;
+
+
+	ci->cd[core].last_branches = branches;
+	ci->cd[core].last_branch_misses = branch_misses;
+
+	hits_diff = (int)branches - (int)last_branches;
+	if (hits_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	miss_diff = (int)branch_misses - (int)last_branch_misses;
+	if (miss_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	g_branches = hits_diff;
+	g_branch_misses = miss_diff;
+
+	if (hits_diff < (INTERVAL*100)) {
+		/* Likely no workload running on this core. Skip. */
+		return -1.0;
+	}
+
+	ratio = (float)miss_diff * (float)100 / (float)hits_diff;
+
+	if (ratio < RATIO_THRESHOLD)
+		power_manager_scale_core_min(core);
+	else
+		power_manager_scale_core_max(core);
+
+	g_active = 1;
+	return ratio;
+}
+
+int
+add_core_to_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (core < ci->core_count) {
+		long setup;
+
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		/*
+		 * Set up branch counters
+		 */
+		setup = IA32_PERFEVT_BRANCH_HITS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = IA32_PERFEVT_BRANCH_MISS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		/*
+		 * Close the file and re-open as read only so
+		 * as not to hog the resource
+		 */
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = open(proc_file, O_RDONLY);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		ci->cd[core].oob_enabled = 1;
+	}
+	return 0;
+}
+
+int
+remove_core_from_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (ci->cd[core].oob_enabled) {
+		long setup;
+
+		/*
+		 * close the msr file, then reopen rw so we can
+		 * disable the counters
+		 */
+		if (ci->cd[core].msr_fd != 0)
+			close(ci->cd[core].msr_fd);
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = 0;
+		ci->cd[core].oob_enabled = 0;
+	}
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+	struct core_info *ci;
+	int print = 0;
+	float ratio;
+	int printed;
+	int reads = 0;
+
+	ci = get_core_info();
+
+	while (run_loop) {
+
+		if (!run_loop)
+			break;
+		usleep(INTERVAL);
+		int j;
+		print++;
+		printed = 0;
+		for (j = 0; j < ci->core_count; j++) {
+			if (ci->cd[j].oob_enabled) {
+				ratio = apply_policy(j);
+				if ((print > PRINT_LOOP_COUNT) && (g_active)) {
+					printf("  %d: %.4f {%lu} {%d}", j,
+							ratio, g_branches,
+							reads);
+					printed = 1;
+					reads = 0;
+				} else {
+					reads++;
+				}
+			}
+		}
+		if (print > PRINT_LOOP_COUNT) {
+			if (printed)
+				printf("\n");
+			print = 0;
+		}
+	}
+}
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 4/8] examples/vm_power: allow greater than 64 cores
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
                       ` (2 preceding siblings ...)
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 3/8] examples/vm_power: add oob monitoring functions David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 5/8] examples/vm_power: add thread for oob core monitor David Hunt
                       ` (4 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/power_manager.c | 115 +++++++++++-----------
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index a7849e48a..4bdde23da 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -19,14 +19,14 @@
 #include <rte_power.h>
 #include <rte_spinlock.h>
 
+#include "channel_manager.h"
 #include "power_manager.h"
-
-#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+#include "oob_monitor.h"
 
 #define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
-	if (core_num >= POWER_MGR_MAX_CPUS) \
+	if (core_num >= ci.core_count) \
 		return -1; \
-	if (!(global_enabled_cpus & (1ULL << core_num))) \
+	if (!(ci.cd[core_num].global_enabled_cpus)) \
 		return -1; \
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
 	ret = rte_power_freq_##DIRECTION(core_num); \
@@ -37,7 +37,7 @@
 	int i; \
 	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
 		if ((core_mask >> i) & 1) { \
-			if (!(global_enabled_cpus & (1ULL << i))) \
+			if (!(ci.cd[i].global_enabled_cpus)) \
 				continue; \
 			rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
 			if (rte_power_freq_##DIRECTION(i) != 1) \
@@ -56,28 +56,9 @@ struct freq_info {
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
 struct core_info ci;
-static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
 
-static unsigned
-set_host_cpus_mask(void)
-{
-	char path[PATH_MAX];
-	unsigned i;
-	unsigned num_cpus = 0;
-
-	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
-		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
-		if (access(path, F_OK) == 0) {
-			global_enabled_cpus |= 1ULL << i;
-			num_cpus++;
-		} else
-			return num_cpus;
-	}
-	return num_cpus;
-}
-
 struct core_info *
 get_core_info(void)
 {
@@ -110,38 +91,45 @@ core_info_init(void)
 int
 power_manager_init(void)
 {
-	unsigned int i, num_cpus, num_freqs;
-	uint64_t cpu_mask;
+	unsigned int i, num_cpus = 0, num_freqs = 0;
 	int ret = 0;
+	struct core_info *ci;
+
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
 
-	num_cpus = set_host_cpus_mask();
-	if (num_cpus == 0) {
-		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
-			"ensure that sufficient privileges exist to inspect sysfs\n");
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
 		return -1;
 	}
-	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
-	cpu_mask = global_enabled_cpus;
-	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
-		if (rte_power_init(i) < 0)
-			RTE_LOG(ERR, POWER_MANAGER,
-					"Unable to initialize power manager "
-					"for core %u\n", i);
-		num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs,
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_init(i) < 0)
+				RTE_LOG(ERR, POWER_MANAGER,
+						"Unable to initialize power manager "
+						"for core %u\n", i);
+			num_cpus++;
+			num_freqs = rte_power_freqs(i,
+					global_core_freq_info[i].freqs,
 					RTE_MAX_LCORE_FREQS);
-		if (num_freqs == 0) {
-			RTE_LOG(ERR, POWER_MANAGER,
-				"Unable to get frequency list for core %u\n",
-				i);
-			global_enabled_cpus &= ~(1 << i);
-			num_cpus--;
-			ret = -1;
+			if (num_freqs == 0) {
+				RTE_LOG(ERR, POWER_MANAGER,
+					"Unable to get frequency list for core %u\n",
+					i);
+				ci->cd[i].oob_enabled = 0;
+				ret = -1;
+			}
+			global_core_freq_info[i].num_freqs = num_freqs;
+
+			rte_spinlock_init(&global_core_freq_info[i].power_sl);
 		}
-		global_core_freq_info[i].num_freqs = num_freqs;
-		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+		if (ci->cd[i].oob_enabled)
+			add_core_to_monitor(i);
 	}
-	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
-					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	RTE_LOG(INFO, POWER_MANAGER, "Managing %u cores out of %u available host cores\n",
+			num_cpus, ci->core_count);
 	return ret;
 
 }
@@ -156,7 +144,7 @@ power_manager_get_current_frequency(unsigned core_num)
 				core_num, POWER_MGR_MAX_CPUS-1);
 		return -1;
 	}
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci.cd[core_num].global_enabled_cpus))
 		return 0;
 
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
@@ -175,15 +163,26 @@ power_manager_exit(void)
 {
 	unsigned int i;
 	int ret = 0;
+	struct core_info *ci;
 
-	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
-		if (rte_power_exit(i) < 0) {
-			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
-					"for core %u\n", i);
-			ret = -1;
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
+		return -1;
+	}
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_exit(i) < 0) {
+				RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+						"for core %u\n", i);
+				ret = -1;
+			}
+			ci->cd[i].global_enabled_cpus = 0;
 		}
+		remove_core_from_monitor(i);
 	}
-	global_enabled_cpus = 0;
 	return ret;
 }
 
@@ -299,10 +298,12 @@ int
 power_manager_scale_core_med(unsigned int core_num)
 {
 	int ret = 0;
+	struct core_info *ci;
 
+	ci = get_core_info();
 	if (core_num >= POWER_MGR_MAX_CPUS)
 		return -1;
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci->cd[core_num].global_enabled_cpus))
 		return -1;
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
 	ret = rte_power_set_freq(core_num,
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 5/8] examples/vm_power: add thread for oob core monitor
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
                       ` (3 preceding siblings ...)
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 4/8] examples/vm_power: allow greater than 64 cores David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 6/8] examples/vm_power: add port-list to command line David Hunt
                       ` (3 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 37 +++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index cc2a1289c..4c6b5a990 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "oob_monitor.h"
 #include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
@@ -269,6 +270,17 @@ run_monitor(__attribute__((unused)) void *arg)
 	return 0;
 }
 
+static int
+run_core_monitor(__attribute__((unused)) void *arg)
+{
+	if (branch_monitor_init() < 0) {
+		printf("Unable to initialize core monitor\n");
+		return -1;
+	}
+	run_branch_monitor();
+	return 0;
+}
+
 static void
 sig_handler(int signo)
 {
@@ -287,12 +299,15 @@ main(int argc, char **argv)
 	unsigned int nb_ports;
 	struct rte_mempool *mbuf_pool;
 	uint16_t portid;
+	struct core_info *ci;
 
 
 	ret = core_info_init();
 	if (ret < 0)
 		rte_panic("Cannot allocate core info\n");
 
+	ci = get_core_info();
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
@@ -367,16 +382,23 @@ main(int argc, char **argv)
 		}
 	}
 
+	check_all_ports_link_status(enabled_port_mask);
+
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
-		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
 				"application\n");
 		return 0;
 	}
-
-	check_all_ports_link_status(enabled_port_mask);
+	printf("Running channel monitor on lcore id %d\n", lcore_id);
 	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
+	lcore_id = rte_get_next_lcore(lcore_id, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
+				"application\n");
+		return 0;
+	}
 	if (power_manager_init() < 0) {
 		printf("Unable to initialize power manager\n");
 		return -1;
@@ -385,8 +407,17 @@ main(int argc, char **argv)
 		printf("Unable to initialize channel manager\n");
 		return -1;
 	}
+
+	printf("Running core monitor on lcore id %d\n", lcore_id);
+	rte_eal_remote_launch(run_core_monitor, NULL, lcore_id);
+
 	run_cli(NULL);
 
+	branch_monitor_exit();
+
 	rte_eal_mp_wait_lcore();
+
+	free(ci->cd);
+
 	return 0;
 }
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 6/8] examples/vm_power: add port-list to command line
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
                       ` (4 preceding siblings ...)
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 5/8] examples/vm_power: add thread for oob core monitor David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 7/8] examples/vm_power: add branch ratio policy type David Hunt
                       ` (2 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

add in the long form of -p, which is --port-list

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4c6b5a990..4088861f1 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -147,6 +147,7 @@ parse_args(int argc, char **argv)
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
 		{ "core-list", optional_argument, 0, 'l'},
+		{ "port-list", optional_argument, 0, 'p'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 7/8] examples/vm_power: add branch ratio policy type
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
                       ` (5 preceding siblings ...)
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 6/8] examples/vm_power: add port-list to command line David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 8/8] examples/vm_power: add cli args to guest app David Hunt
  2018-06-21 14:28     ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling Radu Nicolau
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/channel_monitor.c | 23 +++++++++++++++++++--
 lib/librte_power/channel_commands.h         |  3 ++-
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index 73bddd993..7fa47ba97 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -27,6 +27,7 @@
 #include "channel_commands.h"
 #include "channel_manager.h"
 #include "power_manager.h"
+#include "oob_monitor.h"
 
 #define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
 
@@ -92,6 +93,10 @@ get_pcpu_to_control(struct policy *pol)
 	struct vm_info info;
 	int pcpu, count;
 	uint64_t mask_u64b;
+	struct core_info *ci;
+	int ret;
+
+	ci = get_core_info();
 
 	RTE_LOG(INFO, CHANNEL_MONITOR, "Looking for pcpu for %s\n",
 			pol->pkt.vm_name);
@@ -100,8 +105,22 @@ get_pcpu_to_control(struct policy *pol)
 	for (count = 0; count < pol->pkt.num_vcpu; count++) {
 		mask_u64b = info.pcpu_mask[pol->pkt.vcpu_to_control[count]];
 		for (pcpu = 0; mask_u64b; mask_u64b &= ~(1ULL << pcpu++)) {
-			if ((mask_u64b >> pcpu) & 1)
-				pol->core_share[count].pcpu = pcpu;
+			if ((mask_u64b >> pcpu) & 1) {
+				if (pol->pkt.policy_to_use == BRANCH_RATIO) {
+					ci->cd[pcpu].oob_enabled = 1;
+					ret = add_core_to_monitor(pcpu);
+					if (ret == 0)
+						printf("Monitoring pcpu %d via Branch Ratio\n",
+								pcpu);
+					else
+						printf("Failed to start OOB Monitoring pcpu %d\n",
+								pcpu);
+
+				} else {
+					pol->core_share[count].pcpu = pcpu;
+					printf("Monitoring pcpu %d\n", pcpu);
+				}
+			}
 		}
 	}
 }
diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 5e8b4ab5d..ee638eefa 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -48,7 +48,8 @@ enum workload {HIGH, MEDIUM, LOW};
 enum policy_to_use {
 	TRAFFIC,
 	TIME,
-	WORKLOAD
+	WORKLOAD,
+	BRANCH_RATIO
 };
 
 struct traffic {
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v2 8/8] examples/vm_power: add cli args to guest app
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
                       ` (6 preceding siblings ...)
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 7/8] examples/vm_power: add branch ratio policy type David Hunt
@ 2018-06-21 13:24     ` David Hunt
  2018-06-21 14:28     ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling Radu Nicolau
  8 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-21 13:24 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add new command line arguments to the guest app to make
    testing and validation of the policy usage easier.
    These arguments are mainly around setting up the power
    management policy that is sent from the guest vm to
    to the vm_power_manager in the host

    New command line parameters:
    -n or --vm-name
       sets the name of the vm to be used by the host OS.
    -b or --busy-hours
       sets the list of hours that are predicted to be busy
    -q or --quiet-hours
       sets the list of hours that are predicted to be quiet
    -l or --vcpu-list
       sets the list of vcpus to monitor
    -p or --port-list
       sets the list of posts to monitor when using a
       workload policy.
    -o or --policy
       sets the default policy type
          TIME
          WORKLOAD
          TRAFFIC
          BRANCH_RATIO

    The format of the hours or list paramers is a comma-separated
    list of integers, which can take the form of
       a. x    e.g. --vcpu-list=1
       b. x,y  e.g. --quiet-hours=3,4
       c. x-y  e.g. --busy-hours=9-12
       d. combination of above (e.g. --busy-hours=4,5-7,9)

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile  |   2 +-
 examples/vm_power_manager/guest_cli/main.c    | 151 +++++++++++++++++-
 examples/vm_power_manager/guest_cli/parse.c   |  93 +++++++++++
 examples/vm_power_manager/guest_cli/parse.h   |  19 +++
 .../guest_cli/vm_power_cli_guest.c            | 113 +++++++------
 .../guest_cli/vm_power_cli_guest.h            |   6 +
 6 files changed, 330 insertions(+), 54 deletions(-)
 create mode 100644 examples/vm_power_manager/guest_cli/parse.c
 create mode 100644 examples/vm_power_manager/guest_cli/parse.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
index d710e22d9..8b1db861e 100644
--- a/examples/vm_power_manager/guest_cli/Makefile
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -14,7 +14,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 APP = guest_vm_power_mgr
 
 # all source are stored in SRCS-y
-SRCS-y := main.c vm_power_cli_guest.c
+SRCS-y := main.c vm_power_cli_guest.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
index b17936d6b..36365b124 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -2,23 +2,20 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
-/*
 #include <stdio.h>
-#include <string.h>
-#include <stdint.h>
-#include <sys/epoll.h>
-#include <fcntl.h>
-#include <unistd.h>
 #include <stdlib.h>
-#include <errno.h>
-*/
 #include <signal.h>
+#include <getopt.h>
+#include <string.h>
 
 #include <rte_lcore.h>
 #include <rte_power.h>
 #include <rte_debug.h>
+#include <rte_eal.h>
+#include <rte_log.h>
 
 #include "vm_power_cli_guest.h"
+#include "parse.h"
 
 static void
 sig_handler(int signo)
@@ -32,6 +29,136 @@ sig_handler(int signo)
 
 }
 
+#define MAX_HOURS 24
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	const struct option lgopts[] = {
+		{ "vm-name", required_argument, 0, 'n'},
+		{ "busy-hours", required_argument, 0, 'b'},
+		{ "quiet-hours", required_argument, 0, 'q'},
+		{ "port-list", required_argument, 0, 'p'},
+		{ "vcpu-list", required_argument, 0, 'l'},
+		{ "policy", required_argument, 0, 'o'},
+		{NULL, 0, 0, 0}
+	};
+	struct channel_packet *policy;
+	unsigned short int hours[MAX_HOURS];
+	unsigned short int cores[MAX_VCPU_PER_VM];
+	unsigned short int ports[MAX_VCPU_PER_VM];
+	int i, cnt, idx;
+
+	policy = get_policy();
+	set_policy_defaults(policy);
+
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "n:b:q:p:",
+				  lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'n':
+			strcpy(policy->vm_name, optarg);
+			printf("Setting VM Name to [%s]\n", policy->vm_name);
+			break;
+		case 'b':
+		case 'q':
+			//printf("***Processing set using [%s]\n", optarg);
+			cnt = parse_set(optarg, hours, MAX_HOURS);
+			if (cnt < 0) {
+				printf("Invalid value passed to quiet/busy hours - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_HOURS; i++) {
+				if (hours[i]) {
+					if (opt == 'b') {
+						printf("***Busy Hour %d\n", i);
+						policy->timer_policy.busy_hours
+							[idx++] = i;
+					} else {
+						printf("***Quiet Hour %d\n", i);
+						policy->timer_policy.quiet_hours
+							[idx++] = i;
+					}
+				}
+			}
+			break;
+		case 'l':
+			cnt = parse_set(optarg, cores, MAX_VCPU_PER_VM);
+			if (cnt < 0) {
+				printf("Invalid value passed to vcpu-list - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_VCPU_PER_VM; i++) {
+				if (cores[i]) {
+					printf("***Using core %d\n", i);
+					policy->vcpu_to_control[idx++] = i;
+				}
+			}
+			policy->num_vcpu = idx;
+			printf("Total cores: %d\n", idx);
+			break;
+		case 'p':
+			cnt = parse_set(optarg, ports, MAX_VCPU_PER_VM);
+			if (cnt < 0) {
+				printf("Invalid value passed to port-list - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_VCPU_PER_VM; i++) {
+				if (ports[i]) {
+					printf("***Using port %d\n", i);
+					set_policy_mac(i, idx++);
+				}
+			}
+			policy->nb_mac_to_monitor = idx;
+			printf("Total Ports: %d\n", idx);
+			break;
+		case 'o':
+			if (!strcmp(optarg, "TRAFFIC"))
+				policy->policy_to_use = TRAFFIC;
+			else if (!strcmp(optarg, "TIME"))
+				policy->policy_to_use = TIME;
+			else if (!strcmp(optarg, "WORKLOAD"))
+				policy->policy_to_use = WORKLOAD;
+			else if (!strcmp(optarg, "BRANCH_RATIO"))
+				policy->policy_to_use = BRANCH_RATIO;
+			else {
+				printf("Invalid policy specified: %s\n",
+						optarg);
+				return -1;
+			}
+			break;
+		/* long options */
+
+		case 0:
+			break;
+
+		default:
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
 int
 main(int argc, char **argv)
 {
@@ -45,6 +172,14 @@ main(int argc, char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
 	rte_power_set_env(PM_ENV_KVM_VM);
 	RTE_LCORE_FOREACH(lcore_id) {
 		rte_power_init(lcore_id);
diff --git a/examples/vm_power_manager/guest_cli/parse.c b/examples/vm_power_manager/guest_cli/parse.c
new file mode 100644
index 000000000..9de15c4a7
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/parse.c
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <syslog.h>
+#include <ctype.h>
+#include <limits.h>
+#include <errno.h>
+#include <getopt.h>
+#include <dlfcn.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <dirent.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/guest_cli/parse.h b/examples/vm_power_manager/guest_cli/parse.h
new file mode 100644
index 000000000..c8aa0ea50
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/parse.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 43bdeacef..0db1b804f 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -33,6 +33,71 @@ struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
 
+union PFID {
+	struct ether_addr addr;
+	uint64_t pfid;
+};
+
+static struct channel_packet policy;
+
+struct channel_packet *
+get_policy(void)
+{
+	return &policy;
+}
+
+int
+set_policy_mac(int port, int idx)
+{
+	struct channel_packet *policy;
+	union PFID pfid;
+
+	/* Use port MAC address as the vfid */
+	rte_eth_macaddr_get(port, &pfid.addr);
+
+	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
+			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
+			port,
+			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
+			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
+			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
+	policy = get_policy();
+	policy->vfid[idx] = pfid.pfid;
+	return 0;
+}
+
+void
+set_policy_defaults(struct channel_packet *pkt)
+{
+	set_policy_mac(0, 0);
+	pkt->nb_mac_to_monitor = 1;
+
+	pkt->t_boost_status.tbEnabled = false;
+
+	pkt->vcpu_to_control[0] = 0;
+	pkt->vcpu_to_control[1] = 1;
+	pkt->num_vcpu = 2;
+	/* Dummy Population. */
+	pkt->traffic_policy.min_packet_thresh = 96000;
+	pkt->traffic_policy.avg_max_packet_thresh = 1800000;
+	pkt->traffic_policy.max_max_packet_thresh = 2000000;
+
+	pkt->timer_policy.busy_hours[0] = 3;
+	pkt->timer_policy.busy_hours[1] = 4;
+	pkt->timer_policy.busy_hours[2] = 5;
+	pkt->timer_policy.quiet_hours[0] = 11;
+	pkt->timer_policy.quiet_hours[1] = 12;
+	pkt->timer_policy.quiet_hours[2] = 13;
+
+	pkt->timer_policy.hours_to_use_traffic_profile[0] = 8;
+	pkt->timer_policy.hours_to_use_traffic_profile[1] = 10;
+
+	pkt->workload = LOW;
+	pkt->policy_to_use = TIME;
+	pkt->command = PKT_POLICY;
+	strcpy(pkt->vm_name, "ubuntu2");
+}
+
 static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
 				__attribute__((unused)) struct cmdline *cl,
 			    __attribute__((unused)) void *data)
@@ -118,54 +183,12 @@ struct cmd_send_policy_result {
 	cmdline_fixed_string_t cmd;
 };
 
-union PFID {
-	struct ether_addr addr;
-	uint64_t pfid;
-};
-
 static inline int
-send_policy(void)
+send_policy(struct channel_packet *pkt)
 {
-	struct channel_packet pkt;
 	int ret;
 
-	union PFID pfid;
-	/* Use port MAC address as the vfid */
-	rte_eth_macaddr_get(0, &pfid.addr);
-	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
-			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
-			1,
-			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
-			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
-			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
-	pkt.vfid[0] = pfid.pfid;
-
-	pkt.nb_mac_to_monitor = 1;
-	pkt.t_boost_status.tbEnabled = false;
-
-	pkt.vcpu_to_control[0] = 0;
-	pkt.vcpu_to_control[1] = 1;
-	pkt.num_vcpu = 2;
-	/* Dummy Population. */
-	pkt.traffic_policy.min_packet_thresh = 96000;
-	pkt.traffic_policy.avg_max_packet_thresh = 1800000;
-	pkt.traffic_policy.max_max_packet_thresh = 2000000;
-
-	pkt.timer_policy.busy_hours[0] = 3;
-	pkt.timer_policy.busy_hours[1] = 4;
-	pkt.timer_policy.busy_hours[2] = 5;
-	pkt.timer_policy.quiet_hours[0] = 11;
-	pkt.timer_policy.quiet_hours[1] = 12;
-	pkt.timer_policy.quiet_hours[2] = 13;
-
-	pkt.timer_policy.hours_to_use_traffic_profile[0] = 8;
-	pkt.timer_policy.hours_to_use_traffic_profile[1] = 10;
-
-	pkt.workload = LOW;
-	pkt.policy_to_use = TIME;
-	pkt.command = PKT_POLICY;
-	strcpy(pkt.vm_name, "ubuntu2");
-	ret = rte_power_guest_channel_send_msg(&pkt, 1);
+	ret = rte_power_guest_channel_send_msg(pkt, 1);
 	if (ret == 0)
 		return 1;
 	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n",
@@ -182,7 +205,7 @@ cmd_send_policy_parsed(void *parsed_result, struct cmdline *cl,
 
 	if (!strcmp(res->cmd, "now")) {
 		printf("Sending Policy down now!\n");
-		ret = send_policy();
+		ret = send_policy(&policy);
 	}
 	if (ret != 1)
 		cmdline_printf(cl, "Error sending message: %s\n",
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
index 75a262967..fd77f6a69 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -11,6 +11,12 @@ extern "C" {
 
 #include "channel_commands.h"
 
+struct channel_packet *get_policy(void);
+
+int set_policy_mac(int port, int idx);
+
+void set_policy_defaults(struct channel_packet *pkt);
+
 void run_cli(__attribute__((unused)) void *arg);
 
 #ifdef __cplusplus
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling
  2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
                       ` (7 preceding siblings ...)
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 8/8] examples/vm_power: add cli args to guest app David Hunt
@ 2018-06-21 14:28     ` Radu Nicolau
  8 siblings, 0 replies; 46+ messages in thread
From: Radu Nicolau @ 2018-06-21 14:28 UTC (permalink / raw)
  To: David Hunt, dev


On 6/21/2018 2:24 PM, David Hunt wrote:
> This patch set adds the capability to do out-of-band power
> monitoring on a system. It uses a thread to monitor the branch
> counters in the targeted cores, and calculates the branch ratio
> if the running code.
>
> If the branch ratop is low (0.01), then
> the code is most likely running in a tight poll loop and doing
> nothing, i.e. receiving no packets. In this case we scale down
> the frequency of that core.
>
> If the branch ratio is higher (>0.01), then it is likely that
> the code is receiving and processing packets. In this case, we
> scale up the frequency of that core.
>
> The cpu counters are read via /dev/cpu/x/msr, so requires the
> msr kernel module to be loaded. Because this method is used,
> the patch set is implemented with one file for x86 systems, and
> another for non-x86 systems, with conditional compilation in
> the Makefile. The non-x86 functions are stubs, and do not
> currently implement any functionality.
>
> The vm_power_manager app has been modified to take a new parameter
>     --core-list or -l
> which takes a list of cores in a comma-separated list format,
> e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
> These cores will then be enabled for oob monitoring. When the
> OOB monitoring thread starts, it reads the branch hits/miss
> counters of each monitored core, and scales up/down accordingly.
>
> The guest_cli app has also been modified to allow sending of a
> policy of type BRANCH_RATIO where all of the cores included in
> the policy will be monitored by the vm_power_manager oob thread.
>
> v2 changes:
>     * Add the guest_cli patch into this patch set, including the
>       ability to set the policy to BRANCH_RATIO.
>       http://patches.dpdk.org/patch/40742/
>     * When vm_power_manger receives a policy with type BRANCH_RATIO,
>       add the relevant cores to the monitoring thread.
>
> [1/8] examples/vm_power: add check for port count
> [2/8] examples/vm_power: add core list parameter
> [3/8] examples/vm_power: add oob monitoring functions
> [4/8] examples/vm_power: allow greater than 64 cores
> [5/8] examples/vm_power: add thread for oob core monitor
> [6/8] examples/vm_power: add port-list to command line
> [7/8] examples/vm_power: add branch ratio policy type
> [8/8] examples/vm_power: add cli args to guest app
>
Series Acked-by: Radu Nicolau <radu.nicolau@intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling
  2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 1/8] examples/vm_power: add check for port count David Hunt
@ 2018-06-26  9:23       ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count David Hunt
                           ` (9 more replies)
  0 siblings, 10 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

This patch set adds the capability to do out-of-band power
monitoring on a system. It uses a thread to monitor the branch
counters in the targeted cores, and calculates the branch ratio
if the running code.

If the branch ratop is low (0.01), then
the code is most likely running in a tight poll loop and doing
nothing, i.e. receiving no packets. In this case we scale down
the frequency of that core.

If the branch ratio is higher (>0.01), then it is likely that
the code is receiving and processing packets. In this case, we
scale up the frequency of that core.

The cpu counters are read via /dev/cpu/x/msr, so requires the
msr kernel module to be loaded. Because this method is used,
the patch set is implemented with one file for x86 systems, and
another for non-x86 systems, with conditional compilation in
the Makefile. The non-x86 functions are stubs, and do not
currently implement any functionality.

The vm_power_manager app has been modified to take a new parameter
   --core-list or -l
which takes a list of cores in a comma-separated list format,
e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
These cores will then be enabled for oob monitoring. When the
OOB monitoring thread starts, it reads the branch hits/miss
counters of each monitored core, and scales up/down accordingly.

The guest_cli app has also been modified to allow sending of a
policy of type BRANCH_RATIO where all of the cores included in
the policy will be monitored by the vm_power_manager oob thread.

v2 changes:
   * Add the guest_cli patch into this patch set, including the
     ability to set the policy to BRANCH_RATIO.
     http://patches.dpdk.org/patch/40742/
   * When vm_power_manger receives a policy with type BRANCH_RATIO,
     add the relevant cores to the monitoring thread.

v3 changes:
   * Added a command line parameter to alloe changing of the
     default branch ratio threshold. can now use -b 0.3 or
     --branch-ratio=0.3 to set the ratio for scaling up/down.

[1/9] examples/vm_power: add check for port count
[2/9] examples/vm_power: add core list parameter
[3/9] examples/vm_power: add oob monitoring functions
[4/9] examples/vm_power: allow greater than 64 cores
[5/9] examples/vm_power: add thread for oob core monitor
[6/9] examples/vm_power: add port-list to command line
[7/9] examples/vm_power: add branch ratio policy type
[8/9] examples/vm_power: add cli args to guest app
[9/9] examples/vm_power: make branch ratio configurable

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 2/9] examples/vm_power: add core list parameter David Hunt
                           ` (8 subsequent siblings)
  9 siblings, 1 reply; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c | 81 +++++++++++++++++---------------
 1 file changed, 43 insertions(+), 38 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index c9805a461..043b374bc 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -280,51 +280,56 @@ main(int argc, char **argv)
 
 	nb_ports = rte_eth_dev_count_avail();
 
-	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
-		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+	if (nb_ports > 0) {
+		mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
+				NUM_MBUFS * nb_ports, MBUF_CACHE_SIZE, 0,
+				RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
 
-	if (mbuf_pool == NULL)
-		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+		if (mbuf_pool == NULL)
+			rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
 
-	/* Initialize ports. */
-	RTE_ETH_FOREACH_DEV(portid) {
-		struct ether_addr eth;
-		int w, j;
-		int ret;
+		/* Initialize ports. */
+		RTE_ETH_FOREACH_DEV(portid) {
+			struct ether_addr eth;
+			int w, j;
+			int ret;
 
-		if ((enabled_port_mask & (1 << portid)) == 0)
-			continue;
+			if ((enabled_port_mask & (1 << portid)) == 0)
+				continue;
 
-		eth.addr_bytes[0] = 0xe0;
-		eth.addr_bytes[1] = 0xe0;
-		eth.addr_bytes[2] = 0xe0;
-		eth.addr_bytes[3] = 0xe0;
-		eth.addr_bytes[4] = portid + 0xf0;
+			eth.addr_bytes[0] = 0xe0;
+			eth.addr_bytes[1] = 0xe0;
+			eth.addr_bytes[2] = 0xe0;
+			eth.addr_bytes[3] = 0xe0;
+			eth.addr_bytes[4] = portid + 0xf0;
 
-		if (port_init(portid, mbuf_pool) != 0)
-			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+			if (port_init(portid, mbuf_pool) != 0)
+				rte_exit(EXIT_FAILURE,
+					"Cannot init port %"PRIu8 "\n",
 					portid);
 
-		for (w = 0; w < MAX_VFS; w++) {
-			eth.addr_bytes[5] = w + 0xf0;
-
-			ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_i40e_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
-						w, &eth);
-
-			switch (ret) {
-			case 0:
-				printf("Port %d VF %d MAC: ",
-						portid, w);
-				for (j = 0; j < 6; j++) {
-					printf("%02x", eth.addr_bytes[j]);
-					if (j < 5)
-						printf(":");
+			for (w = 0; w < MAX_VFS; w++) {
+				eth.addr_bytes[5] = w + 0xf0;
+
+				ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+							w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_i40e_set_vf_mac_addr(
+							portid, w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_bnxt_set_vf_mac_addr(
+							portid, w, &eth);
+
+				switch (ret) {
+				case 0:
+					printf("Port %d VF %d MAC: ",
+							portid, w);
+					for (j = 0; j < 5; j++) {
+						printf("%02x:",
+							eth.addr_bytes[j]);
+					}
+					printf("%02x\n", eth.addr_bytes[5]);
+					break;
 				}
 				printf("\n");
 				break;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 2/9] examples/vm_power: add core list parameter
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions David Hunt
                           ` (7 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).

This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.

Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/Makefile        |  2 +-
 examples/vm_power_manager/main.c          | 34 ++++++++-
 examples/vm_power_manager/parse.c         | 93 +++++++++++++++++++++++
 examples/vm_power_manager/parse.h         | 20 +++++
 examples/vm_power_manager/power_manager.c | 31 ++++++++
 examples/vm_power_manager/power_manager.h | 20 +++++
 6 files changed, 197 insertions(+), 3 deletions(-)
 create mode 100644 examples/vm_power_manager/parse.c
 create mode 100644 examples/vm_power_manager/parse.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index ef2a9f959..0c925967c 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -19,7 +19,7 @@ APP = vm_power_mgr
 
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
-SRCS-y += channel_monitor.c
+SRCS-y += channel_monitor.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 043b374bc..cc2a1289c 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
 #include <rte_pmd_bnxt.h>
@@ -135,18 +136,22 @@ parse_portmask(const char *portmask)
 static int
 parse_args(int argc, char **argv)
 {
-	int opt, ret;
+	int opt, ret, cnt, i;
 	char **argvopt;
+	uint16_t *oob_enable;
 	int option_index;
 	char *prgname = argv[0];
+	struct core_info *ci;
 	static struct option lgopts[] = {
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
+		{ "core-list", optional_argument, 0, 'l'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
+	ci = get_core_info();
 
-	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -158,6 +163,27 @@ parse_args(int argc, char **argv)
 				return -1;
 			}
 			break;
+		case 'l':
+			oob_enable = malloc(ci->core_count * sizeof(uint16_t));
+			if (oob_enable == NULL) {
+				printf("Error - Unable to allocate memory\n");
+				return -1;
+			}
+			cnt = parse_set(optarg, oob_enable, ci->core_count);
+			if (cnt < 0) {
+				printf("Invalid core-list - [%s]\n",
+						optarg);
+				break;
+			}
+			for (i = 0; i < ci->core_count; i++) {
+				if (oob_enable[i]) {
+					printf("***Using core %d\n", i);
+					ci->cd[i].oob_enabled = 1;
+					ci->cd[i].global_enabled_cpus = 1;
+				}
+			}
+			free(oob_enable);
+			break;
 		/* long options */
 		case 0:
 			break;
@@ -263,6 +289,10 @@ main(int argc, char **argv)
 	uint16_t portid;
 
 
+	ret = core_info_init();
+	if (ret < 0)
+		rte_panic("Cannot allocate core info\n");
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
diff --git a/examples/vm_power_manager/parse.c b/examples/vm_power_manager/parse.c
new file mode 100644
index 000000000..9de15c4a7
--- /dev/null
+++ b/examples/vm_power_manager/parse.c
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <syslog.h>
+#include <ctype.h>
+#include <limits.h>
+#include <errno.h>
+#include <getopt.h>
+#include <dlfcn.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <dirent.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/parse.h b/examples/vm_power_manager/parse.h
new file mode 100644
index 000000000..a5971e9a2
--- /dev/null
+++ b/examples/vm_power_manager/parse.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 35db25591..a7849e48a 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -12,6 +12,7 @@
 #include <dirent.h>
 #include <errno.h>
 
+#include <sys/sysinfo.h>
 #include <sys/types.h>
 
 #include <rte_log.h>
@@ -54,6 +55,7 @@ struct freq_info {
 
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
+struct core_info ci;
 static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
@@ -76,6 +78,35 @@ set_host_cpus_mask(void)
 	return num_cpus;
 }
 
+struct core_info *
+get_core_info(void)
+{
+	return &ci;
+}
+
+int
+core_info_init(void)
+{
+	struct core_info *ci;
+	int i;
+
+	ci = get_core_info();
+
+	ci->core_count = get_nprocs_conf();
+	ci->cd = malloc(ci->core_count * sizeof(struct core_details));
+	if (!ci->cd) {
+		RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info.");
+		return -1;
+	}
+	for (i = 0; i < ci->core_count; i++) {
+		ci->cd[i].global_enabled_cpus = 1;
+		ci->cd[i].oob_enabled = 0;
+		ci->cd[i].msr_fd = 0;
+	}
+	printf("%d cores in system\n", ci->core_count);
+	return 0;
+}
+
 int
 power_manager_init(void)
 {
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 8a8a84aa4..45385de37 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -8,6 +8,26 @@
 #ifdef __cplusplus
 extern "C" {
 #endif
+struct core_details {
+	uint64_t last_branches;
+	uint64_t last_branch_misses;
+	uint16_t global_enabled_cpus;
+	uint16_t oob_enabled;
+	int msr_fd;
+};
+
+struct core_info {
+	uint16_t core_count;
+	struct core_details *cd;
+};
+
+struct core_info *
+get_core_info(void);
+
+int
+core_info_init(void);
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
 
 /* Maximum number of CPUS to manage */
 #define POWER_MGR_MAX_CPUS 64
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 2/9] examples/vm_power: add core list parameter David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-07-12 19:13           ` Thomas Monjalon
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 4/9] examples/vm_power: allow greater than 64 cores David Hunt
                           ` (6 subsequent siblings)
  9 siblings, 1 reply; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

This patch introduces the out-of-band (oob) core monitoring
functions.

The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.

The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).

The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.

The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/Makefile          |   5 +
 examples/vm_power_manager/oob_monitor.h     |  68 +++++
 examples/vm_power_manager/oob_monitor_nop.c |  38 +++
 examples/vm_power_manager/oob_monitor_x86.c | 282 ++++++++++++++++++++
 4 files changed, 393 insertions(+)
 create mode 100644 examples/vm_power_manager/oob_monitor.h
 create mode 100644 examples/vm_power_manager/oob_monitor_nop.c
 create mode 100644 examples/vm_power_manager/oob_monitor_x86.c

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index 0c925967c..13a5205ba 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -20,6 +20,11 @@ APP = vm_power_mgr
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
 SRCS-y += channel_monitor.c parse.c
+ifeq ($(CONFIG_RTE_ARCH_X86_64),y)
+SRCS-y += oob_monitor_x86.c
+else
+SRCS-y += oob_monitor_nop.c
+endif
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/oob_monitor.h b/examples/vm_power_manager/oob_monitor.h
new file mode 100644
index 000000000..b96e08df7
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef OOB_MONITOR_H_
+#define OOB_MONITOR_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Branch Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int branch_monitor_init(void);
+
+/**
+ * Run the OOB branch monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_branch_monitor(void);
+
+/**
+ * Exit the OOB Branch Monitor.
+ *
+ * @return
+ *  None
+ */
+void branch_monitor_exit(void);
+
+/**
+ * Add a core to the list of cores to monitor.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_core_to_monitor(int core);
+
+/**
+ * Remove a previously added core from core list.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_core_from_monitor(int core);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* OOB_MONITOR_H_ */
diff --git a/examples/vm_power_manager/oob_monitor_nop.c b/examples/vm_power_manager/oob_monitor_nop.c
new file mode 100644
index 000000000..7e7b8bc14
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_nop.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include "oob_monitor.h"
+
+void branch_monitor_exit(void)
+{
+}
+
+__attribute__((unused)) static float
+apply_policy(__attribute__((unused)) int core)
+{
+	return 0.0;
+}
+
+int
+add_core_to_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+remove_core_from_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+}
diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c
new file mode 100644
index 000000000..485ec5e3f
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -0,0 +1,282 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <signal.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/epoll.h>
+#include <sys/queue.h>
+#include <sys/time.h>
+#include <fcntl.h>
+
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_ethdev.h>
+#include <rte_pmd_i40e.h>
+
+#include <libvirt/libvirt.h>
+#include "oob_monitor.h"
+#include "power_manager.h"
+#include "channel_manager.h"
+
+#include <rte_log.h>
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+static volatile unsigned run_loop = 1;
+static uint64_t g_branches, g_branch_misses;
+static int g_active;
+
+void branch_monitor_exit(void)
+{
+	run_loop = 0;
+}
+
+/* Number of microseconds between each poll */
+#define INTERVAL 100
+#define PRINT_LOOP_COUNT (1000000/INTERVAL)
+#define RATIO_THRESHOLD 0.03
+#define IA32_PERFEVTSEL0 0x186
+#define IA32_PERFEVTSEL1 0x187
+#define IA32_PERFCTR0 0xc1
+#define IA32_PERFCTR1 0xc2
+#define IA32_PERFEVT_BRANCH_HITS 0x05300c4
+#define IA32_PERFEVT_BRANCH_MISS 0x05300c5
+
+static float
+apply_policy(int core)
+{
+	struct core_info *ci;
+	uint64_t counter;
+	uint64_t branches, branch_misses;
+	uint32_t last_branches, last_branch_misses;
+	int hits_diff, miss_diff;
+	float ratio;
+	int ret;
+
+	g_active = 0;
+	ci = get_core_info();
+
+	last_branches = ci->cd[core].last_branches;
+	last_branch_misses = ci->cd[core].last_branch_misses;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR0);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branches = counter;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR1);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branch_misses = counter;
+
+
+	ci->cd[core].last_branches = branches;
+	ci->cd[core].last_branch_misses = branch_misses;
+
+	hits_diff = (int)branches - (int)last_branches;
+	if (hits_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	miss_diff = (int)branch_misses - (int)last_branch_misses;
+	if (miss_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	g_branches = hits_diff;
+	g_branch_misses = miss_diff;
+
+	if (hits_diff < (INTERVAL*100)) {
+		/* Likely no workload running on this core. Skip. */
+		return -1.0;
+	}
+
+	ratio = (float)miss_diff * (float)100 / (float)hits_diff;
+
+	if (ratio < RATIO_THRESHOLD)
+		power_manager_scale_core_min(core);
+	else
+		power_manager_scale_core_max(core);
+
+	g_active = 1;
+	return ratio;
+}
+
+int
+add_core_to_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (core < ci->core_count) {
+		long setup;
+
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		/*
+		 * Set up branch counters
+		 */
+		setup = IA32_PERFEVT_BRANCH_HITS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = IA32_PERFEVT_BRANCH_MISS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		/*
+		 * Close the file and re-open as read only so
+		 * as not to hog the resource
+		 */
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = open(proc_file, O_RDONLY);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		ci->cd[core].oob_enabled = 1;
+	}
+	return 0;
+}
+
+int
+remove_core_from_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (ci->cd[core].oob_enabled) {
+		long setup;
+
+		/*
+		 * close the msr file, then reopen rw so we can
+		 * disable the counters
+		 */
+		if (ci->cd[core].msr_fd != 0)
+			close(ci->cd[core].msr_fd);
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = 0;
+		ci->cd[core].oob_enabled = 0;
+	}
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+	struct core_info *ci;
+	int print = 0;
+	float ratio;
+	int printed;
+	int reads = 0;
+
+	ci = get_core_info();
+
+	while (run_loop) {
+
+		if (!run_loop)
+			break;
+		usleep(INTERVAL);
+		int j;
+		print++;
+		printed = 0;
+		for (j = 0; j < ci->core_count; j++) {
+			if (ci->cd[j].oob_enabled) {
+				ratio = apply_policy(j);
+				if ((print > PRINT_LOOP_COUNT) && (g_active)) {
+					printf("  %d: %.4f {%lu} {%d}", j,
+							ratio, g_branches,
+							reads);
+					printed = 1;
+					reads = 0;
+				} else {
+					reads++;
+				}
+			}
+		}
+		if (print > PRINT_LOOP_COUNT) {
+			if (printed)
+				printf("\n");
+			print = 0;
+		}
+	}
+}
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 4/9] examples/vm_power: allow greater than 64 cores
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (2 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 5/9] examples/vm_power: add thread for oob core monitor David Hunt
                           ` (5 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/power_manager.c | 115 +++++++++++-----------
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index a7849e48a..4bdde23da 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -19,14 +19,14 @@
 #include <rte_power.h>
 #include <rte_spinlock.h>
 
+#include "channel_manager.h"
 #include "power_manager.h"
-
-#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+#include "oob_monitor.h"
 
 #define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
-	if (core_num >= POWER_MGR_MAX_CPUS) \
+	if (core_num >= ci.core_count) \
 		return -1; \
-	if (!(global_enabled_cpus & (1ULL << core_num))) \
+	if (!(ci.cd[core_num].global_enabled_cpus)) \
 		return -1; \
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
 	ret = rte_power_freq_##DIRECTION(core_num); \
@@ -37,7 +37,7 @@
 	int i; \
 	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
 		if ((core_mask >> i) & 1) { \
-			if (!(global_enabled_cpus & (1ULL << i))) \
+			if (!(ci.cd[i].global_enabled_cpus)) \
 				continue; \
 			rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
 			if (rte_power_freq_##DIRECTION(i) != 1) \
@@ -56,28 +56,9 @@ struct freq_info {
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
 struct core_info ci;
-static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
 
-static unsigned
-set_host_cpus_mask(void)
-{
-	char path[PATH_MAX];
-	unsigned i;
-	unsigned num_cpus = 0;
-
-	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
-		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
-		if (access(path, F_OK) == 0) {
-			global_enabled_cpus |= 1ULL << i;
-			num_cpus++;
-		} else
-			return num_cpus;
-	}
-	return num_cpus;
-}
-
 struct core_info *
 get_core_info(void)
 {
@@ -110,38 +91,45 @@ core_info_init(void)
 int
 power_manager_init(void)
 {
-	unsigned int i, num_cpus, num_freqs;
-	uint64_t cpu_mask;
+	unsigned int i, num_cpus = 0, num_freqs = 0;
 	int ret = 0;
+	struct core_info *ci;
+
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
 
-	num_cpus = set_host_cpus_mask();
-	if (num_cpus == 0) {
-		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
-			"ensure that sufficient privileges exist to inspect sysfs\n");
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
 		return -1;
 	}
-	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
-	cpu_mask = global_enabled_cpus;
-	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
-		if (rte_power_init(i) < 0)
-			RTE_LOG(ERR, POWER_MANAGER,
-					"Unable to initialize power manager "
-					"for core %u\n", i);
-		num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs,
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_init(i) < 0)
+				RTE_LOG(ERR, POWER_MANAGER,
+						"Unable to initialize power manager "
+						"for core %u\n", i);
+			num_cpus++;
+			num_freqs = rte_power_freqs(i,
+					global_core_freq_info[i].freqs,
 					RTE_MAX_LCORE_FREQS);
-		if (num_freqs == 0) {
-			RTE_LOG(ERR, POWER_MANAGER,
-				"Unable to get frequency list for core %u\n",
-				i);
-			global_enabled_cpus &= ~(1 << i);
-			num_cpus--;
-			ret = -1;
+			if (num_freqs == 0) {
+				RTE_LOG(ERR, POWER_MANAGER,
+					"Unable to get frequency list for core %u\n",
+					i);
+				ci->cd[i].oob_enabled = 0;
+				ret = -1;
+			}
+			global_core_freq_info[i].num_freqs = num_freqs;
+
+			rte_spinlock_init(&global_core_freq_info[i].power_sl);
 		}
-		global_core_freq_info[i].num_freqs = num_freqs;
-		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+		if (ci->cd[i].oob_enabled)
+			add_core_to_monitor(i);
 	}
-	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
-					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	RTE_LOG(INFO, POWER_MANAGER, "Managing %u cores out of %u available host cores\n",
+			num_cpus, ci->core_count);
 	return ret;
 
 }
@@ -156,7 +144,7 @@ power_manager_get_current_frequency(unsigned core_num)
 				core_num, POWER_MGR_MAX_CPUS-1);
 		return -1;
 	}
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci.cd[core_num].global_enabled_cpus))
 		return 0;
 
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
@@ -175,15 +163,26 @@ power_manager_exit(void)
 {
 	unsigned int i;
 	int ret = 0;
+	struct core_info *ci;
 
-	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
-		if (rte_power_exit(i) < 0) {
-			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
-					"for core %u\n", i);
-			ret = -1;
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
+		return -1;
+	}
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_exit(i) < 0) {
+				RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+						"for core %u\n", i);
+				ret = -1;
+			}
+			ci->cd[i].global_enabled_cpus = 0;
 		}
+		remove_core_from_monitor(i);
 	}
-	global_enabled_cpus = 0;
 	return ret;
 }
 
@@ -299,10 +298,12 @@ int
 power_manager_scale_core_med(unsigned int core_num)
 {
 	int ret = 0;
+	struct core_info *ci;
 
+	ci = get_core_info();
 	if (core_num >= POWER_MGR_MAX_CPUS)
 		return -1;
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci->cd[core_num].global_enabled_cpus))
 		return -1;
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
 	ret = rte_power_set_freq(core_num,
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 5/9] examples/vm_power: add thread for oob core monitor
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (3 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 4/9] examples/vm_power: allow greater than 64 cores David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 6/9] examples/vm_power: add port-list to command line David Hunt
                           ` (4 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c | 37 +++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index cc2a1289c..4c6b5a990 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "oob_monitor.h"
 #include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
@@ -269,6 +270,17 @@ run_monitor(__attribute__((unused)) void *arg)
 	return 0;
 }
 
+static int
+run_core_monitor(__attribute__((unused)) void *arg)
+{
+	if (branch_monitor_init() < 0) {
+		printf("Unable to initialize core monitor\n");
+		return -1;
+	}
+	run_branch_monitor();
+	return 0;
+}
+
 static void
 sig_handler(int signo)
 {
@@ -287,12 +299,15 @@ main(int argc, char **argv)
 	unsigned int nb_ports;
 	struct rte_mempool *mbuf_pool;
 	uint16_t portid;
+	struct core_info *ci;
 
 
 	ret = core_info_init();
 	if (ret < 0)
 		rte_panic("Cannot allocate core info\n");
 
+	ci = get_core_info();
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
@@ -367,16 +382,23 @@ main(int argc, char **argv)
 		}
 	}
 
+	check_all_ports_link_status(enabled_port_mask);
+
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
-		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
 				"application\n");
 		return 0;
 	}
-
-	check_all_ports_link_status(enabled_port_mask);
+	printf("Running channel monitor on lcore id %d\n", lcore_id);
 	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
+	lcore_id = rte_get_next_lcore(lcore_id, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
+				"application\n");
+		return 0;
+	}
 	if (power_manager_init() < 0) {
 		printf("Unable to initialize power manager\n");
 		return -1;
@@ -385,8 +407,17 @@ main(int argc, char **argv)
 		printf("Unable to initialize channel manager\n");
 		return -1;
 	}
+
+	printf("Running core monitor on lcore id %d\n", lcore_id);
+	rte_eal_remote_launch(run_core_monitor, NULL, lcore_id);
+
 	run_cli(NULL);
 
+	branch_monitor_exit();
+
 	rte_eal_mp_wait_lcore();
+
+	free(ci->cd);
+
 	return 0;
 }
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 6/9] examples/vm_power: add port-list to command line
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (4 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 5/9] examples/vm_power: add thread for oob core monitor David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 7/9] examples/vm_power: add branch ratio policy type David Hunt
                           ` (3 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

add in the long form of -p, which is --port-list

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4c6b5a990..4088861f1 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -147,6 +147,7 @@ parse_args(int argc, char **argv)
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
 		{ "core-list", optional_argument, 0, 'l'},
+		{ "port-list", optional_argument, 0, 'p'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 7/9] examples/vm_power: add branch ratio policy type
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (5 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 6/9] examples/vm_power: add port-list to command line David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 8/9] examples/vm_power: add cli args to guest app David Hunt
                           ` (2 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/channel_monitor.c | 23 +++++++++++++++++++--
 lib/librte_power/channel_commands.h         |  3 ++-
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index 73bddd993..7fa47ba97 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -27,6 +27,7 @@
 #include "channel_commands.h"
 #include "channel_manager.h"
 #include "power_manager.h"
+#include "oob_monitor.h"
 
 #define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
 
@@ -92,6 +93,10 @@ get_pcpu_to_control(struct policy *pol)
 	struct vm_info info;
 	int pcpu, count;
 	uint64_t mask_u64b;
+	struct core_info *ci;
+	int ret;
+
+	ci = get_core_info();
 
 	RTE_LOG(INFO, CHANNEL_MONITOR, "Looking for pcpu for %s\n",
 			pol->pkt.vm_name);
@@ -100,8 +105,22 @@ get_pcpu_to_control(struct policy *pol)
 	for (count = 0; count < pol->pkt.num_vcpu; count++) {
 		mask_u64b = info.pcpu_mask[pol->pkt.vcpu_to_control[count]];
 		for (pcpu = 0; mask_u64b; mask_u64b &= ~(1ULL << pcpu++)) {
-			if ((mask_u64b >> pcpu) & 1)
-				pol->core_share[count].pcpu = pcpu;
+			if ((mask_u64b >> pcpu) & 1) {
+				if (pol->pkt.policy_to_use == BRANCH_RATIO) {
+					ci->cd[pcpu].oob_enabled = 1;
+					ret = add_core_to_monitor(pcpu);
+					if (ret == 0)
+						printf("Monitoring pcpu %d via Branch Ratio\n",
+								pcpu);
+					else
+						printf("Failed to start OOB Monitoring pcpu %d\n",
+								pcpu);
+
+				} else {
+					pol->core_share[count].pcpu = pcpu;
+					printf("Monitoring pcpu %d\n", pcpu);
+				}
+			}
 		}
 	}
 }
diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 5e8b4ab5d..ee638eefa 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -48,7 +48,8 @@ enum workload {HIGH, MEDIUM, LOW};
 enum policy_to_use {
 	TRAFFIC,
 	TIME,
-	WORKLOAD
+	WORKLOAD,
+	BRANCH_RATIO
 };
 
 struct traffic {
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 8/9] examples/vm_power: add cli args to guest app
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (6 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 7/9] examples/vm_power: add branch ratio policy type David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 9/9] examples/vm_power: make branch ratio configurable David Hunt
  2018-07-12 19:09         ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

Add new command line arguments to the guest app to make
    testing and validation of the policy usage easier.
    These arguments are mainly around setting up the power
    management policy that is sent from the guest vm to
    to the vm_power_manager in the host

    New command line parameters:
    -n or --vm-name
       sets the name of the vm to be used by the host OS.
    -b or --busy-hours
       sets the list of hours that are predicted to be busy
    -q or --quiet-hours
       sets the list of hours that are predicted to be quiet
    -l or --vcpu-list
       sets the list of vcpus to monitor
    -p or --port-list
       sets the list of posts to monitor when using a
       workload policy.
    -o or --policy
       sets the default policy type
          TIME
          WORKLOAD
          TRAFFIC
          BRANCH_RATIO

    The format of the hours or list paramers is a comma-separated
    list of integers, which can take the form of
       a. x    e.g. --vcpu-list=1
       b. x,y  e.g. --quiet-hours=3,4
       c. x-y  e.g. --busy-hours=9-12
       d. combination of above (e.g. --busy-hours=4,5-7,9)

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile  |   2 +-
 examples/vm_power_manager/guest_cli/main.c    | 151 +++++++++++++++++-
 examples/vm_power_manager/guest_cli/parse.c   |  93 +++++++++++
 examples/vm_power_manager/guest_cli/parse.h   |  19 +++
 .../guest_cli/vm_power_cli_guest.c            | 113 +++++++------
 .../guest_cli/vm_power_cli_guest.h            |   6 +
 6 files changed, 330 insertions(+), 54 deletions(-)
 create mode 100644 examples/vm_power_manager/guest_cli/parse.c
 create mode 100644 examples/vm_power_manager/guest_cli/parse.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
index d710e22d9..8b1db861e 100644
--- a/examples/vm_power_manager/guest_cli/Makefile
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -14,7 +14,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 APP = guest_vm_power_mgr
 
 # all source are stored in SRCS-y
-SRCS-y := main.c vm_power_cli_guest.c
+SRCS-y := main.c vm_power_cli_guest.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
index b17936d6b..36365b124 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -2,23 +2,20 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
-/*
 #include <stdio.h>
-#include <string.h>
-#include <stdint.h>
-#include <sys/epoll.h>
-#include <fcntl.h>
-#include <unistd.h>
 #include <stdlib.h>
-#include <errno.h>
-*/
 #include <signal.h>
+#include <getopt.h>
+#include <string.h>
 
 #include <rte_lcore.h>
 #include <rte_power.h>
 #include <rte_debug.h>
+#include <rte_eal.h>
+#include <rte_log.h>
 
 #include "vm_power_cli_guest.h"
+#include "parse.h"
 
 static void
 sig_handler(int signo)
@@ -32,6 +29,136 @@ sig_handler(int signo)
 
 }
 
+#define MAX_HOURS 24
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	const struct option lgopts[] = {
+		{ "vm-name", required_argument, 0, 'n'},
+		{ "busy-hours", required_argument, 0, 'b'},
+		{ "quiet-hours", required_argument, 0, 'q'},
+		{ "port-list", required_argument, 0, 'p'},
+		{ "vcpu-list", required_argument, 0, 'l'},
+		{ "policy", required_argument, 0, 'o'},
+		{NULL, 0, 0, 0}
+	};
+	struct channel_packet *policy;
+	unsigned short int hours[MAX_HOURS];
+	unsigned short int cores[MAX_VCPU_PER_VM];
+	unsigned short int ports[MAX_VCPU_PER_VM];
+	int i, cnt, idx;
+
+	policy = get_policy();
+	set_policy_defaults(policy);
+
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "n:b:q:p:",
+				  lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'n':
+			strcpy(policy->vm_name, optarg);
+			printf("Setting VM Name to [%s]\n", policy->vm_name);
+			break;
+		case 'b':
+		case 'q':
+			//printf("***Processing set using [%s]\n", optarg);
+			cnt = parse_set(optarg, hours, MAX_HOURS);
+			if (cnt < 0) {
+				printf("Invalid value passed to quiet/busy hours - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_HOURS; i++) {
+				if (hours[i]) {
+					if (opt == 'b') {
+						printf("***Busy Hour %d\n", i);
+						policy->timer_policy.busy_hours
+							[idx++] = i;
+					} else {
+						printf("***Quiet Hour %d\n", i);
+						policy->timer_policy.quiet_hours
+							[idx++] = i;
+					}
+				}
+			}
+			break;
+		case 'l':
+			cnt = parse_set(optarg, cores, MAX_VCPU_PER_VM);
+			if (cnt < 0) {
+				printf("Invalid value passed to vcpu-list - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_VCPU_PER_VM; i++) {
+				if (cores[i]) {
+					printf("***Using core %d\n", i);
+					policy->vcpu_to_control[idx++] = i;
+				}
+			}
+			policy->num_vcpu = idx;
+			printf("Total cores: %d\n", idx);
+			break;
+		case 'p':
+			cnt = parse_set(optarg, ports, MAX_VCPU_PER_VM);
+			if (cnt < 0) {
+				printf("Invalid value passed to port-list - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_VCPU_PER_VM; i++) {
+				if (ports[i]) {
+					printf("***Using port %d\n", i);
+					set_policy_mac(i, idx++);
+				}
+			}
+			policy->nb_mac_to_monitor = idx;
+			printf("Total Ports: %d\n", idx);
+			break;
+		case 'o':
+			if (!strcmp(optarg, "TRAFFIC"))
+				policy->policy_to_use = TRAFFIC;
+			else if (!strcmp(optarg, "TIME"))
+				policy->policy_to_use = TIME;
+			else if (!strcmp(optarg, "WORKLOAD"))
+				policy->policy_to_use = WORKLOAD;
+			else if (!strcmp(optarg, "BRANCH_RATIO"))
+				policy->policy_to_use = BRANCH_RATIO;
+			else {
+				printf("Invalid policy specified: %s\n",
+						optarg);
+				return -1;
+			}
+			break;
+		/* long options */
+
+		case 0:
+			break;
+
+		default:
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
 int
 main(int argc, char **argv)
 {
@@ -45,6 +172,14 @@ main(int argc, char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
 	rte_power_set_env(PM_ENV_KVM_VM);
 	RTE_LCORE_FOREACH(lcore_id) {
 		rte_power_init(lcore_id);
diff --git a/examples/vm_power_manager/guest_cli/parse.c b/examples/vm_power_manager/guest_cli/parse.c
new file mode 100644
index 000000000..9de15c4a7
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/parse.c
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <syslog.h>
+#include <ctype.h>
+#include <limits.h>
+#include <errno.h>
+#include <getopt.h>
+#include <dlfcn.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <dirent.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/guest_cli/parse.h b/examples/vm_power_manager/guest_cli/parse.h
new file mode 100644
index 000000000..c8aa0ea50
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/parse.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 43bdeacef..0db1b804f 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -33,6 +33,71 @@ struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
 
+union PFID {
+	struct ether_addr addr;
+	uint64_t pfid;
+};
+
+static struct channel_packet policy;
+
+struct channel_packet *
+get_policy(void)
+{
+	return &policy;
+}
+
+int
+set_policy_mac(int port, int idx)
+{
+	struct channel_packet *policy;
+	union PFID pfid;
+
+	/* Use port MAC address as the vfid */
+	rte_eth_macaddr_get(port, &pfid.addr);
+
+	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
+			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
+			port,
+			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
+			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
+			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
+	policy = get_policy();
+	policy->vfid[idx] = pfid.pfid;
+	return 0;
+}
+
+void
+set_policy_defaults(struct channel_packet *pkt)
+{
+	set_policy_mac(0, 0);
+	pkt->nb_mac_to_monitor = 1;
+
+	pkt->t_boost_status.tbEnabled = false;
+
+	pkt->vcpu_to_control[0] = 0;
+	pkt->vcpu_to_control[1] = 1;
+	pkt->num_vcpu = 2;
+	/* Dummy Population. */
+	pkt->traffic_policy.min_packet_thresh = 96000;
+	pkt->traffic_policy.avg_max_packet_thresh = 1800000;
+	pkt->traffic_policy.max_max_packet_thresh = 2000000;
+
+	pkt->timer_policy.busy_hours[0] = 3;
+	pkt->timer_policy.busy_hours[1] = 4;
+	pkt->timer_policy.busy_hours[2] = 5;
+	pkt->timer_policy.quiet_hours[0] = 11;
+	pkt->timer_policy.quiet_hours[1] = 12;
+	pkt->timer_policy.quiet_hours[2] = 13;
+
+	pkt->timer_policy.hours_to_use_traffic_profile[0] = 8;
+	pkt->timer_policy.hours_to_use_traffic_profile[1] = 10;
+
+	pkt->workload = LOW;
+	pkt->policy_to_use = TIME;
+	pkt->command = PKT_POLICY;
+	strcpy(pkt->vm_name, "ubuntu2");
+}
+
 static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
 				__attribute__((unused)) struct cmdline *cl,
 			    __attribute__((unused)) void *data)
@@ -118,54 +183,12 @@ struct cmd_send_policy_result {
 	cmdline_fixed_string_t cmd;
 };
 
-union PFID {
-	struct ether_addr addr;
-	uint64_t pfid;
-};
-
 static inline int
-send_policy(void)
+send_policy(struct channel_packet *pkt)
 {
-	struct channel_packet pkt;
 	int ret;
 
-	union PFID pfid;
-	/* Use port MAC address as the vfid */
-	rte_eth_macaddr_get(0, &pfid.addr);
-	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
-			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
-			1,
-			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
-			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
-			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
-	pkt.vfid[0] = pfid.pfid;
-
-	pkt.nb_mac_to_monitor = 1;
-	pkt.t_boost_status.tbEnabled = false;
-
-	pkt.vcpu_to_control[0] = 0;
-	pkt.vcpu_to_control[1] = 1;
-	pkt.num_vcpu = 2;
-	/* Dummy Population. */
-	pkt.traffic_policy.min_packet_thresh = 96000;
-	pkt.traffic_policy.avg_max_packet_thresh = 1800000;
-	pkt.traffic_policy.max_max_packet_thresh = 2000000;
-
-	pkt.timer_policy.busy_hours[0] = 3;
-	pkt.timer_policy.busy_hours[1] = 4;
-	pkt.timer_policy.busy_hours[2] = 5;
-	pkt.timer_policy.quiet_hours[0] = 11;
-	pkt.timer_policy.quiet_hours[1] = 12;
-	pkt.timer_policy.quiet_hours[2] = 13;
-
-	pkt.timer_policy.hours_to_use_traffic_profile[0] = 8;
-	pkt.timer_policy.hours_to_use_traffic_profile[1] = 10;
-
-	pkt.workload = LOW;
-	pkt.policy_to_use = TIME;
-	pkt.command = PKT_POLICY;
-	strcpy(pkt.vm_name, "ubuntu2");
-	ret = rte_power_guest_channel_send_msg(&pkt, 1);
+	ret = rte_power_guest_channel_send_msg(pkt, 1);
 	if (ret == 0)
 		return 1;
 	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n",
@@ -182,7 +205,7 @@ cmd_send_policy_parsed(void *parsed_result, struct cmdline *cl,
 
 	if (!strcmp(res->cmd, "now")) {
 		printf("Sending Policy down now!\n");
-		ret = send_policy();
+		ret = send_policy(&policy);
 	}
 	if (ret != 1)
 		cmdline_printf(cl, "Error sending message: %s\n",
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
index 75a262967..fd77f6a69 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -11,6 +11,12 @@ extern "C" {
 
 #include "channel_commands.h"
 
+struct channel_packet *get_policy(void);
+
+int set_policy_mac(int port, int idx);
+
+void set_policy_defaults(struct channel_packet *pkt);
+
 void run_cli(__attribute__((unused)) void *arg);
 
 #ifdef __cplusplus
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v3 9/9] examples/vm_power: make branch ratio configurable
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (7 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 8/9] examples/vm_power: add cli args to guest app David Hunt
@ 2018-06-26  9:23         ` David Hunt
  2018-07-12 19:09         ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-06-26  9:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt

For different workloads and poll loops, the theshold
may be different for when you want to scale up and down.

This patch allows changing of the default branch ratio
by using the -b command line argument (or --branch-ratio=)

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c            | 16 +++++++++++++++-
 examples/vm_power_manager/oob_monitor_x86.c |  3 +--
 examples/vm_power_manager/power_manager.c   |  1 +
 examples/vm_power_manager/power_manager.h   |  3 +++
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4088861f1..784d928bd 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -143,17 +143,19 @@ parse_args(int argc, char **argv)
 	int option_index;
 	char *prgname = argv[0];
 	struct core_info *ci;
+	float branch_ratio;
 	static struct option lgopts[] = {
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
 		{ "core-list", optional_argument, 0, 'l'},
 		{ "port-list", optional_argument, 0, 'p'},
+		{ "branch-ratio", optional_argument, 0, 'b'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
 	ci = get_core_info();
 
-	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
+	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:b:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -186,6 +188,18 @@ parse_args(int argc, char **argv)
 			}
 			free(oob_enable);
 			break;
+		case 'b':
+			branch_ratio = 0.0;
+			if (strlen(optarg))
+				branch_ratio = atof(optarg);
+			if (branch_ratio <= 0.0) {
+				printf("invalid branch ratio specified\n");
+				return -1;
+			}
+			ci->branch_ratio_threshold = branch_ratio;
+			printf("***Setting branch ratio to %f\n",
+					branch_ratio);
+			break;
 		/* long options */
 		case 0:
 			break;
diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c
index 485ec5e3f..ea327b819 100644
--- a/examples/vm_power_manager/oob_monitor_x86.c
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -45,7 +45,6 @@ void branch_monitor_exit(void)
 /* Number of microseconds between each poll */
 #define INTERVAL 100
 #define PRINT_LOOP_COUNT (1000000/INTERVAL)
-#define RATIO_THRESHOLD 0.03
 #define IA32_PERFEVTSEL0 0x186
 #define IA32_PERFEVTSEL1 0x187
 #define IA32_PERFCTR0 0xc1
@@ -112,7 +111,7 @@ apply_policy(int core)
 
 	ratio = (float)miss_diff * (float)100 / (float)hits_diff;
 
-	if (ratio < RATIO_THRESHOLD)
+	if (ratio < ci->branch_ratio_threshold)
 		power_manager_scale_core_min(core);
 	else
 		power_manager_scale_core_max(core);
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 4bdde23da..b7769c3c3 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -74,6 +74,7 @@ core_info_init(void)
 	ci = get_core_info();
 
 	ci->core_count = get_nprocs_conf();
+	ci->branch_ratio_threshold = BRANCH_RATIO_THRESHOLD;
 	ci->cd = malloc(ci->core_count * sizeof(struct core_details));
 	if (!ci->cd) {
 		RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info.");
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 45385de37..605b3c8f6 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -19,8 +19,11 @@ struct core_details {
 struct core_info {
 	uint16_t core_count;
 	struct core_details *cd;
+	float branch_ratio_threshold;
 };
 
+#define BRANCH_RATIO_THRESHOLD 0.1
+
 struct core_info *
 get_core_info(void);
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling
  2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
                           ` (8 preceding siblings ...)
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 9/9] examples/vm_power: make branch ratio configurable David Hunt
@ 2018-07-12 19:09         ` Thomas Monjalon
  2018-07-13  8:31           ` Hunt, David
  9 siblings, 1 reply; 46+ messages in thread
From: Thomas Monjalon @ 2018-07-12 19:09 UTC (permalink / raw)
  To: David Hunt
  Cc: dev, jerin.jacob, hemant.agrawal, arybchenko, ferruh.yigit,
	bruce.richardson

26/06/2018 11:23, David Hunt:
> This patch set adds the capability to do out-of-band power
> monitoring on a system. It uses a thread to monitor the branch
> counters in the targeted cores, and calculates the branch ratio
> if the running code.
> 
> If the branch ratop is low (0.01), then
> the code is most likely running in a tight poll loop and doing
> nothing, i.e. receiving no packets. In this case we scale down
> the frequency of that core.
> 
> If the branch ratio is higher (>0.01), then it is likely that
> the code is receiving and processing packets. In this case, we
> scale up the frequency of that core.
> 
> The cpu counters are read via /dev/cpu/x/msr, so requires the
> msr kernel module to be loaded. Because this method is used,
> the patch set is implemented with one file for x86 systems, and
> another for non-x86 systems, with conditional compilation in
> the Makefile. The non-x86 functions are stubs, and do not
> currently implement any functionality.
> 
> The vm_power_manager app has been modified to take a new parameter
>    --core-list or -l
> which takes a list of cores in a comma-separated list format,
> e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
> These cores will then be enabled for oob monitoring. When the
> OOB monitoring thread starts, it reads the branch hits/miss
> counters of each monitored core, and scales up/down accordingly.

It looks to be a feature which could be integrated in DPDK libs.
Why choosing to implement it fully in an example?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions David Hunt
@ 2018-07-12 19:13           ` Thomas Monjalon
  2018-07-12 22:18             ` Stephen Hemminger
  2018-07-13  8:24             ` Hunt, David
  0 siblings, 2 replies; 46+ messages in thread
From: Thomas Monjalon @ 2018-07-12 19:13 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

26/06/2018 11:23, David Hunt:
> +#include <unistd.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <signal.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <sys/types.h>
> +#include <sys/epoll.h>
> +#include <sys/queue.h>
> +#include <sys/time.h>
> +#include <fcntl.h>
> +
> +#include <rte_log.h>
> +#include <rte_memory.h>
> +#include <rte_malloc.h>
> +#include <rte_atomic.h>
> +#include <rte_cycles.h>
> +#include <rte_ethdev.h>
> +#include <rte_pmd_i40e.h>
> +
> +#include <libvirt/libvirt.h>
> +#include "oob_monitor.h"
> +#include "power_manager.h"
> +#include "channel_manager.h"
> +
> +#include <rte_log.h>
> +#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1

I'm sure you don't need all these headers.
rte_log.h is included twice.
rte_pmd_i40e is more than suspicious...

This is a hint that the whole file was probably written too fast :)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions
  2018-07-12 19:13           ` Thomas Monjalon
@ 2018-07-12 22:18             ` Stephen Hemminger
  2018-07-13  8:24             ` Hunt, David
  1 sibling, 0 replies; 46+ messages in thread
From: Stephen Hemminger @ 2018-07-12 22:18 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: David Hunt, dev

On Thu, 12 Jul 2018 21:13:26 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

> 26/06/2018 11:23, David Hunt:
> > +#include <unistd.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <stdint.h>
> > +#include <signal.h>
> > +#include <errno.h>
> > +#include <string.h>
> > +#include <sys/types.h>
> > +#include <sys/epoll.h>
> > +#include <sys/queue.h>
> > +#include <sys/time.h>
> > +#include <fcntl.h>
> > +
> > +#include <rte_log.h>
> > +#include <rte_memory.h>
> > +#include <rte_malloc.h>
> > +#include <rte_atomic.h>
> > +#include <rte_cycles.h>
> > +#include <rte_ethdev.h>
> > +#include <rte_pmd_i40e.h>
> > +
> > +#include <libvirt/libvirt.h>
> > +#include "oob_monitor.h"
> > +#include "power_manager.h"
> > +#include "channel_manager.h"
> > +
> > +#include <rte_log.h>
> > +#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1  
> 
> I'm sure you don't need all these headers.
> rte_log.h is included twice.
> rte_pmd_i40e is more than suspicious...
> 
> This is a hint that the whole file was probably written too fast :)
> 

This tool can help
  https://include-what-you-use.org/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions
  2018-07-12 19:13           ` Thomas Monjalon
  2018-07-12 22:18             ` Stephen Hemminger
@ 2018-07-13  8:24             ` Hunt, David
  1 sibling, 0 replies; 46+ messages in thread
From: Hunt, David @ 2018-07-13  8:24 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,


On 12/7/2018 8:13 PM, Thomas Monjalon wrote:
> 26/06/2018 11:23, David Hunt:
>> +#include <unistd.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <stdint.h>
>> +#include <signal.h>
>> +#include <errno.h>
>> +#include <string.h>
>> +#include <sys/types.h>
>> +#include <sys/epoll.h>
>> +#include <sys/queue.h>
>> +#include <sys/time.h>
>> +#include <fcntl.h>
>> +
>> +#include <rte_log.h>
>> +#include <rte_memory.h>
>> +#include <rte_malloc.h>
>> +#include <rte_atomic.h>
>> +#include <rte_cycles.h>
>> +#include <rte_ethdev.h>
>> +#include <rte_pmd_i40e.h>
>> +
>> +#include <libvirt/libvirt.h>
>> +#include "oob_monitor.h"
>> +#include "power_manager.h"
>> +#include "channel_manager.h"
>> +
>> +#include <rte_log.h>
>> +#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
> I'm sure you don't need all these headers.
> rte_log.h is included twice.
> rte_pmd_i40e is more than suspicious...
>
> This is a hint that the whole file was probably written too fast :)

Apologies, it was a cut-and-paste from another file in that same 
directory. I can clean it up and re-spin.

Regards,
Dave.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling
  2018-07-12 19:09         ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
@ 2018-07-13  8:31           ` Hunt, David
  2018-07-13  8:33             ` Thomas Monjalon
  0 siblings, 1 reply; 46+ messages in thread
From: Hunt, David @ 2018-07-13  8:31 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, jerin.jacob, hemant.agrawal, arybchenko, ferruh.yigit,
	bruce.richardson

Hi Thomas,

On 12/7/2018 8:09 PM, Thomas Monjalon wrote:
> 26/06/2018 11:23, David Hunt:
>> This patch set adds the capability to do out-of-band power
>> monitoring on a system. It uses a thread to monitor the branch
>> counters in the targeted cores, and calculates the branch ratio
>> if the running code.
>>
>> If the branch ratop is low (0.01), then
>> the code is most likely running in a tight poll loop and doing
>> nothing, i.e. receiving no packets. In this case we scale down
>> the frequency of that core.
>>
>> If the branch ratio is higher (>0.01), then it is likely that
>> the code is receiving and processing packets. In this case, we
>> scale up the frequency of that core.
>>
>> The cpu counters are read via /dev/cpu/x/msr, so requires the
>> msr kernel module to be loaded. Because this method is used,
>> the patch set is implemented with one file for x86 systems, and
>> another for non-x86 systems, with conditional compilation in
>> the Makefile. The non-x86 functions are stubs, and do not
>> currently implement any functionality.
>>
>> The vm_power_manager app has been modified to take a new parameter
>>     --core-list or -l
>> which takes a list of cores in a comma-separated list format,
>> e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
>> These cores will then be enabled for oob monitoring. When the
>> OOB monitoring thread starts, it reads the branch hits/miss
>> counters of each monitored core, and scales up/down accordingly.
> It looks to be a feature which could be integrated in DPDK libs.
> Why choosing to implement it fully in an example?

I needed to set up a thread that looped tightly (~100uS interval) and 
run it on it's
own core. From what I have seen in other cases, it is usually the 
application that
allocates cores and decides what to run on them. I did think about putting
some of it in a library, but for this case I thought it made more sense 
to keep
it purely as a sample app.

Regards,
Dave.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling
  2018-07-13  8:31           ` Hunt, David
@ 2018-07-13  8:33             ` Thomas Monjalon
  2018-07-13  8:43               ` Hunt, David
  0 siblings, 1 reply; 46+ messages in thread
From: Thomas Monjalon @ 2018-07-13  8:33 UTC (permalink / raw)
  To: Hunt, David
  Cc: dev, jerin.jacob, hemant.agrawal, arybchenko, ferruh.yigit,
	bruce.richardson

13/07/2018 10:31, Hunt, David:
> Hi Thomas,
> 
> On 12/7/2018 8:09 PM, Thomas Monjalon wrote:
> > 26/06/2018 11:23, David Hunt:
> >> This patch set adds the capability to do out-of-band power
> >> monitoring on a system. It uses a thread to monitor the branch
> >> counters in the targeted cores, and calculates the branch ratio
> >> if the running code.
> >>
> >> If the branch ratop is low (0.01), then
> >> the code is most likely running in a tight poll loop and doing
> >> nothing, i.e. receiving no packets. In this case we scale down
> >> the frequency of that core.
> >>
> >> If the branch ratio is higher (>0.01), then it is likely that
> >> the code is receiving and processing packets. In this case, we
> >> scale up the frequency of that core.
> >>
> >> The cpu counters are read via /dev/cpu/x/msr, so requires the
> >> msr kernel module to be loaded. Because this method is used,
> >> the patch set is implemented with one file for x86 systems, and
> >> another for non-x86 systems, with conditional compilation in
> >> the Makefile. The non-x86 functions are stubs, and do not
> >> currently implement any functionality.
> >>
> >> The vm_power_manager app has been modified to take a new parameter
> >>     --core-list or -l
> >> which takes a list of cores in a comma-separated list format,
> >> e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
> >> These cores will then be enabled for oob monitoring. When the
> >> OOB monitoring thread starts, it reads the branch hits/miss
> >> counters of each monitored core, and scales up/down accordingly.
> > It looks to be a feature which could be integrated in DPDK libs.
> > Why choosing to implement it fully in an example?
> 
> I needed to set up a thread that looped tightly (~100uS interval) and 
> run it on it's
> own core. From what I have seen in other cases, it is usually the 
> application that
> allocates cores and decides what to run on them. I did think about putting
> some of it in a library, but for this case I thought it made more sense 
> to keep
> it purely as a sample app.

I feel some code deserves to be in a library.
For instance, having different implementations per CPU is a good reason
to make a library.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling
  2018-07-13  8:33             ` Thomas Monjalon
@ 2018-07-13  8:43               ` Hunt, David
  2018-07-18 15:23                 ` Thomas Monjalon
  0 siblings, 1 reply; 46+ messages in thread
From: Hunt, David @ 2018-07-13  8:43 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, jerin.jacob, hemant.agrawal, arybchenko, ferruh.yigit,
	bruce.richardson


On 13/7/2018 9:33 AM, Thomas Monjalon wrote:
> 13/07/2018 10:31, Hunt, David:
>> Hi Thomas,
>>
>> On 12/7/2018 8:09 PM, Thomas Monjalon wrote:
>>> 26/06/2018 11:23, David Hunt:
>>>> This patch set adds the capability to do out-of-band power
>>>> monitoring on a system. It uses a thread to monitor the branch
>>>> counters in the targeted cores, and calculates the branch ratio
>>>> if the running code.
>>>>
>>>> If the branch ratop is low (0.01), then
>>>> the code is most likely running in a tight poll loop and doing
>>>> nothing, i.e. receiving no packets. In this case we scale down
>>>> the frequency of that core.
>>>>
>>>> If the branch ratio is higher (>0.01), then it is likely that
>>>> the code is receiving and processing packets. In this case, we
>>>> scale up the frequency of that core.
>>>>
>>>> The cpu counters are read via /dev/cpu/x/msr, so requires the
>>>> msr kernel module to be loaded. Because this method is used,
>>>> the patch set is implemented with one file for x86 systems, and
>>>> another for non-x86 systems, with conditional compilation in
>>>> the Makefile. The non-x86 functions are stubs, and do not
>>>> currently implement any functionality.
>>>>
>>>> The vm_power_manager app has been modified to take a new parameter
>>>>      --core-list or -l
>>>> which takes a list of cores in a comma-separated list format,
>>>> e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
>>>> These cores will then be enabled for oob monitoring. When the
>>>> OOB monitoring thread starts, it reads the branch hits/miss
>>>> counters of each monitored core, and scales up/down accordingly.
>>> It looks to be a feature which could be integrated in DPDK libs.
>>> Why choosing to implement it fully in an example?
>> I needed to set up a thread that looped tightly (~100uS interval) and
>> run it on it's
>> own core. From what I have seen in other cases, it is usually the
>> application that
>> allocates cores and decides what to run on them. I did think about putting
>> some of it in a library, but for this case I thought it made more sense
>> to keep
>> it purely as a sample app.
> I feel some code deserves to be in a library.
> For instance, having different implementations per CPU is a good reason
> to make a library.
>

Sure, I can look at moving some of the code into the library in a future 
release. However, I
believe it's OK as it is for the current merge window.

Regards,
Dave.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling
  2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count David Hunt
@ 2018-07-13 14:22           ` David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 1/9] examples/vm_power: add check for port count David Hunt
                               ` (9 more replies)
  0 siblings, 10 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

This patch set adds the capability to do out-of-band power
monitoring on a system. It uses a thread to monitor the branch
counters in the targeted cores, and calculates the branch ratio
if the running code.

If the branch ratio is low (0.01), then
the code is most likely running in a tight poll loop and doing
nothing, i.e. receiving no packets. In this case we scale down
the frequency of that core.

If the branch ratio is higher (>0.01), then it is likely that
the code is receiving and processing packets. In this case, we
scale up the frequency of that core.

The cpu counters are read via /dev/cpu/x/msr, so requires the
msr kernel module to be loaded. Because this method is used,
the patch set is implemented with one file for x86 systems, and
another for non-x86 systems, with conditional compilation in
the Makefile. The non-x86 functions are stubs, and do not
currently implement any functionality.

The vm_power_manager app has been modified to take a new parameter
   --core-list or -l
which takes a list of cores in a comma-separated list format,
e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
These cores will then be enabled for oob monitoring. When the
OOB monitoring thread starts, it reads the branch hits/miss
counters of each monitored core, and scales up/down accordingly.

The guest_cli app has also been modified to allow sending of a
policy of type BRANCH_RATIO where all of the cores included in
the policy will be monitored by the vm_power_manager oob thread.

v2 changes:
   * Add the guest_cli patch into this patch set, including the
     ability to set the policy to BRANCH_RATIO.
     http://patches.dpdk.org/patch/40742/
   * When vm_power_manger receives a policy with type BRANCH_RATIO,
     add the relevant cores to the monitoring thread.

v3 changes:
   * Added a command line parameter to allow changing of the
     default branch ratio threshold. can now use -b 0.3 or
     --branch-ratio=0.3 to set the ratio for scaling up/down.

v4 changes:
   * Removed some un-needed header file includes

[1/9] examples/vm_power: add check for port count
[2/9] examples/vm_power: add core list parameter
[3/9] examples/vm_power: add oob monitoring functions
[4/9] examples/vm_power: allow greater than 64 cores
[5/9] examples/vm_power: add thread for oob core monitor
[6/9] examples/vm_power: add port-list to command line
[7/9] examples/vm_power: add branch ratio policy type
[8/9] examples/vm_power: add cli args to guest app
[9/9] examples/vm_power: make branch ratio configurable

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 1/9] examples/vm_power: add check for port count
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
@ 2018-07-13 14:22             ` David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 2/9] examples/vm_power: add core list parameter David Hunt
                               ` (8 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c | 81 +++++++++++++++++---------------
 1 file changed, 43 insertions(+), 38 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 8911f2659..0d3846971 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -278,51 +278,56 @@ main(int argc, char **argv)
 
 	nb_ports = rte_eth_dev_count_avail();
 
-	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
-		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+	if (nb_ports > 0) {
+		mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
+				NUM_MBUFS * nb_ports, MBUF_CACHE_SIZE, 0,
+				RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
 
-	if (mbuf_pool == NULL)
-		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+		if (mbuf_pool == NULL)
+			rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
 
-	/* Initialize ports. */
-	RTE_ETH_FOREACH_DEV(portid) {
-		struct ether_addr eth;
-		int w, j;
-		int ret;
+		/* Initialize ports. */
+		RTE_ETH_FOREACH_DEV(portid) {
+			struct ether_addr eth;
+			int w, j;
+			int ret;
 
-		if ((enabled_port_mask & (1 << portid)) == 0)
-			continue;
+			if ((enabled_port_mask & (1 << portid)) == 0)
+				continue;
 
-		eth.addr_bytes[0] = 0xe0;
-		eth.addr_bytes[1] = 0xe0;
-		eth.addr_bytes[2] = 0xe0;
-		eth.addr_bytes[3] = 0xe0;
-		eth.addr_bytes[4] = portid + 0xf0;
+			eth.addr_bytes[0] = 0xe0;
+			eth.addr_bytes[1] = 0xe0;
+			eth.addr_bytes[2] = 0xe0;
+			eth.addr_bytes[3] = 0xe0;
+			eth.addr_bytes[4] = portid + 0xf0;
 
-		if (port_init(portid, mbuf_pool) != 0)
-			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+			if (port_init(portid, mbuf_pool) != 0)
+				rte_exit(EXIT_FAILURE,
+					"Cannot init port %"PRIu8 "\n",
 					portid);
 
-		for (w = 0; w < MAX_VFS; w++) {
-			eth.addr_bytes[5] = w + 0xf0;
-
-			ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_i40e_set_vf_mac_addr(portid,
-						w, &eth);
-			if (ret == -ENOTSUP)
-				ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
-						w, &eth);
-
-			switch (ret) {
-			case 0:
-				printf("Port %d VF %d MAC: ",
-						portid, w);
-				for (j = 0; j < 6; j++) {
-					printf("%02x", eth.addr_bytes[j]);
-					if (j < 5)
-						printf(":");
+			for (w = 0; w < MAX_VFS; w++) {
+				eth.addr_bytes[5] = w + 0xf0;
+
+				ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+							w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_i40e_set_vf_mac_addr(
+							portid, w, &eth);
+				if (ret == -ENOTSUP)
+					ret = rte_pmd_bnxt_set_vf_mac_addr(
+							portid, w, &eth);
+
+				switch (ret) {
+				case 0:
+					printf("Port %d VF %d MAC: ",
+							portid, w);
+					for (j = 0; j < 5; j++) {
+						printf("%02x:",
+							eth.addr_bytes[j]);
+					}
+					printf("%02x\n", eth.addr_bytes[5]);
+					break;
 				}
 				printf("\n");
 				break;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 2/9] examples/vm_power: add core list parameter
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 1/9] examples/vm_power: add check for port count David Hunt
@ 2018-07-13 14:22             ` David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 3/9] examples/vm_power: add oob monitoring functions David Hunt
                               ` (7 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).

This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.

Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/Makefile        |  2 +-
 examples/vm_power_manager/main.c          | 34 +++++++++-
 examples/vm_power_manager/parse.c         | 81 +++++++++++++++++++++++
 examples/vm_power_manager/parse.h         | 20 ++++++
 examples/vm_power_manager/power_manager.c | 31 +++++++++
 examples/vm_power_manager/power_manager.h | 20 ++++++
 6 files changed, 185 insertions(+), 3 deletions(-)
 create mode 100644 examples/vm_power_manager/parse.c
 create mode 100644 examples/vm_power_manager/parse.h

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index ef2a9f959..0c925967c 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -19,7 +19,7 @@ APP = vm_power_mgr
 
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
-SRCS-y += channel_monitor.c
+SRCS-y += channel_monitor.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 0d3846971..613a40af0 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
 #include <rte_pmd_bnxt.h>
@@ -133,18 +134,22 @@ parse_portmask(const char *portmask)
 static int
 parse_args(int argc, char **argv)
 {
-	int opt, ret;
+	int opt, ret, cnt, i;
 	char **argvopt;
+	uint16_t *oob_enable;
 	int option_index;
 	char *prgname = argv[0];
+	struct core_info *ci;
 	static struct option lgopts[] = {
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
+		{ "core-list", optional_argument, 0, 'l'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
+	ci = get_core_info();
 
-	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -156,6 +161,27 @@ parse_args(int argc, char **argv)
 				return -1;
 			}
 			break;
+		case 'l':
+			oob_enable = malloc(ci->core_count * sizeof(uint16_t));
+			if (oob_enable == NULL) {
+				printf("Error - Unable to allocate memory\n");
+				return -1;
+			}
+			cnt = parse_set(optarg, oob_enable, ci->core_count);
+			if (cnt < 0) {
+				printf("Invalid core-list - [%s]\n",
+						optarg);
+				break;
+			}
+			for (i = 0; i < ci->core_count; i++) {
+				if (oob_enable[i]) {
+					printf("***Using core %d\n", i);
+					ci->cd[i].oob_enabled = 1;
+					ci->cd[i].global_enabled_cpus = 1;
+				}
+			}
+			free(oob_enable);
+			break;
 		/* long options */
 		case 0:
 			break;
@@ -261,6 +287,10 @@ main(int argc, char **argv)
 	uint16_t portid;
 
 
+	ret = core_info_init();
+	if (ret < 0)
+		rte_panic("Cannot allocate core info\n");
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
diff --git a/examples/vm_power_manager/parse.c b/examples/vm_power_manager/parse.c
new file mode 100644
index 000000000..8231533b6
--- /dev/null
+++ b/examples/vm_power_manager/parse.c
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <string.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/parse.h b/examples/vm_power_manager/parse.h
new file mode 100644
index 000000000..a5971e9a2
--- /dev/null
+++ b/examples/vm_power_manager/parse.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 35db25591..a7849e48a 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -12,6 +12,7 @@
 #include <dirent.h>
 #include <errno.h>
 
+#include <sys/sysinfo.h>
 #include <sys/types.h>
 
 #include <rte_log.h>
@@ -54,6 +55,7 @@ struct freq_info {
 
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
+struct core_info ci;
 static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
@@ -76,6 +78,35 @@ set_host_cpus_mask(void)
 	return num_cpus;
 }
 
+struct core_info *
+get_core_info(void)
+{
+	return &ci;
+}
+
+int
+core_info_init(void)
+{
+	struct core_info *ci;
+	int i;
+
+	ci = get_core_info();
+
+	ci->core_count = get_nprocs_conf();
+	ci->cd = malloc(ci->core_count * sizeof(struct core_details));
+	if (!ci->cd) {
+		RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info.");
+		return -1;
+	}
+	for (i = 0; i < ci->core_count; i++) {
+		ci->cd[i].global_enabled_cpus = 1;
+		ci->cd[i].oob_enabled = 0;
+		ci->cd[i].msr_fd = 0;
+	}
+	printf("%d cores in system\n", ci->core_count);
+	return 0;
+}
+
 int
 power_manager_init(void)
 {
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 8a8a84aa4..45385de37 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -8,6 +8,26 @@
 #ifdef __cplusplus
 extern "C" {
 #endif
+struct core_details {
+	uint64_t last_branches;
+	uint64_t last_branch_misses;
+	uint16_t global_enabled_cpus;
+	uint16_t oob_enabled;
+	int msr_fd;
+};
+
+struct core_info {
+	uint16_t core_count;
+	struct core_details *cd;
+};
+
+struct core_info *
+get_core_info(void);
+
+int
+core_info_init(void);
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
 
 /* Maximum number of CPUS to manage */
 #define POWER_MGR_MAX_CPUS 64
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 3/9] examples/vm_power: add oob monitoring functions
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 1/9] examples/vm_power: add check for port count David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 2/9] examples/vm_power: add core list parameter David Hunt
@ 2018-07-13 14:22             ` David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 4/9] examples/vm_power: allow greater than 64 cores David Hunt
                               ` (6 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

This patch introduces the out-of-band (oob) core monitoring
functions.

The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.

The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).

The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.

The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/Makefile          |   5 +
 examples/vm_power_manager/oob_monitor.h     |  68 +++++
 examples/vm_power_manager/oob_monitor_nop.c |  38 +++
 examples/vm_power_manager/oob_monitor_x86.c | 259 ++++++++++++++++++++
 4 files changed, 370 insertions(+)
 create mode 100644 examples/vm_power_manager/oob_monitor.h
 create mode 100644 examples/vm_power_manager/oob_monitor_nop.c
 create mode 100644 examples/vm_power_manager/oob_monitor_x86.c

diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile
index 0c925967c..13a5205ba 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -20,6 +20,11 @@ APP = vm_power_mgr
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
 SRCS-y += channel_monitor.c parse.c
+ifeq ($(CONFIG_RTE_ARCH_X86_64),y)
+SRCS-y += oob_monitor_x86.c
+else
+SRCS-y += oob_monitor_nop.c
+endif
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/oob_monitor.h b/examples/vm_power_manager/oob_monitor.h
new file mode 100644
index 000000000..b96e08df7
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef OOB_MONITOR_H_
+#define OOB_MONITOR_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Branch Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int branch_monitor_init(void);
+
+/**
+ * Run the OOB branch monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_branch_monitor(void);
+
+/**
+ * Exit the OOB Branch Monitor.
+ *
+ * @return
+ *  None
+ */
+void branch_monitor_exit(void);
+
+/**
+ * Add a core to the list of cores to monitor.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_core_to_monitor(int core);
+
+/**
+ * Remove a previously added core from core list.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_core_from_monitor(int core);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* OOB_MONITOR_H_ */
diff --git a/examples/vm_power_manager/oob_monitor_nop.c b/examples/vm_power_manager/oob_monitor_nop.c
new file mode 100644
index 000000000..7e7b8bc14
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_nop.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include "oob_monitor.h"
+
+void branch_monitor_exit(void)
+{
+}
+
+__attribute__((unused)) static float
+apply_policy(__attribute__((unused)) int core)
+{
+	return 0.0;
+}
+
+int
+add_core_to_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+remove_core_from_monitor(__attribute__((unused)) int core)
+{
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+}
diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c
new file mode 100644
index 000000000..62d503ca5
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -0,0 +1,259 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <fcntl.h>
+#include <rte_log.h>
+
+#include "oob_monitor.h"
+#include "power_manager.h"
+#include "channel_manager.h"
+
+static volatile unsigned run_loop = 1;
+static uint64_t g_branches, g_branch_misses;
+static int g_active;
+
+void branch_monitor_exit(void)
+{
+	run_loop = 0;
+}
+
+/* Number of microseconds between each poll */
+#define INTERVAL 100
+#define PRINT_LOOP_COUNT (1000000/INTERVAL)
+#define RATIO_THRESHOLD 0.03
+#define IA32_PERFEVTSEL0 0x186
+#define IA32_PERFEVTSEL1 0x187
+#define IA32_PERFCTR0 0xc1
+#define IA32_PERFCTR1 0xc2
+#define IA32_PERFEVT_BRANCH_HITS 0x05300c4
+#define IA32_PERFEVT_BRANCH_MISS 0x05300c5
+
+static float
+apply_policy(int core)
+{
+	struct core_info *ci;
+	uint64_t counter;
+	uint64_t branches, branch_misses;
+	uint32_t last_branches, last_branch_misses;
+	int hits_diff, miss_diff;
+	float ratio;
+	int ret;
+
+	g_active = 0;
+	ci = get_core_info();
+
+	last_branches = ci->cd[core].last_branches;
+	last_branch_misses = ci->cd[core].last_branch_misses;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR0);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branches = counter;
+
+	ret = pread(ci->cd[core].msr_fd, &counter,
+			sizeof(counter), IA32_PERFCTR1);
+	if (ret < 0)
+		RTE_LOG(ERR, POWER_MANAGER,
+				"unable to read counter for core %u\n",
+				core);
+	branch_misses = counter;
+
+
+	ci->cd[core].last_branches = branches;
+	ci->cd[core].last_branch_misses = branch_misses;
+
+	hits_diff = (int)branches - (int)last_branches;
+	if (hits_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	miss_diff = (int)branch_misses - (int)last_branch_misses;
+	if (miss_diff <= 0) {
+		/* Likely a counter overflow condition, skip this round */
+		return -1.0;
+	}
+
+	g_branches = hits_diff;
+	g_branch_misses = miss_diff;
+
+	if (hits_diff < (INTERVAL*100)) {
+		/* Likely no workload running on this core. Skip. */
+		return -1.0;
+	}
+
+	ratio = (float)miss_diff * (float)100 / (float)hits_diff;
+
+	if (ratio < RATIO_THRESHOLD)
+		power_manager_scale_core_min(core);
+	else
+		power_manager_scale_core_max(core);
+
+	g_active = 1;
+	return ratio;
+}
+
+int
+add_core_to_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (core < ci->core_count) {
+		long setup;
+
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		/*
+		 * Set up branch counters
+		 */
+		setup = IA32_PERFEVT_BRANCH_HITS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = IA32_PERFEVT_BRANCH_MISS;
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		/*
+		 * Close the file and re-open as read only so
+		 * as not to hog the resource
+		 */
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = open(proc_file, O_RDONLY);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		ci->cd[core].oob_enabled = 1;
+	}
+	return 0;
+}
+
+int
+remove_core_from_monitor(int core)
+{
+	struct core_info *ci;
+	char proc_file[UNIX_PATH_MAX];
+	int ret;
+
+	ci = get_core_info();
+
+	if (ci->cd[core].oob_enabled) {
+		long setup;
+
+		/*
+		 * close the msr file, then reopen rw so we can
+		 * disable the counters
+		 */
+		if (ci->cd[core].msr_fd != 0)
+			close(ci->cd[core].msr_fd);
+		snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core);
+		ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC);
+		if (ci->cd[core].msr_fd < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"Error opening MSR file for core %d "
+					"(is msr kernel module loaded?)\n",
+					core);
+			return -1;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL0);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+		setup = 0x0; /* clear event */
+		ret = pwrite(ci->cd[core].msr_fd, &setup,
+				sizeof(setup), IA32_PERFEVTSEL1);
+		if (ret < 0) {
+			RTE_LOG(ERR, POWER_MANAGER,
+					"unable to set counter for core %u\n",
+					core);
+			return ret;
+		}
+
+		close(ci->cd[core].msr_fd);
+		ci->cd[core].msr_fd = 0;
+		ci->cd[core].oob_enabled = 0;
+	}
+	return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+	return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+	struct core_info *ci;
+	int print = 0;
+	float ratio;
+	int printed;
+	int reads = 0;
+
+	ci = get_core_info();
+
+	while (run_loop) {
+
+		if (!run_loop)
+			break;
+		usleep(INTERVAL);
+		int j;
+		print++;
+		printed = 0;
+		for (j = 0; j < ci->core_count; j++) {
+			if (ci->cd[j].oob_enabled) {
+				ratio = apply_policy(j);
+				if ((print > PRINT_LOOP_COUNT) && (g_active)) {
+					printf("  %d: %.4f {%lu} {%d}", j,
+							ratio, g_branches,
+							reads);
+					printed = 1;
+					reads = 0;
+				} else {
+					reads++;
+				}
+			}
+		}
+		if (print > PRINT_LOOP_COUNT) {
+			if (printed)
+				printf("\n");
+			print = 0;
+		}
+	}
+}
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 4/9] examples/vm_power: allow greater than 64 cores
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (2 preceding siblings ...)
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 3/9] examples/vm_power: add oob monitoring functions David Hunt
@ 2018-07-13 14:22             ` David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 5/9] examples/vm_power: add thread for oob core monitor David Hunt
                               ` (5 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/power_manager.c | 115 +++++++++++-----------
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index a7849e48a..4bdde23da 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -19,14 +19,14 @@
 #include <rte_power.h>
 #include <rte_spinlock.h>
 
+#include "channel_manager.h"
 #include "power_manager.h"
-
-#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+#include "oob_monitor.h"
 
 #define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
-	if (core_num >= POWER_MGR_MAX_CPUS) \
+	if (core_num >= ci.core_count) \
 		return -1; \
-	if (!(global_enabled_cpus & (1ULL << core_num))) \
+	if (!(ci.cd[core_num].global_enabled_cpus)) \
 		return -1; \
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
 	ret = rte_power_freq_##DIRECTION(core_num); \
@@ -37,7 +37,7 @@
 	int i; \
 	for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
 		if ((core_mask >> i) & 1) { \
-			if (!(global_enabled_cpus & (1ULL << i))) \
+			if (!(ci.cd[i].global_enabled_cpus)) \
 				continue; \
 			rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
 			if (rte_power_freq_##DIRECTION(i) != 1) \
@@ -56,28 +56,9 @@ struct freq_info {
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
 struct core_info ci;
-static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
 
-static unsigned
-set_host_cpus_mask(void)
-{
-	char path[PATH_MAX];
-	unsigned i;
-	unsigned num_cpus = 0;
-
-	for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
-		snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
-		if (access(path, F_OK) == 0) {
-			global_enabled_cpus |= 1ULL << i;
-			num_cpus++;
-		} else
-			return num_cpus;
-	}
-	return num_cpus;
-}
-
 struct core_info *
 get_core_info(void)
 {
@@ -110,38 +91,45 @@ core_info_init(void)
 int
 power_manager_init(void)
 {
-	unsigned int i, num_cpus, num_freqs;
-	uint64_t cpu_mask;
+	unsigned int i, num_cpus = 0, num_freqs = 0;
 	int ret = 0;
+	struct core_info *ci;
+
+	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
 
-	num_cpus = set_host_cpus_mask();
-	if (num_cpus == 0) {
-		RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please "
-			"ensure that sufficient privileges exist to inspect sysfs\n");
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
 		return -1;
 	}
-	rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
-	cpu_mask = global_enabled_cpus;
-	for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
-		if (rte_power_init(i) < 0)
-			RTE_LOG(ERR, POWER_MANAGER,
-					"Unable to initialize power manager "
-					"for core %u\n", i);
-		num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs,
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_init(i) < 0)
+				RTE_LOG(ERR, POWER_MANAGER,
+						"Unable to initialize power manager "
+						"for core %u\n", i);
+			num_cpus++;
+			num_freqs = rte_power_freqs(i,
+					global_core_freq_info[i].freqs,
 					RTE_MAX_LCORE_FREQS);
-		if (num_freqs == 0) {
-			RTE_LOG(ERR, POWER_MANAGER,
-				"Unable to get frequency list for core %u\n",
-				i);
-			global_enabled_cpus &= ~(1 << i);
-			num_cpus--;
-			ret = -1;
+			if (num_freqs == 0) {
+				RTE_LOG(ERR, POWER_MANAGER,
+					"Unable to get frequency list for core %u\n",
+					i);
+				ci->cd[i].oob_enabled = 0;
+				ret = -1;
+			}
+			global_core_freq_info[i].num_freqs = num_freqs;
+
+			rte_spinlock_init(&global_core_freq_info[i].power_sl);
 		}
-		global_core_freq_info[i].num_freqs = num_freqs;
-		rte_spinlock_init(&global_core_freq_info[i].power_sl);
+		if (ci->cd[i].oob_enabled)
+			add_core_to_monitor(i);
 	}
-	RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:"
-					" 0x%"PRIx64"\n", num_cpus, global_enabled_cpus);
+	RTE_LOG(INFO, POWER_MANAGER, "Managing %u cores out of %u available host cores\n",
+			num_cpus, ci->core_count);
 	return ret;
 
 }
@@ -156,7 +144,7 @@ power_manager_get_current_frequency(unsigned core_num)
 				core_num, POWER_MGR_MAX_CPUS-1);
 		return -1;
 	}
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci.cd[core_num].global_enabled_cpus))
 		return 0;
 
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
@@ -175,15 +163,26 @@ power_manager_exit(void)
 {
 	unsigned int i;
 	int ret = 0;
+	struct core_info *ci;
 
-	for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) {
-		if (rte_power_exit(i) < 0) {
-			RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
-					"for core %u\n", i);
-			ret = -1;
+	ci = get_core_info();
+	if (!ci) {
+		RTE_LOG(ERR, POWER_MANAGER,
+				"Failed to get core info!\n");
+		return -1;
+	}
+
+	for (i = 0; i < ci->core_count; i++) {
+		if (ci->cd[i].global_enabled_cpus) {
+			if (rte_power_exit(i) < 0) {
+				RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager "
+						"for core %u\n", i);
+				ret = -1;
+			}
+			ci->cd[i].global_enabled_cpus = 0;
 		}
+		remove_core_from_monitor(i);
 	}
-	global_enabled_cpus = 0;
 	return ret;
 }
 
@@ -299,10 +298,12 @@ int
 power_manager_scale_core_med(unsigned int core_num)
 {
 	int ret = 0;
+	struct core_info *ci;
 
+	ci = get_core_info();
 	if (core_num >= POWER_MGR_MAX_CPUS)
 		return -1;
-	if (!(global_enabled_cpus & (1ULL << core_num)))
+	if (!(ci->cd[core_num].global_enabled_cpus))
 		return -1;
 	rte_spinlock_lock(&global_core_freq_info[core_num].power_sl);
 	ret = rte_power_set_freq(core_num,
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 5/9] examples/vm_power: add thread for oob core monitor
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (3 preceding siblings ...)
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 4/9] examples/vm_power: allow greater than 64 cores David Hunt
@ 2018-07-13 14:22             ` David Hunt
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 6/9] examples/vm_power: add port-list to command line David Hunt
                               ` (4 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c | 37 +++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 613a40af0..aef97b9ae 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "oob_monitor.h"
 #include "parse.h"
 #include <rte_pmd_ixgbe.h>
 #include <rte_pmd_i40e.h>
@@ -267,6 +268,17 @@ run_monitor(__attribute__((unused)) void *arg)
 	return 0;
 }
 
+static int
+run_core_monitor(__attribute__((unused)) void *arg)
+{
+	if (branch_monitor_init() < 0) {
+		printf("Unable to initialize core monitor\n");
+		return -1;
+	}
+	run_branch_monitor();
+	return 0;
+}
+
 static void
 sig_handler(int signo)
 {
@@ -285,12 +297,15 @@ main(int argc, char **argv)
 	unsigned int nb_ports;
 	struct rte_mempool *mbuf_pool;
 	uint16_t portid;
+	struct core_info *ci;
 
 
 	ret = core_info_init();
 	if (ret < 0)
 		rte_panic("Cannot allocate core info\n");
 
+	ci = get_core_info();
+
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
 		rte_panic("Cannot init EAL\n");
@@ -365,16 +380,23 @@ main(int argc, char **argv)
 		}
 	}
 
+	check_all_ports_link_status(enabled_port_mask);
+
 	lcore_id = rte_get_next_lcore(-1, 1, 0);
 	if (lcore_id == RTE_MAX_LCORE) {
-		RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
 				"application\n");
 		return 0;
 	}
-
-	check_all_ports_link_status(enabled_port_mask);
+	printf("Running channel monitor on lcore id %d\n", lcore_id);
 	rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
+	lcore_id = rte_get_next_lcore(lcore_id, 1, 0);
+	if (lcore_id == RTE_MAX_LCORE) {
+		RTE_LOG(ERR, EAL, "A minimum of three cores are required to run "
+				"application\n");
+		return 0;
+	}
 	if (power_manager_init() < 0) {
 		printf("Unable to initialize power manager\n");
 		return -1;
@@ -383,8 +405,17 @@ main(int argc, char **argv)
 		printf("Unable to initialize channel manager\n");
 		return -1;
 	}
+
+	printf("Running core monitor on lcore id %d\n", lcore_id);
+	rte_eal_remote_launch(run_core_monitor, NULL, lcore_id);
+
 	run_cli(NULL);
 
+	branch_monitor_exit();
+
 	rte_eal_mp_wait_lcore();
+
+	free(ci->cd);
+
 	return 0;
 }
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 6/9] examples/vm_power: add port-list to command line
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (4 preceding siblings ...)
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 5/9] examples/vm_power: add thread for oob core monitor David Hunt
@ 2018-07-13 14:22             ` David Hunt
  2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 7/9] examples/vm_power: add branch ratio policy type David Hunt
                               ` (3 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:22 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

add in the long form of -p, which is --port-list

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index aef97b9ae..f9990f153 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -145,6 +145,7 @@ parse_args(int argc, char **argv)
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
 		{ "core-list", optional_argument, 0, 'l'},
+		{ "port-list", optional_argument, 0, 'p'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 7/9] examples/vm_power: add branch ratio policy type
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (5 preceding siblings ...)
  2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 6/9] examples/vm_power: add port-list to command line David Hunt
@ 2018-07-13 14:23             ` David Hunt
  2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 8/9] examples/vm_power: add cli args to guest app David Hunt
                               ` (2 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/channel_monitor.c | 23 +++++++++++++++++++--
 lib/librte_power/channel_commands.h         |  3 ++-
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index 73bddd993..7fa47ba97 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -27,6 +27,7 @@
 #include "channel_commands.h"
 #include "channel_manager.h"
 #include "power_manager.h"
+#include "oob_monitor.h"
 
 #define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
 
@@ -92,6 +93,10 @@ get_pcpu_to_control(struct policy *pol)
 	struct vm_info info;
 	int pcpu, count;
 	uint64_t mask_u64b;
+	struct core_info *ci;
+	int ret;
+
+	ci = get_core_info();
 
 	RTE_LOG(INFO, CHANNEL_MONITOR, "Looking for pcpu for %s\n",
 			pol->pkt.vm_name);
@@ -100,8 +105,22 @@ get_pcpu_to_control(struct policy *pol)
 	for (count = 0; count < pol->pkt.num_vcpu; count++) {
 		mask_u64b = info.pcpu_mask[pol->pkt.vcpu_to_control[count]];
 		for (pcpu = 0; mask_u64b; mask_u64b &= ~(1ULL << pcpu++)) {
-			if ((mask_u64b >> pcpu) & 1)
-				pol->core_share[count].pcpu = pcpu;
+			if ((mask_u64b >> pcpu) & 1) {
+				if (pol->pkt.policy_to_use == BRANCH_RATIO) {
+					ci->cd[pcpu].oob_enabled = 1;
+					ret = add_core_to_monitor(pcpu);
+					if (ret == 0)
+						printf("Monitoring pcpu %d via Branch Ratio\n",
+								pcpu);
+					else
+						printf("Failed to start OOB Monitoring pcpu %d\n",
+								pcpu);
+
+				} else {
+					pol->core_share[count].pcpu = pcpu;
+					printf("Monitoring pcpu %d\n", pcpu);
+				}
+			}
 		}
 	}
 }
diff --git a/lib/librte_power/channel_commands.h b/lib/librte_power/channel_commands.h
index 5e8b4ab5d..ee638eefa 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -48,7 +48,8 @@ enum workload {HIGH, MEDIUM, LOW};
 enum policy_to_use {
 	TRAFFIC,
 	TIME,
-	WORKLOAD
+	WORKLOAD,
+	BRANCH_RATIO
 };
 
 struct traffic {
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 8/9] examples/vm_power: add cli args to guest app
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (6 preceding siblings ...)
  2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 7/9] examples/vm_power: add branch ratio policy type David Hunt
@ 2018-07-13 14:23             ` David Hunt
  2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 9/9] examples/vm_power: make branch ratio configurable David Hunt
  2018-07-20 22:06             ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

Add new command line arguments to the guest app to make
    testing and validation of the policy usage easier.
    These arguments are mainly around setting up the power
    management policy that is sent from the guest vm to
    to the vm_power_manager in the host

    New command line parameters:
    -n or --vm-name
       sets the name of the vm to be used by the host OS.
    -b or --busy-hours
       sets the list of hours that are predicted to be busy
    -q or --quiet-hours
       sets the list of hours that are predicted to be quiet
    -l or --vcpu-list
       sets the list of vcpus to monitor
    -p or --port-list
       sets the list of posts to monitor when using a
       workload policy.
    -o or --policy
       sets the default policy type
          TIME
          WORKLOAD
          TRAFFIC
          BRANCH_RATIO

    The format of the hours or list paramers is a comma-separated
    list of integers, which can take the form of
       a. x    e.g. --vcpu-list=1
       b. x,y  e.g. --quiet-hours=3,4
       c. x-y  e.g. --busy-hours=9-12
       d. combination of above (e.g. --busy-hours=4,5-7,9)

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/guest_cli/Makefile  |   2 +-
 examples/vm_power_manager/guest_cli/main.c    | 151 +++++++++++++++++-
 examples/vm_power_manager/guest_cli/parse.c   |  82 ++++++++++
 examples/vm_power_manager/guest_cli/parse.h   |  19 +++
 .../guest_cli/vm_power_cli_guest.c            | 113 +++++++------
 .../guest_cli/vm_power_cli_guest.h            |   6 +
 6 files changed, 319 insertions(+), 54 deletions(-)
 create mode 100644 examples/vm_power_manager/guest_cli/parse.c
 create mode 100644 examples/vm_power_manager/guest_cli/parse.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile b/examples/vm_power_manager/guest_cli/Makefile
index d710e22d9..8b1db861e 100644
--- a/examples/vm_power_manager/guest_cli/Makefile
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -14,7 +14,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 APP = guest_vm_power_mgr
 
 # all source are stored in SRCS-y
-SRCS-y := main.c vm_power_cli_guest.c
+SRCS-y := main.c vm_power_cli_guest.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
index b17936d6b..36365b124 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -2,23 +2,20 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
-/*
 #include <stdio.h>
-#include <string.h>
-#include <stdint.h>
-#include <sys/epoll.h>
-#include <fcntl.h>
-#include <unistd.h>
 #include <stdlib.h>
-#include <errno.h>
-*/
 #include <signal.h>
+#include <getopt.h>
+#include <string.h>
 
 #include <rte_lcore.h>
 #include <rte_power.h>
 #include <rte_debug.h>
+#include <rte_eal.h>
+#include <rte_log.h>
 
 #include "vm_power_cli_guest.h"
+#include "parse.h"
 
 static void
 sig_handler(int signo)
@@ -32,6 +29,136 @@ sig_handler(int signo)
 
 }
 
+#define MAX_HOURS 24
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	const struct option lgopts[] = {
+		{ "vm-name", required_argument, 0, 'n'},
+		{ "busy-hours", required_argument, 0, 'b'},
+		{ "quiet-hours", required_argument, 0, 'q'},
+		{ "port-list", required_argument, 0, 'p'},
+		{ "vcpu-list", required_argument, 0, 'l'},
+		{ "policy", required_argument, 0, 'o'},
+		{NULL, 0, 0, 0}
+	};
+	struct channel_packet *policy;
+	unsigned short int hours[MAX_HOURS];
+	unsigned short int cores[MAX_VCPU_PER_VM];
+	unsigned short int ports[MAX_VCPU_PER_VM];
+	int i, cnt, idx;
+
+	policy = get_policy();
+	set_policy_defaults(policy);
+
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "n:b:q:p:",
+				  lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'n':
+			strcpy(policy->vm_name, optarg);
+			printf("Setting VM Name to [%s]\n", policy->vm_name);
+			break;
+		case 'b':
+		case 'q':
+			//printf("***Processing set using [%s]\n", optarg);
+			cnt = parse_set(optarg, hours, MAX_HOURS);
+			if (cnt < 0) {
+				printf("Invalid value passed to quiet/busy hours - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_HOURS; i++) {
+				if (hours[i]) {
+					if (opt == 'b') {
+						printf("***Busy Hour %d\n", i);
+						policy->timer_policy.busy_hours
+							[idx++] = i;
+					} else {
+						printf("***Quiet Hour %d\n", i);
+						policy->timer_policy.quiet_hours
+							[idx++] = i;
+					}
+				}
+			}
+			break;
+		case 'l':
+			cnt = parse_set(optarg, cores, MAX_VCPU_PER_VM);
+			if (cnt < 0) {
+				printf("Invalid value passed to vcpu-list - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_VCPU_PER_VM; i++) {
+				if (cores[i]) {
+					printf("***Using core %d\n", i);
+					policy->vcpu_to_control[idx++] = i;
+				}
+			}
+			policy->num_vcpu = idx;
+			printf("Total cores: %d\n", idx);
+			break;
+		case 'p':
+			cnt = parse_set(optarg, ports, MAX_VCPU_PER_VM);
+			if (cnt < 0) {
+				printf("Invalid value passed to port-list - [%s]\n",
+						optarg);
+				break;
+			}
+			idx = 0;
+			for (i = 0; i < MAX_VCPU_PER_VM; i++) {
+				if (ports[i]) {
+					printf("***Using port %d\n", i);
+					set_policy_mac(i, idx++);
+				}
+			}
+			policy->nb_mac_to_monitor = idx;
+			printf("Total Ports: %d\n", idx);
+			break;
+		case 'o':
+			if (!strcmp(optarg, "TRAFFIC"))
+				policy->policy_to_use = TRAFFIC;
+			else if (!strcmp(optarg, "TIME"))
+				policy->policy_to_use = TIME;
+			else if (!strcmp(optarg, "WORKLOAD"))
+				policy->policy_to_use = WORKLOAD;
+			else if (!strcmp(optarg, "BRANCH_RATIO"))
+				policy->policy_to_use = BRANCH_RATIO;
+			else {
+				printf("Invalid policy specified: %s\n",
+						optarg);
+				return -1;
+			}
+			break;
+		/* long options */
+
+		case 0:
+			break;
+
+		default:
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
 int
 main(int argc, char **argv)
 {
@@ -45,6 +172,14 @@ main(int argc, char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
 	rte_power_set_env(PM_ENV_KVM_VM);
 	RTE_LCORE_FOREACH(lcore_id) {
 		rte_power_init(lcore_id);
diff --git a/examples/vm_power_manager/guest_cli/parse.c b/examples/vm_power_manager/guest_cli/parse.c
new file mode 100644
index 000000000..528df6d6f
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/parse.c
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include <rte_log.h>
+#include "parse.h"
+
+/*
+ * Parse elem, the elem could be single number/range or group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6
+ *    Within group, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+int
+parse_set(const char *input, uint16_t set[], unsigned int num)
+{
+	unsigned int idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned int min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if (!isdigit(*str) || *str == '\0')
+		return -1;
+
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = num;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-' and ',' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == num)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == '\0')) {
+			max = idx;
+
+			if (min == num)
+				min = idx;
+
+			for (idx = RTE_MIN(min, max);
+					idx <= RTE_MAX(min, max); idx++) {
+				set[idx] = 1;
+			}
+			min = num;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0');
+
+	return str - input;
+}
diff --git a/examples/vm_power_manager/guest_cli/parse.h b/examples/vm_power_manager/guest_cli/parse.h
new file mode 100644
index 000000000..c8aa0ea50
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/parse.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef PARSE_H_
+#define PARSE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int
+parse_set(const char *, uint16_t [], unsigned int);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* PARSE_H_ */
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 43bdeacef..0db1b804f 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -33,6 +33,71 @@ struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
 
+union PFID {
+	struct ether_addr addr;
+	uint64_t pfid;
+};
+
+static struct channel_packet policy;
+
+struct channel_packet *
+get_policy(void)
+{
+	return &policy;
+}
+
+int
+set_policy_mac(int port, int idx)
+{
+	struct channel_packet *policy;
+	union PFID pfid;
+
+	/* Use port MAC address as the vfid */
+	rte_eth_macaddr_get(port, &pfid.addr);
+
+	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
+			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
+			port,
+			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
+			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
+			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
+	policy = get_policy();
+	policy->vfid[idx] = pfid.pfid;
+	return 0;
+}
+
+void
+set_policy_defaults(struct channel_packet *pkt)
+{
+	set_policy_mac(0, 0);
+	pkt->nb_mac_to_monitor = 1;
+
+	pkt->t_boost_status.tbEnabled = false;
+
+	pkt->vcpu_to_control[0] = 0;
+	pkt->vcpu_to_control[1] = 1;
+	pkt->num_vcpu = 2;
+	/* Dummy Population. */
+	pkt->traffic_policy.min_packet_thresh = 96000;
+	pkt->traffic_policy.avg_max_packet_thresh = 1800000;
+	pkt->traffic_policy.max_max_packet_thresh = 2000000;
+
+	pkt->timer_policy.busy_hours[0] = 3;
+	pkt->timer_policy.busy_hours[1] = 4;
+	pkt->timer_policy.busy_hours[2] = 5;
+	pkt->timer_policy.quiet_hours[0] = 11;
+	pkt->timer_policy.quiet_hours[1] = 12;
+	pkt->timer_policy.quiet_hours[2] = 13;
+
+	pkt->timer_policy.hours_to_use_traffic_profile[0] = 8;
+	pkt->timer_policy.hours_to_use_traffic_profile[1] = 10;
+
+	pkt->workload = LOW;
+	pkt->policy_to_use = TIME;
+	pkt->command = PKT_POLICY;
+	strcpy(pkt->vm_name, "ubuntu2");
+}
+
 static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
 				__attribute__((unused)) struct cmdline *cl,
 			    __attribute__((unused)) void *data)
@@ -118,54 +183,12 @@ struct cmd_send_policy_result {
 	cmdline_fixed_string_t cmd;
 };
 
-union PFID {
-	struct ether_addr addr;
-	uint64_t pfid;
-};
-
 static inline int
-send_policy(void)
+send_policy(struct channel_packet *pkt)
 {
-	struct channel_packet pkt;
 	int ret;
 
-	union PFID pfid;
-	/* Use port MAC address as the vfid */
-	rte_eth_macaddr_get(0, &pfid.addr);
-	printf("Port %u MAC: %02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 ":"
-			"%02" PRIx8 ":%02" PRIx8 ":%02" PRIx8 "\n",
-			1,
-			pfid.addr.addr_bytes[0], pfid.addr.addr_bytes[1],
-			pfid.addr.addr_bytes[2], pfid.addr.addr_bytes[3],
-			pfid.addr.addr_bytes[4], pfid.addr.addr_bytes[5]);
-	pkt.vfid[0] = pfid.pfid;
-
-	pkt.nb_mac_to_monitor = 1;
-	pkt.t_boost_status.tbEnabled = false;
-
-	pkt.vcpu_to_control[0] = 0;
-	pkt.vcpu_to_control[1] = 1;
-	pkt.num_vcpu = 2;
-	/* Dummy Population. */
-	pkt.traffic_policy.min_packet_thresh = 96000;
-	pkt.traffic_policy.avg_max_packet_thresh = 1800000;
-	pkt.traffic_policy.max_max_packet_thresh = 2000000;
-
-	pkt.timer_policy.busy_hours[0] = 3;
-	pkt.timer_policy.busy_hours[1] = 4;
-	pkt.timer_policy.busy_hours[2] = 5;
-	pkt.timer_policy.quiet_hours[0] = 11;
-	pkt.timer_policy.quiet_hours[1] = 12;
-	pkt.timer_policy.quiet_hours[2] = 13;
-
-	pkt.timer_policy.hours_to_use_traffic_profile[0] = 8;
-	pkt.timer_policy.hours_to_use_traffic_profile[1] = 10;
-
-	pkt.workload = LOW;
-	pkt.policy_to_use = TIME;
-	pkt.command = PKT_POLICY;
-	strcpy(pkt.vm_name, "ubuntu2");
-	ret = rte_power_guest_channel_send_msg(&pkt, 1);
+	ret = rte_power_guest_channel_send_msg(pkt, 1);
 	if (ret == 0)
 		return 1;
 	RTE_LOG(DEBUG, POWER, "Error sending message: %s\n",
@@ -182,7 +205,7 @@ cmd_send_policy_parsed(void *parsed_result, struct cmdline *cl,
 
 	if (!strcmp(res->cmd, "now")) {
 		printf("Sending Policy down now!\n");
-		ret = send_policy();
+		ret = send_policy(&policy);
 	}
 	if (ret != 1)
 		cmdline_printf(cl, "Error sending message: %s\n",
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
index 75a262967..fd77f6a69 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.h
@@ -11,6 +11,12 @@ extern "C" {
 
 #include "channel_commands.h"
 
+struct channel_packet *get_policy(void);
+
+int set_policy_mac(int port, int idx);
+
+void set_policy_defaults(struct channel_packet *pkt);
+
 void run_cli(__attribute__((unused)) void *arg);
 
 #ifdef __cplusplus
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [dpdk-dev] [PATCH v4 9/9] examples/vm_power: make branch ratio configurable
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (7 preceding siblings ...)
  2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 8/9] examples/vm_power: add cli args to guest app David Hunt
@ 2018-07-13 14:23             ` David Hunt
  2018-07-20 22:06             ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
  9 siblings, 0 replies; 46+ messages in thread
From: David Hunt @ 2018-07-13 14:23 UTC (permalink / raw)
  To: dev; +Cc: david.hunt, thomas

For different workloads and poll loops, the theshold
may be different for when you want to scale up and down.

This patch allows changing of the default branch ratio
by using the -b command line argument (or --branch-ratio=)

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 examples/vm_power_manager/main.c            | 16 +++++++++++++++-
 examples/vm_power_manager/oob_monitor_x86.c |  3 +--
 examples/vm_power_manager/power_manager.c   |  1 +
 examples/vm_power_manager/power_manager.h   |  3 +++
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index f9990f153..58c5fa45c 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -141,17 +141,19 @@ parse_args(int argc, char **argv)
 	int option_index;
 	char *prgname = argv[0];
 	struct core_info *ci;
+	float branch_ratio;
 	static struct option lgopts[] = {
 		{ "mac-updating", no_argument, 0, 1},
 		{ "no-mac-updating", no_argument, 0, 0},
 		{ "core-list", optional_argument, 0, 'l'},
 		{ "port-list", optional_argument, 0, 'p'},
+		{ "branch-ratio", optional_argument, 0, 'b'},
 		{NULL, 0, 0, 0}
 	};
 	argvopt = argv;
 	ci = get_core_info();
 
-	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
+	while ((opt = getopt_long(argc, argvopt, "l:p:q:T:b:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -184,6 +186,18 @@ parse_args(int argc, char **argv)
 			}
 			free(oob_enable);
 			break;
+		case 'b':
+			branch_ratio = 0.0;
+			if (strlen(optarg))
+				branch_ratio = atof(optarg);
+			if (branch_ratio <= 0.0) {
+				printf("invalid branch ratio specified\n");
+				return -1;
+			}
+			ci->branch_ratio_threshold = branch_ratio;
+			printf("***Setting branch ratio to %f\n",
+					branch_ratio);
+			break;
 		/* long options */
 		case 0:
 			break;
diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c
index 62d503ca5..589c604e5 100644
--- a/examples/vm_power_manager/oob_monitor_x86.c
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -22,7 +22,6 @@ void branch_monitor_exit(void)
 /* Number of microseconds between each poll */
 #define INTERVAL 100
 #define PRINT_LOOP_COUNT (1000000/INTERVAL)
-#define RATIO_THRESHOLD 0.03
 #define IA32_PERFEVTSEL0 0x186
 #define IA32_PERFEVTSEL1 0x187
 #define IA32_PERFCTR0 0xc1
@@ -89,7 +88,7 @@ apply_policy(int core)
 
 	ratio = (float)miss_diff * (float)100 / (float)hits_diff;
 
-	if (ratio < RATIO_THRESHOLD)
+	if (ratio < ci->branch_ratio_threshold)
 		power_manager_scale_core_min(core);
 	else
 		power_manager_scale_core_max(core);
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 4bdde23da..b7769c3c3 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -74,6 +74,7 @@ core_info_init(void)
 	ci = get_core_info();
 
 	ci->core_count = get_nprocs_conf();
+	ci->branch_ratio_threshold = BRANCH_RATIO_THRESHOLD;
 	ci->cd = malloc(ci->core_count * sizeof(struct core_details));
 	if (!ci->cd) {
 		RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info.");
diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h
index 45385de37..605b3c8f6 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -19,8 +19,11 @@ struct core_details {
 struct core_info {
 	uint16_t core_count;
 	struct core_details *cd;
+	float branch_ratio_threshold;
 };
 
+#define BRANCH_RATIO_THRESHOLD 0.1
+
 struct core_info *
 get_core_info(void);
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling
  2018-07-13  8:43               ` Hunt, David
@ 2018-07-18 15:23                 ` Thomas Monjalon
  0 siblings, 0 replies; 46+ messages in thread
From: Thomas Monjalon @ 2018-07-18 15:23 UTC (permalink / raw)
  To: Hunt, David
  Cc: dev, jerin.jacob, hemant.agrawal, arybchenko, ferruh.yigit,
	bruce.richardson

13/07/2018 10:43, Hunt, David:
> 
> On 13/7/2018 9:33 AM, Thomas Monjalon wrote:
> > 13/07/2018 10:31, Hunt, David:
> >> Hi Thomas,
> >>
> >> On 12/7/2018 8:09 PM, Thomas Monjalon wrote:
> >>> 26/06/2018 11:23, David Hunt:
> >>>> This patch set adds the capability to do out-of-band power
> >>>> monitoring on a system. It uses a thread to monitor the branch
> >>>> counters in the targeted cores, and calculates the branch ratio
> >>>> if the running code.
> >>>>
> >>>> If the branch ratop is low (0.01), then
> >>>> the code is most likely running in a tight poll loop and doing
> >>>> nothing, i.e. receiving no packets. In this case we scale down
> >>>> the frequency of that core.
> >>>>
> >>>> If the branch ratio is higher (>0.01), then it is likely that
> >>>> the code is receiving and processing packets. In this case, we
> >>>> scale up the frequency of that core.
> >>>>
> >>>> The cpu counters are read via /dev/cpu/x/msr, so requires the
> >>>> msr kernel module to be loaded. Because this method is used,
> >>>> the patch set is implemented with one file for x86 systems, and
> >>>> another for non-x86 systems, with conditional compilation in
> >>>> the Makefile. The non-x86 functions are stubs, and do not
> >>>> currently implement any functionality.
> >>>>
> >>>> The vm_power_manager app has been modified to take a new parameter
> >>>>      --core-list or -l
> >>>> which takes a list of cores in a comma-separated list format,
> >>>> e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
> >>>> These cores will then be enabled for oob monitoring. When the
> >>>> OOB monitoring thread starts, it reads the branch hits/miss
> >>>> counters of each monitored core, and scales up/down accordingly.
> >>> It looks to be a feature which could be integrated in DPDK libs.
> >>> Why choosing to implement it fully in an example?
> >> I needed to set up a thread that looped tightly (~100uS interval) and
> >> run it on it's
> >> own core. From what I have seen in other cases, it is usually the
> >> application that
> >> allocates cores and decides what to run on them. I did think about putting
> >> some of it in a library, but for this case I thought it made more sense
> >> to keep
> >> it purely as a sample app.
> > I feel some code deserves to be in a library.
> > For instance, having different implementations per CPU is a good reason
> > to make a library.
> >
> 
> Sure, I can look at moving some of the code into the library in a future 
> release. However, I
> believe it's OK as it is for the current merge window.

I will to pull it in 18.08-rc2 if compilation is fine.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling
  2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
                               ` (8 preceding siblings ...)
  2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 9/9] examples/vm_power: make branch ratio configurable David Hunt
@ 2018-07-20 22:06             ` Thomas Monjalon
  9 siblings, 0 replies; 46+ messages in thread
From: Thomas Monjalon @ 2018-07-20 22:06 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

13/07/2018 16:22, David Hunt:
> [1/9] examples/vm_power: add check for port count
> [2/9] examples/vm_power: add core list parameter
> [3/9] examples/vm_power: add oob monitoring functions
> [4/9] examples/vm_power: allow greater than 64 cores
> [5/9] examples/vm_power: add thread for oob core monitor
> [6/9] examples/vm_power: add port-list to command line
> [7/9] examples/vm_power: add branch ratio policy type
> [8/9] examples/vm_power: add cli args to guest app
> [9/9] examples/vm_power: make branch ratio configurable

Applied, thanks

Some features could be hosted in a DPDK library.
It has been agreed in the technical board to use examples
as a staging area, but it must re-considered regularly
whether the code must stay, move or be removed.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2018-07-20 22:06 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-07  7:36 [dpdk-dev] [PATCH v1 0/6] examples/vm_power: 100% Busy Polling David Hunt
2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count David Hunt
2018-06-21 13:24   ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 1/8] examples/vm_power: add check for port count David Hunt
2018-06-26  9:23       ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count David Hunt
2018-07-13 14:22           ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling David Hunt
2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 1/9] examples/vm_power: add check for port count David Hunt
2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 2/9] examples/vm_power: add core list parameter David Hunt
2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 3/9] examples/vm_power: add oob monitoring functions David Hunt
2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 4/9] examples/vm_power: allow greater than 64 cores David Hunt
2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 5/9] examples/vm_power: add thread for oob core monitor David Hunt
2018-07-13 14:22             ` [dpdk-dev] [PATCH v4 6/9] examples/vm_power: add port-list to command line David Hunt
2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 7/9] examples/vm_power: add branch ratio policy type David Hunt
2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 8/9] examples/vm_power: add cli args to guest app David Hunt
2018-07-13 14:23             ` [dpdk-dev] [PATCH v4 9/9] examples/vm_power: make branch ratio configurable David Hunt
2018-07-20 22:06             ` [dpdk-dev] [PATCH v4 0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 2/9] examples/vm_power: add core list parameter David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions David Hunt
2018-07-12 19:13           ` Thomas Monjalon
2018-07-12 22:18             ` Stephen Hemminger
2018-07-13  8:24             ` Hunt, David
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 4/9] examples/vm_power: allow greater than 64 cores David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 5/9] examples/vm_power: add thread for oob core monitor David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 6/9] examples/vm_power: add port-list to command line David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 7/9] examples/vm_power: add branch ratio policy type David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 8/9] examples/vm_power: add cli args to guest app David Hunt
2018-06-26  9:23         ` [dpdk-dev] [PATCH v3 9/9] examples/vm_power: make branch ratio configurable David Hunt
2018-07-12 19:09         ` [dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling Thomas Monjalon
2018-07-13  8:31           ` Hunt, David
2018-07-13  8:33             ` Thomas Monjalon
2018-07-13  8:43               ` Hunt, David
2018-07-18 15:23                 ` Thomas Monjalon
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 2/8] examples/vm_power: add core list parameter David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 3/8] examples/vm_power: add oob monitoring functions David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 4/8] examples/vm_power: allow greater than 64 cores David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 5/8] examples/vm_power: add thread for oob core monitor David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 6/8] examples/vm_power: add port-list to command line David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 7/8] examples/vm_power: add branch ratio policy type David Hunt
2018-06-21 13:24     ` [dpdk-dev] [PATCH v2 8/8] examples/vm_power: add cli args to guest app David Hunt
2018-06-21 14:28     ` [dpdk-dev] [PATCH v2 0/8] examples/vm_power: 100% Busy Polling Radu Nicolau
2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 2/6] examples/vm_power: add core list parameter David Hunt
2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 3/6] examples/vm_power: add oob monitoring functions David Hunt
2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 4/6] examples/vm_power: allow greater than 64 cores David Hunt
2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 5/6] examples/vm_power: add thread for oob core monitor David Hunt
2018-06-07  7:37 ` [dpdk-dev] [PATCH v1 6/6] examples/vm_power: add port-list to command line David Hunt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).