From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 7A7741B204 for ; Fri, 28 Sep 2018 13:19:35 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Sep 2018 04:19:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,314,1534834800"; d="scan'208";a="94713483" Received: from dhunt5-mobl2.ger.corp.intel.com (HELO [10.237.221.37]) ([10.237.221.37]) by orsmga001.jf.intel.com with ESMTP; 28 Sep 2018 04:19:32 -0700 To: Liang Ma Cc: dev@dpdk.org, lei.a.yao@intel.com, ktraynor@redhat.com, john.geary@intel.com References: <1536070228-6545-1-git-send-email-liang.j.ma@intel.com> <1537191016-26330-1-git-send-email-liang.j.ma@intel.com> <1537191016-26330-2-git-send-email-liang.j.ma@intel.com> From: "Hunt, David" Message-ID: <1da54734-35f2-aaea-b95d-01f3a651d74a@intel.com> Date: Fri, 28 Sep 2018 12:19:31 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <1537191016-26330-2-git-send-email-liang.j.ma@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Subject: Re: [dpdk-dev] [PATCH v8 2/4] examples/l3fwd-power: simple app update for new API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Sep 2018 11:19:36 -0000 Hi Liang, A few tweaks below: On 17/9/2018 2:30 PM, Liang Ma wrote: > Add the support for new traffic pattern aware power control > power management API. > > Example: > ./l3fwd-power -l xxx -n 4 -w 0000:xx:00.0 -w 0000:xx:00.1 -- -p 0x3 > -P --config="(0,0,xx),(1,0,xx)" --empty-poll="0,0,0" -l 14 -m 9 -h 1 > > Please Reference l3fwd-power document for all parameter except > empty-poll. The docs should probably include empty poll parameter. Suggest re-wording to Please Reference l3fwd-power document for full parameter usage > The option "l", "m", "h" are used to set the power index for > LOW, MED, HIGH power state. only is useful after enable empty-poll > > --empty-poll="training_flag, med_threshold, high_threshold" > > The option training_flag is used to enable/disable training mode. > > The option med_threshold is used to indicate the empty poll threshold > of modest state which is customized by user. > > The option high_threshold is used to indicate the empty poll threshold > of busy state which is customized by user. > > Above three option default value is all 0. > > Once enable empty-poll. System will apply the default parameter. > Training mode is disabled as default. Suggest: Once empty-poll is enabled, the system will apply the default parameters is no other command line options are provided. > If training mode is triggered, there should not has any traffic > pass-through during training phase. Suggest: If training mode is enabled, the user should ensure that no traffic is allowed to pass through the system. > When training phase complete, system transfer to normal phase. When training phase complete, the application transfer to normal operation > > System will running with modest power stat at beginning. System will start running with the modest power mode. > If the system busyness percentage above 70%, then system will adjust > power state move to High power state. If the traffic become lower(eg. The > system busyness percentage drop below 30%), system will fallback > to the modest power state. If the traffic goes above 70%, then system will move to High power state. If the traffic drops below 30%, the system will fallback to the modest power state. > Example code use master thread to monitoring worker thread busyness. > the default timer resolution is 10ms. > > ChangeLog: > v2 fix some coding style issues > v3 rename the API. > v6 re-work the API. > v7 no change. > v8 disable training as default option. > > Signed-off-by: Liang Ma > > Reviewed-by: Lei Yao > --- > examples/l3fwd-power/Makefile | 3 + > examples/l3fwd-power/main.c | 325 +++++++++++++++++++++++++++++++++++++-- > examples/l3fwd-power/meson.build | 1 + > 3 files changed, 312 insertions(+), 17 deletions(-) > > diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile > index d7e39a3..772ec7b 100644 > --- a/examples/l3fwd-power/Makefile > +++ b/examples/l3fwd-power/Makefile > @@ -23,6 +23,8 @@ CFLAGS += -O3 $(shell pkg-config --cflags libdpdk) > LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk) > LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk) > > +CFLAGS += -DALLOW_EXPERIMENTAL_API > + > build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build > $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED) > > @@ -54,6 +56,7 @@ please change the definition of the RTE_TARGET environment variable) > all: > else > > +CFLAGS += -DALLOW_EXPERIMENTAL_API > CFLAGS += -O3 > CFLAGS += $(WERROR_FLAGS) > > diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c > index 68527d2..1465608 100644 > --- a/examples/l3fwd-power/main.c > +++ b/examples/l3fwd-power/main.c > @@ -43,6 +43,7 @@ > #include > #include > #include > +#include > > #include "perf_core.h" > #include "main.h" > @@ -55,6 +56,8 @@ > > /* 100 ms interval */ > #define TIMER_NUMBER_PER_SECOND 10 > +/* (10ms) */ > +#define INTERVALS_PER_SECOND 100 > /* 100000 us */ > #define SCALING_PERIOD (1000000/TIMER_NUMBER_PER_SECOND) > #define SCALING_DOWN_TIME_RATIO_THRESHOLD 0.25 > @@ -117,6 +120,11 @@ > */ > #define RTE_TEST_RX_DESC_DEFAULT 1024 > #define RTE_TEST_TX_DESC_DEFAULT 1024 > +#define EMPTY_POLL_MED_THRESHOLD 350000UL > +#define EMPTY_POLL_HGH_THRESHOLD 580000UL I'd suggest adding some explanation around these two numbers. E.g. /*  * These two thresholds were decided on by running the training algorithm on  * a 2.5GHz Xeon. These defaults can be overridden by supplying non-zero values  * for the med_threshold and high_threshold parameters on the command line.  */ > + > + > + > static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT; > static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT; > > @@ -132,6 +140,14 @@ static uint32_t enabled_port_mask = 0; > static int promiscuous_on = 0; > /* NUMA is enabled by default. */ > static int numa_on = 1; > +/* emptypoll is disabled by default. */ > +static bool empty_poll_on; > +static bool empty_poll_train; > +volatile bool empty_poll_stop; > +static struct ep_params *ep_params; > +static struct ep_policy policy; > +static long ep_med_edpi, ep_hgh_edpi; > + > static int parse_ptype; /**< Parse packet type using rx callback, and */ > /**< disabled by default */ > > @@ -330,6 +346,13 @@ static inline uint32_t power_idle_heuristic(uint32_t zero_rx_packet_count); > static inline enum freq_scale_hint_t power_freq_scaleup_heuristic( \ > unsigned int lcore_id, uint16_t port_id, uint16_t queue_id); > > +static uint8_t freq_tlb[] = {14, 9, 1}; > + Maybe an explanation on where these numbers came from. E.g. /*  * These defaults are using the max frequency index (1), a medium index (9) and a  * typical low frequency index (14). These can be adjusted to use different  * indexes using the relevant command line parameters.  */ > +static int is_done(void) > +{ > + return empty_poll_stop; > +} > + > /* exit signal handler */ > static void > signal_exit_now(int sigtype) > @@ -338,7 +361,15 @@ signal_exit_now(int sigtype) > unsigned int portid; > int ret; > > + RTE_SET_USED(lcore_id); > + RTE_SET_USED(portid); > + RTE_SET_USED(ret); > + > if (sigtype == SIGINT) { > + if (empty_poll_on) > + empty_poll_stop = true; > + > + > for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { > if (rte_lcore_is_enabled(lcore_id) == 0) > continue; > @@ -351,16 +382,19 @@ signal_exit_now(int sigtype) > "core%u\n", lcore_id); > } > > - RTE_ETH_FOREACH_DEV(portid) { > - if ((enabled_port_mask & (1 << portid)) == 0) > - continue; > + if (!empty_poll_on) { > + RTE_ETH_FOREACH_DEV(portid) { > + if ((enabled_port_mask & (1 << portid)) == 0) > + continue; > > - rte_eth_dev_stop(portid); > - rte_eth_dev_close(portid); > + rte_eth_dev_stop(portid); > + rte_eth_dev_close(portid); > + } > } > } > > - rte_exit(EXIT_SUCCESS, "User forced exit\n"); > + if (!empty_poll_on) > + rte_exit(EXIT_SUCCESS, "User forced exit\n"); > } > > /* Freqency scale down timer callback */ > @@ -825,7 +859,107 @@ static int event_register(struct lcore_conf *qconf) > > return 0; > } > +/* main processing loop */ > +static int > +main_empty_poll_loop(__attribute__((unused)) void *dummy) > +{ > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; > + unsigned int lcore_id; > + uint64_t prev_tsc, diff_tsc, cur_tsc; > + int i, j, nb_rx; > + uint8_t queueid; > + uint16_t portid; > + struct lcore_conf *qconf; > + struct lcore_rx_queue *rx_queue; > + > + const uint64_t drain_tsc = > + (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US; > + > + prev_tsc = 0; > + > + lcore_id = rte_lcore_id(); > + qconf = &lcore_conf[lcore_id]; > + > + if (qconf->n_rx_queue == 0) { > + RTE_LOG(INFO, L3FWD_POWER, "lcore %u has nothing to do\n", lcore_id); > + return 0; > + } > + > + for (i = 0; i < qconf->n_rx_queue; i++) { > + portid = qconf->rx_queue_list[i].port_id; > + queueid = qconf->rx_queue_list[i].queue_id; > + RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%u " > + "rxqueueid=%hhu\n", lcore_id, portid, queueid); > + } > + > + while (!is_done()) { > + stats[lcore_id].nb_iteration_looped++; > + > + cur_tsc = rte_rdtsc(); > + /* > + * TX burst queue drain > + */ > + diff_tsc = cur_tsc - prev_tsc; > + if (unlikely(diff_tsc > drain_tsc)) { > + for (i = 0; i < qconf->n_tx_port; ++i) { > + portid = qconf->tx_port_id[i]; > + rte_eth_tx_buffer_flush(portid, > + qconf->tx_queue_id[portid], > + qconf->tx_buffer[portid]); > + } > + prev_tsc = cur_tsc; > + } > + > + /* > + * Read packet from RX queues > + */ > + for (i = 0; i < qconf->n_rx_queue; ++i) { > + rx_queue = &(qconf->rx_queue_list[i]); > + rx_queue->idle_hint = 0; > + portid = rx_queue->port_id; > + queueid = rx_queue->queue_id; > + > + nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst, > + MAX_PKT_BURST); > + > + stats[lcore_id].nb_rx_processed += nb_rx; > + > + if (nb_rx == 0) { > + > + rte_power_empty_poll_stat_update(lcore_id); > + > + continue; > + } else { > + rte_power_poll_stat_update(lcore_id, nb_rx); > + } > + > + > + /* Prefetch first packets */ > + for (j = 0; j < PREFETCH_OFFSET && j < nb_rx; j++) { > + rte_prefetch0(rte_pktmbuf_mtod( > + pkts_burst[j], void *)); > + } > + > + /* Prefetch and forward already prefetched packets */ > + for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) { > + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[ > + j + PREFETCH_OFFSET], void *)); > + l3fwd_simple_forward(pkts_burst[j], portid, > + qconf); > + } > > + /* Forward remaining prefetched packets */ > + for (; j < nb_rx; j++) { > + l3fwd_simple_forward(pkts_burst[j], portid, > + qconf); > + } > + > + } > + > + } > + > + return 0; > +} > /* main processing loop */ > static int > main_loop(__attribute__((unused)) void *dummy) > @@ -1127,7 +1261,8 @@ print_usage(const char *prgname) > " --no-numa: optional, disable numa awareness\n" > " --enable-jumbo: enable jumbo frame" > " which max packet len is PKTLEN in decimal (64-9600)\n" > - " --parse-ptype: parse packet type by software\n", > + " --parse-ptype: parse packet type by software\n" > + " --empty=poll: enable empty poll detection\n", typo: "empty=poll" should be "empty-poll" I really think some info on what should be supplied with the empty-poll parameter should be mentioned here e.g. --empty=poll "training_flag, high_threshold, med_threshold" > prgname); > } > > @@ -1220,7 +1355,55 @@ parse_config(const char *q_arg) > > return 0; > } > +static int > +parse_ep_config(const char *q_arg) > +{ > + char s[256]; > + const char *p = q_arg; > + char *end; > + int num_arg; > + > + char *str_fld[3]; > + > + int training_flag; > + int med_edpi; > + int hgh_edpi; > + > + ep_med_edpi = EMPTY_POLL_MED_THRESHOLD; > + ep_hgh_edpi = EMPTY_POLL_MED_THRESHOLD; > + > + snprintf(s, sizeof(s), "%s", p); > + > + num_arg = rte_strsplit(s, sizeof(s), str_fld, 3, ','); > + > + empty_poll_train = false; > + > + if (num_arg == 0) > + return 0; > > + if (num_arg == 3) { > + > + training_flag = strtoul(str_fld[0], &end, 0); > + med_edpi = strtoul(str_fld[1], &end, 0); > + hgh_edpi = strtoul(str_fld[2], &end, 0); > + > + if (training_flag == 1) > + empty_poll_train = true; > + > + if (med_edpi > 0) > + ep_med_edpi = med_edpi; > + > + if (med_edpi > 0) > + ep_hgh_edpi = hgh_edpi; > + > + } else { > + > + return -1; > + } > + > + return 0; > + > +} > #define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype" > > /* Parse the argument given in the command line of the application */ > @@ -1230,6 +1413,7 @@ parse_args(int argc, char **argv) > int opt, ret; > char **argvopt; > int option_index; > + uint32_t limit; > char *prgname = argv[0]; > static struct option lgopts[] = { > {"config", 1, 0, 0}, > @@ -1237,13 +1421,14 @@ parse_args(int argc, char **argv) > {"high-perf-cores", 1, 0, 0}, > {"no-numa", 0, 0, 0}, > {"enable-jumbo", 0, 0, 0}, > + {"empty-poll", 1, 0, 0}, > {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0}, > {NULL, 0, 0, 0} > }; > > argvopt = argv; > > - while ((opt = getopt_long(argc, argvopt, "p:P", > + while ((opt = getopt_long(argc, argvopt, "p:l:m:h:P", > lgopts, &option_index)) != EOF) { > > switch (opt) { > @@ -1260,7 +1445,18 @@ parse_args(int argc, char **argv) > printf("Promiscuous mode selected\n"); > promiscuous_on = 1; > break; > - > + case 'l': > + limit = parse_max_pkt_len(optarg); > + freq_tlb[LOW] = limit; > + break; > + case 'm': > + limit = parse_max_pkt_len(optarg); > + freq_tlb[MED] = limit; > + break; > + case 'h': > + limit = parse_max_pkt_len(optarg); > + freq_tlb[HGH] = limit; > + break; > /* long options */ > case 0: > if (!strncmp(lgopts[option_index].name, "config", 6)) { > @@ -1299,6 +1495,20 @@ parse_args(int argc, char **argv) > } > > if (!strncmp(lgopts[option_index].name, > + "empty-poll", 10)) { > + printf("empty-poll is enabled\n"); > + empty_poll_on = true; > + ret = parse_ep_config(optarg); > + > + if (ret) { > + printf("invalid empty poll config\n"); > + print_usage(prgname); > + return -1; > + } > + > + } > + > + if (!strncmp(lgopts[option_index].name, > "enable-jumbo", 12)) { > struct option lenopts = > {"max-pkt-len", required_argument, \ > @@ -1646,6 +1856,59 @@ init_power_library(void) > } > return ret; > } > +static void > +empty_poll_setup_timer(void) > +{ > + int lcore_id = rte_lcore_id(); > + uint64_t hz = rte_get_timer_hz(); > + > + struct ep_params *ep_ptr = ep_params; > + > + ep_ptr->interval_ticks = hz / INTERVALS_PER_SECOND; > + > + rte_timer_reset_sync(&ep_ptr->timer0, > + ep_ptr->interval_ticks, > + PERIODICAL, > + lcore_id, > + rte_empty_poll_detection, > + (void *)ep_ptr); > + > +} > +static int > +launch_timer(unsigned int lcore_id) > +{ > + int64_t prev_tsc = 0, cur_tsc, diff_tsc, cycles_10ms; > + > + RTE_SET_USED(lcore_id); > + > + > + if (rte_get_master_lcore() != lcore_id) { > + rte_panic("timer on lcore:%d which is not master core:%d\n", > + lcore_id, > + rte_get_master_lcore()); > + } > + > + RTE_LOG(INFO, POWER, "Bring up the Timer\n"); > + > + empty_poll_setup_timer(); > + > + cycles_10ms = rte_get_timer_hz() / 100; > + > + while (!is_done()) { > + cur_tsc = rte_rdtsc(); > + diff_tsc = cur_tsc - prev_tsc; > + if (diff_tsc > cycles_10ms) { > + rte_timer_manage(); > + prev_tsc = cur_tsc; > + cycles_10ms = rte_get_timer_hz() / 100; > + } > + } > + > + RTE_LOG(INFO, POWER, "Timer_subsystem is done\n"); > + > + return 0; > +} > + > > int > main(int argc, char **argv) > @@ -1828,13 +2091,15 @@ main(int argc, char **argv) > if (rte_lcore_is_enabled(lcore_id) == 0) > continue; > > - /* init timer structures for each enabled lcore */ > - rte_timer_init(&power_timers[lcore_id]); > - hz = rte_get_timer_hz(); > - rte_timer_reset(&power_timers[lcore_id], > - hz/TIMER_NUMBER_PER_SECOND, SINGLE, lcore_id, > - power_timer_cb, NULL); > - > + if (empty_poll_on == false) { > + /* init timer structures for each enabled lcore */ > + rte_timer_init(&power_timers[lcore_id]); > + hz = rte_get_timer_hz(); > + rte_timer_reset(&power_timers[lcore_id], > + hz/TIMER_NUMBER_PER_SECOND, > + SINGLE, lcore_id, > + power_timer_cb, NULL); > + } > qconf = &lcore_conf[lcore_id]; > printf("\nInitializing rx queues on lcore %u ... ", lcore_id ); > fflush(stdout); > @@ -1905,12 +2170,38 @@ main(int argc, char **argv) > > check_all_ports_link_status(enabled_port_mask); > > + if (empty_poll_on == true) { > + > + if (empty_poll_train) { > + policy.state = TRAINING; > + } else { > + policy.state = MED_NORMAL; > + policy.med_base_edpi = ep_med_edpi; > + policy.hgh_base_edpi = ep_hgh_edpi; > + } > + > + rte_power_empty_poll_stat_init(&ep_params, freq_tlb, &policy); > + } > + > + > /* launch per-lcore init on every lcore */ > - rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER); > + if (empty_poll_on == false) { > + rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER); > + } else { > + empty_poll_stop = false; > + rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, SKIP_MASTER); > + } > + > + if (empty_poll_on == true) > + launch_timer(rte_lcore_id()); > + > RTE_LCORE_FOREACH_SLAVE(lcore_id) { > if (rte_eal_wait_lcore(lcore_id) < 0) > return -1; > } > > + if (empty_poll_on) > + rte_power_empty_poll_stat_free(); > + > return 0; > } > diff --git a/examples/l3fwd-power/meson.build b/examples/l3fwd-power/meson.build > index 20c8054..a3c5c2f 100644 > --- a/examples/l3fwd-power/meson.build > +++ b/examples/l3fwd-power/meson.build > @@ -9,6 +9,7 @@ > if host_machine.system() != 'linux' > build = false > endif > +allow_experimental_apis = true > deps += ['power', 'timer', 'lpm', 'hash'] > sources = files( > 'main.c', 'perf_core.c' Checkpatch throws up some warnings: ### examples/l3fwd-power: simple app update for new API WARNING:LONG_LINE: line over 80 characters #201: FILE: examples/l3fwd-power/main.c:876: +               (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US; WARNING:LONG_LINE: line over 80 characters #209: FILE: examples/l3fwd-power/main.c:884: +               RTE_LOG(INFO, L3FWD_POWER, "lcore %u has nothing to do\n", lcore_id); WARNING:LONG_LINE: line over 80 characters #271: FILE: examples/l3fwd-power/main.c:946: +                                                       j + PREFETCH_OFFSET], void *)); WARNING:LONG_LINE: line over 80 characters #529: FILE: examples/l3fwd-power/main.c:2192: +               rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, SKIP_MASTER); total: 0 errors, 4 warnings, 467 lines checked Rgds, Dave.