* [dpdk-dev] [PATCH 0/2] new headroom stats library and example application
@ 2015-01-29 11:50 Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
` (3 more replies)
0 siblings, 4 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-01-29 11:50 UTC (permalink / raw)
To: dev
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs. It
can be used to profile every kind of job sets on any arbitrary execution unit.
In provided l2fwd-headroom example I demonstrate how to use this library to
profile packet forwarding (job set is froward, flush and stats) on LCores
(execution unit). This example does no limit possible schemes on which this
library can be used.
Pawel Wodkowski (2):
librte_headroom: New library for checking core/system/app load
examples: introduce new l2fwd-headroom example
config/common_bsdapp | 6 +
config/common_linuxapp | 6 +
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 +++
examples/l2fwd-headroom/main.c | 875 ++++++++++++++++++++++++++++++++++++
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 50 +++
lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++
lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++
mk/rte.app.mk | 4 +
10 files changed, 1843 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH 1/2] librte_headroom: New library for checking core/system/app load
2015-01-29 11:50 [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Pawel Wodkowski
@ 2015-01-29 11:50 ` Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
` (2 subsequent siblings)
3 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-01-29 11:50 UTC (permalink / raw)
To: dev
To calculate a headroom we need to have some part of code that do
something. Those parts of code are called jobs (not tasks, to avoid
confusion). Jobs are managed by headroom library, that is responsible
for executing them when needed.
The rte_headroom_next_job() function is waiting for first job
to became ready. If job is ready, time that it spent
waiting is added to overal idle time and also is saved.
Job is then executed.
Executed job must return an integer value. This value
is used to calculate next execution time (time when job will be
considered ready). For example: if job is forward job it return number
of received packets. Returned value is then compared to target value. If
returned value is different next_exec_time
is adjusted. Previously saved idle time is considered to be a job's
idle time (it is added to job's idle time).
After execution of last ready job, number of loops is incremented
and whole process starts all over again.
Please notice that given headroom is no absolute. For example:
if some app have avg 100us headroom, adding job that consume 90us will
not mean that there is 10us left. You need to run headroom profiling
again after adding this 90us-job.
Additionaly used can define own handlers:
- idle handler - function called when no job is ready to execute.
- loop hook - function called when all ready jobs are executed.
- job update period callback - if more sophisticated than default
function is required to calculate job's execution period.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
config/common_bsdapp | 6 +
config/common_linuxapp | 6 +
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 50 ++++
lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++++++++++++++
lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++++++++++++++++++
mk/rte.app.mk | 4 +
7 files changed, 916 insertions(+)
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 9177db1..eca9299 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -282,6 +282,12 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+CONFIG_RTE_HEADROOM_MAX_JOBS=32
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2f9643b..54c9458 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -290,6 +290,12 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+CONFIG_RTE_HEADROOM_MAX_JOBS=32
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/lib/Makefile b/lib/Makefile
index 0ffc982..ab9e474 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -53,6 +53,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
+DIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += librte_headroom
DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
diff --git a/lib/librte_headroom/Makefile b/lib/librte_headroom/Makefile
new file mode 100644
index 0000000..f0137e3
--- /dev/null
+++ b/lib/librte_headroom/Makefile
@@ -0,0 +1,50 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_headroom.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_HEADROOM) := rte_headroom.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_HEADROOM)-include := rte_headroom.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_mbuf
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_headroom/rte_headroom.c b/lib/librte_headroom/rte_headroom.c
new file mode 100644
index 0000000..5f19c83
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.c
@@ -0,0 +1,368 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <errno.h>
+
+#include <rte_errno.h>
+#include <rte_common.h>
+#include <rte_cycles.h>
+#include <rte_branch_prediction.h>
+#include <rte_debug.h>
+
+#include "rte_headroom.h"
+
+/* Those are steps used to adjust job period.
+ * Experiments show that for forwarding apps the up step must be less than down
+ * step to achieve optimal performance.
+ */
+#define JOB_UPDATE_STEP_UP 1
+#define JOB_UPDATE_STEP_DOWN 4
+
+/*
+ * Default update function that implements simple period adjustment.
+ */
+static void
+default_update_function(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->job_target - result;
+
+ /* Job is happy. Nothing to do */
+ if (err == 0)
+ return;
+
+ if (err > 0) {
+ if (job->period + JOB_UPDATE_STEP_UP < job->max_period)
+ job->period += JOB_UPDATE_STEP_UP;
+ } else {
+ if (job->min_period + JOB_UPDATE_STEP_DOWN < job->period)
+ job->period -= JOB_UPDATE_STEP_DOWN;
+ }
+}
+
+static uint32_t
+select_next_job(struct rte_headroom *hdr)
+{
+ const uint64_t now = rte_get_timer_cycles();
+ const uint32_t count = hdr->job_count;
+ uint32_t idx = hdr->job_idx;
+
+ for (; idx < count; idx++) {
+ if (hdr->jobs[idx].next_exec_time <= now)
+ break;
+ }
+
+ hdr->job_idx = idx;
+ return idx;
+}
+
+static void
+default_loop_hook(__rte_unused struct rte_headroom *hdr)
+{
+ rte_pause();
+}
+
+static void
+default_idle_hook(__rte_unused struct rte_headroom *hdr)
+{
+ rte_pause();
+}
+
+int
+rte_headroom_init(struct rte_headroom *hdr)
+{
+ if (hdr == NULL)
+ return -EINVAL;
+
+ memset(hdr, 0, sizeof(*hdr));
+
+ /* Set some initial values */
+ hdr->idle_hook = &default_idle_hook;
+ hdr->loop_hook = &default_loop_hook;
+
+ hdr->user_data = NULL;
+
+ return 0;
+}
+
+void
+rte_headroom_deinit(struct rte_headroom *hdr)
+{
+ RTE_VERIFY(hdr != NULL);
+
+ hdr->job_count = 0;
+ hdr->job_idx = 0;
+}
+
+void
+rte_headroom_set_user_data(struct rte_headroom *hdr, void *user_data)
+{
+ hdr->user_data = user_data;
+}
+
+void
+rte_headroom_set_loop_hook(struct rte_headroom *hdr,
+ rte_headroom_idle_hook_t loop_end_hook)
+{
+ if (loop_end_hook == NULL)
+ loop_end_hook = default_loop_hook;
+
+ hdr->loop_hook = loop_end_hook;
+}
+
+void
+rte_headroom_set_idle_hook(struct rte_headroom *hdr,
+ rte_headroom_idle_hook_t idle_hook)
+{
+ if (idle_hook == NULL)
+ idle_hook = default_idle_hook;
+
+ hdr->idle_hook = idle_hook;
+}
+
+static void
+update_stats(struct rte_headroom_stats *stats, uint64_t idle, uint64_t runtime)
+{
+ stats->idle += idle;
+
+ if (idle < stats->idle_min)
+ stats->idle_min = idle;
+
+ if (idle > stats->idle_max)
+ stats->idle_max = idle;
+
+ stats->run_time += runtime;
+
+ if (runtime < stats->run_time_min)
+ stats->run_time_min = runtime;
+
+ if (runtime > stats->run_time_max)
+ stats->run_time_max = runtime;
+
+ stats->exec_cnt++;
+}
+
+int
+rte_headroom_next_job(struct rte_headroom *hdr)
+{
+ uint64_t start_time, run_time, idle_time = 0, now;
+ int64_t retval;
+ struct rte_headroom_job *job;
+
+ if (unlikely(hdr == NULL))
+ return -EINVAL;
+
+ if (unlikely(hdr->job_count == 0))
+ return -ENOENT;
+
+ /* Wait for any job to be ready. */
+ if (unlikely(select_next_job(hdr) == hdr->job_count)) {
+ /* All jobs done. Update statisctics and go next loop */
+ update_stats(&hdr->stats, hdr->loop_idle, hdr->loop_runtime);
+ hdr->loop_idle = 0;
+ hdr->loop_runtime = 0;
+ hdr->job_idx = 0;
+
+ /* Execute loop end hook. */
+ (*hdr->loop_hook)(hdr);
+
+ /* Get ready job or wait for first one to became ready */
+ while (select_next_job(hdr) == hdr->job_count) {
+ hdr->job_idx = 0;
+ (*hdr->idle_hook)(hdr);
+ }
+ }
+
+ /* Calculate idle time counter for this job. Idle time is a time from
+ * execution finish to next job became ready. */
+ start_time = rte_get_timer_cycles();
+ idle_time = start_time - hdr->loop_end_time;
+
+ job = &hdr->jobs[hdr->job_idx];
+ retval = (*job->job_cb)(job);
+
+ (*job->update_period_cb)(job, retval);
+ job->next_exec_time += job->period;
+
+ now = rte_get_timer_cycles();
+ if (job->next_exec_time < now)
+ job->next_exec_time = now;
+
+ run_time = now - start_time;
+
+ /* Update job stats. */
+ update_stats(&job->stats, idle_time, run_time);
+
+ /* Update this loop stats. */
+ hdr->loop_idle += idle_time;
+ hdr->loop_runtime += run_time;
+
+ /* Mark time when this iteration ended */
+ hdr->loop_end_time = rte_get_timer_cycles();
+
+ /* Try next job. */
+ hdr->job_idx++;
+
+ /* Job finished */
+ return 0;
+}
+
+struct rte_headroom_job *
+rte_headroom_find_job(struct rte_headroom *hdr,
+ rte_headroom_job_callbak_t job_cb, void *job_data)
+{
+ size_t i;
+
+ if (unlikely(job_cb == NULL)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ /* Search the array. */
+ for (i = 0; i < hdr->job_count; i++) {
+ if (hdr->jobs[i].job_cb == job_cb &&
+ hdr->jobs[i].job_data == job_data)
+ return &hdr->jobs[i];
+ }
+
+ if (hdr->job_count == 0)
+ rte_errno = ENOENT;
+
+ return NULL;
+}
+
+struct rte_headroom_job *
+rte_headroom_find_job_by_name(struct rte_headroom *hdr, const char *job_name)
+{
+ size_t i;
+
+ if (unlikely(job_name == NULL)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ /* Search the array. */
+ for (i = 0; i < hdr->job_count; i++) {
+ if (hdr->jobs[i].name[0] == '\0')
+ continue;
+
+ if (strcmp(hdr->jobs[i].name, job_name) == 0)
+ return &hdr->jobs[i];
+ }
+
+ if (hdr->job_count == 0)
+ rte_errno = ENOENT;
+
+ return NULL;
+}
+
+int
+rte_headroom_add_job(struct rte_headroom *hdr, const char *name,
+ rte_headroom_job_callbak_t job_cb, void *job_data, uint64_t min_period,
+ uint64_t max_period, uint64_t initial_period, int64_t target,
+ struct rte_headroom_job **job_handle)
+{
+ struct rte_headroom_job *job;
+
+ if (hdr == NULL || job_cb == NULL)
+ return -EINVAL;
+
+ /* Add job by finding free slot. */
+ if (hdr->job_count == RTE_DIM(hdr->jobs))
+ return -ENOBUFS;
+
+ job = &hdr->jobs[hdr->job_count];
+ hdr->job_count++;
+
+ memset(job, 0, sizeof(*job));
+ job->next_exec_time = rte_get_timer_cycles();
+ job->job_cb = job_cb;
+ job->job_data = job_data;
+
+ job->update_period_cb = default_update_function;
+
+ if (initial_period <= min_period)
+ job->period = min_period;
+ else if (initial_period >= max_period)
+ job->period = max_period;
+ else
+ job->period = initial_period;
+
+ job->min_period = min_period;
+ job->max_period = max_period;
+ job->job_target = target;
+
+ if (name != NULL)
+ strncpy(job->name, name, RTE_DIM(job->name));
+ else
+ memset(job->name, 0, sizeof(job->name));
+
+ job->headroom = hdr;
+
+ if (job_handle)
+ *job_handle = job;
+ return 0;
+}
+
+int
+rte_headroom_del_job(struct rte_headroom_job *job)
+{
+ struct rte_headroom *hdr;
+ size_t cnt, idx;
+
+ if (unlikely(job == NULL))
+ return -EINVAL;
+
+ hdr = job->headroom;
+ RTE_VERIFY(hdr->jobs < job &&
+ (size_t)(job - hdr->jobs) < RTE_DIM(hdr->jobs));
+
+ idx = job - hdr->jobs;
+ cnt = hdr->job_count - idx - 1;
+ hdr->job_count--;
+
+ if (cnt > 0)
+ memmove(&hdr->jobs[idx], &hdr->jobs[idx + 1], sizeof(*job) * cnt);
+
+ return hdr->job_count;
+}
+
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_pedriod_cb)
+{
+ if (update_pedriod_cb == NULL)
+ update_pedriod_cb = default_update_function;
+
+ job->update_period_cb = update_pedriod_cb;
+}
diff --git a/lib/librte_headroom/rte_headroom.h b/lib/librte_headroom/rte_headroom.h
new file mode 100644
index 0000000..20f99fd
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.h
@@ -0,0 +1,481 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef HEADROOM_H_
+#define HEADROOM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#if RTE_HEADROOM_MAX_JOBS < 1 || RTE_HEADROOM_MAX_JOBS >= 0xFFFF
+# error Define RTE_HEADROOM_MAX_JOBS to be greater 1 and less 65535
+#endif
+
+/* Forward declarations. */
+struct rte_headroom;
+struct rte_headroom_job;
+
+typedef void (*rte_headroom_idle_hook_t)(struct rte_headroom *hdr);
+typedef void (*rte_headroom_loop_hook_t)(struct rte_headroom *hdr);
+typedef int64_t (*rte_headroom_job_callbak_t)(struct rte_headroom_job *job);
+
+/**
+ * This function should calculate new period and set it using
+ * rte_headroom_set_period() function. Time spent in this function will be
+ * added to job's runtime.
+ *
+ * @param job
+ * The job data structure handler.
+ * @param job_result
+ * Result of calling job callback.
+ */
+typedef void (*rte_headroom_update_fn_t)(struct rte_headroom_job *job,
+ int64_t job_result);
+
+struct rte_headroom_stats {
+ uint64_t idle;
+ /**< Sum of idle time before this job became ready. */
+
+ uint64_t idle_min;
+ /**< Minimum idle time. */
+
+ uint64_t idle_max;
+ /**< Maximum idle time. */
+
+ uint64_t run_time;
+ /**< Total time that this job was executing. */
+
+ uint64_t run_time_min;
+ /**< Minimum execute time. */
+
+ uint64_t run_time_max;
+ /**< Minimum execute time. */
+
+ uint64_t exec_cnt;
+ /**< Task execute count. */
+};
+
+struct rte_headroom_job {
+ uint64_t next_exec_time;
+ /**< Next time when job should be executed. */
+
+ rte_headroom_job_callbak_t job_cb;
+ /**< Callback function for this job. */
+
+ void *job_data;
+ /**< Pointer to custom job's data. */
+
+ rte_headroom_update_fn_t update_period_cb;
+ /**< Period update callback. */
+
+ uint64_t period;
+ /**< Estitmated period of execution. */
+
+ uint64_t min_period;
+ /**< Minimum period. */
+
+ uint64_t max_period;
+ /**< Maximum period. */
+
+ struct rte_headroom_stats stats;
+
+ int64_t job_target;
+ /**< Desired value for this job. */
+
+#define RTE_HEADROOM_JOB_NAMESIZE 32
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ /**< Name of this job */
+
+ struct rte_headroom *headroom;
+ /**< Hedroom object that this job belong to. */
+} __rte_cache_aligned;
+
+struct rte_headroom {
+ struct rte_headroom_job jobs[RTE_HEADROOM_MAX_JOBS];
+ /**< Array of job objects. */
+ uint32_t job_count;
+ /**< Job count. */
+ uint32_t job_idx;
+ /**< Index of job that is executing. */
+ struct rte_headroom_stats stats;
+
+ uint64_t loop_end_time;
+ /**< Time when last loop finished its execution. */
+
+ uint64_t loop_idle;
+ uint64_t loop_runtime;
+ /**< This loop idle time. */
+
+ rte_headroom_idle_hook_t idle_hook;
+ rte_headroom_loop_hook_t loop_hook;
+
+ void *user_data;
+ /**< User specified data for all user hooks. Not used by default hooks. */
+} __rte_cache_aligned;
+
+/**
+ * Allocate resource and initialize given headroom object with default
+ * values.
+ *
+ * @param hdr
+ * The headroom object to be initialized.
+ *
+ * @return
+ * 0 if successfull or negative error value:
+ * -EINVAL - if hdr is NULL
+ */
+int
+rte_headroom_init(struct rte_headroom *hdr);
+
+/**
+ * Deallocate any reserved resource make headroom object invalid.
+ *
+ * @pre All jobs from headroom object
+ *
+ * @param hdr The headroom object to be initialized.
+ */
+void
+rte_headroom_deinit(struct rte_headroom *hdr);
+
+/**
+ * Set new user data for headroom hooks.
+ *
+ * @param hdr
+ * Headroom object to change a hook.
+ * @param loop_hook
+ * New hook function. Can be NULL.
+ */
+void
+rte_headroom_set_user_data(struct rte_headroom *hdr, void *user_data);
+
+/**
+ * Set hook function executed when all jobs are executed in this loop turn.
+ *
+ * Time spent in this function is considered idle and counted as headroom. User
+ * application should not do any time consuming tasks in this hook as it can
+ * delay execution of job that became ready during time spent in this function.
+ *
+ * @param hdr
+ * Headroom object to change a hook.
+ * @param loop_hook
+ * New hook function. Can be NULL.
+ */
+void
+rte_headroom_set_loop_hook(struct rte_headroom *hdr,
+ rte_headroom_idle_hook_t loop_end_hook);
+
+/**
+ * Set hook function executed when no job is ready to execute (idle loop).
+ *
+ * Time spent in this function is considered idle and counted as headroom. User
+ * application should not do any time consuming tasks in this hook as it can
+ * delay execution of job that became ready during time spent in this function.
+ *
+ * @param hdr
+ * Headroom object to change a hook.
+ * @param idle_hook
+ * New hook function. Can be NULL.
+ */
+void
+rte_headroom_set_idle_hook(struct rte_headroom *hdr,
+ rte_headroom_idle_hook_t idle_hook);
+
+/**
+ * Wait for next job to be ready and execute it.
+ *
+ * @param headroom
+ * Headroom object that contains jobs.
+ *
+ * @return
+ * 0 on success or negative error code:
+ * -EINVAL if hdr is NULL.
+ * -ENOENT if there is no added jobs.
+ */
+int
+rte_headroom_next_job(struct rte_headroom *hdr);
+
+/**
+ * Return first job handle that match callback and data parameter.
+ *
+ * @param hdr
+ * Headroom obcjet to be searched.
+ * @param job_cb
+ * Job callback to find.
+ * @param job_data
+ * Job's private data pointer to find.
+ *
+ * @return
+ * Job object
+ * NULL if job is not found.
+ * NULL and set rte_errno:
+ * EINVAL if job_cb is NULL
+ * ENOENT if there is no jobs added to headroom object
+ */
+struct rte_headroom_job *
+rte_headroom_find_job(struct rte_headroom *hdr,
+ rte_headroom_job_callbak_t job_cb, void *job_data);
+
+/**
+ * Return first job handle that match given name. Jobs without name
+ * can't be found.
+ *
+ * @param hdr
+ * Headroom obcjet to be searched.
+ * @param job_name
+ * Name of job to be found.
+ *
+ * @return
+ * Job object
+ * NULL if job is not found.
+ * NULL and set rte_errno:
+ * EINVAL if job_name is NULL
+ * ENOENT if there is no jobs added to headroom object
+ */
+struct rte_headroom_job *
+rte_headroom_find_job_by_name(struct rte_headroom *hdr, const char *job_name);
+
+/**
+ * Return current number of jobs in headroom object.
+ * @param hdr
+ * Headroom object to interrogate.
+ * @return
+ * Number of jobs.
+ */
+static inline uint16_t
+rte_headroom_job_count(struct rte_headroom *hdr)
+{
+ return hdr->job_count;
+}
+
+/**
+ * Add new job to headroom object.
+ *
+ * @param hdr The headroom object to which job will be added.
+ * @param name Name of ne job. Can be NULL.
+ * @param job_cb Callback for this job.
+ * @param job_data Job's private data pointer.
+ * @param job_handle Output - added job handle. Can be NULL.
+ *
+ * @return 0 on success or negative error code:
+ * -EINVAL if hdr or job_cb is NULL.
+ * -ENOBUFS if there is no space in headroom object to add new job.
+ */
+int
+rte_headroom_add_job(struct rte_headroom *hdr, const char *name,
+ rte_headroom_job_callbak_t job_cb, void *job_data, uint64_t min_period,
+ uint64_t max_period, uint64_t initial_period, int64_t target,
+ struct rte_headroom_job **job_handle);
+
+/**
+ * Remove given job from headroom.
+ *
+ * @param job
+ * Job to be removed.
+ *
+ * @return
+ * Count of remaining jobs in headroom to which belong removed job or negative
+ * error value:
+ * -EINVAL job is NULL
+ */
+int
+rte_headroom_del_job(struct rte_headroom_job *job);
+
+/**
+ * Set job desired target value. Difference between target and job callback
+ * return value must be used to properly adjust job execute period value.
+ * @param job
+ * The job object.
+ * @param target
+ * New target.
+ */
+static inline void
+rte_headroom_set_target(struct rte_headroom_job *job, int64_t target)
+{
+ job->job_target = target;
+}
+
+/**
+ * Set execute period of given job.
+ *
+ * @param job
+ * The job ocbject.
+ * @param period
+ * New period value.
+ * @param saturate
+ * If zero, skip period saturatation to min, max range.
+ */
+static inline void
+rte_headroom_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate)
+{
+ if (saturate != 0) {
+ if (period < job->min_period)
+ period = job->min_period;
+ else if (period > job->max_period)
+ period = job->max_period;
+ }
+
+ job->period = period;
+}
+
+/**
+ * Set minimum execute period of given job.
+ *
+ * @param job
+ * The job ocject.
+ * @param period
+ * New minimum period value.
+ */
+static inline void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->min_period = period;
+ if (job->period < period)
+ job->period = period;
+}
+
+/**
+ * Set maximum execute period of given job.
+ *
+ * @param job
+ * The job ocject.
+ * @param period
+ * New maximum period value.
+ */
+static inline void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->max_period = period;
+ if (job->period > period)
+ job->period = period;
+}
+
+/**
+ * Set update period callback that is invoked after task finish his job.
+ * If application want to
+ *
+ * @param job
+ * Job object.
+ * @param update_pedriod_cb
+ * Callback to set. If NULL restore default update function.
+ */
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_pedriod_cb);
+
+/**
+ * Retive job stats
+ *
+ * @param job
+ * Job which statistics will be copied.
+ * @param stats
+ * The output stats buffer.
+ *
+ */
+static inline void
+rte_headroom_get_job_stats(struct rte_headroom_job *job,
+ struct rte_headroom_stats *stats)
+{
+ rte_memcpy(stats, &job->stats, sizeof(job->stats));
+}
+
+/**
+ * Function resets job statistics.
+ *
+ * @param job
+ * Job which statistics will be reset.
+ */
+static inline void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job)
+{
+ struct rte_headroom_stats *s = &job->stats;
+
+ s->idle = 0;
+ s->idle_min = UINT64_MAX;
+ s->idle_max = 0;
+
+ s->run_time = 0;
+ s->run_time_min = UINT64_MAX;
+ s->run_time_max = 0;
+
+ s->exec_cnt = 0;
+}
+
+/**
+ * Retive headroom stats
+ *
+ * @param hdr
+ * Headroom which statistics will be copied.
+ * @param stats
+ * The output stats buffer.
+ */
+static inline void
+rte_headroom_get_stats(struct rte_headroom *hdr,
+ struct rte_headroom_stats *stats)
+{
+ rte_memcpy(stats, &hdr->stats, sizeof(hdr->stats));
+}
+
+/**
+ * Function resets headroom statistics.
+ *
+ * @param hdr
+ * Headroom which statistics will be reset.
+ */
+static inline void
+rte_headroom_reset_stats(struct rte_headroom *hdr)
+{
+ struct rte_headroom_stats *s = &hdr->stats;
+
+ s->idle = 0;
+ s->idle_min = UINT64_MAX;
+ s->idle_max = 0;
+
+ s->run_time = 0;
+ s->run_time_min = UINT64_MAX;
+ s->run_time_max = 0;
+
+ s->exec_cnt = 0;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* HEADROOM_H_ */
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 40afb2c..bc2af88 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -103,6 +103,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
LDLIBS += -lrte_hash
endif
+ifeq ($(CONFIG_RTE_LIBRTE_HEADROOM),y)
+LDLIBS += -lrte_headroom
+endif
+
ifeq ($(CONFIG_RTE_LIBRTE_LPM),y)
LDLIBS += -lrte_lpm
endif
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH 2/2] examples: introduce new l2fwd-headroom example
2015-01-29 11:50 [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
@ 2015-01-29 11:50 ` Pawel Wodkowski
2015-01-29 13:25 ` [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Neil Horman
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 " Pawel Wodkowski
3 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-01-29 11:50 UTC (permalink / raw)
To: dev
This app demonstrate usage of new headroom library.
It is basicaly orginal l2fwd with following modificantions to met
headroom library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for thos jobs is almost the same as in orginal application.
- stats is moved to it's own job.
- If there is more lcores available than queues/ports, the stats job is
run on first free core, otherwise it is run on master core.
- stats are expanded to show headroom statistics.
Comparing orginal l2fwd and l2fwd-headroom apps will show approach what
is needed to properly write own application with headroom measurements.
Please notice that assigning separate core for printing stats is
prefered becouse flushing stdout is terrible slow and might impact
headroom statistics.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 +++
examples/l2fwd-headroom/main.c | 875 ++++++++++++++++++++++++++++++++++++++
3 files changed, 927 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..8a459b7 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
DIRS-y += l2fwd
+DIRS-y += l2fwd-headroom
DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
DIRS-y += l3fwd
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-headroom/Makefile
new file mode 100644
index 0000000..07da286
--- /dev/null
+++ b/examples/l2fwd-headroom/Makefile
@@ -0,0 +1,51 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-headroom
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-headroom/main.c
new file mode 100644
index 0000000..4a6c392
--- /dev/null
+++ b/examples/l2fwd-headroom/main.c
@@ -0,0 +1,875 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <ctype.h>
+#include <getopt.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_errno.h>
+#include <rte_headroom.h>
+
+#define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF 8192
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr l2fwd_ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t l2fwd_enabled_port_mask;
+
+/* list of enabled ports */
+static uint32_t l2fwd_dst_ports[RTE_MAX_ETHPORTS];
+
+static unsigned int l2fwd_rx_queue_per_lcore = 1;
+
+struct mbuf_table {
+ uint64_t next_flush_time;
+ unsigned len;
+ struct rte_mbuf *mbufs[MAX_PKT_BURST];
+};
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT 16
+struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_headroom headroom;
+
+} __rte_cache_aligned;
+struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
+
+static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc = 0, /**< CRC stripped by hardware */
+ },
+ .txmode = {
+ .mq_mode = ETH_MQ_TX_NONE,
+ },
+};
+
+struct rte_mempool *l2fwd_pktmbuf_pool = NULL;
+
+/* Per-port statistics struct */
+struct l2fwd_port_statistics {
+ uint64_t tx;
+ uint64_t rx;
+ uint64_t dropped;
+} __rte_cache_aligned;
+struct l2fwd_port_statistics port_statistics[RTE_MAX_ETHPORTS];
+
+/* 1 day max */
+#define MAX_TIMER_PERIOD 86400
+/* default period is 10 seconds */
+static int64_t timer_period = 10;
+/* default timer frequency */
+static uint64_t hz;
+/* BURST_TX_DRAIN_US converted to cycles */
+uint64_t drain_tsc;
+/* Convert cycles to ns */
+static inline uint64_t
+cycles_to_ns(uint64_t cycles)
+{
+ double t = cycles;
+ t *= NS_PER_S;
+ t /= hz;
+ return t;
+}
+
+/* Print out statistics on packets dropped */
+static int64_t
+print_stats_job(struct rte_headroom_job *this_job)
+{
+ struct rte_headroom *hdr;
+ struct rte_headroom_job *job;
+ struct rte_headroom_stats stats;
+ uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+ uint64_t stats_start = rte_get_timer_cycles();
+ unsigned portid, lcore_id;
+ uint32_t job_idx;
+
+ total_packets_dropped = 0;
+ total_packets_tx = 0;
+ total_packets_rx = 0;
+
+ const char clr[] = { 27, '[', '2', 'J', '\0' };
+ const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+ /* Clear screen and move to top left */
+ printf("%s%s"
+ "\nPort statistics ====================================",
+ clr, topLeft);
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ /* skip disabled ports */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+ printf("\nStatistics for port %u ------------------------------"
+ "\nPackets sent: %24"PRIu64
+ "\nPackets received: %20"PRIu64
+ "\nPackets dropped: %21"PRIu64,
+ portid,
+ port_statistics[portid].tx,
+ port_statistics[portid].rx,
+ port_statistics[portid].dropped);
+
+ total_packets_dropped += port_statistics[portid].dropped;
+ total_packets_tx += port_statistics[portid].tx;
+ total_packets_rx += port_statistics[portid].rx;
+ }
+
+ printf("\nAggregate statistics ==============================="
+ "\nTotal packets sent: %18"PRIu64
+ "\nTotal packets received: %14"PRIu64
+ "\nTotal packets dropped: %15"PRIu64
+ "\n====================================================\n",
+ total_packets_tx,
+ total_packets_rx,
+ total_packets_dropped);
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ if (lcore_queue_conf[lcore_id].n_rx_port == 0)
+ continue;
+
+ hdr = &lcore_queue_conf[lcore_id].headroom;
+ rte_headroom_get_stats(hdr, &stats);
+
+ printf("\nLCore %u: headroom statistics (time in ns) ========"
+ "\nLoop count: %26"PRIu64
+ "\nTotal headroom: %22"PRIu64
+ "\nHeadroom per loop: %19"PRIu64
+ "\nHeadroom min: %24"PRIu64
+ "\nHeadroom max: %24"PRIu64
+ "\nLoop min time: %23"PRIu64
+ "\nLoop max time: %23"PRIu64,
+ lcore_id,
+ stats.exec_cnt,
+ cycles_to_ns(stats.idle),
+ cycles_to_ns(stats.exec_cnt ? stats.idle / stats.exec_cnt : 0),
+ cycles_to_ns(stats.idle_min),
+ cycles_to_ns(stats.idle_max),
+ cycles_to_ns(stats.run_time_min),
+ cycles_to_ns(stats.run_time_max));
+
+ for (job_idx = 0; job_idx < hdr->job_count; job_idx++) {
+ job = &hdr->jobs[job_idx];
+ rte_headroom_get_job_stats(job, &stats);
+ rte_headroom_reset_job_stats(job);
+
+ printf("\nJob %" PRIu32 ":%20s -----------------------"
+ "\nExec count: %26"PRIu64
+ "\nExec period: %25"PRIu64
+ "\nTotal headroom: %22"PRIu64
+ "\nHeadroom per exec: %19"PRIu64
+ "\nHeadroom min: %24"PRIu64
+ "\nHeadroom max: %24"PRIu64
+ "\nExec min time: %23"PRIu64
+ "\nExec max time: %23"PRIu64,
+ job_idx, job->name,
+ stats.exec_cnt,
+ cycles_to_ns(job->period),
+ cycles_to_ns(stats.idle),
+ cycles_to_ns(stats.exec_cnt ? stats.idle / stats.exec_cnt : 0),
+ cycles_to_ns(stats.idle_min),
+ cycles_to_ns(stats.idle_max),
+ cycles_to_ns(stats.run_time_min),
+ cycles_to_ns(stats.run_time_max));
+ }
+
+ rte_headroom_reset_stats(hdr);
+ }
+
+ printf("\n==== Stats gen time %19"PRIu64" ========= \n",
+ cycles_to_ns(rte_get_timer_cycles() - stats_start));
+
+ /* Return setpoint to indicate that this job is happy of time interwal
+ * in which it was called. */
+ return this_job->job_target;
+}
+
+/* Send the burst of packets on an output interface */
+static void
+l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
+{
+ struct mbuf_table *m_table;
+ uint16_t ret;
+ uint16_t queueid = 0;
+ uint16_t n;
+
+ m_table = &qconf->tx_mbufs[port];
+ n = m_table->len;
+
+ m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
+ m_table->len = 0;
+
+ ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
+
+ port_statistics[port].tx += ret;
+ if (unlikely(ret < n)) {
+ port_statistics[port].dropped += (n - ret);
+ do {
+ rte_pktmbuf_free(m_table->mbufs[ret]);
+ } while (++ret < n);
+ }
+}
+
+/* Enqueue packets for TX and prepare them to be sent */
+static int
+l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+{
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct mbuf_table *m_table = &qconf->tx_mbufs[port];
+ uint16_t len = qconf->tx_mbufs[port].len;
+
+ m_table->mbufs[len] = m;
+
+ len++;
+ m_table->len = len;
+
+ /* Enough pkts to be sent. */
+ if (unlikely(len == MAX_PKT_BURST))
+ l2fwd_send_burst(qconf, port);
+
+ return 0;
+}
+
+static void
+l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+{
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+ tmp = ð->d_addr.addr_bytes[0];
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+ /* src addr */
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+}
+
+static int64_t
+l2fwd_fwd_job(struct rte_headroom_job *job)
+{
+ struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+ struct rte_mbuf *m;
+
+ unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ const uint8_t port_idx = (uint8_t) (uintptr_t) job->job_data;
+ const uint8_t portid = qconf->rx_port_list[port_idx];
+ uint8_t j;
+ uint16_t nb_rx, total_nb_rx;
+
+ /* Call rx burst 2 times. This allow headroom logic to see if this function
+ * must be called more frequently. */
+
+ nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst, MAX_PKT_BURST);
+
+ total_nb_rx = nb_rx;
+ port_statistics[portid].rx += nb_rx;
+
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+ if (nb_rx < MAX_PKT_BURST)
+ return total_nb_rx;
+
+ nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst, MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ port_statistics[portid].rx += nb_rx;
+
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+ return total_nb_rx;
+}
+
+static int64_t
+l2fwd_flush_job(struct rte_headroom_job *job)
+{
+ const uint64_t now = rte_get_timer_cycles();
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+ /* Return setpoint to indicate that this job is happy of time interwal
+ * in which it was called. */
+ return job->job_target;
+}
+
+/* main processing loop */
+static void
+l2fwd_main_loop(void)
+{
+ unsigned lcore_id;
+ unsigned i, portid;
+ struct lcore_queue_conf *qconf;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ if (rte_headroom_job_count(&qconf->headroom) == 0) {
+ RTE_LOG(INFO, L2FWD, "lcore %u has nothing to do\n", lcore_id);
+ return;
+ }
+
+ RTE_LOG(INFO, L2FWD, "entering main loop on lcore %u\n", lcore_id);
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+
+ portid = qconf->rx_port_list[i];
+ RTE_LOG(INFO, L2FWD, " -- lcoreid=%u portid=%u\n", lcore_id,
+ portid);
+ }
+
+ while (1)
+ rte_headroom_next_job(&qconf->headroom);
+}
+
+static int
+l2fwd_launch_one_lcore(__attribute__((unused)) void *dummy)
+{
+ l2fwd_main_loop();
+ return 0;
+}
+
+/* display usage */
+static void
+l2fwd_usage(const char *prgname)
+{
+ printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+ " -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+ " -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+ " -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n",
+ prgname);
+}
+
+static int
+l2fwd_parse_portmask(const char *portmask)
+{
+ char *end = NULL;
+ unsigned long pm;
+
+ /* parse hexadecimal string */
+ pm = strtoul(portmask, &end, 16);
+ if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+
+ if (pm == 0)
+ return -1;
+
+ return pm;
+}
+
+static unsigned int
+l2fwd_parse_nqueue(const char *q_arg)
+{
+ char *end = NULL;
+ unsigned long n;
+
+ /* parse hexadecimal string */
+ n = strtoul(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return 0;
+ if (n == 0)
+ return 0;
+ if (n >= MAX_RX_QUEUE_PER_LCORE)
+ return 0;
+
+ return n;
+}
+
+static int
+l2fwd_parse_timer_period(const char *q_arg)
+{
+ char *end = NULL;
+ int n;
+
+ /* parse number string */
+ n = strtol(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+ if (n >= MAX_TIMER_PERIOD)
+ return -1;
+
+ return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+l2fwd_parse_args(int argc, char **argv)
+{
+ int opt, ret;
+ char **argvopt;
+ int option_index;
+ char *prgname = argv[0];
+ static struct option lgopts[] = {
+ {NULL, 0, 0, 0}
+ };
+
+ argvopt = argv;
+
+ while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+ lgopts, &option_index)) != EOF) {
+
+ switch (opt) {
+ /* portmask */
+ case 'p':
+ l2fwd_enabled_port_mask = l2fwd_parse_portmask(optarg);
+ if (l2fwd_enabled_port_mask == 0) {
+ printf("invalid portmask\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* nqueue */
+ case 'q':
+ l2fwd_rx_queue_per_lcore = l2fwd_parse_nqueue(optarg);
+ if (l2fwd_rx_queue_per_lcore == 0) {
+ printf("invalid queue number\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* timer period */
+ case 'T':
+ timer_period = l2fwd_parse_timer_period(optarg);
+ if (timer_period < 0) {
+ printf("invalid timer period\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* long options */
+ case 0:
+ l2fwd_usage(prgname);
+ return -1;
+
+ default:
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ }
+
+ if (optind >= 0)
+ argv[optind-1] = prgname;
+
+ ret = optind-1;
+ optind = 0; /* reset getopt lib */
+ return ret;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+ uint8_t portid, count, all_ports_up, print_flag = 0;
+ struct rte_eth_link link;
+
+ printf("\nChecking link status");
+ fflush(stdout);
+ for (count = 0; count <= MAX_CHECK_TIME; count++) {
+ all_ports_up = 1;
+ for (portid = 0; portid < port_num; portid++) {
+ if ((port_mask & (1 << portid)) == 0)
+ continue;
+ memset(&link, 0, sizeof(link));
+ rte_eth_link_get_nowait(portid, &link);
+ /* print link status if flag set */
+ if (print_flag == 1) {
+ if (link.link_status)
+ printf("Port %d Link Up - speed %u "
+ "Mbps - %s\n", (uint8_t)portid,
+ (unsigned)link.link_speed,
+ (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+ ("full-duplex") : ("half-duplex\n"));
+ else
+ printf("Port %d Link Down\n",
+ (uint8_t)portid);
+ continue;
+ }
+ /* clear all_ports_up flag if any link down */
+ if (link.link_status == 0) {
+ all_ports_up = 0;
+ break;
+ }
+ }
+ /* after finally printing all link status, get out */
+ if (print_flag == 1)
+ break;
+
+ if (all_ports_up == 0) {
+ printf(".");
+ fflush(stdout);
+ rte_delay_ms(CHECK_INTERVAL);
+ }
+
+ /* set the print_flag if all ports up or timeout */
+ if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+ print_flag = 1;
+ printf("done\n");
+ }
+ }
+}
+
+int
+main(int argc, char **argv)
+{
+ struct lcore_queue_conf *qconf;
+ struct rte_eth_dev_info dev_info;
+ struct rte_headroom_job *job;
+ unsigned lcore_id, rx_lcore_id, stats_lcore;
+ unsigned nb_ports_in_mask = 0;
+ int ret;
+ uint8_t nb_ports;
+ uint8_t nb_ports_available;
+ uint8_t portid, last_port;
+ uint8_t i;
+ char job_name[RTE_HEADROOM_JOB_NAMESIZE];
+
+ /* init EAL */
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+ /* fetch default timer frequency. */
+ hz = rte_get_timer_hz();
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+ nb_ports = rte_eth_dev_count();
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ } else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get(portid, &dev_info);
+ }
+ if (nb_ports_in_mask % 2) {
+ printf("Notice: odd number of ports in portmask.\n");
+ l2fwd_dst_ports[last_port] = last_port;
+ }
+
+ rx_lcore_id = 0;
+ qconf = NULL;
+
+ /* Initialize the port/queue configuration of each logical core */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ /* get the lcore_id for this port */
+ while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+ lcore_queue_conf[rx_lcore_id].n_rx_port ==
+ l2fwd_rx_queue_per_lcore) {
+ rx_lcore_id++;
+ if (rx_lcore_id >= RTE_MAX_LCORE)
+ rte_exit(EXIT_FAILURE, "Not enough cores\n");
+ }
+
+ if (qconf != &lcore_queue_conf[rx_lcore_id])
+ /* Assigned a new logical core in the loop above. */
+ qconf = &lcore_queue_conf[rx_lcore_id];
+
+ qconf->rx_port_list[qconf->n_rx_port] = portid;
+ qconf->n_rx_port++;
+ printf("Lcore %u: RX port %u\n", rx_lcore_id, (unsigned) portid);
+ }
+
+ nb_ports_available = nb_ports;
+
+ /* Initialise each port */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+ printf("Skipping disabled port %u\n", (unsigned) portid);
+ nb_ports_available--;
+ continue;
+ }
+ /* init port */
+ printf("Initializing port %u... ", (unsigned) portid);
+ fflush(stdout);
+ ret = rte_eth_dev_configure(portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ rte_eth_macaddr_get(portid, &l2fwd_ports_eth_addr[portid]);
+
+ /* init one RX queue */
+ fflush(stdout);
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* init one TX queue on each port */
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* Start device */
+ ret = rte_eth_dev_start(portid);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ printf("done:\n");
+
+ rte_eth_promiscuous_enable(portid);
+
+ printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+ (unsigned) portid,
+ l2fwd_ports_eth_addr[portid].addr_bytes[0],
+ l2fwd_ports_eth_addr[portid].addr_bytes[1],
+ l2fwd_ports_eth_addr[portid].addr_bytes[2],
+ l2fwd_ports_eth_addr[portid].addr_bytes[3],
+ l2fwd_ports_eth_addr[portid].addr_bytes[4],
+ l2fwd_ports_eth_addr[portid].addr_bytes[5]);
+
+ /* initialize port stats */
+ memset(&port_statistics, 0, sizeof(port_statistics));
+ }
+
+ if (!nb_ports_available) {
+ rte_exit(EXIT_FAILURE,
+ "All available ports are disabled. Please set portmask.\n");
+ }
+
+ check_all_ports_link_status(nb_ports, l2fwd_enabled_port_mask);
+
+ drain_tsc = (hz + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+ stats_lcore = rte_get_master_lcore();
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ qconf = &lcore_queue_conf[lcore_id];
+
+ if (rte_headroom_init(&qconf->headroom) != 0)
+ rte_panic("Headroom for core %u init failed\n", lcore_id);
+
+ if (qconf->n_rx_port == 0) {
+ if (stats_lcore == rte_get_master_lcore()) {
+ stats_lcore = lcore_id;
+ RTE_LOG(INFO, L2FWD,
+ "lcore %u: this core is free. Statistics will be "
+ "displayed on this core.\n",
+ lcore_id);
+ } else {
+ RTE_LOG(INFO, L2FWD,
+ "lcore %u: no ports so no headroom initialization\n",
+ lcore_id);
+ }
+
+ continue;
+ }
+
+ /* Add flush job.
+ * Set fixed period by setting min = max = initila period. Set target to
+ * zero as it is irrelevant for this job. */
+ ret = rte_headroom_add_job(&qconf->headroom, "flush",
+ &l2fwd_flush_job, NULL, drain_tsc, drain_tsc, drain_tsc, 0,
+ &job);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add flush job for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ printf("%u ", qconf->rx_port_list[i]);
+
+ snprintf(job_name, RTE_DIM(job_name), "port %u fwd",
+ qconf->rx_port_list[i]);
+
+ /* Add forward job.
+ * Set min, max and initial period. Set target to MAX_PKT_BURST as
+ * this is desired optimal RX/TX burst size. */
+ ret = rte_headroom_add_job(&qconf->headroom, job_name,
+ &l2fwd_fwd_job, (void *)(uintptr_t)i, 0, drain_tsc, 0,
+ MAX_PKT_BURST, &job);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add job (lcore: %u, port %u): %s",
+ lcore_id, qconf->rx_port_list[i], rte_strerror(-ret));
+ }
+ }
+ }
+
+ if (timer_period) {
+ /* Convert timer period to cycles */
+ timer_period *= hz;
+ qconf = &lcore_queue_conf[stats_lcore];
+
+ /* Add stats display job.
+ * Set fixed period by setting min = max = initila period. Set target to
+ * zero as it is irrelevant for this job. */
+ ret = rte_headroom_add_job(&qconf->headroom, "stats", &print_stats_job,
+ NULL, timer_period, timer_period, timer_period, 0, &job);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add print stats job for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+ RTE_LOG(INFO, L2FWD, "Stats display on LCore %u\n", stats_lcore);
+ } else {
+ RTE_LOG(INFO, L2FWD, "Stats display disabled\n");
+ }
+
+ /* launch per-lcore init on every lcore */
+ rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL, CALL_MASTER);
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+ if (rte_eal_wait_lcore(lcore_id) < 0)
+ return -1;
+ }
+
+ return 0;
+}
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example application
2015-01-29 11:50 [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
@ 2015-01-29 13:25 ` Neil Horman
2015-01-29 17:10 ` Wodkowski, PawelX
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 " Pawel Wodkowski
3 siblings, 1 reply; 48+ messages in thread
From: Neil Horman @ 2015-01-29 13:25 UTC (permalink / raw)
To: Pawel Wodkowski; +Cc: dev
On Thu, Jan 29, 2015 at 12:50:04PM +0100, Pawel Wodkowski wrote:
> Hi community,
> I would like to introduce library for measuring load of some arbitrary jobs. It
> can be used to profile every kind of job sets on any arbitrary execution unit.
> In provided l2fwd-headroom example I demonstrate how to use this library to
> profile packet forwarding (job set is froward, flush and stats) on LCores
> (execution unit). This example does no limit possible schemes on which this
> library can be used.
>
> Pawel Wodkowski (2):
> librte_headroom: New library for checking core/system/app load
> examples: introduce new l2fwd-headroom example
>
> config/common_bsdapp | 6 +
> config/common_linuxapp | 6 +
> examples/Makefile | 1 +
> examples/l2fwd-headroom/Makefile | 51 +++
> examples/l2fwd-headroom/main.c | 875 ++++++++++++++++++++++++++++++++++++
> lib/Makefile | 1 +
> lib/librte_headroom/Makefile | 50 +++
> lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++
> lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++
> mk/rte.app.mk | 4 +
> 10 files changed, 1843 insertions(+)
> create mode 100644 examples/l2fwd-headroom/Makefile
> create mode 100644 examples/l2fwd-headroom/main.c
> create mode 100644 lib/librte_headroom/Makefile
> create mode 100644 lib/librte_headroom/rte_headroom.c
> create mode 100644 lib/librte_headroom/rte_headroom.h
>
> --
> 1.7.9.5
>
>
Whats the advantage of this library over the other tools to preform the same
function. Perf can provide all the information in this library, and do so
without having to directly modify the source for the execution unit under test
Neil
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example application
2015-01-29 13:25 ` [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Neil Horman
@ 2015-01-29 17:10 ` Wodkowski, PawelX
2015-01-29 19:13 ` Neil Horman
0 siblings, 1 reply; 48+ messages in thread
From: Wodkowski, PawelX @ 2015-01-29 17:10 UTC (permalink / raw)
To: Neil Horman; +Cc: dev
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, January 29, 2015 2:25 PM
> To: Wodkowski, PawelX
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example
> application
>
> On Thu, Jan 29, 2015 at 12:50:04PM +0100, Pawel Wodkowski wrote:
> > Hi community,
> > I would like to introduce library for measuring load of some arbitrary jobs. It
> > can be used to profile every kind of job sets on any arbitrary execution unit.
> > In provided l2fwd-headroom example I demonstrate how to use this library to
> > profile packet forwarding (job set is froward, flush and stats) on LCores
> > (execution unit). This example does no limit possible schemes on which this
> > library can be used.
> >
> > Pawel Wodkowski (2):
> > librte_headroom: New library for checking core/system/app load
> > examples: introduce new l2fwd-headroom example
> >
> > config/common_bsdapp | 6 +
> > config/common_linuxapp | 6 +
> > examples/Makefile | 1 +
> > examples/l2fwd-headroom/Makefile | 51 +++
> > examples/l2fwd-headroom/main.c | 875
> ++++++++++++++++++++++++++++++++++++
> > lib/Makefile | 1 +
> > lib/librte_headroom/Makefile | 50 +++
> > lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++
> > lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++
> > mk/rte.app.mk | 4 +
> > 10 files changed, 1843 insertions(+)
> > create mode 100644 examples/l2fwd-headroom/Makefile
> > create mode 100644 examples/l2fwd-headroom/main.c
> > create mode 100644 lib/librte_headroom/Makefile
> > create mode 100644 lib/librte_headroom/rte_headroom.c
> > create mode 100644 lib/librte_headroom/rte_headroom.h
> >
> > --
> > 1.7.9.5
> >
> >
>
> Whats the advantage of this library over the other tools to preform the same
> function.
Hi Neil,
Good point, what is advantage over perf. Answer is: this library does not
supposed to be a perf competition and is not for profiling app in the way perf does.
It is an small and fast extension. It's main task is to manage job list to invoke
them exactly when needed and provide some basic stats about application idle
time (whatever programmer will consider the idle) and busy time.
For example:
application might decide to add remove some jobs to/from LCore(s) dynamically
basing on current idle time (ex: move job from one core to another).
Also application might have some information's about traffic type it handles
and provide own algorithm to calculate invocation time (it can also dynamically
switch between those algorithms only replacing handlers).
> Perf can provide all the information in this library, and do so
> without having to directly modify the source for the execution unit under test
Yes, perf can provide those information's but it can't handle the case when
you are poling for packets too fast or too slow and waist time getting only couple
of them. Library will adjust time when it execute job basing on value this job
returned previously. Code modifications are not so deep, as you can see comparing
l2wf vs l2fwd-headroom app.
For example in application I introduced, when forward job return less than
MAX_PKT_BURST execution period will be increased. If it return more it will decrease
execution period. Stats provided for that can be used to determine if application is
behaving correctly and if there is a time for handling another port (what did for tests).
Pawel
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example application
2015-01-29 17:10 ` Wodkowski, PawelX
@ 2015-01-29 19:13 ` Neil Horman
2015-01-30 10:47 ` Wodkowski, PawelX
0 siblings, 1 reply; 48+ messages in thread
From: Neil Horman @ 2015-01-29 19:13 UTC (permalink / raw)
To: Wodkowski, PawelX; +Cc: dev
On Thu, Jan 29, 2015 at 05:10:36PM +0000, Wodkowski, PawelX wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, January 29, 2015 2:25 PM
> > To: Wodkowski, PawelX
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example
> > application
> >
> > On Thu, Jan 29, 2015 at 12:50:04PM +0100, Pawel Wodkowski wrote:
> > > Hi community,
> > > I would like to introduce library for measuring load of some arbitrary jobs. It
> > > can be used to profile every kind of job sets on any arbitrary execution unit.
> > > In provided l2fwd-headroom example I demonstrate how to use this library to
> > > profile packet forwarding (job set is froward, flush and stats) on LCores
> > > (execution unit). This example does no limit possible schemes on which this
> > > library can be used.
> > >
> > > Pawel Wodkowski (2):
> > > librte_headroom: New library for checking core/system/app load
> > > examples: introduce new l2fwd-headroom example
> > >
> > > config/common_bsdapp | 6 +
> > > config/common_linuxapp | 6 +
> > > examples/Makefile | 1 +
> > > examples/l2fwd-headroom/Makefile | 51 +++
> > > examples/l2fwd-headroom/main.c | 875
> > ++++++++++++++++++++++++++++++++++++
> > > lib/Makefile | 1 +
> > > lib/librte_headroom/Makefile | 50 +++
> > > lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++
> > > lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++
> > > mk/rte.app.mk | 4 +
> > > 10 files changed, 1843 insertions(+)
> > > create mode 100644 examples/l2fwd-headroom/Makefile
> > > create mode 100644 examples/l2fwd-headroom/main.c
> > > create mode 100644 lib/librte_headroom/Makefile
> > > create mode 100644 lib/librte_headroom/rte_headroom.c
> > > create mode 100644 lib/librte_headroom/rte_headroom.h
> > >
> > > --
> > > 1.7.9.5
> > >
> > >
> >
> > Whats the advantage of this library over the other tools to preform the same
> > function.
>
> Hi Neil,
>
> Good point, what is advantage over perf. Answer is: this library does not
> supposed to be a perf competition and is not for profiling app in the way perf does.
> It is an small and fast extension. It's main task is to manage job list to invoke
> them exactly when needed and provide some basic stats about application idle
> time (whatever programmer will consider the idle) and busy time.
>
> For example:
> application might decide to add remove some jobs to/from LCore(s) dynamically
> basing on current idle time (ex: move job from one core to another).
>
> Also application might have some information's about traffic type it handles
> and provide own algorithm to calculate invocation time (it can also dynamically
> switch between those algorithms only replacing handlers).
>
> > Perf can provide all the information in this library, and do so
> > without having to directly modify the source for the execution unit under test
>
> Yes, perf can provide those information's but it can't handle the case when
> you are poling for packets too fast or too slow and waist time getting only couple
> of them. Library will adjust time when it execute job basing on value this job
> returned previously. Code modifications are not so deep, as you can see comparing
> l2wf vs l2fwd-headroom app.
>
> For example in application I introduced, when forward job return less than
> MAX_PKT_BURST execution period will be increased. If it return more it will decrease
> execution period. Stats provided for that can be used to determine if application is
> behaving correctly and if there is a time for handling another port (what did for tests).
>
You're still re-inventing the wheel here, and I don't see any advantage to doing
so. If the goal of the library is to profile the run time of a task, then you
have perf and systemtap for such purposes. If the goal is to create a job
scheduler that allows you to track multiple parallel tasks, and adjust their
execution, there are several pre-existing libraries that any application
programmer can already leverage to do just that (Berkely UPC or libtask to name
just two examples). Truthfully, on a dedicated cpu, you could just as easily
create multiple child processes runnnig at SCHED_RR and set their priorities
accordingly.
I don't see why we need another library to do what several other tools/libraries
can do quite well.
Neil
> Pawel
>
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example application
2015-01-29 19:13 ` Neil Horman
@ 2015-01-30 10:47 ` Wodkowski, PawelX
2015-01-30 18:02 ` Neil Horman
0 siblings, 1 reply; 48+ messages in thread
From: Wodkowski, PawelX @ 2015-01-30 10:47 UTC (permalink / raw)
To: Neil Horman; +Cc: dev
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, January 29, 2015 8:13 PM
> To: Wodkowski, PawelX
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example
> application
>
> On Thu, Jan 29, 2015 at 05:10:36PM +0000, Wodkowski, PawelX wrote:
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Thursday, January 29, 2015 2:25 PM
> > > To: Wodkowski, PawelX
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and
> example
> > > application
> > >
> > > On Thu, Jan 29, 2015 at 12:50:04PM +0100, Pawel Wodkowski wrote:
> > > > Hi community,
> > > > I would like to introduce library for measuring load of some arbitrary jobs.
> It
> > > > can be used to profile every kind of job sets on any arbitrary execution unit.
> > > > In provided l2fwd-headroom example I demonstrate how to use this library
> to
> > > > profile packet forwarding (job set is froward, flush and stats) on LCores
> > > > (execution unit). This example does no limit possible schemes on which this
> > > > library can be used.
> > > >
> > > > Pawel Wodkowski (2):
> > > > librte_headroom: New library for checking core/system/app load
> > > > examples: introduce new l2fwd-headroom example
> > > >
> > > > config/common_bsdapp | 6 +
> > > > config/common_linuxapp | 6 +
> > > > examples/Makefile | 1 +
> > > > examples/l2fwd-headroom/Makefile | 51 +++
> > > > examples/l2fwd-headroom/main.c | 875
> > > ++++++++++++++++++++++++++++++++++++
> > > > lib/Makefile | 1 +
> > > > lib/librte_headroom/Makefile | 50 +++
> > > > lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++
> > > > lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++
> > > > mk/rte.app.mk | 4 +
> > > > 10 files changed, 1843 insertions(+)
> > > > create mode 100644 examples/l2fwd-headroom/Makefile
> > > > create mode 100644 examples/l2fwd-headroom/main.c
> > > > create mode 100644 lib/librte_headroom/Makefile
> > > > create mode 100644 lib/librte_headroom/rte_headroom.c
> > > > create mode 100644 lib/librte_headroom/rte_headroom.h
> > > >
> > > > --
> > > > 1.7.9.5
> > > >
> > > >
> > >
> > > Whats the advantage of this library over the other tools to preform the same
> > > function.
> >
> > Hi Neil,
> >
> > Good point, what is advantage over perf. Answer is: this library does not
> > supposed to be a perf competition and is not for profiling app in the way perf
> does.
> > It is an small and fast extension. It's main task is to manage job list to invoke
> > them exactly when needed and provide some basic stats about application idle
> > time (whatever programmer will consider the idle) and busy time.
> >
> > For example:
> > application might decide to add remove some jobs to/from LCore(s)
> dynamically
> > basing on current idle time (ex: move job from one core to another).
> >
> > Also application might have some information's about traffic type it handles
> > and provide own algorithm to calculate invocation time (it can also
> dynamically
> > switch between those algorithms only replacing handlers).
> >
> > > Perf can provide all the information in this library, and do so
> > > without having to directly modify the source for the execution unit under
> test
> >
> > Yes, perf can provide those information's but it can't handle the case when
> > you are poling for packets too fast or too slow and waist time getting only
> couple
> > of them. Library will adjust time when it execute job basing on value this job
> > returned previously. Code modifications are not so deep, as you can see
> comparing
> > l2wf vs l2fwd-headroom app.
> >
> > For example in application I introduced, when forward job return less than
> > MAX_PKT_BURST execution period will be increased. If it return more it will
> decrease
> > execution period. Stats provided for that can be used to determine if
> application is
> > behaving correctly and if there is a time for handling another port (what did for
> tests).
> >
> You're still re-inventing the wheel here, and I don't see any advantage to doing
> so. If the goal of the library is to profile the run time of a task, then you
> have perf and systemtap for such purposes. If the goal is to create a job
> scheduler that allows you to track multiple parallel tasks, and adjust their
> execution, there are several pre-existing libraries that any application
> programmer can already leverage to do just that (Berkely UPC or libtask to name
> just two examples). Truthfully, on a dedicated cpu, you could just as easily
> create multiple child processes runnnig at SCHED_RR and set their priorities
> accordingly.
>
> I don't see why we need another library to do what several other tools/libraries
> can do quite well.
>
I am under impression that I am unable to express myself clearly enough.
I did not meant to make competition to perf nor reinvent the tasking library.
1. Idle and runtime statistics are provided "by the way" and you can use them
or use perf on linux or libpmc on freebsd or whatever tool you like to get more
sophisticated ones.
2. You can also use tasking library what you like with a headroom object on top
of every task you created with any scheduling you want. This way you can assign
priorities to job sets and adjust sleep time as you like. You can also decide what
is your idle/load time by placing task-dependet-yeld/sleep in idle callback or separate
job callback or loop end callback.
Both of those two points above are no covering one gap - we are running in poll
mode and here is a lack of mutex, semaphores or other synchronization mechanisms
provided by OS/tasking layer. This library try to provide facility for estimating optimal
job execution time because every mis-poll is a waste of time and can lead to
impression that core/task is fully loaded but what it does is only polling and getting 1
or 2 packets instead of 32 (it degrades total NIC throughput).
You can use perf (or whatever tool you like) to profile your code for performance. If
the code good enough, you use headroom library that will estimate how often or
when your perfect-profiled-and-optimized code need to be executed. When execution
period estimation settles you have your headroom for other jobs on this core/task.
You can also provide your own algorithm for estimating execution period. You are also
able to drop using this headroom library at runtime and switch to tasking one when
you decide that you have enough data make decision how many jobs you can execute
on lcore/task basing on current throughput.
I deliberately did not used 'task' but 'job' to not make an impression that I am
reinventing another task library. This code is to be simple and fast. Of course you can
do all those things in every application you make but this library is provided to not
reinvent this logic all the time.
Pawel
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example application
2015-01-30 10:47 ` Wodkowski, PawelX
@ 2015-01-30 18:02 ` Neil Horman
0 siblings, 0 replies; 48+ messages in thread
From: Neil Horman @ 2015-01-30 18:02 UTC (permalink / raw)
To: Wodkowski, PawelX; +Cc: dev
On Fri, Jan 30, 2015 at 10:47:21AM +0000, Wodkowski, PawelX wrote:
>
>
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, January 29, 2015 8:13 PM
> > To: Wodkowski, PawelX
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and example
> > application
> >
> > On Thu, Jan 29, 2015 at 05:10:36PM +0000, Wodkowski, PawelX wrote:
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Thursday, January 29, 2015 2:25 PM
> > > > To: Wodkowski, PawelX
> > > > Cc: dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 0/2] new headroom stats library and
> > example
> > > > application
> > > >
> > > > On Thu, Jan 29, 2015 at 12:50:04PM +0100, Pawel Wodkowski wrote:
> > > > > Hi community,
> > > > > I would like to introduce library for measuring load of some arbitrary jobs.
> > It
> > > > > can be used to profile every kind of job sets on any arbitrary execution unit.
> > > > > In provided l2fwd-headroom example I demonstrate how to use this library
> > to
> > > > > profile packet forwarding (job set is froward, flush and stats) on LCores
> > > > > (execution unit). This example does no limit possible schemes on which this
> > > > > library can be used.
> > > > >
> > > > > Pawel Wodkowski (2):
> > > > > librte_headroom: New library for checking core/system/app load
> > > > > examples: introduce new l2fwd-headroom example
> > > > >
> > > > > config/common_bsdapp | 6 +
> > > > > config/common_linuxapp | 6 +
> > > > > examples/Makefile | 1 +
> > > > > examples/l2fwd-headroom/Makefile | 51 +++
> > > > > examples/l2fwd-headroom/main.c | 875
> > > > ++++++++++++++++++++++++++++++++++++
> > > > > lib/Makefile | 1 +
> > > > > lib/librte_headroom/Makefile | 50 +++
> > > > > lib/librte_headroom/rte_headroom.c | 368 +++++++++++++++
> > > > > lib/librte_headroom/rte_headroom.h | 481 ++++++++++++++++++++
> > > > > mk/rte.app.mk | 4 +
> > > > > 10 files changed, 1843 insertions(+)
> > > > > create mode 100644 examples/l2fwd-headroom/Makefile
> > > > > create mode 100644 examples/l2fwd-headroom/main.c
> > > > > create mode 100644 lib/librte_headroom/Makefile
> > > > > create mode 100644 lib/librte_headroom/rte_headroom.c
> > > > > create mode 100644 lib/librte_headroom/rte_headroom.h
> > > > >
> > > > > --
> > > > > 1.7.9.5
> > > > >
> > > > >
> > > >
> > > > Whats the advantage of this library over the other tools to preform the same
> > > > function.
> > >
> > > Hi Neil,
> > >
> > > Good point, what is advantage over perf. Answer is: this library does not
> > > supposed to be a perf competition and is not for profiling app in the way perf
> > does.
> > > It is an small and fast extension. It's main task is to manage job list to invoke
> > > them exactly when needed and provide some basic stats about application idle
> > > time (whatever programmer will consider the idle) and busy time.
> > >
> > > For example:
> > > application might decide to add remove some jobs to/from LCore(s)
> > dynamically
> > > basing on current idle time (ex: move job from one core to another).
> > >
> > > Also application might have some information's about traffic type it handles
> > > and provide own algorithm to calculate invocation time (it can also
> > dynamically
> > > switch between those algorithms only replacing handlers).
> > >
> > > > Perf can provide all the information in this library, and do so
> > > > without having to directly modify the source for the execution unit under
> > test
> > >
> > > Yes, perf can provide those information's but it can't handle the case when
> > > you are poling for packets too fast or too slow and waist time getting only
> > couple
> > > of them. Library will adjust time when it execute job basing on value this job
> > > returned previously. Code modifications are not so deep, as you can see
> > comparing
> > > l2wf vs l2fwd-headroom app.
> > >
> > > For example in application I introduced, when forward job return less than
> > > MAX_PKT_BURST execution period will be increased. If it return more it will
> > decrease
> > > execution period. Stats provided for that can be used to determine if
> > application is
> > > behaving correctly and if there is a time for handling another port (what did for
> > tests).
> > >
> > You're still re-inventing the wheel here, and I don't see any advantage to doing
> > so. If the goal of the library is to profile the run time of a task, then you
> > have perf and systemtap for such purposes. If the goal is to create a job
> > scheduler that allows you to track multiple parallel tasks, and adjust their
> > execution, there are several pre-existing libraries that any application
> > programmer can already leverage to do just that (Berkely UPC or libtask to name
> > just two examples). Truthfully, on a dedicated cpu, you could just as easily
> > create multiple child processes runnnig at SCHED_RR and set their priorities
> > accordingly.
> >
> > I don't see why we need another library to do what several other tools/libraries
> > can do quite well.
> >
>
> I am under impression that I am unable to express myself clearly enough.
>
No, I think you're making yourself clear, you've created a library that allows
you to run a set of jobs, measure various metrics about them, and adjust their
execution time. Am I correct?
> I did not meant to make competition to perf nor reinvent the tasking library.
>
I understand thats not what your intent was, but it seems to have been your
result.
> 1. Idle and runtime statistics are provided "by the way" and you can use them
> or use perf on linux or libpmc on freebsd or whatever tool you like to get more
> sophisticated ones.
Ok, so we're in agreement that the idle and runtime statistics in this library
are simmilar to, and somewhat less sophisticated that those provided by perf of
libpmc.
> 2. You can also use tasking library what you like with a headroom object on top
> of every task you created with any scheduling you want. This way you can assign
> priorities to job sets and adjust sleep time as you like. You can also decide what
> is your idle/load time by placing task-dependet-yeld/sleep in idle callback or separate
> job callback or loop end callback.
>
I'm a little unclear on what you're saying here. What I think you're saying is
that we can also use other tasking libraries to achieve what you've done with
libheadroom, but I might be misreading you. If so, please clarify.
> Both of those two points above are no covering one gap - we are running in poll
> mode and here is a lack of mutex, semaphores or other synchronization mechanisms
> provided by OS/tasking layer.
Ok, I'm not sure how thats relevant though. The OS doesn't have to be
involved here at all if you don't want it to. Take a look at libtask:
http://swtch.com/libtask.tar.gz
Its a rudimentary tasking library that makes locking available if need be, but
in no way requires it.
UPC is simmilar:
http://upc.lbl.gov/task.shtml
Its got locking, but with proper cpu separation you don't need to use it.
This library try to provide facility for estimating optimal
> job execution time because every mis-poll is a waste of time and can lead to
> impression that core/task is fully loaded but what it does is only polling and getting 1
> or 2 packets instead of 32 (it degrades total NIC throughput).
>
Yes, I get that you've got a specific use case in mind (optimizing the time
spent in a specific task dynamically). My question is: Why did you re-create
the code already available in other libraries to do it? Theres no reason you
can't use the UPC library to create an example of a job set consisting of 4
jobs:
1) A job to record the start time of job (2)
2) A job to poll an interface
3) A job to measure the end time of job (2) and the number of packets it
recorded
4) A job to adjust the run time of job (2) based on the results obtained from
stats gathered in job (3)
I don't deny that you have a useful use case here, you definately do. My only
concern is that you've re-invented the wheel to make it happen. You don't
really need the library here, just the example that makes use of an existing
library.
> You can use perf (or whatever tool you like) to profile your code for performance. If
> the code good enough, you use headroom library that will estimate how often or
> when your perfect-profiled-and-optimized code need to be executed. When execution
> period estimation settles you have your headroom for other jobs on this core/task.
> You can also provide your own algorithm for estimating execution period. You are also
> able to drop using this headroom library at runtime and switch to tasking one when
> you decide that you have enough data make decision how many jobs you can execute
> on lcore/task basing on current throughput.
>
> I deliberately did not used 'task' but 'job' to not make an impression that I am
> reinventing another task library. This code is to be simple and fast. Of course you can
> do all those things in every application you make but this library is provided to not
> reinvent this logic all the time.
>
All user space task libraries are simple and fast, its the only reason they
exist :). Theres no need to create another one here.
Neil
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v2 0/2] new headroom stats library and example application
2015-01-29 11:50 [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Pawel Wodkowski
` (2 preceding siblings ...)
2015-01-29 13:25 ` [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Neil Horman
@ 2015-02-17 15:37 ` Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
` (2 more replies)
3 siblings, 3 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 15:37 UTC (permalink / raw)
To: dev
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs. It
can be used to profile every kind of job sets on any arbitrary execution unit or
tasking library.
In provided l2fwd-headroom example I demonstrate how to use this library to
select optimal rx burst poll time. Jobs are selected by using existing rte_timer
library calls. This example does no limit possible schemes on which this library
can be used.
PATCH v2 changes:
- Remove jobs management/callback from library to not duplicate tasking library
behaviour.
- Cleenup/remove useless statistics.
- Rework example application to use rte_timer library for jobs selection.
- Introduce new app parameter '-l' for automatic thousands separating in stats.
- More readable statistics format.
Pawel Wodkowski (2):
librte_headroom: New library for checking core/system/app load
examples: introduce new l2fwd-headroom example
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 ++
examples/l2fwd-headroom/main.c | 1039 ++++++++++++++++++++++++++
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 54 ++
lib/librte_headroom/rte_headroom.c | 271 +++++++
lib/librte_headroom/rte_headroom.h | 324 ++++++++
lib/librte_headroom/rte_headroom_version.map | 20 +
mk/rte.app.mk | 4 +
11 files changed, 1775 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
create mode 100644 lib/librte_headroom/rte_headroom_version.map
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v2 1/2] librte_headroom: New library for checking core/system/app load
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 " Pawel Wodkowski
@ 2015-02-17 15:37 ` Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
2 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 15:37 UTC (permalink / raw)
To: dev
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.
To calculate a those statistics application code need to be devided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.
Series of jobs must be surrounded with the rte_headroom_start_loop() and
rte_headroom_finish_loop() calls. After that, jobs might be started.
Each job must be surrounded with rte_headroom_start_job() and
rte_headroom_finish_job() calls.
After job finish its execution, period in which it should be called
again is adjusted to minimize time wasted on unnecessary polls/calls.
Adjustmend is based on data provided by job itself (ex: number of
packets it processed).
After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
- total/min/max execution - time spent in executing jobs.
- total/min/max management - time spent outside execution area. This
value might used to measure overhead of sheduling jobs. This time also
contains overhead of headroom library itself.
- number of loops that executed at least one job
- executed jobs
- time when statistics were reset.
Each job provide total/min/max execution time and execution count
statistics.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 54 +++++
lib/librte_headroom/rte_headroom.c | 271 +++++++++++++++++++++
lib/librte_headroom/rte_headroom.h | 324 ++++++++++++++++++++++++++
lib/librte_headroom/rte_headroom_version.map | 20 ++
7 files changed, 680 insertions(+)
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
create mode 100644 lib/librte_headroom/rte_headroom_version.map
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..aa2e5fd 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -282,6 +282,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..055a37b 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -290,6 +290,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/lib/Makefile b/lib/Makefile
index d617d81..4fc2819 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -54,6 +54,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
+DIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += librte_headroom
DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
diff --git a/lib/librte_headroom/Makefile b/lib/librte_headroom/Makefile
new file mode 100644
index 0000000..faefb3b
--- /dev/null
+++ b/lib/librte_headroom/Makefile
@@ -0,0 +1,54 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_headroom.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_headroom_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_HEADROOM) := rte_headroom.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_HEADROOM)-include := rte_headroom.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_mbuf
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_headroom/rte_headroom.c b/lib/librte_headroom/rte_headroom.c
new file mode 100644
index 0000000..a2cc671
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.c
@@ -0,0 +1,271 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <stdlib.h>
+
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_common.h>
+#include <rte_cycles.h>
+#include <rte_branch_prediction.h>
+#include <rte_debug.h>
+#include <rte_eal.h>
+#include <rte_malloc.h>
+
+#include "rte_headroom.h"
+
+/* Those are steps used to adjust job period.
+ * Experiments show that for forwarding apps the up step must be less than down
+ * step to achieve optimal performance.
+ */
+#define JOB_UPDATE_STEP_UP 1
+#define JOB_UPDATE_STEP_DOWN 4
+
+/*
+ * Default update function that implements simple period adjustment.
+ */
+static void
+default_update_function(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->target - result;
+
+ /* Job is happy. Nothing to do */
+ if (err == 0)
+ return;
+
+ if (err > 0) {
+ if (job->period + JOB_UPDATE_STEP_UP < job->max_period)
+ job->period += JOB_UPDATE_STEP_UP;
+ } else {
+ if (job->min_period + JOB_UPDATE_STEP_DOWN < job->period)
+ job->period -= JOB_UPDATE_STEP_DOWN;
+ }
+}
+
+#define HDR_ADD_TIME_MIN_MAX(obj, type, value) do { \
+ typeof(value) tmp = (value); \
+ (obj)->type ## _time += tmp; \
+ if (tmp < (obj)->min_ ## type ## _time) \
+ (obj)->min_ ## type ## _time = tmp; \
+ if (tmp > (obj)->max_ ## type ## _time) \
+ (obj)->max_ ## type ## _time = tmp; \
+} while (0)
+
+#define HDR_RESET_TIME_MIN_MAX(obj, type) do { \
+ (obj)->type ## _time = 0; \
+ (obj)->min_ ## type ## _time = UINT64_MAX; \
+ (obj)->max_ ## type ## _time = 0; \
+} while (0)
+
+int
+rte_headroom_init(struct rte_headroom *hdr)
+{
+ if (hdr == NULL)
+ return -EINVAL;
+
+ /* Init only needed parameters. Zero out everything else. */
+ memset(hdr, 0, sizeof(struct rte_headroom));
+
+ rte_headroom_reset_stats(hdr);
+
+ return 0;
+}
+
+void
+rte_headroom_start_loop(struct rte_headroom *hdr)
+{
+ uint64_t now;
+
+ hdr->loop_executed_jobs = 0;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+}
+
+void
+rte_headroom_finish_loop(struct rte_headroom *hdr)
+{
+ uint64_t now;
+
+ if (likely(hdr->loop_executed_jobs))
+ hdr->loop_cnt++;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+}
+
+void
+rte_headroom_set_job_target(struct rte_headroom_job *job, int64_t target)
+{
+ job->target = target;
+}
+
+int
+rte_headroom_start_job(struct rte_headroom *hdr, struct rte_headroom_job *job)
+{
+ uint64_t now;
+
+ /* Some sanity check. */
+ if (unlikely(hdr == NULL || job == NULL || job->headroom != NULL))
+ return -EINVAL;
+
+ /* Link job with headroom object. */
+ job->headroom = hdr;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+
+ return 0;
+}
+
+int
+rte_headroom_finish_job(struct rte_headroom_job *job, int64_t job_value)
+{
+ struct rte_headroom *hdr;
+ uint64_t now, exec_time;
+ int need_update;
+
+ /* Some sanity check. */
+ if (unlikely(job == NULL || job->headroom == NULL))
+ return -EINVAL;
+
+ need_update = job->target != job_value;
+ /* Adjust period only if job is unhappy of its current period. */
+ if (need_update)
+ (*job->update_period_cb)(job, job_value);
+
+ hdr = job->headroom;
+
+ /* Update execution time is considered as runtime so get time after it is
+ * executed. */
+ rte_mb();
+ now = rte_get_timer_cycles();
+ exec_time = now - hdr->state_time;
+ HDR_ADD_TIME_MIN_MAX(job, exec, exec_time);
+ HDR_ADD_TIME_MIN_MAX(hdr, exec, exec_time);
+
+ hdr->state_time = now;
+
+ hdr->loop_executed_jobs++;
+ hdr->job_exec_cnt++;
+
+ job->exec_cnt++;
+ job->headroom = NULL;
+
+ return need_update;
+}
+
+void
+rte_headroom_job_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate)
+{
+ if (saturate != 0) {
+ if (period < job->min_period)
+ period = job->min_period;
+ else if (period > job->max_period)
+ period = job->max_period;
+ }
+
+ job->period = period;
+}
+
+void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->min_period = period;
+ if (job->period < period)
+ job->period = period;
+}
+
+void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->max_period = period;
+ if (job->period > period)
+ job->period = period;
+}
+
+int
+rte_headroom_job_init(struct rte_headroom_job *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target)
+{
+ if (job == NULL)
+ return -EINVAL;
+
+ job->period = initial_period;
+ job->min_period = min_period;
+ job->max_period = max_period;
+ job->target = target;
+ job->update_period_cb = &default_update_function;
+ rte_headroom_reset_job_stats(job);
+ snprintf(job->name, RTE_DIM(job->name), "%s", name == NULL ? "" : name);
+ job->headroom = NULL;
+
+ return 0;
+}
+
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_period_cb)
+{
+ if (update_period_cb == NULL)
+ update_period_cb = default_update_function;
+
+ job->update_period_cb = update_period_cb;
+}
+
+void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job)
+{
+ HDR_RESET_TIME_MIN_MAX(job, exec);
+ job->exec_cnt = 0;
+}
+
+void
+rte_headroom_reset_stats(struct rte_headroom *hdr)
+{
+ HDR_RESET_TIME_MIN_MAX(hdr, exec);
+ HDR_RESET_TIME_MIN_MAX(hdr, management);
+ hdr->start_time = rte_get_timer_cycles();
+ hdr->state_time = hdr->start_time;
+ hdr->job_exec_cnt = 0;
+ hdr->loop_cnt = 0;
+}
diff --git a/lib/librte_headroom/rte_headroom.h b/lib/librte_headroom/rte_headroom.h
new file mode 100644
index 0000000..f389ca7
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.h
@@ -0,0 +1,324 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef HEADROOM_H_
+#define HEADROOM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_HEADROOM_JOB_NAMESIZE 32
+#define RTE_HEADROOM_NAMESIZE 32
+#define RTE_HEADROOM_MZ_PREFIX "HDR_"
+
+/* Forward declarations. */
+struct rte_headroom;
+struct rte_headroom_job;
+
+/**
+ * This function should calculate new period and set it using
+ * rte_headroom_set_period() function. Time spent in this function will be
+ * added to job's runtime.
+ *
+ * @param job
+ * The job data structure handler.
+ * @param job_result
+ * Result of calling job callback.
+ */
+typedef void (*rte_headroom_update_fn_t)(struct rte_headroom_job *job,
+ int64_t job_result);
+
+struct rte_headroom_job {
+ uint64_t period;
+ /**< Estimated period of execution. */
+
+ uint64_t min_period;
+ /**< Minimum period. */
+
+ uint64_t max_period;
+ /**< Maximum period. */
+
+ int64_t target;
+ /**< Desired value for this job. */
+
+ rte_headroom_update_fn_t update_period_cb;
+ /**< Period update callback. */
+
+ uint64_t exec_time;
+ /**< Total time (sum) that this job was executing. */
+
+ uint64_t min_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t exec_cnt;
+ /**< Execute count. */
+
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ /**< Name of this job */
+
+ struct rte_headroom *headroom;
+ /**< Headroom object that is executing this job. */
+} __rte_cache_aligned;
+
+struct rte_headroom {
+ /** Viariable holding time at different points:
+ * -# loop start time if loop was started but no job executed yet.
+ * -# job start time if job is currently executing.
+ * -# job finish time if job finished its execution.
+ * -# loop finish time if loop finished its execution. */
+ uint64_t state_time;
+
+ uint64_t loop_executed_jobs;
+ /**< Count of executed jobs in this loop. */
+
+ /* Statistics start. */
+
+ uint64_t exec_time;
+ /**< Total time taken to execute jobs, not including management time. */
+
+ uint64_t min_exec_time;
+ /**< Minimum loop execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum loop execute time. */
+
+ /**
+ * Sum of time that is not the execute time (ex: from job finish to next
+ * job start).
+ *
+ * This time might be considered as overhead of headroom library + job
+ * scheduling.
+ */
+ uint64_t management_time;
+
+ uint64_t min_management_time;
+ /**< Minimum management time */
+
+ uint64_t max_management_time;
+ /**< Maximum management time */
+
+ uint64_t start_time;
+ /**< Time since last reset stats. */
+
+ uint64_t job_exec_cnt;
+ /**< Total count of executed jobs. */
+
+ uint64_t loop_cnt;
+ /**< Total count of executed loops with at least one executed job. */
+} __rte_cache_aligned;
+
+/**
+ * Initialize given headroom object with default values.
+ *
+ * @param hdr
+ * Headroom object to initialize.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *hdr* is NULL
+ */
+int
+rte_headroom_init(struct rte_headroom *hdr);
+
+/**
+ * Mark, that new set of jobs start executing.
+ *
+ * @param hdr
+ * Headroom object.
+ */
+void
+rte_headroom_start_loop(struct rte_headroom *hdr);
+
+/**
+ * Mark, that there is no more jobs ready to execute in this turn. Calculate
+ * stats for this loop turn.
+ *
+ * @param hdr
+ * Headroom object.
+ */
+void
+rte_headroom_finish_loop(struct rte_headroom *hdr);
+
+/**
+ * Initialize given job stats object.
+ *
+ * @param job
+ * Job object.
+ * @param name
+ * Optional job name.
+ * @param min_period
+ * Minimum period that this job can accept.
+ * @param max_period
+ * Maximum period that this job can accept.
+ * @param initial_period
+ * Initial period. It will be checked against *min_period* and *max_period*.
+ * @param target
+ * Target value that this job try to achieve.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *job* is NULL
+ */
+int
+rte_headroom_job_init(struct rte_headroom_job *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target);
+
+/**
+ * Set job desired target value. Difference between target and job callback
+ * return value must be used to properly adjust job execute period value.
+ *
+ * @param job
+ * The job object.
+ * @param target
+ * New target.
+ */
+void
+rte_headroom_set_job_target(struct rte_headroom_job *job, int64_t target);
+
+/**
+ * Mark that *job* is starting of its execution in context of *hdr* object.
+ *
+ * @param hdr
+ * Headroom object context.
+ * @param job
+ * Job object.
+ * @return
+ * 0 on success
+ * -EINVAL if *hdr* or *job* is NULL or *job* is executing in another headroom
+ * context already,
+ */
+int
+rte_headroom_start_job(struct rte_headroom *hdr, struct rte_headroom_job *job);
+
+/**
+ * Mark that *job* finished its execution. Contex in which it was executing will
+ * receive stat update. After this function call *job* object is ready to be
+ * executed in other headroom context.
+ *
+ * @param job
+ * Job object.
+ * @param job_value
+ * Job value. Job should pass in this parameter a value that it try to optimize
+ * for example the number of packets it processed.
+ *
+ * @return
+ * 0 if job's period was not updated (job target equals *job_value*)
+ * 1 if job's period was updated
+ * -EINVAL if job is NULL or job was not started (it have no headroom context).
+ */
+int
+rte_headroom_finish_job(struct rte_headroom_job *job, int64_t job_value);
+
+/**
+ * Set execute period of given job.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New period value.
+ * @param saturate
+ * If zero, skip period saturation to min, max range.
+ */
+void
+rte_headroom_job_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate);
+/**
+ * Set minimum execute period of given job. Current period will be checked
+ * against new minimum value.
+ *
+ * @param job
+ * The job ocject.
+ * @param period
+ * New minimum period value.
+ */
+void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period);
+/**
+ * Set maximum execute period of given job.urrent period will be checked
+ * against new maximum value.
+ *
+ * @param job
+ * The job ocject.
+ * @param period
+ * New maximum period value.
+ */
+void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period);
+
+/**
+ * Set update period callback that is invoked after job finish.
+ *
+ * If application want to do more sophisticated calculations than default
+ * it can provide this handler.
+ *
+ * @param job
+ * Job object.
+ * @param update_pedriod_cb
+ * Callback to set. If NULL restore default update function.
+ */
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_period_cb);
+
+/**
+ * Function resets job statistics.
+ *
+ * @param job
+ * Job which statistics will be reset.
+ */
+void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job);
+/**
+ * Function resets headroom statistics.
+ *
+ * @param hdr
+ * Headroom which statistics will be reset.
+ */
+void
+rte_headroom_reset_stats(struct rte_headroom *hdr);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* HEADROOM_H_ */
diff --git a/lib/librte_headroom/rte_headroom_version.map b/lib/librte_headroom/rte_headroom_version.map
new file mode 100644
index 0000000..1f20016
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom_version.map
@@ -0,0 +1,20 @@
+DPDK_2.0 {
+ global:
+
+ rte_headroom_init;
+ rte_headroom_start_loop;
+ rte_headroom_finish_loop;
+ rte_headroom_job_init;
+ rte_headroom_set_job_target;
+ rte_headroom_start_job;
+ rte_headroom_finish_job;
+ rte_headroom_job_set_period;
+ rte_headroom_set_min_period;
+ rte_headroom_set_max_period;
+ rte_headroom_set_update_period_function;
+ rte_headroom_reset_job_stats;
+ rte_headroom_reset_stats;
+
+ local: *;
+};
+
\ No newline at end of file
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v2 2/2] examples: introduce new l2fwd-headroom example
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 " Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-17 15:37 ` Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
2 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 15:37 UTC (permalink / raw)
To: dev
This app demonstrate usage of new headroom library.
It is basicaly orginal l2fwd with following modificantions to met
headroom library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for those jobs is almost the same as in orginal application.
- stats is moved to rte_alarm callbac to not introduce overhead of
printing.
- stats are expanded to show headroom statistics.
- added new parameter '-l' to automatic thousands separator.
Comparing orginal l2fwd and l2fwd-headroom apps will show approach what
is needed to properly write own application with headroom measurements.
New available statistics:
- Total and % of fwd and flush execution time
- management time - overhead of rte_timer + overhead of headroom library
- Idle time and % of time spent waiting for fwd or flush to be ready to
execute.
- per job execution time and period.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 ++
examples/l2fwd-headroom/main.c | 1039 ++++++++++++++++++++++++++++++++++++++
mk/rte.app.mk | 4 +
4 files changed, 1095 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..8a459b7 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
DIRS-y += l2fwd
+DIRS-y += l2fwd-headroom
DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
DIRS-y += l3fwd
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-headroom/Makefile
new file mode 100644
index 0000000..07da286
--- /dev/null
+++ b/examples/l2fwd-headroom/Makefile
@@ -0,0 +1,51 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-headroom
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-headroom/main.c
new file mode 100644
index 0000000..7ba1743
--- /dev/null
+++ b/examples/l2fwd-headroom/main.c
@@ -0,0 +1,1039 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <locale.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <ctype.h>
+#include <getopt.h>
+
+#include <rte_alarm.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_spinlock.h>
+
+#include <rte_errno.h>
+#include <rte_headroom.h>
+#include <rte_timer.h>
+#include <rte_alarm.h>
+
+#define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF 8192
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr l2fwd_ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t l2fwd_enabled_port_mask;
+
+/* list of enabled ports */
+static uint32_t l2fwd_dst_ports[RTE_MAX_ETHPORTS];
+
+#define UPDATE_STEP_UP 1
+#define UPDATE_STEP_DOWN 32
+
+static unsigned int l2fwd_rx_queue_per_lcore = 1;
+
+struct mbuf_table {
+ uint64_t next_flush_time;
+ unsigned len;
+ struct rte_mbuf *mbufs[MAX_PKT_BURST];
+};
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT 16
+struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_headroom_job port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
+
+ struct rte_timer flush_timer;
+ struct rte_headroom_job flush_job;
+ struct rte_headroom_job idle_job;
+ struct rte_headroom headroom;
+
+ rte_atomic16_t stats_read_pending;
+ rte_spinlock_t lock;
+} __rte_cache_aligned;
+struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
+
+static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc = 0, /**< CRC stripped by hardware */
+ },
+ .txmode = {
+ .mq_mode = ETH_MQ_TX_NONE,
+ },
+};
+
+struct rte_mempool *l2fwd_pktmbuf_pool = NULL;
+
+/* Per-port statistics struct */
+struct l2fwd_port_statistics {
+ uint64_t tx;
+ uint64_t rx;
+ uint64_t dropped;
+} __rte_cache_aligned;
+struct l2fwd_port_statistics port_statistics[RTE_MAX_ETHPORTS];
+
+/* 1 day max */
+#define MAX_TIMER_PERIOD 86400
+/* default period is 10 seconds */
+static int64_t timer_period = 10;
+/* default timer frequency */
+static double hz;
+/* BURST_TX_DRAIN_US converted to cycles */
+uint64_t drain_tsc;
+/* Convert cycles to ns */
+static inline double
+cycles_to_ns(uint64_t cycles)
+{
+ double t = cycles;
+
+ t *= (double)NS_PER_S;
+ t /= hz;
+ return t;
+}
+
+static void
+show_lcore_headroom_stats(unsigned lcore_id)
+{
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_headroom *hdr = &qconf->headroom;
+ struct rte_headroom_job *job;
+ uint8_t i;
+
+ /* Headroom statistics. */
+ uint64_t stats_period, loop_count;
+ uint64_t exec, exec_min, exec_max;
+ uint64_t management, management_min, management_max;
+ uint64_t busy, busy_min, busy_max;
+
+ /* Jobs statistics. */
+ const uint8_t port_cnt = qconf->n_rx_port;
+ uint64_t jobs_exec_cnt[port_cnt], jobs_period[port_cnt];
+ uint64_t jobs_exec[port_cnt], jobs_exec_min[port_cnt],
+ jobs_exec_max[port_cnt];
+
+ uint64_t flush_exec_cnt, flush_period;
+ uint64_t flush_exec, flush_exec_min, flush_exec_max;
+
+ uint64_t idle_exec_cnt;
+ uint64_t idle_exec, idle_exec_min, idle_exec_max;
+ uint64_t collection_time = rte_get_timer_cycles();
+
+ /* Ask forwarding thread to give us stats. */
+ rte_atomic16_set(&qconf->stats_read_pending, 1);
+ rte_spinlock_lock(&qconf->lock);
+ rte_atomic16_set(&qconf->stats_read_pending, 0);
+
+ /* Collect headroom statistics. */
+ stats_period = hdr->state_time - hdr->start_time;
+ loop_count = hdr->loop_cnt;
+
+ exec = hdr->exec_time;
+ exec_min = hdr->min_exec_time;
+ exec_max = hdr->max_exec_time;
+
+ management = hdr->management_time;
+ management_min = hdr->min_management_time;
+ management_max = hdr->max_management_time;
+
+ rte_headroom_reset_stats(hdr);
+
+ for (i = 0; i < port_cnt; i++) {
+ job = &qconf->port_fwd_jobs[i];
+
+ jobs_exec_cnt[i] = job->exec_cnt;
+ jobs_period[i] = job->period;
+
+ jobs_exec[i] = job->exec_time;
+ jobs_exec_min[i] = job->min_exec_time;
+ jobs_exec_max[i] = job->max_exec_time;
+
+ rte_headroom_reset_job_stats(job);
+ }
+
+ flush_exec_cnt = qconf->flush_job.exec_cnt;
+ flush_period = qconf->flush_job.period;
+ flush_exec = qconf->flush_job.exec_time;
+ flush_exec_min = qconf->flush_job.min_exec_time;
+ flush_exec_max = qconf->flush_job.max_exec_time;
+ rte_headroom_reset_job_stats(&qconf->flush_job);
+
+ idle_exec_cnt = qconf->idle_job.exec_cnt;
+ idle_exec = qconf->idle_job.exec_time;
+ idle_exec_min = qconf->idle_job.min_exec_time;
+ idle_exec_max = qconf->idle_job.max_exec_time;
+ rte_headroom_reset_job_stats(&qconf->idle_job);
+
+ rte_spinlock_unlock(&qconf->lock);
+
+ exec -= idle_exec;
+ busy = exec + management;
+ busy_min = exec_min + management_min;
+ busy_max = exec_max + management_max;
+
+
+ collection_time = rte_get_timer_cycles() - collection_time;
+
+#define STAT_FMT "\n%-18s %'14.0f %6.1f%% %'10.0f %'10.0f %'10.0f"
+
+ printf("\n----------------"
+ "\nLCore %3u: headroom statistics (time in ns, collected in %'9.0f)"
+ "\n%-18s %14s %7s %10s %10s %10s "
+ "\n%-18s %'14.0f"
+ "\n%-18s %'14" PRIu64
+ STAT_FMT /* Exec */
+ STAT_FMT /* Management */
+ STAT_FMT /* Busy */
+ STAT_FMT, /* Idle */
+ lcore_id, cycles_to_ns(collection_time),
+ "Stat type", "total", "%total", "avg", "min", "max",
+ "Stats duration:", cycles_to_ns(stats_period),
+ "Loop count:", loop_count,
+ "Exec time",
+ cycles_to_ns(exec), exec * 100.0 / stats_period ,
+ cycles_to_ns(loop_count ? exec / loop_count : 0),
+ cycles_to_ns(exec_min),
+ cycles_to_ns(exec_max),
+ "Management time",
+ cycles_to_ns(management), management * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? management / loop_count : 0),
+ cycles_to_ns(management_min),
+ cycles_to_ns(management_max),
+ "Exec + management",
+ cycles_to_ns(busy), busy * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? busy / loop_count : 0),
+ cycles_to_ns(busy_min),
+ cycles_to_ns(busy_max),
+ "Idle (job)",
+ cycles_to_ns(idle_exec), idle_exec * 100.0 / stats_period,
+ cycles_to_ns(idle_exec_cnt ? idle_exec / idle_exec_cnt : 0),
+ cycles_to_ns(idle_exec_min),
+ cycles_to_ns(idle_exec_max));
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ job = &qconf->port_fwd_jobs[i];
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", jobs_exec_cnt[i],
+ "Exec period: ", cycles_to_ns(jobs_period[i]),
+ "Exec time",
+ cycles_to_ns(jobs_exec[i]), jobs_exec[i] * 100.0 / stats_period,
+ cycles_to_ns(jobs_exec_cnt[i] ? jobs_exec[i] / jobs_exec_cnt[i]
+ : 0),
+ cycles_to_ns(jobs_exec_min[i]),
+ cycles_to_ns(jobs_exec_max[i]));
+ }
+
+ if (qconf->n_rx_port > 0) {
+ job = &qconf->flush_job;
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", flush_exec_cnt,
+ "Exec period: ", cycles_to_ns(flush_period),
+ "Exec time",
+ cycles_to_ns(flush_exec), flush_exec * 100.0 / stats_period ,
+ cycles_to_ns(flush_exec_cnt ? flush_exec / flush_exec_cnt : 0),
+ cycles_to_ns(flush_exec_min),
+ cycles_to_ns(flush_exec_max));
+ }
+}
+
+/* Print out statistics on packets dropped */
+static void
+show_stats_cb(__rte_unused void *param)
+{
+ uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+ unsigned portid, lcore_id;
+
+ total_packets_dropped = 0;
+ total_packets_tx = 0;
+ total_packets_rx = 0;
+
+ const char clr[] = { 27, '[', '2', 'J', '\0' };
+ const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+ /* Clear screen and move to top left */
+ printf("%s%s"
+ "\nPort statistics ===================================",
+ clr, topLeft);
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ /* skip disabled ports */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+ printf("\nStatistics for port %u ------------------------------"
+ "\nPackets sent: %24"PRIu64
+ "\nPackets received: %20"PRIu64
+ "\nPackets dropped: %21"PRIu64,
+ portid,
+ port_statistics[portid].tx,
+ port_statistics[portid].rx,
+ port_statistics[portid].dropped);
+
+ total_packets_dropped += port_statistics[portid].dropped;
+ total_packets_tx += port_statistics[portid].tx;
+ total_packets_rx += port_statistics[portid].rx;
+ }
+
+ printf("\nAggregate statistics ==============================="
+ "\nTotal packets sent: %18"PRIu64
+ "\nTotal packets received: %14"PRIu64
+ "\nTotal packets dropped: %15"PRIu64
+ "\n====================================================",
+ total_packets_tx,
+ total_packets_rx,
+ total_packets_dropped);
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ if (lcore_queue_conf[lcore_id].n_rx_port > 0)
+ show_lcore_headroom_stats(lcore_id);
+ }
+
+ printf("\n====================================================\n");
+ rte_eal_alarm_set(timer_period * US_PER_S, show_stats_cb, NULL);
+}
+
+/* Send the burst of packets on an output interface */
+static void
+l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
+{
+ struct mbuf_table *m_table;
+ uint16_t ret;
+ uint16_t queueid = 0;
+ uint16_t n;
+
+ m_table = &qconf->tx_mbufs[port];
+ n = m_table->len;
+
+ m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
+ m_table->len = 0;
+
+ ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
+
+ port_statistics[port].tx += ret;
+ if (unlikely(ret < n)) {
+ port_statistics[port].dropped += (n - ret);
+ do {
+ rte_pktmbuf_free(m_table->mbufs[ret]);
+ } while (++ret < n);
+ }
+}
+
+/* Enqueue packets for TX and prepare them to be sent */
+static int
+l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+{
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct mbuf_table *m_table = &qconf->tx_mbufs[port];
+ uint16_t len = qconf->tx_mbufs[port].len;
+
+ m_table->mbufs[len] = m;
+
+ len++;
+ m_table->len = len;
+
+ /* Enough pkts to be sent. */
+ if (unlikely(len == MAX_PKT_BURST))
+ l2fwd_send_burst(qconf, port);
+
+ return 0;
+}
+
+static void
+l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+{
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+ tmp = ð->d_addr.addr_bytes[0];
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+ /* src addr */
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+}
+
+static void
+l2fwd_job_update_cb(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->target - result;
+ int64_t histeresis = job->target / 8;
+
+ if (err < -histeresis) {
+ if (job->min_period + UPDATE_STEP_DOWN < job->period)
+ job->period -= UPDATE_STEP_DOWN;
+ } else if (err > histeresis) {
+ if (job->period + UPDATE_STEP_UP < job->max_period)
+ job->period += UPDATE_STEP_UP;
+ }
+}
+
+static void
+l2fwd_fwd_job(__rte_unused struct rte_timer *timer, void *arg)
+{
+ struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+ struct rte_mbuf *m;
+
+ const uint8_t port_idx = (uintptr_t) arg;
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_headroom_job *job = &qconf->port_fwd_jobs[port_idx];
+ const uint8_t portid = qconf->rx_port_list[port_idx];
+
+ uint8_t j;
+ uint16_t total_nb_rx;
+
+ rte_headroom_start_job(&qconf->headroom, job);
+
+ /* Call rx burst 2 times. This allow headroom logic to see if this function
+ * must be called more frequently. */
+
+ total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ for (j = 0; j < total_nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+ if (total_nb_rx == MAX_PKT_BURST) {
+ const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+ }
+
+ port_statistics[portid].rx += total_nb_rx;
+
+ /* Adjust period time in which we are running here. */
+ if (rte_headroom_finish_job(job, total_nb_rx) != 0) {
+ rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
+ lcore_id, l2fwd_fwd_job, arg);
+ }
+}
+
+static void
+l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
+{
+ uint64_t now;
+ unsigned lcore_id;
+ struct lcore_queue_conf *qconf;
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_headroom_start_job(&qconf->headroom, &qconf->flush_job);
+
+ now = rte_get_timer_cycles();
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+
+ /* Pass target to indicate that this job is happy of time interwal
+ * in which it was called. */
+ rte_headroom_finish_job(&qconf->flush_job, qconf->flush_job.target);
+}
+
+/* main processing loop */
+static void
+l2fwd_main_loop(void)
+{
+ unsigned lcore_id;
+ unsigned i, portid;
+ struct lcore_queue_conf *qconf;
+ uint8_t stats_read_pending = 0;
+ uint8_t need_manage;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD, "lcore %u has nothing to do\n", lcore_id);
+ return;
+ }
+
+ RTE_LOG(INFO, L2FWD, "entering main loop on lcore %u\n", lcore_id);
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+
+ portid = qconf->rx_port_list[i];
+ RTE_LOG(INFO, L2FWD, " -- lcoreid=%u portid=%u\n", lcore_id,
+ portid);
+ }
+
+ rte_headroom_job_init(&qconf->idle_job, "idle", 0, 0, 0, 0);
+
+ for (;;) {
+ rte_spinlock_lock(&qconf->lock);
+
+ do {
+ rte_headroom_start_loop(&qconf->headroom);
+
+ /* Do the Idle job:
+ * - Read stats_read_pending flag
+ * - check if some real job need to be executed
+ */
+ rte_headroom_start_job(&qconf->headroom, &qconf->idle_job);
+
+ do {
+ uint8_t i;
+ uint64_t now = rte_get_timer_cycles();
+ need_manage = qconf->flush_timer.expire < now;
+ /* Check if we was esked to give a stats. */
+ stats_read_pending =
+ rte_atomic16_read(&qconf->stats_read_pending);
+ need_manage |= stats_read_pending;
+
+ for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
+ need_manage = qconf->rx_timers[i].expire < now;
+
+ } while (!need_manage);
+ rte_headroom_finish_job(&qconf->idle_job, qconf->idle_job.target);
+
+ rte_timer_manage();
+ rte_headroom_finish_loop(&qconf->headroom);
+ } while (likely(stats_read_pending == 0));
+
+ rte_spinlock_unlock(&qconf->lock);
+ rte_pause();
+ }
+}
+
+static int
+l2fwd_launch_one_lcore(__attribute__((unused)) void *dummy)
+{
+ l2fwd_main_loop();
+ return 0;
+}
+
+/* display usage */
+static void
+l2fwd_usage(const char *prgname)
+{
+ printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+ " -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+ " -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+ " -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n"
+ " -l set system default locale instead of default (\"C\" locale) for thousands separator in stats.",
+ prgname);
+}
+
+static int
+l2fwd_parse_portmask(const char *portmask)
+{
+ char *end = NULL;
+ unsigned long pm;
+
+ /* parse hexadecimal string */
+ pm = strtoul(portmask, &end, 16);
+ if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+
+ if (pm == 0)
+ return -1;
+
+ return pm;
+}
+
+static unsigned int
+l2fwd_parse_nqueue(const char *q_arg)
+{
+ char *end = NULL;
+ unsigned long n;
+
+ /* parse hexadecimal string */
+ n = strtoul(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return 0;
+ if (n == 0)
+ return 0;
+ if (n >= MAX_RX_QUEUE_PER_LCORE)
+ return 0;
+
+ return n;
+}
+
+static int
+l2fwd_parse_timer_period(const char *q_arg)
+{
+ char *end = NULL;
+ int n;
+
+ /* parse number string */
+ n = strtol(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+ if (n >= MAX_TIMER_PERIOD)
+ return -1;
+
+ return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+l2fwd_parse_args(int argc, char **argv)
+{
+ int opt, ret;
+ char **argvopt;
+ int option_index;
+ char *prgname = argv[0];
+ static struct option lgopts[] = {
+ {NULL, 0, 0, 0}
+ };
+
+ argvopt = argv;
+
+ while ((opt = getopt_long(argc, argvopt, "p:q:T:l",
+ lgopts, &option_index)) != EOF) {
+
+ switch (opt) {
+ /* portmask */
+ case 'p':
+ l2fwd_enabled_port_mask = l2fwd_parse_portmask(optarg);
+ if (l2fwd_enabled_port_mask == 0) {
+ printf("invalid portmask\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* nqueue */
+ case 'q':
+ l2fwd_rx_queue_per_lcore = l2fwd_parse_nqueue(optarg);
+ if (l2fwd_rx_queue_per_lcore == 0) {
+ printf("invalid queue number\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* timer period */
+ case 'T':
+ timer_period = l2fwd_parse_timer_period(optarg);
+ if (timer_period < 0) {
+ printf("invalid timer period\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* For thousands separator in printf. */
+ case 'l':
+ setlocale(LC_ALL, "");
+ break;
+
+ /* long options */
+ case 0:
+ l2fwd_usage(prgname);
+ return -1;
+
+ default:
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ }
+
+ if (optind >= 0)
+ argv[optind-1] = prgname;
+
+ ret = optind-1;
+ optind = 0; /* reset getopt lib */
+ return ret;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+ uint8_t portid, count, all_ports_up, print_flag = 0;
+ struct rte_eth_link link;
+
+ printf("\nChecking link status");
+ fflush(stdout);
+ for (count = 0; count <= MAX_CHECK_TIME; count++) {
+ all_ports_up = 1;
+ for (portid = 0; portid < port_num; portid++) {
+ if ((port_mask & (1 << portid)) == 0)
+ continue;
+ memset(&link, 0, sizeof(link));
+ rte_eth_link_get_nowait(portid, &link);
+ /* print link status if flag set */
+ if (print_flag == 1) {
+ if (link.link_status)
+ printf("Port %d Link Up - speed %u "
+ "Mbps - %s\n", (uint8_t)portid,
+ (unsigned)link.link_speed,
+ (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+ ("full-duplex") : ("half-duplex\n"));
+ else
+ printf("Port %d Link Down\n",
+ (uint8_t)portid);
+ continue;
+ }
+ /* clear all_ports_up flag if any link down */
+ if (link.link_status == 0) {
+ all_ports_up = 0;
+ break;
+ }
+ }
+ /* after finally printing all link status, get out */
+ if (print_flag == 1)
+ break;
+
+ if (all_ports_up == 0) {
+ printf(".");
+ fflush(stdout);
+ rte_delay_ms(CHECK_INTERVAL);
+ }
+
+ /* set the print_flag if all ports up or timeout */
+ if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+ print_flag = 1;
+ printf("done\n");
+ }
+ }
+}
+
+int
+main(int argc, char **argv)
+{
+ struct lcore_queue_conf *qconf;
+ struct rte_eth_dev_info dev_info;
+ unsigned lcore_id, rx_lcore_id;
+ unsigned nb_ports_in_mask = 0;
+ int ret;
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ uint8_t nb_ports;
+ uint8_t nb_ports_available;
+ uint8_t portid, last_port;
+ uint8_t i;
+
+ /* init EAL */
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+ rte_timer_subsystem_init();
+
+ /* fetch default timer frequency. */
+ hz = rte_get_timer_hz();
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+ nb_ports = rte_eth_dev_count();
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ } else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get(portid, &dev_info);
+ }
+ if (nb_ports_in_mask % 2) {
+ printf("Notice: odd number of ports in portmask.\n");
+ l2fwd_dst_ports[last_port] = last_port;
+ }
+
+ rx_lcore_id = 0;
+ qconf = NULL;
+
+ /* Initialize the port/queue configuration of each logical core */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ /* get the lcore_id for this port */
+ while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+ lcore_queue_conf[rx_lcore_id].n_rx_port ==
+ l2fwd_rx_queue_per_lcore) {
+ rx_lcore_id++;
+ if (rx_lcore_id >= RTE_MAX_LCORE)
+ rte_exit(EXIT_FAILURE, "Not enough cores\n");
+ }
+
+ if (qconf != &lcore_queue_conf[rx_lcore_id])
+ /* Assigned a new logical core in the loop above. */
+ qconf = &lcore_queue_conf[rx_lcore_id];
+
+ qconf->rx_port_list[qconf->n_rx_port] = portid;
+ qconf->n_rx_port++;
+ printf("Lcore %u: RX port %u\n", rx_lcore_id, (unsigned) portid);
+ }
+
+ nb_ports_available = nb_ports;
+
+ /* Initialise each port */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+ printf("Skipping disabled port %u\n", (unsigned) portid);
+ nb_ports_available--;
+ continue;
+ }
+ /* init port */
+ printf("Initializing port %u... ", (unsigned) portid);
+ fflush(stdout);
+ ret = rte_eth_dev_configure(portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ rte_eth_macaddr_get(portid, &l2fwd_ports_eth_addr[portid]);
+
+ /* init one RX queue */
+ fflush(stdout);
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* init one TX queue on each port */
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* Start device */
+ ret = rte_eth_dev_start(portid);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ printf("done:\n");
+
+ rte_eth_promiscuous_enable(portid);
+
+ printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+ (unsigned) portid,
+ l2fwd_ports_eth_addr[portid].addr_bytes[0],
+ l2fwd_ports_eth_addr[portid].addr_bytes[1],
+ l2fwd_ports_eth_addr[portid].addr_bytes[2],
+ l2fwd_ports_eth_addr[portid].addr_bytes[3],
+ l2fwd_ports_eth_addr[portid].addr_bytes[4],
+ l2fwd_ports_eth_addr[portid].addr_bytes[5]);
+
+ /* initialize port stats */
+ memset(&port_statistics, 0, sizeof(port_statistics));
+ }
+
+ if (!nb_ports_available) {
+ rte_exit(EXIT_FAILURE,
+ "All available ports are disabled. Please set portmask.\n");
+ }
+
+ check_all_ports_link_status(nb_ports, l2fwd_enabled_port_mask);
+
+ drain_tsc = (hz + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_spinlock_init(&qconf->lock);
+
+ if (rte_headroom_init(&qconf->headroom) != 0)
+ rte_panic("Headroom for core %u init failed\n", lcore_id);
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD,
+ "lcore %u: no ports so no headroom initialization\n",
+ lcore_id);
+ continue;
+ }
+ /* Add flush job.
+ * Set fixed period by setting min = max = initial period. Set target to
+ * zero as it is irrelevant for this job. */
+ rte_headroom_job_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
+ drain_tsc, 0);
+
+ rte_timer_init(&qconf->flush_timer);
+ rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL, lcore_id,
+ &l2fwd_flush_job, NULL);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add flush job for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ struct rte_headroom_job *job = &qconf->port_fwd_jobs[i];
+
+ portid = qconf->rx_port_list[i];
+ printf("Setting forward jon for port %u\n", portid);
+
+ snprintf(name, RTE_DIM(name), "port %u fwd", portid);
+ /* Setup forward job.
+ * Set min, max and initial period. Set target to MAX_PKT_BURST as
+ * this is desired optimal RX/TX burst size. */
+ rte_headroom_job_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
+ rte_headroom_set_update_period_function(job, l2fwd_job_update_cb);
+
+ rte_timer_init(&qconf->rx_timers[i]);
+ rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
+ &l2fwd_fwd_job, (void *)(uintptr_t)i);
+ }
+ }
+
+ if (timer_period)
+ rte_eal_alarm_set(timer_period * MS_PER_S, show_stats_cb, NULL);
+ else
+ RTE_LOG(INFO, L2FWD, "Stats display disabled\n");
+
+ /* launch per-lcore init on every lcore */
+ rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL, CALL_MASTER);
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+ if (rte_eal_wait_lcore(lcore_id) < 0)
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 334cb25..3db7222 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -103,6 +103,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
LDLIBS += -lrte_hash
endif
+ifeq ($(CONFIG_RTE_LIBRTE_HEADROOM),y)
+LDLIBS += -lrte_headroom
+endif
+
ifeq ($(CONFIG_RTE_LIBRTE_LPM),y)
LDLIBS += -lrte_lpm
endif
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 " Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
@ 2015-02-17 16:19 ` Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 1/2] pmd: enable DCB in SRIOV Pawel Wodkowski
` (3 more replies)
2 siblings, 4 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 16:19 UTC (permalink / raw)
To: dev
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs. It
can be used to profile every kind of job sets on any arbitrary execution unit or
tasking library.
In provided l2fwd-headroom example I demonstrate how to use this library to
select optimal rx burst poll time. Jobs are selected by using existing rte_timer
library calls. This example does no limit possible schemes on which this library
can be used.
PATCH v3 changes:
- spelling fixes.
PATCH v2 changes:
- Remove jobs management/callback from library to not duplicate tasking library
behaviour.
- Cleenup/remove useless statistics.
- Rework example application to use rte_timer library for jobs selection.
- Introduce new app parameter '-l' for automatic thousands separating in stats.
- More readable statistics format.
Pawel Wodkowski (2):
pmd: enable DCB in SRIOV
tespmd: fix DCB in SRIOV mode support
app/test-pmd/cmdline.c | 4 ++--
app/test-pmd/testpmd.c | 39 +++++++++++++++++++++++++++----------
app/test-pmd/testpmd.h | 10 ----------
lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 2 +-
lib/librte_pmd_ixgbe/ixgbe_pf.c | 19 +++++++++---------
lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 7 +++----
6 files changed, 45 insertions(+), 36 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v3 1/2] pmd: enable DCB in SRIOV
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
@ 2015-02-17 16:19 ` Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 2/2] tespmd: fix DCB in SRIOV mode support Pawel Wodkowski
` (2 subsequent siblings)
3 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 16:19 UTC (permalink / raw)
To: dev
This patch enables DCB in SRIOV mode for ixgbe (Niantic) driver.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 2 +-
lib/librte_pmd_ixgbe/ixgbe_pf.c | 19 ++++++++++---------
lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 7 +++----
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index 412bab2..7e7434d 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -1514,7 +1514,7 @@ ixgbe_dev_configure(struct rte_eth_dev *dev)
if (conf->nb_queue_pools != ETH_16_POOLS &&
conf->nb_queue_pools != ETH_32_POOLS) {
PMD_INIT_LOG(ERR, " VMDQ+DCB selected, "
- "number of TX qqueue pools must be %d or %d\n",
+ "number of TX queue pools must be %d or %d\n",
ETH_16_POOLS, ETH_32_POOLS);
return (-EINVAL);
}
diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c
index 255c996..8411445 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
@@ -137,7 +137,7 @@ int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
/*
- * Functin that make SRIOV configuration, based on device configuration,
+ * Function that make SRIOV configuration, based on device configuration,
* number of requested queues and number of VF created.
* Function returns:
* 1 - SRIOV is not enabled (no VF created)
@@ -191,7 +191,7 @@ ixgbe_pf_configure_mq_sriov(struct rte_eth_dev *dev)
break;
case ETH_MQ_RX_RSS:
PMD_INIT_LOG(INFO, " RSS (SRIOV active) mode, "
- "Rx mq mode is changed from:"
+ "Rx mq mode is changed from "
"mq_mode %u into VMDQ mq_mode %u\n",
dev_conf->rxmode.mq_mode,
dev->data->dev_conf.rxmode.mq_mode);
@@ -295,7 +295,7 @@ ixgbe_pf_configure_mq_sriov(struct rte_eth_dev *dev)
/* Check if available queus count is not less than allocated.*/
if (dev->data->nb_rx_queues > sriov->nb_rx_q_per_pool ||
- dev->data->nb_rx_queues > sriov->nb_tx_q_per_pool) {
+ dev->data->nb_tx_queues > sriov->nb_tx_q_per_pool) {
PMD_INIT_LOG(ERR, "SRIOV active, "
"rx/tx queue number must less or equal to %d/%d\n",
sriov->nb_rx_q_per_pool, sriov->nb_tx_q_per_pool);
@@ -305,7 +305,6 @@ ixgbe_pf_configure_mq_sriov(struct rte_eth_dev *dev)
return 0;
}
-
int ixgbe_pf_host_configure(struct rte_eth_dev *eth_dev)
{
uint32_t vtctl, fcrth;
@@ -659,7 +658,9 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf)
{
struct ixgbe_vf_info *vfinfo =
*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
- uint32_t default_q = vf * RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool;
+ struct ixgbe_dcb_config *dcbinfo =
+ IXGBE_DEV_PRIVATE_TO_DCB_CFG(dev->data->dev_private);
+ uint32_t default_q = RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx;
/* Verify if the PF supports the mbox APIs version or not */
switch (vfinfo[vf].api_version) {
@@ -677,10 +678,10 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf)
/* Notify VF of default queue */
msgbuf[IXGBE_VF_DEF_QUEUE] = default_q;
- /*
- * FIX ME if it needs fill msgbuf[IXGBE_VF_TRANS_VLAN]
- * for VLAN strip or VMDQ_DCB or VMDQ_DCB_RSS
- */
+ if (dcbinfo->num_tcs.pg_tcs)
+ msgbuf[IXGBE_VF_TRANS_VLAN] = dcbinfo->num_tcs.pg_tcs;
+ else
+ msgbuf[IXGBE_VF_TRANS_VLAN] = 1;
return 0;
}
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index e6766b3..f845bb0 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3166,10 +3166,9 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
/* check support mq_mode for DCB */
if ((dev_conf->rxmode.mq_mode != ETH_MQ_RX_VMDQ_DCB) &&
- (dev_conf->rxmode.mq_mode != ETH_MQ_RX_DCB))
- return;
-
- if (dev->data->nb_rx_queues != ETH_DCB_NUM_QUEUES)
+ (dev_conf->rxmode.mq_mode != ETH_MQ_RX_DCB) &&
+ (dev_conf->txmode.mq_mode != ETH_MQ_TX_VMDQ_DCB) &&
+ (dev_conf->txmode.mq_mode != ETH_MQ_TX_DCB))
return;
/** Configure DCB hardware **/
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v3 2/2] tespmd: fix DCB in SRIOV mode support
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 1/2] pmd: enable DCB in SRIOV Pawel Wodkowski
@ 2015-02-17 16:19 ` Pawel Wodkowski
2015-02-17 16:33 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Wodkowski, PawelX
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 " Pawel Wodkowski
3 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 16:19 UTC (permalink / raw)
To: dev
This patch incorporate fixes to support DCB in SRIOV mode for testpmd.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
app/test-pmd/cmdline.c | 4 ++--
app/test-pmd/testpmd.c | 39 +++++++++++++++++++++++++++++----------
app/test-pmd/testpmd.h | 10 ----------
3 files changed, 31 insertions(+), 22 deletions(-)
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4beb404..eb9877e 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1942,9 +1942,9 @@ cmd_config_dcb_parsed(void *parsed_result,
/* DCB in VT mode */
if (!strncmp(res->vt_en, "on",2))
- dcb_conf.dcb_mode = DCB_VT_ENABLED;
+ dcb_conf.vt_en = 1;
else
- dcb_conf.dcb_mode = DCB_ENABLED;
+ dcb_conf.vt_en = 0;
if (!strncmp(res->pfc_en, "on",2)) {
dcb_conf.pfc_en = 1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 773b8af..9b12c25 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1743,7 +1743,8 @@ const uint16_t vlan_tags[] = {
};
static int
-get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf)
+get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf,
+ uint16_t sriov)
{
uint8_t i;
@@ -1751,7 +1752,7 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf)
* Builds up the correct configuration for dcb+vt based on the vlan tags array
* given above, and the number of traffic classes available for use.
*/
- if (dcb_conf->dcb_mode == DCB_VT_ENABLED) {
+ if (dcb_conf->vt_en == 1) {
struct rte_eth_vmdq_dcb_conf vmdq_rx_conf;
struct rte_eth_vmdq_dcb_tx_conf vmdq_tx_conf;
@@ -1768,9 +1769,17 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf)
vmdq_rx_conf.pool_map[i].vlan_id = vlan_tags[ i ];
vmdq_rx_conf.pool_map[i].pools = 1 << (i % vmdq_rx_conf.nb_queue_pools);
}
- for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
- vmdq_rx_conf.dcb_queue[i] = i;
- vmdq_tx_conf.dcb_queue[i] = i;
+
+ if (sriov == 0) {
+ for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
+ vmdq_rx_conf.dcb_queue[i] = i;
+ vmdq_tx_conf.dcb_queue[i] = i;
+ }
+ } else {
+ for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
+ vmdq_rx_conf.dcb_queue[i] = i % dcb_conf->num_tcs;
+ vmdq_tx_conf.dcb_queue[i] = i % dcb_conf->num_tcs;
+ }
}
/*set DCB mode of RX and TX of multiple queues*/
@@ -1828,22 +1837,32 @@ init_port_dcb_config(portid_t pid,struct dcb_config *dcb_conf)
uint16_t nb_vlan;
uint16_t i;
- /* rxq and txq configuration in dcb mode */
- nb_rxq = 128;
- nb_txq = 128;
rx_free_thresh = 64;
+ rte_port = &ports[pid];
memset(&port_conf,0,sizeof(struct rte_eth_conf));
/* Enter DCB configuration status */
dcb_config = 1;
nb_vlan = sizeof( vlan_tags )/sizeof( vlan_tags[ 0 ]);
/*set configuration of DCB in vt mode and DCB in non-vt mode*/
- retval = get_eth_dcb_conf(&port_conf, dcb_conf);
+ retval = get_eth_dcb_conf(&port_conf, dcb_conf, rte_port->dev_info.max_vfs);
+
+ /* rxq and txq configuration in dcb mode */
+ nb_rxq = rte_port->dev_info.max_rx_queues;
+ nb_txq = rte_port->dev_info.max_tx_queues;
+
+ if (rte_port->dev_info.max_vfs) {
+ if (port_conf.rxmode.mq_mode == ETH_MQ_RX_VMDQ_DCB)
+ nb_rxq /= port_conf.rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
+
+ if (port_conf.txmode.mq_mode == ETH_MQ_TX_VMDQ_DCB)
+ nb_txq /= port_conf.tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
+ }
+
if (retval < 0)
return retval;
- rte_port = &ports[pid];
memcpy(&rte_port->dev_conf, &port_conf,sizeof(struct rte_eth_conf));
rte_port->rx_conf.rx_thresh = rx_thresh;
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 8f5e6c7..695e893 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -227,20 +227,10 @@ struct fwd_config {
portid_t nb_fwd_ports; /**< Nb. of ports involved. */
};
-/**
- * DCB mode enable
- */
-enum dcb_mode_enable
-{
- DCB_VT_ENABLED,
- DCB_ENABLED
-};
-
/*
* DCB general config info
*/
struct dcb_config {
- enum dcb_mode_enable dcb_mode;
uint8_t vt_en;
enum rte_eth_nb_tcs num_tcs;
uint8_t pfc_en;
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 1/2] pmd: enable DCB in SRIOV Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 2/2] tespmd: fix DCB in SRIOV mode support Pawel Wodkowski
@ 2015-02-17 16:33 ` Wodkowski, PawelX
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 " Pawel Wodkowski
3 siblings, 0 replies; 48+ messages in thread
From: Wodkowski, PawelX @ 2015-02-17 16:33 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Tuesday, February 17, 2015 5:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example
> application
>
> Hi community,
> I would like to introduce library for measuring load of some arbitrary jobs. It
> can be used to profile every kind of job sets on any arbitrary execution unit or
> tasking library.
>
> In provided l2fwd-headroom example I demonstrate how to use this library to
> select optimal rx burst poll time. Jobs are selected by using existing rte_timer
> library calls. This example does no limit possible schemes on which this library
> can be used.
>
> PATCH v3 changes:
> - spelling fixes.
>
> PATCH v2 changes:
> - Remove jobs management/callback from library to not duplicate tasking
> library
> behaviour.
> - Cleenup/remove useless statistics.
> - Rework example application to use rte_timer library for jobs selection.
> - Introduce new app parameter '-l' for automatic thousands separating in stats.
> - More readable statistics format.
>
>
> Pawel Wodkowski (2):
> pmd: enable DCB in SRIOV
> tespmd: fix DCB in SRIOV mode support
>
> app/test-pmd/cmdline.c | 4 ++--
> app/test-pmd/testpmd.c | 39 +++++++++++++++++++++++++++----------
> app/test-pmd/testpmd.h | 10 ----------
> lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 2 +-
> lib/librte_pmd_ixgbe/ixgbe_pf.c | 19 +++++++++---------
> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 7 +++----
> 6 files changed, 45 insertions(+), 36 deletions(-)
>
> --
> 1.9.1
Not this branch :(
Self-NACK
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v4 0/2] new headroom stats library and example application
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
` (2 preceding siblings ...)
2015-02-17 16:33 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Wodkowski, PawelX
@ 2015-02-17 16:42 ` Pawel Wodkowski
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
` (2 more replies)
3 siblings, 3 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 16:42 UTC (permalink / raw)
To: dev
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs. It
can be used to profile every kind of job sets on any arbitrary execution unit or
tasking library.
In provided l2fwd-headroom example I demonstrate how to use this library to
select optimal rx burst poll time. Jobs are selected by using existing rte_timer
library calls. This example does no limit possible schemes on which this library
can be used.
PATCH v4 changes:
- use proper branch fof generating patch.
PATCH v3 changes:
- Fix spelling.
PATCH v2 changes:
- Remove jobs management/callback from library to not duplicate tasking library
behaviour.
- Cleenup/remove useless statistics.
- Rework example application to use rte_timer library for jobs selection.
- Introduce new app parameter '-l' for automatic thousands separating in stats.
- More readable statistics format.
Pawel Wodkowski (2):
librte_headroom: New library for checking core/system/app load
examples: introduce new l2fwd-headroom example
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 ++
examples/l2fwd-headroom/main.c | 1039 ++++++++++++++++++++++++++
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 54 ++
lib/librte_headroom/rte_headroom.c | 271 +++++++
lib/librte_headroom/rte_headroom.h | 324 ++++++++
lib/librte_headroom/rte_headroom_version.map | 20 +
mk/rte.app.mk | 4 +
11 files changed, 1775 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
create mode 100644 lib/librte_headroom/rte_headroom_version.map
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 " Pawel Wodkowski
@ 2015-02-17 16:42 ` Pawel Wodkowski
2015-02-18 13:36 ` De Lara Guarch, Pablo
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
2 siblings, 1 reply; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 16:42 UTC (permalink / raw)
To: dev
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.
To calculate a those statistics application code need to be devided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.
Series of jobs must be surrounded with the rte_headroom_start_loop() and
rte_headroom_finish_loop() calls. After that, jobs might be started.
Each job must be surrounded with rte_headroom_start_job() and
rte_headroom_finish_job() calls.
After job finish its execution, period in which it should be called
again is adjusted to minimize time wasted on unnecessary polls/calls.
Adjustmend is based on data provided by job itself (ex: number of
packets it processed).
After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
- total/min/max execution - time spent in executing jobs.
- total/min/max management - time spent outside execution area. This
value might used to measure overhead of sheduling jobs. This time also
contains overhead of headroom library itself.
- number of loops that executed at least one job
- executed jobs
- time when statistics were reset.
Each job provide total/min/max execution time and execution count
statistics.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 54 +++++
lib/librte_headroom/rte_headroom.c | 271 ++++++++++++++++++++++
lib/librte_headroom/rte_headroom.h | 324 +++++++++++++++++++++++++++
lib/librte_headroom/rte_headroom_version.map | 20 ++
7 files changed, 680 insertions(+)
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
create mode 100644 lib/librte_headroom/rte_headroom_version.map
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..aa2e5fd 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -282,6 +282,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..055a37b 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -290,6 +290,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/lib/Makefile b/lib/Makefile
index d617d81..4fc2819 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -54,6 +54,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
+DIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += librte_headroom
DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
diff --git a/lib/librte_headroom/Makefile b/lib/librte_headroom/Makefile
new file mode 100644
index 0000000..faefb3b
--- /dev/null
+++ b/lib/librte_headroom/Makefile
@@ -0,0 +1,54 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_headroom.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_headroom_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_HEADROOM) := rte_headroom.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_HEADROOM)-include := rte_headroom.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_mbuf
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_headroom/rte_headroom.c b/lib/librte_headroom/rte_headroom.c
new file mode 100644
index 0000000..a2cc671
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.c
@@ -0,0 +1,271 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <stdlib.h>
+
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_common.h>
+#include <rte_cycles.h>
+#include <rte_branch_prediction.h>
+#include <rte_debug.h>
+#include <rte_eal.h>
+#include <rte_malloc.h>
+
+#include "rte_headroom.h"
+
+/* Those are steps used to adjust job period.
+ * Experiments show that for forwarding apps the up step must be less than down
+ * step to achieve optimal performance.
+ */
+#define JOB_UPDATE_STEP_UP 1
+#define JOB_UPDATE_STEP_DOWN 4
+
+/*
+ * Default update function that implements simple period adjustment.
+ */
+static void
+default_update_function(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->target - result;
+
+ /* Job is happy. Nothing to do */
+ if (err == 0)
+ return;
+
+ if (err > 0) {
+ if (job->period + JOB_UPDATE_STEP_UP < job->max_period)
+ job->period += JOB_UPDATE_STEP_UP;
+ } else {
+ if (job->min_period + JOB_UPDATE_STEP_DOWN < job->period)
+ job->period -= JOB_UPDATE_STEP_DOWN;
+ }
+}
+
+#define HDR_ADD_TIME_MIN_MAX(obj, type, value) do { \
+ typeof(value) tmp = (value); \
+ (obj)->type ## _time += tmp; \
+ if (tmp < (obj)->min_ ## type ## _time) \
+ (obj)->min_ ## type ## _time = tmp; \
+ if (tmp > (obj)->max_ ## type ## _time) \
+ (obj)->max_ ## type ## _time = tmp; \
+} while (0)
+
+#define HDR_RESET_TIME_MIN_MAX(obj, type) do { \
+ (obj)->type ## _time = 0; \
+ (obj)->min_ ## type ## _time = UINT64_MAX; \
+ (obj)->max_ ## type ## _time = 0; \
+} while (0)
+
+int
+rte_headroom_init(struct rte_headroom *hdr)
+{
+ if (hdr == NULL)
+ return -EINVAL;
+
+ /* Init only needed parameters. Zero out everything else. */
+ memset(hdr, 0, sizeof(struct rte_headroom));
+
+ rte_headroom_reset_stats(hdr);
+
+ return 0;
+}
+
+void
+rte_headroom_start_loop(struct rte_headroom *hdr)
+{
+ uint64_t now;
+
+ hdr->loop_executed_jobs = 0;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+}
+
+void
+rte_headroom_finish_loop(struct rte_headroom *hdr)
+{
+ uint64_t now;
+
+ if (likely(hdr->loop_executed_jobs))
+ hdr->loop_cnt++;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+}
+
+void
+rte_headroom_set_job_target(struct rte_headroom_job *job, int64_t target)
+{
+ job->target = target;
+}
+
+int
+rte_headroom_start_job(struct rte_headroom *hdr, struct rte_headroom_job *job)
+{
+ uint64_t now;
+
+ /* Some sanity check. */
+ if (unlikely(hdr == NULL || job == NULL || job->headroom != NULL))
+ return -EINVAL;
+
+ /* Link job with headroom object. */
+ job->headroom = hdr;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+
+ return 0;
+}
+
+int
+rte_headroom_finish_job(struct rte_headroom_job *job, int64_t job_value)
+{
+ struct rte_headroom *hdr;
+ uint64_t now, exec_time;
+ int need_update;
+
+ /* Some sanity check. */
+ if (unlikely(job == NULL || job->headroom == NULL))
+ return -EINVAL;
+
+ need_update = job->target != job_value;
+ /* Adjust period only if job is unhappy of its current period. */
+ if (need_update)
+ (*job->update_period_cb)(job, job_value);
+
+ hdr = job->headroom;
+
+ /* Update execution time is considered as runtime so get time after it is
+ * executed. */
+ rte_mb();
+ now = rte_get_timer_cycles();
+ exec_time = now - hdr->state_time;
+ HDR_ADD_TIME_MIN_MAX(job, exec, exec_time);
+ HDR_ADD_TIME_MIN_MAX(hdr, exec, exec_time);
+
+ hdr->state_time = now;
+
+ hdr->loop_executed_jobs++;
+ hdr->job_exec_cnt++;
+
+ job->exec_cnt++;
+ job->headroom = NULL;
+
+ return need_update;
+}
+
+void
+rte_headroom_job_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate)
+{
+ if (saturate != 0) {
+ if (period < job->min_period)
+ period = job->min_period;
+ else if (period > job->max_period)
+ period = job->max_period;
+ }
+
+ job->period = period;
+}
+
+void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->min_period = period;
+ if (job->period < period)
+ job->period = period;
+}
+
+void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->max_period = period;
+ if (job->period > period)
+ job->period = period;
+}
+
+int
+rte_headroom_job_init(struct rte_headroom_job *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target)
+{
+ if (job == NULL)
+ return -EINVAL;
+
+ job->period = initial_period;
+ job->min_period = min_period;
+ job->max_period = max_period;
+ job->target = target;
+ job->update_period_cb = &default_update_function;
+ rte_headroom_reset_job_stats(job);
+ snprintf(job->name, RTE_DIM(job->name), "%s", name == NULL ? "" : name);
+ job->headroom = NULL;
+
+ return 0;
+}
+
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_period_cb)
+{
+ if (update_period_cb == NULL)
+ update_period_cb = default_update_function;
+
+ job->update_period_cb = update_period_cb;
+}
+
+void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job)
+{
+ HDR_RESET_TIME_MIN_MAX(job, exec);
+ job->exec_cnt = 0;
+}
+
+void
+rte_headroom_reset_stats(struct rte_headroom *hdr)
+{
+ HDR_RESET_TIME_MIN_MAX(hdr, exec);
+ HDR_RESET_TIME_MIN_MAX(hdr, management);
+ hdr->start_time = rte_get_timer_cycles();
+ hdr->state_time = hdr->start_time;
+ hdr->job_exec_cnt = 0;
+ hdr->loop_cnt = 0;
+}
diff --git a/lib/librte_headroom/rte_headroom.h b/lib/librte_headroom/rte_headroom.h
new file mode 100644
index 0000000..2232c13
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.h
@@ -0,0 +1,324 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef HEADROOM_H_
+#define HEADROOM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_HEADROOM_JOB_NAMESIZE 32
+#define RTE_HEADROOM_NAMESIZE 32
+#define RTE_HEADROOM_MZ_PREFIX "HDR_"
+
+/* Forward declarations. */
+struct rte_headroom;
+struct rte_headroom_job;
+
+/**
+ * This function should calculate new period and set it using
+ * rte_headroom_set_period() function. Time spent in this function will be
+ * added to job's runtime.
+ *
+ * @param job
+ * The job data structure handler.
+ * @param job_result
+ * Result of calling job callback.
+ */
+typedef void (*rte_headroom_update_fn_t)(struct rte_headroom_job *job,
+ int64_t job_result);
+
+struct rte_headroom_job {
+ uint64_t period;
+ /**< Estimated period of execution. */
+
+ uint64_t min_period;
+ /**< Minimum period. */
+
+ uint64_t max_period;
+ /**< Maximum period. */
+
+ int64_t target;
+ /**< Desired value for this job. */
+
+ rte_headroom_update_fn_t update_period_cb;
+ /**< Period update callback. */
+
+ uint64_t exec_time;
+ /**< Total time (sum) that this job was executing. */
+
+ uint64_t min_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t exec_cnt;
+ /**< Execute count. */
+
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ /**< Name of this job */
+
+ struct rte_headroom *headroom;
+ /**< Headroom object that is executing this job. */
+} __rte_cache_aligned;
+
+struct rte_headroom {
+ /** Viariable holding time at different points:
+ * -# loop start time if loop was started but no job executed yet.
+ * -# job start time if job is currently executing.
+ * -# job finish time if job finished its execution.
+ * -# loop finish time if loop finished its execution. */
+ uint64_t state_time;
+
+ uint64_t loop_executed_jobs;
+ /**< Count of executed jobs in this loop. */
+
+ /* Statistics start. */
+
+ uint64_t exec_time;
+ /**< Total time taken to execute jobs, not including management time. */
+
+ uint64_t min_exec_time;
+ /**< Minimum loop execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum loop execute time. */
+
+ /**
+ * Sum of time that is not the execute time (ex: from job finish to next
+ * job start).
+ *
+ * This time might be considered as overhead of headroom library + job
+ * scheduling.
+ */
+ uint64_t management_time;
+
+ uint64_t min_management_time;
+ /**< Minimum management time */
+
+ uint64_t max_management_time;
+ /**< Maximum management time */
+
+ uint64_t start_time;
+ /**< Time since last reset stats. */
+
+ uint64_t job_exec_cnt;
+ /**< Total count of executed jobs. */
+
+ uint64_t loop_cnt;
+ /**< Total count of executed loops with at least one executed job. */
+} __rte_cache_aligned;
+
+/**
+ * Initialize given headroom object with default values.
+ *
+ * @param hdr
+ * Headroom object to initialize.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *hdr* is NULL
+ */
+int
+rte_headroom_init(struct rte_headroom *hdr);
+
+/**
+ * Mark that new set of jobs start executing.
+ *
+ * @param hdr
+ * Headroom object.
+ */
+void
+rte_headroom_start_loop(struct rte_headroom *hdr);
+
+/**
+ * Mark that there is no more jobs ready to execute in this turn. Calculate
+ * stats for this loop turn.
+ *
+ * @param hdr
+ * Headroom object.
+ */
+void
+rte_headroom_finish_loop(struct rte_headroom *hdr);
+
+/**
+ * Initialize given job stats object.
+ *
+ * @param job
+ * Job object.
+ * @param name
+ * Optional job name.
+ * @param min_period
+ * Minimum period that this job can accept.
+ * @param max_period
+ * Maximum period that this job can accept.
+ * @param initial_period
+ * Initial period. It will be checked against *min_period* and *max_period*.
+ * @param target
+ * Target value that this job try to achieve.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *job* is NULL
+ */
+int
+rte_headroom_job_init(struct rte_headroom_job *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target);
+
+/**
+ * Set job desired target value. Difference between target and job callback
+ * return value must be used to properly adjust job execute period value.
+ *
+ * @param job
+ * The job object.
+ * @param target
+ * New target.
+ */
+void
+rte_headroom_set_job_target(struct rte_headroom_job *job, int64_t target);
+
+/**
+ * Mark that *job* is starting of its execution in context of *hdr* object.
+ *
+ * @param hdr
+ * Headroom object context.
+ * @param job
+ * Job object.
+ * @return
+ * 0 on success
+ * -EINVAL if *hdr* or *job* is NULL or *job* is executing in another headroom
+ * context already,
+ */
+int
+rte_headroom_start_job(struct rte_headroom *hdr, struct rte_headroom_job *job);
+
+/**
+ * Mark that *job* finished its execution. Context in which it was executing will
+ * receive stat update. After this function call *job* object is ready to be
+ * executed in other headroom context.
+ *
+ * @param job
+ * Job object.
+ * @param job_value
+ * Job value. Job should pass in this parameter a value that it try to optimize
+ * for example the number of packets it processed.
+ *
+ * @return
+ * 0 if job's period was not updated (job target equals *job_value*)
+ * 1 if job's period was updated
+ * -EINVAL if job is NULL or job was not started (it have no headroom context).
+ */
+int
+rte_headroom_finish_job(struct rte_headroom_job *job, int64_t job_value);
+
+/**
+ * Set execute period of given job.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New period value.
+ * @param saturate
+ * If zero, skip period saturation to min, max range.
+ */
+void
+rte_headroom_job_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate);
+/**
+ * Set minimum execute period of given job. Current period will be checked
+ * against new minimum value.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New minimum period value.
+ */
+void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period);
+/**
+ * Set maximum execute period of given job. Current period will be checked
+ * against new maximum value.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New maximum period value.
+ */
+void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period);
+
+/**
+ * Set update period callback that is invoked after job finish.
+ *
+ * If application wants to do more sophisticated calculations than default
+ * it can provide this handler.
+ *
+ * @param job
+ * Job object.
+ * @param update_pedriod_cb
+ * Callback to set. If NULL restore default update function.
+ */
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_period_cb);
+
+/**
+ * Function resets job statistics.
+ *
+ * @param job
+ * Job which statistics will be reset.
+ */
+void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job);
+/**
+ * Function resets headroom statistics.
+ *
+ * @param hdr
+ * Headroom which statistics will be reset.
+ */
+void
+rte_headroom_reset_stats(struct rte_headroom *hdr);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* HEADROOM_H_ */
diff --git a/lib/librte_headroom/rte_headroom_version.map b/lib/librte_headroom/rte_headroom_version.map
new file mode 100644
index 0000000..1f20016
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom_version.map
@@ -0,0 +1,20 @@
+DPDK_2.0 {
+ global:
+
+ rte_headroom_init;
+ rte_headroom_start_loop;
+ rte_headroom_finish_loop;
+ rte_headroom_job_init;
+ rte_headroom_set_job_target;
+ rte_headroom_start_job;
+ rte_headroom_finish_job;
+ rte_headroom_job_set_period;
+ rte_headroom_set_min_period;
+ rte_headroom_set_max_period;
+ rte_headroom_set_update_period_function;
+ rte_headroom_reset_job_stats;
+ rte_headroom_reset_stats;
+
+ local: *;
+};
+
\ No newline at end of file
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 " Pawel Wodkowski
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-17 16:42 ` Pawel Wodkowski
2015-02-18 13:41 ` De Lara Guarch, Pablo
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
2 siblings, 1 reply; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-17 16:42 UTC (permalink / raw)
To: dev
This app demonstrate usage of new headroom library.
It is basicaly orginal l2fwd with following modificantions to met
headroom library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for those jobs is almost the same as in orginal application.
- stats is moved to rte_alarm callbac to not introduce overhead of
printing.
- stats are expanded to show headroom statistics.
- added new parameter '-l' to automatic thousands separator.
Comparing orginal l2fwd and l2fwd-headroom apps will show approach what
is needed to properly write own application with headroom measurements.
New available statistics:
- Total and % of fwd and flush execution time
- management time - overhead of rte_timer + overhead of headroom library
- Idle time and % of time spent waiting for fwd or flush to be ready to
execute.
- per job execution time and period.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 ++
examples/l2fwd-headroom/main.c | 1039 ++++++++++++++++++++++++++++++++++++++
mk/rte.app.mk | 4 +
4 files changed, 1095 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..8a459b7 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
DIRS-y += l2fwd
+DIRS-y += l2fwd-headroom
DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
DIRS-y += l3fwd
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-headroom/Makefile
new file mode 100644
index 0000000..07da286
--- /dev/null
+++ b/examples/l2fwd-headroom/Makefile
@@ -0,0 +1,51 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-headroom
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-headroom/main.c
new file mode 100644
index 0000000..7ba1743
--- /dev/null
+++ b/examples/l2fwd-headroom/main.c
@@ -0,0 +1,1039 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <locale.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <ctype.h>
+#include <getopt.h>
+
+#include <rte_alarm.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_spinlock.h>
+
+#include <rte_errno.h>
+#include <rte_headroom.h>
+#include <rte_timer.h>
+#include <rte_alarm.h>
+
+#define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF 8192
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr l2fwd_ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t l2fwd_enabled_port_mask;
+
+/* list of enabled ports */
+static uint32_t l2fwd_dst_ports[RTE_MAX_ETHPORTS];
+
+#define UPDATE_STEP_UP 1
+#define UPDATE_STEP_DOWN 32
+
+static unsigned int l2fwd_rx_queue_per_lcore = 1;
+
+struct mbuf_table {
+ uint64_t next_flush_time;
+ unsigned len;
+ struct rte_mbuf *mbufs[MAX_PKT_BURST];
+};
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT 16
+struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_headroom_job port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
+
+ struct rte_timer flush_timer;
+ struct rte_headroom_job flush_job;
+ struct rte_headroom_job idle_job;
+ struct rte_headroom headroom;
+
+ rte_atomic16_t stats_read_pending;
+ rte_spinlock_t lock;
+} __rte_cache_aligned;
+struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
+
+static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc = 0, /**< CRC stripped by hardware */
+ },
+ .txmode = {
+ .mq_mode = ETH_MQ_TX_NONE,
+ },
+};
+
+struct rte_mempool *l2fwd_pktmbuf_pool = NULL;
+
+/* Per-port statistics struct */
+struct l2fwd_port_statistics {
+ uint64_t tx;
+ uint64_t rx;
+ uint64_t dropped;
+} __rte_cache_aligned;
+struct l2fwd_port_statistics port_statistics[RTE_MAX_ETHPORTS];
+
+/* 1 day max */
+#define MAX_TIMER_PERIOD 86400
+/* default period is 10 seconds */
+static int64_t timer_period = 10;
+/* default timer frequency */
+static double hz;
+/* BURST_TX_DRAIN_US converted to cycles */
+uint64_t drain_tsc;
+/* Convert cycles to ns */
+static inline double
+cycles_to_ns(uint64_t cycles)
+{
+ double t = cycles;
+
+ t *= (double)NS_PER_S;
+ t /= hz;
+ return t;
+}
+
+static void
+show_lcore_headroom_stats(unsigned lcore_id)
+{
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_headroom *hdr = &qconf->headroom;
+ struct rte_headroom_job *job;
+ uint8_t i;
+
+ /* Headroom statistics. */
+ uint64_t stats_period, loop_count;
+ uint64_t exec, exec_min, exec_max;
+ uint64_t management, management_min, management_max;
+ uint64_t busy, busy_min, busy_max;
+
+ /* Jobs statistics. */
+ const uint8_t port_cnt = qconf->n_rx_port;
+ uint64_t jobs_exec_cnt[port_cnt], jobs_period[port_cnt];
+ uint64_t jobs_exec[port_cnt], jobs_exec_min[port_cnt],
+ jobs_exec_max[port_cnt];
+
+ uint64_t flush_exec_cnt, flush_period;
+ uint64_t flush_exec, flush_exec_min, flush_exec_max;
+
+ uint64_t idle_exec_cnt;
+ uint64_t idle_exec, idle_exec_min, idle_exec_max;
+ uint64_t collection_time = rte_get_timer_cycles();
+
+ /* Ask forwarding thread to give us stats. */
+ rte_atomic16_set(&qconf->stats_read_pending, 1);
+ rte_spinlock_lock(&qconf->lock);
+ rte_atomic16_set(&qconf->stats_read_pending, 0);
+
+ /* Collect headroom statistics. */
+ stats_period = hdr->state_time - hdr->start_time;
+ loop_count = hdr->loop_cnt;
+
+ exec = hdr->exec_time;
+ exec_min = hdr->min_exec_time;
+ exec_max = hdr->max_exec_time;
+
+ management = hdr->management_time;
+ management_min = hdr->min_management_time;
+ management_max = hdr->max_management_time;
+
+ rte_headroom_reset_stats(hdr);
+
+ for (i = 0; i < port_cnt; i++) {
+ job = &qconf->port_fwd_jobs[i];
+
+ jobs_exec_cnt[i] = job->exec_cnt;
+ jobs_period[i] = job->period;
+
+ jobs_exec[i] = job->exec_time;
+ jobs_exec_min[i] = job->min_exec_time;
+ jobs_exec_max[i] = job->max_exec_time;
+
+ rte_headroom_reset_job_stats(job);
+ }
+
+ flush_exec_cnt = qconf->flush_job.exec_cnt;
+ flush_period = qconf->flush_job.period;
+ flush_exec = qconf->flush_job.exec_time;
+ flush_exec_min = qconf->flush_job.min_exec_time;
+ flush_exec_max = qconf->flush_job.max_exec_time;
+ rte_headroom_reset_job_stats(&qconf->flush_job);
+
+ idle_exec_cnt = qconf->idle_job.exec_cnt;
+ idle_exec = qconf->idle_job.exec_time;
+ idle_exec_min = qconf->idle_job.min_exec_time;
+ idle_exec_max = qconf->idle_job.max_exec_time;
+ rte_headroom_reset_job_stats(&qconf->idle_job);
+
+ rte_spinlock_unlock(&qconf->lock);
+
+ exec -= idle_exec;
+ busy = exec + management;
+ busy_min = exec_min + management_min;
+ busy_max = exec_max + management_max;
+
+
+ collection_time = rte_get_timer_cycles() - collection_time;
+
+#define STAT_FMT "\n%-18s %'14.0f %6.1f%% %'10.0f %'10.0f %'10.0f"
+
+ printf("\n----------------"
+ "\nLCore %3u: headroom statistics (time in ns, collected in %'9.0f)"
+ "\n%-18s %14s %7s %10s %10s %10s "
+ "\n%-18s %'14.0f"
+ "\n%-18s %'14" PRIu64
+ STAT_FMT /* Exec */
+ STAT_FMT /* Management */
+ STAT_FMT /* Busy */
+ STAT_FMT, /* Idle */
+ lcore_id, cycles_to_ns(collection_time),
+ "Stat type", "total", "%total", "avg", "min", "max",
+ "Stats duration:", cycles_to_ns(stats_period),
+ "Loop count:", loop_count,
+ "Exec time",
+ cycles_to_ns(exec), exec * 100.0 / stats_period ,
+ cycles_to_ns(loop_count ? exec / loop_count : 0),
+ cycles_to_ns(exec_min),
+ cycles_to_ns(exec_max),
+ "Management time",
+ cycles_to_ns(management), management * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? management / loop_count : 0),
+ cycles_to_ns(management_min),
+ cycles_to_ns(management_max),
+ "Exec + management",
+ cycles_to_ns(busy), busy * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? busy / loop_count : 0),
+ cycles_to_ns(busy_min),
+ cycles_to_ns(busy_max),
+ "Idle (job)",
+ cycles_to_ns(idle_exec), idle_exec * 100.0 / stats_period,
+ cycles_to_ns(idle_exec_cnt ? idle_exec / idle_exec_cnt : 0),
+ cycles_to_ns(idle_exec_min),
+ cycles_to_ns(idle_exec_max));
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ job = &qconf->port_fwd_jobs[i];
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", jobs_exec_cnt[i],
+ "Exec period: ", cycles_to_ns(jobs_period[i]),
+ "Exec time",
+ cycles_to_ns(jobs_exec[i]), jobs_exec[i] * 100.0 / stats_period,
+ cycles_to_ns(jobs_exec_cnt[i] ? jobs_exec[i] / jobs_exec_cnt[i]
+ : 0),
+ cycles_to_ns(jobs_exec_min[i]),
+ cycles_to_ns(jobs_exec_max[i]));
+ }
+
+ if (qconf->n_rx_port > 0) {
+ job = &qconf->flush_job;
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", flush_exec_cnt,
+ "Exec period: ", cycles_to_ns(flush_period),
+ "Exec time",
+ cycles_to_ns(flush_exec), flush_exec * 100.0 / stats_period ,
+ cycles_to_ns(flush_exec_cnt ? flush_exec / flush_exec_cnt : 0),
+ cycles_to_ns(flush_exec_min),
+ cycles_to_ns(flush_exec_max));
+ }
+}
+
+/* Print out statistics on packets dropped */
+static void
+show_stats_cb(__rte_unused void *param)
+{
+ uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+ unsigned portid, lcore_id;
+
+ total_packets_dropped = 0;
+ total_packets_tx = 0;
+ total_packets_rx = 0;
+
+ const char clr[] = { 27, '[', '2', 'J', '\0' };
+ const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+ /* Clear screen and move to top left */
+ printf("%s%s"
+ "\nPort statistics ===================================",
+ clr, topLeft);
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ /* skip disabled ports */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+ printf("\nStatistics for port %u ------------------------------"
+ "\nPackets sent: %24"PRIu64
+ "\nPackets received: %20"PRIu64
+ "\nPackets dropped: %21"PRIu64,
+ portid,
+ port_statistics[portid].tx,
+ port_statistics[portid].rx,
+ port_statistics[portid].dropped);
+
+ total_packets_dropped += port_statistics[portid].dropped;
+ total_packets_tx += port_statistics[portid].tx;
+ total_packets_rx += port_statistics[portid].rx;
+ }
+
+ printf("\nAggregate statistics ==============================="
+ "\nTotal packets sent: %18"PRIu64
+ "\nTotal packets received: %14"PRIu64
+ "\nTotal packets dropped: %15"PRIu64
+ "\n====================================================",
+ total_packets_tx,
+ total_packets_rx,
+ total_packets_dropped);
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ if (lcore_queue_conf[lcore_id].n_rx_port > 0)
+ show_lcore_headroom_stats(lcore_id);
+ }
+
+ printf("\n====================================================\n");
+ rte_eal_alarm_set(timer_period * US_PER_S, show_stats_cb, NULL);
+}
+
+/* Send the burst of packets on an output interface */
+static void
+l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
+{
+ struct mbuf_table *m_table;
+ uint16_t ret;
+ uint16_t queueid = 0;
+ uint16_t n;
+
+ m_table = &qconf->tx_mbufs[port];
+ n = m_table->len;
+
+ m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
+ m_table->len = 0;
+
+ ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
+
+ port_statistics[port].tx += ret;
+ if (unlikely(ret < n)) {
+ port_statistics[port].dropped += (n - ret);
+ do {
+ rte_pktmbuf_free(m_table->mbufs[ret]);
+ } while (++ret < n);
+ }
+}
+
+/* Enqueue packets for TX and prepare them to be sent */
+static int
+l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+{
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct mbuf_table *m_table = &qconf->tx_mbufs[port];
+ uint16_t len = qconf->tx_mbufs[port].len;
+
+ m_table->mbufs[len] = m;
+
+ len++;
+ m_table->len = len;
+
+ /* Enough pkts to be sent. */
+ if (unlikely(len == MAX_PKT_BURST))
+ l2fwd_send_burst(qconf, port);
+
+ return 0;
+}
+
+static void
+l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+{
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+ tmp = ð->d_addr.addr_bytes[0];
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+ /* src addr */
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+}
+
+static void
+l2fwd_job_update_cb(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->target - result;
+ int64_t histeresis = job->target / 8;
+
+ if (err < -histeresis) {
+ if (job->min_period + UPDATE_STEP_DOWN < job->period)
+ job->period -= UPDATE_STEP_DOWN;
+ } else if (err > histeresis) {
+ if (job->period + UPDATE_STEP_UP < job->max_period)
+ job->period += UPDATE_STEP_UP;
+ }
+}
+
+static void
+l2fwd_fwd_job(__rte_unused struct rte_timer *timer, void *arg)
+{
+ struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+ struct rte_mbuf *m;
+
+ const uint8_t port_idx = (uintptr_t) arg;
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_headroom_job *job = &qconf->port_fwd_jobs[port_idx];
+ const uint8_t portid = qconf->rx_port_list[port_idx];
+
+ uint8_t j;
+ uint16_t total_nb_rx;
+
+ rte_headroom_start_job(&qconf->headroom, job);
+
+ /* Call rx burst 2 times. This allow headroom logic to see if this function
+ * must be called more frequently. */
+
+ total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ for (j = 0; j < total_nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+ if (total_nb_rx == MAX_PKT_BURST) {
+ const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+ }
+
+ port_statistics[portid].rx += total_nb_rx;
+
+ /* Adjust period time in which we are running here. */
+ if (rte_headroom_finish_job(job, total_nb_rx) != 0) {
+ rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
+ lcore_id, l2fwd_fwd_job, arg);
+ }
+}
+
+static void
+l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
+{
+ uint64_t now;
+ unsigned lcore_id;
+ struct lcore_queue_conf *qconf;
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_headroom_start_job(&qconf->headroom, &qconf->flush_job);
+
+ now = rte_get_timer_cycles();
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+
+ /* Pass target to indicate that this job is happy of time interwal
+ * in which it was called. */
+ rte_headroom_finish_job(&qconf->flush_job, qconf->flush_job.target);
+}
+
+/* main processing loop */
+static void
+l2fwd_main_loop(void)
+{
+ unsigned lcore_id;
+ unsigned i, portid;
+ struct lcore_queue_conf *qconf;
+ uint8_t stats_read_pending = 0;
+ uint8_t need_manage;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD, "lcore %u has nothing to do\n", lcore_id);
+ return;
+ }
+
+ RTE_LOG(INFO, L2FWD, "entering main loop on lcore %u\n", lcore_id);
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+
+ portid = qconf->rx_port_list[i];
+ RTE_LOG(INFO, L2FWD, " -- lcoreid=%u portid=%u\n", lcore_id,
+ portid);
+ }
+
+ rte_headroom_job_init(&qconf->idle_job, "idle", 0, 0, 0, 0);
+
+ for (;;) {
+ rte_spinlock_lock(&qconf->lock);
+
+ do {
+ rte_headroom_start_loop(&qconf->headroom);
+
+ /* Do the Idle job:
+ * - Read stats_read_pending flag
+ * - check if some real job need to be executed
+ */
+ rte_headroom_start_job(&qconf->headroom, &qconf->idle_job);
+
+ do {
+ uint8_t i;
+ uint64_t now = rte_get_timer_cycles();
+ need_manage = qconf->flush_timer.expire < now;
+ /* Check if we was esked to give a stats. */
+ stats_read_pending =
+ rte_atomic16_read(&qconf->stats_read_pending);
+ need_manage |= stats_read_pending;
+
+ for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
+ need_manage = qconf->rx_timers[i].expire < now;
+
+ } while (!need_manage);
+ rte_headroom_finish_job(&qconf->idle_job, qconf->idle_job.target);
+
+ rte_timer_manage();
+ rte_headroom_finish_loop(&qconf->headroom);
+ } while (likely(stats_read_pending == 0));
+
+ rte_spinlock_unlock(&qconf->lock);
+ rte_pause();
+ }
+}
+
+static int
+l2fwd_launch_one_lcore(__attribute__((unused)) void *dummy)
+{
+ l2fwd_main_loop();
+ return 0;
+}
+
+/* display usage */
+static void
+l2fwd_usage(const char *prgname)
+{
+ printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+ " -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+ " -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+ " -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n"
+ " -l set system default locale instead of default (\"C\" locale) for thousands separator in stats.",
+ prgname);
+}
+
+static int
+l2fwd_parse_portmask(const char *portmask)
+{
+ char *end = NULL;
+ unsigned long pm;
+
+ /* parse hexadecimal string */
+ pm = strtoul(portmask, &end, 16);
+ if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+
+ if (pm == 0)
+ return -1;
+
+ return pm;
+}
+
+static unsigned int
+l2fwd_parse_nqueue(const char *q_arg)
+{
+ char *end = NULL;
+ unsigned long n;
+
+ /* parse hexadecimal string */
+ n = strtoul(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return 0;
+ if (n == 0)
+ return 0;
+ if (n >= MAX_RX_QUEUE_PER_LCORE)
+ return 0;
+
+ return n;
+}
+
+static int
+l2fwd_parse_timer_period(const char *q_arg)
+{
+ char *end = NULL;
+ int n;
+
+ /* parse number string */
+ n = strtol(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+ if (n >= MAX_TIMER_PERIOD)
+ return -1;
+
+ return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+l2fwd_parse_args(int argc, char **argv)
+{
+ int opt, ret;
+ char **argvopt;
+ int option_index;
+ char *prgname = argv[0];
+ static struct option lgopts[] = {
+ {NULL, 0, 0, 0}
+ };
+
+ argvopt = argv;
+
+ while ((opt = getopt_long(argc, argvopt, "p:q:T:l",
+ lgopts, &option_index)) != EOF) {
+
+ switch (opt) {
+ /* portmask */
+ case 'p':
+ l2fwd_enabled_port_mask = l2fwd_parse_portmask(optarg);
+ if (l2fwd_enabled_port_mask == 0) {
+ printf("invalid portmask\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* nqueue */
+ case 'q':
+ l2fwd_rx_queue_per_lcore = l2fwd_parse_nqueue(optarg);
+ if (l2fwd_rx_queue_per_lcore == 0) {
+ printf("invalid queue number\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* timer period */
+ case 'T':
+ timer_period = l2fwd_parse_timer_period(optarg);
+ if (timer_period < 0) {
+ printf("invalid timer period\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* For thousands separator in printf. */
+ case 'l':
+ setlocale(LC_ALL, "");
+ break;
+
+ /* long options */
+ case 0:
+ l2fwd_usage(prgname);
+ return -1;
+
+ default:
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ }
+
+ if (optind >= 0)
+ argv[optind-1] = prgname;
+
+ ret = optind-1;
+ optind = 0; /* reset getopt lib */
+ return ret;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+ uint8_t portid, count, all_ports_up, print_flag = 0;
+ struct rte_eth_link link;
+
+ printf("\nChecking link status");
+ fflush(stdout);
+ for (count = 0; count <= MAX_CHECK_TIME; count++) {
+ all_ports_up = 1;
+ for (portid = 0; portid < port_num; portid++) {
+ if ((port_mask & (1 << portid)) == 0)
+ continue;
+ memset(&link, 0, sizeof(link));
+ rte_eth_link_get_nowait(portid, &link);
+ /* print link status if flag set */
+ if (print_flag == 1) {
+ if (link.link_status)
+ printf("Port %d Link Up - speed %u "
+ "Mbps - %s\n", (uint8_t)portid,
+ (unsigned)link.link_speed,
+ (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+ ("full-duplex") : ("half-duplex\n"));
+ else
+ printf("Port %d Link Down\n",
+ (uint8_t)portid);
+ continue;
+ }
+ /* clear all_ports_up flag if any link down */
+ if (link.link_status == 0) {
+ all_ports_up = 0;
+ break;
+ }
+ }
+ /* after finally printing all link status, get out */
+ if (print_flag == 1)
+ break;
+
+ if (all_ports_up == 0) {
+ printf(".");
+ fflush(stdout);
+ rte_delay_ms(CHECK_INTERVAL);
+ }
+
+ /* set the print_flag if all ports up or timeout */
+ if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+ print_flag = 1;
+ printf("done\n");
+ }
+ }
+}
+
+int
+main(int argc, char **argv)
+{
+ struct lcore_queue_conf *qconf;
+ struct rte_eth_dev_info dev_info;
+ unsigned lcore_id, rx_lcore_id;
+ unsigned nb_ports_in_mask = 0;
+ int ret;
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ uint8_t nb_ports;
+ uint8_t nb_ports_available;
+ uint8_t portid, last_port;
+ uint8_t i;
+
+ /* init EAL */
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+ rte_timer_subsystem_init();
+
+ /* fetch default timer frequency. */
+ hz = rte_get_timer_hz();
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+ nb_ports = rte_eth_dev_count();
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ } else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get(portid, &dev_info);
+ }
+ if (nb_ports_in_mask % 2) {
+ printf("Notice: odd number of ports in portmask.\n");
+ l2fwd_dst_ports[last_port] = last_port;
+ }
+
+ rx_lcore_id = 0;
+ qconf = NULL;
+
+ /* Initialize the port/queue configuration of each logical core */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ /* get the lcore_id for this port */
+ while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+ lcore_queue_conf[rx_lcore_id].n_rx_port ==
+ l2fwd_rx_queue_per_lcore) {
+ rx_lcore_id++;
+ if (rx_lcore_id >= RTE_MAX_LCORE)
+ rte_exit(EXIT_FAILURE, "Not enough cores\n");
+ }
+
+ if (qconf != &lcore_queue_conf[rx_lcore_id])
+ /* Assigned a new logical core in the loop above. */
+ qconf = &lcore_queue_conf[rx_lcore_id];
+
+ qconf->rx_port_list[qconf->n_rx_port] = portid;
+ qconf->n_rx_port++;
+ printf("Lcore %u: RX port %u\n", rx_lcore_id, (unsigned) portid);
+ }
+
+ nb_ports_available = nb_ports;
+
+ /* Initialise each port */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+ printf("Skipping disabled port %u\n", (unsigned) portid);
+ nb_ports_available--;
+ continue;
+ }
+ /* init port */
+ printf("Initializing port %u... ", (unsigned) portid);
+ fflush(stdout);
+ ret = rte_eth_dev_configure(portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ rte_eth_macaddr_get(portid, &l2fwd_ports_eth_addr[portid]);
+
+ /* init one RX queue */
+ fflush(stdout);
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* init one TX queue on each port */
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* Start device */
+ ret = rte_eth_dev_start(portid);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ printf("done:\n");
+
+ rte_eth_promiscuous_enable(portid);
+
+ printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+ (unsigned) portid,
+ l2fwd_ports_eth_addr[portid].addr_bytes[0],
+ l2fwd_ports_eth_addr[portid].addr_bytes[1],
+ l2fwd_ports_eth_addr[portid].addr_bytes[2],
+ l2fwd_ports_eth_addr[portid].addr_bytes[3],
+ l2fwd_ports_eth_addr[portid].addr_bytes[4],
+ l2fwd_ports_eth_addr[portid].addr_bytes[5]);
+
+ /* initialize port stats */
+ memset(&port_statistics, 0, sizeof(port_statistics));
+ }
+
+ if (!nb_ports_available) {
+ rte_exit(EXIT_FAILURE,
+ "All available ports are disabled. Please set portmask.\n");
+ }
+
+ check_all_ports_link_status(nb_ports, l2fwd_enabled_port_mask);
+
+ drain_tsc = (hz + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_spinlock_init(&qconf->lock);
+
+ if (rte_headroom_init(&qconf->headroom) != 0)
+ rte_panic("Headroom for core %u init failed\n", lcore_id);
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD,
+ "lcore %u: no ports so no headroom initialization\n",
+ lcore_id);
+ continue;
+ }
+ /* Add flush job.
+ * Set fixed period by setting min = max = initial period. Set target to
+ * zero as it is irrelevant for this job. */
+ rte_headroom_job_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
+ drain_tsc, 0);
+
+ rte_timer_init(&qconf->flush_timer);
+ rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL, lcore_id,
+ &l2fwd_flush_job, NULL);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add flush job for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ struct rte_headroom_job *job = &qconf->port_fwd_jobs[i];
+
+ portid = qconf->rx_port_list[i];
+ printf("Setting forward jon for port %u\n", portid);
+
+ snprintf(name, RTE_DIM(name), "port %u fwd", portid);
+ /* Setup forward job.
+ * Set min, max and initial period. Set target to MAX_PKT_BURST as
+ * this is desired optimal RX/TX burst size. */
+ rte_headroom_job_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
+ rte_headroom_set_update_period_function(job, l2fwd_job_update_cb);
+
+ rte_timer_init(&qconf->rx_timers[i]);
+ rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
+ &l2fwd_fwd_job, (void *)(uintptr_t)i);
+ }
+ }
+
+ if (timer_period)
+ rte_eal_alarm_set(timer_period * MS_PER_S, show_stats_cb, NULL);
+ else
+ RTE_LOG(INFO, L2FWD, "Stats display disabled\n");
+
+ /* launch per-lcore init on every lcore */
+ rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL, CALL_MASTER);
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+ if (rte_eal_wait_lcore(lcore_id) < 0)
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 334cb25..3db7222 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -103,6 +103,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
LDLIBS += -lrte_hash
endif
+ifeq ($(CONFIG_RTE_LIBRTE_HEADROOM),y)
+LDLIBS += -lrte_headroom
+endif
+
ifeq ($(CONFIG_RTE_LIBRTE_LPM),y)
LDLIBS += -lrte_lpm
endif
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-18 13:36 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 48+ messages in thread
From: De Lara Guarch, Pablo @ 2015-02-18 13:36 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
Hi Pawel,
A few things to fix in this patch:
> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Tuesday, February 17, 2015 4:42 PM
> To: dev@dpdk.org
> Cc: De Lara Guarch, Pablo
> Subject: [PATCH v4 1/2] librte_headroom: New library for checking
> core/system/app load
>
> This library provide API to measure time spend in particular parts of
> code and to calculate optimal polling time.
>
> To calculate a those statistics application code need to be devided into
Typo in "devided"
> parts (called jobs) that do something. It is up to application to decide
> what is considered a job.
>
> Series of jobs must be surrounded with the rte_headroom_start_loop() and
> rte_headroom_finish_loop() calls. After that, jobs might be started.
> Each job must be surrounded with rte_headroom_start_job() and
> rte_headroom_finish_job() calls.
>
> After job finish its execution, period in which it should be called
Finishes
> again is adjusted to minimize time wasted on unnecessary polls/calls.
> Adjustmend is based on data provided by job itself (ex: number of
> packets it processed).
Adjustment
>
> After all jobs in serie are executed fallowing statistics are updated
> and might be used by application. Statistics can be reset. Some of
> provided statistic data:
> - total/min/max execution - time spent in executing jobs.
> - total/min/max management - time spent outside execution area. This
> value might used to measure overhead of sheduling jobs. This time also
Be used, scheduling
> contains overhead of headroom library itself.
> - number of loops that executed at least one job
> - executed jobs
> - time when statistics were reset.
>
> Each job provide total/min/max execution time and execution count
> statistics.
>
> Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> ---
> config/common_bsdapp | 5 +
> config/common_linuxapp | 5 +
> lib/Makefile | 1 +
> lib/librte_headroom/Makefile | 54 +++++
> lib/librte_headroom/rte_headroom.c | 271
> ++++++++++++++++++++++
> lib/librte_headroom/rte_headroom.h | 324
> +++++++++++++++++++++++++++
> lib/librte_headroom/rte_headroom_version.map | 20 ++
> 7 files changed, 680 insertions(+)
> create mode 100644 lib/librte_headroom/Makefile
> create mode 100644 lib/librte_headroom/rte_headroom.c
> create mode 100644 lib/librte_headroom/rte_headroom.h
> create mode 100644 lib/librte_headroom/rte_headroom_version.map
>
[...]
> diff --git a/lib/librte_headroom/rte_headroom_version.map
> b/lib/librte_headroom/rte_headroom_version.map
> new file mode 100644
> index 0000000..1f20016
> --- /dev/null
> +++ b/lib/librte_headroom/rte_headroom_version.map
> @@ -0,0 +1,20 @@
> +DPDK_2.0 {
> + global:
> +
> + rte_headroom_init;
> + rte_headroom_start_loop;
> + rte_headroom_finish_loop;
> + rte_headroom_job_init;
> + rte_headroom_set_job_target;
> + rte_headroom_start_job;
> + rte_headroom_finish_job;
> + rte_headroom_job_set_period;
> + rte_headroom_set_min_period;
> + rte_headroom_set_max_period;
> + rte_headroom_set_update_period_function;
> + rte_headroom_reset_job_stats;
> + rte_headroom_reset_stats;
> +
Trailing whitespaces here.
> + local: *;
> +};
> +
Trailing whitespaces here.
> \ No newline at end of file
> --
> 1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
@ 2015-02-18 13:41 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 48+ messages in thread
From: De Lara Guarch, Pablo @ 2015-02-18 13:41 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
Hi Pawel,
A few things to fix:
> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Tuesday, February 17, 2015 4:42 PM
> To: dev@dpdk.org
> Cc: De Lara Guarch, Pablo
> Subject: [PATCH v4 2/2] examples: introduce new l2fwd-headroom example
>
> This app demonstrate usage of new headroom library.
> It is basicaly orginal l2fwd with following modificantions to met
Typo: Basically the original, modifications
> headroom library requirements:
> - main_loop() was split into two jobs: forward job and flush job. Logic
> for those jobs is almost the same as in orginal application.
original
> - stats is moved to rte_alarm callbac to not introduce overhead of
callback
> printing.
> - stats are expanded to show headroom statistics.
> - added new parameter '-l' to automatic thousands separator.
>
> Comparing orginal l2fwd and l2fwd-headroom apps will show approach what
original
> is needed to properly write own application with headroom measurements.
>
> New available statistics:
> - Total and % of fwd and flush execution time
> - management time - overhead of rte_timer + overhead of headroom library
> - Idle time and % of time spent waiting for fwd or flush to be ready to
> execute.
> - per job execution time and period.
>
>
> Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> ---
> examples/Makefile | 1 +
> examples/l2fwd-headroom/Makefile | 51 ++
> examples/l2fwd-headroom/main.c | 1039
> ++++++++++++++++++++++++++++++++++++++
> mk/rte.app.mk | 4 +
> 4 files changed, 1095 insertions(+)
> create mode 100644 examples/l2fwd-headroom/Makefile
> create mode 100644 examples/l2fwd-headroom/main.c
>
> diff --git a/examples/Makefile b/examples/Makefile
> index 81f1d2f..8a459b7 100644
> --- a/examples/Makefile
> +++ b/examples/Makefile
> @@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) +=
> ip_fragmentation
> DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
> DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
> DIRS-y += l2fwd
> +DIRS-y += l2fwd-headroom
> DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
> DIRS-y += l3fwd
> DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
> diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-
> headroom/Makefile
> new file mode 100644
> index 0000000..07da286
> --- /dev/null
> +++ b/examples/l2fwd-headroom/Makefile
> @@ -0,0 +1,51 @@
> +# BSD LICENSE
> +#
> +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +# All rights reserved.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +# notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +# notice, this list of conditions and the following disclaimer in
> +# the documentation and/or other materials provided with the
> +# distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +# contributors may be used to endorse or promote products derived
> +# from this software without specific prior written permission.
> +#
> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +ifeq ($(RTE_SDK),)
> +$(error "Please define RTE_SDK environment variable")
> +endif
> +
> +# Default target, can be overriden by command line or environment
overridden
> +RTE_TARGET ?= x86_64-native-linuxapp-gcc
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# binary name
> +APP = l2fwd-headroom
> +
> +# all source are stored in SRCS-y
> +SRCS-y := main.c
> +
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +include $(RTE_SDK)/mk/rte.extapp.mk
> diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-
> headroom/main.c
[...]
> + if (qconf->n_rx_port > 0) {
> + job = &qconf->flush_job;
> + printf("\n\nJob %" PRIu32 ": %-20s "
> + "\n%-18s %'14" PRIu64
> + "\n%-18s %'14.0f"
> + STAT_FMT,
> + i, job->name,
> + "Exec count:", flush_exec_cnt,
> + "Exec period: ", cycles_to_ns(flush_period),
> + "Exec time",
> + cycles_to_ns(flush_exec), flush_exec * 100.0
> / stats_period ,
Remove space before last comma.
> + cycles_to_ns(flush_exec_cnt ? flush_exec /
> flush_exec_cnt : 0),
> + cycles_to_ns(flush_exec_min),
> + cycles_to_ns(flush_exec_max));
> + }
> +}
> +
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 " Pawel Wodkowski
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
@ 2015-02-19 12:18 ` Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
` (4 more replies)
2 siblings, 5 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-19 12:18 UTC (permalink / raw)
To: dev
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs. It
can be used to profile every kind of job sets on any arbitrary execution unit or
tasking library.
In provided l2fwd-headroom example I demonstrate how to use this library to
select optimal rx burst poll time. Jobs are selected by using existing rte_timer
library calls. This example does no limit possible schemes on which this library
can be used.
PATCH v5 changes:
- Fix spelling and checkpatch.pl errors.
- Add maintainer claim for library and example app.
PATCH v4 changes:
- use proper branch for generating patch.
PATCH v3 changes:
- Fix spelling.
PATCH v2 changes:
- Remove jobs management/callback from library to not duplicate tasking library
behaviour.
- Cleenup/remove useless statistics.
- Rework example application to use rte_timer library for jobs selection.
- Introduce new app parameter '-l' for automatic thousands separating in stats.
- More readable statistics format.
Pawel Wodkowski (3):
librte_headroom: New library for checking core/system/app load
examples: introduce new l2fwd-headroom example
MAINTAINERS: claim responsibility for headroom library and example app
MAINTAINERS | 4 +
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 ++
examples/l2fwd-headroom/main.c | 1040 ++++++++++++++++++++++++++
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 54 ++
lib/librte_headroom/rte_headroom.c | 271 +++++++
lib/librte_headroom/rte_headroom.h | 324 ++++++++
lib/librte_headroom/rte_headroom_version.map | 19 +
mk/rte.app.mk | 4 +
12 files changed, 1779 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
create mode 100644 lib/librte_headroom/rte_headroom_version.map
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
@ 2015-02-19 12:18 ` Pawel Wodkowski
2015-02-24 1:55 ` Thomas Monjalon
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 2/3] examples: introduce new l2fwd-headroom example Pawel Wodkowski
` (3 subsequent siblings)
4 siblings, 1 reply; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-19 12:18 UTC (permalink / raw)
To: dev
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.
To calculate a those statistics application code need to be divided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.
Series of jobs must be surrounded with the rte_headroom_start_loop() and
rte_headroom_finish_loop() calls. After that, jobs might be started.
Each job must be surrounded with rte_headroom_start_job() and
rte_headroom_finish_job() calls.
After job finishes its execution, period in which it should be called
again is adjusted to minimize time wasted on unnecessary polls/calls.
Adjustment is based on data provided by job itself (ex: number of
packets it processed).
After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
- total/min/max execution - time spent in executing jobs.
- total/min/max management - time spent outside execution area. This
value might be used to measure overhead of scheduling jobs. This time
also
contains overhead of headroom library itself.
- number of loops that executed at least one job
- executed jobs
- time when statistics were reset.
Each job provide total/min/max execution time and execution count
statistics.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
lib/Makefile | 1 +
lib/librte_headroom/Makefile | 54 +++++
lib/librte_headroom/rte_headroom.c | 271 ++++++++++++++++++++++
lib/librte_headroom/rte_headroom.h | 324 +++++++++++++++++++++++++++
lib/librte_headroom/rte_headroom_version.map | 19 ++
7 files changed, 679 insertions(+)
create mode 100644 lib/librte_headroom/Makefile
create mode 100644 lib/librte_headroom/rte_headroom.c
create mode 100644 lib/librte_headroom/rte_headroom.h
create mode 100644 lib/librte_headroom/rte_headroom_version.map
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..aa2e5fd 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -282,6 +282,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..055a37b 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -290,6 +290,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/lib/Makefile b/lib/Makefile
index d617d81..4fc2819 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -54,6 +54,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
+DIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += librte_headroom
DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
diff --git a/lib/librte_headroom/Makefile b/lib/librte_headroom/Makefile
new file mode 100644
index 0000000..faefb3b
--- /dev/null
+++ b/lib/librte_headroom/Makefile
@@ -0,0 +1,54 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_headroom.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_headroom_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_HEADROOM) := rte_headroom.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_HEADROOM)-include := rte_headroom.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += lib/librte_mbuf
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_headroom/rte_headroom.c b/lib/librte_headroom/rte_headroom.c
new file mode 100644
index 0000000..a2cc671
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.c
@@ -0,0 +1,271 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <stdlib.h>
+
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_common.h>
+#include <rte_cycles.h>
+#include <rte_branch_prediction.h>
+#include <rte_debug.h>
+#include <rte_eal.h>
+#include <rte_malloc.h>
+
+#include "rte_headroom.h"
+
+/* Those are steps used to adjust job period.
+ * Experiments show that for forwarding apps the up step must be less than down
+ * step to achieve optimal performance.
+ */
+#define JOB_UPDATE_STEP_UP 1
+#define JOB_UPDATE_STEP_DOWN 4
+
+/*
+ * Default update function that implements simple period adjustment.
+ */
+static void
+default_update_function(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->target - result;
+
+ /* Job is happy. Nothing to do */
+ if (err == 0)
+ return;
+
+ if (err > 0) {
+ if (job->period + JOB_UPDATE_STEP_UP < job->max_period)
+ job->period += JOB_UPDATE_STEP_UP;
+ } else {
+ if (job->min_period + JOB_UPDATE_STEP_DOWN < job->period)
+ job->period -= JOB_UPDATE_STEP_DOWN;
+ }
+}
+
+#define HDR_ADD_TIME_MIN_MAX(obj, type, value) do { \
+ typeof(value) tmp = (value); \
+ (obj)->type ## _time += tmp; \
+ if (tmp < (obj)->min_ ## type ## _time) \
+ (obj)->min_ ## type ## _time = tmp; \
+ if (tmp > (obj)->max_ ## type ## _time) \
+ (obj)->max_ ## type ## _time = tmp; \
+} while (0)
+
+#define HDR_RESET_TIME_MIN_MAX(obj, type) do { \
+ (obj)->type ## _time = 0; \
+ (obj)->min_ ## type ## _time = UINT64_MAX; \
+ (obj)->max_ ## type ## _time = 0; \
+} while (0)
+
+int
+rte_headroom_init(struct rte_headroom *hdr)
+{
+ if (hdr == NULL)
+ return -EINVAL;
+
+ /* Init only needed parameters. Zero out everything else. */
+ memset(hdr, 0, sizeof(struct rte_headroom));
+
+ rte_headroom_reset_stats(hdr);
+
+ return 0;
+}
+
+void
+rte_headroom_start_loop(struct rte_headroom *hdr)
+{
+ uint64_t now;
+
+ hdr->loop_executed_jobs = 0;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+}
+
+void
+rte_headroom_finish_loop(struct rte_headroom *hdr)
+{
+ uint64_t now;
+
+ if (likely(hdr->loop_executed_jobs))
+ hdr->loop_cnt++;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+}
+
+void
+rte_headroom_set_job_target(struct rte_headroom_job *job, int64_t target)
+{
+ job->target = target;
+}
+
+int
+rte_headroom_start_job(struct rte_headroom *hdr, struct rte_headroom_job *job)
+{
+ uint64_t now;
+
+ /* Some sanity check. */
+ if (unlikely(hdr == NULL || job == NULL || job->headroom != NULL))
+ return -EINVAL;
+
+ /* Link job with headroom object. */
+ job->headroom = hdr;
+
+ rte_mb();
+ now = rte_get_timer_cycles();
+ HDR_ADD_TIME_MIN_MAX(hdr, management, now - hdr->state_time);
+ hdr->state_time = now;
+
+ return 0;
+}
+
+int
+rte_headroom_finish_job(struct rte_headroom_job *job, int64_t job_value)
+{
+ struct rte_headroom *hdr;
+ uint64_t now, exec_time;
+ int need_update;
+
+ /* Some sanity check. */
+ if (unlikely(job == NULL || job->headroom == NULL))
+ return -EINVAL;
+
+ need_update = job->target != job_value;
+ /* Adjust period only if job is unhappy of its current period. */
+ if (need_update)
+ (*job->update_period_cb)(job, job_value);
+
+ hdr = job->headroom;
+
+ /* Update execution time is considered as runtime so get time after it is
+ * executed. */
+ rte_mb();
+ now = rte_get_timer_cycles();
+ exec_time = now - hdr->state_time;
+ HDR_ADD_TIME_MIN_MAX(job, exec, exec_time);
+ HDR_ADD_TIME_MIN_MAX(hdr, exec, exec_time);
+
+ hdr->state_time = now;
+
+ hdr->loop_executed_jobs++;
+ hdr->job_exec_cnt++;
+
+ job->exec_cnt++;
+ job->headroom = NULL;
+
+ return need_update;
+}
+
+void
+rte_headroom_job_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate)
+{
+ if (saturate != 0) {
+ if (period < job->min_period)
+ period = job->min_period;
+ else if (period > job->max_period)
+ period = job->max_period;
+ }
+
+ job->period = period;
+}
+
+void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->min_period = period;
+ if (job->period < period)
+ job->period = period;
+}
+
+void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period)
+{
+ job->max_period = period;
+ if (job->period > period)
+ job->period = period;
+}
+
+int
+rte_headroom_job_init(struct rte_headroom_job *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target)
+{
+ if (job == NULL)
+ return -EINVAL;
+
+ job->period = initial_period;
+ job->min_period = min_period;
+ job->max_period = max_period;
+ job->target = target;
+ job->update_period_cb = &default_update_function;
+ rte_headroom_reset_job_stats(job);
+ snprintf(job->name, RTE_DIM(job->name), "%s", name == NULL ? "" : name);
+ job->headroom = NULL;
+
+ return 0;
+}
+
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_period_cb)
+{
+ if (update_period_cb == NULL)
+ update_period_cb = default_update_function;
+
+ job->update_period_cb = update_period_cb;
+}
+
+void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job)
+{
+ HDR_RESET_TIME_MIN_MAX(job, exec);
+ job->exec_cnt = 0;
+}
+
+void
+rte_headroom_reset_stats(struct rte_headroom *hdr)
+{
+ HDR_RESET_TIME_MIN_MAX(hdr, exec);
+ HDR_RESET_TIME_MIN_MAX(hdr, management);
+ hdr->start_time = rte_get_timer_cycles();
+ hdr->state_time = hdr->start_time;
+ hdr->job_exec_cnt = 0;
+ hdr->loop_cnt = 0;
+}
diff --git a/lib/librte_headroom/rte_headroom.h b/lib/librte_headroom/rte_headroom.h
new file mode 100644
index 0000000..2232c13
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom.h
@@ -0,0 +1,324 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef HEADROOM_H_
+#define HEADROOM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_HEADROOM_JOB_NAMESIZE 32
+#define RTE_HEADROOM_NAMESIZE 32
+#define RTE_HEADROOM_MZ_PREFIX "HDR_"
+
+/* Forward declarations. */
+struct rte_headroom;
+struct rte_headroom_job;
+
+/**
+ * This function should calculate new period and set it using
+ * rte_headroom_set_period() function. Time spent in this function will be
+ * added to job's runtime.
+ *
+ * @param job
+ * The job data structure handler.
+ * @param job_result
+ * Result of calling job callback.
+ */
+typedef void (*rte_headroom_update_fn_t)(struct rte_headroom_job *job,
+ int64_t job_result);
+
+struct rte_headroom_job {
+ uint64_t period;
+ /**< Estimated period of execution. */
+
+ uint64_t min_period;
+ /**< Minimum period. */
+
+ uint64_t max_period;
+ /**< Maximum period. */
+
+ int64_t target;
+ /**< Desired value for this job. */
+
+ rte_headroom_update_fn_t update_period_cb;
+ /**< Period update callback. */
+
+ uint64_t exec_time;
+ /**< Total time (sum) that this job was executing. */
+
+ uint64_t min_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t exec_cnt;
+ /**< Execute count. */
+
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ /**< Name of this job */
+
+ struct rte_headroom *headroom;
+ /**< Headroom object that is executing this job. */
+} __rte_cache_aligned;
+
+struct rte_headroom {
+ /** Viariable holding time at different points:
+ * -# loop start time if loop was started but no job executed yet.
+ * -# job start time if job is currently executing.
+ * -# job finish time if job finished its execution.
+ * -# loop finish time if loop finished its execution. */
+ uint64_t state_time;
+
+ uint64_t loop_executed_jobs;
+ /**< Count of executed jobs in this loop. */
+
+ /* Statistics start. */
+
+ uint64_t exec_time;
+ /**< Total time taken to execute jobs, not including management time. */
+
+ uint64_t min_exec_time;
+ /**< Minimum loop execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum loop execute time. */
+
+ /**
+ * Sum of time that is not the execute time (ex: from job finish to next
+ * job start).
+ *
+ * This time might be considered as overhead of headroom library + job
+ * scheduling.
+ */
+ uint64_t management_time;
+
+ uint64_t min_management_time;
+ /**< Minimum management time */
+
+ uint64_t max_management_time;
+ /**< Maximum management time */
+
+ uint64_t start_time;
+ /**< Time since last reset stats. */
+
+ uint64_t job_exec_cnt;
+ /**< Total count of executed jobs. */
+
+ uint64_t loop_cnt;
+ /**< Total count of executed loops with at least one executed job. */
+} __rte_cache_aligned;
+
+/**
+ * Initialize given headroom object with default values.
+ *
+ * @param hdr
+ * Headroom object to initialize.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *hdr* is NULL
+ */
+int
+rte_headroom_init(struct rte_headroom *hdr);
+
+/**
+ * Mark that new set of jobs start executing.
+ *
+ * @param hdr
+ * Headroom object.
+ */
+void
+rte_headroom_start_loop(struct rte_headroom *hdr);
+
+/**
+ * Mark that there is no more jobs ready to execute in this turn. Calculate
+ * stats for this loop turn.
+ *
+ * @param hdr
+ * Headroom object.
+ */
+void
+rte_headroom_finish_loop(struct rte_headroom *hdr);
+
+/**
+ * Initialize given job stats object.
+ *
+ * @param job
+ * Job object.
+ * @param name
+ * Optional job name.
+ * @param min_period
+ * Minimum period that this job can accept.
+ * @param max_period
+ * Maximum period that this job can accept.
+ * @param initial_period
+ * Initial period. It will be checked against *min_period* and *max_period*.
+ * @param target
+ * Target value that this job try to achieve.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *job* is NULL
+ */
+int
+rte_headroom_job_init(struct rte_headroom_job *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target);
+
+/**
+ * Set job desired target value. Difference between target and job callback
+ * return value must be used to properly adjust job execute period value.
+ *
+ * @param job
+ * The job object.
+ * @param target
+ * New target.
+ */
+void
+rte_headroom_set_job_target(struct rte_headroom_job *job, int64_t target);
+
+/**
+ * Mark that *job* is starting of its execution in context of *hdr* object.
+ *
+ * @param hdr
+ * Headroom object context.
+ * @param job
+ * Job object.
+ * @return
+ * 0 on success
+ * -EINVAL if *hdr* or *job* is NULL or *job* is executing in another headroom
+ * context already,
+ */
+int
+rte_headroom_start_job(struct rte_headroom *hdr, struct rte_headroom_job *job);
+
+/**
+ * Mark that *job* finished its execution. Context in which it was executing will
+ * receive stat update. After this function call *job* object is ready to be
+ * executed in other headroom context.
+ *
+ * @param job
+ * Job object.
+ * @param job_value
+ * Job value. Job should pass in this parameter a value that it try to optimize
+ * for example the number of packets it processed.
+ *
+ * @return
+ * 0 if job's period was not updated (job target equals *job_value*)
+ * 1 if job's period was updated
+ * -EINVAL if job is NULL or job was not started (it have no headroom context).
+ */
+int
+rte_headroom_finish_job(struct rte_headroom_job *job, int64_t job_value);
+
+/**
+ * Set execute period of given job.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New period value.
+ * @param saturate
+ * If zero, skip period saturation to min, max range.
+ */
+void
+rte_headroom_job_set_period(struct rte_headroom_job *job, uint64_t period,
+ uint8_t saturate);
+/**
+ * Set minimum execute period of given job. Current period will be checked
+ * against new minimum value.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New minimum period value.
+ */
+void
+rte_headroom_set_min_period(struct rte_headroom_job *job, uint64_t period);
+/**
+ * Set maximum execute period of given job. Current period will be checked
+ * against new maximum value.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New maximum period value.
+ */
+void
+rte_headroom_set_max_period(struct rte_headroom_job *job, uint64_t period);
+
+/**
+ * Set update period callback that is invoked after job finish.
+ *
+ * If application wants to do more sophisticated calculations than default
+ * it can provide this handler.
+ *
+ * @param job
+ * Job object.
+ * @param update_pedriod_cb
+ * Callback to set. If NULL restore default update function.
+ */
+void
+rte_headroom_set_update_period_function(struct rte_headroom_job *job,
+ rte_headroom_update_fn_t update_period_cb);
+
+/**
+ * Function resets job statistics.
+ *
+ * @param job
+ * Job which statistics will be reset.
+ */
+void
+rte_headroom_reset_job_stats(struct rte_headroom_job *job);
+/**
+ * Function resets headroom statistics.
+ *
+ * @param hdr
+ * Headroom which statistics will be reset.
+ */
+void
+rte_headroom_reset_stats(struct rte_headroom *hdr);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* HEADROOM_H_ */
diff --git a/lib/librte_headroom/rte_headroom_version.map b/lib/librte_headroom/rte_headroom_version.map
new file mode 100644
index 0000000..3c66812
--- /dev/null
+++ b/lib/librte_headroom/rte_headroom_version.map
@@ -0,0 +1,19 @@
+DPDK_2.0 {
+ global:
+
+ rte_headroom_init;
+ rte_headroom_start_loop;
+ rte_headroom_finish_loop;
+ rte_headroom_job_init;
+ rte_headroom_set_job_target;
+ rte_headroom_start_job;
+ rte_headroom_finish_job;
+ rte_headroom_job_set_period;
+ rte_headroom_set_min_period;
+ rte_headroom_set_max_period;
+ rte_headroom_set_update_period_function;
+ rte_headroom_reset_job_stats;
+ rte_headroom_reset_stats;
+
+ local: *;
+};
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v5 2/3] examples: introduce new l2fwd-headroom example
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-19 12:18 ` Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 3/3] MAINTAINERS: claim responsibility for headroom library and example app Pawel Wodkowski
` (2 subsequent siblings)
4 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-19 12:18 UTC (permalink / raw)
To: dev
This app demonstrate usage of new headroom library.
It is basically the orginal l2fwd with following modifications to met
headroom library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for those jobs is almost the same as in original application.
- stats is moved to rte_alarm callback to not introduce overhead of
printing.
- stats are expanded to show headroom statistics.
- added new parameter '-l' to automatic thousands separator.
Comparing original l2fwd and l2fwd-headroom apps will show approach what
is needed to properly write own application with headroom measurements.
New available statistics:
- Total and % of fwd and flush execution time
- management time - overhead of rte_timer + overhead of headroom library
- Idle time and % of time spent waiting for fwd or flush to be ready to
execute.
- per job execution time and period.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
examples/Makefile | 1 +
examples/l2fwd-headroom/Makefile | 51 ++
examples/l2fwd-headroom/main.c | 1040 ++++++++++++++++++++++++++++++++++++++
mk/rte.app.mk | 4 +
4 files changed, 1096 insertions(+)
create mode 100644 examples/l2fwd-headroom/Makefile
create mode 100644 examples/l2fwd-headroom/main.c
diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..8a459b7 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
DIRS-y += l2fwd
+DIRS-y += l2fwd-headroom
DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
DIRS-y += l3fwd
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-headroom/Makefile
new file mode 100644
index 0000000..07da286
--- /dev/null
+++ b/examples/l2fwd-headroom/Makefile
@@ -0,0 +1,51 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-headroom
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-headroom/main.c
new file mode 100644
index 0000000..d7e557d
--- /dev/null
+++ b/examples/l2fwd-headroom/main.c
@@ -0,0 +1,1040 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <locale.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <ctype.h>
+#include <getopt.h>
+
+#include <rte_alarm.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_spinlock.h>
+
+#include <rte_errno.h>
+#include <rte_headroom.h>
+#include <rte_timer.h>
+#include <rte_alarm.h>
+
+#define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF 8192
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr l2fwd_ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t l2fwd_enabled_port_mask;
+
+/* list of enabled ports */
+static uint32_t l2fwd_dst_ports[RTE_MAX_ETHPORTS];
+
+#define UPDATE_STEP_UP 1
+#define UPDATE_STEP_DOWN 32
+
+static unsigned int l2fwd_rx_queue_per_lcore = 1;
+
+struct mbuf_table {
+ uint64_t next_flush_time;
+ unsigned len;
+ struct rte_mbuf *mbufs[MAX_PKT_BURST];
+};
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT 16
+struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_headroom_job port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
+
+ struct rte_timer flush_timer;
+ struct rte_headroom_job flush_job;
+ struct rte_headroom_job idle_job;
+ struct rte_headroom headroom;
+
+ rte_atomic16_t stats_read_pending;
+ rte_spinlock_t lock;
+} __rte_cache_aligned;
+struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
+
+static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc = 0, /**< CRC stripped by hardware */
+ },
+ .txmode = {
+ .mq_mode = ETH_MQ_TX_NONE,
+ },
+};
+
+struct rte_mempool *l2fwd_pktmbuf_pool = NULL;
+
+/* Per-port statistics struct */
+struct l2fwd_port_statistics {
+ uint64_t tx;
+ uint64_t rx;
+ uint64_t dropped;
+} __rte_cache_aligned;
+struct l2fwd_port_statistics port_statistics[RTE_MAX_ETHPORTS];
+
+/* 1 day max */
+#define MAX_TIMER_PERIOD 86400
+/* default period is 10 seconds */
+static int64_t timer_period = 10;
+/* default timer frequency */
+static double hz;
+/* BURST_TX_DRAIN_US converted to cycles */
+uint64_t drain_tsc;
+/* Convert cycles to ns */
+static inline double
+cycles_to_ns(uint64_t cycles)
+{
+ double t = cycles;
+
+ t *= (double)NS_PER_S;
+ t /= hz;
+ return t;
+}
+
+static void
+show_lcore_headroom_stats(unsigned lcore_id)
+{
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_headroom *hdr = &qconf->headroom;
+ struct rte_headroom_job *job;
+ uint8_t i;
+
+ /* Headroom statistics. */
+ uint64_t stats_period, loop_count;
+ uint64_t exec, exec_min, exec_max;
+ uint64_t management, management_min, management_max;
+ uint64_t busy, busy_min, busy_max;
+
+ /* Jobs statistics. */
+ const uint8_t port_cnt = qconf->n_rx_port;
+ uint64_t jobs_exec_cnt[port_cnt], jobs_period[port_cnt];
+ uint64_t jobs_exec[port_cnt], jobs_exec_min[port_cnt],
+ jobs_exec_max[port_cnt];
+
+ uint64_t flush_exec_cnt, flush_period;
+ uint64_t flush_exec, flush_exec_min, flush_exec_max;
+
+ uint64_t idle_exec_cnt;
+ uint64_t idle_exec, idle_exec_min, idle_exec_max;
+ uint64_t collection_time = rte_get_timer_cycles();
+
+ /* Ask forwarding thread to give us stats. */
+ rte_atomic16_set(&qconf->stats_read_pending, 1);
+ rte_spinlock_lock(&qconf->lock);
+ rte_atomic16_set(&qconf->stats_read_pending, 0);
+
+ /* Collect headroom statistics. */
+ stats_period = hdr->state_time - hdr->start_time;
+ loop_count = hdr->loop_cnt;
+
+ exec = hdr->exec_time;
+ exec_min = hdr->min_exec_time;
+ exec_max = hdr->max_exec_time;
+
+ management = hdr->management_time;
+ management_min = hdr->min_management_time;
+ management_max = hdr->max_management_time;
+
+ rte_headroom_reset_stats(hdr);
+
+ for (i = 0; i < port_cnt; i++) {
+ job = &qconf->port_fwd_jobs[i];
+
+ jobs_exec_cnt[i] = job->exec_cnt;
+ jobs_period[i] = job->period;
+
+ jobs_exec[i] = job->exec_time;
+ jobs_exec_min[i] = job->min_exec_time;
+ jobs_exec_max[i] = job->max_exec_time;
+
+ rte_headroom_reset_job_stats(job);
+ }
+
+ flush_exec_cnt = qconf->flush_job.exec_cnt;
+ flush_period = qconf->flush_job.period;
+ flush_exec = qconf->flush_job.exec_time;
+ flush_exec_min = qconf->flush_job.min_exec_time;
+ flush_exec_max = qconf->flush_job.max_exec_time;
+ rte_headroom_reset_job_stats(&qconf->flush_job);
+
+ idle_exec_cnt = qconf->idle_job.exec_cnt;
+ idle_exec = qconf->idle_job.exec_time;
+ idle_exec_min = qconf->idle_job.min_exec_time;
+ idle_exec_max = qconf->idle_job.max_exec_time;
+ rte_headroom_reset_job_stats(&qconf->idle_job);
+
+ rte_spinlock_unlock(&qconf->lock);
+
+ exec -= idle_exec;
+ busy = exec + management;
+ busy_min = exec_min + management_min;
+ busy_max = exec_max + management_max;
+
+
+ collection_time = rte_get_timer_cycles() - collection_time;
+
+#define STAT_FMT "\n%-18s %'14.0f %6.1f%% %'10.0f %'10.0f %'10.0f"
+
+ printf("\n----------------"
+ "\nLCore %3u: headroom statistics (time in ns, collected in %'9.0f)"
+ "\n%-18s %14s %7s %10s %10s %10s "
+ "\n%-18s %'14.0f"
+ "\n%-18s %'14" PRIu64
+ STAT_FMT /* Exec */
+ STAT_FMT /* Management */
+ STAT_FMT /* Busy */
+ STAT_FMT, /* Idle */
+ lcore_id, cycles_to_ns(collection_time),
+ "Stat type", "total", "%total", "avg", "min", "max",
+ "Stats duration:", cycles_to_ns(stats_period),
+ "Loop count:", loop_count,
+ "Exec time",
+ cycles_to_ns(exec), exec * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? exec / loop_count : 0),
+ cycles_to_ns(exec_min),
+ cycles_to_ns(exec_max),
+ "Management time",
+ cycles_to_ns(management), management * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? management / loop_count : 0),
+ cycles_to_ns(management_min),
+ cycles_to_ns(management_max),
+ "Exec + management",
+ cycles_to_ns(busy), busy * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? busy / loop_count : 0),
+ cycles_to_ns(busy_min),
+ cycles_to_ns(busy_max),
+ "Idle (job)",
+ cycles_to_ns(idle_exec), idle_exec * 100.0 / stats_period,
+ cycles_to_ns(idle_exec_cnt ? idle_exec / idle_exec_cnt : 0),
+ cycles_to_ns(idle_exec_min),
+ cycles_to_ns(idle_exec_max));
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ job = &qconf->port_fwd_jobs[i];
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", jobs_exec_cnt[i],
+ "Exec period: ", cycles_to_ns(jobs_period[i]),
+ "Exec time",
+ cycles_to_ns(jobs_exec[i]), jobs_exec[i] * 100.0 / stats_period,
+ cycles_to_ns(jobs_exec_cnt[i] ? jobs_exec[i] / jobs_exec_cnt[i]
+ : 0),
+ cycles_to_ns(jobs_exec_min[i]),
+ cycles_to_ns(jobs_exec_max[i]));
+ }
+
+ if (qconf->n_rx_port > 0) {
+ job = &qconf->flush_job;
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", flush_exec_cnt,
+ "Exec period: ", cycles_to_ns(flush_period),
+ "Exec time",
+ cycles_to_ns(flush_exec), flush_exec * 100.0 / stats_period,
+ cycles_to_ns(flush_exec_cnt ? flush_exec / flush_exec_cnt : 0),
+ cycles_to_ns(flush_exec_min),
+ cycles_to_ns(flush_exec_max));
+ }
+}
+
+/* Print out statistics on packets dropped */
+static void
+show_stats_cb(__rte_unused void *param)
+{
+ uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+ unsigned portid, lcore_id;
+
+ total_packets_dropped = 0;
+ total_packets_tx = 0;
+ total_packets_rx = 0;
+
+ const char clr[] = { 27, '[', '2', 'J', '\0' };
+ const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+ /* Clear screen and move to top left */
+ printf("%s%s"
+ "\nPort statistics ===================================",
+ clr, topLeft);
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ /* skip disabled ports */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+ printf("\nStatistics for port %u ------------------------------"
+ "\nPackets sent: %24"PRIu64
+ "\nPackets received: %20"PRIu64
+ "\nPackets dropped: %21"PRIu64,
+ portid,
+ port_statistics[portid].tx,
+ port_statistics[portid].rx,
+ port_statistics[portid].dropped);
+
+ total_packets_dropped += port_statistics[portid].dropped;
+ total_packets_tx += port_statistics[portid].tx;
+ total_packets_rx += port_statistics[portid].rx;
+ }
+
+ printf("\nAggregate statistics ==============================="
+ "\nTotal packets sent: %18"PRIu64
+ "\nTotal packets received: %14"PRIu64
+ "\nTotal packets dropped: %15"PRIu64
+ "\n====================================================",
+ total_packets_tx,
+ total_packets_rx,
+ total_packets_dropped);
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ if (lcore_queue_conf[lcore_id].n_rx_port > 0)
+ show_lcore_headroom_stats(lcore_id);
+ }
+
+ printf("\n====================================================\n");
+ rte_eal_alarm_set(timer_period * US_PER_S, show_stats_cb, NULL);
+}
+
+/* Send the burst of packets on an output interface */
+static void
+l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
+{
+ struct mbuf_table *m_table;
+ uint16_t ret;
+ uint16_t queueid = 0;
+ uint16_t n;
+
+ m_table = &qconf->tx_mbufs[port];
+ n = m_table->len;
+
+ m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
+ m_table->len = 0;
+
+ ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
+
+ port_statistics[port].tx += ret;
+ if (unlikely(ret < n)) {
+ port_statistics[port].dropped += (n - ret);
+ do {
+ rte_pktmbuf_free(m_table->mbufs[ret]);
+ } while (++ret < n);
+ }
+}
+
+/* Enqueue packets for TX and prepare them to be sent */
+static int
+l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+{
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct mbuf_table *m_table = &qconf->tx_mbufs[port];
+ uint16_t len = qconf->tx_mbufs[port].len;
+
+ m_table->mbufs[len] = m;
+
+ len++;
+ m_table->len = len;
+
+ /* Enough pkts to be sent. */
+ if (unlikely(len == MAX_PKT_BURST))
+ l2fwd_send_burst(qconf, port);
+
+ return 0;
+}
+
+static void
+l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+{
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+ tmp = ð->d_addr.addr_bytes[0];
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+ /* src addr */
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+}
+
+static void
+l2fwd_job_update_cb(struct rte_headroom_job *job, int64_t result)
+{
+ int64_t err = job->target - result;
+ int64_t histeresis = job->target / 8;
+
+ if (err < -histeresis) {
+ if (job->min_period + UPDATE_STEP_DOWN < job->period)
+ job->period -= UPDATE_STEP_DOWN;
+ } else if (err > histeresis) {
+ if (job->period + UPDATE_STEP_UP < job->max_period)
+ job->period += UPDATE_STEP_UP;
+ }
+}
+
+static void
+l2fwd_fwd_job(__rte_unused struct rte_timer *timer, void *arg)
+{
+ struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+ struct rte_mbuf *m;
+
+ const uint8_t port_idx = (uintptr_t) arg;
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_headroom_job *job = &qconf->port_fwd_jobs[port_idx];
+ const uint8_t portid = qconf->rx_port_list[port_idx];
+
+ uint8_t j;
+ uint16_t total_nb_rx;
+
+ rte_headroom_start_job(&qconf->headroom, job);
+
+ /* Call rx burst 2 times. This allow headroom logic to see if this function
+ * must be called more frequently. */
+
+ total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ for (j = 0; j < total_nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+ if (total_nb_rx == MAX_PKT_BURST) {
+ const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+ }
+
+ port_statistics[portid].rx += total_nb_rx;
+
+ /* Adjust period time in which we are running here. */
+ if (rte_headroom_finish_job(job, total_nb_rx) != 0) {
+ rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
+ lcore_id, l2fwd_fwd_job, arg);
+ }
+}
+
+static void
+l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
+{
+ uint64_t now;
+ unsigned lcore_id;
+ struct lcore_queue_conf *qconf;
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_headroom_start_job(&qconf->headroom, &qconf->flush_job);
+
+ now = rte_get_timer_cycles();
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+
+ /* Pass target to indicate that this job is happy of time interwal
+ * in which it was called. */
+ rte_headroom_finish_job(&qconf->flush_job, qconf->flush_job.target);
+}
+
+/* main processing loop */
+static void
+l2fwd_main_loop(void)
+{
+ unsigned lcore_id;
+ unsigned i, portid;
+ struct lcore_queue_conf *qconf;
+ uint8_t stats_read_pending = 0;
+ uint8_t need_manage;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD, "lcore %u has nothing to do\n", lcore_id);
+ return;
+ }
+
+ RTE_LOG(INFO, L2FWD, "entering main loop on lcore %u\n", lcore_id);
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+
+ portid = qconf->rx_port_list[i];
+ RTE_LOG(INFO, L2FWD, " -- lcoreid=%u portid=%u\n", lcore_id,
+ portid);
+ }
+
+ rte_headroom_job_init(&qconf->idle_job, "idle", 0, 0, 0, 0);
+
+ for (;;) {
+ rte_spinlock_lock(&qconf->lock);
+
+ do {
+ rte_headroom_start_loop(&qconf->headroom);
+
+ /* Do the Idle job:
+ * - Read stats_read_pending flag
+ * - check if some real job need to be executed
+ */
+ rte_headroom_start_job(&qconf->headroom, &qconf->idle_job);
+
+ do {
+ uint8_t i;
+ uint64_t now = rte_get_timer_cycles();
+
+ need_manage = qconf->flush_timer.expire < now;
+ /* Check if we was esked to give a stats. */
+ stats_read_pending =
+ rte_atomic16_read(&qconf->stats_read_pending);
+ need_manage |= stats_read_pending;
+
+ for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
+ need_manage = qconf->rx_timers[i].expire < now;
+
+ } while (!need_manage);
+ rte_headroom_finish_job(&qconf->idle_job, qconf->idle_job.target);
+
+ rte_timer_manage();
+ rte_headroom_finish_loop(&qconf->headroom);
+ } while (likely(stats_read_pending == 0));
+
+ rte_spinlock_unlock(&qconf->lock);
+ rte_pause();
+ }
+}
+
+static int
+l2fwd_launch_one_lcore(__attribute__((unused)) void *dummy)
+{
+ l2fwd_main_loop();
+ return 0;
+}
+
+/* display usage */
+static void
+l2fwd_usage(const char *prgname)
+{
+ printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+ " -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+ " -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+ " -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n"
+ " -l set system default locale instead of default (\"C\" locale) for thousands separator in stats.",
+ prgname);
+}
+
+static int
+l2fwd_parse_portmask(const char *portmask)
+{
+ char *end = NULL;
+ unsigned long pm;
+
+ /* parse hexadecimal string */
+ pm = strtoul(portmask, &end, 16);
+ if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+
+ if (pm == 0)
+ return -1;
+
+ return pm;
+}
+
+static unsigned int
+l2fwd_parse_nqueue(const char *q_arg)
+{
+ char *end = NULL;
+ unsigned long n;
+
+ /* parse hexadecimal string */
+ n = strtoul(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return 0;
+ if (n == 0)
+ return 0;
+ if (n >= MAX_RX_QUEUE_PER_LCORE)
+ return 0;
+
+ return n;
+}
+
+static int
+l2fwd_parse_timer_period(const char *q_arg)
+{
+ char *end = NULL;
+ int n;
+
+ /* parse number string */
+ n = strtol(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+ if (n >= MAX_TIMER_PERIOD)
+ return -1;
+
+ return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+l2fwd_parse_args(int argc, char **argv)
+{
+ int opt, ret;
+ char **argvopt;
+ int option_index;
+ char *prgname = argv[0];
+ static struct option lgopts[] = {
+ {NULL, 0, 0, 0}
+ };
+
+ argvopt = argv;
+
+ while ((opt = getopt_long(argc, argvopt, "p:q:T:l",
+ lgopts, &option_index)) != EOF) {
+
+ switch (opt) {
+ /* portmask */
+ case 'p':
+ l2fwd_enabled_port_mask = l2fwd_parse_portmask(optarg);
+ if (l2fwd_enabled_port_mask == 0) {
+ printf("invalid portmask\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* nqueue */
+ case 'q':
+ l2fwd_rx_queue_per_lcore = l2fwd_parse_nqueue(optarg);
+ if (l2fwd_rx_queue_per_lcore == 0) {
+ printf("invalid queue number\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* timer period */
+ case 'T':
+ timer_period = l2fwd_parse_timer_period(optarg);
+ if (timer_period < 0) {
+ printf("invalid timer period\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* For thousands separator in printf. */
+ case 'l':
+ setlocale(LC_ALL, "");
+ break;
+
+ /* long options */
+ case 0:
+ l2fwd_usage(prgname);
+ return -1;
+
+ default:
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ }
+
+ if (optind >= 0)
+ argv[optind-1] = prgname;
+
+ ret = optind-1;
+ optind = 0; /* reset getopt lib */
+ return ret;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+ uint8_t portid, count, all_ports_up, print_flag = 0;
+ struct rte_eth_link link;
+
+ printf("\nChecking link status");
+ fflush(stdout);
+ for (count = 0; count <= MAX_CHECK_TIME; count++) {
+ all_ports_up = 1;
+ for (portid = 0; portid < port_num; portid++) {
+ if ((port_mask & (1 << portid)) == 0)
+ continue;
+ memset(&link, 0, sizeof(link));
+ rte_eth_link_get_nowait(portid, &link);
+ /* print link status if flag set */
+ if (print_flag == 1) {
+ if (link.link_status)
+ printf("Port %d Link Up - speed %u "
+ "Mbps - %s\n", (uint8_t)portid,
+ (unsigned)link.link_speed,
+ (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+ ("full-duplex") : ("half-duplex\n"));
+ else
+ printf("Port %d Link Down\n",
+ (uint8_t)portid);
+ continue;
+ }
+ /* clear all_ports_up flag if any link down */
+ if (link.link_status == 0) {
+ all_ports_up = 0;
+ break;
+ }
+ }
+ /* after finally printing all link status, get out */
+ if (print_flag == 1)
+ break;
+
+ if (all_ports_up == 0) {
+ printf(".");
+ fflush(stdout);
+ rte_delay_ms(CHECK_INTERVAL);
+ }
+
+ /* set the print_flag if all ports up or timeout */
+ if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+ print_flag = 1;
+ printf("done\n");
+ }
+ }
+}
+
+int
+main(int argc, char **argv)
+{
+ struct lcore_queue_conf *qconf;
+ struct rte_eth_dev_info dev_info;
+ unsigned lcore_id, rx_lcore_id;
+ unsigned nb_ports_in_mask = 0;
+ int ret;
+ char name[RTE_HEADROOM_JOB_NAMESIZE];
+ uint8_t nb_ports;
+ uint8_t nb_ports_available;
+ uint8_t portid, last_port;
+ uint8_t i;
+
+ /* init EAL */
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+ rte_timer_subsystem_init();
+
+ /* fetch default timer frequency. */
+ hz = rte_get_timer_hz();
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+ nb_ports = rte_eth_dev_count();
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ } else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get(portid, &dev_info);
+ }
+ if (nb_ports_in_mask % 2) {
+ printf("Notice: odd number of ports in portmask.\n");
+ l2fwd_dst_ports[last_port] = last_port;
+ }
+
+ rx_lcore_id = 0;
+ qconf = NULL;
+
+ /* Initialize the port/queue configuration of each logical core */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ /* get the lcore_id for this port */
+ while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+ lcore_queue_conf[rx_lcore_id].n_rx_port ==
+ l2fwd_rx_queue_per_lcore) {
+ rx_lcore_id++;
+ if (rx_lcore_id >= RTE_MAX_LCORE)
+ rte_exit(EXIT_FAILURE, "Not enough cores\n");
+ }
+
+ if (qconf != &lcore_queue_conf[rx_lcore_id])
+ /* Assigned a new logical core in the loop above. */
+ qconf = &lcore_queue_conf[rx_lcore_id];
+
+ qconf->rx_port_list[qconf->n_rx_port] = portid;
+ qconf->n_rx_port++;
+ printf("Lcore %u: RX port %u\n", rx_lcore_id, (unsigned) portid);
+ }
+
+ nb_ports_available = nb_ports;
+
+ /* Initialise each port */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+ printf("Skipping disabled port %u\n", (unsigned) portid);
+ nb_ports_available--;
+ continue;
+ }
+ /* init port */
+ printf("Initializing port %u... ", (unsigned) portid);
+ fflush(stdout);
+ ret = rte_eth_dev_configure(portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ rte_eth_macaddr_get(portid, &l2fwd_ports_eth_addr[portid]);
+
+ /* init one RX queue */
+ fflush(stdout);
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* init one TX queue on each port */
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* Start device */
+ ret = rte_eth_dev_start(portid);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ printf("done:\n");
+
+ rte_eth_promiscuous_enable(portid);
+
+ printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+ (unsigned) portid,
+ l2fwd_ports_eth_addr[portid].addr_bytes[0],
+ l2fwd_ports_eth_addr[portid].addr_bytes[1],
+ l2fwd_ports_eth_addr[portid].addr_bytes[2],
+ l2fwd_ports_eth_addr[portid].addr_bytes[3],
+ l2fwd_ports_eth_addr[portid].addr_bytes[4],
+ l2fwd_ports_eth_addr[portid].addr_bytes[5]);
+
+ /* initialize port stats */
+ memset(&port_statistics, 0, sizeof(port_statistics));
+ }
+
+ if (!nb_ports_available) {
+ rte_exit(EXIT_FAILURE,
+ "All available ports are disabled. Please set portmask.\n");
+ }
+
+ check_all_ports_link_status(nb_ports, l2fwd_enabled_port_mask);
+
+ drain_tsc = (hz + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_spinlock_init(&qconf->lock);
+
+ if (rte_headroom_init(&qconf->headroom) != 0)
+ rte_panic("Headroom for core %u init failed\n", lcore_id);
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD,
+ "lcore %u: no ports so no headroom initialization\n",
+ lcore_id);
+ continue;
+ }
+ /* Add flush job.
+ * Set fixed period by setting min = max = initial period. Set target to
+ * zero as it is irrelevant for this job. */
+ rte_headroom_job_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
+ drain_tsc, 0);
+
+ rte_timer_init(&qconf->flush_timer);
+ rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL, lcore_id,
+ &l2fwd_flush_job, NULL);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add flush job for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ struct rte_headroom_job *job = &qconf->port_fwd_jobs[i];
+
+ portid = qconf->rx_port_list[i];
+ printf("Setting forward jon for port %u\n", portid);
+
+ snprintf(name, RTE_DIM(name), "port %u fwd", portid);
+ /* Setup forward job.
+ * Set min, max and initial period. Set target to MAX_PKT_BURST as
+ * this is desired optimal RX/TX burst size. */
+ rte_headroom_job_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
+ rte_headroom_set_update_period_function(job, l2fwd_job_update_cb);
+
+ rte_timer_init(&qconf->rx_timers[i]);
+ rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
+ &l2fwd_fwd_job, (void *)(uintptr_t)i);
+ }
+ }
+
+ if (timer_period)
+ rte_eal_alarm_set(timer_period * MS_PER_S, show_stats_cb, NULL);
+ else
+ RTE_LOG(INFO, L2FWD, "Stats display disabled\n");
+
+ /* launch per-lcore init on every lcore */
+ rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL, CALL_MASTER);
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+ if (rte_eal_wait_lcore(lcore_id) < 0)
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 334cb25..3db7222 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -103,6 +103,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
LDLIBS += -lrte_hash
endif
+ifeq ($(CONFIG_RTE_LIBRTE_HEADROOM),y)
+LDLIBS += -lrte_headroom
+endif
+
ifeq ($(CONFIG_RTE_LIBRTE_LPM),y)
LDLIBS += -lrte_lpm
endif
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v5 3/3] MAINTAINERS: claim responsibility for headroom library and example app
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 2/3] examples: introduce new l2fwd-headroom example Pawel Wodkowski
@ 2015-02-19 12:18 ` Pawel Wodkowski
2015-02-19 14:33 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Neil Horman
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
4 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-19 12:18 UTC (permalink / raw)
To: dev
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
MAINTAINERS | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index a771fa3..782b585 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -362,6 +362,10 @@ F: app/test/test_timer*
F: examples/timer/
F: doc/guides/sample_app_ug/timer.rst
+Headroom
+M: Pawel Wodkowski <pawelx.wodkowski@intel.com>
+F: lib/librte_headroom/
+F: examples/l2fwd-headroom/
Test Applications
-----------------
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
` (2 preceding siblings ...)
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 3/3] MAINTAINERS: claim responsibility for headroom library and example app Pawel Wodkowski
@ 2015-02-19 14:33 ` Neil Horman
2015-02-20 15:46 ` Jastrzebski, MichalX K
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
4 siblings, 1 reply; 48+ messages in thread
From: Neil Horman @ 2015-02-19 14:33 UTC (permalink / raw)
To: Pawel Wodkowski; +Cc: dev
On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> Hi community,
> I would like to introduce library for measuring load of some arbitrary jobs. It
> can be used to profile every kind of job sets on any arbitrary execution unit or
> tasking library.
>
> In provided l2fwd-headroom example I demonstrate how to use this library to
> select optimal rx burst poll time. Jobs are selected by using existing rte_timer
> library calls. This example does no limit possible schemes on which this library
> can be used.
>
> PATCH v5 changes:
> - Fix spelling and checkpatch.pl errors.
> - Add maintainer claim for library and example app.
>
> PATCH v4 changes:
> - use proper branch for generating patch.
>
> PATCH v3 changes:
> - Fix spelling.
>
> PATCH v2 changes:
> - Remove jobs management/callback from library to not duplicate tasking library
> behaviour.
> - Cleenup/remove useless statistics.
> - Rework example application to use rte_timer library for jobs selection.
> - Introduce new app parameter '-l' for automatic thousands separating in stats.
> - More readable statistics format.
>
>
>
> Pawel Wodkowski (3):
> librte_headroom: New library for checking core/system/app load
> examples: introduce new l2fwd-headroom example
> MAINTAINERS: claim responsibility for headroom library and example app
>
> MAINTAINERS | 4 +
> config/common_bsdapp | 5 +
> config/common_linuxapp | 5 +
> examples/Makefile | 1 +
> examples/l2fwd-headroom/Makefile | 51 ++
> examples/l2fwd-headroom/main.c | 1040 ++++++++++++++++++++++++++
> lib/Makefile | 1 +
> lib/librte_headroom/Makefile | 54 ++
> lib/librte_headroom/rte_headroom.c | 271 +++++++
> lib/librte_headroom/rte_headroom.h | 324 ++++++++
> lib/librte_headroom/rte_headroom_version.map | 19 +
> mk/rte.app.mk | 4 +
> 12 files changed, 1779 insertions(+)
> create mode 100644 examples/l2fwd-headroom/Makefile
> create mode 100644 examples/l2fwd-headroom/main.c
> create mode 100644 lib/librte_headroom/Makefile
> create mode 100644 lib/librte_headroom/rte_headroom.c
> create mode 100644 lib/librte_headroom/rte_headroom.h
> create mode 100644 lib/librte_headroom/rte_headroom_version.map
>
> --
> 1.9.1
>
>
I'm sorry but I still fail to see how this is a particularly useful library. It
clearly works fine, but it composes an application event loop in its own terms,
and measures stats based on that. While thats ok, any application is already
going to have to write its own event loop, and can makethe same measurements
synchnously within that loop, using alot less code to optimize its polling time.
In other words, I think this is one of those cases where this library is
probably somewhat useful for anyone who just wants to write an application in
terms the semantics exposed by this library, but not at all useful for anyone
else. I'd personally rather not have the extra code to maintain here.
Stephen just gave a presentation at netdev about some of the performance
optimization measurements Brocade did with DPDK and how they fine tuned their
environment. One of the big take aways for me was that making time based
measurements (especially if it was using the tsc), created cpu stalls that
skewed the measurements, and so the best optimizations they made avoided time
measurements, opting instead for packet count metrics.
Neil
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-19 14:33 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Neil Horman
@ 2015-02-20 15:46 ` Jastrzebski, MichalX K
2015-02-23 11:45 ` Thomas Monjalon
0 siblings, 1 reply; 48+ messages in thread
From: Jastrzebski, MichalX K @ 2015-02-20 15:46 UTC (permalink / raw)
To: Neil Horman, Wodkowski, PawelX; +Cc: dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> Sent: Thursday, February 19, 2015 3:34 PM
> To: Wodkowski, PawelX
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and
> example application
>
> On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > Hi community,
> > I would like to introduce library for measuring load of some arbitrary jobs.
> It
> > can be used to profile every kind of job sets on any arbitrary execution unit
> or
> > tasking library.
> >
> > In provided l2fwd-headroom example I demonstrate how to use this library
> to
> > select optimal rx burst poll time. Jobs are selected by using existing
> rte_timer
> > library calls. This example does no limit possible schemes on which this
> library
> > can be used.
> >
> > PATCH v5 changes:
> > - Fix spelling and checkpatch.pl errors.
> > - Add maintainer claim for library and example app.
> >
> > PATCH v4 changes:
> > - use proper branch for generating patch.
> >
> > PATCH v3 changes:
> > - Fix spelling.
> >
> > PATCH v2 changes:
> > - Remove jobs management/callback from library to not duplicate tasking
> library
> > behaviour.
> > - Cleenup/remove useless statistics.
> > - Rework example application to use rte_timer library for jobs selection.
> > - Introduce new app parameter '-l' for automatic thousands separating in
> stats.
> > - More readable statistics format.
> >
> >
> >
> > Pawel Wodkowski (3):
> > librte_headroom: New library for checking core/system/app load
> > examples: introduce new l2fwd-headroom example
> > MAINTAINERS: claim responsibility for headroom library and example app
> >
> > MAINTAINERS | 4 +
> > config/common_bsdapp | 5 +
> > config/common_linuxapp | 5 +
> > examples/Makefile | 1 +
> > examples/l2fwd-headroom/Makefile | 51 ++
> > examples/l2fwd-headroom/main.c | 1040
> ++++++++++++++++++++++++++
> > lib/Makefile | 1 +
> > lib/librte_headroom/Makefile | 54 ++
> > lib/librte_headroom/rte_headroom.c | 271 +++++++
> > lib/librte_headroom/rte_headroom.h | 324 ++++++++
> > lib/librte_headroom/rte_headroom_version.map | 19 +
> > mk/rte.app.mk | 4 +
> > 12 files changed, 1779 insertions(+)
> > create mode 100644 examples/l2fwd-headroom/Makefile
> > create mode 100644 examples/l2fwd-headroom/main.c
> > create mode 100644 lib/librte_headroom/Makefile
> > create mode 100644 lib/librte_headroom/rte_headroom.c
> > create mode 100644 lib/librte_headroom/rte_headroom.h
> > create mode 100644 lib/librte_headroom/rte_headroom_version.map
> >
> > --
> > 1.9.1
> >
> >
> I'm sorry but I still fail to see how this is a particularly useful library. It
> clearly works fine, but it composes an application event loop in its own
> terms,
> and measures stats based on that. While thats ok, any application is already
> going to have to write its own event loop, and can makethe same
> measurements
> synchnously within that loop, using alot less code to optimize its polling time.
>
> In other words, I think this is one of those cases where this library is
> probably somewhat useful for anyone who just wants to write an application
> in
> terms the semantics exposed by this library, but not at all useful for anyone
> else. I'd personally rather not have the extra code to maintain here.
>
> Stephen just gave a presentation at netdev about some of the performance
> optimization measurements Brocade did with DPDK and how they fine tuned
> their
> environment. One of the big take aways for me was that making time based
> measurements (especially if it was using the tsc), created cpu stalls that
> skewed the measurements, and so the best optimizations they made avoided
> time
> measurements, opting instead for packet count metrics.
>
> Neil
Hi Neil,
I think this library offers something quite useful probably not for everyone,
but for many people that use DPDK, and it is measuring quite accurately,
how many spare cycles a CPU have after executing any serial tasks (as you will know).
If you look at two places in example application: main_loop()
and l2fwd_fwd() functions, you will see two possible approach there, but
this is not limited to that. You can even nest headroom objects and measure
process time of particular packets type.
Of course, this will add an overhead due to the measurements,
but that time is also measured, so any user can know what is the relative
time "wasted" for measuring all this.
If time delays are measured in bigger timestamps, are handled reliably,
the cost of measuring will be low.
I find this quite similar to the power library case. I would say that library is not useful
for every application, but there are several cases where it can be
(as demonstrated with l3fwd-power app).
About your last bit, not sure if I understood it right, but in case of the included sample app,
the main measurement to see if we are overusing a CPU is the packet count
in a queue (in this case RX queue), and I believe this should be used for other apps,
especially in those that use a pipeline model, where queues and rings are the key part.
As a final point, last week (12th of February), there was a request for a tool/library like this
from a user in the mailing list (Ilan Borenshtein), which indicates that this would be useful
(probably not just for him, but for others). It probably could be achieved by the user
by adding their own code, but I believe this library would be a good-to-have,
in case a user is looking for an easy way to calculate the exposed above.
Let us give the users an example of this method and we will expand it with more
advanced application that may show capabilities of dynamic load scaling based on headroom library measurement.
Best regards
Michal
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-20 15:46 ` Jastrzebski, MichalX K
@ 2015-02-23 11:45 ` Thomas Monjalon
2015-02-23 14:36 ` Jastrzebski, MichalX K
0 siblings, 1 reply; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-23 11:45 UTC (permalink / raw)
To: Wodkowski, PawelX; +Cc: dev
2015-02-20 15:46, Jastrzebski, MichalX K:
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > Hi community,
> > > I would like to introduce library for measuring load of some arbitrary jobs.
> > > It can be used to profile every kind of job sets on any arbitrary execution unit
> > > or tasking library.
> > >
> > > In provided l2fwd-headroom example I demonstrate how to use this library to
> > > select optimal rx burst poll time. Jobs are selected by using existing rte_timer
> > > library calls. This example does no limit possible schemes on which this
> > > library can be used.
> > >
> > > Pawel Wodkowski (3):
> > > librte_headroom: New library for checking core/system/app load
> > > examples: introduce new l2fwd-headroom example
> > > MAINTAINERS: claim responsibility for headroom library and example app
> >
> > I'm sorry but I still fail to see how this is a particularly useful library. It
> > clearly works fine, but it composes an application event loop in its own
> > terms,
> > and measures stats based on that. While thats ok, any application is already
> > going to have to write its own event loop, and can makethe same
> > measurements
> > synchnously within that loop, using alot less code to optimize its polling time.
> >
> > In other words, I think this is one of those cases where this library is
> > probably somewhat useful for anyone who just wants to write an application
> > in
> > terms the semantics exposed by this library, but not at all useful for anyone
> > else. I'd personally rather not have the extra code to maintain here.
> >
> > Stephen just gave a presentation at netdev about some of the performance
> > optimization measurements Brocade did with DPDK and how they fine tuned
> > their
> > environment. One of the big take aways for me was that making time based
> > measurements (especially if it was using the tsc), created cpu stalls that
> > skewed the measurements, and so the best optimizations they made avoided
> > time
> > measurements, opting instead for packet count metrics.
> >
> > Neil
>
> Hi Neil,
>
> I think this library offers something quite useful probably not for everyone,
> but for many people that use DPDK, and it is measuring quite accurately,
> how many spare cycles a CPU have after executing any serial tasks (as you will know).
> If you look at two places in example application: main_loop()
> and l2fwd_fwd() functions, you will see two possible approach there, but
> this is not limited to that. You can even nest headroom objects and measure
> process time of particular packets type.
> Of course, this will add an overhead due to the measurements,
> but that time is also measured, so any user can know what is the relative
> time "wasted" for measuring all this.
> If time delays are measured in bigger timestamps, are handled reliably,
> the cost of measuring will be low.
> I find this quite similar to the power library case. I would say that library is not useful
> for every application, but there are several cases where it can be
> (as demonstrated with l3fwd-power app).
>
> About your last bit, not sure if I understood it right, but in case of the included sample app,
> the main measurement to see if we are overusing a CPU is the packet count
> in a queue (in this case RX queue), and I believe this should be used for other apps,
> especially in those that use a pipeline model, where queues and rings are the key part.
>
> As a final point, last week (12th of February), there was a request for a tool/library like this
> from a user in the mailing list (Ilan Borenshtein), which indicates that this would be useful
> (probably not just for him, but for others). It probably could be achieved by the user
> by adding their own code, but I believe this library would be a good-to-have,
> in case a user is looking for an easy way to calculate the exposed above.
> Let us give the users an example of this method and we will expand it with more
> advanced application that may show capabilities of dynamic load scaling based on headroom library measurement.
I wonder how this library is related to DPDK.
I'm not against its integration, though the question must be asked.
DPDK is a set of libraries. What kind of library fit with DPDK goals and
deserve to be integrated?
I don't know whether it's related but nobody acknowledged this patchset.
I also feel that the name of this library is a bit too vague. Some people
were asking first what means "headroom". It's actually for CPU headroom monitoring.
What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
Last comment, less important: as many of your colleagues, you don't pay
attention to the copyright dates. I'm pretty sure this code was not written
in 2010. So why claiming it?
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-23 11:45 ` Thomas Monjalon
@ 2015-02-23 14:36 ` Jastrzebski, MichalX K
2015-02-23 14:46 ` Thomas Monjalon
0 siblings, 1 reply; 48+ messages in thread
From: Jastrzebski, MichalX K @ 2015-02-23 14:36 UTC (permalink / raw)
To: Thomas Monjalon, Wodkowski, PawelX; +Cc: dev
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, February 23, 2015 12:46 PM
> To: Wodkowski, PawelX
> Cc: dev@dpdk.org; Jastrzebski, MichalX K; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and
> example application
>
> 2015-02-20 15:46, Jastrzebski, MichalX K:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > Hi community,
> > > > I would like to introduce library for measuring load of some arbitrary
> jobs.
> > > > It can be used to profile every kind of job sets on any arbitrary execution
> unit
> > > > or tasking library.
> > > >
> > > > In provided l2fwd-headroom example I demonstrate how to use this
> library to
> > > > select optimal rx burst poll time. Jobs are selected by using existing
> rte_timer
> > > > library calls. This example does no limit possible schemes on which this
> > > > library can be used.
> > > >
> > > > Pawel Wodkowski (3):
> > > > librte_headroom: New library for checking core/system/app load
> > > > examples: introduce new l2fwd-headroom example
> > > > MAINTAINERS: claim responsibility for headroom library and example
> app
> > >
> > > I'm sorry but I still fail to see how this is a particularly useful library. It
> > > clearly works fine, but it composes an application event loop in its own
> > > terms,
> > > and measures stats based on that. While thats ok, any application is
> already
> > > going to have to write its own event loop, and can makethe same
> > > measurements
> > > synchnously within that loop, using alot less code to optimize its polling
> time.
> > >
> > > In other words, I think this is one of those cases where this library is
> > > probably somewhat useful for anyone who just wants to write an
> application
> > > in
> > > terms the semantics exposed by this library, but not at all useful for
> anyone
> > > else. I'd personally rather not have the extra code to maintain here.
> > >
> > > Stephen just gave a presentation at netdev about some of the
> performance
> > > optimization measurements Brocade did with DPDK and how they fine
> tuned
> > > their
> > > environment. One of the big take aways for me was that making time
> based
> > > measurements (especially if it was using the tsc), created cpu stalls that
> > > skewed the measurements, and so the best optimizations they made
> avoided
> > > time
> > > measurements, opting instead for packet count metrics.
> > >
> > > Neil
> >
> > Hi Neil,
> >
> > I think this library offers something quite useful probably not for everyone,
> > but for many people that use DPDK, and it is measuring quite accurately,
> > how many spare cycles a CPU have after executing any serial tasks (as you
> will know).
> > If you look at two places in example application: main_loop()
> > and l2fwd_fwd() functions, you will see two possible approach there, but
> > this is not limited to that. You can even nest headroom objects and
> measure
> > process time of particular packets type.
> > Of course, this will add an overhead due to the measurements,
> > but that time is also measured, so any user can know what is the relative
> > time "wasted" for measuring all this.
> > If time delays are measured in bigger timestamps, are handled reliably,
> > the cost of measuring will be low.
> > I find this quite similar to the power library case. I would say that library is
> not useful
> > for every application, but there are several cases where it can be
> > (as demonstrated with l3fwd-power app).
> >
> > About your last bit, not sure if I understood it right, but in case of the
> included sample app,
> > the main measurement to see if we are overusing a CPU is the packet count
> > in a queue (in this case RX queue), and I believe this should be used for
> other apps,
> > especially in those that use a pipeline model, where queues and rings are
> the key part.
> >
> > As a final point, last week (12th of February), there was a request for a
> tool/library like this
> > from a user in the mailing list (Ilan Borenshtein), which indicates that this
> would be useful
> > (probably not just for him, but for others). It probably could be achieved by
> the user
> > by adding their own code, but I believe this library would be a good-to-
> have,
> > in case a user is looking for an easy way to calculate the exposed above.
> > Let us give the users an example of this method and we will expand it with
> more
> > advanced application that may show capabilities of dynamic load scaling
> based on headroom library measurement.
>
Hi Thomas,
> I wonder how this library is related to DPDK.
> I'm not against its integration, though the question must be asked.
> DPDK is a set of libraries. What kind of library fit with DPDK goals and
> deserve to be integrated?
>
I think this library fits into dpdk goals, because it is simple and optimized for fast packet processing.
The library provides an easy way for existing DPDK users to modify their applications to measure available CPU headroom.
If we integrate it, users will verify it and decide what else they need and we will implement this.
> I don't know whether it's related but nobody acknowledged this patchset.
We were waiting for Neil's final comments. He did not mention anything else since the last time.
When Pawel sends the next version with the copyright date corrected, Pablo will ack this.
>
> I also feel that the name of this library is a bit too vague. Some people
> were asking first what means "headroom". It's actually for CPU headroom
> monitoring.
> What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
I think we can change the name to "cpuheadroom" as it describes more clear this library.
>
> Last comment, less important: as many of your colleagues, you don't pay
> attention to the copyright dates. I'm pretty sure this code was not written
> in 2010. So why claiming it?
Sorry, we will fix this of course and pay more attention now.
Best regards
Michal
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-23 14:36 ` Jastrzebski, MichalX K
@ 2015-02-23 14:46 ` Thomas Monjalon
2015-02-23 15:55 ` Jastrzebski, MichalX K
2015-02-24 9:49 ` Jastrzebski, MichalX K
0 siblings, 2 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-23 14:46 UTC (permalink / raw)
To: Jastrzebski, MichalX K; +Cc: dev
2015-02-23 14:36, Jastrzebski, MichalX K:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2015-02-20 15:46, Jastrzebski, MichalX K:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > > Hi community,
> > > > > I would like to introduce library for measuring load of some arbitrary
> > jobs.
> > > > > It can be used to profile every kind of job sets on any arbitrary execution
> > unit
> > > > > or tasking library.
> > > > >
> > > > > In provided l2fwd-headroom example I demonstrate how to use this
> > library to
> > > > > select optimal rx burst poll time. Jobs are selected by using existing
> > rte_timer
> > > > > library calls. This example does no limit possible schemes on which this
> > > > > library can be used.
> > > > >
> > > > > Pawel Wodkowski (3):
> > > > > librte_headroom: New library for checking core/system/app load
> > > > > examples: introduce new l2fwd-headroom example
> > > > > MAINTAINERS: claim responsibility for headroom library and example
> > app
> > > >
> > > > I'm sorry but I still fail to see how this is a particularly useful library. It
> > > > clearly works fine, but it composes an application event loop in its own
> > > > terms,
> > > > and measures stats based on that. While thats ok, any application is
> > already
> > > > going to have to write its own event loop, and can makethe same
> > > > measurements
> > > > synchnously within that loop, using alot less code to optimize its polling
> > time.
> > > >
> > > > In other words, I think this is one of those cases where this library is
> > > > probably somewhat useful for anyone who just wants to write an
> > application
> > > > in
> > > > terms the semantics exposed by this library, but not at all useful for
> > anyone
> > > > else. I'd personally rather not have the extra code to maintain here.
> > > >
> > > > Stephen just gave a presentation at netdev about some of the
> > performance
> > > > optimization measurements Brocade did with DPDK and how they fine
> > tuned
> > > > their
> > > > environment. One of the big take aways for me was that making time
> > based
> > > > measurements (especially if it was using the tsc), created cpu stalls that
> > > > skewed the measurements, and so the best optimizations they made
> > avoided
> > > > time
> > > > measurements, opting instead for packet count metrics.
> > > >
> > > > Neil
> > >
> > > Hi Neil,
> > >
> > > I think this library offers something quite useful probably not for everyone,
> > > but for many people that use DPDK, and it is measuring quite accurately,
> > > how many spare cycles a CPU have after executing any serial tasks (as you
> > will know).
> > > If you look at two places in example application: main_loop()
> > > and l2fwd_fwd() functions, you will see two possible approach there, but
> > > this is not limited to that. You can even nest headroom objects and
> > measure
> > > process time of particular packets type.
> > > Of course, this will add an overhead due to the measurements,
> > > but that time is also measured, so any user can know what is the relative
> > > time "wasted" for measuring all this.
> > > If time delays are measured in bigger timestamps, are handled reliably,
> > > the cost of measuring will be low.
> > > I find this quite similar to the power library case. I would say that library is
> > not useful
> > > for every application, but there are several cases where it can be
> > > (as demonstrated with l3fwd-power app).
> > >
> > > About your last bit, not sure if I understood it right, but in case of the
> > included sample app,
> > > the main measurement to see if we are overusing a CPU is the packet count
> > > in a queue (in this case RX queue), and I believe this should be used for
> > other apps,
> > > especially in those that use a pipeline model, where queues and rings are
> > the key part.
> > >
> > > As a final point, last week (12th of February), there was a request for a
> > tool/library like this
> > > from a user in the mailing list (Ilan Borenshtein), which indicates that this
> > would be useful
> > > (probably not just for him, but for others). It probably could be achieved by
> > the user
> > > by adding their own code, but I believe this library would be a good-to-
> > have,
> > > in case a user is looking for an easy way to calculate the exposed above.
> > > Let us give the users an example of this method and we will expand it with
> > more
> > > advanced application that may show capabilities of dynamic load scaling
> > based on headroom library measurement.
> >
> Hi Thomas,
>
> > I wonder how this library is related to DPDK.
> > I'm not against its integration, though the question must be asked.
> > DPDK is a set of libraries. What kind of library fit with DPDK goals and
> > deserve to be integrated?
> >
> I think this library fits into dpdk goals, because it is simple and optimized for fast packet processing.
I don't have a strong opinion here. If anyone else wants to comment, please speak now.
> The library provides an easy way for existing DPDK users to modify their applications to measure available CPU headroom.
> If we integrate it, users will verify it and decide what else they need and we will implement this.
Do you mean that you plan to add some features to this library?
Is it going to stay at providing some stats or could you make some actions
like time-sharing helpers?
> > I don't know whether it's related but nobody acknowledged this patchset.
> We were waiting for Neil's final comments. He did not mention anything else since the last time.
> When Pawel sends the next version with the copyright date corrected, Pablo will ack this.
>
> > I also feel that the name of this library is a bit too vague. Some people
> > were asking first what means "headroom". It's actually for CPU headroom
> > monitoring.
> > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
>
> I think we can change the name to "cpuheadroom" as it describes more clear this library.
If you're focusing on CPU usage with possible actions, yes.
If you're focusing on decision helper, jobstat would be better IMHO.
> > Last comment, less important: as many of your colleagues, you don't pay
> > attention to the copyright dates. I'm pretty sure this code was not written
> > in 2010. So why claiming it?
> Sorry, we will fix this of course and pay more attention now.
OK, thanks
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-23 14:46 ` Thomas Monjalon
@ 2015-02-23 15:55 ` Jastrzebski, MichalX K
2015-02-23 16:04 ` Thomas Monjalon
2015-02-24 9:49 ` Jastrzebski, MichalX K
1 sibling, 1 reply; 48+ messages in thread
From: Jastrzebski, MichalX K @ 2015-02-23 15:55 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, February 23, 2015 3:47 PM
> To: Jastrzebski, MichalX K
> Cc: Wodkowski, PawelX; dev@dpdk.org; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and
> example application
>
> 2015-02-23 14:36, Jastrzebski, MichalX K:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2015-02-20 15:46, Jastrzebski, MichalX K:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > > > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > > > Hi community,
> > > > > > I would like to introduce library for measuring load of some arbitrary
> > > jobs.
> > > > > > It can be used to profile every kind of job sets on any arbitrary
> execution
> > > unit
> > > > > > or tasking library.
> > > > > >
> > > > > > In provided l2fwd-headroom example I demonstrate how to use this
> > > library to
> > > > > > select optimal rx burst poll time. Jobs are selected by using existing
> > > rte_timer
> > > > > > library calls. This example does no limit possible schemes on which
> this
> > > > > > library can be used.
> > > > > >
> > > > > > Pawel Wodkowski (3):
> > > > > > librte_headroom: New library for checking core/system/app load
> > > > > > examples: introduce new l2fwd-headroom example
> > > > > > MAINTAINERS: claim responsibility for headroom library and
> example
> > > app
> > > > >
> > > > > I'm sorry but I still fail to see how this is a particularly useful library. It
> > > > > clearly works fine, but it composes an application event loop in its
> own
> > > > > terms,
> > > > > and measures stats based on that. While thats ok, any application is
> > > already
> > > > > going to have to write its own event loop, and can makethe same
> > > > > measurements
> > > > > synchnously within that loop, using alot less code to optimize its
> polling
> > > time.
> > > > >
> > > > > In other words, I think this is one of those cases where this library is
> > > > > probably somewhat useful for anyone who just wants to write an
> > > application
> > > > > in
> > > > > terms the semantics exposed by this library, but not at all useful for
> > > anyone
> > > > > else. I'd personally rather not have the extra code to maintain here.
> > > > >
> > > > > Stephen just gave a presentation at netdev about some of the
> > > performance
> > > > > optimization measurements Brocade did with DPDK and how they fine
> > > tuned
> > > > > their
> > > > > environment. One of the big take aways for me was that making time
> > > based
> > > > > measurements (especially if it was using the tsc), created cpu stalls
> that
> > > > > skewed the measurements, and so the best optimizations they made
> > > avoided
> > > > > time
> > > > > measurements, opting instead for packet count metrics.
> > > > >
> > > > > Neil
> > > >
> > > > Hi Neil,
> > > >
> > > > I think this library offers something quite useful probably not for
> everyone,
> > > > but for many people that use DPDK, and it is measuring quite
> accurately,
> > > > how many spare cycles a CPU have after executing any serial tasks (as
> you
> > > will know).
> > > > If you look at two places in example application: main_loop()
> > > > and l2fwd_fwd() functions, you will see two possible approach there,
> but
> > > > this is not limited to that. You can even nest headroom objects and
> > > measure
> > > > process time of particular packets type.
> > > > Of course, this will add an overhead due to the measurements,
> > > > but that time is also measured, so any user can know what is the
> relative
> > > > time "wasted" for measuring all this.
> > > > If time delays are measured in bigger timestamps, are handled reliably,
> > > > the cost of measuring will be low.
> > > > I find this quite similar to the power library case. I would say that library
> is
> > > not useful
> > > > for every application, but there are several cases where it can be
> > > > (as demonstrated with l3fwd-power app).
> > > >
> > > > About your last bit, not sure if I understood it right, but in case of the
> > > included sample app,
> > > > the main measurement to see if we are overusing a CPU is the packet
> count
> > > > in a queue (in this case RX queue), and I believe this should be used for
> > > other apps,
> > > > especially in those that use a pipeline model, where queues and rings
> are
> > > the key part.
> > > >
> > > > As a final point, last week (12th of February), there was a request for a
> > > tool/library like this
> > > > from a user in the mailing list (Ilan Borenshtein), which indicates that
> this
> > > would be useful
> > > > (probably not just for him, but for others). It probably could be achieved
> by
> > > the user
> > > > by adding their own code, but I believe this library would be a good-to-
> > > have,
> > > > in case a user is looking for an easy way to calculate the exposed above.
> > > > Let us give the users an example of this method and we will expand it
> with
> > > more
> > > > advanced application that may show capabilities of dynamic load
> scaling
> > > based on headroom library measurement.
> > >
> > Hi Thomas,
> >
> > > I wonder how this library is related to DPDK.
> > > I'm not against its integration, though the question must be asked.
> > > DPDK is a set of libraries. What kind of library fit with DPDK goals and
> > > deserve to be integrated?
> > >
> > I think this library fits into dpdk goals, because it is simple and optimized
> for fast packet processing.
>
> I don't have a strong opinion here. If anyone else wants to comment, please
> speak now.
>
> > The library provides an easy way for existing DPDK users to modify their
> applications to measure available CPU headroom.
> > If we integrate it, users will verify it and decide what else they need and we
> will implement this.
>
> Do you mean that you plan to add some features to this library?
> Is it going to stay at providing some stats or could you make some actions
> like time-sharing helpers?
What do you mean here saying time-sharing?
>
> > > I don't know whether it's related but nobody acknowledged this patchset.
> > We were waiting for Neil's final comments. He did not mention anything
> else since the last time.
> > When Pawel sends the next version with the copyright date corrected,
> Pablo will ack this.
> >
> > > I also feel that the name of this library is a bit too vague. Some people
> > > were asking first what means "headroom". It's actually for CPU headroom
> > > monitoring.
> > > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
> >
> > I think we can change the name to "cpuheadroom" as it describes more
> clear this library.
>
> If you're focusing on CPU usage with possible actions, yes.
> If you're focusing on decision helper, jobstat would be better IMHO.
>
> > > Last comment, less important: as many of your colleagues, you don't pay
> > > attention to the copyright dates. I'm pretty sure this code was not written
> > > in 2010. So why claiming it?
> > Sorry, we will fix this of course and pay more attention now.
>
> OK, thanks
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-23 15:55 ` Jastrzebski, MichalX K
@ 2015-02-23 16:04 ` Thomas Monjalon
2015-02-24 8:44 ` Pawel Wodkowski
0 siblings, 1 reply; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-23 16:04 UTC (permalink / raw)
To: Jastrzebski, MichalX K; +Cc: dev
2015-02-23 15:55, Jastrzebski, MichalX K:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2015-02-23 14:36, Jastrzebski, MichalX K:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2015-02-20 15:46, Jastrzebski, MichalX K:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > > > > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > > > > Hi community,
> > > > > > > I would like to introduce library for measuring load of some arbitrary
> > > > jobs.
> > > > > > > It can be used to profile every kind of job sets on any arbitrary
> > execution
> > > > unit
> > > > > > > or tasking library.
> > > > > > >
> > > > > > > In provided l2fwd-headroom example I demonstrate how to use this
> > > > library to
> > > > > > > select optimal rx burst poll time. Jobs are selected by using existing
> > > > rte_timer
> > > > > > > library calls. This example does no limit possible schemes on which
> > this
> > > > > > > library can be used.
> > > > > > >
> > > > > > > Pawel Wodkowski (3):
> > > > > > > librte_headroom: New library for checking core/system/app load
> > > > > > > examples: introduce new l2fwd-headroom example
> > > > > > > MAINTAINERS: claim responsibility for headroom library and
> > example
> > > > app
> > > > > >
> > > > > > I'm sorry but I still fail to see how this is a particularly useful library. It
> > > > > > clearly works fine, but it composes an application event loop in its
> > own
> > > > > > terms,
> > > > > > and measures stats based on that. While thats ok, any application is
> > > > already
> > > > > > going to have to write its own event loop, and can makethe same
> > > > > > measurements
> > > > > > synchnously within that loop, using alot less code to optimize its
> > polling
> > > > time.
> > > > > >
> > > > > > In other words, I think this is one of those cases where this library is
> > > > > > probably somewhat useful for anyone who just wants to write an
> > > > application
> > > > > > in
> > > > > > terms the semantics exposed by this library, but not at all useful for
> > > > anyone
> > > > > > else. I'd personally rather not have the extra code to maintain here.
> > > > > >
> > > > > > Stephen just gave a presentation at netdev about some of the
> > > > performance
> > > > > > optimization measurements Brocade did with DPDK and how they fine
> > > > tuned
> > > > > > their
> > > > > > environment. One of the big take aways for me was that making time
> > > > based
> > > > > > measurements (especially if it was using the tsc), created cpu stalls
> > that
> > > > > > skewed the measurements, and so the best optimizations they made
> > > > avoided
> > > > > > time
> > > > > > measurements, opting instead for packet count metrics.
> > > > > >
> > > > > > Neil
> > > > >
> > > > > Hi Neil,
> > > > >
> > > > > I think this library offers something quite useful probably not for
> > everyone,
> > > > > but for many people that use DPDK, and it is measuring quite
> > accurately,
> > > > > how many spare cycles a CPU have after executing any serial tasks (as
> > you
> > > > will know).
> > > > > If you look at two places in example application: main_loop()
> > > > > and l2fwd_fwd() functions, you will see two possible approach there,
> > but
> > > > > this is not limited to that. You can even nest headroom objects and
> > > > measure
> > > > > process time of particular packets type.
> > > > > Of course, this will add an overhead due to the measurements,
> > > > > but that time is also measured, so any user can know what is the
> > relative
> > > > > time "wasted" for measuring all this.
> > > > > If time delays are measured in bigger timestamps, are handled reliably,
> > > > > the cost of measuring will be low.
> > > > > I find this quite similar to the power library case. I would say that library
> > is
> > > > not useful
> > > > > for every application, but there are several cases where it can be
> > > > > (as demonstrated with l3fwd-power app).
> > > > >
> > > > > About your last bit, not sure if I understood it right, but in case of the
> > > > included sample app,
> > > > > the main measurement to see if we are overusing a CPU is the packet
> > count
> > > > > in a queue (in this case RX queue), and I believe this should be used for
> > > > other apps,
> > > > > especially in those that use a pipeline model, where queues and rings
> > are
> > > > the key part.
> > > > >
> > > > > As a final point, last week (12th of February), there was a request for a
> > > > tool/library like this
> > > > > from a user in the mailing list (Ilan Borenshtein), which indicates that
> > this
> > > > would be useful
> > > > > (probably not just for him, but for others). It probably could be achieved
> > by
> > > > the user
> > > > > by adding their own code, but I believe this library would be a good-to-
> > > > have,
> > > > > in case a user is looking for an easy way to calculate the exposed above.
> > > > > Let us give the users an example of this method and we will expand it
> > with
> > > > more
> > > > > advanced application that may show capabilities of dynamic load
> > scaling
> > > > based on headroom library measurement.
> > > >
> > > Hi Thomas,
> > >
> > > > I wonder how this library is related to DPDK.
> > > > I'm not against its integration, though the question must be asked.
> > > > DPDK is a set of libraries. What kind of library fit with DPDK goals and
> > > > deserve to be integrated?
> > > >
> > > I think this library fits into dpdk goals, because it is simple and optimized
> > for fast packet processing.
> >
> > I don't have a strong opinion here. If anyone else wants to comment, please
> > speak now.
> >
> > > The library provides an easy way for existing DPDK users to modify their
> > applications to measure available CPU headroom.
> > > If we integrate it, users will verify it and decide what else they need and we
> > will implement this.
> >
> > Do you mean that you plan to add some features to this library?
> > Is it going to stay at providing some stats or could you make some actions
> > like time-sharing helpers?
> What do you mean here saying time-sharing?
I mean helpers to stop processing at a defined rate in order to share CPU.
> > > > I don't know whether it's related but nobody acknowledged this patchset.
> > > We were waiting for Neil's final comments. He did not mention anything
> > else since the last time.
> > > When Pawel sends the next version with the copyright date corrected,
> > Pablo will ack this.
> > >
> > > > I also feel that the name of this library is a bit too vague. Some people
> > > > were asking first what means "headroom". It's actually for CPU headroom
> > > > monitoring.
> > > > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
> > >
> > > I think we can change the name to "cpuheadroom" as it describes more
> > clear this library.
> >
> > If you're focusing on CPU usage with possible actions, yes.
> > If you're focusing on decision helper, jobstat would be better IMHO.
> >
> > > > Last comment, less important: as many of your colleagues, you don't pay
> > > > attention to the copyright dates. I'm pretty sure this code was not written
> > > > in 2010. So why claiming it?
> > > Sorry, we will fix this of course and pay more attention now.
> >
> > OK, thanks
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-24 1:55 ` Thomas Monjalon
0 siblings, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-24 1:55 UTC (permalink / raw)
To: Pawel Wodkowski; +Cc: dev
2015-02-19 13:18, Pawel Wodkowski:
> This library provide API to measure time spend in particular parts of
> code and to calculate optimal polling time.
>
> To calculate a those statistics application code need to be divided into
> parts (called jobs) that do something. It is up to application to decide
> what is considered a job.
>
> Series of jobs must be surrounded with the rte_headroom_start_loop() and
> rte_headroom_finish_loop() calls. After that, jobs might be started.
> Each job must be surrounded with rte_headroom_start_job() and
> rte_headroom_finish_job() calls.
>
> After job finishes its execution, period in which it should be called
> again is adjusted to minimize time wasted on unnecessary polls/calls.
> Adjustment is based on data provided by job itself (ex: number of
> packets it processed).
>
> After all jobs in serie are executed fallowing statistics are updated
> and might be used by application. Statistics can be reset. Some of
> provided statistic data:
> - total/min/max execution - time spent in executing jobs.
> - total/min/max management - time spent outside execution area. This
> value might be used to measure overhead of scheduling jobs. This time
> also
> contains overhead of headroom library itself.
> - number of loops that executed at least one job
> - executed jobs
> - time when statistics were reset.
>
> Each job provide total/min/max execution time and execution count
> statistics.
>
> Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> ---
> config/common_bsdapp | 5 +
> config/common_linuxapp | 5 +
> lib/Makefile | 1 +
> lib/librte_headroom/Makefile | 54 +++++
> lib/librte_headroom/rte_headroom.c | 271 ++++++++++++++++++++++
> lib/librte_headroom/rte_headroom.h | 324 +++++++++++++++++++++++++++
> lib/librte_headroom/rte_headroom_version.map | 19 ++
Please add the library in doc/api/doxy-api.conf.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-23 16:04 ` Thomas Monjalon
@ 2015-02-24 8:44 ` Pawel Wodkowski
0 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-24 8:44 UTC (permalink / raw)
To: Thomas Monjalon, Jastrzebski, MichalX K; +Cc: dev
On 2015-02-23 17:04, Thomas Monjalon wrote:
>>> Do you mean that you plan to add some features to this library?
>>> > >Is it going to stay at providing some stats or could you make some actions
>>> > >like time-sharing helpers?
>> >What do you mean here saying time-sharing?
> I mean helpers to stop processing at a defined rate in order to share CPU.
>
I am not sure if are talking about the same but this is already present
by period field in struct rte_headroom_job (or whatever it will be
called in next version). This field is a hint for application and allow
execute jobs when needed. If application decide that there is no time to
execute some jobs it can skip it but it is up to application decision.
The job stats + period hint and ability to skip or invoke jobs at any
point and even dynamically decide if this part of code is now considered
a job is the added value of this library. Every above thing separately
have limited usage.
--
Pawel
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-23 14:46 ` Thomas Monjalon
2015-02-23 15:55 ` Jastrzebski, MichalX K
@ 2015-02-24 9:49 ` Jastrzebski, MichalX K
2015-02-24 10:00 ` Thomas Monjalon
1 sibling, 1 reply; 48+ messages in thread
From: Jastrzebski, MichalX K @ 2015-02-24 9:49 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev
> > > I also feel that the name of this library is a bit too vague. Some people
> > > were asking first what means "headroom". It's actually for CPU headroom
> > > monitoring.
> > > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
> >
> > I think we can change the name to "cpuheadroom" as it describes more
> clear this library.
>
> If you're focusing on CPU usage with possible actions, yes.
> If you're focusing on decision helper, jobstat would be better IMHO.
We will change the name to "jobstats".
Best regards
Michal
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-24 9:49 ` Jastrzebski, MichalX K
@ 2015-02-24 10:00 ` Thomas Monjalon
2015-02-24 10:05 ` Wodkowski, PawelX
2015-02-24 10:53 ` Wodkowski, PawelX
0 siblings, 2 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-24 10:00 UTC (permalink / raw)
To: Jastrzebski, MichalX K; +Cc: dev
2015-02-24 09:49, Jastrzebski, MichalX K:
> > > > I also feel that the name of this library is a bit too vague. Some people
> > > > were asking first what means "headroom". It's actually for CPU headroom
> > > > monitoring.
> > > > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
> > >
> > > I think we can change the name to "cpuheadroom" as it describes more
> > clear this library.
> >
> > If you're focusing on CPU usage with possible actions, yes.
> > If you're focusing on decision helper, jobstat would be better IMHO.
>
> We will change the name to "jobstats".
OK, I don't want to impose any name, it should be your choice.
I guess you did an internal survey?
Would you able to send a v6 today?
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-24 10:00 ` Thomas Monjalon
@ 2015-02-24 10:05 ` Wodkowski, PawelX
2015-02-24 10:53 ` Wodkowski, PawelX
1 sibling, 0 replies; 48+ messages in thread
From: Wodkowski, PawelX @ 2015-02-24 10:05 UTC (permalink / raw)
To: Thomas Monjalon, Jastrzebski, MichalX K; +Cc: dev
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, February 24, 2015 11:00 AM
> To: Jastrzebski, MichalX K
> Cc: Wodkowski, PawelX; dev@dpdk.org; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example
> application
>
> 2015-02-24 09:49, Jastrzebski, MichalX K:
> > > > > I also feel that the name of this library is a bit too vague. Some people
> > > > > were asking first what means "headroom". It's actually for CPU
> headroom
> > > > > monitoring.
> > > > > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
> > > >
> > > > I think we can change the name to "cpuheadroom" as it describes more
> > > clear this library.
> > >
> > > If you're focusing on CPU usage with possible actions, yes.
> > > If you're focusing on decision helper, jobstat would be better IMHO.
> >
> > We will change the name to "jobstats".
>
> OK, I don't want to impose any name, it should be your choice.
> I guess you did an internal survey?
Yes, there was no better name that we could fit in.
>
> Would you able to send a v6 today?
In two or three hours it should be ready.
Thanks
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application
2015-02-24 10:00 ` Thomas Monjalon
2015-02-24 10:05 ` Wodkowski, PawelX
@ 2015-02-24 10:53 ` Wodkowski, PawelX
1 sibling, 0 replies; 48+ messages in thread
From: Wodkowski, PawelX @ 2015-02-24 10:53 UTC (permalink / raw)
To: Thomas Monjalon, Jastrzebski, MichalX K; +Cc: dev
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, February 24, 2015 11:00 AM
> To: Jastrzebski, MichalX K
> Cc: Wodkowski, PawelX; dev@dpdk.org; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example
> application
>
> 2015-02-24 09:49, Jastrzebski, MichalX K:
> > > > > I also feel that the name of this library is a bit too vague. Some people
> > > > > were asking first what means "headroom". It's actually for CPU
> headroom
> > > > > monitoring.
> > > > > What about "cpuheadroom", "cpuheadroomstat", "jobstat"?
> > > >
> > > > I think we can change the name to "cpuheadroom" as it describes more
> > > clear this library.
> > >
> > > If you're focusing on CPU usage with possible actions, yes.
> > > If you're focusing on decision helper, jobstat would be better IMHO.
> >
> > We will change the name to "jobstats".
>
> OK, I don't want to impose any name, it should be your choice.
> I guess you did an internal survey?
Yes, there was no better name that we could fit in.
>
> Would you able to send a v6 today?
In two or three hours it should be ready.
Thanks
Pawel
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
` (3 preceding siblings ...)
2015-02-19 14:33 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Neil Horman
@ 2015-02-24 16:33 ` Pawel Wodkowski
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load Pawel Wodkowski
` (3 more replies)
4 siblings, 4 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-24 16:33 UTC (permalink / raw)
To: dev, pablo.de.lara.guarch
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs and
help finding optimal poll time in poll mode applications. It can be used to
measure and drive every kind of job sets on any arbitrary execution unit or
tasking library.
In provided l2fwd-jobstats example I demonstrate how to use this library to
select optimal rx burst poll time and find out idle time. Jobs are selected by
using existing rte_timer library calls. This example does no limit possible
schemes on which this library can be used.
PATCH v6 changes:
- rename library name to rte_jobstats.
- clean unused includes and dependencies in library.
- change/fix API documentation.
- reword cover letter.
PATCH v5 changes:
- Fix spelling and checkpatch.pl errors.
- Add maintainer claim for library and example app.
PATCH v4 changes:
- use proper branch for generating patch.
PATCH v3 changes:
- Fix spelling.
PATCH v2 changes:
- Remove jobs management/callback from library to not duplicate tasking library
behaviour.
- Cleenup/remove useless statistics.
- Rework example application to use rte_timer library for jobs selection.
- Introduce new app parameter '-l' for automatic thousands separating in stats.
- More readable statistics format.
Pawel Wodkowski (3):
librte_jobstats: New library for checking core/system/app load
examples: introduce new l2fwd-jobstats example
MAINTAINERS: claim responsibility for rte_jobstats library and example
app
MAINTAINERS | 4 +
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
doc/api/doxy-api.conf | 1 +
examples/Makefile | 1 +
examples/l2fwd-jobstats/Makefile | 51 ++
examples/l2fwd-jobstats/main.c | 1040 ++++++++++++++++++++++++++
lib/Makefile | 1 +
lib/librte_jobstats/Makefile | 53 ++
lib/librte_jobstats/rte_jobstats.c | 273 +++++++
lib/librte_jobstats/rte_jobstats.h | 322 ++++++++
lib/librte_jobstats/rte_jobstats_version.map | 19 +
mk/rte.app.mk | 4 +
13 files changed, 1779 insertions(+)
create mode 100644 examples/l2fwd-jobstats/Makefile
create mode 100644 examples/l2fwd-jobstats/main.c
create mode 100644 lib/librte_jobstats/Makefile
create mode 100644 lib/librte_jobstats/rte_jobstats.c
create mode 100644 lib/librte_jobstats/rte_jobstats.h
create mode 100644 lib/librte_jobstats/rte_jobstats_version.map
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
@ 2015-02-24 16:33 ` Pawel Wodkowski
2015-02-24 21:18 ` Thomas Monjalon
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example Pawel Wodkowski
` (2 subsequent siblings)
3 siblings, 1 reply; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-24 16:33 UTC (permalink / raw)
To: dev, pablo.de.lara.guarch
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.
To calculate a those statistics application code need to be divided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.
Series of jobs must be surrounded with the rte_jobstats_context_start()
and rte_jobstats_context_finish() calls. After that, jobs might be
started. Each job must be surrounded with rte_jobstats_start() and
rte_jobstats_finish() calls.
After job finishes its execution, period in which it should be called
again is adjusted. It might be used to minimize time wasted on
unnecessary polls/calls. Adjustment is based on data provided by job
itself (ex: number of packets it processed).
After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
- total/min/max execution - time spent in executing jobs.
- total/min/max management - time spent outside execution area. This
value might be used to measure overhead of scheduling jobs. This time
also contains overhead of rte_jobstats library itself.
- number of loops that executed at least one job
- executed jobs
- time when statistics were reset.
Each job provide total/min/max execution time and execution count
statistics.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
config/common_bsdapp | 5 +
config/common_linuxapp | 5 +
doc/api/doxy-api.conf | 1 +
lib/Makefile | 1 +
lib/librte_jobstats/Makefile | 53 +++++
lib/librte_jobstats/rte_jobstats.c | 273 +++++++++++++++++++++++
lib/librte_jobstats/rte_jobstats.h | 322 +++++++++++++++++++++++++++
lib/librte_jobstats/rte_jobstats_version.map | 19 ++
8 files changed, 679 insertions(+)
create mode 100644 lib/librte_jobstats/Makefile
create mode 100644 lib/librte_jobstats/rte_jobstats.c
create mode 100644 lib/librte_jobstats/rte_jobstats.h
create mode 100644 lib/librte_jobstats/rte_jobstats_version.map
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..86dc329 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -282,6 +282,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_jobstats
+#
+CONFIG_RTE_LIBRTE_JOBSTATS=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..6cfadef 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -290,6 +290,11 @@ CONFIG_RTE_LIBRTE_HASH=y
CONFIG_RTE_LIBRTE_HASH_DEBUG=n
#
+# Compile librte_jobstats
+#
+CONFIG_RTE_LIBRTE_JOBSTATS=y
+
+#
# Compile librte_lpm
#
CONFIG_RTE_LIBRTE_LPM=y
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 27c782c..8a6a5e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -37,6 +37,7 @@ INPUT = doc/api/doxy-api-index.md \
lib/librte_ether \
lib/librte_hash \
lib/librte_ip_frag \
+ lib/librte_jobstats \
lib/librte_kni \
lib/librte_kvargs \
lib/librte_lpm \
diff --git a/lib/Makefile b/lib/Makefile
index d617d81..42ffe2f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -58,6 +58,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
+DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
diff --git a/lib/librte_jobstats/Makefile b/lib/librte_jobstats/Makefile
new file mode 100644
index 0000000..136a448
--- /dev/null
+++ b/lib/librte_jobstats/Makefile
@@ -0,0 +1,53 @@
+# BSD LICENSE
+#
+# Copyright(c) 2015 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_jobstats.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_jobstats_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_JOBSTATS) := rte_jobstats.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_JOBSTATS)-include := rte_jobstats.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_jobstats/rte_jobstats.c b/lib/librte_jobstats/rte_jobstats.c
new file mode 100644
index 0000000..2eaac0c
--- /dev/null
+++ b/lib/librte_jobstats/rte_jobstats.c
@@ -0,0 +1,273 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+
+#include <rte_errno.h>
+#include <rte_common.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include <rte_cycles.h>
+#include <rte_branch_prediction.h>
+
+#include "rte_jobstats.h"
+
+#define ADD_TIME_MIN_MAX(obj, type, value) do { \
+ typeof(value) tmp = (value); \
+ (obj)->type ## _time += tmp; \
+ if (tmp < (obj)->min_ ## type ## _time) \
+ (obj)->min_ ## type ## _time = tmp; \
+ if (tmp > (obj)->max_ ## type ## _time) \
+ (obj)->max_ ## type ## _time = tmp; \
+} while (0)
+
+#define RESET_TIME_MIN_MAX(obj, type) do { \
+ (obj)->type ## _time = 0; \
+ (obj)->min_ ## type ## _time = UINT64_MAX; \
+ (obj)->max_ ## type ## _time = 0; \
+} while (0)
+
+static inline uint64_t
+get_time(void)
+{
+ rte_rmb();
+ return rte_get_timer_cycles();
+}
+
+/* Those are steps used to adjust job period.
+ * Experiments show that for forwarding apps the up step must be less than down
+ * step to achieve optimal performance.
+ */
+#define JOB_UPDATE_STEP_UP 1
+#define JOB_UPDATE_STEP_DOWN 4
+
+/*
+ * Default update function that implements simple period adjustment.
+ */
+static void
+default_update_function(struct rte_jobstats *job, int64_t result)
+{
+ int64_t err = job->target - result;
+
+ /* Job is happy. Nothing to do */
+ if (err == 0)
+ return;
+
+ if (err > 0) {
+ if (job->period + JOB_UPDATE_STEP_UP < job->max_period)
+ job->period += JOB_UPDATE_STEP_UP;
+ } else {
+ if (job->min_period + JOB_UPDATE_STEP_DOWN < job->period)
+ job->period -= JOB_UPDATE_STEP_DOWN;
+ }
+}
+
+int
+rte_jobstats_context_init(struct rte_jobstats_context *ctx)
+{
+ if (ctx == NULL)
+ return -EINVAL;
+
+ /* Init only needed parameters. Zero out everything else. */
+ memset(ctx, 0, sizeof(struct rte_jobstats_context));
+
+ rte_jobstats_context_reset(ctx);
+
+ return 0;
+}
+
+void
+rte_jobstats_context_start(struct rte_jobstats_context *ctx)
+{
+ uint64_t now;
+
+ ctx->loop_executed_jobs = 0;
+
+ now = get_time();
+ ADD_TIME_MIN_MAX(ctx, management, now - ctx->state_time);
+ ctx->state_time = now;
+}
+
+void
+rte_jobstats_context_finish(struct rte_jobstats_context *ctx)
+{
+ uint64_t now;
+
+ if (likely(ctx->loop_executed_jobs))
+ ctx->loop_cnt++;
+
+ now = get_time();
+ ADD_TIME_MIN_MAX(ctx, management, now - ctx->state_time);
+ ctx->state_time = now;
+}
+
+void
+rte_jobstats_context_reset(struct rte_jobstats_context *ctx)
+{
+ RESET_TIME_MIN_MAX(ctx, exec);
+ RESET_TIME_MIN_MAX(ctx, management);
+ ctx->start_time = get_time();
+ ctx->state_time = ctx->start_time;
+ ctx->job_exec_cnt = 0;
+ ctx->loop_cnt = 0;
+}
+
+void
+rte_jobstats_set_target(struct rte_jobstats *job, int64_t target)
+{
+ job->target = target;
+}
+
+int
+rte_jobstats_start(struct rte_jobstats_context *ctx, struct rte_jobstats *job)
+{
+ uint64_t now;
+
+ /* Some sanity check. */
+ if (unlikely(ctx == NULL || job == NULL || job->context != NULL))
+ return -EINVAL;
+
+ /* Link job with context object. */
+ job->context = ctx;
+
+ now = get_time();
+ ADD_TIME_MIN_MAX(ctx, management, now - ctx->state_time);
+ ctx->state_time = now;
+
+ return 0;
+}
+
+int
+rte_jobstats_finish(struct rte_jobstats *job, int64_t job_value)
+{
+ struct rte_jobstats_context *ctx;
+ uint64_t now, exec_time;
+ int need_update;
+
+ /* Some sanity check. */
+ if (unlikely(job == NULL || job->context == NULL))
+ return -EINVAL;
+
+ need_update = job->target != job_value;
+ /* Adjust period only if job is unhappy of its current period. */
+ if (need_update)
+ (*job->update_period_cb)(job, job_value);
+
+ ctx = job->context;
+
+ /* Update execution time is considered as runtime so get time after it is
+ * executed. */
+ now = get_time();
+ exec_time = now - ctx->state_time;
+ ADD_TIME_MIN_MAX(job, exec, exec_time);
+ ADD_TIME_MIN_MAX(ctx, exec, exec_time);
+
+ ctx->state_time = now;
+
+ ctx->loop_executed_jobs++;
+ ctx->job_exec_cnt++;
+
+ job->exec_cnt++;
+ job->context = NULL;
+
+ return need_update;
+}
+
+void
+rte_jobstats_set_period(struct rte_jobstats *job, uint64_t period,
+ uint8_t saturate)
+{
+ if (saturate != 0) {
+ if (period < job->min_period)
+ period = job->min_period;
+ else if (period > job->max_period)
+ period = job->max_period;
+ }
+
+ job->period = period;
+}
+
+void
+rte_jobstats_set_min(struct rte_jobstats *job, uint64_t period)
+{
+ job->min_period = period;
+ if (job->period < period)
+ job->period = period;
+}
+
+void
+rte_jobstats_set_max(struct rte_jobstats *job, uint64_t period)
+{
+ job->max_period = period;
+ if (job->period > period)
+ job->period = period;
+}
+
+int
+rte_jobstats_init(struct rte_jobstats *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target)
+{
+ if (job == NULL)
+ return -EINVAL;
+
+ job->period = initial_period;
+ job->min_period = min_period;
+ job->max_period = max_period;
+ job->target = target;
+ job->update_period_cb = &default_update_function;
+ rte_jobstats_reset(job);
+ snprintf(job->name, RTE_DIM(job->name), "%s", name == NULL ? "" : name);
+ job->context = NULL;
+
+ return 0;
+}
+
+void
+rte_jobstats_set_update_period_function(struct rte_jobstats *job,
+ rte_job_update_period_cb_t update_period_cb)
+{
+ if (update_period_cb == NULL)
+ update_period_cb = default_update_function;
+
+ job->update_period_cb = update_period_cb;
+}
+
+void
+rte_jobstats_reset(struct rte_jobstats *job)
+{
+ RESET_TIME_MIN_MAX(job, exec);
+ job->exec_cnt = 0;
+}
diff --git a/lib/librte_jobstats/rte_jobstats.h b/lib/librte_jobstats/rte_jobstats.h
new file mode 100644
index 0000000..de6a89a
--- /dev/null
+++ b/lib/librte_jobstats/rte_jobstats.h
@@ -0,0 +1,322 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef JOBSTATS_H_
+#define JOBSTATS_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_JOBSTATS_NAMESIZE 32
+
+/* Forward declarations. */
+struct rte_jobstats_context;
+struct rte_jobstats;
+
+/**
+ * This function should calculate new period and set it using
+ * rte_jobstats_set_period() function. Time spent in this function will be
+ * added to job's runtime.
+ *
+ * @param job
+ * The job data structure handler.
+ * @param job_result
+ * Result of calling job callback.
+ */
+typedef void (*rte_job_update_period_cb_t)(struct rte_jobstats *job,
+ int64_t job_result);
+
+struct rte_jobstats {
+ uint64_t period;
+ /**< Estimated period of execution. */
+
+ uint64_t min_period;
+ /**< Minimum period. */
+
+ uint64_t max_period;
+ /**< Maximum period. */
+
+ int64_t target;
+ /**< Desired value for this job. */
+
+ rte_job_update_period_cb_t update_period_cb;
+ /**< Period update callback. */
+
+ uint64_t exec_time;
+ /**< Total time (sum) that this job was executing. */
+
+ uint64_t min_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum execute time. */
+
+ uint64_t exec_cnt;
+ /**< Execute count. */
+
+ char name[RTE_JOBSTATS_NAMESIZE];
+ /**< Name of this job */
+
+ struct rte_jobstats_context *context;
+ /**< Job stats context object that is executing this job. */
+} __rte_cache_aligned;
+
+struct rte_jobstats_context {
+ /** Viariable holding time at different points:
+ * -# loop start time if loop was started but no job executed yet.
+ * -# job start time if job is currently executing.
+ * -# job finish time if job finished its execution.
+ * -# loop finish time if loop finished its execution. */
+ uint64_t state_time;
+
+ uint64_t loop_executed_jobs;
+ /**< Count of executed jobs in this loop. */
+
+ /* Statistics start. */
+
+ uint64_t exec_time;
+ /**< Total time taken to execute jobs, not including management time. */
+
+ uint64_t min_exec_time;
+ /**< Minimum loop execute time. */
+
+ uint64_t max_exec_time;
+ /**< Minimum loop execute time. */
+
+ /**
+ * Sum of time that is not the execute time (ex: from job finish to next
+ * job start).
+ *
+ * This time might be considered as overhead of library + job scheduling.
+ */
+ uint64_t management_time;
+
+ uint64_t min_management_time;
+ /**< Minimum management time */
+
+ uint64_t max_management_time;
+ /**< Maximum management time */
+
+ uint64_t start_time;
+ /**< Time since last reset stats. */
+
+ uint64_t job_exec_cnt;
+ /**< Total count of executed jobs. */
+
+ uint64_t loop_cnt;
+ /**< Total count of executed loops with at least one executed job. */
+} __rte_cache_aligned;
+
+/**
+ * Initialize given context object with default values.
+ *
+ * @param ctx
+ * Job stats context object to initialize.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *ctx* is NULL
+ */
+int
+rte_jobstats_context_init(struct rte_jobstats_context *ctx);
+
+/**
+ * Mark that new set of jobs start executing.
+ *
+ * @param ctx
+ * Job stats context object.
+ */
+void
+rte_jobstats_context_start(struct rte_jobstats_context *ctx);
+
+/**
+ * Mark that there is no more jobs ready to execute in this turn. Calculate
+ * stats for this loop turn.
+ *
+ * @param ctx
+ * Job stats context.
+ */
+void
+rte_jobstats_context_finish(struct rte_jobstats_context *ctx);
+
+/**
+ * Function resets job context statistics.
+ *
+ * @param ctx
+ * Job stats context which statistics will be reset.
+ */
+void
+rte_jobstats_context_reset(struct rte_jobstats_context *ctx);
+
+/**
+ * Initialize given job stats object.
+ *
+ * @param job
+ * Job object.
+ * @param name
+ * Optional job name.
+ * @param min_period
+ * Minimum period that this job can accept.
+ * @param max_period
+ * Maximum period that this job can accept.
+ * @param initial_period
+ * Initial period. It will be checked against *min_period* and *max_period*.
+ * @param target
+ * Target value that this job try to achieve.
+ *
+ * @return
+ * 0 on success
+ * -EINVAL if *job* is NULL
+ */
+int
+rte_jobstats_init(struct rte_jobstats *job, const char *name,
+ uint64_t min_period, uint64_t max_period, uint64_t initial_period,
+ int64_t target);
+
+/**
+ * Set job desired target value. Difference between target and job value
+ * value must be used to properly adjust job execute period value.
+ *
+ * @param job
+ * The job object.
+ * @param target
+ * New target.
+ */
+void
+rte_jobstats_set_target(struct rte_jobstats *job, int64_t target);
+
+/**
+ * Mark that *job* is starting of its execution in context of *ctx* object.
+ *
+ * @param ctx
+ * Job stats context.
+ * @param job
+ * Job object.
+ * @return
+ * 0 on success
+ * -EINVAL if *ctx* or *job* is NULL or *job* is executing in another context
+ * context already,
+ */
+int
+rte_jobstats_start(struct rte_jobstats_context *ctx, struct rte_jobstats *job);
+
+/**
+ * Mark that *job* finished its execution. Context in which it was executing
+ * will receive stat update. After this function call *job* object is ready to
+ * be executed in other context.
+ *
+ * @param job
+ * Job object.
+ * @param job_value
+ * Job value. Job should pass in this parameter a value that it try to optimize
+ * for example the number of packets it processed.
+ *
+ * @return
+ * 0 if job's period was not updated (job target equals *job_value*)
+ * 1 if job's period was updated
+ * -EINVAL if job is NULL or job was not started (it have no context).
+ */
+int
+rte_jobstats_finish(struct rte_jobstats *job, int64_t job_value);
+
+/**
+ * Set execute period of given job.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New period value.
+ * @param saturate
+ * If zero, skip period saturation to min, max range.
+ */
+void
+rte_jobstats_set_period(struct rte_jobstats *job, uint64_t period,
+ uint8_t saturate);
+/**
+ * Set minimum execute period of given job. Current period will be checked
+ * against new minimum value.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New minimum period value.
+ */
+void
+rte_jobstats_set_min(struct rte_jobstats *job, uint64_t period);
+/**
+ * Set maximum execute period of given job. Current period will be checked
+ * against new maximum value.
+ *
+ * @param job
+ * The job object.
+ * @param period
+ * New maximum period value.
+ */
+void
+rte_jobstats_set_max(struct rte_jobstats *job, uint64_t period);
+
+/**
+ * Set update period callback that is invoked after job finish.
+ *
+ * If application wants to do more sophisticated calculations than default
+ * it can provide this handler.
+ *
+ * @param job
+ * Job object.
+ * @param update_pedriod_cb
+ * Callback to set. If NULL restore default update function.
+ */
+void
+rte_jobstats_set_update_period_function(struct rte_jobstats *job,
+ rte_job_update_period_cb_t update_period_cb);
+
+/**
+ * Function resets job statistics.
+ *
+ * @param job
+ * Job which statistics will be reset.
+ */
+void
+rte_jobstats_reset(struct rte_jobstats *job);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* JOBSTATS_H_ */
diff --git a/lib/librte_jobstats/rte_jobstats_version.map b/lib/librte_jobstats/rte_jobstats_version.map
new file mode 100644
index 0000000..ef29819
--- /dev/null
+++ b/lib/librte_jobstats/rte_jobstats_version.map
@@ -0,0 +1,19 @@
+DPDK_2.0 {
+ global:
+
+ rte_jobstats_context_init;
+ rte_jobstats_context_start;
+ rte_jobstats_context_finish;
+ rte_jobstats_context_reset;
+ rte_jobstats_init;
+ rte_jobstats_set_target;
+ rte_jobstats_start;
+ rte_jobstats_finish;
+ rte_jobstats_set_period;
+ rte_jobstats_set_min;
+ rte_jobstats_set_max;
+ rte_jobstats_set_update_period_function;
+ rte_jobstats_reset;
+
+ local: *;
+};
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-24 16:33 ` Pawel Wodkowski
2015-02-24 19:10 ` De Lara Guarch, Pablo
2015-02-24 21:19 ` Thomas Monjalon
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 3/3] MAINTAINERS: claim responsibility for rte_jobstats library and example app Pawel Wodkowski
2015-02-24 20:34 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application De Lara Guarch, Pablo
3 siblings, 2 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-24 16:33 UTC (permalink / raw)
To: dev, pablo.de.lara.guarch
This app demonstrate usage of new rte_jobstats library.
It is basically the orginal l2fwd with following modifications to met
library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for those jobs is almost the same as in original application.
- stats is moved to rte_alarm callback to not introduce overhead of
printing.
- stats are expanded to show rte_jobstats statistics.
- added new parameter '-l' to automatic thousands separator.
Comparing original l2fwd and l2fwd-jobstats apps will show approach what
is needed to properly write own application with rte_jobstats
measurements.
New available statistics:
- Total and % of fwd and flush execution time
- management time - overhead of rte_timer + overhead of rte_jobstats
library
- Idle time and % of time spent waiting for fwd or flush to be ready to
execute.
- per job execution time and period.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
examples/Makefile | 1 +
examples/l2fwd-jobstats/Makefile | 51 ++
examples/l2fwd-jobstats/main.c | 1040 ++++++++++++++++++++++++++++++++++++++
mk/rte.app.mk | 4 +
4 files changed, 1096 insertions(+)
create mode 100644 examples/l2fwd-jobstats/Makefile
create mode 100644 examples/l2fwd-jobstats/main.c
diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..e847ded 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
DIRS-y += l2fwd
+DIRS-y += l2fwd-jobstats
DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
DIRS-y += l3fwd
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-jobstats/Makefile b/examples/l2fwd-jobstats/Makefile
new file mode 100644
index 0000000..d57a0ae
--- /dev/null
+++ b/examples/l2fwd-jobstats/Makefile
@@ -0,0 +1,51 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-jobstats
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-jobstats/main.c b/examples/l2fwd-jobstats/main.c
new file mode 100644
index 0000000..a5a1aaa
--- /dev/null
+++ b/examples/l2fwd-jobstats/main.c
@@ -0,0 +1,1040 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <locale.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <ctype.h>
+#include <getopt.h>
+
+#include <rte_alarm.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_spinlock.h>
+
+#include <rte_errno.h>
+#include <rte_jobstats.h>
+#include <rte_timer.h>
+#include <rte_alarm.h>
+
+#define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF 8192
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr l2fwd_ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t l2fwd_enabled_port_mask;
+
+/* list of enabled ports */
+static uint32_t l2fwd_dst_ports[RTE_MAX_ETHPORTS];
+
+#define UPDATE_STEP_UP 1
+#define UPDATE_STEP_DOWN 32
+
+static unsigned int l2fwd_rx_queue_per_lcore = 1;
+
+struct mbuf_table {
+ uint64_t next_flush_time;
+ unsigned len;
+ struct rte_mbuf *mbufs[MAX_PKT_BURST];
+};
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT 16
+struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_jobstats port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
+
+ struct rte_timer flush_timer;
+ struct rte_jobstats flush_job;
+ struct rte_jobstats idle_job;
+ struct rte_jobstats_context jobs_context;
+
+ rte_atomic16_t stats_read_pending;
+ rte_spinlock_t lock;
+} __rte_cache_aligned;
+struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
+
+static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc = 0, /**< CRC stripped by hardware */
+ },
+ .txmode = {
+ .mq_mode = ETH_MQ_TX_NONE,
+ },
+};
+
+struct rte_mempool *l2fwd_pktmbuf_pool = NULL;
+
+/* Per-port statistics struct */
+struct l2fwd_port_statistics {
+ uint64_t tx;
+ uint64_t rx;
+ uint64_t dropped;
+} __rte_cache_aligned;
+struct l2fwd_port_statistics port_statistics[RTE_MAX_ETHPORTS];
+
+/* 1 day max */
+#define MAX_TIMER_PERIOD 86400
+/* default period is 10 seconds */
+static int64_t timer_period = 10;
+/* default timer frequency */
+static double hz;
+/* BURST_TX_DRAIN_US converted to cycles */
+uint64_t drain_tsc;
+/* Convert cycles to ns */
+static inline double
+cycles_to_ns(uint64_t cycles)
+{
+ double t = cycles;
+
+ t *= (double)NS_PER_S;
+ t /= hz;
+ return t;
+}
+
+static void
+show_lcore_stats(unsigned lcore_id)
+{
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_jobstats_context *ctx = &qconf->jobs_context;
+ struct rte_jobstats *job;
+ uint8_t i;
+
+ /* LCore statistics. */
+ uint64_t stats_period, loop_count;
+ uint64_t exec, exec_min, exec_max;
+ uint64_t management, management_min, management_max;
+ uint64_t busy, busy_min, busy_max;
+
+ /* Jobs statistics. */
+ const uint8_t port_cnt = qconf->n_rx_port;
+ uint64_t jobs_exec_cnt[port_cnt], jobs_period[port_cnt];
+ uint64_t jobs_exec[port_cnt], jobs_exec_min[port_cnt],
+ jobs_exec_max[port_cnt];
+
+ uint64_t flush_exec_cnt, flush_period;
+ uint64_t flush_exec, flush_exec_min, flush_exec_max;
+
+ uint64_t idle_exec_cnt;
+ uint64_t idle_exec, idle_exec_min, idle_exec_max;
+ uint64_t collection_time = rte_get_timer_cycles();
+
+ /* Ask forwarding thread to give us stats. */
+ rte_atomic16_set(&qconf->stats_read_pending, 1);
+ rte_spinlock_lock(&qconf->lock);
+ rte_atomic16_set(&qconf->stats_read_pending, 0);
+
+ /* Collect context statistics. */
+ stats_period = ctx->state_time - ctx->start_time;
+ loop_count = ctx->loop_cnt;
+
+ exec = ctx->exec_time;
+ exec_min = ctx->min_exec_time;
+ exec_max = ctx->max_exec_time;
+
+ management = ctx->management_time;
+ management_min = ctx->min_management_time;
+ management_max = ctx->max_management_time;
+
+ rte_jobstats_context_reset(ctx);
+
+ for (i = 0; i < port_cnt; i++) {
+ job = &qconf->port_fwd_jobs[i];
+
+ jobs_exec_cnt[i] = job->exec_cnt;
+ jobs_period[i] = job->period;
+
+ jobs_exec[i] = job->exec_time;
+ jobs_exec_min[i] = job->min_exec_time;
+ jobs_exec_max[i] = job->max_exec_time;
+
+ rte_jobstats_reset(job);
+ }
+
+ flush_exec_cnt = qconf->flush_job.exec_cnt;
+ flush_period = qconf->flush_job.period;
+ flush_exec = qconf->flush_job.exec_time;
+ flush_exec_min = qconf->flush_job.min_exec_time;
+ flush_exec_max = qconf->flush_job.max_exec_time;
+ rte_jobstats_reset(&qconf->flush_job);
+
+ idle_exec_cnt = qconf->idle_job.exec_cnt;
+ idle_exec = qconf->idle_job.exec_time;
+ idle_exec_min = qconf->idle_job.min_exec_time;
+ idle_exec_max = qconf->idle_job.max_exec_time;
+ rte_jobstats_reset(&qconf->idle_job);
+
+ rte_spinlock_unlock(&qconf->lock);
+
+ exec -= idle_exec;
+ busy = exec + management;
+ busy_min = exec_min + management_min;
+ busy_max = exec_max + management_max;
+
+
+ collection_time = rte_get_timer_cycles() - collection_time;
+
+#define STAT_FMT "\n%-18s %'14.0f %6.1f%% %'10.0f %'10.0f %'10.0f"
+
+ printf("\n----------------"
+ "\nLCore %3u: statistics (time in ns, collected in %'9.0f)"
+ "\n%-18s %14s %7s %10s %10s %10s "
+ "\n%-18s %'14.0f"
+ "\n%-18s %'14" PRIu64
+ STAT_FMT /* Exec */
+ STAT_FMT /* Management */
+ STAT_FMT /* Busy */
+ STAT_FMT, /* Idle */
+ lcore_id, cycles_to_ns(collection_time),
+ "Stat type", "total", "%total", "avg", "min", "max",
+ "Stats duration:", cycles_to_ns(stats_period),
+ "Loop count:", loop_count,
+ "Exec time",
+ cycles_to_ns(exec), exec * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? exec / loop_count : 0),
+ cycles_to_ns(exec_min),
+ cycles_to_ns(exec_max),
+ "Management time",
+ cycles_to_ns(management), management * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? management / loop_count : 0),
+ cycles_to_ns(management_min),
+ cycles_to_ns(management_max),
+ "Exec + management",
+ cycles_to_ns(busy), busy * 100.0 / stats_period,
+ cycles_to_ns(loop_count ? busy / loop_count : 0),
+ cycles_to_ns(busy_min),
+ cycles_to_ns(busy_max),
+ "Idle (job)",
+ cycles_to_ns(idle_exec), idle_exec * 100.0 / stats_period,
+ cycles_to_ns(idle_exec_cnt ? idle_exec / idle_exec_cnt : 0),
+ cycles_to_ns(idle_exec_min),
+ cycles_to_ns(idle_exec_max));
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ job = &qconf->port_fwd_jobs[i];
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", jobs_exec_cnt[i],
+ "Exec period: ", cycles_to_ns(jobs_period[i]),
+ "Exec time",
+ cycles_to_ns(jobs_exec[i]), jobs_exec[i] * 100.0 / stats_period,
+ cycles_to_ns(jobs_exec_cnt[i] ? jobs_exec[i] / jobs_exec_cnt[i]
+ : 0),
+ cycles_to_ns(jobs_exec_min[i]),
+ cycles_to_ns(jobs_exec_max[i]));
+ }
+
+ if (qconf->n_rx_port > 0) {
+ job = &qconf->flush_job;
+ printf("\n\nJob %" PRIu32 ": %-20s "
+ "\n%-18s %'14" PRIu64
+ "\n%-18s %'14.0f"
+ STAT_FMT,
+ i, job->name,
+ "Exec count:", flush_exec_cnt,
+ "Exec period: ", cycles_to_ns(flush_period),
+ "Exec time",
+ cycles_to_ns(flush_exec), flush_exec * 100.0 / stats_period,
+ cycles_to_ns(flush_exec_cnt ? flush_exec / flush_exec_cnt : 0),
+ cycles_to_ns(flush_exec_min),
+ cycles_to_ns(flush_exec_max));
+ }
+}
+
+/* Print out statistics on packets dropped */
+static void
+show_stats_cb(__rte_unused void *param)
+{
+ uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+ unsigned portid, lcore_id;
+
+ total_packets_dropped = 0;
+ total_packets_tx = 0;
+ total_packets_rx = 0;
+
+ const char clr[] = { 27, '[', '2', 'J', '\0' };
+ const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+ /* Clear screen and move to top left */
+ printf("%s%s"
+ "\nPort statistics ===================================",
+ clr, topLeft);
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ /* skip disabled ports */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+ printf("\nStatistics for port %u ------------------------------"
+ "\nPackets sent: %24"PRIu64
+ "\nPackets received: %20"PRIu64
+ "\nPackets dropped: %21"PRIu64,
+ portid,
+ port_statistics[portid].tx,
+ port_statistics[portid].rx,
+ port_statistics[portid].dropped);
+
+ total_packets_dropped += port_statistics[portid].dropped;
+ total_packets_tx += port_statistics[portid].tx;
+ total_packets_rx += port_statistics[portid].rx;
+ }
+
+ printf("\nAggregate statistics ==============================="
+ "\nTotal packets sent: %18"PRIu64
+ "\nTotal packets received: %14"PRIu64
+ "\nTotal packets dropped: %15"PRIu64
+ "\n====================================================",
+ total_packets_tx,
+ total_packets_rx,
+ total_packets_dropped);
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ if (lcore_queue_conf[lcore_id].n_rx_port > 0)
+ show_lcore_stats(lcore_id);
+ }
+
+ printf("\n====================================================\n");
+ rte_eal_alarm_set(timer_period * US_PER_S, show_stats_cb, NULL);
+}
+
+/* Send the burst of packets on an output interface */
+static void
+l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
+{
+ struct mbuf_table *m_table;
+ uint16_t ret;
+ uint16_t queueid = 0;
+ uint16_t n;
+
+ m_table = &qconf->tx_mbufs[port];
+ n = m_table->len;
+
+ m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
+ m_table->len = 0;
+
+ ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
+
+ port_statistics[port].tx += ret;
+ if (unlikely(ret < n)) {
+ port_statistics[port].dropped += (n - ret);
+ do {
+ rte_pktmbuf_free(m_table->mbufs[ret]);
+ } while (++ret < n);
+ }
+}
+
+/* Enqueue packets for TX and prepare them to be sent */
+static int
+l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+{
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct mbuf_table *m_table = &qconf->tx_mbufs[port];
+ uint16_t len = qconf->tx_mbufs[port].len;
+
+ m_table->mbufs[len] = m;
+
+ len++;
+ m_table->len = len;
+
+ /* Enough pkts to be sent. */
+ if (unlikely(len == MAX_PKT_BURST))
+ l2fwd_send_burst(qconf, port);
+
+ return 0;
+}
+
+static void
+l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+{
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+ tmp = ð->d_addr.addr_bytes[0];
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+ /* src addr */
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+}
+
+static void
+l2fwd_job_update_cb(struct rte_jobstats *job, int64_t result)
+{
+ int64_t err = job->target - result;
+ int64_t histeresis = job->target / 8;
+
+ if (err < -histeresis) {
+ if (job->min_period + UPDATE_STEP_DOWN < job->period)
+ job->period -= UPDATE_STEP_DOWN;
+ } else if (err > histeresis) {
+ if (job->period + UPDATE_STEP_UP < job->max_period)
+ job->period += UPDATE_STEP_UP;
+ }
+}
+
+static void
+l2fwd_fwd_job(__rte_unused struct rte_timer *timer, void *arg)
+{
+ struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+ struct rte_mbuf *m;
+
+ const uint8_t port_idx = (uintptr_t) arg;
+ const unsigned lcore_id = rte_lcore_id();
+ struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+ struct rte_jobstats *job = &qconf->port_fwd_jobs[port_idx];
+ const uint8_t portid = qconf->rx_port_list[port_idx];
+
+ uint8_t j;
+ uint16_t total_nb_rx;
+
+ rte_jobstats_start(&qconf->jobs_context, job);
+
+ /* Call rx burst 2 times. This allow rte_jobstats logic to see if this
+ * function must be called more frequently. */
+
+ total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ for (j = 0; j < total_nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+ if (total_nb_rx == MAX_PKT_BURST) {
+ const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+ }
+
+ port_statistics[portid].rx += total_nb_rx;
+
+ /* Adjust period time in which we are running here. */
+ if (rte_jobstats_finish(job, total_nb_rx) != 0) {
+ rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
+ lcore_id, l2fwd_fwd_job, arg);
+ }
+}
+
+static void
+l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
+{
+ uint64_t now;
+ unsigned lcore_id;
+ struct lcore_queue_conf *qconf;
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_jobstats_start(&qconf->jobs_context, &qconf->flush_job);
+
+ now = rte_get_timer_cycles();
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+
+ /* Pass target to indicate that this job is happy of time interwal
+ * in which it was called. */
+ rte_jobstats_finish(&qconf->flush_job, qconf->flush_job.target);
+}
+
+/* main processing loop */
+static void
+l2fwd_main_loop(void)
+{
+ unsigned lcore_id;
+ unsigned i, portid;
+ struct lcore_queue_conf *qconf;
+ uint8_t stats_read_pending = 0;
+ uint8_t need_manage;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD, "lcore %u has nothing to do\n", lcore_id);
+ return;
+ }
+
+ RTE_LOG(INFO, L2FWD, "entering main loop on lcore %u\n", lcore_id);
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+
+ portid = qconf->rx_port_list[i];
+ RTE_LOG(INFO, L2FWD, " -- lcoreid=%u portid=%u\n", lcore_id,
+ portid);
+ }
+
+ rte_jobstats_init(&qconf->idle_job, "idle", 0, 0, 0, 0);
+
+ for (;;) {
+ rte_spinlock_lock(&qconf->lock);
+
+ do {
+ rte_jobstats_context_start(&qconf->jobs_context);
+
+ /* Do the Idle job:
+ * - Read stats_read_pending flag
+ * - check if some real job need to be executed
+ */
+ rte_jobstats_start(&qconf->jobs_context, &qconf->idle_job);
+
+ do {
+ uint8_t i;
+ uint64_t now = rte_get_timer_cycles();
+
+ need_manage = qconf->flush_timer.expire < now;
+ /* Check if we was esked to give a stats. */
+ stats_read_pending =
+ rte_atomic16_read(&qconf->stats_read_pending);
+ need_manage |= stats_read_pending;
+
+ for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
+ need_manage = qconf->rx_timers[i].expire < now;
+
+ } while (!need_manage);
+ rte_jobstats_finish(&qconf->idle_job, qconf->idle_job.target);
+
+ rte_timer_manage();
+ rte_jobstats_context_finish(&qconf->jobs_context);
+ } while (likely(stats_read_pending == 0));
+
+ rte_spinlock_unlock(&qconf->lock);
+ rte_pause();
+ }
+}
+
+static int
+l2fwd_launch_one_lcore(__attribute__((unused)) void *dummy)
+{
+ l2fwd_main_loop();
+ return 0;
+}
+
+/* display usage */
+static void
+l2fwd_usage(const char *prgname)
+{
+ printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+ " -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+ " -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+ " -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n"
+ " -l set system default locale instead of default (\"C\" locale) for thousands separator in stats.",
+ prgname);
+}
+
+static int
+l2fwd_parse_portmask(const char *portmask)
+{
+ char *end = NULL;
+ unsigned long pm;
+
+ /* parse hexadecimal string */
+ pm = strtoul(portmask, &end, 16);
+ if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+
+ if (pm == 0)
+ return -1;
+
+ return pm;
+}
+
+static unsigned int
+l2fwd_parse_nqueue(const char *q_arg)
+{
+ char *end = NULL;
+ unsigned long n;
+
+ /* parse hexadecimal string */
+ n = strtoul(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return 0;
+ if (n == 0)
+ return 0;
+ if (n >= MAX_RX_QUEUE_PER_LCORE)
+ return 0;
+
+ return n;
+}
+
+static int
+l2fwd_parse_timer_period(const char *q_arg)
+{
+ char *end = NULL;
+ int n;
+
+ /* parse number string */
+ n = strtol(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+ if (n >= MAX_TIMER_PERIOD)
+ return -1;
+
+ return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+l2fwd_parse_args(int argc, char **argv)
+{
+ int opt, ret;
+ char **argvopt;
+ int option_index;
+ char *prgname = argv[0];
+ static struct option lgopts[] = {
+ {NULL, 0, 0, 0}
+ };
+
+ argvopt = argv;
+
+ while ((opt = getopt_long(argc, argvopt, "p:q:T:l",
+ lgopts, &option_index)) != EOF) {
+
+ switch (opt) {
+ /* portmask */
+ case 'p':
+ l2fwd_enabled_port_mask = l2fwd_parse_portmask(optarg);
+ if (l2fwd_enabled_port_mask == 0) {
+ printf("invalid portmask\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* nqueue */
+ case 'q':
+ l2fwd_rx_queue_per_lcore = l2fwd_parse_nqueue(optarg);
+ if (l2fwd_rx_queue_per_lcore == 0) {
+ printf("invalid queue number\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* timer period */
+ case 'T':
+ timer_period = l2fwd_parse_timer_period(optarg);
+ if (timer_period < 0) {
+ printf("invalid timer period\n");
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ break;
+
+ /* For thousands separator in printf. */
+ case 'l':
+ setlocale(LC_ALL, "");
+ break;
+
+ /* long options */
+ case 0:
+ l2fwd_usage(prgname);
+ return -1;
+
+ default:
+ l2fwd_usage(prgname);
+ return -1;
+ }
+ }
+
+ if (optind >= 0)
+ argv[optind-1] = prgname;
+
+ ret = optind-1;
+ optind = 0; /* reset getopt lib */
+ return ret;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+ uint8_t portid, count, all_ports_up, print_flag = 0;
+ struct rte_eth_link link;
+
+ printf("\nChecking link status");
+ fflush(stdout);
+ for (count = 0; count <= MAX_CHECK_TIME; count++) {
+ all_ports_up = 1;
+ for (portid = 0; portid < port_num; portid++) {
+ if ((port_mask & (1 << portid)) == 0)
+ continue;
+ memset(&link, 0, sizeof(link));
+ rte_eth_link_get_nowait(portid, &link);
+ /* print link status if flag set */
+ if (print_flag == 1) {
+ if (link.link_status)
+ printf("Port %d Link Up - speed %u "
+ "Mbps - %s\n", (uint8_t)portid,
+ (unsigned)link.link_speed,
+ (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+ ("full-duplex") : ("half-duplex\n"));
+ else
+ printf("Port %d Link Down\n",
+ (uint8_t)portid);
+ continue;
+ }
+ /* clear all_ports_up flag if any link down */
+ if (link.link_status == 0) {
+ all_ports_up = 0;
+ break;
+ }
+ }
+ /* after finally printing all link status, get out */
+ if (print_flag == 1)
+ break;
+
+ if (all_ports_up == 0) {
+ printf(".");
+ fflush(stdout);
+ rte_delay_ms(CHECK_INTERVAL);
+ }
+
+ /* set the print_flag if all ports up or timeout */
+ if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+ print_flag = 1;
+ printf("done\n");
+ }
+ }
+}
+
+int
+main(int argc, char **argv)
+{
+ struct lcore_queue_conf *qconf;
+ struct rte_eth_dev_info dev_info;
+ unsigned lcore_id, rx_lcore_id;
+ unsigned nb_ports_in_mask = 0;
+ int ret;
+ char name[RTE_JOBSTATS_NAMESIZE];
+ uint8_t nb_ports;
+ uint8_t nb_ports_available;
+ uint8_t portid, last_port;
+ uint8_t i;
+
+ /* init EAL */
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+ rte_timer_subsystem_init();
+
+ /* fetch default timer frequency. */
+ hz = rte_get_timer_hz();
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+ nb_ports = rte_eth_dev_count();
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ } else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get(portid, &dev_info);
+ }
+ if (nb_ports_in_mask % 2) {
+ printf("Notice: odd number of ports in portmask.\n");
+ l2fwd_dst_ports[last_port] = last_port;
+ }
+
+ rx_lcore_id = 0;
+ qconf = NULL;
+
+ /* Initialize the port/queue configuration of each logical core */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ /* get the lcore_id for this port */
+ while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+ lcore_queue_conf[rx_lcore_id].n_rx_port ==
+ l2fwd_rx_queue_per_lcore) {
+ rx_lcore_id++;
+ if (rx_lcore_id >= RTE_MAX_LCORE)
+ rte_exit(EXIT_FAILURE, "Not enough cores\n");
+ }
+
+ if (qconf != &lcore_queue_conf[rx_lcore_id])
+ /* Assigned a new logical core in the loop above. */
+ qconf = &lcore_queue_conf[rx_lcore_id];
+
+ qconf->rx_port_list[qconf->n_rx_port] = portid;
+ qconf->n_rx_port++;
+ printf("Lcore %u: RX port %u\n", rx_lcore_id, (unsigned) portid);
+ }
+
+ nb_ports_available = nb_ports;
+
+ /* Initialise each port */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+ printf("Skipping disabled port %u\n", (unsigned) portid);
+ nb_ports_available--;
+ continue;
+ }
+ /* init port */
+ printf("Initializing port %u... ", (unsigned) portid);
+ fflush(stdout);
+ ret = rte_eth_dev_configure(portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ rte_eth_macaddr_get(portid, &l2fwd_ports_eth_addr[portid]);
+
+ /* init one RX queue */
+ fflush(stdout);
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* init one TX queue on each port */
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ /* Start device */
+ ret = rte_eth_dev_start(portid);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+ printf("done:\n");
+
+ rte_eth_promiscuous_enable(portid);
+
+ printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+ (unsigned) portid,
+ l2fwd_ports_eth_addr[portid].addr_bytes[0],
+ l2fwd_ports_eth_addr[portid].addr_bytes[1],
+ l2fwd_ports_eth_addr[portid].addr_bytes[2],
+ l2fwd_ports_eth_addr[portid].addr_bytes[3],
+ l2fwd_ports_eth_addr[portid].addr_bytes[4],
+ l2fwd_ports_eth_addr[portid].addr_bytes[5]);
+
+ /* initialize port stats */
+ memset(&port_statistics, 0, sizeof(port_statistics));
+ }
+
+ if (!nb_ports_available) {
+ rte_exit(EXIT_FAILURE,
+ "All available ports are disabled. Please set portmask.\n");
+ }
+
+ check_all_ports_link_status(nb_ports, l2fwd_enabled_port_mask);
+
+ drain_tsc = (hz + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+
+ RTE_LCORE_FOREACH(lcore_id) {
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_spinlock_init(&qconf->lock);
+
+ if (rte_jobstats_context_init(&qconf->jobs_context) != 0)
+ rte_panic("Jobs stats context for core %u init failed\n", lcore_id);
+
+ if (qconf->n_rx_port == 0) {
+ RTE_LOG(INFO, L2FWD,
+ "lcore %u: no ports so no jobs stats context initialization\n",
+ lcore_id);
+ continue;
+ }
+ /* Add flush job.
+ * Set fixed period by setting min = max = initial period. Set target to
+ * zero as it is irrelevant for this job. */
+ rte_jobstats_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
+ drain_tsc, 0);
+
+ rte_timer_init(&qconf->flush_timer);
+ rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL, lcore_id,
+ &l2fwd_flush_job, NULL);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to add flush job for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+ for (i = 0; i < qconf->n_rx_port; i++) {
+ struct rte_jobstats *job = &qconf->port_fwd_jobs[i];
+
+ portid = qconf->rx_port_list[i];
+ printf("Setting forward jon for port %u\n", portid);
+
+ snprintf(name, RTE_DIM(name), "port %u fwd", portid);
+ /* Setup forward job.
+ * Set min, max and initial period. Set target to MAX_PKT_BURST as
+ * this is desired optimal RX/TX burst size. */
+ rte_jobstats_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
+ rte_jobstats_set_update_period_function(job, l2fwd_job_update_cb);
+
+ rte_timer_init(&qconf->rx_timers[i]);
+ rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
+ &l2fwd_fwd_job, (void *)(uintptr_t)i);
+ }
+ }
+
+ if (timer_period)
+ rte_eal_alarm_set(timer_period * MS_PER_S, show_stats_cb, NULL);
+ else
+ RTE_LOG(INFO, L2FWD, "Stats display disabled\n");
+
+ /* launch per-lcore init on every lcore */
+ rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL, CALL_MASTER);
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+ if (rte_eal_wait_lcore(lcore_id) < 0)
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 334cb25..cfca8f5 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -103,6 +103,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
LDLIBS += -lrte_hash
endif
+ifeq ($(CONFIG_RTE_LIBRTE_JOBSTATS),y)
+LDLIBS += -lrte_jobstats
+endif
+
ifeq ($(CONFIG_RTE_LIBRTE_LPM),y)
LDLIBS += -lrte_lpm
endif
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* [dpdk-dev] [PATCH v6 3/3] MAINTAINERS: claim responsibility for rte_jobstats library and example app
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load Pawel Wodkowski
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example Pawel Wodkowski
@ 2015-02-24 16:33 ` Pawel Wodkowski
2015-02-24 20:34 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application De Lara Guarch, Pablo
3 siblings, 0 replies; 48+ messages in thread
From: Pawel Wodkowski @ 2015-02-24 16:33 UTC (permalink / raw)
To: dev, pablo.de.lara.guarch
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
MAINTAINERS | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index a771fa3..7b3ef00 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -362,6 +362,10 @@ F: app/test/test_timer*
F: examples/timer/
F: doc/guides/sample_app_ug/timer.rst
+Job stats
+M: Pawel Wodkowski <pawelx.wodkowski@intel.com>
+F: lib/librte_jobstats/
+F: examples/l2fwd-jobstats/
Test Applications
-----------------
--
1.9.1
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example Pawel Wodkowski
@ 2015-02-24 19:10 ` De Lara Guarch, Pablo
2015-02-24 19:16 ` De Lara Guarch, Pablo
2015-02-24 21:19 ` Thomas Monjalon
1 sibling, 1 reply; 48+ messages in thread
From: De Lara Guarch, Pablo @ 2015-02-24 19:10 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Tuesday, February 24, 2015 4:33 PM
> To: dev@dpdk.org; De Lara Guarch, Pablo
> Subject: [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
>
> This app demonstrate usage of new rte_jobstats library.
> It is basically the orginal l2fwd with following modifications to met
> library requirements:
> - main_loop() was split into two jobs: forward job and flush job. Logic
> for those jobs is almost the same as in original application.
> - stats is moved to rte_alarm callback to not introduce overhead of
> printing.
> - stats are expanded to show rte_jobstats statistics.
> - added new parameter '-l' to automatic thousands separator.
>
> Comparing original l2fwd and l2fwd-jobstats apps will show approach what
> is needed to properly write own application with rte_jobstats
> measurements.
>
> New available statistics:
> - Total and % of fwd and flush execution time
> - management time - overhead of rte_timer + overhead of rte_jobstats
> library
> - Idle time and % of time spent waiting for fwd or flush to be ready to
> execute.
> - per job execution time and period.
>
> Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> ---
> examples/Makefile | 1 +
> examples/l2fwd-jobstats/Makefile | 51 ++
> examples/l2fwd-jobstats/main.c | 1040
> ++++++++++++++++++++++++++++++++++++++
> mk/rte.app.mk | 4 +
> 4 files changed, 1096 insertions(+)
> create mode 100644 examples/l2fwd-jobstats/Makefile
> create mode 100644 examples/l2fwd-jobstats/main.c
>
> diff --git a/examples/Makefile b/examples/Makefile
> index 81f1d2f..e847ded 100644
> --- a/examples/Makefile
> +++ b/examples/Makefile
> @@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) +=
> ip_fragmentation
> DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
> DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
> DIRS-y += l2fwd
> +DIRS-y += l2fwd-jobstats
> DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
> DIRS-y += l3fwd
> DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
> diff --git a/examples/l2fwd-jobstats/Makefile b/examples/l2fwd-
> jobstats/Makefile
> new file mode 100644
> index 0000000..d57a0ae
> --- /dev/null
> +++ b/examples/l2fwd-jobstats/Makefile
> @@ -0,0 +1,51 @@
> +# BSD LICENSE
> +#
> +# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
Fix these dates. Plus, there was a conflict due to a recent commit,
modifying examples/Makefile, so make sure you rebase ;)
Thanks,
Pablo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
2015-02-24 19:10 ` De Lara Guarch, Pablo
@ 2015-02-24 19:16 ` De Lara Guarch, Pablo
2015-02-24 20:08 ` Thomas Monjalon
0 siblings, 1 reply; 48+ messages in thread
From: De Lara Guarch, Pablo @ 2015-02-24 19:16 UTC (permalink / raw)
To: De Lara Guarch, Pablo, Wodkowski, PawelX, dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Tuesday, February 24, 2015 7:11 PM
> To: Wodkowski, PawelX; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-
> jobstats example
>
>
>
> > -----Original Message-----
> > From: Wodkowski, PawelX
> > Sent: Tuesday, February 24, 2015 4:33 PM
> > To: dev@dpdk.org; De Lara Guarch, Pablo
> > Subject: [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
> >
> > This app demonstrate usage of new rte_jobstats library.
> > It is basically the orginal l2fwd with following modifications to met
> > library requirements:
> > - main_loop() was split into two jobs: forward job and flush job. Logic
> > for those jobs is almost the same as in original application.
> > - stats is moved to rte_alarm callback to not introduce overhead of
> > printing.
> > - stats are expanded to show rte_jobstats statistics.
> > - added new parameter '-l' to automatic thousands separator.
> >
> > Comparing original l2fwd and l2fwd-jobstats apps will show approach what
> > is needed to properly write own application with rte_jobstats
> > measurements.
> >
> > New available statistics:
> > - Total and % of fwd and flush execution time
> > - management time - overhead of rte_timer + overhead of rte_jobstats
> > library
> > - Idle time and % of time spent waiting for fwd or flush to be ready to
> > execute.
> > - per job execution time and period.
> >
> > Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> > ---
> > examples/Makefile | 1 +
> > examples/l2fwd-jobstats/Makefile | 51 ++
> > examples/l2fwd-jobstats/main.c | 1040
> > ++++++++++++++++++++++++++++++++++++++
> > mk/rte.app.mk | 4 +
> > 4 files changed, 1096 insertions(+)
> > create mode 100644 examples/l2fwd-jobstats/Makefile
> > create mode 100644 examples/l2fwd-jobstats/main.c
> >
> > diff --git a/examples/Makefile b/examples/Makefile
> > index 81f1d2f..e847ded 100644
> > --- a/examples/Makefile
> > +++ b/examples/Makefile
> > @@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) +=
> > ip_fragmentation
> > DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
> > DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
> > DIRS-y += l2fwd
> > +DIRS-y += l2fwd-jobstats
> > DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
> > DIRS-y += l3fwd
> > DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
> > diff --git a/examples/l2fwd-jobstats/Makefile b/examples/l2fwd-
> > jobstats/Makefile
> > new file mode 100644
> > index 0000000..d57a0ae
> > --- /dev/null
> > +++ b/examples/l2fwd-jobstats/Makefile
> > @@ -0,0 +1,51 @@
> > +# BSD LICENSE
> > +#
> > +# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
>
> Fix these dates. Plus, there was a conflict due to a recent commit,
> modifying examples/Makefile, so make sure you rebase ;)
Well, actually, I am in doubt. This is a modified version of an existing app.
In that case, copyright dates should contain the dates of that app, or just the year where it was created this modified app?
>
> Thanks,
> Pablo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
2015-02-24 19:16 ` De Lara Guarch, Pablo
@ 2015-02-24 20:08 ` Thomas Monjalon
0 siblings, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-24 20:08 UTC (permalink / raw)
To: De Lara Guarch, Pablo; +Cc: dev
2015-02-24 19:16, De Lara Guarch, Pablo:
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of De Lara Guarch,
> > Pablo
> > Sent: Tuesday, February 24, 2015 7:11 PM
> > To: Wodkowski, PawelX; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-
> > jobstats example
> >
> >
> >
> > > -----Original Message-----
> > > From: Wodkowski, PawelX
> > > Sent: Tuesday, February 24, 2015 4:33 PM
> > > To: dev@dpdk.org; De Lara Guarch, Pablo
> > > Subject: [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
> > >
> > > This app demonstrate usage of new rte_jobstats library.
> > > It is basically the orginal l2fwd with following modifications to met
> > > library requirements:
> > > - main_loop() was split into two jobs: forward job and flush job. Logic
> > > for those jobs is almost the same as in original application.
> > > - stats is moved to rte_alarm callback to not introduce overhead of
> > > printing.
> > > - stats are expanded to show rte_jobstats statistics.
> > > - added new parameter '-l' to automatic thousands separator.
> > >
> > > Comparing original l2fwd and l2fwd-jobstats apps will show approach what
> > > is needed to properly write own application with rte_jobstats
> > > measurements.
> > >
> > > New available statistics:
> > > - Total and % of fwd and flush execution time
> > > - management time - overhead of rte_timer + overhead of rte_jobstats
> > > library
> > > - Idle time and % of time spent waiting for fwd or flush to be ready to
> > > execute.
> > > - per job execution time and period.
> > >
> > > Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> > > ---
> > > examples/Makefile | 1 +
> > > examples/l2fwd-jobstats/Makefile | 51 ++
> > > examples/l2fwd-jobstats/main.c | 1040
> > > ++++++++++++++++++++++++++++++++++++++
> > > mk/rte.app.mk | 4 +
> > > 4 files changed, 1096 insertions(+)
> > > create mode 100644 examples/l2fwd-jobstats/Makefile
> > > create mode 100644 examples/l2fwd-jobstats/main.c
> > >
> > > diff --git a/examples/Makefile b/examples/Makefile
> > > index 81f1d2f..e847ded 100644
> > > --- a/examples/Makefile
> > > +++ b/examples/Makefile
> > > @@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) +=
> > > ip_fragmentation
> > > DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
> > > DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
> > > DIRS-y += l2fwd
> > > +DIRS-y += l2fwd-jobstats
> > > DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
> > > DIRS-y += l3fwd
> > > DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
> > > diff --git a/examples/l2fwd-jobstats/Makefile b/examples/l2fwd-
> > > jobstats/Makefile
> > > new file mode 100644
> > > index 0000000..d57a0ae
> > > --- /dev/null
> > > +++ b/examples/l2fwd-jobstats/Makefile
> > > @@ -0,0 +1,51 @@
> > > +# BSD LICENSE
> > > +#
> > > +# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> >
> > Fix these dates. Plus, there was a conflict due to a recent commit,
> > modifying examples/Makefile, so make sure you rebase ;)
>
> Well, actually, I am in doubt. This is a modified version of an existing app.
> In that case, copyright dates should contain the dates of that app, or just
> the year where it was created this modified app?
I think you should keep the dates of the original file.
It would be interesting to have lawyer's opinion.
If there is no other problem, I'm going to apply this patchset.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
` (2 preceding siblings ...)
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 3/3] MAINTAINERS: claim responsibility for rte_jobstats library and example app Pawel Wodkowski
@ 2015-02-24 20:34 ` De Lara Guarch, Pablo
2015-02-24 21:25 ` Thomas Monjalon
3 siblings, 1 reply; 48+ messages in thread
From: De Lara Guarch, Pablo @ 2015-02-24 20:34 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Tuesday, February 24, 2015 4:33 PM
> To: dev@dpdk.org; De Lara Guarch, Pablo
> Subject: [PATCH v6 0/3] new rte_jobstats library and example application
>
> Hi community,
> I would like to introduce library for measuring load of some arbitrary jobs and
> help finding optimal poll time in poll mode applications. It can be used to
> measure and drive every kind of job sets on any arbitrary execution unit or
> tasking library.
>
> In provided l2fwd-jobstats example I demonstrate how to use this library to
> select optimal rx burst poll time and find out idle time. Jobs are selected by
> using existing rte_timer library calls. This example does no limit possible
> schemes on which this library can be used.
>
> PATCH v6 changes:
> - rename library name to rte_jobstats.
> - clean unused includes and dependencies in library.
> - change/fix API documentation.
> - reword cover letter.
>
> PATCH v5 changes:
> - Fix spelling and checkpatch.pl errors.
> - Add maintainer claim for library and example app.
>
> PATCH v4 changes:
> - use proper branch for generating patch.
>
> PATCH v3 changes:
> - Fix spelling.
>
> PATCH v2 changes:
> - Remove jobs management/callback from library to not duplicate tasking
> library
> behaviour.
> - Cleenup/remove useless statistics.
> - Rework example application to use rte_timer library for jobs selection.
> - Introduce new app parameter '-l' for automatic thousands separating in
> stats.
> - More readable statistics format.
>
> Pawel Wodkowski (3):
> librte_jobstats: New library for checking core/system/app load
> examples: introduce new l2fwd-jobstats example
> MAINTAINERS: claim responsibility for rte_jobstats library and example
> app
>
> MAINTAINERS | 4 +
> config/common_bsdapp | 5 +
> config/common_linuxapp | 5 +
> doc/api/doxy-api.conf | 1 +
> examples/Makefile | 1 +
> examples/l2fwd-jobstats/Makefile | 51 ++
> examples/l2fwd-jobstats/main.c | 1040
> ++++++++++++++++++++++++++
> lib/Makefile | 1 +
> lib/librte_jobstats/Makefile | 53 ++
> lib/librte_jobstats/rte_jobstats.c | 273 +++++++
> lib/librte_jobstats/rte_jobstats.h | 322 ++++++++
> lib/librte_jobstats/rte_jobstats_version.map | 19 +
> mk/rte.app.mk | 4 +
> 13 files changed, 1779 insertions(+)
> create mode 100644 examples/l2fwd-jobstats/Makefile
> create mode 100644 examples/l2fwd-jobstats/main.c
> create mode 100644 lib/librte_jobstats/Makefile
> create mode 100644 lib/librte_jobstats/rte_jobstats.c
> create mode 100644 lib/librte_jobstats/rte_jobstats.h
> create mode 100644 lib/librte_jobstats/rte_jobstats_version.map
>
> --
> 1.9.1
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load Pawel Wodkowski
@ 2015-02-24 21:18 ` Thomas Monjalon
0 siblings, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-24 21:18 UTC (permalink / raw)
To: dev, Pawel Wodkowski
2015-02-24 17:33, Pawel Wodkowski:
> --- /dev/null
> +++ b/lib/librte_jobstats/rte_jobstats_version.map
> @@ -0,0 +1,19 @@
> +DPDK_2.0 {
> + global:
> +
> + rte_jobstats_context_init;
> + rte_jobstats_context_start;
> + rte_jobstats_context_finish;
> + rte_jobstats_context_reset;
> + rte_jobstats_init;
> + rte_jobstats_set_target;
> + rte_jobstats_start;
> + rte_jobstats_finish;
> + rte_jobstats_set_period;
> + rte_jobstats_set_min;
> + rte_jobstats_set_max;
> + rte_jobstats_set_update_period_function;
> + rte_jobstats_reset;
> +
> + local: *;
> +};
This list should be alphabetically ordered.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example Pawel Wodkowski
2015-02-24 19:10 ` De Lara Guarch, Pablo
@ 2015-02-24 21:19 ` Thomas Monjalon
1 sibling, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-24 21:19 UTC (permalink / raw)
To: Pawel Wodkowski; +Cc: dev
2015-02-24 17:33, Pawel Wodkowski:
> --- a/examples/Makefile
> +++ b/examples/Makefile
> @@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
> DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
> DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
> DIRS-y += l2fwd
> +DIRS-y += l2fwd-jobstats
You should use $(CONFIG_RTE_LIBRTE_JOBSTATS) instead of "y".
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application
2015-02-24 20:34 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application De Lara Guarch, Pablo
@ 2015-02-24 21:25 ` Thomas Monjalon
0 siblings, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2015-02-24 21:25 UTC (permalink / raw)
To: Wodkowski, PawelX; +Cc: dev
> > Hi community,
> > I would like to introduce library for measuring load of some arbitrary jobs and
> > help finding optimal poll time in poll mode applications. It can be used to
> > measure and drive every kind of job sets on any arbitrary execution unit or
> > tasking library.
> >
> > In provided l2fwd-jobstats example I demonstrate how to use this library to
> > select optimal rx burst poll time and find out idle time. Jobs are selected by
> > using existing rte_timer library calls. This example does no limit possible
> > schemes on which this library can be used.
> >
> > PATCH v6 changes:
> > - rename library name to rte_jobstats.
> > - clean unused includes and dependencies in library.
> > - change/fix API documentation.
> > - reword cover letter.
> >
> > PATCH v5 changes:
> > - Fix spelling and checkpatch.pl errors.
> > - Add maintainer claim for library and example app.
> >
> > PATCH v4 changes:
> > - use proper branch for generating patch.
> >
> > PATCH v3 changes:
> > - Fix spelling.
> >
> > PATCH v2 changes:
> > - Remove jobs management/callback from library to not duplicate tasking
> > library
> > behaviour.
> > - Cleenup/remove useless statistics.
> > - Rework example application to use rte_timer library for jobs selection.
> > - Introduce new app parameter '-l' for automatic thousands separating in
> > stats.
> > - More readable statistics format.
> >
> > Pawel Wodkowski (3):
> > librte_jobstats: New library for checking core/system/app load
> > examples: introduce new l2fwd-jobstats example
> > MAINTAINERS: claim responsibility for rte_jobstats library and example
> > app
>
> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Applied with small fixes, thanks
^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2015-02-24 21:26 UTC | newest]
Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-29 11:50 [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-01-29 11:50 ` [dpdk-dev] [PATCH 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
2015-01-29 13:25 ` [dpdk-dev] [PATCH 0/2] new headroom stats library and example application Neil Horman
2015-01-29 17:10 ` Wodkowski, PawelX
2015-01-29 19:13 ` Neil Horman
2015-01-30 10:47 ` Wodkowski, PawelX
2015-01-30 18:02 ` Neil Horman
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 " Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-02-17 15:37 ` [dpdk-dev] [PATCH v2 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 1/2] pmd: enable DCB in SRIOV Pawel Wodkowski
2015-02-17 16:19 ` [dpdk-dev] [PATCH v3 2/2] tespmd: fix DCB in SRIOV mode support Pawel Wodkowski
2015-02-17 16:33 ` [dpdk-dev] [PATCH v3 0/2] new headroom stats library and example application Wodkowski, PawelX
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 " Pawel Wodkowski
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-02-18 13:36 ` De Lara Guarch, Pablo
2015-02-17 16:42 ` [dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example Pawel Wodkowski
2015-02-18 13:41 ` De Lara Guarch, Pablo
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load Pawel Wodkowski
2015-02-24 1:55 ` Thomas Monjalon
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 2/3] examples: introduce new l2fwd-headroom example Pawel Wodkowski
2015-02-19 12:18 ` [dpdk-dev] [PATCH v5 3/3] MAINTAINERS: claim responsibility for headroom library and example app Pawel Wodkowski
2015-02-19 14:33 ` [dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application Neil Horman
2015-02-20 15:46 ` Jastrzebski, MichalX K
2015-02-23 11:45 ` Thomas Monjalon
2015-02-23 14:36 ` Jastrzebski, MichalX K
2015-02-23 14:46 ` Thomas Monjalon
2015-02-23 15:55 ` Jastrzebski, MichalX K
2015-02-23 16:04 ` Thomas Monjalon
2015-02-24 8:44 ` Pawel Wodkowski
2015-02-24 9:49 ` Jastrzebski, MichalX K
2015-02-24 10:00 ` Thomas Monjalon
2015-02-24 10:05 ` Wodkowski, PawelX
2015-02-24 10:53 ` Wodkowski, PawelX
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats " Pawel Wodkowski
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 1/3] librte_jobstats: New library for checking core/system/app load Pawel Wodkowski
2015-02-24 21:18 ` Thomas Monjalon
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 2/3] examples: introduce new l2fwd-jobstats example Pawel Wodkowski
2015-02-24 19:10 ` De Lara Guarch, Pablo
2015-02-24 19:16 ` De Lara Guarch, Pablo
2015-02-24 20:08 ` Thomas Monjalon
2015-02-24 21:19 ` Thomas Monjalon
2015-02-24 16:33 ` [dpdk-dev] [PATCH v6 3/3] MAINTAINERS: claim responsibility for rte_jobstats library and example app Pawel Wodkowski
2015-02-24 20:34 ` [dpdk-dev] [PATCH v6 0/3] new rte_jobstats library and example application De Lara Guarch, Pablo
2015-02-24 21:25 ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).