From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pawelx.wodkowski@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 17DFF58CB
 for <dev@dpdk.org>; Thu, 29 Jan 2015 13:11:16 +0100 (CET)
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga102.fm.intel.com with ESMTP; 29 Jan 2015 04:11:15 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.09,485,1418112000"; d="scan'208";a="677948802"
Received: from unknown (HELO Sent) ([10.217.248.233])
 by orsmga002.jf.intel.com with SMTP; 29 Jan 2015 04:11:13 -0800
Received: by Sent (sSMTP sendmail emulation); Thu, 29 Jan 2015 13:05:27 +0100
From: Pawel Wodkowski <pawelx.wodkowski@intel.com>
To: dev@dpdk.org
Date: Thu, 29 Jan 2015 12:50:06 +0100
Message-Id: <1422532206-10662-3-git-send-email-pawelx.wodkowski@intel.com>
X-Mailer: git-send-email 1.9.1
In-Reply-To: <1422532206-10662-1-git-send-email-pawelx.wodkowski@intel.com>
References: <1422532206-10662-1-git-send-email-pawelx.wodkowski@intel.com>
Subject: [dpdk-dev] [PATCH 2/2] examples: introduce new l2fwd-headroom
	example
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Jan 2015 12:11:18 -0000

This app demonstrate usage of new headroom library.
It is basicaly orginal l2fwd with following modificantions to met
headroom library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for thos jobs is almost the same as in orginal application.
- stats is moved to it's own job.
- If there is more lcores available than queues/ports, the stats job is
run on first free core, otherwise it is run on master core.
- stats are expanded to show headroom statistics.

Comparing orginal l2fwd and l2fwd-headroom apps will show approach what
is needed to properly write own application with headroom measurements.

Please notice that assigning separate core for printing stats is
prefered becouse flushing stdout is terrible slow and might impact
headroom statistics.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
 examples/Makefile                |    1 +
 examples/l2fwd-headroom/Makefile |   51 +++
 examples/l2fwd-headroom/main.c   |  875 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 927 insertions(+)
 create mode 100644 examples/l2fwd-headroom/Makefile
 create mode 100644 examples/l2fwd-headroom/main.c

diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..8a459b7 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
 DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
 DIRS-y += l2fwd
+DIRS-y += l2fwd-headroom
 DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
 DIRS-y += l3fwd
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-headroom/Makefile
new file mode 100644
index 0000000..07da286
--- /dev/null
+++ b/examples/l2fwd-headroom/Makefile
@@ -0,0 +1,51 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-headroom
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-headroom/main.c
new file mode 100644
index 0000000..4a6c392
--- /dev/null
+++ b/examples/l2fwd-headroom/main.c
@@ -0,0 +1,875 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <ctype.h>
+#include <getopt.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_errno.h>
+#include <rte_headroom.h>
+
+#define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF   8192
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr l2fwd_ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t l2fwd_enabled_port_mask;
+
+/* list of enabled ports */
+static uint32_t l2fwd_dst_ports[RTE_MAX_ETHPORTS];
+
+static unsigned int l2fwd_rx_queue_per_lcore = 1;
+
+struct mbuf_table {
+	uint64_t next_flush_time;
+	unsigned len;
+	struct rte_mbuf *mbufs[MAX_PKT_BURST];
+};
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT 16
+struct lcore_queue_conf {
+	unsigned n_rx_port;
+	unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+	struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+	struct rte_headroom headroom;
+
+} __rte_cache_aligned;
+struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
+
+static const struct rte_eth_conf port_conf = {
+	.rxmode = {
+		.split_hdr_size = 0,
+		.header_split   = 0, /**< Header Split disabled */
+		.hw_ip_checksum = 0, /**< IP checksum offload disabled */
+		.hw_vlan_filter = 0, /**< VLAN filtering disabled */
+		.jumbo_frame    = 0, /**< Jumbo Frame Support disabled */
+		.hw_strip_crc   = 0, /**< CRC stripped by hardware */
+	},
+	.txmode = {
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+};
+
+struct rte_mempool *l2fwd_pktmbuf_pool = NULL;
+
+/* Per-port statistics struct */
+struct l2fwd_port_statistics {
+	uint64_t tx;
+	uint64_t rx;
+	uint64_t dropped;
+} __rte_cache_aligned;
+struct l2fwd_port_statistics port_statistics[RTE_MAX_ETHPORTS];
+
+/* 1 day max */
+#define MAX_TIMER_PERIOD 86400
+/* default period is 10 seconds */
+static int64_t timer_period = 10;
+/* default timer frequency */
+static uint64_t hz;
+/* BURST_TX_DRAIN_US converted to cycles */
+uint64_t drain_tsc;
+/* Convert cycles to ns */
+static inline uint64_t
+cycles_to_ns(uint64_t cycles)
+{
+	double t = cycles;
+	t *= NS_PER_S;
+	t /= hz;
+	return t;
+}
+
+/* Print out statistics on packets dropped */
+static int64_t
+print_stats_job(struct rte_headroom_job *this_job)
+{
+	struct rte_headroom *hdr;
+	struct rte_headroom_job *job;
+	struct rte_headroom_stats stats;
+	uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+	uint64_t stats_start = rte_get_timer_cycles();
+	unsigned portid, lcore_id;
+	uint32_t job_idx;
+
+	total_packets_dropped = 0;
+	total_packets_tx = 0;
+	total_packets_rx = 0;
+
+	const char clr[] = { 27, '[', '2', 'J', '\0' };
+	const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+	/* Clear screen and move to top left */
+	printf("%s%s"
+			"\nPort statistics ====================================",
+			clr, topLeft);
+
+	for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+		/* skip disabled ports */
+		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+			continue;
+		printf("\nStatistics for port %u ------------------------------"
+				"\nPackets sent: %24"PRIu64
+				"\nPackets received: %20"PRIu64
+				"\nPackets dropped: %21"PRIu64,
+				portid,
+				port_statistics[portid].tx,
+				port_statistics[portid].rx,
+				port_statistics[portid].dropped);
+
+		total_packets_dropped += port_statistics[portid].dropped;
+		total_packets_tx += port_statistics[portid].tx;
+		total_packets_rx += port_statistics[portid].rx;
+	}
+
+	printf("\nAggregate statistics ==============================="
+			"\nTotal packets sent: %18"PRIu64
+			"\nTotal packets received: %14"PRIu64
+			"\nTotal packets dropped: %15"PRIu64
+			"\n====================================================\n",
+			total_packets_tx,
+			total_packets_rx,
+			total_packets_dropped);
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		if (lcore_queue_conf[lcore_id].n_rx_port == 0)
+			continue;
+
+		hdr = &lcore_queue_conf[lcore_id].headroom;
+		rte_headroom_get_stats(hdr, &stats);
+
+		printf("\nLCore %u: headroom statistics (time in ns) ========"
+				"\nLoop count: %26"PRIu64
+				"\nTotal headroom: %22"PRIu64
+				"\nHeadroom per loop: %19"PRIu64
+				"\nHeadroom min: %24"PRIu64
+				"\nHeadroom max: %24"PRIu64
+				"\nLoop min time: %23"PRIu64
+				"\nLoop max time: %23"PRIu64,
+				lcore_id,
+				stats.exec_cnt,
+				cycles_to_ns(stats.idle),
+				cycles_to_ns(stats.exec_cnt  ? stats.idle / stats.exec_cnt : 0),
+				cycles_to_ns(stats.idle_min),
+				cycles_to_ns(stats.idle_max),
+				cycles_to_ns(stats.run_time_min),
+				cycles_to_ns(stats.run_time_max));
+
+		for (job_idx = 0; job_idx < hdr->job_count; job_idx++) {
+			job = &hdr->jobs[job_idx];
+			rte_headroom_get_job_stats(job, &stats);
+			rte_headroom_reset_job_stats(job);
+
+			printf("\nJob %" PRIu32 ":%20s -----------------------"
+					"\nExec count: %26"PRIu64
+					"\nExec period: %25"PRIu64
+					"\nTotal headroom: %22"PRIu64
+					"\nHeadroom per exec: %19"PRIu64
+					"\nHeadroom min: %24"PRIu64
+					"\nHeadroom max: %24"PRIu64
+					"\nExec min time: %23"PRIu64
+					"\nExec max time: %23"PRIu64,
+					job_idx, job->name,
+					stats.exec_cnt,
+					cycles_to_ns(job->period),
+					cycles_to_ns(stats.idle),
+					cycles_to_ns(stats.exec_cnt ? stats.idle / stats.exec_cnt : 0),
+					cycles_to_ns(stats.idle_min),
+					cycles_to_ns(stats.idle_max),
+					cycles_to_ns(stats.run_time_min),
+					cycles_to_ns(stats.run_time_max));
+		}
+
+		rte_headroom_reset_stats(hdr);
+	}
+
+	printf("\n==== Stats gen time %19"PRIu64" =========	\n",
+			cycles_to_ns(rte_get_timer_cycles() - stats_start));
+
+	/* Return setpoint to indicate that this job is happy of time interwal
+	 * in which it was called. */
+	return this_job->job_target;
+}
+
+/* Send the burst of packets on an output interface */
+static void
+l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
+{
+	struct mbuf_table *m_table;
+	uint16_t ret;
+	uint16_t queueid = 0;
+	uint16_t n;
+
+	m_table = &qconf->tx_mbufs[port];
+	n = m_table->len;
+
+	m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
+	m_table->len = 0;
+
+	ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
+
+	port_statistics[port].tx += ret;
+	if (unlikely(ret < n)) {
+		port_statistics[port].dropped += (n - ret);
+		do {
+			rte_pktmbuf_free(m_table->mbufs[ret]);
+		} while (++ret < n);
+	}
+}
+
+/* Enqueue packets for TX and prepare them to be sent */
+static int
+l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+{
+	const unsigned lcore_id = rte_lcore_id();
+	struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+	struct mbuf_table *m_table = &qconf->tx_mbufs[port];
+	uint16_t len = qconf->tx_mbufs[port].len;
+
+	m_table->mbufs[len] = m;
+
+	len++;
+	m_table->len = len;
+
+	/* Enough pkts to be sent. */
+	if (unlikely(len == MAX_PKT_BURST))
+		l2fwd_send_burst(qconf, port);
+
+	return 0;
+}
+
+static void
+l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+{
+	struct ether_hdr *eth;
+	void *tmp;
+	unsigned dst_port;
+
+	dst_port = l2fwd_dst_ports[portid];
+	eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+	/* 02:00:00:00:00:xx */
+	tmp = &eth->d_addr.addr_bytes[0];
+	*((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+	/* src addr */
+	ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], &eth->s_addr);
+
+	l2fwd_send_packet(m, (uint8_t) dst_port);
+}
+
+static int64_t
+l2fwd_fwd_job(struct rte_headroom_job *job)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *m;
+
+	unsigned lcore_id = rte_lcore_id();
+	struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+	const uint8_t port_idx = (uint8_t) (uintptr_t) job->job_data;
+	const uint8_t portid = qconf->rx_port_list[port_idx];
+	uint8_t j;
+	uint16_t nb_rx, total_nb_rx;
+
+	/* Call rx burst 2 times. This allow headroom logic to see if this function
+	 * must be called more frequently. */
+
+	nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst, MAX_PKT_BURST);
+
+	total_nb_rx = nb_rx;
+	port_statistics[portid].rx += nb_rx;
+
+	for (j = 0; j < nb_rx; j++) {
+		m = pkts_burst[j];
+		rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+		l2fwd_simple_forward(m, portid);
+	}
+
+	if (nb_rx < MAX_PKT_BURST)
+		return total_nb_rx;
+
+	nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst, MAX_PKT_BURST);
+
+	total_nb_rx += nb_rx;
+	port_statistics[portid].rx += nb_rx;
+
+	for (j = 0; j < nb_rx; j++) {
+		m = pkts_burst[j];
+		rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+		l2fwd_simple_forward(m, portid);
+	}
+
+	return total_nb_rx;
+}
+
+static int64_t
+l2fwd_flush_job(struct rte_headroom_job *job)
+{
+	const uint64_t now = rte_get_timer_cycles();
+	const unsigned lcore_id = rte_lcore_id();
+	struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
+	struct mbuf_table *m_table;
+	uint8_t portid;
+
+	for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+		m_table = &qconf->tx_mbufs[portid];
+		if (m_table->len == 0 || m_table->next_flush_time <= now)
+			continue;
+
+		l2fwd_send_burst(qconf, portid);
+	}
+
+	/* Return setpoint to indicate that this job is happy of time interwal
+	 * in which it was called. */
+	return job->job_target;
+}
+
+/* main processing loop */
+static void
+l2fwd_main_loop(void)
+{
+	unsigned lcore_id;
+	unsigned i, portid;
+	struct lcore_queue_conf *qconf;
+
+	lcore_id = rte_lcore_id();
+	qconf = &lcore_queue_conf[lcore_id];
+
+	if (rte_headroom_job_count(&qconf->headroom) == 0) {
+		RTE_LOG(INFO, L2FWD, "lcore %u has nothing to do\n", lcore_id);
+		return;
+	}
+
+	RTE_LOG(INFO, L2FWD, "entering main loop on lcore %u\n", lcore_id);
+
+	for (i = 0; i < qconf->n_rx_port; i++) {
+
+		portid = qconf->rx_port_list[i];
+		RTE_LOG(INFO, L2FWD, " -- lcoreid=%u portid=%u\n", lcore_id,
+			portid);
+	}
+
+	while (1)
+		rte_headroom_next_job(&qconf->headroom);
+}
+
+static int
+l2fwd_launch_one_lcore(__attribute__((unused)) void *dummy)
+{
+	l2fwd_main_loop();
+	return 0;
+}
+
+/* display usage */
+static void
+l2fwd_usage(const char *prgname)
+{
+	printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+	       "  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+		   "  -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n",
+	       prgname);
+}
+
+static int
+l2fwd_parse_portmask(const char *portmask)
+{
+	char *end = NULL;
+	unsigned long pm;
+
+	/* parse hexadecimal string */
+	pm = strtoul(portmask, &end, 16);
+	if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (pm == 0)
+		return -1;
+
+	return pm;
+}
+
+static unsigned int
+l2fwd_parse_nqueue(const char *q_arg)
+{
+	char *end = NULL;
+	unsigned long n;
+
+	/* parse hexadecimal string */
+	n = strtoul(q_arg, &end, 10);
+	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return 0;
+	if (n == 0)
+		return 0;
+	if (n >= MAX_RX_QUEUE_PER_LCORE)
+		return 0;
+
+	return n;
+}
+
+static int
+l2fwd_parse_timer_period(const char *q_arg)
+{
+	char *end = NULL;
+	int n;
+
+	/* parse number string */
+	n = strtol(q_arg, &end, 10);
+	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+	if (n >= MAX_TIMER_PERIOD)
+		return -1;
+
+	return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+l2fwd_parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	static struct option lgopts[] = {
+		{NULL, 0, 0, 0}
+	};
+
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+				  lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'p':
+			l2fwd_enabled_port_mask = l2fwd_parse_portmask(optarg);
+			if (l2fwd_enabled_port_mask == 0) {
+				printf("invalid portmask\n");
+				l2fwd_usage(prgname);
+				return -1;
+			}
+			break;
+
+		/* nqueue */
+		case 'q':
+			l2fwd_rx_queue_per_lcore = l2fwd_parse_nqueue(optarg);
+			if (l2fwd_rx_queue_per_lcore == 0) {
+				printf("invalid queue number\n");
+				l2fwd_usage(prgname);
+				return -1;
+			}
+			break;
+
+		/* timer period */
+		case 'T':
+			timer_period = l2fwd_parse_timer_period(optarg);
+			if (timer_period < 0) {
+				printf("invalid timer period\n");
+				l2fwd_usage(prgname);
+				return -1;
+			}
+			break;
+
+		/* long options */
+		case 0:
+			l2fwd_usage(prgname);
+			return -1;
+
+		default:
+			l2fwd_usage(prgname);
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+	uint8_t portid, count, all_ports_up, print_flag = 0;
+	struct rte_eth_link link;
+
+	printf("\nChecking link status");
+	fflush(stdout);
+	for (count = 0; count <= MAX_CHECK_TIME; count++) {
+		all_ports_up = 1;
+		for (portid = 0; portid < port_num; portid++) {
+			if ((port_mask & (1 << portid)) == 0)
+				continue;
+			memset(&link, 0, sizeof(link));
+			rte_eth_link_get_nowait(portid, &link);
+			/* print link status if flag set */
+			if (print_flag == 1) {
+				if (link.link_status)
+					printf("Port %d Link Up - speed %u "
+						"Mbps - %s\n", (uint8_t)portid,
+						(unsigned)link.link_speed,
+				(link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+					("full-duplex") : ("half-duplex\n"));
+				else
+					printf("Port %d Link Down\n",
+						(uint8_t)portid);
+				continue;
+			}
+			/* clear all_ports_up flag if any link down */
+			if (link.link_status == 0) {
+				all_ports_up = 0;
+				break;
+			}
+		}
+		/* after finally printing all link status, get out */
+		if (print_flag == 1)
+			break;
+
+		if (all_ports_up == 0) {
+			printf(".");
+			fflush(stdout);
+			rte_delay_ms(CHECK_INTERVAL);
+		}
+
+		/* set the print_flag if all ports up or timeout */
+		if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+			print_flag = 1;
+			printf("done\n");
+		}
+	}
+}
+
+int
+main(int argc, char **argv)
+{
+	struct lcore_queue_conf *qconf;
+	struct rte_eth_dev_info dev_info;
+	struct rte_headroom_job *job;
+	unsigned lcore_id, rx_lcore_id, stats_lcore;
+	unsigned nb_ports_in_mask = 0;
+	int ret;
+	uint8_t nb_ports;
+	uint8_t nb_ports_available;
+	uint8_t portid, last_port;
+	uint8_t i;
+	char job_name[RTE_HEADROOM_JOB_NAMESIZE];
+
+	/* init EAL */
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = l2fwd_parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+	/* fetch default timer frequency. */
+	hz = rte_get_timer_hz();
+
+	/* create the mbuf pool */
+	l2fwd_pktmbuf_pool =
+		rte_mempool_create("mbuf_pool", NB_MBUF,
+				   MBUF_SIZE, 32,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL,
+				   rte_socket_id(), 0);
+	if (l2fwd_pktmbuf_pool == NULL)
+		rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+	nb_ports = rte_eth_dev_count();
+	if (nb_ports == 0)
+		rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+	if (nb_ports > RTE_MAX_ETHPORTS)
+		nb_ports = RTE_MAX_ETHPORTS;
+
+	/* reset l2fwd_dst_ports */
+	for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+		l2fwd_dst_ports[portid] = 0;
+	last_port = 0;
+
+	/*
+	 * Each logical core is assigned a dedicated TX queue on each port.
+	 */
+	for (portid = 0; portid < nb_ports; portid++) {
+		/* skip ports that are not enabled */
+		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+			continue;
+
+		if (nb_ports_in_mask % 2) {
+			l2fwd_dst_ports[portid] = last_port;
+			l2fwd_dst_ports[last_port] = portid;
+		} else
+			last_port = portid;
+
+		nb_ports_in_mask++;
+
+		rte_eth_dev_info_get(portid, &dev_info);
+	}
+	if (nb_ports_in_mask % 2) {
+		printf("Notice: odd number of ports in portmask.\n");
+		l2fwd_dst_ports[last_port] = last_port;
+	}
+
+	rx_lcore_id = 0;
+	qconf = NULL;
+
+	/* Initialize the port/queue configuration of each logical core */
+	for (portid = 0; portid < nb_ports; portid++) {
+		/* skip ports that are not enabled */
+		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+			continue;
+
+		/* get the lcore_id for this port */
+		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+		       lcore_queue_conf[rx_lcore_id].n_rx_port ==
+		       l2fwd_rx_queue_per_lcore) {
+			rx_lcore_id++;
+			if (rx_lcore_id >= RTE_MAX_LCORE)
+				rte_exit(EXIT_FAILURE, "Not enough cores\n");
+		}
+
+		if (qconf != &lcore_queue_conf[rx_lcore_id])
+			/* Assigned a new logical core in the loop above. */
+			qconf = &lcore_queue_conf[rx_lcore_id];
+
+		qconf->rx_port_list[qconf->n_rx_port] = portid;
+		qconf->n_rx_port++;
+		printf("Lcore %u: RX port %u\n", rx_lcore_id, (unsigned) portid);
+	}
+
+	nb_ports_available = nb_ports;
+
+	/* Initialise each port */
+	for (portid = 0; portid < nb_ports; portid++) {
+		/* skip ports that are not enabled */
+		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+			printf("Skipping disabled port %u\n", (unsigned) portid);
+			nb_ports_available--;
+			continue;
+		}
+		/* init port */
+		printf("Initializing port %u... ", (unsigned) portid);
+		fflush(stdout);
+		ret = rte_eth_dev_configure(portid, 1, 1, &port_conf);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n",
+				  ret, (unsigned) portid);
+
+		rte_eth_macaddr_get(portid, &l2fwd_ports_eth_addr[portid]);
+
+		/* init one RX queue */
+		fflush(stdout);
+		ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+					     rte_eth_dev_socket_id(portid),
+					     NULL,
+					     l2fwd_pktmbuf_pool);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+				  ret, (unsigned) portid);
+
+		/* init one TX queue on each port */
+		fflush(stdout);
+		ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+				rte_eth_dev_socket_id(portid),
+				NULL);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+				ret, (unsigned) portid);
+
+		/* Start device */
+		ret = rte_eth_dev_start(portid);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+				  ret, (unsigned) portid);
+
+		printf("done:\n");
+
+		rte_eth_promiscuous_enable(portid);
+
+		printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+				(unsigned) portid,
+				l2fwd_ports_eth_addr[portid].addr_bytes[0],
+				l2fwd_ports_eth_addr[portid].addr_bytes[1],
+				l2fwd_ports_eth_addr[portid].addr_bytes[2],
+				l2fwd_ports_eth_addr[portid].addr_bytes[3],
+				l2fwd_ports_eth_addr[portid].addr_bytes[4],
+				l2fwd_ports_eth_addr[portid].addr_bytes[5]);
+
+		/* initialize port stats */
+		memset(&port_statistics, 0, sizeof(port_statistics));
+	}
+
+	if (!nb_ports_available) {
+		rte_exit(EXIT_FAILURE,
+			"All available ports are disabled. Please set portmask.\n");
+	}
+
+	check_all_ports_link_status(nb_ports, l2fwd_enabled_port_mask);
+
+	drain_tsc = (hz + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+	stats_lcore = rte_get_master_lcore();
+
+	RTE_LCORE_FOREACH(lcore_id) {
+		qconf = &lcore_queue_conf[lcore_id];
+
+		if (rte_headroom_init(&qconf->headroom) != 0)
+			rte_panic("Headroom for core %u init failed\n", lcore_id);
+
+		if (qconf->n_rx_port == 0) {
+			if (stats_lcore == rte_get_master_lcore()) {
+				stats_lcore = lcore_id;
+				RTE_LOG(INFO, L2FWD,
+						"lcore %u: this core is free. Statistics will be "
+						"displayed on this core.\n",
+						lcore_id);
+			} else {
+				RTE_LOG(INFO, L2FWD,
+						"lcore %u: no ports so no headroom initialization\n",
+						lcore_id);
+			}
+
+			continue;
+		}
+
+		/* Add flush job.
+		 * Set fixed period by setting min = max = initila period. Set target to
+		 * zero as it is irrelevant for this job. */
+		ret = rte_headroom_add_job(&qconf->headroom, "flush",
+				&l2fwd_flush_job, NULL, drain_tsc, drain_tsc, drain_tsc, 0,
+				&job);
+
+		if (ret < 0) {
+			rte_exit(1, "Failed to add flush job for lcore %u: %s",
+					lcore_id, rte_strerror(-ret));
+		}
+
+		for (i = 0; i < qconf->n_rx_port; i++) {
+			printf("%u ", qconf->rx_port_list[i]);
+
+			snprintf(job_name, RTE_DIM(job_name), "port %u fwd",
+					qconf->rx_port_list[i]);
+
+			/* Add forward job.
+			 * Set min, max and initial period. Set target to MAX_PKT_BURST as
+			 * this is desired optimal RX/TX burst size. */
+			ret = rte_headroom_add_job(&qconf->headroom, job_name,
+						&l2fwd_fwd_job, (void *)(uintptr_t)i, 0, drain_tsc, 0,
+						MAX_PKT_BURST, &job);
+
+			if (ret < 0) {
+				rte_exit(1, "Failed to add job (lcore: %u, port %u): %s",
+						lcore_id, qconf->rx_port_list[i], rte_strerror(-ret));
+			}
+		}
+	}
+
+	if (timer_period) {
+		/* Convert timer period to cycles */
+		timer_period *= hz;
+		qconf = &lcore_queue_conf[stats_lcore];
+
+		/* Add stats display job.
+		 * Set fixed period by setting min = max = initila period. Set target to
+		 * zero as it is irrelevant for this job. */
+		ret = rte_headroom_add_job(&qconf->headroom, "stats", &print_stats_job,
+				NULL, timer_period, timer_period, timer_period, 0, &job);
+
+		if (ret < 0) {
+			rte_exit(1, "Failed to add print stats job for lcore %u: %s",
+					lcore_id, rte_strerror(-ret));
+		}
+
+		RTE_LOG(INFO, L2FWD, "Stats display on LCore %u\n", stats_lcore);
+	} else {
+		RTE_LOG(INFO, L2FWD, "Stats display disabled\n");
+	}
+
+	/* launch per-lcore init on every lcore */
+	rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL, CALL_MASTER);
+	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+		if (rte_eal_wait_lcore(lcore_id) < 0)
+			return -1;
+	}
+
+	return 0;
+}
-- 
1.7.9.5