* [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide
@ 2015-03-11 9:41 Pawel Wodkowski
2015-03-12 7:26 ` Butler, Siobhan A
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Pawel Wodkowski @ 2015-03-11 9:41 UTC (permalink / raw)
To: dev
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
doc/guides/sample_app_ug/index.rst | 1 +
doc/guides/sample_app_ug/l2_forward_job_stats.rst | 637 ++++++++++++++++++++++
2 files changed, 638 insertions(+)
create mode 100644 doc/guides/sample_app_ug/l2_forward_job_stats.rst
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 5720181..c89a2f0 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -47,6 +47,7 @@ Sample Applications User Guide
ipv4_multicast
ip_reassembly
kernel_nic_interface
+ l2_forward_job_stats
l2_forward_real_virtual
l3_forward
l3_forward_power_man
diff --git a/doc/guides/sample_app_ug/l2_forward_job_stats.rst b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
new file mode 100644
index 0000000..76cea71
--- /dev/null
+++ b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
@@ -0,0 +1,637 @@
+.. BSD LICENSE
+ Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ All rights reserved.
+
+ Redistribution and use in source and binary forms, with or without
+ modification, are permitted provided that the following conditions
+ are met:
+
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in
+ the documentation and/or other materials provided with the
+ distribution.
+ * Neither the name of Intel Corporation nor the names of its
+ contributors may be used to endorse or promote products derived
+ from this software without specific prior written permission.
+
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+L2 Forwarding Sample Application (in Real and Virtualized Environments) with core load statistics.
+==================================================================================================
+
+The L2 Forwarding sample application is a simple example of packet processing using
+the Data Plane Development Kit (DPDK) which
+also takes advantage of Single Root I/O Virtualization (SR-IOV) features in a virtualized environment.
+
+.. note::
+
+ This application is a variation of L2 Forwarding sample application. It demonstrate possible
+ scheme of job stats library usage therefore some parts of this document is identical with original
+ L2 forwarding application.
+
+Overview
+--------
+
+The L2 Forwarding sample application, which can operate in real and virtualized environments,
+performs L2 forwarding for each packet that is received.
+The destination port is the adjacent port from the enabled portmask, that is,
+if the first four ports are enabled (portmask 0xf),
+ports 1 and 2 forward into each other, and ports 3 and 4 forward into each other.
+Also, the MAC addresses are affected as follows:
+
+* The source MAC address is replaced by the TX port MAC address
+
+* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID
+
+This application can be used to benchmark performance using a traffic-generator, as shown in the Figure 3.
+
+The application can also be used in a virtualized environment as shown in Figure 4.
+
+The L2 Forwarding application can also be used as a starting point for developing a new application based on the DPDK.
+
+.. _figure_3:
+
+**Figure 3. Performance Benchmark Setup (Basic Environment)**
+
+.. image4_png has been replaced
+
+|l2_fwd_benchmark_setup|
+
+.. _figure_4:
+
+**Figure 4. Performance Benchmark Setup (Virtualized Environment)**
+
+.. image5_png has been renamed
+
+|l2_fwd_virtenv_benchmark_setup|
+
+Virtual Function Setup Instructions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This application can use the virtual function available in the system and
+therefore can be used in a virtual machine without passing through
+the whole Network Device into a guest machine in a virtualized scenario.
+The virtual functions can be enabled in the host machine or the hypervisor with the respective physical function driver.
+
+For example, in a Linux* host machine, it is possible to enable a virtual function using the following command:
+
+.. code-block:: console
+
+ modprobe ixgbe max_vfs=2,2
+
+This command enables two Virtual Functions on each of Physical Function of the NIC,
+with two physical ports in the PCI configuration space.
+It is important to note that enabled Virtual Function 0 and 2 would belong to Physical Function 0
+and Virtual Function 1 and 3 would belong to Physical Function 1,
+in this case enabling a total of four Virtual Functions.
+
+Compiling the Application
+-------------------------
+
+#. Go to the example directory:
+
+ .. code-block:: console
+
+ export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l2fwd-jobstats
+
+#. Set the target (a default target is used if not specified). For example:
+
+ .. code-block:: console
+
+ export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+ *See the DPDK Getting Started Guide* for possible RTE_TARGET values.
+
+#. Build the application:
+
+ .. code-block:: console
+
+ make
+
+Running the Application
+-----------------------
+
+The application requires a number of command line options:
+
+.. code-block:: console
+
+ ./build/l2fwd-jobstats [EAL options] -- -p PORTMASK [-q NQ] [-l]
+
+where,
+
+* p PORTMASK: A hexadecimal bitmask of the ports to configure
+
+* q NQ: A number of queues (=ports) per lcore (default is 1)
+
+* l: Use locale thousands separator when formatting big numbers.
+
+To run the application in linuxapp environment with 4 lcores, 16 ports, 8 RX queues per lcore and
+thousands separator printing, issue the command:
+
+.. code-block:: console
+
+ $ ./build/l2fwd-jobstats -c f -n 4 -- -q 8 -p ffff -l
+
+Refer to the *DPDK Getting Started Guide* for general information on running applications
+and the Environment Abstraction Layer (EAL) options.
+
+Explanation
+-----------
+
+The following sections provide some explanation of the code.
+
+Command Line Arguments
+~~~~~~~~~~~~~~~~~~~~~~
+
+The L2 Forwarding sample application takes specific parameters,
+in addition to Environment Abstraction Layer (EAL) arguments (see Section 9.3).
+The preferred way to parse parameters is to use the getopt() function,
+since it is part of a well-defined and portable library.
+
+The parsing of arguments is done in the l2fwd_parse_args() function.
+The method of argument parsing is not described here.
+Refer to the *glibc getopt(3)* man page for details.
+
+EAL arguments are parsed first, then application-specific arguments.
+This is done at the beginning of the main() function:
+
+.. code-block:: c
+
+ /* init EAL */
+
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+Mbuf Pool Initialization
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Once the arguments are parsed, the mbuf pool is created.
+The mbuf pool contains a set of mbuf objects that will be used by the driver
+and the application to store network packet data:
+
+.. code-block:: c
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+The rte_mempool is a generic structure used to handle pools of objects.
+In this case, it is necessary to create a pool that will be used by the driver,
+which expects to have some reserved space in the mempool structure,
+sizeof(struct rte_pktmbuf_pool_private) bytes.
+The number of allocated pkt mbufs is NB_MBUF, with a size of MBUF_SIZE each.
+A per-lcore cache of 32 mbufs is kept.
+The memory is allocated in rte_socket_id() socket,
+but it is possible to extend this code to allocate one mbuf pool per socket.
+
+Two callback pointers are also given to the rte_mempool_create() function:
+
+* The first callback pointer is to rte_pktmbuf_pool_init() and is used
+ to initialize the private data of the mempool, which is needed by the driver.
+ This function is provided by the mbuf API, but can be copied and extended by the developer.
+
+* The second callback pointer given to rte_mempool_create() is the mbuf initializer.
+ The default is used, that is, rte_pktmbuf_init(), which is provided in the rte_mbuf library.
+ If a more complex application wants to extend the rte_pktmbuf structure for its own needs,
+ a new function derived from rte_pktmbuf_init( ) can be created.
+
+Driver Initialization
+~~~~~~~~~~~~~~~~~~~~~
+
+The main part of the code in the main() function relates to the initialization of the driver.
+To fully understand this code, it is recommended to study the chapters that related to the Poll Mode Driver
+in the *DPDK Programmer's Guide* and the *DPDK API Reference*.
+
+.. code-block:: c
+
+ nb_ports = rte_eth_dev_count();
+
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ }
+ else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get((uint8_t) portid, &dev_info);
+ }
+
+The next step is to configure the RX and TX queues.
+For each port, there is only one RX queue (only one lcore is able to poll a given port).
+The number of TX queues depends on the number of available lcores.
+The rte_eth_dev_configure() function is used to configure the number of queues for a port:
+
+.. code-block:: c
+
+ ret = rte_eth_dev_configure((uint8_t)portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: "
+ "err=%d, port=%u\n",
+ ret, portid);
+
+The global configuration is stored in a static structure:
+
+.. code-block:: c
+
+ static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc= 0, /**< CRC stripped by hardware */
+ },
+
+ .txmode = {
+ .mq_mode = ETH_DCB_NONE
+ },
+ };
+
+RX Queue Initialization
+~~~~~~~~~~~~~~~~~~~~~~~
+
+The application uses one lcore to poll one or several ports, depending on the -q option,
+which specifies the number of queues per lcore.
+
+For example, if the user specifies -q 4, the application is able to poll four ports with one lcore.
+If there are 16 ports on the target (and if the portmask argument is -p ffff ),
+the application will need four lcores to poll all the ports.
+
+.. code-block:: c
+
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+The list of queues that must be polled for a given lcore is stored in a private structure called struct lcore_queue_conf.
+
+.. code-block:: c
+
+ struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ truct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_jobstats port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
+
+ struct rte_timer flush_timer;
+ struct rte_jobstats flush_job;
+ struct rte_jobstats idle_job;
+ struct rte_jobstats_context jobs_context;
+
+ rte_atomic16_t stats_read_pending;
+ rte_spinlock_t lock;
+ } __rte_cache_aligned;
+
+Values of struct lcore_queue_conf:
+
+* n_rx_port and rx_port_list[] are used in the main packet processing loop
+ (see Section 9.4.6 "Receive, Process and Transmit Packets" later in this chapter).
+
+* rx_timers and flush_timer are used to ensure forced TX on low packet rate.
+
+* flush_job, idle_job and jobs_context are librte_jobstats objects used for managing l2fwd jobs.
+
+* stats_read_pending and lock are used during job stats read phase.
+
+TX Queue Initialization
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Each lcore should be able to transmit on any port. For every port, a single TX queue is initialized.
+
+.. code-block:: c
+
+ /* init one TX queue on each port */
+
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+Jobs statistics initialization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+There are several statistics objects available:
+
+* Flush job statistics
+
+.. code-block:: c
+
+ rte_jobstats_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
+ drain_tsc, 0);
+
+ rte_timer_init(&qconf->flush_timer);
+ ret = rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL,
+ lcore_id, &l2fwd_flush_job, NULL);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to reset flush job timer for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+* Statistics per RX port
+
+.. code-block:: c
+
+ rte_jobstats_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
+ rte_jobstats_set_update_period_function(job, l2fwd_job_update_cb);
+
+ rte_timer_init(&qconf->rx_timers[i]);
+ ret = rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
+ l2fwd_fwd_job, (void *)(uintptr_t)i);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to reset lcore %u port %u job timer: %s",
+ lcore_id, qconf->rx_port_list[i], rte_strerror(-ret));
+ }
+
+Following parameters are passed to rte_jobstats_init():
+
+* 0 as minimal poll period
+
+* drain_tsc as maximum poll period
+
+* MAX_PKT_BURST as desired target value (RX burst size)
+
+Main loop
+~~~~~~~~~
+
+The forwarding path is reworked comparing to orginal L2 Forwarding application.
+In the l2fwd_main_loop() function three loops are placed.
+
+.. code-block:: c
+
+ for (;;) {
+ rte_spinlock_lock(&qconf->lock);
+
+ do {
+ rte_jobstats_context_start(&qconf->jobs_context);
+
+ /* Do the Idle job:
+ * - Read stats_read_pending flag
+ * - check if some real job need to be executed
+ */
+ rte_jobstats_start(&qconf->jobs_context, &qconf->idle_job);
+
+ do {
+ uint8_t i;
+ uint64_t now = rte_get_timer_cycles();
+
+ need_manage = qconf->flush_timer.expire < now;
+ /* Check if we was esked to give a stats. */
+ stats_read_pending =
+ rte_atomic16_read(&qconf->stats_read_pending);
+ need_manage |= stats_read_pending;
+
+ for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
+ need_manage = qconf->rx_timers[i].expire < now;
+
+ } while (!need_manage);
+ rte_jobstats_finish(&qconf->idle_job, qconf->idle_job.target);
+
+ rte_timer_manage();
+ rte_jobstats_context_finish(&qconf->jobs_context);
+ } while (likely(stats_read_pending == 0));
+
+ rte_spinlock_unlock(&qconf->lock);
+ rte_pause();
+ }
+
+First inifnite for loop is to minimize impact of stats reading. Lock is only locked/unlocked when asked.
+
+Second inner while loop do the whole jobs management. When any job is ready, the use rte_timer_manage() is used to call the job handler.
+In this place functions l2fwd_fwd_job() and l2fwd_flush_job() are called when needed.
+Then rte_jobstats_context_finish() is called to mark loop end - no other jobs are ready to execute. By this time stats are ready to be read
+and if stats_read_pending is set, loop breaks allowing stats to be read.
+
+Third do-while loop is the idle job (idle stats counter). Its only purpose is moniting if any job is ready or stats job read is pending
+for this lcore. Statistics from this part of code is considered as the headroom avalable fo additional processing.
+
+Receive, Process and Transmit Packets
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The main task of l2fwd_fwd_job() function is to read ingress packets from the RX queue of particular port and forward it.
+This is done using the following code:
+
+.. code-block:: c
+
+ total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ for (j = 0; j < total_nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+Packets are read in a burst of size MAX_PKT_BURST.
+Then, each mbuf in the table is processed by the l2fwd_simple_forward() function.
+The processing is very simple: process the TX port from the RX port, then replace the source and destination MAC addresses.
+
+The rte_eth_rx_burst() function writes the mbuf pointers in a local table and returns the number of available mbufs in the table.
+
+After first read second try is issued.
+
+.. code::
+
+ if (total_nb_rx == MAX_PKT_BURST) {
+ const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+ }
+
+This second read is important to give job stats library a feedback how many packets was procesed.
+
+.. code::
+
+ /* Adjust period time in which we are running here. */
+ if (rte_jobstats_finish(job, total_nb_rx) != 0) {
+ rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
+ lcore_id, l2fwd_fwd_job, arg);
+ }
+
+To maximize performance exactly MAX_PKT_BURST is expected (the target value) to be read for each l2fwd_fwd_job() call.
+If total_nb_rx is smaller than target value job->period will be increased. If it is greater the period will be decreased.
+
+.. note::
+
+ In the following code, one line for getting the output port requires some explanation.
+
+During the initialization process, a static array of destination ports (l2fwd_dst_ports[]) is filled such that for each source port,
+a destination port is assigned that is either the next or previous enabled port from the portmask.
+Naturally, the number of ports in the portmask must be even, otherwise, the application exits.
+
+.. code-block:: c
+
+ static void
+ l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+ {
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+
+ tmp = ð->d_addr.addr_bytes[0];
+
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t) dst_port << 40);
+
+ /* src addr */
+
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+ }
+
+Then, the packet is sent using the l2fwd_send_packet (m, dst_port) function.
+For this test application, the processing is exactly the same for all packets arriving on the same RX port.
+Therefore, it would have been possible to call the l2fwd_send_burst() function directly from the main loop
+to send all the received packets on the same TX port,
+using the burst-oriented send function, which is more efficient.
+
+However, in real-life applications (such as, L3 routing),
+packet N is not necessarily forwarded on the same port as packet N-1.
+The application is implemented to illustrate that, so the same approach can be reused in a more complex application.
+
+The l2fwd_send_packet() function stores the packet in a per-lcore and per-txport table.
+If the table is full, the whole packets table is transmitted using the l2fwd_send_burst() function:
+
+.. code-block:: c
+
+ /* Send the packet on an output interface */
+
+ static int
+ l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+ {
+ unsigned lcore_id, len;
+ struct lcore_queue_conf *qconf;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ len = qconf->tx_mbufs[port].len;
+ qconf->tx_mbufs[port].m_table[len] = m;
+ len++;
+
+ /* enough pkts to be sent */
+
+ if (unlikely(len == MAX_PKT_BURST)) {
+ l2fwd_send_burst(qconf, MAX_PKT_BURST, port);
+ len = 0;
+ }
+
+ qconf->tx_mbufs[port].len = len; return 0;
+ }
+
+To ensure that no packets remain in the tables, the flush job exists. The l2fwd_flush_job()
+is called periodicaly to for each lcore draining TX queue of each port.
+This technique introduces some latency when there are not many packets to send,
+however it improves performance:
+
+.. code-block:: c
+
+ static void
+ l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
+ {
+ uint64_t now;
+ unsigned lcore_id;
+ struct lcore_queue_conf *qconf;
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_jobstats_start(&qconf->jobs_context, &qconf->flush_job);
+
+ now = rte_get_timer_cycles();
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+
+ /* Pass target to indicate that this job is happy of time interval
+ * in which it was called. */
+ rte_jobstats_finish(&qconf->flush_job, qconf->flush_job.target);
+ }
+
+.. |l2_fwd_benchmark_setup| image:: img/l2_fwd_benchmark_setup.svg
+
+.. |l2_fwd_virtenv_benchmark_setup| image:: img/l2_fwd_virtenv_benchmark_setup.png
--
1.9.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide
2015-03-11 9:41 [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide Pawel Wodkowski
@ 2015-03-12 7:26 ` Butler, Siobhan A
2015-03-17 20:53 ` Thomas Monjalon
2015-03-18 10:20 ` [dpdk-dev] [PATCH v2] " Pawel Wodkowski
2 siblings, 0 replies; 6+ messages in thread
From: Butler, Siobhan A @ 2015-03-12 7:26 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Wednesday, March 11, 2015 9:42 AM
> To: dev@dpdk.org
> Cc: Butler, Siobhan A
> Subject: [PATCH] doc: add l2fwd-jobstats user guide
>
> Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> ---
> doc/guides/sample_app_ug/index.rst | 1 +
> doc/guides/sample_app_ug/l2_forward_job_stats.rst | 637
> ++++++++++++++++++++++
> 2 files changed, 638 insertions(+)
> create mode 100644 doc/guides/sample_app_ug/l2_forward_job_stats.rst
>
> diff --git a/doc/guides/sample_app_ug/index.rst
> b/doc/guides/sample_app_ug/index.rst
> index 5720181..c89a2f0 100644
> --- a/doc/guides/sample_app_ug/index.rst
> +++ b/doc/guides/sample_app_ug/index.rst
> @@ -47,6 +47,7 @@ Sample Applications User Guide
> ipv4_multicast
> ip_reassembly
> kernel_nic_interface
> + l2_forward_job_stats
> l2_forward_real_virtual
> l3_forward
> l3_forward_power_man
> diff --git a/doc/guides/sample_app_ug/l2_forward_job_stats.rst
> b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
> new file mode 100644
> index 0000000..76cea71
> --- /dev/null
> +++ b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
> @@ -0,0 +1,637 @@
> +.. BSD LICENSE
> + Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> + All rights reserved.
> +
> + Redistribution and use in source and binary forms, with or without
> + modification, are permitted provided that the following conditions
> + are met:
> +
> + * Redistributions of source code must retain the above copyright
> + notice, this list of conditions and the following disclaimer.
> + * Redistributions in binary form must reproduce the above copyright
> + notice, this list of conditions and the following disclaimer in
> + the documentation and/or other materials provided with the
> + distribution.
> + * Neither the name of Intel Corporation nor the names of its
> + contributors may be used to endorse or promote products derived
> + from this software without specific prior written permission.
> +
> + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +L2 Forwarding Sample Application (in Real and Virtualized Environments)
> with core load statistics.
> +=========================================================
> ==============
> +===========================
> +
> +The L2 Forwarding sample application is a simple example of packet
> +processing using the Data Plane Development Kit (DPDK) which also takes
> +advantage of Single Root I/O Virtualization (SR-IOV) features in a virtualized
> environment.
> +
> +.. note::
> +
> + This application is a variation of L2 Forwarding sample application. It
> demonstrate possible
> + scheme of job stats library usage therefore some parts of this document
> is identical with original
> + L2 forwarding application.
> +
> +Overview
> +--------
> +
> +The L2 Forwarding sample application, which can operate in real and
> +virtualized environments, performs L2 forwarding for each packet that is
> received.
> +The destination port is the adjacent port from the enabled portmask,
> +that is, if the first four ports are enabled (portmask 0xf), ports 1
> +and 2 forward into each other, and ports 3 and 4 forward into each other.
> +Also, the MAC addresses are affected as follows:
> +
> +* The source MAC address is replaced by the TX port MAC address
> +
> +* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID
> +
> +This application can be used to benchmark performance using a traffic-
> generator, as shown in the Figure 3.
> +
> +The application can also be used in a virtualized environment as shown in
> Figure 4.
> +
> +The L2 Forwarding application can also be used as a starting point for
> developing a new application based on the DPDK.
> +
> +.. _figure_3:
> +
> +**Figure 3. Performance Benchmark Setup (Basic Environment)**
> +
> +.. image4_png has been replaced
> +
> +|l2_fwd_benchmark_setup|
> +
> +.. _figure_4:
> +
> +**Figure 4. Performance Benchmark Setup (Virtualized Environment)**
> +
> +.. image5_png has been renamed
> +
> +|l2_fwd_virtenv_benchmark_setup|
> +
> +Virtual Function Setup Instructions
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This application can use the virtual function available in the system
> +and therefore can be used in a virtual machine without passing through
> +the whole Network Device into a guest machine in a virtualized scenario.
> +The virtual functions can be enabled in the host machine or the hypervisor
> with the respective physical function driver.
> +
> +For example, in a Linux* host machine, it is possible to enable a virtual
> function using the following command:
> +
> +.. code-block:: console
> +
> + modprobe ixgbe max_vfs=2,2
> +
> +This command enables two Virtual Functions on each of Physical Function
> +of the NIC, with two physical ports in the PCI configuration space.
> +It is important to note that enabled Virtual Function 0 and 2 would
> +belong to Physical Function 0 and Virtual Function 1 and 3 would belong
> +to Physical Function 1, in this case enabling a total of four Virtual Functions.
> +
> +Compiling the Application
> +-------------------------
> +
> +#. Go to the example directory:
> +
> + .. code-block:: console
> +
> + export RTE_SDK=/path/to/rte_sdk cd
> + ${RTE_SDK}/examples/l2fwd-jobstats
> +
> +#. Set the target (a default target is used if not specified). For example:
> +
> + .. code-block:: console
> +
> + export RTE_TARGET=x86_64-native-linuxapp-gcc
> +
> + *See the DPDK Getting Started Guide* for possible RTE_TARGET values.
> +
> +#. Build the application:
> +
> + .. code-block:: console
> +
> + make
> +
> +Running the Application
> +-----------------------
> +
> +The application requires a number of command line options:
> +
> +.. code-block:: console
> +
> + ./build/l2fwd-jobstats [EAL options] -- -p PORTMASK [-q NQ] [-l]
> +
> +where,
> +
> +* p PORTMASK: A hexadecimal bitmask of the ports to configure
> +
> +* q NQ: A number of queues (=ports) per lcore (default is 1)
> +
> +* l: Use locale thousands separator when formatting big numbers.
> +
> +To run the application in linuxapp environment with 4 lcores, 16 ports,
> +8 RX queues per lcore and thousands separator printing, issue the
> command:
> +
> +.. code-block:: console
> +
> + $ ./build/l2fwd-jobstats -c f -n 4 -- -q 8 -p ffff -l
> +
> +Refer to the *DPDK Getting Started Guide* for general information on
> +running applications and the Environment Abstraction Layer (EAL) options.
> +
> +Explanation
> +-----------
> +
> +The following sections provide some explanation of the code.
> +
> +Command Line Arguments
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +The L2 Forwarding sample application takes specific parameters, in
> +addition to Environment Abstraction Layer (EAL) arguments (see Section
> 9.3).
> +The preferred way to parse parameters is to use the getopt() function,
> +since it is part of a well-defined and portable library.
> +
> +The parsing of arguments is done in the l2fwd_parse_args() function.
> +The method of argument parsing is not described here.
> +Refer to the *glibc getopt(3)* man page for details.
> +
> +EAL arguments are parsed first, then application-specific arguments.
> +This is done at the beginning of the main() function:
> +
> +.. code-block:: c
> +
> + /* init EAL */
> +
> + ret = rte_eal_init(argc, argv);
> + if (ret < 0)
> + rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
> +
> + argc -= ret;
> + argv += ret;
> +
> + /* parse application arguments (after the EAL ones) */
> +
> + ret = l2fwd_parse_args(argc, argv);
> + if (ret < 0)
> + rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
> +
> +Mbuf Pool Initialization
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Once the arguments are parsed, the mbuf pool is created.
> +The mbuf pool contains a set of mbuf objects that will be used by the
> +driver and the application to store network packet data:
> +
> +.. code-block:: c
> +
> + /* create the mbuf pool */
> + l2fwd_pktmbuf_pool =
> + rte_mempool_create("mbuf_pool", NB_MBUF,
> + MBUF_SIZE, 32,
> + sizeof(struct rte_pktmbuf_pool_private),
> + rte_pktmbuf_pool_init, NULL,
> + rte_pktmbuf_init, NULL,
> + rte_socket_id(), 0);
> +
> + if (l2fwd_pktmbuf_pool == NULL)
> + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
> +
> +The rte_mempool is a generic structure used to handle pools of objects.
> +In this case, it is necessary to create a pool that will be used by the
> +driver, which expects to have some reserved space in the mempool
> +structure, sizeof(struct rte_pktmbuf_pool_private) bytes.
> +The number of allocated pkt mbufs is NB_MBUF, with a size of MBUF_SIZE
> each.
> +A per-lcore cache of 32 mbufs is kept.
> +The memory is allocated in rte_socket_id() socket, but it is possible
> +to extend this code to allocate one mbuf pool per socket.
> +
> +Two callback pointers are also given to the rte_mempool_create() function:
> +
> +* The first callback pointer is to rte_pktmbuf_pool_init() and is used
> + to initialize the private data of the mempool, which is needed by the
> driver.
> + This function is provided by the mbuf API, but can be copied and
> extended by the developer.
> +
> +* The second callback pointer given to rte_mempool_create() is the mbuf
> initializer.
> + The default is used, that is, rte_pktmbuf_init(), which is provided in the
> rte_mbuf library.
> + If a more complex application wants to extend the rte_pktmbuf structure
> for its own needs,
> + a new function derived from rte_pktmbuf_init( ) can be created.
> +
> +Driver Initialization
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +The main part of the code in the main() function relates to the initialization
> of the driver.
> +To fully understand this code, it is recommended to study the chapters
> +that related to the Poll Mode Driver in the *DPDK Programmer's Guide*
> and the *DPDK API Reference*.
> +
> +.. code-block:: c
> +
> + nb_ports = rte_eth_dev_count();
> +
> + if (nb_ports == 0)
> + rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
> +
> + if (nb_ports > RTE_MAX_ETHPORTS)
> + nb_ports = RTE_MAX_ETHPORTS;
> +
> + /* reset l2fwd_dst_ports */
> +
> + for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
> + l2fwd_dst_ports[portid] = 0;
> +
> + last_port = 0;
> +
> + /*
> + * Each logical core is assigned a dedicated TX queue on each port.
> + */
> + for (portid = 0; portid < nb_ports; portid++) {
> + /* skip ports that are not enabled */
> + if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
> + continue;
> +
> + if (nb_ports_in_mask % 2) {
> + l2fwd_dst_ports[portid] = last_port;
> + l2fwd_dst_ports[last_port] = portid;
> + }
> + else
> + last_port = portid;
> +
> + nb_ports_in_mask++;
> +
> + rte_eth_dev_info_get((uint8_t) portid, &dev_info);
> + }
> +
> +The next step is to configure the RX and TX queues.
> +For each port, there is only one RX queue (only one lcore is able to poll a
> given port).
> +The number of TX queues depends on the number of available lcores.
> +The rte_eth_dev_configure() function is used to configure the number of
> queues for a port:
> +
> +.. code-block:: c
> +
> + ret = rte_eth_dev_configure((uint8_t)portid, 1, 1, &port_conf);
> + if (ret < 0)
> + rte_exit(EXIT_FAILURE, "Cannot configure device: "
> + "err=%d, port=%u\n",
> + ret, portid);
> +
> +The global configuration is stored in a static structure:
> +
> +.. code-block:: c
> +
> + static const struct rte_eth_conf port_conf = {
> + .rxmode = {
> + .split_hdr_size = 0,
> + .header_split = 0, /**< Header Split disabled */
> + .hw_ip_checksum = 0, /**< IP checksum offload disabled */
> + .hw_vlan_filter = 0, /**< VLAN filtering disabled */
> + .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
> + .hw_strip_crc= 0, /**< CRC stripped by hardware */
> + },
> +
> + .txmode = {
> + .mq_mode = ETH_DCB_NONE
> + },
> + };
> +
> +RX Queue Initialization
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The application uses one lcore to poll one or several ports, depending
> +on the -q option, which specifies the number of queues per lcore.
> +
> +For example, if the user specifies -q 4, the application is able to poll four
> ports with one lcore.
> +If there are 16 ports on the target (and if the portmask argument is -p
> +ffff ), the application will need four lcores to poll all the ports.
> +
> +.. code-block:: c
> +
> + ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
> + rte_eth_dev_socket_id(portid),
> + NULL,
> + l2fwd_pktmbuf_pool);
> +
> + if (ret < 0)
> + rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d,
> port=%u\n",
> + ret, (unsigned) portid);
> +
> +The list of queues that must be polled for a given lcore is stored in a private
> structure called struct lcore_queue_conf.
> +
> +.. code-block:: c
> +
> + struct lcore_queue_conf {
> + unsigned n_rx_port;
> + unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
> + truct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
> +
> + struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
> + struct rte_jobstats port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
> +
> + struct rte_timer flush_timer;
> + struct rte_jobstats flush_job;
> + struct rte_jobstats idle_job;
> + struct rte_jobstats_context jobs_context;
> +
> + rte_atomic16_t stats_read_pending;
> + rte_spinlock_t lock;
> + } __rte_cache_aligned;
> +
> +Values of struct lcore_queue_conf:
> +
> +* n_rx_port and rx_port_list[] are used in the main packet processing loop
> + (see Section 9.4.6 "Receive, Process and Transmit Packets" later in this
> chapter).
> +
> +* rx_timers and flush_timer are used to ensure forced TX on low packet
> rate.
> +
> +* flush_job, idle_job and jobs_context are librte_jobstats objects used for
> managing l2fwd jobs.
> +
> +* stats_read_pending and lock are used during job stats read phase.
> +
> +TX Queue Initialization
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Each lcore should be able to transmit on any port. For every port, a single TX
> queue is initialized.
> +
> +.. code-block:: c
> +
> + /* init one TX queue on each port */
> +
> + fflush(stdout);
> + ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
> + rte_eth_dev_socket_id(portid),
> + NULL);
> + if (ret < 0)
> + rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d,
> port=%u\n",
> + ret, (unsigned) portid);
> +
> +Jobs statistics initialization
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +There are several statistics objects available:
> +
> +* Flush job statistics
> +
> +.. code-block:: c
> +
> + rte_jobstats_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
> + drain_tsc, 0);
> +
> + rte_timer_init(&qconf->flush_timer);
> + ret = rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL,
> + lcore_id, &l2fwd_flush_job, NULL);
> +
> + if (ret < 0) {
> + rte_exit(1, "Failed to reset flush job timer for lcore %u: %s",
> + lcore_id, rte_strerror(-ret));
> + }
> +
> +* Statistics per RX port
> +
> +.. code-block:: c
> +
> + rte_jobstats_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
> + rte_jobstats_set_update_period_function(job, l2fwd_job_update_cb);
> +
> + rte_timer_init(&qconf->rx_timers[i]);
> + ret = rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
> + l2fwd_fwd_job, (void *)(uintptr_t)i);
> +
> + if (ret < 0) {
> + rte_exit(1, "Failed to reset lcore %u port %u job timer: %s",
> + lcore_id, qconf->rx_port_list[i], rte_strerror(-ret));
> + }
> +
> +Following parameters are passed to rte_jobstats_init():
> +
> +* 0 as minimal poll period
> +
> +* drain_tsc as maximum poll period
> +
> +* MAX_PKT_BURST as desired target value (RX burst size)
> +
> +Main loop
> +~~~~~~~~~
> +
> +The forwarding path is reworked comparing to orginal L2 Forwarding
> application.
> +In the l2fwd_main_loop() function three loops are placed.
> +
> +.. code-block:: c
> +
> + for (;;) {
> + rte_spinlock_lock(&qconf->lock);
> +
> + do {
> + rte_jobstats_context_start(&qconf->jobs_context);
> +
> + /* Do the Idle job:
> + * - Read stats_read_pending flag
> + * - check if some real job need to be executed
> + */
> + rte_jobstats_start(&qconf->jobs_context, &qconf->idle_job);
> +
> + do {
> + uint8_t i;
> + uint64_t now = rte_get_timer_cycles();
> +
> + need_manage = qconf->flush_timer.expire < now;
> + /* Check if we was esked to give a stats. */
> + stats_read_pending =
> + rte_atomic16_read(&qconf->stats_read_pending);
> + need_manage |= stats_read_pending;
> +
> + for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
> + need_manage = qconf->rx_timers[i].expire < now;
> +
> + } while (!need_manage);
> + rte_jobstats_finish(&qconf->idle_job,
> + qconf->idle_job.target);
> +
> + rte_timer_manage();
> + rte_jobstats_context_finish(&qconf->jobs_context);
> + } while (likely(stats_read_pending == 0));
> +
> + rte_spinlock_unlock(&qconf->lock);
> + rte_pause();
> + }
> +
> +First inifnite for loop is to minimize impact of stats reading. Lock is only
> locked/unlocked when asked.
> +
> +Second inner while loop do the whole jobs management. When any job is
> ready, the use rte_timer_manage() is used to call the job handler.
> +In this place functions l2fwd_fwd_job() and l2fwd_flush_job() are called
> when needed.
> +Then rte_jobstats_context_finish() is called to mark loop end - no
> +other jobs are ready to execute. By this time stats are ready to be read and
> if stats_read_pending is set, loop breaks allowing stats to be read.
> +
> +Third do-while loop is the idle job (idle stats counter). Its only
> +purpose is moniting if any job is ready or stats job read is pending for this
> lcore. Statistics from this part of code is considered as the headroom avalable
> fo additional processing.
> +
> +Receive, Process and Transmit Packets
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The main task of l2fwd_fwd_job() function is to read ingress packets from
> the RX queue of particular port and forward it.
> +This is done using the following code:
> +
> +.. code-block:: c
> +
> + total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
> + MAX_PKT_BURST);
> +
> + for (j = 0; j < total_nb_rx; j++) {
> + m = pkts_burst[j];
> + rte_prefetch0(rte_pktmbuf_mtod(m, void *));
> + l2fwd_simple_forward(m, portid);
> + }
> +
> +Packets are read in a burst of size MAX_PKT_BURST.
> +Then, each mbuf in the table is processed by the l2fwd_simple_forward()
> function.
> +The processing is very simple: process the TX port from the RX port, then
> replace the source and destination MAC addresses.
> +
> +The rte_eth_rx_burst() function writes the mbuf pointers in a local table
> and returns the number of available mbufs in the table.
> +
> +After first read second try is issued.
> +
> +.. code::
> +
> + if (total_nb_rx == MAX_PKT_BURST) {
> + const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
> + MAX_PKT_BURST);
> +
> + total_nb_rx += nb_rx;
> + for (j = 0; j < nb_rx; j++) {
> + m = pkts_burst[j];
> + rte_prefetch0(rte_pktmbuf_mtod(m, void *));
> + l2fwd_simple_forward(m, portid);
> + }
> + }
> +
> +This second read is important to give job stats library a feedback how many
> packets was procesed.
> +
> +.. code::
> +
> + /* Adjust period time in which we are running here. */
> + if (rte_jobstats_finish(job, total_nb_rx) != 0) {
> + rte_timer_reset(&qconf->rx_timers[port_idx], job->period,
> PERIODICAL,
> + lcore_id, l2fwd_fwd_job, arg);
> + }
> +
> +To maximize performance exactly MAX_PKT_BURST is expected (the target
> value) to be read for each l2fwd_fwd_job() call.
> +If total_nb_rx is smaller than target value job->period will be increased. If it
> is greater the period will be decreased.
> +
> +.. note::
> +
> + In the following code, one line for getting the output port requires some
> explanation.
> +
> +During the initialization process, a static array of destination ports
> +(l2fwd_dst_ports[]) is filled such that for each source port, a destination
> port is assigned that is either the next or previous enabled port from the
> portmask.
> +Naturally, the number of ports in the portmask must be even, otherwise,
> the application exits.
> +
> +.. code-block:: c
> +
> + static void
> + l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
> + {
> + struct ether_hdr *eth;
> + void *tmp;
> + unsigned dst_port;
> +
> + dst_port = l2fwd_dst_ports[portid];
> +
> + eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
> +
> + /* 02:00:00:00:00:xx */
> +
> + tmp = ð->d_addr.addr_bytes[0];
> +
> + *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t) dst_port <<
> + 40);
> +
> + /* src addr */
> +
> + ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
> +
> + l2fwd_send_packet(m, (uint8_t) dst_port);
> + }
> +
> +Then, the packet is sent using the l2fwd_send_packet (m, dst_port)
> function.
> +For this test application, the processing is exactly the same for all packets
> arriving on the same RX port.
> +Therefore, it would have been possible to call the l2fwd_send_burst()
> +function directly from the main loop to send all the received packets
> +on the same TX port, using the burst-oriented send function, which is more
> efficient.
> +
> +However, in real-life applications (such as, L3 routing), packet N is
> +not necessarily forwarded on the same port as packet N-1.
> +The application is implemented to illustrate that, so the same approach can
> be reused in a more complex application.
> +
> +The l2fwd_send_packet() function stores the packet in a per-lcore and per-
> txport table.
> +If the table is full, the whole packets table is transmitted using the
> l2fwd_send_burst() function:
> +
> +.. code-block:: c
> +
> + /* Send the packet on an output interface */
> +
> + static int
> + l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
> + {
> + unsigned lcore_id, len;
> + struct lcore_queue_conf *qconf;
> +
> + lcore_id = rte_lcore_id();
> + qconf = &lcore_queue_conf[lcore_id];
> + len = qconf->tx_mbufs[port].len;
> + qconf->tx_mbufs[port].m_table[len] = m;
> + len++;
> +
> + /* enough pkts to be sent */
> +
> + if (unlikely(len == MAX_PKT_BURST)) {
> + l2fwd_send_burst(qconf, MAX_PKT_BURST, port);
> + len = 0;
> + }
> +
> + qconf->tx_mbufs[port].len = len; return 0;
> + }
> +
> +To ensure that no packets remain in the tables, the flush job exists.
> +The l2fwd_flush_job() is called periodicaly to for each lcore draining TX
> queue of each port.
> +This technique introduces some latency when there are not many packets
> +to send, however it improves performance:
> +
> +.. code-block:: c
> +
> + static void
> + l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused
> void *arg)
> + {
> + uint64_t now;
> + unsigned lcore_id;
> + struct lcore_queue_conf *qconf;
> + struct mbuf_table *m_table;
> + uint8_t portid;
> +
> + lcore_id = rte_lcore_id();
> + qconf = &lcore_queue_conf[lcore_id];
> +
> + rte_jobstats_start(&qconf->jobs_context, &qconf->flush_job);
> +
> + now = rte_get_timer_cycles();
> + lcore_id = rte_lcore_id();
> + qconf = &lcore_queue_conf[lcore_id];
> + for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
> + m_table = &qconf->tx_mbufs[portid];
> + if (m_table->len == 0 || m_table->next_flush_time <= now)
> + continue;
> +
> + l2fwd_send_burst(qconf, portid);
> + }
> +
> +
> + /* Pass target to indicate that this job is happy of time interval
> + * in which it was called. */
> + rte_jobstats_finish(&qconf->flush_job, qconf->flush_job.target);
> + }
> +
> +.. |l2_fwd_benchmark_setup| image:: img/l2_fwd_benchmark_setup.svg
> +
> +.. |l2_fwd_virtenv_benchmark_setup| image::
> +img/l2_fwd_virtenv_benchmark_setup.png
> --
> 1.9.1
Acked-by Siobhan Butler <siobhan.a. butler@intel.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide
2015-03-11 9:41 [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide Pawel Wodkowski
2015-03-12 7:26 ` Butler, Siobhan A
@ 2015-03-17 20:53 ` Thomas Monjalon
2015-03-18 10:20 ` [dpdk-dev] [PATCH v2] " Pawel Wodkowski
2 siblings, 0 replies; 6+ messages in thread
From: Thomas Monjalon @ 2015-03-17 20:53 UTC (permalink / raw)
To: Pawel Wodkowski; +Cc: dev
Hi Pawel,
2015-03-11 10:41, Pawel Wodkowski:
> doc/guides/sample_app_ug/index.rst | 1 +
> doc/guides/sample_app_ug/l2_forward_job_stats.rst | 637 ++++++++++++++++++++++
> 2 files changed, 638 insertions(+)
My checkpatch has detected 25 trailing whitespaces and 2 typos:
WARNING:TYPO_SPELLING: 'avalable' may be misspelled - perhaps 'available'?
#541: FILE: doc/guides/sample_app_ug/l2_forward_job_stats.rst:470:
+for this lcore. Statistics from this part of code is considered as the headroom avalable fo additional processing.
WARNING:TYPO_SPELLING: 'procesed' may be misspelled - perhaps 'processed'?
#582: FILE: doc/guides/sample_app_ug/l2_forward_job_stats.rst:511:
+This second read is important to give job stats library a feedback how many packets was procesed.
Please send a v2. Thanks
^ permalink raw reply [flat|nested] 6+ messages in thread
* [dpdk-dev] [PATCH v2] doc: add l2fwd-jobstats user guide
2015-03-11 9:41 [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide Pawel Wodkowski
2015-03-12 7:26 ` Butler, Siobhan A
2015-03-17 20:53 ` Thomas Monjalon
@ 2015-03-18 10:20 ` Pawel Wodkowski
2015-03-27 15:33 ` De Lara Guarch, Pablo
2 siblings, 1 reply; 6+ messages in thread
From: Pawel Wodkowski @ 2015-03-18 10:20 UTC (permalink / raw)
To: dev
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
---
Changes v2
1. Fix trailing spaces and typos.
2. Add maintaners claim
MAINTAINERS | 1 +
doc/guides/sample_app_ug/index.rst | 1 +
doc/guides/sample_app_ug/l2_forward_job_stats.rst | 637 ++++++++++++++++++++++
3 files changed, 639 insertions(+)
create mode 100644 doc/guides/sample_app_ug/l2_forward_job_stats.rst
diff --git a/MAINTAINERS b/MAINTAINERS
index 07fdf5e..a82e2f0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -391,6 +391,7 @@ Job statistics
M: Pawel Wodkowski <pawelx.wodkowski@intel.com>
F: lib/librte_jobstats/
F: examples/l2fwd-jobstats/
+F: doc/guides/sample_app_ug/l2_forward_job_stats.rst
Test Applications
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 5720181..c89a2f0 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -47,6 +47,7 @@ Sample Applications User Guide
ipv4_multicast
ip_reassembly
kernel_nic_interface
+ l2_forward_job_stats
l2_forward_real_virtual
l3_forward
l3_forward_power_man
diff --git a/doc/guides/sample_app_ug/l2_forward_job_stats.rst b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
new file mode 100644
index 0000000..e25d7b1
--- /dev/null
+++ b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
@@ -0,0 +1,637 @@
+.. BSD LICENSE
+ Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ All rights reserved.
+
+ Redistribution and use in source and binary forms, with or without
+ modification, are permitted provided that the following conditions
+ are met:
+
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in
+ the documentation and/or other materials provided with the
+ distribution.
+ * Neither the name of Intel Corporation nor the names of its
+ contributors may be used to endorse or promote products derived
+ from this software without specific prior written permission.
+
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+L2 Forwarding Sample Application (in Real and Virtualized Environments) with core load statistics.
+==================================================================================================
+
+The L2 Forwarding sample application is a simple example of packet processing using
+the Data Plane Development Kit (DPDK) which
+also takes advantage of Single Root I/O Virtualization (SR-IOV) features in a virtualized environment.
+
+.. note::
+
+ This application is a variation of L2 Forwarding sample application. It demonstrate possible
+ scheme of job stats library usage therefore some parts of this document is identical with original
+ L2 forwarding application.
+
+Overview
+--------
+
+The L2 Forwarding sample application, which can operate in real and virtualized environments,
+performs L2 forwarding for each packet that is received.
+The destination port is the adjacent port from the enabled portmask, that is,
+if the first four ports are enabled (portmask 0xf),
+ports 1 and 2 forward into each other, and ports 3 and 4 forward into each other.
+Also, the MAC addresses are affected as follows:
+
+* The source MAC address is replaced by the TX port MAC address
+
+* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID
+
+This application can be used to benchmark performance using a traffic-generator, as shown in the Figure 3.
+
+The application can also be used in a virtualized environment as shown in Figure 4.
+
+The L2 Forwarding application can also be used as a starting point for developing a new application based on the DPDK.
+
+.. _figure_3:
+
+**Figure 3. Performance Benchmark Setup (Basic Environment)**
+
+.. image4_png has been replaced
+
+|l2_fwd_benchmark_setup|
+
+.. _figure_4:
+
+**Figure 4. Performance Benchmark Setup (Virtualized Environment)**
+
+.. image5_png has been renamed
+
+|l2_fwd_virtenv_benchmark_setup|
+
+Virtual Function Setup Instructions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This application can use the virtual function available in the system and
+therefore can be used in a virtual machine without passing through
+the whole Network Device into a guest machine in a virtualized scenario.
+The virtual functions can be enabled in the host machine or the hypervisor with the respective physical function driver.
+
+For example, in a Linux* host machine, it is possible to enable a virtual function using the following command:
+
+.. code-block:: console
+
+ modprobe ixgbe max_vfs=2,2
+
+This command enables two Virtual Functions on each of Physical Function of the NIC,
+with two physical ports in the PCI configuration space.
+It is important to note that enabled Virtual Function 0 and 2 would belong to Physical Function 0
+and Virtual Function 1 and 3 would belong to Physical Function 1,
+in this case enabling a total of four Virtual Functions.
+
+Compiling the Application
+-------------------------
+
+#. Go to the example directory:
+
+ .. code-block:: console
+
+ export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l2fwd-jobstats
+
+#. Set the target (a default target is used if not specified). For example:
+
+ .. code-block:: console
+
+ export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+ *See the DPDK Getting Started Guide* for possible RTE_TARGET values.
+
+#. Build the application:
+
+ .. code-block:: console
+
+ make
+
+Running the Application
+-----------------------
+
+The application requires a number of command line options:
+
+.. code-block:: console
+
+ ./build/l2fwd-jobstats [EAL options] -- -p PORTMASK [-q NQ] [-l]
+
+where,
+
+* p PORTMASK: A hexadecimal bitmask of the ports to configure
+
+* q NQ: A number of queues (=ports) per lcore (default is 1)
+
+* l: Use locale thousands separator when formatting big numbers.
+
+To run the application in linuxapp environment with 4 lcores, 16 ports, 8 RX queues per lcore and
+thousands separator printing, issue the command:
+
+.. code-block:: console
+
+ $ ./build/l2fwd-jobstats -c f -n 4 -- -q 8 -p ffff -l
+
+Refer to the *DPDK Getting Started Guide* for general information on running applications
+and the Environment Abstraction Layer (EAL) options.
+
+Explanation
+-----------
+
+The following sections provide some explanation of the code.
+
+Command Line Arguments
+~~~~~~~~~~~~~~~~~~~~~~
+
+The L2 Forwarding sample application takes specific parameters,
+in addition to Environment Abstraction Layer (EAL) arguments (see Section 9.3).
+The preferred way to parse parameters is to use the getopt() function,
+since it is part of a well-defined and portable library.
+
+The parsing of arguments is done in the l2fwd_parse_args() function.
+The method of argument parsing is not described here.
+Refer to the *glibc getopt(3)* man page for details.
+
+EAL arguments are parsed first, then application-specific arguments.
+This is done at the beginning of the main() function:
+
+.. code-block:: c
+
+ /* init EAL */
+
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+
+ argc -= ret;
+ argv += ret;
+
+ /* parse application arguments (after the EAL ones) */
+
+ ret = l2fwd_parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
+
+Mbuf Pool Initialization
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Once the arguments are parsed, the mbuf pool is created.
+The mbuf pool contains a set of mbuf objects that will be used by the driver
+and the application to store network packet data:
+
+.. code-block:: c
+
+ /* create the mbuf pool */
+ l2fwd_pktmbuf_pool =
+ rte_mempool_create("mbuf_pool", NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ rte_socket_id(), 0);
+
+ if (l2fwd_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+The rte_mempool is a generic structure used to handle pools of objects.
+In this case, it is necessary to create a pool that will be used by the driver,
+which expects to have some reserved space in the mempool structure,
+sizeof(struct rte_pktmbuf_pool_private) bytes.
+The number of allocated pkt mbufs is NB_MBUF, with a size of MBUF_SIZE each.
+A per-lcore cache of 32 mbufs is kept.
+The memory is allocated in rte_socket_id() socket,
+but it is possible to extend this code to allocate one mbuf pool per socket.
+
+Two callback pointers are also given to the rte_mempool_create() function:
+
+* The first callback pointer is to rte_pktmbuf_pool_init() and is used
+ to initialize the private data of the mempool, which is needed by the driver.
+ This function is provided by the mbuf API, but can be copied and extended by the developer.
+
+* The second callback pointer given to rte_mempool_create() is the mbuf initializer.
+ The default is used, that is, rte_pktmbuf_init(), which is provided in the rte_mbuf library.
+ If a more complex application wants to extend the rte_pktmbuf structure for its own needs,
+ a new function derived from rte_pktmbuf_init( ) can be created.
+
+Driver Initialization
+~~~~~~~~~~~~~~~~~~~~~
+
+The main part of the code in the main() function relates to the initialization of the driver.
+To fully understand this code, it is recommended to study the chapters that related to the Poll Mode Driver
+in the *DPDK Programmer's Guide* and the *DPDK API Reference*.
+
+.. code-block:: c
+
+ nb_ports = rte_eth_dev_count();
+
+ if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
+
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+
+ /* reset l2fwd_dst_ports */
+
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
+ l2fwd_dst_ports[portid] = 0;
+
+ last_port = 0;
+
+ /*
+ * Each logical core is assigned a dedicated TX queue on each port.
+ */
+ for (portid = 0; portid < nb_ports; portid++) {
+ /* skip ports that are not enabled */
+ if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
+ continue;
+
+ if (nb_ports_in_mask % 2) {
+ l2fwd_dst_ports[portid] = last_port;
+ l2fwd_dst_ports[last_port] = portid;
+ }
+ else
+ last_port = portid;
+
+ nb_ports_in_mask++;
+
+ rte_eth_dev_info_get((uint8_t) portid, &dev_info);
+ }
+
+The next step is to configure the RX and TX queues.
+For each port, there is only one RX queue (only one lcore is able to poll a given port).
+The number of TX queues depends on the number of available lcores.
+The rte_eth_dev_configure() function is used to configure the number of queues for a port:
+
+.. code-block:: c
+
+ ret = rte_eth_dev_configure((uint8_t)portid, 1, 1, &port_conf);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Cannot configure device: "
+ "err=%d, port=%u\n",
+ ret, portid);
+
+The global configuration is stored in a static structure:
+
+.. code-block:: c
+
+ static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .split_hdr_size = 0,
+ .header_split = 0, /**< Header Split disabled */
+ .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_vlan_filter = 0, /**< VLAN filtering disabled */
+ .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .hw_strip_crc= 0, /**< CRC stripped by hardware */
+ },
+
+ .txmode = {
+ .mq_mode = ETH_DCB_NONE
+ },
+ };
+
+RX Queue Initialization
+~~~~~~~~~~~~~~~~~~~~~~~
+
+The application uses one lcore to poll one or several ports, depending on the -q option,
+which specifies the number of queues per lcore.
+
+For example, if the user specifies -q 4, the application is able to poll four ports with one lcore.
+If there are 16 ports on the target (and if the portmask argument is -p ffff ),
+the application will need four lcores to poll all the ports.
+
+.. code-block:: c
+
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ rte_eth_dev_socket_id(portid),
+ NULL,
+ l2fwd_pktmbuf_pool);
+
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+The list of queues that must be polled for a given lcore is stored in a private structure called struct lcore_queue_conf.
+
+.. code-block:: c
+
+ struct lcore_queue_conf {
+ unsigned n_rx_port;
+ unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
+ truct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+
+ struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_jobstats port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
+
+ struct rte_timer flush_timer;
+ struct rte_jobstats flush_job;
+ struct rte_jobstats idle_job;
+ struct rte_jobstats_context jobs_context;
+
+ rte_atomic16_t stats_read_pending;
+ rte_spinlock_t lock;
+ } __rte_cache_aligned;
+
+Values of struct lcore_queue_conf:
+
+* n_rx_port and rx_port_list[] are used in the main packet processing loop
+ (see Section 9.4.6 "Receive, Process and Transmit Packets" later in this chapter).
+
+* rx_timers and flush_timer are used to ensure forced TX on low packet rate.
+
+* flush_job, idle_job and jobs_context are librte_jobstats objects used for managing l2fwd jobs.
+
+* stats_read_pending and lock are used during job stats read phase.
+
+TX Queue Initialization
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Each lcore should be able to transmit on any port. For every port, a single TX queue is initialized.
+
+.. code-block:: c
+
+ /* init one TX queue on each port */
+
+ fflush(stdout);
+ ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+ rte_eth_dev_socket_id(portid),
+ NULL);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
+ ret, (unsigned) portid);
+
+Jobs statistics initialization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+There are several statistics objects available:
+
+* Flush job statistics
+
+.. code-block:: c
+
+ rte_jobstats_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
+ drain_tsc, 0);
+
+ rte_timer_init(&qconf->flush_timer);
+ ret = rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL,
+ lcore_id, &l2fwd_flush_job, NULL);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to reset flush job timer for lcore %u: %s",
+ lcore_id, rte_strerror(-ret));
+ }
+
+* Statistics per RX port
+
+.. code-block:: c
+
+ rte_jobstats_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
+ rte_jobstats_set_update_period_function(job, l2fwd_job_update_cb);
+
+ rte_timer_init(&qconf->rx_timers[i]);
+ ret = rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
+ l2fwd_fwd_job, (void *)(uintptr_t)i);
+
+ if (ret < 0) {
+ rte_exit(1, "Failed to reset lcore %u port %u job timer: %s",
+ lcore_id, qconf->rx_port_list[i], rte_strerror(-ret));
+ }
+
+Following parameters are passed to rte_jobstats_init():
+
+* 0 as minimal poll period
+
+* drain_tsc as maximum poll period
+
+* MAX_PKT_BURST as desired target value (RX burst size)
+
+Main loop
+~~~~~~~~~
+
+The forwarding path is reworked comparing to original L2 Forwarding application.
+In the l2fwd_main_loop() function three loops are placed.
+
+.. code-block:: c
+
+ for (;;) {
+ rte_spinlock_lock(&qconf->lock);
+
+ do {
+ rte_jobstats_context_start(&qconf->jobs_context);
+
+ /* Do the Idle job:
+ * - Read stats_read_pending flag
+ * - check if some real job need to be executed
+ */
+ rte_jobstats_start(&qconf->jobs_context, &qconf->idle_job);
+
+ do {
+ uint8_t i;
+ uint64_t now = rte_get_timer_cycles();
+
+ need_manage = qconf->flush_timer.expire < now;
+ /* Check if we was esked to give a stats. */
+ stats_read_pending =
+ rte_atomic16_read(&qconf->stats_read_pending);
+ need_manage |= stats_read_pending;
+
+ for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
+ need_manage = qconf->rx_timers[i].expire < now;
+
+ } while (!need_manage);
+ rte_jobstats_finish(&qconf->idle_job, qconf->idle_job.target);
+
+ rte_timer_manage();
+ rte_jobstats_context_finish(&qconf->jobs_context);
+ } while (likely(stats_read_pending == 0));
+
+ rte_spinlock_unlock(&qconf->lock);
+ rte_pause();
+ }
+
+First inifnite for loop is to minimize impact of stats reading. Lock is only locked/unlocked when asked.
+
+Second inner while loop do the whole jobs management. When any job is ready, the use rte_timer_manage() is used to call the job handler.
+In this place functions l2fwd_fwd_job() and l2fwd_flush_job() are called when needed.
+Then rte_jobstats_context_finish() is called to mark loop end - no other jobs are ready to execute. By this time stats are ready to be read
+and if stats_read_pending is set, loop breaks allowing stats to be read.
+
+Third do-while loop is the idle job (idle stats counter). Its only purpose is moniting if any job is ready or stats job read is pending
+for this lcore. Statistics from this part of code is considered as the headroom available fo additional processing.
+
+Receive, Process and Transmit Packets
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The main task of l2fwd_fwd_job() function is to read ingress packets from the RX queue of particular port and forward it.
+This is done using the following code:
+
+.. code-block:: c
+
+ total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ for (j = 0; j < total_nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+
+Packets are read in a burst of size MAX_PKT_BURST.
+Then, each mbuf in the table is processed by the l2fwd_simple_forward() function.
+The processing is very simple: process the TX port from the RX port, then replace the source and destination MAC addresses.
+
+The rte_eth_rx_burst() function writes the mbuf pointers in a local table and returns the number of available mbufs in the table.
+
+After first read second try is issued.
+
+.. code::
+
+ if (total_nb_rx == MAX_PKT_BURST) {
+ const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
+ MAX_PKT_BURST);
+
+ total_nb_rx += nb_rx;
+ for (j = 0; j < nb_rx; j++) {
+ m = pkts_burst[j];
+ rte_prefetch0(rte_pktmbuf_mtod(m, void *));
+ l2fwd_simple_forward(m, portid);
+ }
+ }
+
+This second read is important to give job stats library a feedback how many packets was processed.
+
+.. code::
+
+ /* Adjust period time in which we are running here. */
+ if (rte_jobstats_finish(job, total_nb_rx) != 0) {
+ rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
+ lcore_id, l2fwd_fwd_job, arg);
+ }
+
+To maximize performance exactly MAX_PKT_BURST is expected (the target value) to be read for each l2fwd_fwd_job() call.
+If total_nb_rx is smaller than target value job->period will be increased. If it is greater the period will be decreased.
+
+.. note::
+
+ In the following code, one line for getting the output port requires some explanation.
+
+During the initialization process, a static array of destination ports (l2fwd_dst_ports[]) is filled such that for each source port,
+a destination port is assigned that is either the next or previous enabled port from the portmask.
+Naturally, the number of ports in the portmask must be even, otherwise, the application exits.
+
+.. code-block:: c
+
+ static void
+ l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
+ {
+ struct ether_hdr *eth;
+ void *tmp;
+ unsigned dst_port;
+
+ dst_port = l2fwd_dst_ports[portid];
+
+ eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ /* 02:00:00:00:00:xx */
+
+ tmp = ð->d_addr.addr_bytes[0];
+
+ *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t) dst_port << 40);
+
+ /* src addr */
+
+ ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);
+
+ l2fwd_send_packet(m, (uint8_t) dst_port);
+ }
+
+Then, the packet is sent using the l2fwd_send_packet (m, dst_port) function.
+For this test application, the processing is exactly the same for all packets arriving on the same RX port.
+Therefore, it would have been possible to call the l2fwd_send_burst() function directly from the main loop
+to send all the received packets on the same TX port,
+using the burst-oriented send function, which is more efficient.
+
+However, in real-life applications (such as, L3 routing),
+packet N is not necessarily forwarded on the same port as packet N-1.
+The application is implemented to illustrate that, so the same approach can be reused in a more complex application.
+
+The l2fwd_send_packet() function stores the packet in a per-lcore and per-txport table.
+If the table is full, the whole packets table is transmitted using the l2fwd_send_burst() function:
+
+.. code-block:: c
+
+ /* Send the packet on an output interface */
+
+ static int
+ l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
+ {
+ unsigned lcore_id, len;
+ struct lcore_queue_conf *qconf;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ len = qconf->tx_mbufs[port].len;
+ qconf->tx_mbufs[port].m_table[len] = m;
+ len++;
+
+ /* enough pkts to be sent */
+
+ if (unlikely(len == MAX_PKT_BURST)) {
+ l2fwd_send_burst(qconf, MAX_PKT_BURST, port);
+ len = 0;
+ }
+
+ qconf->tx_mbufs[port].len = len; return 0;
+ }
+
+To ensure that no packets remain in the tables, the flush job exists. The l2fwd_flush_job()
+is called periodicaly to for each lcore draining TX queue of each port.
+This technique introduces some latency when there are not many packets to send,
+however it improves performance:
+
+.. code-block:: c
+
+ static void
+ l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
+ {
+ uint64_t now;
+ unsigned lcore_id;
+ struct lcore_queue_conf *qconf;
+ struct mbuf_table *m_table;
+ uint8_t portid;
+
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+
+ rte_jobstats_start(&qconf->jobs_context, &qconf->flush_job);
+
+ now = rte_get_timer_cycles();
+ lcore_id = rte_lcore_id();
+ qconf = &lcore_queue_conf[lcore_id];
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+ m_table = &qconf->tx_mbufs[portid];
+ if (m_table->len == 0 || m_table->next_flush_time <= now)
+ continue;
+
+ l2fwd_send_burst(qconf, portid);
+ }
+
+
+ /* Pass target to indicate that this job is happy of time interval
+ * in which it was called. */
+ rte_jobstats_finish(&qconf->flush_job, qconf->flush_job.target);
+ }
+
+.. |l2_fwd_benchmark_setup| image:: img/l2_fwd_benchmark_setup.svg
+
+.. |l2_fwd_virtenv_benchmark_setup| image:: img/l2_fwd_virtenv_benchmark_setup.png
--
1.9.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [dpdk-dev] [PATCH v2] doc: add l2fwd-jobstats user guide
2015-03-18 10:20 ` [dpdk-dev] [PATCH v2] " Pawel Wodkowski
@ 2015-03-27 15:33 ` De Lara Guarch, Pablo
2015-03-31 0:19 ` Thomas Monjalon
0 siblings, 1 reply; 6+ messages in thread
From: De Lara Guarch, Pablo @ 2015-03-27 15:33 UTC (permalink / raw)
To: Wodkowski, PawelX, dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Wednesday, March 18, 2015 10:21 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: add l2fwd-jobstats user guide
>
> Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> ---
>
> Changes v2
> 1. Fix trailing spaces and typos.
> 2. Add maintaners claim
>
[...]
> diff --git a/doc/guides/sample_app_ug/l2_forward_job_stats.rst
> b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
> new file mode 100644
> index 0000000..e25d7b1
[...]
> + /* Pass target to indicate that this job is happy of time interval
> + * in which it was called. */
> + rte_jobstats_finish(&qconf->flush_job, qconf->flush_job.target);
> + }
> +
> +.. |l2_fwd_benchmark_setup| image:: img/l2_fwd_benchmark_setup.svg
> +
> +.. |l2_fwd_virtenv_benchmark_setup| image::
> img/l2_fwd_virtenv_benchmark_setup.png
> --
> 1.9.1
PDF will not build due to the extension of the images. Can you change them from ".svg" to ".*"?
Thanks,
Pablo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [dpdk-dev] [PATCH v2] doc: add l2fwd-jobstats user guide
2015-03-27 15:33 ` De Lara Guarch, Pablo
@ 2015-03-31 0:19 ` Thomas Monjalon
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Monjalon @ 2015-03-31 0:19 UTC (permalink / raw)
To: Wodkowski, PawelX; +Cc: dev
2015-03-27 15:33, De Lara Guarch, Pablo:
> > Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
> > ---
> >
> > Changes v2
> > 1. Fix trailing spaces and typos.
> > 2. Add maintaners claim
>
> PDF will not build due to the extension of the images. Can you change them from ".svg" to ".*"?
Applied with above fix, thanks
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-03-31 0:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-11 9:41 [dpdk-dev] [PATCH] doc: add l2fwd-jobstats user guide Pawel Wodkowski
2015-03-12 7:26 ` Butler, Siobhan A
2015-03-17 20:53 ` Thomas Monjalon
2015-03-18 10:20 ` [dpdk-dev] [PATCH v2] " Pawel Wodkowski
2015-03-27 15:33 ` De Lara Guarch, Pablo
2015-03-31 0:19 ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).