From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>,
Neil Horman <nhorman@tuxdriver.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v3] distributor_app: new sample app
Date: Thu, 2 Oct 2014 09:04:27 +0000 [thread overview]
Message-ID: <2601191342CEEE43887BDE71AB977258213904F6@IRSMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <20141001153730.GA9292@BRICHA3-MOBL>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, October 01, 2014 4:38 PM
> To: Neil Horman
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] distributor_app: new sample app
>
> On Wed, Oct 01, 2014 at 10:56:20AM -0400, Neil Horman wrote:
> > On Wed, Oct 01, 2014 at 02:47:00PM +0000, Pattan, Reshma wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Tuesday, September 30, 2014 2:40 PM
> > > > To: Richardson, Bruce
> > > > Cc: Pattan, Reshma; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v3] distributor_app: new sample app
> > > >
> > > > On Tue, Sep 30, 2014 at 01:18:28PM +0100, Bruce Richardson wrote:
> > > > > On Tue, Sep 30, 2014 at 07:34:45AM -0400, Neil Horman wrote:
> > > > > > On Tue, Sep 30, 2014 at 11:39:37AM +0100, reshmapa wrote:
> > > > > > > From: Reshma Pattan <reshma.pattan@intel.com>
> > > > > > >
> > > > > > > A new sample app that shows the usage of the distributor library.
> > > > > > > This app works as follows:
> > > > > > >
> > > > > > > * An RX thread runs which pulls packets from each ethernet port in turn
> > > > > > > and passes those packets to worker using a distributor component.
> > > > > > > * The workers take the packets in turn, and determine the output port
> > > > > > > for those packets using basic l2forwarding doing an xor on the source
> > > > > > > port id.
> > > > > > > * The RX thread takes the returned packets from the workers and enqueue
> > > > > > > those packets into an rte_ring structure.
> > > > > > > * A TX thread pulls the packets off the rte_ring structure and then
> > > > > > > sends each packet out the output port specified previously by
> > > > > > > the worker
> > > > > > > * Command-line option support provided only for portmask.
> > > > > > >
> > > > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > > > > > ---
> > > > > > > examples/Makefile | 1 +
> > > > > > > examples/distributor_app/Makefile | 57 ++++
> > > > > > > examples/distributor_app/main.c | 600
> > > > +++++++++++++++++++++++++++++++++++++
> > > > > > > examples/distributor_app/main.h | 46 +++
> > > > > > > 4 files changed, 704 insertions(+), 0 deletions(-) create mode
> > > > > > > 100644 examples/distributor_app/Makefile create mode 100644
> > > > > > > examples/distributor_app/main.c create mode 100644
> > > > > > > examples/distributor_app/main.h
> > > > > > >
> > > > > > > diff --git a/examples/Makefile b/examples/Makefile index
> > > > > > > 6245f83..2ba82b0 100644
> > > > > > > --- a/examples/Makefile
> > > > > > > +++ b/examples/Makefile
> > > > > > > @@ -66,5 +66,6 @@ DIRS-y += vhost
> > > > > > > DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen DIRS-y += vmdq
> > > > > > > DIRS-y += vmdq_dcb
> > > > > > > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> > > > > > >
> > > > > > > include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git
> > > > > > > a/examples/distributor_app/Makefile
> > > > > > > b/examples/distributor_app/Makefile
> > > > > > > new file mode 100644
> > > > > > > index 0000000..6a5bada
> > > > > > > --- /dev/null
> > > > > > > +++ b/examples/distributor_app/Makefile
> > > > > > > @@ -0,0 +1,57 @@
> > > > > > > +# BSD LICENSE
> > > > > > > +#
> > > > > > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > > > > +# All rights reserved.
> > > > > > > +#
> > > > > > > +# Redistribution and use in source and binary forms, with or without
> > > > > > > +# modification, are permitted provided that the following conditions
> > > > > > > +# are met:
> > > > > > > +#
> > > > > > > +# * Redistributions of source code must retain the above copyright
> > > > > > > +# notice, this list of conditions and the following disclaimer.
> > > > > > > +# * Redistributions in binary form must reproduce the above copyright
> > > > > > > +# notice, this list of conditions and the following disclaimer in
> > > > > > > +# the documentation and/or other materials provided with the
> > > > > > > +# distribution.
> > > > > > > +# * Neither the name of Intel Corporation nor the names of its
> > > > > > > +# contributors may be used to endorse or promote products derived
> > > > > > > +# from this software without specific prior written permission.
> > > > > > > +#
> > > > > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > > CONTRIBUTORS
> > > > > > > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> > > > NOT
> > > > > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > > > FITNESS FOR
> > > > > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > > > COPYRIGHT
> > > > > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > > > INCIDENTAL,
> > > > > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> > > > BUT NOT
> > > > > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> > > > LOSS OF USE,
> > > > > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> > > > AND ON ANY
> > > > > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> > > > TORT
> > > > > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> > > > OF THE USE
> > > > > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > > > DAMAGE.
> > > > > > > +
> > > > > > > +ifeq ($(RTE_SDK),)
> > > > > > > +$(error "Please define RTE_SDK environment variable") endif
> > > > > > > +
> > > > > > > +# Default target, can be overriden by command line or environment
> > > > > > > +RTE_TARGET ?= x86_64-native-linuxapp-gcc
> > > > > > > +
> > > > > > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > +
> > > > > > > +# binary name
> > > > > > > +APP = distributor_app
> > > > > > > +
> > > > > > > +# all source are stored in SRCS-y SRCS-y := main.c
> > > > > > > +
> > > > > > > +CFLAGS += $(WERROR_FLAGS)
> > > > > > > +
> > > > > > > +# workaround for a gcc bug with noreturn attribute #
> > > > > > > +http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> > > > > > > +ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y) CFLAGS_main.o +=
> > > > > > > +-Wno-return-type endif
> > > > > > > +
> > > > > > > +EXTRA_CFLAGS += -O3 -Wfatal-errors
> > > > > > > +
> > > > > > > +include $(RTE_SDK)/mk/rte.extapp.mk
> > > > > > > diff --git a/examples/distributor_app/main.c
> > > > > > > b/examples/distributor_app/main.c new file mode 100644 index
> > > > > > > 0000000..f555d93
> > > > > > > --- /dev/null
> > > > > > > +++ b/examples/distributor_app/main.c
> > > > > > > @@ -0,0 +1,600 @@
> > > > > > > +/*-
> > > > > > > + * BSD LICENSE
> > > > > > > + *
> > > > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > > > > + * All rights reserved.
> > > > > > > + *
> > > > > > > + * Redistribution and use in source and binary forms, with or without
> > > > > > > + * modification, are permitted provided that the following conditions
> > > > > > > + * are met:
> > > > > > > + *
> > > > > > > + * * Redistributions of source code must retain the above copyright
> > > > > > > + * notice, this list of conditions and the following disclaimer.
> > > > > > > + * * Redistributions in binary form must reproduce the above copyright
> > > > > > > + * notice, this list of conditions and the following disclaimer in
> > > > > > > + * the documentation and/or other materials provided with the
> > > > > > > + * distribution.
> > > > > > > + * * Neither the name of Intel Corporation nor the names of its
> > > > > > > + * contributors may be used to endorse or promote products derived
> > > > > > > + * from this software without specific prior written permission.
> > > > > > > + *
> > > > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > > CONTRIBUTORS
> > > > > > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
> > > > BUT NOT
> > > > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > > > FITNESS FOR
> > > > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > > > COPYRIGHT
> > > > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > > > INCIDENTAL,
> > > > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> > > > BUT NOT
> > > > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> > > > LOSS OF USE,
> > > > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> > > > AND ON ANY
> > > > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> > > > TORT
> > > > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> > > > OF THE USE
> > > > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > > > DAMAGE.
> > > > > > > + */
> > > > > > > +
> > > > > > > +#include <stdint.h>
> > > > > > > +#include <inttypes.h>
> > > > > > > +#include <unistd.h>
> > > > > > > +#include <signal.h>
> > > > > > > +#include <getopt.h>
> > > > > > > +
> > > > > > > +#include <rte_eal.h>
> > > > > > > +#include <rte_ethdev.h>
> > > > > > > +#include <rte_cycles.h>
> > > > > > > +#include <rte_malloc.h>
> > > > > > > +#include <rte_debug.h>
> > > > > > > +#include <rte_distributor.h>
> > > > > > > +
> > > > > > > +#include "main.h"
> > > > > > > +
> > > > > > > +#define RX_RING_SIZE 256
> > > > > > > +#define RX_FREE_THRESH 32
> > > > > > > +#define RX_PTHRESH 8
> > > > > > > +#define RX_HTHRESH 8
> > > > > > > +#define RX_WTHRESH 0
> > > > > > > +
> > > > > > > +#define TX_RING_SIZE 512
> > > > > > > +#define TX_FREE_THRESH 32
> > > > > > > +#define TX_PTHRESH 32
> > > > > > > +#define TX_HTHRESH 0
> > > > > > > +#define TX_WTHRESH 0
> > > > > > > +#define TX_RSBIT_THRESH 32
> > > > > > > +#define TX_Q_FLAGS (ETH_TXQ_FLAGS_NOMULTSEGS |
> > > > ETH_TXQ_FLAGS_NOVLANOFFL |\
> > > > > > > + ETH_TXQ_FLAGS_NOXSUMSCTP | ETH_TXQ_FLAGS_NOXSUMUDP | \
> > > > > > > + ETH_TXQ_FLAGS_NOXSUMTCP)
> > > > > > > +
> > > > > > > +#define NUM_MBUFS ((64*1024)-1)
> > > > > > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) +
> > > > > > > +RTE_PKTMBUF_HEADROOM) #define MBUF_CACHE_SIZE 250 #define
> > > > > > > +BURST_SIZE 32 #define RTE_RING_SZ 1024
> > > > > > > +
> > > > > > > +/* uncommnet below line to enable debug logs */
> > > > > > > +/* #define DEBUG */
> > > > > > > +
> > > > > > > +#ifdef DEBUG
> > > > > > > +#define LOG_LEVEL RTE_LOG_DEBUG
> > > > > > > +#define LOG_DEBUG(log_type, fmt, args...) do { \
> > > > > > > + RTE_LOG(DEBUG, log_type, fmt, ##args) \
> > > > > > > +} while (0)
> > > > > > > +#else
> > > > > > > +#define LOG_LEVEL RTE_LOG_INFO
> > > > > > > +#define LOG_DEBUG(log_type, fmt, args...) do {} while (0) #endif
> > > > > > > +
> > > > > > > +#define RTE_LOGTYPE_DISTRAPP RTE_LOGTYPE_USER1
> > > > > > > +
> > > > > > > +/* mask of enabled ports */
> > > > > > > +static uint32_t enabled_port_mask = 0;
> > > > > > > +
> > > > > > > +static volatile struct app_stats {
> > > > > > > + struct {
> > > > > > > + uint64_t rx_pkts;
> > > > > > > + uint64_t returned_pkts;
> > > > > > > + uint64_t enqueued_pkts;
> > > > > > > + } rx __rte_cache_aligned;
> > > > > > > +
> > > > > > > + struct {
> > > > > > > + uint64_t dequeue_pkts;
> > > > > > > + uint64_t tx_pkts;
> > > > > > > + } tx __rte_cache_aligned;
> > > > > > > +} app_stats;
> > > > > > > +
> > > > > > > +static const struct rte_eth_conf port_conf_default = {
> > > > > > > + .rxmode = {
> > > > > > > + .mq_mode = ETH_MQ_RX_RSS,
> > > > > > > + .max_rx_pkt_len = ETHER_MAX_LEN,
> > > > > > > + .split_hdr_size = 0,
> > > > > > > + .header_split = 0, /**< Header Split disabled */
> > > > > > > + .hw_ip_checksum = 0, /**< IP checksum offload enabled */
> > > > > > > + .hw_vlan_filter = 0, /**< VLAN filtering disabled */
> > > > > > > + .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
> > > > > > > + .hw_strip_crc = 0, /**< CRC stripped by hardware */
> > > > > > > + },
> > > > > > > + .txmode = {
> > > > > > > + .mq_mode = ETH_MQ_TX_NONE,
> > > > > > > + },
> > > > > > > + .lpbk_mode = 0,
> > > > > > > + .rx_adv_conf = {
> > > > > > > + .rss_conf = {
> > > > > > > + .rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6 |
> > > > > > > + ETH_RSS_IPV4_TCP |
> > > > ETH_RSS_IPV4_UDP |
> > > > > > > + ETH_RSS_IPV6_TCP |
> > > > ETH_RSS_IPV6_UDP,
> > > > > > > + }
> > > > > > > + },
> > > > > > > +};
> > > > > > > +
> > > > > > > +static const struct rte_eth_rxconf rx_conf_default = {
> > > > > > > + .rx_thresh = {
> > > > > > > + .pthresh = RX_PTHRESH,
> > > > > > > + .hthresh = RX_HTHRESH,
> > > > > > > + .wthresh = RX_WTHRESH,
> > > > > > > + },
> > > > > > > + .rx_free_thresh = RX_FREE_THRESH,
> > > > > > > + .rx_drop_en = 0,
> > > > > > > +};
> > > > > > > +
> > > > > > > +static const struct rte_eth_txconf tx_conf_default = {
> > > > > > > + .tx_thresh = {
> > > > > > > + .pthresh = TX_PTHRESH,
> > > > > > > + .hthresh = TX_HTHRESH,
> > > > > > > + .wthresh = TX_WTHRESH,
> > > > > > > + },
> > > > > > > + .tx_free_thresh = TX_FREE_THRESH,
> > > > > > > + .tx_rs_thresh = TX_RSBIT_THRESH,
> > > > > > > + .txq_flags = TX_Q_FLAGS
> > > > > > > +
> > > > > > > +};
> > > > > > > +
> > > > > > > +struct output_buffer {
> > > > > > > + unsigned count;
> > > > > > > + struct rte_mbuf *mbufs[BURST_SIZE]; };
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * Initialises a given port using global settings and with the rx
> > > > > > > +buffers
> > > > > > > + * coming from the mbuf_pool passed as parameter */ static
> > > > > > > +inline int port_init(uint8_t port, struct rte_mempool *mbuf_pool)
> > > > > > > +{
> > > > > > > + struct rte_eth_conf port_conf = port_conf_default;
> > > > > > > + const uint16_t rxRings = 1, txRings = rte_lcore_count() - 1;
> > > > > > > + int retval;
> > > > > > > + uint16_t q;
> > > > > > > +
> > > > > > > + if (port >= rte_eth_dev_count())
> > > > > > > + return -1;
> > > > > > > +
> > > > > > > + retval = rte_eth_dev_configure(port, rxRings, txRings, &port_conf);
> > > > > > > + if (retval != 0)
> > > > > > > + return retval;
> > > > > > > +
> > > > > > > + for (q = 0; q < rxRings; q++) {
> > > > > > > + retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
> > > > > > > + rte_eth_dev_socket_id(port),
> > > > > > > + &rx_conf_default, mbuf_pool);
> > > > > > > + if (retval < 0)
> > > > > > > + return retval;
> > > > > > > + }
> > > > > > > +
> > > > > > > + for (q = 0; q < txRings; q++) {
> > > > > > > + retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
> > > > > > > + rte_eth_dev_socket_id(port),
> > > > > > > + &tx_conf_default);
> > > > > > > + if (retval < 0)
> > > > > > > + return retval;
> > > > > > > + }
> > > > > > > +
> > > > > > > + retval = rte_eth_dev_start(port);
> > > > > > > + if (retval < 0)
> > > > > > > + return retval;
> > > > > > > +
> > > > > > > + struct rte_eth_link link;
> > > > > > > + rte_eth_link_get_nowait(port, &link);
> > > > > > > + if (!link.link_status) {
> > > > > > > + sleep(1);
> > > > > > > + rte_eth_link_get_nowait(port, &link);
> > > > > > > + }
> > > > > > > +
> > > > > > > + if (!link.link_status) {
> > > > > > > + printf("Link down on port %"PRIu8"\n", port);
> > > > > > > + return 0;
> > > > > > > + }
> > > > > > > +
> > > > > > > + struct ether_addr addr;
> > > > > > > + rte_eth_macaddr_get(port, &addr);
> > > > > > > + printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
> > > > > > > + " %02"PRIx8" %02"PRIx8" %02"PRIx8"\n",
> > > > > > > + (unsigned)port,
> > > > > > > + addr.addr_bytes[0], addr.addr_bytes[1],
> > > > > > > + addr.addr_bytes[2], addr.addr_bytes[3],
> > > > > > > + addr.addr_bytes[4], addr.addr_bytes[5]);
> > > > > > > +
> > > > > > > + rte_eth_promiscuous_enable(port);
> > > > > > > +
> > > > > > > + return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +struct lcore_params {
> > > > > > > + unsigned worker_id;
> > > > > > > + struct rte_distributor *d;
> > > > > > > + struct rte_ring *r;
> > > > > > > +};
> > > > > > > +
> > > > > > > +static __attribute__((noreturn)) void lcore_rx(struct
> > > > > > > +lcore_params *p) {
> > > > > > > + struct rte_distributor *d = p->d;
> > > > > > > + struct rte_ring *r = p->r;
> > > > > > > + const uint8_t nb_ports = rte_eth_dev_count();
> > > > > > > + const int socket_id = rte_socket_id();
> > > > > > > + uint8_t port;
> > > > > > > +
> > > > > > > + for (port = 0; port < nb_ports; port++) {
> > > > > > > + /* skip ports that are not enabled */
> > > > > > > + if ((enabled_port_mask & (1 << port)) == 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + if (rte_eth_dev_socket_id(port) > 0 &&
> > > > > > > + rte_eth_dev_socket_id(port) != socket_id)
> > > > > > > + printf("WARNING, port %u is on remote NUMA node to
> > > > "
> > > > > > > + "RX thread.\n\tPerformance will not "
> > > > > > > + "be optimal.\n", port);
> > > > > > > + }
> > > > > > > +
> > > > > > > + printf("\nCore %u doing packet RX.\n", rte_lcore_id());
> > > > > > > + port = 0;
> > > > > > > + for (;;) {
> > > > > > > + /* skip ports that are not enabled */
> > > > > > > + if ((enabled_port_mask & (1 << port)) == 0) {
> > > > > > > + if (++port == nb_ports)
> > > > > > > + port = 0;
> > > > > > > + continue;
> > > > > > > + }
> > > > > > > + struct rte_mbuf *bufs[BURST_SIZE*2];
> > > > > > > + const uint16_t nb_rx = rte_eth_rx_burst(port, 0, bufs,
> > > > > > > + BURST_SIZE);
> > > > > > > + app_stats.rx.rx_pkts += nb_rx;
> > > > > > > +
> > > > > > > + rte_distributor_process(d, bufs, nb_rx);
> > > > > > > + const uint16_t nb_ret = rte_distributor_returned_pkts(d,
> > > > > > > + bufs, BURST_SIZE*2);
> > > > > > > + app_stats.rx.returned_pkts += nb_ret;
> > > > > > > + if (unlikely(nb_ret == 0))
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + uint16_t sent = rte_ring_enqueue_burst(r, (void *)bufs, nb_ret);
> > > > > > > + app_stats.rx.enqueued_pkts += sent;
> > > > > > > + if (unlikely(sent < nb_ret)) {
> > > > > > > + LOG_DEBUG(DISTRAPP, "%s:Packet loss due to full
> > > > ring\n", __func__);
> > > > > > > + while (sent < nb_ret)
> > > > > > > + rte_pktmbuf_free(bufs[sent++]);
> > > > > > > + }
> > > > > > > + if (++port == nb_ports)
> > > > > > > + port = 0;
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +static inline void
> > > > > > > +flush_one_port(struct output_buffer *outbuf, uint8_t outp) {
> > > > > > > + unsigned nb_tx = rte_eth_tx_burst(outp, 0, outbuf->mbufs,
> > > > > > > + outbuf->count);
> > > > > > > + app_stats.tx.tx_pkts += nb_tx;
> > > > > > > +
> > > > > > > + if (unlikely(nb_tx < outbuf->count)) {
> > > > > > > + LOG_DEBUG(DISTRAPP, "%s:Packet loss with tx_burst\n",
> > > > __func__);
> > > > > > > + do {
> > > > > > > + rte_pktmbuf_free(outbuf->mbufs[nb_tx]);
> > > > > > > + } while (++nb_tx < outbuf->count);
> > > > > > > + }
> > > > > > > + outbuf->count = 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static inline void
> > > > > > > +flush_all_ports(struct output_buffer *tx_buffers, uint8_t
> > > > > > > +nb_ports) {
> > > > > > > + uint8_t outp;
> > > > > > > + for (outp = 0; outp < nb_ports; outp++) {
> > > > > > > + /* skip ports that are not enabled */
> > > > > > > + if ((enabled_port_mask & (1 << outp)) == 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + if (tx_buffers[outp].count == 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + flush_one_port(&tx_buffers[outp], outp);
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +static __attribute__((noreturn)) void lcore_tx(struct rte_ring
> > > > > > > +*in_r) {
> > > > > > > + static struct output_buffer tx_buffers[RTE_MAX_ETHPORTS];
> > > > > > > + const uint8_t nb_ports = rte_eth_dev_count();
> > > > > > > + const int socket_id = rte_socket_id();
> > > > > > > + uint8_t port;
> > > > > > > +
> > > > > > > + for (port = 0; port < nb_ports; port++) {
> > > > > > > + /* skip ports that are not enabled */
> > > > > > > + if ((enabled_port_mask & (1 << port)) == 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + if (rte_eth_dev_socket_id(port) > 0 &&
> > > > > > > + rte_eth_dev_socket_id(port) != socket_id)
> > > > > > > + printf("WARNING, port %u is on remote NUMA node to
> > > > "
> > > > > > > + "TX thread.\n\tPerformance will not "
> > > > > > > + "be optimal.\n", port);
> > > > > > > + }
> > > > > > > +
> > > > > > > + printf("\nCore %u doing packet TX.\n", rte_lcore_id());
> > > > > > > + for (;;) {
> > > > > > > + for (port = 0; port < nb_ports; port++) {
> > > > > > > + /* skip ports that are not enabled */
> > > > > > > + if ((enabled_port_mask & (1 << port)) == 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + struct rte_mbuf *bufs[BURST_SIZE];
> > > > > > > + const uint16_t nb_rx = rte_ring_dequeue_burst(in_r,
> > > > > > > + (void *)bufs, BURST_SIZE);
> > > > > > > + app_stats.tx.dequeue_pkts += nb_rx;
> > > > > > > +
> > > > > > > + /* if we get no traffic, flush anything we have */
> > > > > > > + if (unlikely(nb_rx == 0)) {
> > > > > > > + flush_all_ports(tx_buffers, nb_ports);
> > > > > > > + continue;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* for traffic we receive, queue it up for transmit */
> > > > > > > + uint16_t i;
> > > > > > > + _mm_prefetch(bufs[0], 0);
> > > > > > > + _mm_prefetch(bufs[1], 0);
> > > > > > > + _mm_prefetch(bufs[2], 0);
> > > > > > > + for (i = 0; i < nb_rx; i++) {
> > > > > > > + struct output_buffer *outbuf;
> > > > > > > + uint8_t outp;
> > > > > > > + _mm_prefetch(bufs[i + 3], 0);
> > > > > > > + /* workers should update in_port to hold the
> > > > > > > + * output port value */
> > > > > > > + outp = bufs[i]->port;
> > > > > > > + /* skip ports that are not enabled */
> > > > > > > + if ((enabled_port_mask & (1 << outp)) == 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + outbuf = &tx_buffers[outp];
> > > > > > > + outbuf->mbufs[outbuf->count++] = bufs[i];
> > > > > > > + if (outbuf->count == BURST_SIZE)
> > > > > > > + flush_one_port(outbuf, outp);
> > > > > > > + }
> > > > > > > + }
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +
> > > > > > > +static __attribute__((noreturn)) void lcore_worker(struct
> > > > > > > +lcore_params *p) {
> > > > > > > + struct rte_distributor *d = p->d;
> > > > > > > + const unsigned id = p->worker_id;
> > > > > > > + /* for single port, xor_val will be zero so we won't modify the output
> > > > > > > + * port, otherwise we send traffic from 0 to 1, 2 to 3, and vice versa
> > > > > > > + */
> > > > > > > + const unsigned xor_val = (rte_eth_dev_count() > 1);
> > > > > > > + struct rte_mbuf *buf = NULL;
> > > > > > > +
> > > > > > > + printf("\nCore %u acting as worker core.\n", rte_lcore_id());
> > > > > > > + for (;;) {
> > > > > > > + buf = rte_distributor_get_pkt(d, id, buf);
> > > > > > > + buf->port ^= xor_val;
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +static void
> > > > > > > +int_handler(int sig_num)
> > > > > > > +{
> > > > > > > + struct rte_eth_stats eth_stats;
> > > > > > > + unsigned i;
> > > > > > > +
> > > > > > > + printf("Exiting on signal %d\n", sig_num);
> > > > > > > +
> > > > > > > + printf("\nRX thread stats:\n");
> > > > > > > + printf(" - Received: %"PRIu64"\n", app_stats.rx.rx_pkts);
> > > > > > > + printf(" - Processed: %"PRIu64"\n", app_stats.rx.returned_pkts);
> > > > > > > + printf(" - Enqueued: %"PRIu64"\n", app_stats.rx.enqueued_pkts);
> > > > > > > +
> > > > > > > + printf("\nTX thread stats:\n");
> > > > > > > + printf(" - Dequeued: %"PRIu64"\n", app_stats.tx.dequeue_pkts);
> > > > > > > + printf(" - Transmitted: %"PRIu64"\n", app_stats.tx.tx_pkts);
> > > > > > > +
> > > > > > > + for (i = 0; i < rte_eth_dev_count(); i++) {
> > > > > > > + rte_eth_stats_get(i, ð_stats);
> > > > > > > + printf("\nPort %u stats:\n", i);
> > > > > > > + printf(" - Pkts in: %"PRIu64"\n", eth_stats.ipackets);
> > > > > > > + printf(" - Pkts out: %"PRIu64"\n", eth_stats.opackets);
> > > > > > > + printf(" - In Errs: %"PRIu64"\n", eth_stats.ierrors);
> > > > > > > + printf(" - Out Errs: %"PRIu64"\n", eth_stats.oerrors);
> > > > > > > + printf(" - Mbuf Errs: %"PRIu64"\n", eth_stats.rx_nombuf);
> > > > > > > + }
> > > > > > > + exit(0);
> > > > > > rte_exit here? Also, this is a pretty ungraceful exit strategy as
> > > > > > all the threads you've created and memory you've allocated are just
> > > > forgotten here.
> > > > > > Given that dpdk mempools are shared, this has the potential to leak
> > > > > > lots of memory if other apps are using the dpdk at the same time
> > > > > > that you run this. You probably want to use the sigint handler to
> > > > > > raise a flag to the tx/rx threads to shutdown gracefully, and then free your
> > > > allocated memory and mempool.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > >
> > > > > Unless the different processes are explicitly cooperating as
> > > > > primary/secondary, the mempools are not shared. I just don't see the
> > > > > need for this app to do more cleanup on ctrl-c signal, as it's not
> > > > > intended to be a multiprocess app, and there is little that any
> > > > > secondary process could do to work with this app, except possibly some
> > > > > resource monitoring, which would be completely unaffected by it exiting the
> > > > way it does.
> > > > >
> > > > Ah, ok, so we don't use a common shared pool between isolated processes
> > > > then, thats good. Still though, this is a sample application, I think its lazy
> > > > programming practice to illustrate to application developers that its generally ok
> > > > to exit programs without freeing your resources. Its about 20 lines of additional
> > > > code to change the sigint handler to flag an exit condition, and have all the
> > > > other threads join on it.
> > > >
> > > > Neil
> > >
> > > 1)I had sent v5 patch which handles graceful shutdown of rx and tx threads upon SIGINT
> > I see it and will take a look shortly, thanks.
> >
> > > 2)Worker thread graceful shutdown was not handled as of now as it needs some change in lcore_worker logic , which will be done
> in future enhancements.
> > Not sure I understand what you mean here. Can you elaborate?
> >
> > > 3)Freeing of mempool is also not handled , as the framework support is not available.
> > Ew, I hadn't noticed that, freeing of mempools seems like something we should
> > implement.
> >
> > > 4)Cleaning of rx/tx queues not done, as it needs some extensive logic which we haven't planned as of now. Will check the
> possibility of doing it in future enhancements i.e in next version of sample application.
> > We can't just flush the queues after we shutdown the workers? I presume a queue
> > flush operation exists, yes?
> > Neil
>
> Other than code hygiene, which does have some value in itself, I can't
> really see what the practical point of such cleanup would be.
>
> If traffic is going through the system, and the process is killed packets
> will be dropped, whatever we do, as packet reception will stop. If traffic
> is not going through the system, then there are no packets in flight and
> therefore no relevant cleanup to be done. [And if the traffic is stopped
> just before shutting down the app, at a throughput rate of a couple of
> million packets per second, the app should be flushed of packets within tiny
> fractions of a second].
I think that in theory not resetting HW at process termination can cause a problem.
Something like that:
- DPDK app has HW RX/TX queues active with armed RXDs, but the link is idle (no packets are flying).
- DPDK app terminates abnormally.
- user deletes DPDK hugepages files.
- Hugepage memory that was used by DPDK for RXDs/data buffers are given to other app (or kernel).
- Packet arrives -a s HW is still active - it will do a write to the RXD and data-buffer.
- Silent memory corruption.
Saying that, I don't think we can completely eliminate that problem from user-space code -
as not process signals can be handled.
>From other side - our whole concept is to move away from custom kernel modules...
Though we probably can make it less possible to happen - create a termination handler,
that would try to reset all active HW.
And make sure it is called atexit and all catchable signals that cause process termination.
But I don' t think it is a good idea to duplicate such code in each and every sample app.
I think it should be in the librte_eal, and yes - it is a subject of a separate patch/discussion :)
Konstantin
> So overall, I just don't see complicated jumping through hoops for flushing
> and cleaning up things being worth the effort. These apps are designed to
> run in a forever loop, and, in the exception case, when killed the cleanup
> done by kernel is sufficient.
>
> regards,
> /Bruce
next prev parent reply other threads:[~2014-10-02 8:57 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-16 12:13 [dpdk-dev] [PATCH 0/3] distributor_app: new sample application for distributor library reshmapa
2014-09-16 12:13 ` [dpdk-dev] [PATCH 1/3] distributor_app: new sample app reshmapa
2014-09-16 12:13 ` [dpdk-dev] [PATCH 2/3] distributor_app: code review comments implementation reshmapa
2014-09-16 12:13 ` [dpdk-dev] [PATCH 3/3] distributor_app: removed extra spaces reshmapa
2014-09-23 12:55 ` [dpdk-dev] [PATCH 0/3] distributor_app: new sample application for distributor library Bruce Richardson
2014-09-24 14:16 ` [dpdk-dev] [PATCH v2] distributor_app: new sample app reshmapa
2014-09-26 15:11 ` De Lara Guarch, Pablo
2014-09-26 15:51 ` Ananyev, Konstantin
2014-09-29 12:39 ` Pattan, Reshma
2014-09-29 13:06 ` Ananyev, Konstantin
2014-09-29 13:35 ` De Lara Guarch, Pablo
2014-09-29 14:35 ` Neil Horman
2014-09-30 8:02 ` Pattan, Reshma
2014-09-30 9:21 ` Ananyev, Konstantin
2014-09-30 10:39 ` [dpdk-dev] [PATCH v3] " reshmapa
2014-09-30 11:34 ` Neil Horman
2014-09-30 12:18 ` Bruce Richardson
2014-09-30 13:39 ` Neil Horman
2014-10-01 14:47 ` Pattan, Reshma
2014-10-01 14:56 ` Neil Horman
2014-10-01 15:37 ` Bruce Richardson
2014-10-01 16:07 ` Neil Horman
2014-10-06 14:16 ` Pattan, Reshma
2014-10-06 14:44 ` Neil Horman
2014-10-06 17:34 ` Pattan, Reshma
2014-10-06 19:02 ` Neil Horman
2014-10-02 9:04 ` Ananyev, Konstantin [this message]
2014-10-01 13:33 ` [dpdk-dev] [PATCH v4] distributor_app: gracefull shutdown of tx/rx threads on SIGINT reshmapa
2014-10-01 13:46 ` Pattan, Reshma
2014-10-01 13:49 ` Thomas Monjalon
2014-10-01 14:33 ` [dpdk-dev] [PATCH v5] distributor_app: new sample app reshmapa
2014-10-17 13:59 ` [dpdk-dev] [PATCH v6] " Reshma Pattan
2014-11-02 20:08 ` De Lara Guarch, Pablo
2014-11-03 15:49 ` [dpdk-dev] [PATCH v7] " Reshma Pattan
2014-11-03 16:03 ` De Lara Guarch, Pablo
2014-11-13 21:30 ` Thomas Monjalon
2014-11-14 8:44 ` Pattan, Reshma
2014-11-16 21:59 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2601191342CEEE43887BDE71AB977258213904F6@IRSMSX105.ger.corp.intel.com \
--to=konstantin.ananyev@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=nhorman@tuxdriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).