From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id CD3C2A2EEB for ; Thu, 12 Sep 2019 11:53:04 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 016551E904; Thu, 12 Sep 2019 11:53:04 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 192C62C4F for ; Thu, 12 Sep 2019 11:53:01 +0200 (CEST) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2019 02:52:59 -0700 X-IronPort-AV: E=Sophos;i="5.64,492,1559545200"; d="scan'208";a="175918188" Received: from bricha3-mobl.ger.corp.intel.com ([10.237.221.46]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2019 02:52:57 -0700 Date: Thu, 12 Sep 2019 10:52:51 +0100 From: Bruce Richardson To: Marcin Baran Cc: dev@dpdk.org, Pawel Modrak Message-ID: <20190912095251.GA1913@bricha3-MOBL.ger.corp.intel.com> References: <20190909082939.1629-1-marcinx.baran@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190909082939.1629-1-marcinx.baran@intel.com> User-Agent: Mutt/1.11.4 (2019-03-13) Subject: Re: [dpdk-dev] [PATCH] examples/ioat: create sample app on ioat driver usage X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Sep 09, 2019 at 10:29:38AM +0200, Marcin Baran wrote: > From: Pawel Modrak > > A new sample app demonstrating use of driver for CBDMA. > The app receives packets, performs software or hardware > copy, changes packets' MAC addresses (if enabled) and > forwards them. The patch includes sample application > as well as it's guide. > > Signed-off-by: Pawel Modrak > Signed-off-by: Marcin Baran > --- Thanks, Pawel and Marcin. Some comments on doc and code inline below. > doc/guides/sample_app_ug/index.rst | 1 + > doc/guides/sample_app_ug/intro.rst | 4 + > doc/guides/sample_app_ug/ioat.rst | 691 +++++++++++++++++++ > examples/Makefile | 3 + > examples/ioat/Makefile | 54 ++ > examples/ioat/ioatfwd.c | 1010 ++++++++++++++++++++++++++++ > examples/ioat/meson.build | 13 + > examples/meson.build | 1 + > 8 files changed, 1777 insertions(+) > create mode 100644 doc/guides/sample_app_ug/ioat.rst > create mode 100644 examples/ioat/Makefile > create mode 100644 examples/ioat/ioatfwd.c > create mode 100644 examples/ioat/meson.build > > diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst > index f23f8f59e..a6a1d9e7a 100644 > --- a/doc/guides/sample_app_ug/index.rst > +++ b/doc/guides/sample_app_ug/index.rst > @@ -23,6 +23,7 @@ Sample Applications User Guides > ip_reassembly > kernel_nic_interface > keep_alive > + ioat > l2_forward_crypto > l2_forward_job_stats > l2_forward_real_virtual > diff --git a/doc/guides/sample_app_ug/intro.rst b/doc/guides/sample_app_ug/intro.rst > index 90704194a..74462312f 100644 > --- a/doc/guides/sample_app_ug/intro.rst > +++ b/doc/guides/sample_app_ug/intro.rst > @@ -91,6 +91,10 @@ examples are highlighted below. > forwarding, or ``l3fwd`` application does forwarding based on Internet > Protocol, IPv4 or IPv6 like a simple router. > > +* :doc:`Hardware packet copying`: The Hardware packet copying, > + or ``ioatfwd`` application demonstrates how to use IOAT rawdev driver for > + copying packets between two threads. > + > * :doc:`Packet Distributor`: The Packet Distributor > demonstrates how to distribute packets arriving on an Rx port to different > cores for processing and transmission. > diff --git a/doc/guides/sample_app_ug/ioat.rst b/doc/guides/sample_app_ug/ioat.rst > new file mode 100644 > index 000000000..378d70b81 > --- /dev/null > +++ b/doc/guides/sample_app_ug/ioat.rst > @@ -0,0 +1,691 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2019 Intel Corporation. > + > +Sample Application of packet copying using Intel\|reg| QuickData Technology > +============================================================================ You need a space before the |reg| bit otherwise the reg doesn't get the symbol replaced. It should be "Intel\ |reg|". > + > +Overview > +-------- > + > +This sample is intended as a demonstration of the basic components of a DPDK > +forwarding application and example of how to use IOAT driver API to make > +packets copies. > + > +Also while forwarding, the MAC addresses are affected as follows: > + > +* The source MAC address is replaced by the TX port MAC address > + > +* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID > + > +This application can be used to compare performance of using software packet > +copy with copy done using a DMA device for different sizes of packets. > +The example will print out statistics each second. The stats shows > +received/send packets and packets dropped or failed to copy. > + > +Compiling the Application > +------------------------- > + > +To compile the sample application see :doc:`compiling`. > + > +The application is located in the ``ioat`` sub-directory. > + > + > +Running the Application > +----------------------- > + > +In order to run the hardware copy application, the copying device > +needs to be bound to user-space IO driver. > + > +Refer to the *IOAT Rawdev Driver for Intel\ |reg| QuickData Technology* > +guide for information on using the driver. > + > +The application requires a number of command line options: > + > +.. code-block:: console > + > + ./build/ioatfwd [EAL options] -- -p MASK [-C CT] [--[no-]mac-updating] I think the app uses lower case "c" rather than upper case, as called out below. Since the "CT" value can only be one of two possibilities, I think you should explicitly include them, e.g. "[-c ]". "rawdev" is also a rather long name for this parameter, why not just call them sw and hw? > + > +where, > + > +* p MASK: A hexadecimal bitmask of the ports to configure > + > +* c CT: Performed packet copy type: software (sw) or hardware using > + DMA (rawdev) > + > +* s RS: size of IOAT rawdev ring for hardware copy mode or rte_ring for > + software copy mode > + This parameter is missing from the summary above. > +* --[no-]mac-updating: Whether MAC address of packets should be changed > + or not > + > +The application can be launched in 2 different configurations: > + > +* Performing software packet copying > + > +* Performing hardware packet copying Two thoughts here: a) is this not obvious from the parameter list b) is not more that two configurations, given that you can have: * sw copy with mac updating * sw copy without mac updating * etc. not including the possibly port-mask, ring size and single-core vs two core configurations. > + > +Each port needs 2 lcores: one of them receives incoming traffic and makes > +a copy of each packet. The second lcore then updates MAC address and sends > +the copy. For each configuration an additional lcore is needed since > +master lcore in use which is responsible for configuration, statistics > +printing and safe deinitialization of all ports and devices. > + I believe the app also supports running with 1 or 2 cores total, right? > +The application can use a maximum of 8 ports. Why this limitation? > + > +To run the application in a Linux environment with 3 lcores (one of them > +is master lcore), 1 port (port 0), software copying and MAC updating issue > +the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw > + > +To run the application in a Linux environment with 5 lcores (one of them > +is master lcore), 2 ports (ports 0 and 1), hardware copying and no MAC > +updating issue the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-4 -n 1 -- -p 0x3 --no-mac-updating -c rawdev > + > +Refer to the *DPDK Getting Started Guide* for general information on > +running applications and the Environment Abstraction Layer (EAL) options. > + > diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c > new file mode 100644 > index 000000000..8463d82f3 > --- /dev/null > +++ b/examples/ioat/ioatfwd.c > @@ -0,0 +1,1010 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2019 Intel Corporation > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > + > +/* size of ring used for software copying between rx and tx. */ > +#define RTE_LOGTYPE_IOAT RTE_LOGTYPE_USER1 > +#define MAX_PKT_BURST 32 Seems a low max, assume this is actually the default burst size? > +#define MEMPOOL_CACHE_SIZE 512 > +#define MIN_POOL_SIZE 65536U > +#define CMD_LINE_OPT_MAC_UPDATING "mac-updating" > +#define CMD_LINE_OPT_NO_MAC_UPDATING "no-mac-updating" > +#define CMD_LINE_OPT_PORTMASK "portmask" > +#define CMD_LINE_OPT_NB_QUEUE "nb-queue" > +#define CMD_LINE_OPT_COPY_TYPE "copy-type" > +#define CMD_LINE_OPT_RING_SIZE "ring-size" > + > +/* configurable number of RX/TX ring descriptors */ > +#define RX_DEFAULT_RINGSIZE 1024 > +#define TX_DEFAULT_RINGSIZE 1024 > + > +/* max number of RX queues per port */ > +#define MAX_RX_QUEUES_COUNT 8 > + > +struct rxtx_port_config { > + /* common config */ > + uint16_t rxtx_port; > + uint16_t nb_queues; > + /* for software copy mode */ > + struct rte_ring *rx_to_tx_ring; > + /* for IOAT rawdev copy mode */ > + uint16_t ioat_ids[MAX_RX_QUEUES_COUNT]; > +}; > + > +struct rxtx_transmission_config { > + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; > + uint16_t nb_ports; > + uint16_t nb_lcores; > +}; > + > +/* per-port statistics struct */ > +struct ioat_port_statistics { > + uint64_t rx[RTE_MAX_ETHPORTS]; > + uint64_t tx[RTE_MAX_ETHPORTS]; > + uint64_t tx_dropped[RTE_MAX_ETHPORTS]; > + uint64_t copy_dropped[RTE_MAX_ETHPORTS]; > +}; > +struct ioat_port_statistics port_statistics; > + > +struct total_statistics { > + uint64_t total_packets_dropped; > + uint64_t total_packets_tx; > + uint64_t total_packets_rx; > + uint64_t total_successful_enqueues; > + uint64_t total_failed_enqueues; > +}; > + > +typedef enum copy_mode_t { > +#define COPY_MODE_SW "sw" > + COPY_MODE_SW_NUM, > +#define COPY_MODE_IOAT "rawdev" > + COPY_MODE_IOAT_NUM, > + COPY_MODE_INVALID_NUM, > + COPY_MODE_SIZE_NUM = COPY_MODE_INVALID_NUM > +} copy_mode_t; > + > +/* mask of enabled ports */ > +static uint32_t ioat_enabled_port_mask; > + > +/* number of RX queues per port */ > +static uint16_t nb_queues = 1; > + > +/* MAC updating enabled by default. */ > +static int mac_updating = 1; > + > +/* hardare copy mode enabled by default. */ > +static copy_mode_t copy_mode = COPY_MODE_IOAT_NUM; > + > +/* size of IOAT rawdev ring for hardware copy mode or > + * rte_ring for software copy mode > + */ > +static unsigned short ring_size = 2048; > + > +/* global transmission config */ > +struct rxtx_transmission_config cfg; > + > +/* configurable number of RX/TX ring descriptors */ > +static uint16_t nb_rxd = RX_DEFAULT_RINGSIZE; > +static uint16_t nb_txd = TX_DEFAULT_RINGSIZE; > + > +static volatile bool force_quit; > + > +/* ethernet addresses of ports */ > +static struct rte_ether_addr ioat_ports_eth_addr[RTE_MAX_ETHPORTS]; > + > +static struct rte_eth_dev_tx_buffer *tx_buffer[RTE_MAX_ETHPORTS]; > +struct rte_mempool *ioat_pktmbuf_pool; > + > +/* Print out statistics for one port. */ > +static void > +print_port_stats(uint16_t port_id) > +{ > + printf("\nStatistics for port %u ------------------------------" > + "\nPackets sent: %34"PRIu64 > + "\nPackets received: %30"PRIu64 > + "\nPackets dropped on tx: %25"PRIu64 > + "\nPackets dropped on copy: %23"PRIu64, > + port_id, > + port_statistics.tx[port_id], > + port_statistics.rx[port_id], > + port_statistics.tx_dropped[port_id], > + port_statistics.copy_dropped[port_id]); > +} > + > +/* Print out statistics for one IOAT rawdev device. */ > +static void > +print_rawdev_stats(uint32_t dev_id, uint64_t *xstats, > + uint16_t nb_xstats, struct rte_rawdev_xstats_name *names_xstats) > +{ > + uint16_t i; > + > + printf("\nIOAT channel %u", dev_id); > + for (i = 0; i < nb_xstats; i++) > + if (strstr(names_xstats[i].name, "enqueues")) > + printf("\n\t %s: %*"PRIu64, > + names_xstats[i].name, > + (int)(37 - strlen(names_xstats[i].name)), > + xstats[i]); > +} > + > +static void > +print_total_stats(struct total_statistics *ts) > +{ > + printf("\nAggregate statistics ===============================" > + "\nTotal packets sent: %28"PRIu64 > + "\nTotal packets received: %24"PRIu64 > + "\nTotal packets dropped: %25"PRIu64, > + ts->total_packets_tx, > + ts->total_packets_rx, > + ts->total_packets_dropped); > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + printf("\nTotal IOAT successful enqueues: %16"PRIu64 > + "\nTotal IOAT failed enqueues: %20"PRIu64, > + ts->total_successful_enqueues, > + ts->total_failed_enqueues); > + } > + > + printf("\n====================================================\n"); > +} > + For these stats, it would be nice to have deltas i.e. pps, rather than (or as well as) the raw packet count numbers. Since your main stats loop below has a "sleep(1)" at the start, just computing the deltas should give a good enough PPS value. > +/* Print out statistics on packets dropped. */ > +static void > +print_stats(char *prgname) > +{ > + struct total_statistics ts; > + uint32_t i, port_id, dev_id; > + struct rte_rawdev_xstats_name *names_xstats; > + uint64_t *xstats; > + unsigned int *ids_xstats; > + unsigned int nb_xstats, id_fail_enq, id_succ_enq; > + char status_string[120]; /* to print at the top of the output */ > + int status_strlen; > + > + > + const char clr[] = { 27, '[', '2', 'J', '\0' }; > + const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' }; > + > + status_strlen = snprintf(status_string, sizeof(status_string), > + "%s, ", prgname); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Worker Threads = %d, ", > + rte_lcore_count() > 2 ? 2 : 1); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Copy Mode = %s,\n", copy_mode == COPY_MODE_SW_NUM ? > + COPY_MODE_SW : COPY_MODE_IOAT); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Updating MAC = %s, ", mac_updating ? > + "enabled" : "disabled"); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Rx Queues = %d, ", nb_queues); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Ring Size = %d\n", ring_size); > + > + /* Allocate memory for xstats names and values */ > + nb_xstats = rte_rawdev_xstats_names_get( > + cfg.ports[0].ioat_ids[0], NULL, 0); > + > + names_xstats = malloc(sizeof(*names_xstats) * nb_xstats); > + if (names_xstats == NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat names memory\n"); > + } > + rte_rawdev_xstats_names_get(cfg.ports[0].ioat_ids[0], > + names_xstats, nb_xstats); > + > + ids_xstats = malloc(sizeof(*ids_xstats) * nb_xstats); > + if (ids_xstats == NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat ids_xstats memory\n"); > + } > + > + for (i = 0; i < nb_xstats; i++) > + ids_xstats[i] = i; > + > + xstats = malloc(sizeof(*xstats) * nb_xstats); > + if (xstats == NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat memory\n"); > + } > + > + /* Get failed/successful enqueues stats index */ > + id_fail_enq = id_succ_enq = nb_xstats; > + for (i = 0; i < nb_xstats; i++) { > + if (!strcmp(names_xstats[i].name, "failed_enqueues")) > + id_fail_enq = i; > + else if (!strcmp(names_xstats[i].name, "successful_enqueues")) > + id_succ_enq = i; > + if (id_fail_enq < nb_xstats && id_succ_enq < nb_xstats) > + break; > + } > + if (id_fail_enq == nb_xstats || id_succ_enq == nb_xstats) { > + rte_exit(EXIT_FAILURE, > + "Error getting failed/successful enqueues stats index\n"); > + } > + > + while (!force_quit) { > + /* Sleep for 1 second each round - init sleep allows reading > + * messages from app startup. > + */ > + sleep(1); > + > + /* Clear screen and move to top left */ > + printf("%s%s", clr, topLeft); > + > + memset(&ts, 0, sizeof(struct total_statistics)); > + > + printf("%s", status_string); > + > + for (i = 0; i < cfg.nb_ports; i++) { > + port_id = cfg.ports[i].rxtx_port; > + print_port_stats(port_id); > + > + ts.total_packets_dropped += > + port_statistics.tx_dropped[port_id] > + + port_statistics.copy_dropped[port_id]; > + ts.total_packets_tx += port_statistics.tx[port_id]; > + ts.total_packets_rx += port_statistics.rx[port_id]; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + uint32_t j; > + > + for (j = 0; j < cfg.ports[i].nb_queues; j++) { > + dev_id = cfg.ports[i].ioat_ids[j]; > + rte_rawdev_xstats_get(dev_id, > + ids_xstats, xstats, nb_xstats); > + > + print_rawdev_stats(dev_id, xstats, > + nb_xstats, names_xstats); > + > + ts.total_successful_enqueues += > + xstats[id_succ_enq]; > + ts.total_failed_enqueues += > + xstats[id_fail_enq]; > + } > + } > + } > + printf("\n"); > + > + print_total_stats(&ts); > + } > + > + free(names_xstats); > + free(xstats); > + free(ids_xstats); > +}