From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 74FEAA2EEB for ; Thu, 12 Sep 2019 14:18:52 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3BB761E9B5; Thu, 12 Sep 2019 14:18:52 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 3FD8C1E8AB for ; Thu, 12 Sep 2019 14:18:50 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2019 05:18:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,492,1559545200"; d="scan'208";a="175947637" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga007.jf.intel.com with ESMTP; 12 Sep 2019 05:18:43 -0700 Received: from fmsmsx607.amr.corp.intel.com (10.18.126.87) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 12 Sep 2019 05:18:27 -0700 Received: from fmsmsx607.amr.corp.intel.com (10.18.126.87) by fmsmsx607.amr.corp.intel.com (10.18.126.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Thu, 12 Sep 2019 05:18:26 -0700 Received: from hasmsx111.ger.corp.intel.com (10.184.198.39) by fmsmsx607.amr.corp.intel.com (10.18.126.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Thu, 12 Sep 2019 05:18:26 -0700 Received: from hasmsx113.ger.corp.intel.com ([169.254.13.177]) by HASMSX111.ger.corp.intel.com ([169.254.5.90]) with mapi id 14.03.0439.000; Thu, 12 Sep 2019 15:18:24 +0300 From: "Baran, MarcinX" To: "Richardson, Bruce" CC: "dev@dpdk.org" , "Modrak, PawelX" Thread-Topic: [dpdk-dev] [PATCH] examples/ioat: create sample app on ioat driver usage Thread-Index: AQHVZujmpw48P0ccBEi3SjfmUSbvJKcnn5GAgABX8SA= Date: Thu, 12 Sep 2019 12:18:24 +0000 Message-ID: <06CDC4676D44784DA2DF9423D4B672BE1055D961@HASMSX113.ger.corp.intel.com> References: <20190909082939.1629-1-marcinx.baran@intel.com> <20190912095251.GA1913@bricha3-MOBL.ger.corp.intel.com> In-Reply-To: <20190912095251.GA1913@bricha3-MOBL.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [10.184.70.11] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] examples/ioat: create sample app on ioat driver usage X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Sep 09, 2019 at 10:29:38AM +0200, Marcin Baran wrote: > From: Pawel Modrak >=20 > A new sample app demonstrating use of driver for CBDMA. > The app receives packets, performs software or hardware copy, changes=20 > packets' MAC addresses (if enabled) and forwards them. The patch=20 > includes sample application as well as it's guide. >=20 > Signed-off-by: Pawel Modrak > Signed-off-by: Marcin Baran > --- Thanks, Pawel and Marcin. Some comments on doc and code inline below. > doc/guides/sample_app_ug/index.rst | 1 + > doc/guides/sample_app_ug/intro.rst | 4 + > doc/guides/sample_app_ug/ioat.rst | 691 +++++++++++++++++++ > examples/Makefile | 3 + > examples/ioat/Makefile | 54 ++ > examples/ioat/ioatfwd.c | 1010 ++++++++++++++++++++++++++++ > examples/ioat/meson.build | 13 + > examples/meson.build | 1 + > 8 files changed, 1777 insertions(+) > create mode 100644 doc/guides/sample_app_ug/ioat.rst create mode=20 > 100644 examples/ioat/Makefile create mode 100644=20 > examples/ioat/ioatfwd.c create mode 100644 examples/ioat/meson.build >=20 > diff --git a/doc/guides/sample_app_ug/index.rst=20 > b/doc/guides/sample_app_ug/index.rst > index f23f8f59e..a6a1d9e7a 100644 > --- a/doc/guides/sample_app_ug/index.rst > +++ b/doc/guides/sample_app_ug/index.rst > @@ -23,6 +23,7 @@ Sample Applications User Guides > ip_reassembly > kernel_nic_interface > keep_alive > + ioat > l2_forward_crypto > l2_forward_job_stats > l2_forward_real_virtual > diff --git a/doc/guides/sample_app_ug/intro.rst=20 > b/doc/guides/sample_app_ug/intro.rst > index 90704194a..74462312f 100644 > --- a/doc/guides/sample_app_ug/intro.rst > +++ b/doc/guides/sample_app_ug/intro.rst > @@ -91,6 +91,10 @@ examples are highlighted below. > forwarding, or ``l3fwd`` application does forwarding based on Internet > Protocol, IPv4 or IPv6 like a simple router. > =20 > +* :doc:`Hardware packet copying`: The Hardware packet copying, > + or ``ioatfwd`` application demonstrates how to use IOAT rawdev=20 > +driver for > + copying packets between two threads. > + > * :doc:`Packet Distributor`: The Packet Distributor > demonstrates how to distribute packets arriving on an Rx port to diffe= rent > cores for processing and transmission. > diff --git a/doc/guides/sample_app_ug/ioat.rst=20 > b/doc/guides/sample_app_ug/ioat.rst > new file mode 100644 > index 000000000..378d70b81 > --- /dev/null > +++ b/doc/guides/sample_app_ug/ioat.rst > @@ -0,0 +1,691 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2019 Intel Corporation. > + > +Sample Application of packet copying using Intel\|reg| QuickData=20 > +Technology=20 > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +=3D=3D=3D=3D=3D=3D=3D You need a space before the |reg| bit otherwise the reg doesn't get the sym= bol replaced. It should be "Intel\ |reg|". [Marcin] Will fix this in v2 > + > +Overview > +-------- > + > +This sample is intended as a demonstration of the basic components of=20 > +a DPDK forwarding application and example of how to use IOAT driver=20 > +API to make packets copies. > + > +Also while forwarding, the MAC addresses are affected as follows: > + > +* The source MAC address is replaced by the TX port MAC address > + > +* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_I= D > + > +This application can be used to compare performance of using software=20 > +packet copy with copy done using a DMA device for different sizes of pac= kets. > +The example will print out statistics each second. The stats shows=20 > +received/send packets and packets dropped or failed to copy. > + > +Compiling the Application > +------------------------- > + > +To compile the sample application see :doc:`compiling`. > + > +The application is located in the ``ioat`` sub-directory. > + > + > +Running the Application > +----------------------- > + > +In order to run the hardware copy application, the copying device=20 > +needs to be bound to user-space IO driver. > + > +Refer to the *IOAT Rawdev Driver for Intel\ |reg| QuickData=20 > +Technology* guide for information on using the driver. > + > +The application requires a number of command line options: > + > +.. code-block:: console > + > + ./build/ioatfwd [EAL options] -- -p MASK [-C CT]=20 > + [--[no-]mac-updating] I think the app uses lower case "c" rather than upper case, as called out b= elow. Since the "CT" value can only be one of two possibilities, I think yo= u should explicitly include them, e.g. "[-c ]". "rawdev" is also= a rather long name for this parameter, why not just call them sw and hw? [Marcin] Yes, it should be lower case. It is my mistake, the doc was not up= dated before sending patch. Proper guide is already prepared for v2. As for= the "rawdev" parameter name, I will correct it. > + > +where, > + > +* p MASK: A hexadecimal bitmask of the ports to configure > + > +* c CT: Performed packet copy type: software (sw) or hardware using > + DMA (rawdev) > + > +* s RS: size of IOAT rawdev ring for hardware copy mode or rte_ring fo= r > + software copy mode > + This parameter is missing from the summary above. [Marcin] Didn't update doc along with code. It is fixed for v2. > +* --[no-]mac-updating: Whether MAC address of packets should be change= d > + or not > + > +The application can be launched in 2 different configurations: > + > +* Performing software packet copying > + > +* Performing hardware packet copying Two thoughts here: a) is this not obvious from the parameter list b) is not more that two configurations, given that you can have: * sw copy with mac updating * sw copy without mac updating * etc. not including the possibly port-mask, ring size and single-core vs two core= configurations. [Marcin] Good point. I will rephrase that. > + > +Each port needs 2 lcores: one of them receives incoming traffic and=20 > +makes a copy of each packet. The second lcore then updates MAC=20 > +address and sends the copy. For each configuration an additional=20 > +lcore is needed since master lcore in use which is responsible for=20 > +configuration, statistics printing and safe deinitialization of all port= s and devices. > + I believe the app also supports running with 1 or 2 cores total, right? [Marcin]Yes, that's right. Didn't update doc along with code. It is fixed f= or v2. > +The application can use a maximum of 8 ports. Why this limitation? [Marcin] Seemed reasonable since on testing machine one IOAT device had 8 c= hannels. > + > +To run the application in a Linux environment with 3 lcores (one of=20 > +them is master lcore), 1 port (port 0), software copying and MAC=20 > +updating issue the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw > + > +To run the application in a Linux environment with 5 lcores (one of=20 > +them is master lcore), 2 ports (ports 0 and 1), hardware copying and=20 > +no MAC updating issue the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-4 -n 1 -- -p 0x3 --no-mac-updating -c=20 > + rawdev > + > +Refer to the *DPDK Getting Started Guide* for general information on=20 > +running applications and the Environment Abstraction Layer (EAL) options= . > + > diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c new=20 > file mode 100644 index 000000000..8463d82f3 > --- /dev/null > +++ b/examples/ioat/ioatfwd.c > @@ -0,0 +1,1010 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2019 Intel Corporation */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > + > +/* size of ring used for software copying between rx and tx. */=20 > +#define RTE_LOGTYPE_IOAT RTE_LOGTYPE_USER1 #define MAX_PKT_BURST 32 Seems a low max, assume this is actually the default burst size? > +#define MEMPOOL_CACHE_SIZE 512 > +#define MIN_POOL_SIZE 65536U > +#define CMD_LINE_OPT_MAC_UPDATING "mac-updating" > +#define CMD_LINE_OPT_NO_MAC_UPDATING "no-mac-updating" > +#define CMD_LINE_OPT_PORTMASK "portmask" > +#define CMD_LINE_OPT_NB_QUEUE "nb-queue" > +#define CMD_LINE_OPT_COPY_TYPE "copy-type" > +#define CMD_LINE_OPT_RING_SIZE "ring-size" > + > +/* configurable number of RX/TX ring descriptors */ #define=20 > +RX_DEFAULT_RINGSIZE 1024 #define TX_DEFAULT_RINGSIZE 1024 > + > +/* max number of RX queues per port */ #define MAX_RX_QUEUES_COUNT 8 > + > +struct rxtx_port_config { > + /* common config */ > + uint16_t rxtx_port; > + uint16_t nb_queues; > + /* for software copy mode */ > + struct rte_ring *rx_to_tx_ring; > + /* for IOAT rawdev copy mode */ > + uint16_t ioat_ids[MAX_RX_QUEUES_COUNT]; }; > + > +struct rxtx_transmission_config { > + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; > + uint16_t nb_ports; > + uint16_t nb_lcores; > +}; > + > +/* per-port statistics struct */ > +struct ioat_port_statistics { > + uint64_t rx[RTE_MAX_ETHPORTS]; > + uint64_t tx[RTE_MAX_ETHPORTS]; > + uint64_t tx_dropped[RTE_MAX_ETHPORTS]; > + uint64_t copy_dropped[RTE_MAX_ETHPORTS]; }; struct=20 > +ioat_port_statistics port_statistics; > + > +struct total_statistics { > + uint64_t total_packets_dropped; > + uint64_t total_packets_tx; > + uint64_t total_packets_rx; > + uint64_t total_successful_enqueues; > + uint64_t total_failed_enqueues; > +}; > + > +typedef enum copy_mode_t { > +#define COPY_MODE_SW "sw" > + COPY_MODE_SW_NUM, > +#define COPY_MODE_IOAT "rawdev" > + COPY_MODE_IOAT_NUM, > + COPY_MODE_INVALID_NUM, > + COPY_MODE_SIZE_NUM =3D COPY_MODE_INVALID_NUM } copy_mode_t; > + > +/* mask of enabled ports */ > +static uint32_t ioat_enabled_port_mask; > + > +/* number of RX queues per port */ > +static uint16_t nb_queues =3D 1; > + > +/* MAC updating enabled by default. */ static int mac_updating =3D 1; > + > +/* hardare copy mode enabled by default. */ static copy_mode_t=20 > +copy_mode =3D COPY_MODE_IOAT_NUM; > + > +/* size of IOAT rawdev ring for hardware copy mode or > + * rte_ring for software copy mode > + */ > +static unsigned short ring_size =3D 2048; > + > +/* global transmission config */ > +struct rxtx_transmission_config cfg; > + > +/* configurable number of RX/TX ring descriptors */ static uint16_t=20 > +nb_rxd =3D RX_DEFAULT_RINGSIZE; static uint16_t nb_txd =3D=20 > +TX_DEFAULT_RINGSIZE; > + > +static volatile bool force_quit; > + > +/* ethernet addresses of ports */ > +static struct rte_ether_addr ioat_ports_eth_addr[RTE_MAX_ETHPORTS]; > + > +static struct rte_eth_dev_tx_buffer *tx_buffer[RTE_MAX_ETHPORTS];=20 > +struct rte_mempool *ioat_pktmbuf_pool; > + > +/* Print out statistics for one port. */ static void=20 > +print_port_stats(uint16_t port_id) { > + printf("\nStatistics for port %u ------------------------------" > + "\nPackets sent: %34"PRIu64 > + "\nPackets received: %30"PRIu64 > + "\nPackets dropped on tx: %25"PRIu64 > + "\nPackets dropped on copy: %23"PRIu64, > + port_id, > + port_statistics.tx[port_id], > + port_statistics.rx[port_id], > + port_statistics.tx_dropped[port_id], > + port_statistics.copy_dropped[port_id]); > +} > + > +/* Print out statistics for one IOAT rawdev device. */ static void=20 > +print_rawdev_stats(uint32_t dev_id, uint64_t *xstats, > + uint16_t nb_xstats, struct rte_rawdev_xstats_name *names_xstats) { > + uint16_t i; > + > + printf("\nIOAT channel %u", dev_id); > + for (i =3D 0; i < nb_xstats; i++) > + if (strstr(names_xstats[i].name, "enqueues")) > + printf("\n\t %s: %*"PRIu64, > + names_xstats[i].name, > + (int)(37 - strlen(names_xstats[i].name)), > + xstats[i]); > +} > + > +static void > +print_total_stats(struct total_statistics *ts) { > + printf("\nAggregate statistics =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" > + "\nTotal packets sent: %28"PRIu64 > + "\nTotal packets received: %24"PRIu64 > + "\nTotal packets dropped: %25"PRIu64, > + ts->total_packets_tx, > + ts->total_packets_rx, > + ts->total_packets_dropped); > + > + if (copy_mode =3D=3D COPY_MODE_IOAT_NUM) { > + printf("\nTotal IOAT successful enqueues: %16"PRIu64 > + "\nTotal IOAT failed enqueues: %20"PRIu64, > + ts->total_successful_enqueues, > + ts->total_failed_enqueues); > + } > + > + printf("\n=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D\n"); > +} > + For these stats, it would be nice to have deltas i.e. pps, rather than (or = as well as) the raw packet count numbers. Since your main stats loop below = has a "sleep(1)" at the start, just computing the deltas should give a good= enough PPS value. [Marcin] Ok, will be prepared for v2 as PPS value. > +/* Print out statistics on packets dropped. */ static void=20 > +print_stats(char *prgname) { > + struct total_statistics ts; > + uint32_t i, port_id, dev_id; > + struct rte_rawdev_xstats_name *names_xstats; > + uint64_t *xstats; > + unsigned int *ids_xstats; > + unsigned int nb_xstats, id_fail_enq, id_succ_enq; > + char status_string[120]; /* to print at the top of the output */ > + int status_strlen; > + > + > + const char clr[] =3D { 27, '[', '2', 'J', '\0' }; > + const char topLeft[] =3D { 27, '[', '1', ';', '1', 'H', '\0' }; > + > + status_strlen =3D snprintf(status_string, sizeof(status_string), > + "%s, ", prgname); > + status_strlen +=3D snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Worker Threads =3D %d, ", > + rte_lcore_count() > 2 ? 2 : 1); > + status_strlen +=3D snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Copy Mode =3D %s,\n", copy_mode =3D=3D COPY_MODE_SW_NUM ? > + COPY_MODE_SW : COPY_MODE_IOAT); > + status_strlen +=3D snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Updating MAC =3D %s, ", mac_updating ? > + "enabled" : "disabled"); > + status_strlen +=3D snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Rx Queues =3D %d, ", nb_queues); > + status_strlen +=3D snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Ring Size =3D %d\n", ring_size); > + > + /* Allocate memory for xstats names and values */ > + nb_xstats =3D rte_rawdev_xstats_names_get( > + cfg.ports[0].ioat_ids[0], NULL, 0); > + > + names_xstats =3D malloc(sizeof(*names_xstats) * nb_xstats); > + if (names_xstats =3D=3D NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat names memory\n"); > + } > + rte_rawdev_xstats_names_get(cfg.ports[0].ioat_ids[0], > + names_xstats, nb_xstats); > + > + ids_xstats =3D malloc(sizeof(*ids_xstats) * nb_xstats); > + if (ids_xstats =3D=3D NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat ids_xstats memory\n"); > + } > + > + for (i =3D 0; i < nb_xstats; i++) > + ids_xstats[i] =3D i; > + > + xstats =3D malloc(sizeof(*xstats) * nb_xstats); > + if (xstats =3D=3D NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat memory\n"); > + } > + > + /* Get failed/successful enqueues stats index */ > + id_fail_enq =3D id_succ_enq =3D nb_xstats; > + for (i =3D 0; i < nb_xstats; i++) { > + if (!strcmp(names_xstats[i].name, "failed_enqueues")) > + id_fail_enq =3D i; > + else if (!strcmp(names_xstats[i].name, "successful_enqueues")) > + id_succ_enq =3D i; > + if (id_fail_enq < nb_xstats && id_succ_enq < nb_xstats) > + break; > + } > + if (id_fail_enq =3D=3D nb_xstats || id_succ_enq =3D=3D nb_xstats) { > + rte_exit(EXIT_FAILURE, > + "Error getting failed/successful enqueues stats index\n"); > + } > + > + while (!force_quit) { > + /* Sleep for 1 second each round - init sleep allows reading > + * messages from app startup. > + */ > + sleep(1); > + > + /* Clear screen and move to top left */ > + printf("%s%s", clr, topLeft); > + > + memset(&ts, 0, sizeof(struct total_statistics)); > + > + printf("%s", status_string); > + > + for (i =3D 0; i < cfg.nb_ports; i++) { > + port_id =3D cfg.ports[i].rxtx_port; > + print_port_stats(port_id); > + > + ts.total_packets_dropped +=3D > + port_statistics.tx_dropped[port_id] > + + port_statistics.copy_dropped[port_id]; > + ts.total_packets_tx +=3D port_statistics.tx[port_id]; > + ts.total_packets_rx +=3D port_statistics.rx[port_id]; > + > + if (copy_mode =3D=3D COPY_MODE_IOAT_NUM) { > + uint32_t j; > + > + for (j =3D 0; j < cfg.ports[i].nb_queues; j++) { > + dev_id =3D cfg.ports[i].ioat_ids[j]; > + rte_rawdev_xstats_get(dev_id, > + ids_xstats, xstats, nb_xstats); > + > + print_rawdev_stats(dev_id, xstats, > + nb_xstats, names_xstats); > + > + ts.total_successful_enqueues +=3D > + xstats[id_succ_enq]; > + ts.total_failed_enqueues +=3D > + xstats[id_fail_enq]; > + } > + } > + } > + printf("\n"); > + > + print_total_stats(&ts); > + } > + > + free(names_xstats); > + free(xstats); > + free(ids_xstats); > +}