From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9A932A0613 for ; Fri, 27 Sep 2019 16:51:56 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 651A61BF2E; Fri, 27 Sep 2019 16:51:56 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 554E61BEFD for ; Fri, 27 Sep 2019 16:51:54 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Sep 2019 07:51:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,555,1559545200"; d="scan'208";a="196735694" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by FMSMGA003.fm.intel.com with ESMTP; 27 Sep 2019 07:51:52 -0700 Received: from fmsmsx114.amr.corp.intel.com (10.18.116.8) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.439.0; Fri, 27 Sep 2019 07:51:52 -0700 Received: from lcsmsx154.ger.corp.intel.com (10.186.165.229) by FMSMSX114.amr.corp.intel.com (10.18.116.8) with Microsoft SMTP Server (TLS) id 14.3.439.0; Fri, 27 Sep 2019 07:51:51 -0700 Received: from hasmsx114.ger.corp.intel.com ([169.254.14.116]) by LCSMSX154.ger.corp.intel.com ([169.254.7.161]) with mapi id 14.03.0439.000; Fri, 27 Sep 2019 17:51:49 +0300 From: "Baran, MarcinX" To: "Richardson, Bruce" CC: "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v5 6/6] doc/guides/: provide IOAT sample app guide Thread-Index: AQHVb4aPH+iU6RkPTUuDqU4O5+UUaKc/NH0AgABviVA= Date: Fri, 27 Sep 2019 14:51:48 +0000 Message-ID: <06CDC4676D44784DA2DF9423D4B672BE15ECCD51@HASMSX114.ger.corp.intel.com> References: <20190919093850.460-1-marcinx.baran@intel.com> <20190920073714.1314-1-marcinx.baran@intel.com> <20190920073714.1314-7-marcinx.baran@intel.com> <20190927110130.GH1847@bricha3-MOBL.ger.corp.intel.com> In-Reply-To: <20190927110130.GH1847@bricha3-MOBL.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [10.184.70.11] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v5 6/6] doc/guides/: provide IOAT sample app guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" -----Original Message----- From: Bruce Richardson =20 Sent: Friday, September 27, 2019 1:02 PM To: Baran, MarcinX Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH v5 6/6] doc/guides/: provide IOAT sample app= guide On Fri, Sep 20, 2019 at 09:37:14AM +0200, Marcin Baran wrote: > Added guide for IOAT sample app usage and code description. >=20 > Signed-off-by: Marcin Baran > --- > doc/guides/sample_app_ug/index.rst | 1 + > doc/guides/sample_app_ug/intro.rst | 4 + > doc/guides/sample_app_ug/ioat.rst | 764=20 > +++++++++++++++++++++++++++++ > 3 files changed, 769 insertions(+) > create mode 100644 doc/guides/sample_app_ug/ioat.rst >=20 > diff --git a/doc/guides/sample_app_ug/index.rst=20 > b/doc/guides/sample_app_ug/index.rst > index f23f8f59e..a6a1d9e7a 100644 > --- a/doc/guides/sample_app_ug/index.rst > +++ b/doc/guides/sample_app_ug/index.rst > @@ -23,6 +23,7 @@ Sample Applications User Guides > ip_reassembly > kernel_nic_interface > keep_alive > + ioat > l2_forward_crypto > l2_forward_job_stats > l2_forward_real_virtual > diff --git a/doc/guides/sample_app_ug/intro.rst=20 > b/doc/guides/sample_app_ug/intro.rst > index 90704194a..74462312f 100644 > --- a/doc/guides/sample_app_ug/intro.rst > +++ b/doc/guides/sample_app_ug/intro.rst > @@ -91,6 +91,10 @@ examples are highlighted below. > forwarding, or ``l3fwd`` application does forwarding based on Internet > Protocol, IPv4 or IPv6 like a simple router. > =20 > +* :doc:`Hardware packet copying`: The Hardware packet copying, > + or ``ioatfwd`` application demonstrates how to use IOAT rawdev=20 > +driver for > + copying packets between two threads. > + > * :doc:`Packet Distributor`: The Packet Distributor > demonstrates how to distribute packets arriving on an Rx port to diffe= rent > cores for processing and transmission. > diff --git a/doc/guides/sample_app_ug/ioat.rst=20 > b/doc/guides/sample_app_ug/ioat.rst > new file mode 100644 > index 000000000..69621673b > --- /dev/null > +++ b/doc/guides/sample_app_ug/ioat.rst > @@ -0,0 +1,764 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2019 Intel Corporation. > + > +Sample Application of packet copying using Intel\ |reg| QuickData=20 > +Technology=20 > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +=3D=3D=3D=3D=3D=3D=3D > + > +Overview > +-------- > + > +This sample is intended as a demonstration of the basic components of=20 > +a DPDK forwarding application and example of how to use IOAT driver=20 > +API to make packets copies. > + > +Also while forwarding, the MAC addresses are affected as follows: > + > +* The source MAC address is replaced by the TX port MAC address > + > +* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_I= D > + > +This application can be used to compare performance of using software=20 > +packet copy with copy done using a DMA device for different sizes of pac= kets. > +The example will print out statistics each second. The stats shows=20 > +received/send packets and packets dropped or failed to copy. > + > +Compiling the Application > +------------------------- > + > +To compile the sample application see :doc:`compiling`. > + > +The application is located in the ``ioat`` sub-directory. > + > + > +Running the Application > +----------------------- > + > +In order to run the hardware copy application, the copying device=20 > +needs to be bound to user-space IO driver. > + > +Refer to the *IOAT Rawdev Driver for Intel\ |reg| QuickData=20 > +Technology* guide for information on using the driver. > + The document is not called that, as the IOAT guide is just part of the over= all rawdev document. So I suggest you just reference the rawdev guide. [Marcin] I wanted to refer to the ioat guide in /doc/guides/rawdevs/ioat.rs= t which has that title. Is there another document or I referenced this one incorrectly? > +The application requires a number of command line options: > + > +.. code-block:: console > + > + ./build/ioatfwd [EAL options] -- -p MASK [-q NQ] [-s RS] [-c = ] > + [--[no-]mac-updating] > + > +where, > + > +* p MASK: A hexadecimal bitmask of the ports to configure Is this a mandatory parameter, or does the app use all detected ports by de= fault, e.g. like testpmd? [Marcin] Optional, the app use all detected ports, added default value comm= ent and tagged -p as optional. > + > +* q NQ: Number of Rx queues used per port equivalent to CBDMA channels > + per port > + > +* c CT: Performed packet copy type: software (sw) or hardware using > + DMA (hw) What is the default? Same for next two parameters. [Marcin] Added default values description. > + > +* s RS: Size of IOAT rawdev ring for hardware copy mode or rte_ring fo= r > + software copy mode > + > +* --[no-]mac-updating: Whether MAC address of packets should be change= d > + or not > + > +The application can be launched in various configurations depending=20 > +on provided parameters. Each port can use up to 2 lcores: one of them=20 > +receives The app uses 2 data plane cores, total, rather than 2 per-port, I believe. It would be good to explain the difference here that with 2 cores the copie= s are done on one core, and the mac updates on the second one. [Marcin] Changed the description accordingly. > +incoming traffic and makes a copy of each packet. The second lcore=20 > +then updates MAC address and sends the copy. If one lcore per port is=20 > +used, both operations are done sequentially. For each configuration=20 > +an additional lcore is needed since master lcore in use which is=20 > +responsible for ... since the master lcore does not handle traffic but is responsible for [Marcin] Changed the description accordingly. > +configuration, statistics printing and safe deinitialization of all=20 > +ports and devices. s/deinitialization/shutdown/ [Marcin] Changed the description accordingly. > + > +The application can use a maximum of 8 ports. Is this a hard limit in the app, if so explain why. I see the stats arrays = are limited by "RTE_MAX_ETHPORTS". [Marcin] The limit was set for simplicity but also because on testing board= there was 16 CBDMA channels total so when there are 8 ports used, they can be set to work with more than one = Rx queue each. > + > +To run the application in a Linux environment with 3 lcores (one of=20 > +them is master lcore), 1 port (port 0), software copying and MAC=20 > +updating issue the command: s/1 port/a single port/ s/one of them is master lcore/the master lcore, plus two forwarding cores/ Similar comments would apply to text immediately below too. [Marcin] Changed the description accordingly. > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw > + > +To run the application in a Linux environment with 2 lcores (one of=20 > +them is master lcore), 2 ports (ports 0 and 1), hardware copying and=20 > +no MAC updating issue the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-1 -n 1 -- -p 0x3 --no-mac-updating -c hw > + > +Refer to the *DPDK Getting Started Guide* for general information on=20 > +running applications and the Environment Abstraction Layer (EAL) options= . > + > +Explanation > +----------- > + > +The following sections provide an explanation of the main components=20 > +of the code. > + > +All DPDK library functions used in the sample code are prefixed with=20 > +``rte_`` and are explained in detail in the *DPDK API Documentation*. > + > + > +The Main Function > +~~~~~~~~~~~~~~~~~ > + > +The ``main()`` function performs the initialization and calls the=20 > +execution threads for each lcore. > + > +The first task is to initialize the Environment Abstraction Layer (EAL). > +The ``argc`` and ``argv`` arguments are provided to the=20 > +``rte_eal_init()`` function. The value returned is the number of parsed = arguments: > + > +.. code-block:: c > + > + /* init EAL */ > + ret =3D rte_eal_init(argc, argv); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); > + > + > +The ``main()`` also allocates a mempool to hold the mbufs (Message=20 > +Buffers) used by the application: > + > +.. code-block:: c > + > + nb_mbufs =3D RTE_MAX(rte_eth_dev_count_avail() * (nb_rxd + nb_txd > + + MAX_PKT_BURST + rte_lcore_count() * MEMPOOL_CACHE_SIZE), > + MIN_POOL_SIZE); > + > + /* Create the mbuf pool */ > + ioat_pktmbuf_pool =3D rte_pktmbuf_pool_create("mbuf_pool", nb_mbufs, > + MEMPOOL_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, > + rte_socket_id()); > + if (ioat_pktmbuf_pool =3D=3D NULL) > + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n"); > + > +Mbufs are the packet buffer structure used by DPDK. They are=20 > +explained in detail in the "Mbuf Library" section of the *DPDK Programme= r's Guide*. > + > +The ``main()`` function also initializes the ports: > + > +.. code-block:: c > + > + /* Initialise each port */ > + RTE_ETH_FOREACH_DEV(portid) { > + port_init(portid, ioat_pktmbuf_pool); > + } > + > +Each port is configured using ``port_init()``: > + > +.. code-block:: c > + > + /* > + * Initializes a given port using global settings and with the RX bu= ffers > + * coming from the mbuf_pool passed as a parameter. > + */ > + static inline void > + port_init(uint16_t portid, struct rte_mempool *mbuf_pool, uint16_t n= b_queues) > + { > + /* configuring port to use RSS for multiple RX queues */ > + static const struct rte_eth_conf port_conf =3D { > + .rxmode =3D { > + .mq_mode =3D ETH_MQ_RX_RSS, > + .max_rx_pkt_len =3D RTE_ETHER_MAX_LEN > + }, > + .rx_adv_conf =3D { > + .rss_conf =3D { > + .rss_key =3D NULL, > + .rss_hf =3D ETH_RSS_PROTO_MASK, > + } > + } > + }; > + > + struct rte_eth_rxconf rxq_conf; > + struct rte_eth_txconf txq_conf; > + struct rte_eth_conf local_port_conf =3D port_conf; > + struct rte_eth_dev_info dev_info; > + int ret, i; > + > + /* Skip ports that are not enabled */ > + if ((ioat_enabled_port_mask & (1 << portid)) =3D=3D 0) { > + printf("Skipping disabled port %u\n", portid); > + return; > + } > + > + /* Init port */ > + printf("Initializing port %u... ", portid); > + fflush(stdout); > + rte_eth_dev_info_get(portid, &dev_info); > + local_port_conf.rx_adv_conf.rss_conf.rss_hf &=3D > + dev_info.flow_type_rss_offloads; > + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) > + local_port_conf.txmode.offloads |=3D > + DEV_TX_OFFLOAD_MBUF_FAST_FREE; > + ret =3D rte_eth_dev_configure(portid, nb_queues, 1, &local_port_= conf); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Cannot configure device:" > + " err=3D%d, port=3D%u\n", ret, portid); > + > + ret =3D rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd, > + &nb_txd); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "Cannot adjust number of descriptors: err=3D%d, port=3D%= u\n", > + ret, portid); > + > + rte_eth_macaddr_get(portid, &ioat_ports_eth_addr[portid]); > + > + /* Init Rx queues */ > + rxq_conf =3D dev_info.default_rxconf; > + rxq_conf.offloads =3D local_port_conf.rxmode.offloads; > + for (i =3D 0; i < nb_queues; i++) { > + ret =3D rte_eth_rx_queue_setup(portid, i, nb_rxd, > + rte_eth_dev_socket_id(portid), &rxq_conf, > + mbuf_pool); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "rte_eth_rx_queue_setup:err=3D%d,port=3D%u, queue_id= =3D%u\n", > + ret, portid, i); > + } > + > + /* Init one TX queue on each port */ > + txq_conf =3D dev_info.default_txconf; > + txq_conf.offloads =3D local_port_conf.txmode.offloads; > + ret =3D rte_eth_tx_queue_setup(portid, 0, nb_txd, > + rte_eth_dev_socket_id(portid), > + &txq_conf); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "rte_eth_tx_queue_setup:err=3D%d,port=3D%u\n", > + ret, portid); > + > + /* Initialize TX buffers */ > + tx_buffer[portid] =3D rte_zmalloc_socket("tx_buffer", > + RTE_ETH_TX_BUFFER_SIZE(MAX_PKT_BURST), 0, > + rte_eth_dev_socket_id(portid)); > + if (tx_buffer[portid] =3D=3D NULL) > + rte_exit(EXIT_FAILURE, > + "Cannot allocate buffer for tx on port %u\n", > + portid); > + > + rte_eth_tx_buffer_init(tx_buffer[portid], MAX_PKT_BURST); > + > + ret =3D rte_eth_tx_buffer_set_err_callback(tx_buffer[portid], > + rte_eth_tx_buffer_count_callback, > + &port_statistics.tx_dropped[portid]); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "Cannot set error callback for tx buffer on port %u\n", > + portid); > + > + /* Start device */ > + ret =3D rte_eth_dev_start(portid); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "rte_eth_dev_start:err=3D%d, port=3D%u\n", > + ret, portid); > + > + rte_eth_promiscuous_enable(portid); > + > + printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n"= , > + portid, > + ioat_ports_eth_addr[portid].addr_bytes[0], > + ioat_ports_eth_addr[portid].addr_bytes[1], > + ioat_ports_eth_addr[portid].addr_bytes[2], > + ioat_ports_eth_addr[portid].addr_bytes[3], > + ioat_ports_eth_addr[portid].addr_bytes[4], > + ioat_ports_eth_addr[portid].addr_bytes[5]); > + > + cfg.ports[cfg.nb_ports].rxtx_port =3D portid; > + cfg.ports[cfg.nb_ports++].nb_queues =3D nb_queues; > + } > + This code is probably quite similar to that in other sample apps, so I don'= t think we need to include the full function here. It makes updating the co= de more difficult, so just refer to the function as doing the port init and= leave it at that, I think. The snippets below give enough detail. [Marcin] Changed the description accordingly. > +The Ethernet ports are configured with local settings using the=20 > +``rte_eth_dev_configure()`` function and the ``port_conf`` struct. > +The RSS is enabled so that multiple Rx queues could be used for=20 > +packet receiving and copying by multiple CBDMA channels per port: > + > +.. code-block:: c > + > + /* configuring port to use RSS for multiple RX queues */ > + static const struct rte_eth_conf port_conf =3D { > + .rxmode =3D { > + .mq_mode =3D ETH_MQ_RX_RSS, > + .max_rx_pkt_len =3D RTE_ETHER_MAX_LEN > + }, > + .rx_adv_conf =3D { > + .rss_conf =3D { > + .rss_key =3D NULL, > + .rss_hf =3D ETH_RSS_PROTO_MASK, > + } > + } > + }; > + > +For this example the ports are set up with the number of Rx queues=20 > +provided with -q option and 1 Tx queue using the=20 > +``rte_eth_rx_queue_setup()`` and ``rte_eth_tx_queue_setup()`` functions. > + > +The Ethernet port is then started: > + > +.. code-block:: c > + > + ret =3D rte_eth_dev_start(portid); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=3D%d, port=3D%u\n"= , > + ret, portid); > + > + > +Finally the Rx port is set in promiscuous mode: > + > +.. code-block:: c > + > + rte_eth_promiscuous_enable(portid); > + > + > +After that each port application assigns resources needed. > + > +.. code-block:: c > + > + check_link_status(ioat_enabled_port_mask); > + > + if (!cfg.nb_ports) { > + rte_exit(EXIT_FAILURE, > + "All available ports are disabled. Please set portmask.\n"); > + } > + > + /* Check if there is enough lcores for all ports. */ > + cfg.nb_lcores =3D rte_lcore_count() - 1; > + if (cfg.nb_lcores < 1) > + rte_exit(EXIT_FAILURE, > + "There should be at least one slave lcore.\n"); > + > + ret =3D 0; > + > + if (copy_mode =3D=3D COPY_MODE_IOAT_NUM) { > + assign_rawdevs(); > + } else /* copy_mode =3D=3D COPY_MODE_SW_NUM */ { > + assign_rings(); > + } > + > +A link status is checked of each port enabled by port mask using=20 > +``check_link_status()`` function. > + I don't think this block needs to be covered. No need to go into everything= in detail, just focus on the key parts of the app that are unique to it, i= .e. the copying and passing mbufs between threads parts. [Marcin] Changed the description accordingly. > +.. code-block:: c > + > + /* check link status, return true if at least one port is up */ > + static int > + check_link_status(uint32_t port_mask) > + { > + uint16_t portid; > + struct rte_eth_link link; > + int retval =3D 0; > + > + printf("\nChecking link status\n"); > + RTE_ETH_FOREACH_DEV(portid) { > + if ((port_mask & (1 << portid)) =3D=3D 0) > + continue; > + > + memset(&link, 0, sizeof(link)); > + rte_eth_link_get(portid, &link); > + > + /* Print link status */ > + if (link.link_status) { > + printf( > + "Port %d Link Up. Speed %u Mbps - %s\n", > + portid, link.link_speed, > + (link.link_duplex =3D=3D ETH_LINK_FULL_DUPLEX) ? > + ("full-duplex") : ("half-duplex\n")); > + retval =3D 1; > + } else > + printf("Port %d Link Down\n", portid); > + } > + return retval; > + } > +