From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 346DFA0613 for ; Fri, 27 Sep 2019 15:23:07 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0B5301BE96; Fri, 27 Sep 2019 15:23:07 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id E9B291BE93 for ; Fri, 27 Sep 2019 15:23:04 +0200 (CEST) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Sep 2019 06:23:03 -0700 X-IronPort-AV: E=Sophos;i="5.64,555,1559545200"; d="scan'208";a="202010659" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.2.75]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Sep 2019 06:23:02 -0700 Date: Fri, 27 Sep 2019 14:22:59 +0100 From: Bruce Richardson To: Marcin Baran Cc: dev@dpdk.org, John McNamara , Marko Kovacevic Message-ID: <20190927132259.GA1859@bricha3-MOBL.ger.corp.intel.com> References: <20190919093850.460-1-marcinx.baran@intel.com> <20190920073714.1314-1-marcinx.baran@intel.com> <20190920073714.1314-7-marcinx.baran@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190920073714.1314-7-marcinx.baran@intel.com> User-Agent: Mutt/1.11.4 (2019-03-13) Subject: Re: [dpdk-dev] [PATCH v5 6/6] doc/guides/: provide IOAT sample app guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, Sep 20, 2019 at 09:37:14AM +0200, Marcin Baran wrote: > Added guide for IOAT sample app usage and > code description. > > Signed-off-by: Marcin Baran > --- > doc/guides/sample_app_ug/index.rst | 1 + > doc/guides/sample_app_ug/intro.rst | 4 + > doc/guides/sample_app_ug/ioat.rst | 764 +++++++++++++++++++++++++++++ > 3 files changed, 769 insertions(+) > create mode 100644 doc/guides/sample_app_ug/ioat.rst > > +Depending on mode set (whether copy should be done by software or by hardware) > +special structures are assigned to each port. If software copy was chosen, > +application have to assign ring structures for packet exchanging between lcores > +assigned to ports. > + > +.. code-block:: c > + > + static void > + assign_rings(void) > + { > + uint32_t i; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + char ring_name[20]; > + > + snprintf(ring_name, 20, "rx_to_tx_ring_%u", i); > + /* Create ring for inter core communication */ > + cfg.ports[i].rx_to_tx_ring = rte_ring_create( > + ring_name, ring_size, > + rte_socket_id(), RING_F_SP_ENQ); > + > + if (cfg.ports[i].rx_to_tx_ring == NULL) > + rte_exit(EXIT_FAILURE, "%s\n", > + rte_strerror(rte_errno)); > + } > + } > + > + > +When using hardware copy each Rx queue of the port is assigned an > +IOAT device (``assign_rawdevs()``) using IOAT Rawdev Driver API > +functions: > + > +.. code-block:: c > + > + static void > + assign_rawdevs(void) > + { > + uint16_t nb_rawdev = 0, rdev_id = 0; > + uint32_t i, j; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + for (j = 0; j < cfg.ports[i].nb_queues; j++) { > + struct rte_rawdev_info rdev_info = { 0 }; > + > + do { > + if (rdev_id == rte_rawdev_count()) > + goto end; > + rte_rawdev_info_get(rdev_id++, &rdev_info); > + } while (strcmp(rdev_info.driver_name, > + IOAT_PMD_RAWDEV_NAME_STR) != 0); > + > + cfg.ports[i].ioat_ids[j] = rdev_id - 1; > + configure_rawdev_queue(cfg.ports[i].ioat_ids[j]); > + ++nb_rawdev; > + } > + } > + end: > + if (nb_rawdev < cfg.nb_ports * cfg.ports[0].nb_queues) > + rte_exit(EXIT_FAILURE, > + "Not enough IOAT rawdevs (%u) for all queues (%u).\n", > + nb_rawdev, cfg.nb_ports * cfg.ports[0].nb_queues); > + RTE_LOG(INFO, IOAT, "Number of used rawdevs: %u.\n", nb_rawdev); > + } > + > + > +The initialization of hardware device is done by ``rte_rawdev_configure()`` > +function and ``rte_rawdev_info`` struct. ... using ``rte_rawdev_info`` struct > After configuration the device is > +started using ``rte_rawdev_start()`` function. Each of the above operations > +is done in ``configure_rawdev_queue()``. In the block below, there is no mention of where dev_config structure comes from. Presume it's a global variable, so maybe mention that in the text. > + > +.. code-block:: c > + > + static void > + configure_rawdev_queue(uint32_t dev_id) > + { > + struct rte_rawdev_info info = { .dev_private = &dev_config }; > + > + /* Configure hardware copy device */ > + dev_config.ring_size = ring_size; > + > + if (rte_rawdev_configure(dev_id, &info) != 0) { > + rte_exit(EXIT_FAILURE, > + "Error with rte_rawdev_configure()\n"); > + } > + rte_rawdev_info_get(dev_id, &info); > + if (dev_config.ring_size != ring_size) { > + rte_exit(EXIT_FAILURE, > + "Error, ring size is not %d (%d)\n", > + ring_size, (int)dev_config.ring_size); > + } > + if (rte_rawdev_start(dev_id) != 0) { > + rte_exit(EXIT_FAILURE, > + "Error with rte_rawdev_start()\n"); > + } > + } > + > +If initialization is successful memory for hardware device > +statistics is allocated. Missing "," after successful. Where is this memory allocated? It is done in main or elsewhere? > + > +Finally ``main()`` functions starts all processing lcores and starts s/functions/function/ s/processing lcores/packet handling lcores/ > +printing stats in a loop on master lcore. The application can be s/master lcore/the master lcore/ > +interrupted and closed using ``Ctrl-C``. The master lcore waits for > +all slave processes to finish, deallocates resources and exits. > + > +The processing lcores launching function are described below. > + > +The Lcores Launching Functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +As described above ``main()`` function invokes ``start_forwarding_cores()`` Missing "," after above. > +function in order to start processing for each lcore: > + > +.. code-block:: c > + > + static void start_forwarding_cores(void) > + { > + uint32_t lcore_id = rte_lcore_id(); > + > + RTE_LOG(INFO, IOAT, "Entering %s on lcore %u\n", > + __func__, rte_lcore_id()); > + > + if (cfg.nb_lcores == 1) { > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)rxtx_main_loop, > + NULL, lcore_id); > + } else if (cfg.nb_lcores > 1) { > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)rx_main_loop, > + NULL, lcore_id); > + > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)tx_main_loop, NULL, > + lcore_id); > + } > + } > + > +The function launches Rx/Tx processing functions on configured lcores > +for each port using ``rte_eal_remote_launch()``. The configured ports, Remove "for each port" > +their number and number of assigned lcores are stored in user-defined > +``rxtx_transmission_config`` struct that is initialized before launching s/is/has been/ Did you describe how that structure was set up previously? > +tasks: > + > +.. code-block:: c > + > + struct rxtx_transmission_config { > + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; > + uint16_t nb_ports; > + uint16_t nb_lcores; > + }; > + > +The Lcores Processing Functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +For receiving packets on each port an ``ioat_rx_port()`` function is used. Missing "," after port. s/an/the/ > +The function receives packets on each configured Rx queue. Depending on mode s/mode/the mode/ > +the user chose, it will enqueue packets to IOAT rawdev channels and then invoke > +copy process (hardware copy), or perform software copy of each packet using > +``pktmbuf_sw_copy()`` function and enqueue them to 1 rte_ring: s/1 rte_ring/an rte_ring/ > + > +.. code-block:: c > + > + /* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. */ > + static void > + ioat_rx_port(struct rxtx_port_config *rx_config) > + { > + uint32_t nb_rx, nb_enq, i, j; > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; > + for (i = 0; i < rx_config->nb_queues; i++) { > + > + nb_rx = rte_eth_rx_burst(rx_config->rxtx_port, i, > + pkts_burst, MAX_PKT_BURST); > + > + if (nb_rx == 0) > + continue; > + > + port_statistics.rx[rx_config->rxtx_port] += nb_rx; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + /* Perform packet hardware copy */ > + nb_enq = ioat_enqueue_packets(pkts_burst, > + nb_rx, rx_config->ioat_ids[i]); > + if (nb_enq > 0) > + rte_ioat_do_copies(rx_config->ioat_ids[i]); > + } else { > + /* Perform packet software copy, free source packets */ > + int ret; > + struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; > + > + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, > + (void *)pkts_burst_copy, nb_rx); > + > + if (unlikely(ret < 0)) > + rte_exit(EXIT_FAILURE, > + "Unable to allocate memory.\n"); > + > + for (j = 0; j < nb_rx; j++) > + pktmbuf_sw_copy(pkts_burst[j], > + pkts_burst_copy[j]); > + > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)pkts_burst, nb_rx); > + > + nb_enq = rte_ring_enqueue_burst( > + rx_config->rx_to_tx_ring, > + (void *)pkts_burst_copy, nb_rx, NULL); > + > + /* Free any not enqueued packets. */ > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)&pkts_burst_copy[nb_enq], > + nb_rx - nb_enq); > + } > + > + port_statistics.copy_dropped[rx_config->rxtx_port] += > + (nb_rx - nb_enq); > + } > + } > + > +The packets are received in burst mode using ``rte_eth_rx_burst()`` > +function. When using hardware copy mode the packets are enqueued in > +copying device's buffer using ``ioat_enqueue_packets()`` which calls > +``rte_ioat_enqueue_copy()``. When all received packets are in the > +buffer the copies are invoked by calling ``rte_ioat_do_copies()``. s/copies are invoked/copy operations are started/ > +Function ``rte_ioat_enqueue_copy()`` operates on physical address of > +the packet. Structure ``rte_mbuf`` contains only physical address to > +start of the data buffer (``buf_iova``). Thus the address is shifted s/shifted/adjusted/ > +by ``addr_offset`` value in order to get pointer to ``rearm_data`` s/pointer to/the address of/ > +member of ``rte_mbuf``. That way the packet is copied all at once > +(with data and metadata). "That way the both the packet data and metadata can be copied in a single operation". Should also note that this shortcut can be used because the mbufs are "direct" mbufs allocated by the apps. If another app uses external buffers, or indirect mbufs, then multiple copy operations must be used. > + > +.. code-block:: c > + > + static uint32_t > + ioat_enqueue_packets(struct rte_mbuf **pkts, > + uint32_t nb_rx, uint16_t dev_id) > + { > + int ret; > + uint32_t i; > + struct rte_mbuf *pkts_copy[MAX_PKT_BURST]; > + > + const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr, > + &pkts[0]->rearm_data); > + > + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, > + (void *)pkts_copy, nb_rx); > + > + if (unlikely(ret < 0)) > + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); > + > + for (i = 0; i < nb_rx; i++) { > + /* Perform data copy */ > + ret = rte_ioat_enqueue_copy(dev_id, > + pkts[i]->buf_iova > + - addr_offset, > + pkts_copy[i]->buf_iova > + - addr_offset, > + rte_pktmbuf_data_len(pkts[i]) > + + addr_offset, > + (uintptr_t)pkts[i], > + (uintptr_t)pkts_copy[i], > + 0 /* nofence */); > + > + if (ret != 1) > + break; > + } > + > + ret = i; > + /* Free any not enqueued packets. */ > + rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts[i], nb_rx - i); > + rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts_copy[i], > + nb_rx - i); > + > + return ret; > + } > + > + > +All done copies are processed by ``ioat_tx_port()`` function. When using s/done/completed/ > +hardware copy mode the function invokes ``rte_ioat_completed_copies()`` > +on each assigned IOAT channel to gather copied packets. If software copy > +mode is used the function dequeues copied packets from the rte_ring. Then each > +packet MAC address is changed if it was enabled. After that copies are sent > +in burst mode using `` rte_eth_tx_burst()``. > + > + > +.. code-block:: c > + > + /* Transmit packets from IOAT rawdev/rte_ring for one port. */ > + static void > + ioat_tx_port(struct rxtx_port_config *tx_config) > + { > + uint32_t i, j, nb_dq = 0; > + struct rte_mbuf *mbufs_src[MAX_PKT_BURST]; > + struct rte_mbuf *mbufs_dst[MAX_PKT_BURST]; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + for (i = 0; i < tx_config->nb_queues; i++) { > + /* Deque the mbufs from IOAT device. */ > + nb_dq = rte_ioat_completed_copies( > + tx_config->ioat_ids[i], MAX_PKT_BURST, > + (void *)mbufs_src, (void *)mbufs_dst); > + > + if (nb_dq == 0) > + break; > + > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)mbufs_src, nb_dq); > + > + /* Update macs if enabled */ > + if (mac_updating) { > + for (j = 0; j < nb_dq; j++) > + update_mac_addrs(mbufs_dst[j], > + tx_config->rxtx_port); > + } > + > + const uint16_t nb_tx = rte_eth_tx_burst( > + tx_config->rxtx_port, 0, > + (void *)mbufs_dst, nb_dq); > + > + port_statistics.tx[tx_config->rxtx_port] += nb_tx; > + > + /* Free any unsent packets. */ > + if (unlikely(nb_tx < nb_dq)) > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)&mbufs_dst[nb_tx], > + nb_dq - nb_tx); > + } > + } > + else { > + for (i = 0; i < tx_config->nb_queues; i++) { > + /* Deque the mbufs from IOAT device. */ > + nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring, > + (void *)mbufs_dst, MAX_PKT_BURST, NULL); > + > + if (nb_dq == 0) > + return; > + > + /* Update macs if enabled */ > + if (mac_updating) { > + for (j = 0; j < nb_dq; j++) > + update_mac_addrs(mbufs_dst[j], > + tx_config->rxtx_port); > + } > + > + const uint16_t nb_tx = rte_eth_tx_burst(tx_config->rxtx_port, > + 0, (void *)mbufs_dst, nb_dq); > + > + port_statistics.tx[tx_config->rxtx_port] += nb_tx; > + > + /* Free any unsent packets. */ > + if (unlikely(nb_tx < nb_dq)) > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)&mbufs_dst[nb_tx], > + nb_dq - nb_tx); > + } > + } > + } > + > +The Packet Copying Functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +In order to perform packet copy there is a user-defined function > +``pktmbuf_sw_copy()`` used. It copies a whole packet by copying > +metadata from source packet to new mbuf, and then copying a data > +chunk of source packet. Both memory copies are done using > +``rte_memcpy()``: > + > +.. code-block:: c > + > + static inline void > + pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) > + { > + /* Copy packet metadata */ > + rte_memcpy(&dst->rearm_data, > + &src->rearm_data, > + offsetof(struct rte_mbuf, cacheline1) > + - offsetof(struct rte_mbuf, rearm_data)); > + > + /* Copy packet data */ > + rte_memcpy(rte_pktmbuf_mtod(dst, char *), > + rte_pktmbuf_mtod(src, char *), src->data_len); > + } > + > +The metadata in this example is copied from ``rearm_data`` member of > +``rte_mbuf`` struct up to ``cacheline1``. > + > +In order to understand why software packet copying is done as shown > +above please refer to the "Mbuf Library" section of the > +*DPDK Programmer's Guide*. > \ No newline at end of file Use a text editor that adds a newline automatically :-) /Bruce