From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 560EAA2EDC for ; Mon, 9 Sep 2019 15:12:37 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9D5DD1EB2C; Mon, 9 Sep 2019 15:12:36 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 056481EB02 for ; Mon, 9 Sep 2019 15:12:35 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EE7930224A8; Mon, 9 Sep 2019 13:12:34 +0000 (UTC) Received: from dhcp-25.97.bos.redhat.com (unknown [10.18.25.35]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 833385D6B2; Mon, 9 Sep 2019 13:12:33 +0000 (UTC) From: Aaron Conole To: Marcin Baran Cc: dev@dpdk.org, Pawel Modrak References: <20190909082939.1629-1-marcinx.baran@intel.com> Date: Mon, 09 Sep 2019 09:12:32 -0400 In-Reply-To: <20190909082939.1629-1-marcinx.baran@intel.com> (Marcin Baran's message of "Mon, 9 Sep 2019 10:29:38 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Mon, 09 Sep 2019 13:12:34 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] examples/ioat: create sample app on ioat driver usage X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Marcin Baran writes: > From: Pawel Modrak > > A new sample app demonstrating use of driver for CBDMA. > The app receives packets, performs software or hardware > copy, changes packets' MAC addresses (if enabled) and > forwards them. The patch includes sample application > as well as it's guide. > > Signed-off-by: Pawel Modrak > Signed-off-by: Marcin Baran > --- > doc/guides/sample_app_ug/index.rst | 1 + > doc/guides/sample_app_ug/intro.rst | 4 + > doc/guides/sample_app_ug/ioat.rst | 691 +++++++++++++++++++ > examples/Makefile | 3 + > examples/ioat/Makefile | 54 ++ > examples/ioat/ioatfwd.c | 1010 ++++++++++++++++++++++++++++ > examples/ioat/meson.build | 13 + > examples/meson.build | 1 + > 8 files changed, 1777 insertions(+) > create mode 100644 doc/guides/sample_app_ug/ioat.rst > create mode 100644 examples/ioat/Makefile > create mode 100644 examples/ioat/ioatfwd.c > create mode 100644 examples/ioat/meson.build > > diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst > index f23f8f59e..a6a1d9e7a 100644 > --- a/doc/guides/sample_app_ug/index.rst > +++ b/doc/guides/sample_app_ug/index.rst > @@ -23,6 +23,7 @@ Sample Applications User Guides > ip_reassembly > kernel_nic_interface > keep_alive > + ioat > l2_forward_crypto > l2_forward_job_stats > l2_forward_real_virtual > diff --git a/doc/guides/sample_app_ug/intro.rst b/doc/guides/sample_app_ug/intro.rst > index 90704194a..74462312f 100644 > --- a/doc/guides/sample_app_ug/intro.rst > +++ b/doc/guides/sample_app_ug/intro.rst > @@ -91,6 +91,10 @@ examples are highlighted below. > forwarding, or ``l3fwd`` application does forwarding based on Internet > Protocol, IPv4 or IPv6 like a simple router. > > +* :doc:`Hardware packet copying`: The Hardware packet copying, > + or ``ioatfwd`` application demonstrates how to use IOAT rawdev driver for > + copying packets between two threads. > + > * :doc:`Packet Distributor`: The Packet Distributor > demonstrates how to distribute packets arriving on an Rx port to different > cores for processing and transmission. > diff --git a/doc/guides/sample_app_ug/ioat.rst b/doc/guides/sample_app_ug/ioat.rst > new file mode 100644 > index 000000000..378d70b81 > --- /dev/null > +++ b/doc/guides/sample_app_ug/ioat.rst > @@ -0,0 +1,691 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2019 Intel Corporation. > + > +Sample Application of packet copying using Intel\|reg| QuickData Technology > +============================================================================ > + > +Overview > +-------- > + > +This sample is intended as a demonstration of the basic components of a DPDK > +forwarding application and example of how to use IOAT driver API to make > +packets copies. > + > +Also while forwarding, the MAC addresses are affected as follows: > + > +* The source MAC address is replaced by the TX port MAC address > + > +* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID > + > +This application can be used to compare performance of using software packet > +copy with copy done using a DMA device for different sizes of packets. > +The example will print out statistics each second. The stats shows > +received/send packets and packets dropped or failed to copy. > + > +Compiling the Application > +------------------------- > + > +To compile the sample application see :doc:`compiling`. > + > +The application is located in the ``ioat`` sub-directory. > + > + > +Running the Application > +----------------------- > + > +In order to run the hardware copy application, the copying device > +needs to be bound to user-space IO driver. > + > +Refer to the *IOAT Rawdev Driver for Intel\ |reg| QuickData Technology* > +guide for information on using the driver. > + > +The application requires a number of command line options: > + > +.. code-block:: console > + > + ./build/ioatfwd [EAL options] -- -p MASK [-C CT] [--[no-]mac-updating] > + > +where, > + > +* p MASK: A hexadecimal bitmask of the ports to configure > + > +* c CT: Performed packet copy type: software (sw) or hardware using > + DMA (rawdev) > + > +* s RS: size of IOAT rawdev ring for hardware copy mode or rte_ring for > + software copy mode > + > +* --[no-]mac-updating: Whether MAC address of packets should be changed > + or not > + > +The application can be launched in 2 different configurations: > + > +* Performing software packet copying > + > +* Performing hardware packet copying > + > +Each port needs 2 lcores: one of them receives incoming traffic and makes > +a copy of each packet. The second lcore then updates MAC address and sends > +the copy. For each configuration an additional lcore is needed since > +master lcore in use which is responsible for configuration, statistics > +printing and safe deinitialization of all ports and devices. > + > +The application can use a maximum of 8 ports. > + > +To run the application in a Linux environment with 3 lcores (one of them > +is master lcore), 1 port (port 0), software copying and MAC updating issue > +the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw > + > +To run the application in a Linux environment with 5 lcores (one of them > +is master lcore), 2 ports (ports 0 and 1), hardware copying and no MAC > +updating issue the command: > + > +.. code-block:: console > + > + $ ./build/ioatfwd -l 0-4 -n 1 -- -p 0x3 --no-mac-updating -c rawdev > + > +Refer to the *DPDK Getting Started Guide* for general information on > +running applications and the Environment Abstraction Layer (EAL) options. > + > +Explanation > +----------- > + > +The following sections provide an explanation of the main components of the > +code. > + > +All DPDK library functions used in the sample code are prefixed with > +``rte_`` and are explained in detail in the *DPDK API Documentation*. > + > + > +The Main Function > +~~~~~~~~~~~~~~~~~ > + > +The ``main()`` function performs the initialization and calls the execution > +threads for each lcore. > + > +The first task is to initialize the Environment Abstraction Layer (EAL). > +The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()`` > +function. The value returned is the number of parsed arguments: > + > +.. code-block:: c > + > + /* init EAL */ > + ret = rte_eal_init(argc, argv); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); > + > + > +The ``main()`` also allocates a mempool to hold the mbufs (Message Buffers) > +used by the application: > + > +.. code-block:: c > + > + nb_mbufs = RTE_MAX(rte_eth_dev_count_avail() * (nb_rxd + nb_txd > + + MAX_PKT_BURST + rte_lcore_count() * MEMPOOL_CACHE_SIZE), > + MIN_POOL_SIZE); > + > + /* Create the mbuf pool */ > + ioat_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool", nb_mbufs, > + MEMPOOL_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, > + rte_socket_id()); > + if (ioat_pktmbuf_pool == NULL) > + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n"); > + > +Mbufs are the packet buffer structure used by DPDK. They are explained in > +detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*. > + > +The ``main()`` function also initializes the ports: > + > +.. code-block:: c > + > + /* Initialise each port */ > + RTE_ETH_FOREACH_DEV(portid) { > + port_init(portid, ioat_pktmbuf_pool); > + } > + > +Each port is configured using ``port_init()``: > + > +.. code-block:: c > + > + static inline void > + port_init(uint16_t portid, struct rte_mempool *mbuf_pool) > + { > + struct rte_eth_rxconf rxq_conf; > + struct rte_eth_txconf txq_conf; > + struct rte_eth_conf local_port_conf = port_conf; > + struct rte_eth_dev_info dev_info; > + int ret; > + > + /* Skip ports that are not enabled */ > + if ((ioat_enabled_port_mask & (1 << portid)) == 0) { > + printf("Skipping disabled port %u\n", portid); > + return; > + } > + > + /* Init port */ > + printf("Initializing port %u... ", portid); > + fflush(stdout); > + rte_eth_dev_info_get(portid, &dev_info); > + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) > + local_port_conf.txmode.offloads |= > + DEV_TX_OFFLOAD_MBUF_FAST_FREE; > + ret = rte_eth_dev_configure(portid, 1, 1, &local_port_conf); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n", > + ret, portid); > + > + ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd, > + &nb_txd); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "Cannot adjust number of descriptors: err=%d, port=%u\n", > + ret, portid); > + > + rte_eth_macaddr_get(portid, &ioat_ports_eth_addr[portid]); > + > + /* Init one RX queue */ > + fflush(stdout); > + rxq_conf = dev_info.default_rxconf; > + rxq_conf.offloads = local_port_conf.rxmode.offloads; > + ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd, > + rte_eth_dev_socket_id(portid), > + &rxq_conf, > + mbuf_pool); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n", > + ret, portid); > + > + /* Init one TX queue on each port */ > + fflush(stdout); > + txq_conf = dev_info.default_txconf; > + txq_conf.offloads = local_port_conf.txmode.offloads; > + ret = rte_eth_tx_queue_setup(portid, 0, nb_txd, > + rte_eth_dev_socket_id(portid), > + &txq_conf); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n", > + ret, portid); > + > + /* Initialize TX buffers */ > + tx_buffer[portid] = rte_zmalloc_socket("tx_buffer", > + RTE_ETH_TX_BUFFER_SIZE(MAX_PKT_BURST), 0, > + rte_eth_dev_socket_id(portid)); > + if (tx_buffer[portid] == NULL) > + rte_exit(EXIT_FAILURE, "Cannot allocate buffer for tx " > + "on port %u\n", portid); > + > + rte_eth_tx_buffer_init(tx_buffer[portid], MAX_PKT_BURST); > + > + ret = rte_eth_tx_buffer_set_err_callback(tx_buffer[portid], > + rte_eth_tx_buffer_count_callback, > + &port_statistics[portid].tx_dropped); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "Cannot set error callback for tx buffer on port %u\n", > + portid); > + > + /* Start device */ > + ret = rte_eth_dev_start(portid); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n", > + ret, portid); > + > + rte_eth_promiscuous_enable(portid); > + > + printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n", > + portid, > + ioat_ports_eth_addr[portid].addr_bytes[0], > + ioat_ports_eth_addr[portid].addr_bytes[1], > + ioat_ports_eth_addr[portid].addr_bytes[2], > + ioat_ports_eth_addr[portid].addr_bytes[3], > + ioat_ports_eth_addr[portid].addr_bytes[4], > + ioat_ports_eth_addr[portid].addr_bytes[5]); > + } > + > +The Ethernet ports are configured with local settings using the > +``rte_eth_dev_configure()`` function and the ``port_conf`` struct: > + > +.. code-block:: c > + > + static struct rte_eth_conf port_conf = { > + .rxmode = { > + .max_rx_pkt_len = RTE_ETHER_MAX_LEN, > + }, > + }; > + > +For this example the ports are set up with 1 RX and 1 TX queue using the > +``rte_eth_rx_queue_setup()`` and ``rte_eth_tx_queue_setup()`` functions. > + > +The Ethernet port is then started: > + > +.. code-block:: c > + > + ret = rte_eth_dev_start(portid); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n", > + ret, portid); > + > + > +Finally the RX port is set in promiscuous mode: > + > +.. code-block:: c > + > + rte_eth_promiscuous_enable(portid); > + > + > +After that each port application assigns resources needed. > + > +.. code-block:: c > + > + check_link_status(ioat_enabled_port_mask); > + > + if (!cfg.nb_ports) { > + rte_exit(EXIT_FAILURE, > + "All available ports are disabled. Please set portmask.\n"); > + } > + > + /* Check if there is enough lcores for all ports. */ > + cfg.nb_lcores = rte_lcore_count() - 1; > + if (cfg.nb_lcores < 1) > + rte_exit(EXIT_FAILURE, > + "There should be at least one slave lcore.\n"); > + > + ret = 0; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + assign_rawdevs(); > + } else /* copy_mode == COPY_MODE_SW_NUM */ { > + assign_rings(); > + } > + > +A link status is checked of each port enabled by port mask > +using ``check_link_status()`` function. > + > +.. code-block:: c > + > + /* Check the link status of all ports in up to 9s, and print them finally */ > + static void > + check_link_status(uint32_t port_mask) > + { > + > + uint16_t portid; > + struct rte_eth_link link; > + > + cfg.nb_ports = 0; > + > + printf("\nChecking link status\n"); > + fflush(stdout); > + RTE_ETH_FOREACH_DEV(portid) { > + if (force_quit) > + return; > + if ((port_mask & (1 << portid)) == 0) > + continue; > + > + store_port_nb(portid); > + > + memset(&link, 0, sizeof(link)); > + rte_eth_link_get(portid, &link); > + > + /* Print link status */ > + if (link.link_status) { > + printf( > + "Port %d Link Up. Speed %u Mbps - %s\n", > + portid, link.link_speed, > + (link.link_duplex == ETH_LINK_FULL_DUPLEX) ? > + ("full-duplex") : ("half-duplex\n")); > + } > + else > + printf("Port %d Link Down\n", portid); > + } > + } > + > +Depending on mode set (whether copy should be done by software or by hardware) > +special structures are assigned to each port. If software copy was chosen, > +application have to assign ring structures for packet exchanging between lcores > +assigned to ports. > + > +.. code-block:: c > + > + static void > + assign_rings(void) > + { > + uint32_t i; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + char ring_name[20]; > + > + snprintf(ring_name, 20, "rx_to_tx_ring_%u", i); > + /* Create ring for inter core communication */ > + cfg.ports[i].rx_to_tx_ring = rte_ring_create( > + ring_name, ring_size, > + rte_socket_id(), RING_F_SP_ENQ); > + > + if (cfg.ports[i].rx_to_tx_ring == NULL) > + rte_exit(EXIT_FAILURE, "%s\n", > + rte_strerror(rte_errno)); > + } > + } > + > + > +When using hardware copy each port is assigned an IOAT device > +(``assign_rawdevs()``) using IOAT Rawdev Driver API functions: > + > +.. code-block:: c > + > + static void > + assign_rawdevs(void) > + { > + uint16_t nb_rawdev = 0; > + uint32_t i; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + struct rte_rawdev_info rdev_info = {0}; > + rte_rawdev_info_get(0, &rdev_info); > + > + if (strcmp(rdev_info.driver_name, "rawdev_ioat") == 0) { > + configure_rawdev_queue(i); > + cfg.ports[i].dev_id = i; > + ++nb_rawdev; > + } > + } > + > + RTE_LOG(INFO, IOAT, "Number of used rawdevs: %u.\n", nb_rawdev); > + > + if (nb_rawdev < cfg.nb_ports) > + rte_exit(EXIT_FAILURE, "Not enough IOAT rawdevs (%u) for ports (%u).\n", > + nb_rawdev, cfg.nb_ports); > + } > + > + > +The initialization of hardware device is done by ``rte_rawdev_configure()`` > +function and ``rte_rawdev_info`` struct. After configuration the device is > +started using ``rte_rawdev_start()`` function. Each of the above operations > +is done in ``configure_rawdev_queue()``. > + > +.. code-block:: c > + > + static void > + configure_rawdev_queue(uint32_t dev_id) > + { > + struct rte_rawdev_info info = { .dev_private = &dev_config }; > + > + /* Configure hardware copy device */ > + dev_config.ring_size = ring_size; > + > + if (rte_rawdev_configure(dev_id, &info) != 0) { > + rte_exit(EXIT_FAILURE, > + "Error with rte_rawdev_configure()\n"); > + } > + rte_rawdev_info_get(dev_id, &info); > + if (dev_config.ring_size != ring_size) { > + rte_exit(EXIT_FAILURE, > + "Error, ring size is not %d (%d)\n", > + ring_size, (int)dev_config.ring_size); > + } > + if (rte_rawdev_start(dev_id) != 0) { > + rte_exit(EXIT_FAILURE, > + "Error with rte_rawdev_start()\n"); > + } > + } > + > +If initialization is successful memory for hardware device > +statistics is allocated. > + > +Finally ``main()`` functions starts all processing lcores and starts > +printing stats in a loop on master lcore. The application can be > +interrupted and closed using ``Ctrl-C``. The master lcore waits for > +all slave processes to finish, deallocates resources and exits. > + > +The processing lcores launching function are described below. > + > +The Lcores Launching Functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +As described above ``main()`` function invokes ``run_transmission()`` > +function in order to start processing for each lcore: > + > +.. code-block:: c > + > + static void run_transmission(void) > + { > + uint32_t lcore_id = rte_lcore_id(); > + > + RTE_LOG(INFO, IOAT, "Entering %s on lcore %u\n", > + __func__, rte_lcore_id()); > + > + if (cfg.nb_lcores == 1) { > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)rxtx_main_loop, NULL, lcore_id); > + } else if (cfg.nb_lcores > 1) { > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)rx_main_loop, NULL, lcore_id); > + > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)tx_main_loop, NULL, lcore_id); > + } > + } > + > +The function launches rx/tx processing functions on configured lcores > +for each port using ``rte_eal_remote_launch()``. The configured ports, > +their number and number of assigned lcores are stored in user-defined > +``rxtx_transmission_config`` struct that is initialized before launching > +tasks: > + > +.. code-block:: c > + > + struct rxtx_transmission_config { > + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; > + uint16_t nb_ports; > + uint16_t nb_lcores; > + }; > + > +The Lcores Processing Functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +For receiving packets on each port an ``ioat_rx_port()`` function is used. > +Depending on mode the user chose, it will enqueue packets to IOAT rawdev > +and then invoke copy process (hardware copy), or perform software copy > +of each packet using ``pktmbuf_sw_copy()`` function and enqueue them to > +rte_ring: > + > +.. code-block:: c > + > + /* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. */ > + static void > + ioat_rx_port(struct rxtx_port_config *rx_config) > + { > + uint32_t nb_rx, nb_enq, i; > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; > + > + nb_rx = rte_eth_rx_burst(rx_config->rx_portId, 0, > + pkts_burst, MAX_PKT_BURST); > + > + if (nb_rx == 0) > + return; > + > + port_statistics[rx_config->rx_portId].rx += nb_rx; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + /* Perform packet hardware copy */ > + nb_enq = ioat_enqueue_packets(rx_config, > + pkts_burst, nb_rx); > + > + if (nb_enq > 0) > + rte_ioat_do_copies(rx_config->dev_id); > + } else { > + /* Perform packet software copy, free source packets */ > + int ret; > + struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; > + > + ret = rte_pktmbuf_alloc_bulk(ioat_pktmbuf_pool, > + pkts_burst_copy, nb_rx); > + > + if (unlikely(ret < 0)) > + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); > + > + for (i = 0; i < nb_rx; i++) { > + pktmbuf_sw_copy(pkts_burst[i], pkts_burst_copy[i]); > + rte_pktmbuf_free(pkts_burst[i]); > + } > + > + nb_enq = rte_ring_enqueue_burst(rx_config->rx_to_tx_ring, > + (void *)pkts_burst_copy, nb_rx, NULL); > + > + /* Free any not enqueued packets. */ > + for (i = nb_enq; i < nb_rx; i++) > + rte_pktmbuf_free(pkts_burst_copy[i]); > + } > + > + port_statistics[rx_config->rx_portId].copy_dropped > + += (nb_rx - nb_enq); > + } > + > +The packets are received in burst mode using ``rte_eth_rx_burst()`` > +function. When using hardware copy mode the packets are enqueued in > +copying device's buffer using ``ioat_enqueue_packets()`` which calls > +``rte_ioat_enqueue_copy()``. When all received packets are in the > +buffer the copies are invoked by calling ``rte_ioat_do_copies()``. > +Function ``rte_ioat_enqueue_copy()`` operates on physical address of > +the packet. Structure ``rte_mbuf`` contains only physical address to > +start of the data buffer (``buf_iova``). Thus the address is shifted > +by ``addr_offset`` value in order to get pointer to ``rearm_data`` > +member of ``rte_mbuf``. That way the packet is copied all at once > +(with data and metadata). > + > +.. code-block:: c > + > + static uint32_t > + ioat_enqueue_packets(struct rxtx_port_config *rx_config, > + struct rte_mbuf **pkts, uint32_t nb_rx) > + { > + int ret; > + uint32_t i; > + struct rte_mbuf *pkts_copy[MAX_PKT_BURST]; > + > + const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr, > + &pkts[0]->rearm_data); > + > + ret = rte_pktmbuf_alloc_bulk(ioat_pktmbuf_pool, pkts_copy, nb_rx); > + > + if (unlikely(ret < 0)) > + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); > + > + for (i = 0; i < nb_rx; i++) { > + /* Perform data copy */ > + ret = rte_ioat_enqueue_copy(rx_config->dev_id, > + pkts[i]->buf_iova > + - addr_offset, > + pkts_copy[i]->buf_iova > + - addr_offset, > + rte_pktmbuf_data_len(pkts[i]) > + + addr_offset, > + (uintptr_t)pkts[i], > + (uintptr_t)pkts_copy[i], > + 0 /* nofence */); > + > + if (ret != 1) > + break; > + } > + > + ret = i; > + /* Free any not enqueued packets. */ > + for (; i < nb_rx; i++) { > + rte_pktmbuf_free(pkts[i]); > + rte_pktmbuf_free(pkts_copy[i]); > + } > + > + return ret; > + } > + > + > +All done copies are processed by ``ioat_tx_port()`` function. When using > +hardware copy mode the function invokes ``rte_ioat_completed_copies()`` > +to gather copied packets. If software copy mode is used the function > +dequeues copied packets from rte_ring. Then each packet MAC address > +is changed if it was enabled. After that copies are sent in burst mode > +using `` rte_eth_tx_burst()``. > + > + > +.. code-block:: c > + > + /* Transmit packets from IOAT rawdev/rte_ring for one port. */ > + static void > + ioat_tx_port(struct rxtx_port_config *tx_config) > + { > + uint32_t i, nb_dq; > + struct rte_mbuf *mbufs_src[MAX_PKT_BURST]; > + struct rte_mbuf *mbufs_dst[MAX_PKT_BURST]; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + /* Deque the mbufs from IOAT device. */ > + nb_dq = rte_ioat_completed_copies(tx_config->dev_id, > + MAX_PKT_BURST, (void *)mbufs_src, (void *)mbufs_dst); > + } else { > + /* Deque the mbufs from rx_to_tx_ring. */ > + nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring, > + (void *)mbufs_dst, MAX_PKT_BURST, NULL); > + } > + > + if (nb_dq == 0) > + return; > + > + /* Free source packets */ > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + for (i = 0; i < nb_dq; i++) > + rte_pktmbuf_free(mbufs_src[i]); > + } > + > + /* Update macs if enabled */ > + if (mac_updating) { > + for (i = 0; i < nb_dq; i++) > + update_mac_addrs(mbufs_dst[i], > + tx_config->tx_portId); > + } > + > + const uint16_t nb_tx = rte_eth_tx_burst(tx_config->tx_portId, > + 0, (void *)mbufs_dst, nb_dq); > + > + port_statistics[tx_config->tx_portId].tx += nb_tx; > + > + /* Free any unsent packets. */ > + if (unlikely(nb_tx < nb_dq)) { > + for (i = nb_tx; i < nb_dq; i++) > + rte_pktmbuf_free(mbufs_dst[i]); > + } > + } > + > +The Packet Copying Functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +In order to perform packet copy there is a user-defined function > +``pktmbuf_sw_copy()`` used. It copies a whole packet by copying > +metadata from source packet to new mbuf, and then copying a data > +chunk of source packet. Both memory copies are done using > +``rte_memcpy()``: > + > +.. code-block:: c > + > + static inline void > + pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) > + { > + /* Copy packet metadata */ > + rte_memcpy(&dst->rearm_data, > + &src->rearm_data, > + offsetof(struct rte_mbuf, cacheline1) > + - offsetof(struct rte_mbuf, rearm_data)); > + > + /* Copy packet data */ > + rte_memcpy(rte_pktmbuf_mtod(dst, char *), > + rte_pktmbuf_mtod(src, char *), src->data_len); > + } > + > +The metadata in this example is copied from ``rearm_data`` member of > +``rte_mbuf`` struct up to ``cacheline1``. > + > +In order to understand why software packet copying is done as shown > +above please refer to the "Mbuf Library" section of the > +*DPDK Programmer's Guide*. > \ No newline at end of file > diff --git a/examples/Makefile b/examples/Makefile > index de11dd487..3cb313d7d 100644 > --- a/examples/Makefile > +++ b/examples/Makefile > @@ -23,6 +23,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += fips_validation > DIRS-$(CONFIG_RTE_LIBRTE_FLOW_CLASSIFY) += flow_classify > DIRS-y += flow_filtering > DIRS-y += helloworld > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_IOAT_RAWDEV),y) > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_IOAT_RAWDEV) += ioat > +endif > DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += ip_pipeline > ifeq ($(CONFIG_RTE_LIBRTE_LPM),y) > DIRS-$(CONFIG_RTE_IP_FRAG) += ip_reassembly > diff --git a/examples/ioat/Makefile b/examples/ioat/Makefile > new file mode 100644 > index 000000000..2a4d1da2d > --- /dev/null > +++ b/examples/ioat/Makefile > @@ -0,0 +1,54 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright(c) 2019 Intel Corporation > + > +# binary name > +APP = ioatfwd > + > +# all source are stored in SRCS-y > +SRCS-y := ioatfwd.c > + > +# Build using pkg-config variables if possible > +ifeq ($(shell pkg-config --exists libdpdk && echo 0),0) > + > +all: shared > +.PHONY: shared static > +shared: build/$(APP)-shared > + ln -sf $(APP)-shared build/$(APP) > +static: build/$(APP)-static > + ln -sf $(APP)-static build/$(APP) > + > +PC_FILE := $(shell pkg-config --path libdpdk) > +CFLAGS += -O3 $(shell pkg-config --cflags libdpdk) > +LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk) > +LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk) > + > +build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build > + $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED) > + > +build/$(APP)-static: $(SRCS-y) Makefile $(PC_FILE) | build > + $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC) > + > +build: > + @mkdir -p $@ > + > +.PHONY: clean > +clean: > + rm -f build/$(APP) build/$(APP)-static build/$(APP)-shared > + test -d build && rmdir -p build || true > + > +else # Build using legacy build system > +ifeq ($(RTE_SDK),) > +$(error "Please define RTE_SDK environment variable") > +endif > + > +# Default target, detect a build directory, by looking for a path with a .config > +RTE_TARGET ?= $(notdir $(abspath $(dir $(firstword $(wildcard $(RTE_SDK)/*/.config))))) > + > +include $(RTE_SDK)/mk/rte.vars.mk > + > + > +CFLAGS += -O3 > +CFLAGS += $(WERROR_FLAGS) > + > +include $(RTE_SDK)/mk/rte.extapp.mk > +endif > diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c > new file mode 100644 > index 000000000..8463d82f3 > --- /dev/null > +++ b/examples/ioat/ioatfwd.c > @@ -0,0 +1,1010 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2019 Intel Corporation > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > + > +/* size of ring used for software copying between rx and tx. */ > +#define RTE_LOGTYPE_IOAT RTE_LOGTYPE_USER1 > +#define MAX_PKT_BURST 32 > +#define MEMPOOL_CACHE_SIZE 512 > +#define MIN_POOL_SIZE 65536U > +#define CMD_LINE_OPT_MAC_UPDATING "mac-updating" > +#define CMD_LINE_OPT_NO_MAC_UPDATING "no-mac-updating" > +#define CMD_LINE_OPT_PORTMASK "portmask" > +#define CMD_LINE_OPT_NB_QUEUE "nb-queue" > +#define CMD_LINE_OPT_COPY_TYPE "copy-type" > +#define CMD_LINE_OPT_RING_SIZE "ring-size" > + > +/* configurable number of RX/TX ring descriptors */ > +#define RX_DEFAULT_RINGSIZE 1024 > +#define TX_DEFAULT_RINGSIZE 1024 > + > +/* max number of RX queues per port */ > +#define MAX_RX_QUEUES_COUNT 8 > + > +struct rxtx_port_config { > + /* common config */ > + uint16_t rxtx_port; > + uint16_t nb_queues; > + /* for software copy mode */ > + struct rte_ring *rx_to_tx_ring; > + /* for IOAT rawdev copy mode */ > + uint16_t ioat_ids[MAX_RX_QUEUES_COUNT]; > +}; > + > +struct rxtx_transmission_config { > + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; > + uint16_t nb_ports; > + uint16_t nb_lcores; > +}; > + > +/* per-port statistics struct */ > +struct ioat_port_statistics { > + uint64_t rx[RTE_MAX_ETHPORTS]; > + uint64_t tx[RTE_MAX_ETHPORTS]; > + uint64_t tx_dropped[RTE_MAX_ETHPORTS]; > + uint64_t copy_dropped[RTE_MAX_ETHPORTS]; > +}; > +struct ioat_port_statistics port_statistics; > + > +struct total_statistics { > + uint64_t total_packets_dropped; > + uint64_t total_packets_tx; > + uint64_t total_packets_rx; > + uint64_t total_successful_enqueues; > + uint64_t total_failed_enqueues; > +}; > + > +typedef enum copy_mode_t { > +#define COPY_MODE_SW "sw" > + COPY_MODE_SW_NUM, > +#define COPY_MODE_IOAT "rawdev" > + COPY_MODE_IOAT_NUM, > + COPY_MODE_INVALID_NUM, > + COPY_MODE_SIZE_NUM = COPY_MODE_INVALID_NUM > +} copy_mode_t; > + > +/* mask of enabled ports */ > +static uint32_t ioat_enabled_port_mask; > + > +/* number of RX queues per port */ > +static uint16_t nb_queues = 1; > + > +/* MAC updating enabled by default. */ > +static int mac_updating = 1; > + > +/* hardare copy mode enabled by default. */ > +static copy_mode_t copy_mode = COPY_MODE_IOAT_NUM; > + > +/* size of IOAT rawdev ring for hardware copy mode or > + * rte_ring for software copy mode > + */ > +static unsigned short ring_size = 2048; > + > +/* global transmission config */ > +struct rxtx_transmission_config cfg; > + > +/* configurable number of RX/TX ring descriptors */ > +static uint16_t nb_rxd = RX_DEFAULT_RINGSIZE; > +static uint16_t nb_txd = TX_DEFAULT_RINGSIZE; > + > +static volatile bool force_quit; > + > +/* ethernet addresses of ports */ > +static struct rte_ether_addr ioat_ports_eth_addr[RTE_MAX_ETHPORTS]; > + > +static struct rte_eth_dev_tx_buffer *tx_buffer[RTE_MAX_ETHPORTS]; > +struct rte_mempool *ioat_pktmbuf_pool; > + > +/* Print out statistics for one port. */ > +static void > +print_port_stats(uint16_t port_id) > +{ > + printf("\nStatistics for port %u ------------------------------" > + "\nPackets sent: %34"PRIu64 > + "\nPackets received: %30"PRIu64 > + "\nPackets dropped on tx: %25"PRIu64 > + "\nPackets dropped on copy: %23"PRIu64, > + port_id, > + port_statistics.tx[port_id], > + port_statistics.rx[port_id], > + port_statistics.tx_dropped[port_id], > + port_statistics.copy_dropped[port_id]); > +} > + > +/* Print out statistics for one IOAT rawdev device. */ > +static void > +print_rawdev_stats(uint32_t dev_id, uint64_t *xstats, > + uint16_t nb_xstats, struct rte_rawdev_xstats_name *names_xstats) > +{ > + uint16_t i; > + > + printf("\nIOAT channel %u", dev_id); > + for (i = 0; i < nb_xstats; i++) > + if (strstr(names_xstats[i].name, "enqueues")) > + printf("\n\t %s: %*"PRIu64, > + names_xstats[i].name, > + (int)(37 - strlen(names_xstats[i].name)), > + xstats[i]); > +} > + > +static void > +print_total_stats(struct total_statistics *ts) > +{ > + printf("\nAggregate statistics ===============================" > + "\nTotal packets sent: %28"PRIu64 > + "\nTotal packets received: %24"PRIu64 > + "\nTotal packets dropped: %25"PRIu64, > + ts->total_packets_tx, > + ts->total_packets_rx, > + ts->total_packets_dropped); > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + printf("\nTotal IOAT successful enqueues: %16"PRIu64 > + "\nTotal IOAT failed enqueues: %20"PRIu64, > + ts->total_successful_enqueues, > + ts->total_failed_enqueues); > + } > + > + printf("\n====================================================\n"); > +} > + > +/* Print out statistics on packets dropped. */ > +static void > +print_stats(char *prgname) > +{ > + struct total_statistics ts; > + uint32_t i, port_id, dev_id; > + struct rte_rawdev_xstats_name *names_xstats; > + uint64_t *xstats; > + unsigned int *ids_xstats; > + unsigned int nb_xstats, id_fail_enq, id_succ_enq; > + char status_string[120]; /* to print at the top of the output */ > + int status_strlen; > + > + > + const char clr[] = { 27, '[', '2', 'J', '\0' }; > + const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' }; > + > + status_strlen = snprintf(status_string, sizeof(status_string), > + "%s, ", prgname); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Worker Threads = %d, ", > + rte_lcore_count() > 2 ? 2 : 1); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Copy Mode = %s,\n", copy_mode == COPY_MODE_SW_NUM ? > + COPY_MODE_SW : COPY_MODE_IOAT); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Updating MAC = %s, ", mac_updating ? > + "enabled" : "disabled"); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Rx Queues = %d, ", nb_queues); > + status_strlen += snprintf(status_string + status_strlen, > + sizeof(status_string) - status_strlen, > + "Ring Size = %d\n", ring_size); > + > + /* Allocate memory for xstats names and values */ > + nb_xstats = rte_rawdev_xstats_names_get( > + cfg.ports[0].ioat_ids[0], NULL, 0); > + > + names_xstats = malloc(sizeof(*names_xstats) * nb_xstats); > + if (names_xstats == NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat names memory\n"); > + } > + rte_rawdev_xstats_names_get(cfg.ports[0].ioat_ids[0], > + names_xstats, nb_xstats); > + > + ids_xstats = malloc(sizeof(*ids_xstats) * nb_xstats); > + if (ids_xstats == NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat ids_xstats memory\n"); > + } > + > + for (i = 0; i < nb_xstats; i++) > + ids_xstats[i] = i; > + > + xstats = malloc(sizeof(*xstats) * nb_xstats); > + if (xstats == NULL) { > + rte_exit(EXIT_FAILURE, > + "Error allocating xstat memory\n"); > + } > + > + /* Get failed/successful enqueues stats index */ > + id_fail_enq = id_succ_enq = nb_xstats; > + for (i = 0; i < nb_xstats; i++) { > + if (!strcmp(names_xstats[i].name, "failed_enqueues")) > + id_fail_enq = i; > + else if (!strcmp(names_xstats[i].name, "successful_enqueues")) > + id_succ_enq = i; > + if (id_fail_enq < nb_xstats && id_succ_enq < nb_xstats) > + break; > + } > + if (id_fail_enq == nb_xstats || id_succ_enq == nb_xstats) { > + rte_exit(EXIT_FAILURE, > + "Error getting failed/successful enqueues stats index\n"); > + } > + > + while (!force_quit) { > + /* Sleep for 1 second each round - init sleep allows reading > + * messages from app startup. > + */ > + sleep(1); > + > + /* Clear screen and move to top left */ > + printf("%s%s", clr, topLeft); > + > + memset(&ts, 0, sizeof(struct total_statistics)); > + > + printf("%s", status_string); > + > + for (i = 0; i < cfg.nb_ports; i++) { > + port_id = cfg.ports[i].rxtx_port; > + print_port_stats(port_id); > + > + ts.total_packets_dropped += > + port_statistics.tx_dropped[port_id] > + + port_statistics.copy_dropped[port_id]; > + ts.total_packets_tx += port_statistics.tx[port_id]; > + ts.total_packets_rx += port_statistics.rx[port_id]; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + uint32_t j; > + > + for (j = 0; j < cfg.ports[i].nb_queues; j++) { > + dev_id = cfg.ports[i].ioat_ids[j]; > + rte_rawdev_xstats_get(dev_id, > + ids_xstats, xstats, nb_xstats); > + > + print_rawdev_stats(dev_id, xstats, > + nb_xstats, names_xstats); > + > + ts.total_successful_enqueues += > + xstats[id_succ_enq]; > + ts.total_failed_enqueues += > + xstats[id_fail_enq]; > + } > + } > + } > + printf("\n"); > + > + print_total_stats(&ts); > + } > + > + free(names_xstats); > + free(xstats); > + free(ids_xstats); > +} > + > +static void > +update_mac_addrs(struct rte_mbuf *m, uint32_t dest_portid) > +{ > + struct rte_ether_hdr *eth; > + void *tmp; > + > + eth = rte_pktmbuf_mtod(m, struct rte_ether_hdr *); > + > + /* 02:00:00:00:00:xx - overwriting 2 bytes of source address but > + * it's acceptable cause it gets overwritten by rte_ether_addr_copy > + */ > + tmp = ð->d_addr.addr_bytes[0]; > + *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dest_portid << 40); > + > + /* src addr */ > + rte_ether_addr_copy(&ioat_ports_eth_addr[dest_portid], ð->s_addr); > +} > + > +static inline void > +pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) > +{ > + /* Copy packet metadata */ > + rte_memcpy(&dst->rearm_data, > + &src->rearm_data, > + offsetof(struct rte_mbuf, cacheline1) > + - offsetof(struct rte_mbuf, rearm_data)); > + > + /* Copy packet data */ > + rte_memcpy(rte_pktmbuf_mtod(dst, char *), > + rte_pktmbuf_mtod(src, char *), src->data_len); > +} > + > +static uint32_t > +ioat_enqueue_packets(struct rte_mbuf **pkts, > + uint32_t nb_rx, uint16_t dev_id) > +{ > + int ret; > + uint32_t i; > + struct rte_mbuf *pkts_copy[MAX_PKT_BURST]; > + > + const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr, > + &pkts[0]->rearm_data); > + > + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, > + (void *)pkts_copy, nb_rx); > + > + if (unlikely(ret < 0)) > + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); > + > + for (i = 0; i < nb_rx; i++) { > + /* Perform data copy */ > + ret = rte_ioat_enqueue_copy(dev_id, > + pkts[i]->buf_iova > + - addr_offset, > + pkts_copy[i]->buf_iova > + - addr_offset, > + rte_pktmbuf_data_len(pkts[i]) > + + addr_offset, > + (uintptr_t)pkts[i], > + (uintptr_t)pkts_copy[i], > + 0 /* nofence */); > + > + if (ret != 1) > + break; > + } > + > + ret = i; > + /* Free any not enqueued packets. */ > + rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts[i], nb_rx - i); > + rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts_copy[i], > + nb_rx - i); > + > + > + return ret; > +} > + > +/* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. */ > +static void > +ioat_rx_port(struct rxtx_port_config *rx_config) > +{ > + uint32_t nb_rx, nb_enq, i, j; > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; > + > + for (i = 0; i < rx_config->nb_queues; i++) { > + > + nb_rx = rte_eth_rx_burst(rx_config->rxtx_port, i, > + pkts_burst, MAX_PKT_BURST); > + > + if (nb_rx == 0) > + continue; > + > + port_statistics.rx[rx_config->rxtx_port] += nb_rx; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + /* Perform packet hardware copy */ > + nb_enq = ioat_enqueue_packets(pkts_burst, > + nb_rx, rx_config->ioat_ids[i]); > + if (nb_enq > 0) > + rte_ioat_do_copies(rx_config->ioat_ids[i]); > + } else { > + /* Perform packet software copy, free source packets */ > + int ret; > + struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; > + > + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, > + (void *)pkts_burst_copy, nb_rx); > + > + if (unlikely(ret < 0)) > + rte_exit(EXIT_FAILURE, > + "Unable to allocate memory.\n"); > + > + for (j = 0; j < nb_rx; j++) > + pktmbuf_sw_copy(pkts_burst[j], > + pkts_burst_copy[j]); > + > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)pkts_burst, nb_rx); > + > + nb_enq = rte_ring_enqueue_burst( > + rx_config->rx_to_tx_ring, > + (void *)pkts_burst_copy, nb_rx, NULL); > + > + /* Free any not enqueued packets. */ > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)&pkts_burst_copy[nb_enq], > + nb_rx - nb_enq); > + } > + > + port_statistics.copy_dropped[rx_config->rxtx_port] += > + (nb_rx - nb_enq); > + } > +} > + > +/* Transmit packets from IOAT rawdev/rte_ring for one port. */ > +static void > +ioat_tx_port(struct rxtx_port_config *tx_config) > +{ > + uint32_t i, j, nb_dq = 0; > + struct rte_mbuf *mbufs_src[MAX_PKT_BURST]; > + struct rte_mbuf *mbufs_dst[MAX_PKT_BURST]; > + > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + /* Deque the mbufs from IOAT device. */ > + for (i = 0; i < tx_config->nb_queues; i++) { > + nb_dq = rte_ioat_completed_copies( > + tx_config->ioat_ids[i], MAX_PKT_BURST, > + (void *)mbufs_src, (void *)mbufs_dst); > + > + if (nb_dq == 0) > + break; > + > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)mbufs_src, nb_dq); > + > + /* Update macs if enabled */ > + if (mac_updating) { > + for (j = 0; j < nb_dq; j++) > + update_mac_addrs(mbufs_dst[j], > + tx_config->rxtx_port); > + } > + > + const uint16_t nb_tx = rte_eth_tx_burst( > + tx_config->rxtx_port, 0, > + (void *)mbufs_dst, nb_dq); > + > + port_statistics.tx[tx_config->rxtx_port] += nb_tx; > + > + /* Free any unsent packets. */ > + if (unlikely(nb_tx < nb_dq)) > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)&mbufs_dst[nb_tx], > + nb_dq - nb_tx); > + } > + } else { > + /* Deque the mbufs from rx_to_tx_ring. */ > + nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring, > + (void *)mbufs_dst, MAX_PKT_BURST, NULL); > + > + if (nb_dq == 0) > + return; > + > + /* Update macs if enabled */ > + if (mac_updating) { > + for (i = 0; i < nb_dq; i++) > + update_mac_addrs(mbufs_dst[i], > + tx_config->rxtx_port); > + } > + > + const uint16_t nb_tx = rte_eth_tx_burst(tx_config->rxtx_port, > + 0, (void *)mbufs_dst, nb_dq); > + > + port_statistics.tx[tx_config->rxtx_port] += nb_tx; > + > + /* Free any unsent packets. */ > + if (unlikely(nb_tx < nb_dq)) > + rte_mempool_put_bulk(ioat_pktmbuf_pool, > + (void *)&mbufs_dst[nb_tx], > + nb_dq - nb_tx); > + } > +} > + > +/* Main rx processing loop for IOAT rawdev. */ > +static void > +rx_main_loop(void) > +{ > + uint16_t i; > + uint16_t nb_ports = cfg.nb_ports; > + > + RTE_LOG(INFO, IOAT, "Entering main rx loop for copy on lcore %u\n", > + rte_lcore_id()); > + > + while (!force_quit) > + for (i = 0; i < nb_ports; i++) > + ioat_rx_port(&cfg.ports[i]); > +} > + > +/* Main tx processing loop for hardware copy. */ > +static void > +tx_main_loop(void) > +{ > + uint16_t i; > + uint16_t nb_ports = cfg.nb_ports; > + > + RTE_LOG(INFO, IOAT, "Entering main tx loop for copy on lcore %u\n", > + rte_lcore_id()); > + > + while (!force_quit) > + for (i = 0; i < nb_ports; i++) > + ioat_tx_port(&cfg.ports[i]); > +} > + > +/* Main rx and tx loop if only one slave lcore available */ > +static void > +rxtx_main_loop(void) > +{ > + uint16_t i; > + uint16_t nb_ports = cfg.nb_ports; > + > + RTE_LOG(INFO, IOAT, "Entering main rx and tx loop for copy on" > + " lcore %u\n", rte_lcore_id()); > + > + while (!force_quit) > + for (i = 0; i < nb_ports; i++) { > + ioat_rx_port(&cfg.ports[i]); > + ioat_tx_port(&cfg.ports[i]); > + } > +} > + > +static void start_forwarding_cores(void) > +{ > + uint32_t lcore_id = rte_lcore_id(); > + > + RTE_LOG(INFO, IOAT, "Entering %s on lcore %u\n", > + __func__, rte_lcore_id()); > + > + if (cfg.nb_lcores == 1) { > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)rxtx_main_loop, > + NULL, lcore_id); > + } else if (cfg.nb_lcores > 1) { > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)rx_main_loop, > + NULL, lcore_id); > + > + lcore_id = rte_get_next_lcore(lcore_id, true, true); > + rte_eal_remote_launch((lcore_function_t *)tx_main_loop, NULL, > + lcore_id); > + } > +} > + > +/* Display usage */ > +static void > +ioat_usage(const char *prgname) > +{ > + printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n" > + " -p --portmask: hexadecimal bitmask of ports to configure\n" > + " -q NQ: number of RX queues per port (default is 1)\n" > + " --[no-]mac-updating: Enable or disable MAC addresses updating (enabled by default)\n" > + " When enabled:\n" > + " - The source MAC address is replaced by the TX port MAC address\n" > + " - The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID\n" > + " -c --copy-type CT: type of copy: sw|rawdev\n" > + " -s --ring-size RS: size of IOAT rawdev ring for hardware copy mode or rte_ring for software copy mode\n", > + prgname); > +} > + > +static int > +ioat_parse_portmask(const char *portmask) > +{ > + char *end = NULL; > + unsigned long pm; > + > + /* Parse hexadecimal string */ > + pm = strtoul(portmask, &end, 16); > + if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0')) > + return -1; > + > + return pm; > +} > + > +static copy_mode_t > +ioat_parse_copy_mode(const char *copy_mode) > +{ > + if (strcmp(copy_mode, COPY_MODE_SW) == 0) > + return COPY_MODE_SW_NUM; > + else if (strcmp(copy_mode, COPY_MODE_IOAT) == 0) > + return COPY_MODE_IOAT_NUM; > + > + return COPY_MODE_INVALID_NUM; > +} > + > +/* Parse the argument given in the command line of the application */ > +static int > +ioat_parse_args(int argc, char **argv, unsigned int nb_ports) > +{ > + static const char short_options[] = > + "p:" /* portmask */ > + "q:" /* number of RX queues per port */ > + "c:" /* copy type (sw|rawdev) */ > + "s:" /* ring size */ > + ; > + > + static const struct option lgopts[] = { > + {CMD_LINE_OPT_MAC_UPDATING, no_argument, &mac_updating, 1}, > + {CMD_LINE_OPT_NO_MAC_UPDATING, no_argument, &mac_updating, 0}, > + {CMD_LINE_OPT_PORTMASK, required_argument, NULL, 'p'}, > + {CMD_LINE_OPT_NB_QUEUE, required_argument, NULL, 'q'}, > + {CMD_LINE_OPT_COPY_TYPE, required_argument, NULL, 'c'}, > + {CMD_LINE_OPT_RING_SIZE, required_argument, NULL, 's'}, > + {NULL, 0, 0, 0} > + }; > + > + const unsigned int default_port_mask = (1 << nb_ports) - 1; > + int opt, ret; > + char **argvopt; > + int option_index; > + char *prgname = argv[0]; > + > + ioat_enabled_port_mask = default_port_mask; > + argvopt = argv; > + > + while ((opt = getopt_long(argc, argvopt, short_options, > + lgopts, &option_index)) != EOF) { > + > + switch (opt) { > + /* portmask */ > + case 'p': > + ioat_enabled_port_mask = ioat_parse_portmask(optarg); > + if (ioat_enabled_port_mask & ~default_port_mask || > + ioat_enabled_port_mask <= 0) { > + printf("Invalid portmask, %s, suggest 0x%x\n", > + optarg, default_port_mask); > + ioat_usage(prgname); > + return -1; > + } > + break; > + > + case 'q': > + nb_queues = atoi(optarg); > + if (nb_queues == 0 || nb_queues > MAX_RX_QUEUES_COUNT) { > + printf("Invalid RX queues number %s. Max %u\n", > + optarg, MAX_RX_QUEUES_COUNT); > + ioat_usage(prgname); > + return -1; > + } > + break; > + > + case 'c': > + copy_mode = ioat_parse_copy_mode(optarg); > + if (copy_mode == COPY_MODE_INVALID_NUM) { > + printf("Invalid copy type. Use: sw, rawdev\n"); > + ioat_usage(prgname); > + return -1; > + } > + break; > + > + case 's': > + ring_size = atoi(optarg); > + if (ring_size == 0) { > + printf("Invalid ring size, %s.\n", optarg); > + ioat_usage(prgname); > + return -1; > + } > + break; > + > + /* long options */ > + case 0: > + break; > + > + default: > + ioat_usage(prgname); > + return -1; > + } > + } > + > + printf("MAC updating %s\n", mac_updating ? "enabled" : "disabled"); > + if (optind >= 0) > + argv[optind-1] = prgname; > + > + ret = optind-1; > + optind = 1; /* reset getopt lib */ > + return ret; > +} > + > +/* check link status, return true if at least one port is up */ > +static int > +check_link_status(uint32_t port_mask) > +{ > + uint16_t portid; > + struct rte_eth_link link; > + int retval = 0; > + > + printf("\nChecking link status\n"); > + RTE_ETH_FOREACH_DEV(portid) { > + if ((port_mask & (1 << portid)) == 0) > + continue; > + > + memset(&link, 0, sizeof(link)); > + rte_eth_link_get(portid, &link); > + > + /* Print link status */ > + if (link.link_status) { > + printf( > + "Port %d Link Up. Speed %u Mbps - %s\n", > + portid, link.link_speed, > + (link.link_duplex == ETH_LINK_FULL_DUPLEX) ? > + ("full-duplex") : ("half-duplex\n")); > + retval = 1; > + } else > + printf("Port %d Link Down\n", portid); > + } > + return retval; > +} > + > +static void > +configure_rawdev_queue(uint32_t dev_id) > +{ > + struct rte_ioat_rawdev_config dev_config = { .ring_size = ring_size }; > + struct rte_rawdev_info info = { .dev_private = &dev_config }; > + > + if (rte_rawdev_configure(dev_id, &info) != 0) { > + rte_exit(EXIT_FAILURE, > + "Error with rte_rawdev_configure()\n"); > + } > + if (rte_rawdev_start(dev_id) != 0) { > + rte_exit(EXIT_FAILURE, > + "Error with rte_rawdev_start()\n"); > + } > +} > + > +static void > +assign_rawdevs(void) > +{ > + uint16_t nb_rawdev = 0, rdev_id = 0; > + uint32_t i, j; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + for (j = 0; j < cfg.ports[i].nb_queues; j++) { > + struct rte_rawdev_info rdev_info = { 0 }; > + > + do { > + if (rdev_id == rte_rawdev_count()) > + goto end; > + rte_rawdev_info_get(rdev_id++, &rdev_info); > + } while (strcmp(rdev_info.driver_name, > + IOAT_PMD_RAWDEV_NAME_STR) != 0); > + > + cfg.ports[i].ioat_ids[j] = rdev_id - 1; > + configure_rawdev_queue(cfg.ports[i].ioat_ids[j]); > + ++nb_rawdev; > + } > + } > +end: > + if (nb_rawdev < cfg.nb_ports * cfg.ports[0].nb_queues) > + rte_exit(EXIT_FAILURE, > + "Not enough IOAT rawdevs (%u) for all queues (%u).\n", > + nb_rawdev, cfg.nb_ports * cfg.ports[0].nb_queues); > + RTE_LOG(INFO, IOAT, "Number of used rawdevs: %u.\n", nb_rawdev); > +} > + > +static void > +assign_rings(void) > +{ > + uint32_t i; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + char ring_name[RTE_RING_NAMESIZE]; > + > + snprintf(ring_name, sizeof(ring_name), "rx_to_tx_ring_%u", i); > + /* Create ring for inter core communication */ > + cfg.ports[i].rx_to_tx_ring = rte_ring_create( > + ring_name, ring_size, > + rte_socket_id(), RING_F_SP_ENQ | RING_F_SC_DEQ); > + > + if (cfg.ports[i].rx_to_tx_ring == NULL) > + rte_exit(EXIT_FAILURE, "Ring create failed: %s\n", > + rte_strerror(rte_errno)); > + } > +} > + > +/* > + * Initializes a given port using global settings and with the RX buffers > + * coming from the mbuf_pool passed as a parameter. > + */ > +static inline void > +port_init(uint16_t portid, struct rte_mempool *mbuf_pool, uint16_t nb_queues) > +{ > + /* configuring port to use RSS for multiple RX queues */ > + static const struct rte_eth_conf port_conf = { > + .rxmode = { > + .mq_mode = ETH_MQ_RX_RSS, > + .max_rx_pkt_len = RTE_ETHER_MAX_LEN > + }, > + .rx_adv_conf = { > + .rss_conf = { > + .rss_key = NULL, > + .rss_hf = ETH_RSS_PROTO_MASK, > + } > + } > + }; > + > + struct rte_eth_rxconf rxq_conf; > + struct rte_eth_txconf txq_conf; > + struct rte_eth_conf local_port_conf = port_conf; > + struct rte_eth_dev_info dev_info; > + int ret, i; > + > + /* Skip ports that are not enabled */ > + if ((ioat_enabled_port_mask & (1 << portid)) == 0) { > + printf("Skipping disabled port %u\n", portid); > + return; > + } > + > + /* Init port */ > + printf("Initializing port %u... ", portid); > + fflush(stdout); > + rte_eth_dev_info_get(portid, &dev_info); > + local_port_conf.rx_adv_conf.rss_conf.rss_hf &= > + dev_info.flow_type_rss_offloads; > + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) > + local_port_conf.txmode.offloads |= > + DEV_TX_OFFLOAD_MBUF_FAST_FREE; > + ret = rte_eth_dev_configure(portid, nb_queues, 1, &local_port_conf); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Cannot configure device:" > + " err=%d, port=%u\n", ret, portid); > + > + ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd, > + &nb_txd); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "Cannot adjust number of descriptors: err=%d, port=%u\n", > + ret, portid); > + > + rte_eth_macaddr_get(portid, &ioat_ports_eth_addr[portid]); > + > + /* Init RX queues */ > + rxq_conf = dev_info.default_rxconf; > + rxq_conf.offloads = local_port_conf.rxmode.offloads; > + for (i = 0; i < nb_queues; i++) { > + ret = rte_eth_rx_queue_setup(portid, i, nb_rxd, > + rte_eth_dev_socket_id(portid), &rxq_conf, > + mbuf_pool); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "rte_eth_rx_queue_setup:err=%d,port=%u, queue_id=%u\n", > + ret, portid, i); > + } > + > + /* Init one TX queue on each port */ > + txq_conf = dev_info.default_txconf; > + txq_conf.offloads = local_port_conf.txmode.offloads; > + ret = rte_eth_tx_queue_setup(portid, 0, nb_txd, > + rte_eth_dev_socket_id(portid), > + &txq_conf); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "rte_eth_tx_queue_setup:err=%d,port=%u\n", > + ret, portid); > + > + /* Initialize TX buffers */ > + tx_buffer[portid] = rte_zmalloc_socket("tx_buffer", > + RTE_ETH_TX_BUFFER_SIZE(MAX_PKT_BURST), 0, > + rte_eth_dev_socket_id(portid)); > + if (tx_buffer[portid] == NULL) > + rte_exit(EXIT_FAILURE, > + "Cannot allocate buffer for tx on port %u\n", > + portid); > + > + rte_eth_tx_buffer_init(tx_buffer[portid], MAX_PKT_BURST); > + > + ret = rte_eth_tx_buffer_set_err_callback(tx_buffer[portid], > + rte_eth_tx_buffer_count_callback, > + &port_statistics.tx_dropped[portid]); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "Cannot set error callback for tx buffer on port %u\n", > + portid); > + > + /* Start device */ > + ret = rte_eth_dev_start(portid); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, > + "rte_eth_dev_start:err=%d, port=%u\n", > + ret, portid); > + > + rte_eth_promiscuous_enable(portid); > + > + printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n", > + portid, > + ioat_ports_eth_addr[portid].addr_bytes[0], > + ioat_ports_eth_addr[portid].addr_bytes[1], > + ioat_ports_eth_addr[portid].addr_bytes[2], > + ioat_ports_eth_addr[portid].addr_bytes[3], > + ioat_ports_eth_addr[portid].addr_bytes[4], > + ioat_ports_eth_addr[portid].addr_bytes[5]); > + > + cfg.ports[cfg.nb_ports].rxtx_port = portid; > + cfg.ports[cfg.nb_ports++].nb_queues = nb_queues; > +} > + > +static void > +signal_handler(int signum) > +{ > + if (signum == SIGINT || signum == SIGTERM) { > + printf("\n\nSignal %d received, preparing to exit...\n", > + signum); > + force_quit = true; > + } > +} > + > +int > +main(int argc, char **argv) > +{ > + int ret; > + uint16_t nb_ports, portid; > + uint32_t i; > + unsigned int nb_mbufs; > + > + /* Init EAL */ > + ret = rte_eal_init(argc, argv); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); > + argc -= ret; > + argv += ret; > + > + force_quit = false; > + signal(SIGINT, signal_handler); > + signal(SIGTERM, signal_handler); > + > + nb_ports = rte_eth_dev_count_avail(); > + if (nb_ports == 0) > + rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n"); > + > + /* Parse application arguments (after the EAL ones) */ > + ret = ioat_parse_args(argc, argv, nb_ports); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Invalid IOAT arguments\n"); > + > + nb_mbufs = RTE_MAX(nb_ports * (nb_queues * (nb_rxd + nb_txd + > + 4 * MAX_PKT_BURST) + rte_lcore_count() * MEMPOOL_CACHE_SIZE), > + MIN_POOL_SIZE); > + > + /* Create the mbuf pool */ > + ioat_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool", nb_mbufs, > + MEMPOOL_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, > + rte_socket_id()); > + if (ioat_pktmbuf_pool == NULL) > + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n"); > + > + /* Initialise each port */ > + cfg.nb_ports = 0; > + RTE_ETH_FOREACH_DEV(portid) > + port_init(portid, ioat_pktmbuf_pool, nb_queues); > + > + /* Initialize port xstats */ > + memset(&port_statistics, 0, sizeof(port_statistics)); > + > + while (!check_link_status(ioat_enabled_port_mask) && !force_quit) > + sleep(1); > + > + /* Check if there is enough lcores for all ports. */ > + cfg.nb_lcores = rte_lcore_count() - 1; > + if (cfg.nb_lcores < 1) > + rte_exit(EXIT_FAILURE, > + "There should be at least one slave lcore.\n"); > + > + if (copy_mode == COPY_MODE_IOAT_NUM) > + assign_rawdevs(); > + else /* copy_mode == COPY_MODE_SW_NUM */ > + assign_rings(); > + > + start_forwarding_cores(); > + /* master core prints stats while other cores forward */ > + print_stats(argv[0]); > + > + /* force_quit is true when we get here */ > + rte_eal_mp_wait_lcore(); > + > + uint32_t j; > + > + for (i = 0; i < cfg.nb_ports; i++) { > + printf("Closing port %d\n", cfg.ports[i].rxtx_port); > + rte_eth_dev_stop(cfg.ports[i].rxtx_port); > + rte_eth_dev_close(cfg.ports[i].rxtx_port); > + if (copy_mode == COPY_MODE_IOAT_NUM) { > + for (j = 0; j < cfg.ports[i].nb_queues; j++) { > + printf("Stopping rawdev %d\n", > + cfg.ports[i].ioat_ids[j]); > + rte_rawdev_stop(cfg.ports[i].ioat_ids[j]); > + } > + } else /* copy_mode == COPY_MODE_SW_NUM */ > + rte_ring_free(cfg.ports[i].rx_to_tx_ring); > + } > + > + printf("Bye...\n"); > + return 0; > +} > diff --git a/examples/ioat/meson.build b/examples/ioat/meson.build > new file mode 100644 > index 000000000..ff56dc99c > --- /dev/null > +++ b/examples/ioat/meson.build > @@ -0,0 +1,13 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright(c) 2019 Intel Corporation > + > +# meson file, for building this example as part of a main DPDK build. > +# > +# To build this example as a standalone application with an already-installed > +# DPDK instance, use 'make' > + > +deps += ['pmd_ioat'] > + > +sources = files( > + 'ioatfwd.c' > +) > diff --git a/examples/meson.build b/examples/meson.build > index a046b74ad..c2e18b59b 100644 > --- a/examples/meson.build > +++ b/examples/meson.build You'll need to add the pmd_ioat dependency to this example it seems: examples/meson.build:89:4: ERROR: Problem encountered: Missing dependency "pmd_ioat" for example "ioat" > @@ -16,6 +16,7 @@ all_examples = [ > 'eventdev_pipeline', 'exception_path', > 'fips_validation', 'flow_classify', > 'flow_filtering', 'helloworld', > + 'ioat', > 'ip_fragmentation', 'ip_pipeline', > 'ip_reassembly', 'ipsec-secgw', > 'ipv4_multicast', 'kni',