From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 69A3BA0471 for ; Mon, 9 Sep 2019 10:30:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E5C521EAF9; Mon, 9 Sep 2019 10:30:39 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id AE4831EADD for ; Mon, 9 Sep 2019 10:30:37 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Sep 2019 01:30:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,484,1559545200"; d="scan'208";a="200012809" Received: from baranmx-mobl.ger.corp.intel.com ([10.103.104.83]) by fmsmga001.fm.intel.com with ESMTP; 09 Sep 2019 01:30:34 -0700 From: Marcin Baran To: dev@dpdk.org Cc: Pawel Modrak , Marcin Baran Date: Mon, 9 Sep 2019 10:29:38 +0200 Message-Id: <20190909082939.1629-1-marcinx.baran@intel.com> X-Mailer: git-send-email 2.22.0.windows.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH] examples/ioat: create sample app on ioat driver usage X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Pawel Modrak A new sample app demonstrating use of driver for CBDMA. The app receives packets, performs software or hardware copy, changes packets' MAC addresses (if enabled) and forwards them. The patch includes sample application as well as it's guide. Signed-off-by: Pawel Modrak Signed-off-by: Marcin Baran --- doc/guides/sample_app_ug/index.rst | 1 + doc/guides/sample_app_ug/intro.rst | 4 + doc/guides/sample_app_ug/ioat.rst | 691 +++++++++++++++++++ examples/Makefile | 3 + examples/ioat/Makefile | 54 ++ examples/ioat/ioatfwd.c | 1010 ++++++++++++++++++++++++++++ examples/ioat/meson.build | 13 + examples/meson.build | 1 + 8 files changed, 1777 insertions(+) create mode 100644 doc/guides/sample_app_ug/ioat.rst create mode 100644 examples/ioat/Makefile create mode 100644 examples/ioat/ioatfwd.c create mode 100644 examples/ioat/meson.build diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst index f23f8f59e..a6a1d9e7a 100644 --- a/doc/guides/sample_app_ug/index.rst +++ b/doc/guides/sample_app_ug/index.rst @@ -23,6 +23,7 @@ Sample Applications User Guides ip_reassembly kernel_nic_interface keep_alive + ioat l2_forward_crypto l2_forward_job_stats l2_forward_real_virtual diff --git a/doc/guides/sample_app_ug/intro.rst b/doc/guides/sample_app_ug/intro.rst index 90704194a..74462312f 100644 --- a/doc/guides/sample_app_ug/intro.rst +++ b/doc/guides/sample_app_ug/intro.rst @@ -91,6 +91,10 @@ examples are highlighted below. forwarding, or ``l3fwd`` application does forwarding based on Internet Protocol, IPv4 or IPv6 like a simple router. +* :doc:`Hardware packet copying`: The Hardware packet copying, + or ``ioatfwd`` application demonstrates how to use IOAT rawdev driver for + copying packets between two threads. + * :doc:`Packet Distributor`: The Packet Distributor demonstrates how to distribute packets arriving on an Rx port to different cores for processing and transmission. diff --git a/doc/guides/sample_app_ug/ioat.rst b/doc/guides/sample_app_ug/ioat.rst new file mode 100644 index 000000000..378d70b81 --- /dev/null +++ b/doc/guides/sample_app_ug/ioat.rst @@ -0,0 +1,691 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2019 Intel Corporation. + +Sample Application of packet copying using Intel\|reg| QuickData Technology +============================================================================ + +Overview +-------- + +This sample is intended as a demonstration of the basic components of a DPDK +forwarding application and example of how to use IOAT driver API to make +packets copies. + +Also while forwarding, the MAC addresses are affected as follows: + +* The source MAC address is replaced by the TX port MAC address + +* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID + +This application can be used to compare performance of using software packet +copy with copy done using a DMA device for different sizes of packets. +The example will print out statistics each second. The stats shows +received/send packets and packets dropped or failed to copy. + +Compiling the Application +------------------------- + +To compile the sample application see :doc:`compiling`. + +The application is located in the ``ioat`` sub-directory. + + +Running the Application +----------------------- + +In order to run the hardware copy application, the copying device +needs to be bound to user-space IO driver. + +Refer to the *IOAT Rawdev Driver for Intel\ |reg| QuickData Technology* +guide for information on using the driver. + +The application requires a number of command line options: + +.. code-block:: console + + ./build/ioatfwd [EAL options] -- -p MASK [-C CT] [--[no-]mac-updating] + +where, + +* p MASK: A hexadecimal bitmask of the ports to configure + +* c CT: Performed packet copy type: software (sw) or hardware using + DMA (rawdev) + +* s RS: size of IOAT rawdev ring for hardware copy mode or rte_ring for + software copy mode + +* --[no-]mac-updating: Whether MAC address of packets should be changed + or not + +The application can be launched in 2 different configurations: + +* Performing software packet copying + +* Performing hardware packet copying + +Each port needs 2 lcores: one of them receives incoming traffic and makes +a copy of each packet. The second lcore then updates MAC address and sends +the copy. For each configuration an additional lcore is needed since +master lcore in use which is responsible for configuration, statistics +printing and safe deinitialization of all ports and devices. + +The application can use a maximum of 8 ports. + +To run the application in a Linux environment with 3 lcores (one of them +is master lcore), 1 port (port 0), software copying and MAC updating issue +the command: + +.. code-block:: console + + $ ./build/ioatfwd -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw + +To run the application in a Linux environment with 5 lcores (one of them +is master lcore), 2 ports (ports 0 and 1), hardware copying and no MAC +updating issue the command: + +.. code-block:: console + + $ ./build/ioatfwd -l 0-4 -n 1 -- -p 0x3 --no-mac-updating -c rawdev + +Refer to the *DPDK Getting Started Guide* for general information on +running applications and the Environment Abstraction Layer (EAL) options. + +Explanation +----------- + +The following sections provide an explanation of the main components of the +code. + +All DPDK library functions used in the sample code are prefixed with +``rte_`` and are explained in detail in the *DPDK API Documentation*. + + +The Main Function +~~~~~~~~~~~~~~~~~ + +The ``main()`` function performs the initialization and calls the execution +threads for each lcore. + +The first task is to initialize the Environment Abstraction Layer (EAL). +The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()`` +function. The value returned is the number of parsed arguments: + +.. code-block:: c + + /* init EAL */ + ret = rte_eal_init(argc, argv); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); + + +The ``main()`` also allocates a mempool to hold the mbufs (Message Buffers) +used by the application: + +.. code-block:: c + + nb_mbufs = RTE_MAX(rte_eth_dev_count_avail() * (nb_rxd + nb_txd + + MAX_PKT_BURST + rte_lcore_count() * MEMPOOL_CACHE_SIZE), + MIN_POOL_SIZE); + + /* Create the mbuf pool */ + ioat_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool", nb_mbufs, + MEMPOOL_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, + rte_socket_id()); + if (ioat_pktmbuf_pool == NULL) + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n"); + +Mbufs are the packet buffer structure used by DPDK. They are explained in +detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*. + +The ``main()`` function also initializes the ports: + +.. code-block:: c + + /* Initialise each port */ + RTE_ETH_FOREACH_DEV(portid) { + port_init(portid, ioat_pktmbuf_pool); + } + +Each port is configured using ``port_init()``: + +.. code-block:: c + + static inline void + port_init(uint16_t portid, struct rte_mempool *mbuf_pool) + { + struct rte_eth_rxconf rxq_conf; + struct rte_eth_txconf txq_conf; + struct rte_eth_conf local_port_conf = port_conf; + struct rte_eth_dev_info dev_info; + int ret; + + /* Skip ports that are not enabled */ + if ((ioat_enabled_port_mask & (1 << portid)) == 0) { + printf("Skipping disabled port %u\n", portid); + return; + } + + /* Init port */ + printf("Initializing port %u... ", portid); + fflush(stdout); + rte_eth_dev_info_get(portid, &dev_info); + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) + local_port_conf.txmode.offloads |= + DEV_TX_OFFLOAD_MBUF_FAST_FREE; + ret = rte_eth_dev_configure(portid, 1, 1, &local_port_conf); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%u\n", + ret, portid); + + ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd, + &nb_txd); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Cannot adjust number of descriptors: err=%d, port=%u\n", + ret, portid); + + rte_eth_macaddr_get(portid, &ioat_ports_eth_addr[portid]); + + /* Init one RX queue */ + fflush(stdout); + rxq_conf = dev_info.default_rxconf; + rxq_conf.offloads = local_port_conf.rxmode.offloads; + ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd, + rte_eth_dev_socket_id(portid), + &rxq_conf, + mbuf_pool); + if (ret < 0) + rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n", + ret, portid); + + /* Init one TX queue on each port */ + fflush(stdout); + txq_conf = dev_info.default_txconf; + txq_conf.offloads = local_port_conf.txmode.offloads; + ret = rte_eth_tx_queue_setup(portid, 0, nb_txd, + rte_eth_dev_socket_id(portid), + &txq_conf); + if (ret < 0) + rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n", + ret, portid); + + /* Initialize TX buffers */ + tx_buffer[portid] = rte_zmalloc_socket("tx_buffer", + RTE_ETH_TX_BUFFER_SIZE(MAX_PKT_BURST), 0, + rte_eth_dev_socket_id(portid)); + if (tx_buffer[portid] == NULL) + rte_exit(EXIT_FAILURE, "Cannot allocate buffer for tx " + "on port %u\n", portid); + + rte_eth_tx_buffer_init(tx_buffer[portid], MAX_PKT_BURST); + + ret = rte_eth_tx_buffer_set_err_callback(tx_buffer[portid], + rte_eth_tx_buffer_count_callback, + &port_statistics[portid].tx_dropped); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Cannot set error callback for tx buffer on port %u\n", + portid); + + /* Start device */ + ret = rte_eth_dev_start(portid); + if (ret < 0) + rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n", + ret, portid); + + rte_eth_promiscuous_enable(portid); + + printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n", + portid, + ioat_ports_eth_addr[portid].addr_bytes[0], + ioat_ports_eth_addr[portid].addr_bytes[1], + ioat_ports_eth_addr[portid].addr_bytes[2], + ioat_ports_eth_addr[portid].addr_bytes[3], + ioat_ports_eth_addr[portid].addr_bytes[4], + ioat_ports_eth_addr[portid].addr_bytes[5]); + } + +The Ethernet ports are configured with local settings using the +``rte_eth_dev_configure()`` function and the ``port_conf`` struct: + +.. code-block:: c + + static struct rte_eth_conf port_conf = { + .rxmode = { + .max_rx_pkt_len = RTE_ETHER_MAX_LEN, + }, + }; + +For this example the ports are set up with 1 RX and 1 TX queue using the +``rte_eth_rx_queue_setup()`` and ``rte_eth_tx_queue_setup()`` functions. + +The Ethernet port is then started: + +.. code-block:: c + + ret = rte_eth_dev_start(portid); + if (ret < 0) + rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n", + ret, portid); + + +Finally the RX port is set in promiscuous mode: + +.. code-block:: c + + rte_eth_promiscuous_enable(portid); + + +After that each port application assigns resources needed. + +.. code-block:: c + + check_link_status(ioat_enabled_port_mask); + + if (!cfg.nb_ports) { + rte_exit(EXIT_FAILURE, + "All available ports are disabled. Please set portmask.\n"); + } + + /* Check if there is enough lcores for all ports. */ + cfg.nb_lcores = rte_lcore_count() - 1; + if (cfg.nb_lcores < 1) + rte_exit(EXIT_FAILURE, + "There should be at least one slave lcore.\n"); + + ret = 0; + + if (copy_mode == COPY_MODE_IOAT_NUM) { + assign_rawdevs(); + } else /* copy_mode == COPY_MODE_SW_NUM */ { + assign_rings(); + } + +A link status is checked of each port enabled by port mask +using ``check_link_status()`` function. + +.. code-block:: c + + /* Check the link status of all ports in up to 9s, and print them finally */ + static void + check_link_status(uint32_t port_mask) + { + + uint16_t portid; + struct rte_eth_link link; + + cfg.nb_ports = 0; + + printf("\nChecking link status\n"); + fflush(stdout); + RTE_ETH_FOREACH_DEV(portid) { + if (force_quit) + return; + if ((port_mask & (1 << portid)) == 0) + continue; + + store_port_nb(portid); + + memset(&link, 0, sizeof(link)); + rte_eth_link_get(portid, &link); + + /* Print link status */ + if (link.link_status) { + printf( + "Port %d Link Up. Speed %u Mbps - %s\n", + portid, link.link_speed, + (link.link_duplex == ETH_LINK_FULL_DUPLEX) ? + ("full-duplex") : ("half-duplex\n")); + } + else + printf("Port %d Link Down\n", portid); + } + } + +Depending on mode set (whether copy should be done by software or by hardware) +special structures are assigned to each port. If software copy was chosen, +application have to assign ring structures for packet exchanging between lcores +assigned to ports. + +.. code-block:: c + + static void + assign_rings(void) + { + uint32_t i; + + for (i = 0; i < cfg.nb_ports; i++) { + char ring_name[20]; + + snprintf(ring_name, 20, "rx_to_tx_ring_%u", i); + /* Create ring for inter core communication */ + cfg.ports[i].rx_to_tx_ring = rte_ring_create( + ring_name, ring_size, + rte_socket_id(), RING_F_SP_ENQ); + + if (cfg.ports[i].rx_to_tx_ring == NULL) + rte_exit(EXIT_FAILURE, "%s\n", + rte_strerror(rte_errno)); + } + } + + +When using hardware copy each port is assigned an IOAT device +(``assign_rawdevs()``) using IOAT Rawdev Driver API functions: + +.. code-block:: c + + static void + assign_rawdevs(void) + { + uint16_t nb_rawdev = 0; + uint32_t i; + + for (i = 0; i < cfg.nb_ports; i++) { + struct rte_rawdev_info rdev_info = {0}; + rte_rawdev_info_get(0, &rdev_info); + + if (strcmp(rdev_info.driver_name, "rawdev_ioat") == 0) { + configure_rawdev_queue(i); + cfg.ports[i].dev_id = i; + ++nb_rawdev; + } + } + + RTE_LOG(INFO, IOAT, "Number of used rawdevs: %u.\n", nb_rawdev); + + if (nb_rawdev < cfg.nb_ports) + rte_exit(EXIT_FAILURE, "Not enough IOAT rawdevs (%u) for ports (%u).\n", + nb_rawdev, cfg.nb_ports); + } + + +The initialization of hardware device is done by ``rte_rawdev_configure()`` +function and ``rte_rawdev_info`` struct. After configuration the device is +started using ``rte_rawdev_start()`` function. Each of the above operations +is done in ``configure_rawdev_queue()``. + +.. code-block:: c + + static void + configure_rawdev_queue(uint32_t dev_id) + { + struct rte_rawdev_info info = { .dev_private = &dev_config }; + + /* Configure hardware copy device */ + dev_config.ring_size = ring_size; + + if (rte_rawdev_configure(dev_id, &info) != 0) { + rte_exit(EXIT_FAILURE, + "Error with rte_rawdev_configure()\n"); + } + rte_rawdev_info_get(dev_id, &info); + if (dev_config.ring_size != ring_size) { + rte_exit(EXIT_FAILURE, + "Error, ring size is not %d (%d)\n", + ring_size, (int)dev_config.ring_size); + } + if (rte_rawdev_start(dev_id) != 0) { + rte_exit(EXIT_FAILURE, + "Error with rte_rawdev_start()\n"); + } + } + +If initialization is successful memory for hardware device +statistics is allocated. + +Finally ``main()`` functions starts all processing lcores and starts +printing stats in a loop on master lcore. The application can be +interrupted and closed using ``Ctrl-C``. The master lcore waits for +all slave processes to finish, deallocates resources and exits. + +The processing lcores launching function are described below. + +The Lcores Launching Functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As described above ``main()`` function invokes ``run_transmission()`` +function in order to start processing for each lcore: + +.. code-block:: c + + static void run_transmission(void) + { + uint32_t lcore_id = rte_lcore_id(); + + RTE_LOG(INFO, IOAT, "Entering %s on lcore %u\n", + __func__, rte_lcore_id()); + + if (cfg.nb_lcores == 1) { + lcore_id = rte_get_next_lcore(lcore_id, true, true); + rte_eal_remote_launch((lcore_function_t *)rxtx_main_loop, NULL, lcore_id); + } else if (cfg.nb_lcores > 1) { + lcore_id = rte_get_next_lcore(lcore_id, true, true); + rte_eal_remote_launch((lcore_function_t *)rx_main_loop, NULL, lcore_id); + + lcore_id = rte_get_next_lcore(lcore_id, true, true); + rte_eal_remote_launch((lcore_function_t *)tx_main_loop, NULL, lcore_id); + } + } + +The function launches rx/tx processing functions on configured lcores +for each port using ``rte_eal_remote_launch()``. The configured ports, +their number and number of assigned lcores are stored in user-defined +``rxtx_transmission_config`` struct that is initialized before launching +tasks: + +.. code-block:: c + + struct rxtx_transmission_config { + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; + uint16_t nb_ports; + uint16_t nb_lcores; + }; + +The Lcores Processing Functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For receiving packets on each port an ``ioat_rx_port()`` function is used. +Depending on mode the user chose, it will enqueue packets to IOAT rawdev +and then invoke copy process (hardware copy), or perform software copy +of each packet using ``pktmbuf_sw_copy()`` function and enqueue them to +rte_ring: + +.. code-block:: c + + /* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. */ + static void + ioat_rx_port(struct rxtx_port_config *rx_config) + { + uint32_t nb_rx, nb_enq, i; + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; + + nb_rx = rte_eth_rx_burst(rx_config->rx_portId, 0, + pkts_burst, MAX_PKT_BURST); + + if (nb_rx == 0) + return; + + port_statistics[rx_config->rx_portId].rx += nb_rx; + + if (copy_mode == COPY_MODE_IOAT_NUM) { + /* Perform packet hardware copy */ + nb_enq = ioat_enqueue_packets(rx_config, + pkts_burst, nb_rx); + + if (nb_enq > 0) + rte_ioat_do_copies(rx_config->dev_id); + } else { + /* Perform packet software copy, free source packets */ + int ret; + struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; + + ret = rte_pktmbuf_alloc_bulk(ioat_pktmbuf_pool, + pkts_burst_copy, nb_rx); + + if (unlikely(ret < 0)) + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); + + for (i = 0; i < nb_rx; i++) { + pktmbuf_sw_copy(pkts_burst[i], pkts_burst_copy[i]); + rte_pktmbuf_free(pkts_burst[i]); + } + + nb_enq = rte_ring_enqueue_burst(rx_config->rx_to_tx_ring, + (void *)pkts_burst_copy, nb_rx, NULL); + + /* Free any not enqueued packets. */ + for (i = nb_enq; i < nb_rx; i++) + rte_pktmbuf_free(pkts_burst_copy[i]); + } + + port_statistics[rx_config->rx_portId].copy_dropped + += (nb_rx - nb_enq); + } + +The packets are received in burst mode using ``rte_eth_rx_burst()`` +function. When using hardware copy mode the packets are enqueued in +copying device's buffer using ``ioat_enqueue_packets()`` which calls +``rte_ioat_enqueue_copy()``. When all received packets are in the +buffer the copies are invoked by calling ``rte_ioat_do_copies()``. +Function ``rte_ioat_enqueue_copy()`` operates on physical address of +the packet. Structure ``rte_mbuf`` contains only physical address to +start of the data buffer (``buf_iova``). Thus the address is shifted +by ``addr_offset`` value in order to get pointer to ``rearm_data`` +member of ``rte_mbuf``. That way the packet is copied all at once +(with data and metadata). + +.. code-block:: c + + static uint32_t + ioat_enqueue_packets(struct rxtx_port_config *rx_config, + struct rte_mbuf **pkts, uint32_t nb_rx) + { + int ret; + uint32_t i; + struct rte_mbuf *pkts_copy[MAX_PKT_BURST]; + + const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr, + &pkts[0]->rearm_data); + + ret = rte_pktmbuf_alloc_bulk(ioat_pktmbuf_pool, pkts_copy, nb_rx); + + if (unlikely(ret < 0)) + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); + + for (i = 0; i < nb_rx; i++) { + /* Perform data copy */ + ret = rte_ioat_enqueue_copy(rx_config->dev_id, + pkts[i]->buf_iova + - addr_offset, + pkts_copy[i]->buf_iova + - addr_offset, + rte_pktmbuf_data_len(pkts[i]) + + addr_offset, + (uintptr_t)pkts[i], + (uintptr_t)pkts_copy[i], + 0 /* nofence */); + + if (ret != 1) + break; + } + + ret = i; + /* Free any not enqueued packets. */ + for (; i < nb_rx; i++) { + rte_pktmbuf_free(pkts[i]); + rte_pktmbuf_free(pkts_copy[i]); + } + + return ret; + } + + +All done copies are processed by ``ioat_tx_port()`` function. When using +hardware copy mode the function invokes ``rte_ioat_completed_copies()`` +to gather copied packets. If software copy mode is used the function +dequeues copied packets from rte_ring. Then each packet MAC address +is changed if it was enabled. After that copies are sent in burst mode +using `` rte_eth_tx_burst()``. + + +.. code-block:: c + + /* Transmit packets from IOAT rawdev/rte_ring for one port. */ + static void + ioat_tx_port(struct rxtx_port_config *tx_config) + { + uint32_t i, nb_dq; + struct rte_mbuf *mbufs_src[MAX_PKT_BURST]; + struct rte_mbuf *mbufs_dst[MAX_PKT_BURST]; + + if (copy_mode == COPY_MODE_IOAT_NUM) { + /* Deque the mbufs from IOAT device. */ + nb_dq = rte_ioat_completed_copies(tx_config->dev_id, + MAX_PKT_BURST, (void *)mbufs_src, (void *)mbufs_dst); + } else { + /* Deque the mbufs from rx_to_tx_ring. */ + nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring, + (void *)mbufs_dst, MAX_PKT_BURST, NULL); + } + + if (nb_dq == 0) + return; + + /* Free source packets */ + if (copy_mode == COPY_MODE_IOAT_NUM) { + for (i = 0; i < nb_dq; i++) + rte_pktmbuf_free(mbufs_src[i]); + } + + /* Update macs if enabled */ + if (mac_updating) { + for (i = 0; i < nb_dq; i++) + update_mac_addrs(mbufs_dst[i], + tx_config->tx_portId); + } + + const uint16_t nb_tx = rte_eth_tx_burst(tx_config->tx_portId, + 0, (void *)mbufs_dst, nb_dq); + + port_statistics[tx_config->tx_portId].tx += nb_tx; + + /* Free any unsent packets. */ + if (unlikely(nb_tx < nb_dq)) { + for (i = nb_tx; i < nb_dq; i++) + rte_pktmbuf_free(mbufs_dst[i]); + } + } + +The Packet Copying Functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In order to perform packet copy there is a user-defined function +``pktmbuf_sw_copy()`` used. It copies a whole packet by copying +metadata from source packet to new mbuf, and then copying a data +chunk of source packet. Both memory copies are done using +``rte_memcpy()``: + +.. code-block:: c + + static inline void + pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) + { + /* Copy packet metadata */ + rte_memcpy(&dst->rearm_data, + &src->rearm_data, + offsetof(struct rte_mbuf, cacheline1) + - offsetof(struct rte_mbuf, rearm_data)); + + /* Copy packet data */ + rte_memcpy(rte_pktmbuf_mtod(dst, char *), + rte_pktmbuf_mtod(src, char *), src->data_len); + } + +The metadata in this example is copied from ``rearm_data`` member of +``rte_mbuf`` struct up to ``cacheline1``. + +In order to understand why software packet copying is done as shown +above please refer to the "Mbuf Library" section of the +*DPDK Programmer's Guide*. \ No newline at end of file diff --git a/examples/Makefile b/examples/Makefile index de11dd487..3cb313d7d 100644 --- a/examples/Makefile +++ b/examples/Makefile @@ -23,6 +23,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += fips_validation DIRS-$(CONFIG_RTE_LIBRTE_FLOW_CLASSIFY) += flow_classify DIRS-y += flow_filtering DIRS-y += helloworld +ifeq ($(CONFIG_RTE_LIBRTE_PMD_IOAT_RAWDEV),y) +DIRS-$(CONFIG_RTE_LIBRTE_PMD_IOAT_RAWDEV) += ioat +endif DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += ip_pipeline ifeq ($(CONFIG_RTE_LIBRTE_LPM),y) DIRS-$(CONFIG_RTE_IP_FRAG) += ip_reassembly diff --git a/examples/ioat/Makefile b/examples/ioat/Makefile new file mode 100644 index 000000000..2a4d1da2d --- /dev/null +++ b/examples/ioat/Makefile @@ -0,0 +1,54 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +# binary name +APP = ioatfwd + +# all source are stored in SRCS-y +SRCS-y := ioatfwd.c + +# Build using pkg-config variables if possible +ifeq ($(shell pkg-config --exists libdpdk && echo 0),0) + +all: shared +.PHONY: shared static +shared: build/$(APP)-shared + ln -sf $(APP)-shared build/$(APP) +static: build/$(APP)-static + ln -sf $(APP)-static build/$(APP) + +PC_FILE := $(shell pkg-config --path libdpdk) +CFLAGS += -O3 $(shell pkg-config --cflags libdpdk) +LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk) +LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk) + +build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build + $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED) + +build/$(APP)-static: $(SRCS-y) Makefile $(PC_FILE) | build + $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC) + +build: + @mkdir -p $@ + +.PHONY: clean +clean: + rm -f build/$(APP) build/$(APP)-static build/$(APP)-shared + test -d build && rmdir -p build || true + +else # Build using legacy build system +ifeq ($(RTE_SDK),) +$(error "Please define RTE_SDK environment variable") +endif + +# Default target, detect a build directory, by looking for a path with a .config +RTE_TARGET ?= $(notdir $(abspath $(dir $(firstword $(wildcard $(RTE_SDK)/*/.config))))) + +include $(RTE_SDK)/mk/rte.vars.mk + + +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +include $(RTE_SDK)/mk/rte.extapp.mk +endif diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c new file mode 100644 index 000000000..8463d82f3 --- /dev/null +++ b/examples/ioat/ioatfwd.c @@ -0,0 +1,1010 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +/* size of ring used for software copying between rx and tx. */ +#define RTE_LOGTYPE_IOAT RTE_LOGTYPE_USER1 +#define MAX_PKT_BURST 32 +#define MEMPOOL_CACHE_SIZE 512 +#define MIN_POOL_SIZE 65536U +#define CMD_LINE_OPT_MAC_UPDATING "mac-updating" +#define CMD_LINE_OPT_NO_MAC_UPDATING "no-mac-updating" +#define CMD_LINE_OPT_PORTMASK "portmask" +#define CMD_LINE_OPT_NB_QUEUE "nb-queue" +#define CMD_LINE_OPT_COPY_TYPE "copy-type" +#define CMD_LINE_OPT_RING_SIZE "ring-size" + +/* configurable number of RX/TX ring descriptors */ +#define RX_DEFAULT_RINGSIZE 1024 +#define TX_DEFAULT_RINGSIZE 1024 + +/* max number of RX queues per port */ +#define MAX_RX_QUEUES_COUNT 8 + +struct rxtx_port_config { + /* common config */ + uint16_t rxtx_port; + uint16_t nb_queues; + /* for software copy mode */ + struct rte_ring *rx_to_tx_ring; + /* for IOAT rawdev copy mode */ + uint16_t ioat_ids[MAX_RX_QUEUES_COUNT]; +}; + +struct rxtx_transmission_config { + struct rxtx_port_config ports[RTE_MAX_ETHPORTS]; + uint16_t nb_ports; + uint16_t nb_lcores; +}; + +/* per-port statistics struct */ +struct ioat_port_statistics { + uint64_t rx[RTE_MAX_ETHPORTS]; + uint64_t tx[RTE_MAX_ETHPORTS]; + uint64_t tx_dropped[RTE_MAX_ETHPORTS]; + uint64_t copy_dropped[RTE_MAX_ETHPORTS]; +}; +struct ioat_port_statistics port_statistics; + +struct total_statistics { + uint64_t total_packets_dropped; + uint64_t total_packets_tx; + uint64_t total_packets_rx; + uint64_t total_successful_enqueues; + uint64_t total_failed_enqueues; +}; + +typedef enum copy_mode_t { +#define COPY_MODE_SW "sw" + COPY_MODE_SW_NUM, +#define COPY_MODE_IOAT "rawdev" + COPY_MODE_IOAT_NUM, + COPY_MODE_INVALID_NUM, + COPY_MODE_SIZE_NUM = COPY_MODE_INVALID_NUM +} copy_mode_t; + +/* mask of enabled ports */ +static uint32_t ioat_enabled_port_mask; + +/* number of RX queues per port */ +static uint16_t nb_queues = 1; + +/* MAC updating enabled by default. */ +static int mac_updating = 1; + +/* hardare copy mode enabled by default. */ +static copy_mode_t copy_mode = COPY_MODE_IOAT_NUM; + +/* size of IOAT rawdev ring for hardware copy mode or + * rte_ring for software copy mode + */ +static unsigned short ring_size = 2048; + +/* global transmission config */ +struct rxtx_transmission_config cfg; + +/* configurable number of RX/TX ring descriptors */ +static uint16_t nb_rxd = RX_DEFAULT_RINGSIZE; +static uint16_t nb_txd = TX_DEFAULT_RINGSIZE; + +static volatile bool force_quit; + +/* ethernet addresses of ports */ +static struct rte_ether_addr ioat_ports_eth_addr[RTE_MAX_ETHPORTS]; + +static struct rte_eth_dev_tx_buffer *tx_buffer[RTE_MAX_ETHPORTS]; +struct rte_mempool *ioat_pktmbuf_pool; + +/* Print out statistics for one port. */ +static void +print_port_stats(uint16_t port_id) +{ + printf("\nStatistics for port %u ------------------------------" + "\nPackets sent: %34"PRIu64 + "\nPackets received: %30"PRIu64 + "\nPackets dropped on tx: %25"PRIu64 + "\nPackets dropped on copy: %23"PRIu64, + port_id, + port_statistics.tx[port_id], + port_statistics.rx[port_id], + port_statistics.tx_dropped[port_id], + port_statistics.copy_dropped[port_id]); +} + +/* Print out statistics for one IOAT rawdev device. */ +static void +print_rawdev_stats(uint32_t dev_id, uint64_t *xstats, + uint16_t nb_xstats, struct rte_rawdev_xstats_name *names_xstats) +{ + uint16_t i; + + printf("\nIOAT channel %u", dev_id); + for (i = 0; i < nb_xstats; i++) + if (strstr(names_xstats[i].name, "enqueues")) + printf("\n\t %s: %*"PRIu64, + names_xstats[i].name, + (int)(37 - strlen(names_xstats[i].name)), + xstats[i]); +} + +static void +print_total_stats(struct total_statistics *ts) +{ + printf("\nAggregate statistics ===============================" + "\nTotal packets sent: %28"PRIu64 + "\nTotal packets received: %24"PRIu64 + "\nTotal packets dropped: %25"PRIu64, + ts->total_packets_tx, + ts->total_packets_rx, + ts->total_packets_dropped); + + if (copy_mode == COPY_MODE_IOAT_NUM) { + printf("\nTotal IOAT successful enqueues: %16"PRIu64 + "\nTotal IOAT failed enqueues: %20"PRIu64, + ts->total_successful_enqueues, + ts->total_failed_enqueues); + } + + printf("\n====================================================\n"); +} + +/* Print out statistics on packets dropped. */ +static void +print_stats(char *prgname) +{ + struct total_statistics ts; + uint32_t i, port_id, dev_id; + struct rte_rawdev_xstats_name *names_xstats; + uint64_t *xstats; + unsigned int *ids_xstats; + unsigned int nb_xstats, id_fail_enq, id_succ_enq; + char status_string[120]; /* to print at the top of the output */ + int status_strlen; + + + const char clr[] = { 27, '[', '2', 'J', '\0' }; + const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' }; + + status_strlen = snprintf(status_string, sizeof(status_string), + "%s, ", prgname); + status_strlen += snprintf(status_string + status_strlen, + sizeof(status_string) - status_strlen, + "Worker Threads = %d, ", + rte_lcore_count() > 2 ? 2 : 1); + status_strlen += snprintf(status_string + status_strlen, + sizeof(status_string) - status_strlen, + "Copy Mode = %s,\n", copy_mode == COPY_MODE_SW_NUM ? + COPY_MODE_SW : COPY_MODE_IOAT); + status_strlen += snprintf(status_string + status_strlen, + sizeof(status_string) - status_strlen, + "Updating MAC = %s, ", mac_updating ? + "enabled" : "disabled"); + status_strlen += snprintf(status_string + status_strlen, + sizeof(status_string) - status_strlen, + "Rx Queues = %d, ", nb_queues); + status_strlen += snprintf(status_string + status_strlen, + sizeof(status_string) - status_strlen, + "Ring Size = %d\n", ring_size); + + /* Allocate memory for xstats names and values */ + nb_xstats = rte_rawdev_xstats_names_get( + cfg.ports[0].ioat_ids[0], NULL, 0); + + names_xstats = malloc(sizeof(*names_xstats) * nb_xstats); + if (names_xstats == NULL) { + rte_exit(EXIT_FAILURE, + "Error allocating xstat names memory\n"); + } + rte_rawdev_xstats_names_get(cfg.ports[0].ioat_ids[0], + names_xstats, nb_xstats); + + ids_xstats = malloc(sizeof(*ids_xstats) * nb_xstats); + if (ids_xstats == NULL) { + rte_exit(EXIT_FAILURE, + "Error allocating xstat ids_xstats memory\n"); + } + + for (i = 0; i < nb_xstats; i++) + ids_xstats[i] = i; + + xstats = malloc(sizeof(*xstats) * nb_xstats); + if (xstats == NULL) { + rte_exit(EXIT_FAILURE, + "Error allocating xstat memory\n"); + } + + /* Get failed/successful enqueues stats index */ + id_fail_enq = id_succ_enq = nb_xstats; + for (i = 0; i < nb_xstats; i++) { + if (!strcmp(names_xstats[i].name, "failed_enqueues")) + id_fail_enq = i; + else if (!strcmp(names_xstats[i].name, "successful_enqueues")) + id_succ_enq = i; + if (id_fail_enq < nb_xstats && id_succ_enq < nb_xstats) + break; + } + if (id_fail_enq == nb_xstats || id_succ_enq == nb_xstats) { + rte_exit(EXIT_FAILURE, + "Error getting failed/successful enqueues stats index\n"); + } + + while (!force_quit) { + /* Sleep for 1 second each round - init sleep allows reading + * messages from app startup. + */ + sleep(1); + + /* Clear screen and move to top left */ + printf("%s%s", clr, topLeft); + + memset(&ts, 0, sizeof(struct total_statistics)); + + printf("%s", status_string); + + for (i = 0; i < cfg.nb_ports; i++) { + port_id = cfg.ports[i].rxtx_port; + print_port_stats(port_id); + + ts.total_packets_dropped += + port_statistics.tx_dropped[port_id] + + port_statistics.copy_dropped[port_id]; + ts.total_packets_tx += port_statistics.tx[port_id]; + ts.total_packets_rx += port_statistics.rx[port_id]; + + if (copy_mode == COPY_MODE_IOAT_NUM) { + uint32_t j; + + for (j = 0; j < cfg.ports[i].nb_queues; j++) { + dev_id = cfg.ports[i].ioat_ids[j]; + rte_rawdev_xstats_get(dev_id, + ids_xstats, xstats, nb_xstats); + + print_rawdev_stats(dev_id, xstats, + nb_xstats, names_xstats); + + ts.total_successful_enqueues += + xstats[id_succ_enq]; + ts.total_failed_enqueues += + xstats[id_fail_enq]; + } + } + } + printf("\n"); + + print_total_stats(&ts); + } + + free(names_xstats); + free(xstats); + free(ids_xstats); +} + +static void +update_mac_addrs(struct rte_mbuf *m, uint32_t dest_portid) +{ + struct rte_ether_hdr *eth; + void *tmp; + + eth = rte_pktmbuf_mtod(m, struct rte_ether_hdr *); + + /* 02:00:00:00:00:xx - overwriting 2 bytes of source address but + * it's acceptable cause it gets overwritten by rte_ether_addr_copy + */ + tmp = ð->d_addr.addr_bytes[0]; + *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dest_portid << 40); + + /* src addr */ + rte_ether_addr_copy(&ioat_ports_eth_addr[dest_portid], ð->s_addr); +} + +static inline void +pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) +{ + /* Copy packet metadata */ + rte_memcpy(&dst->rearm_data, + &src->rearm_data, + offsetof(struct rte_mbuf, cacheline1) + - offsetof(struct rte_mbuf, rearm_data)); + + /* Copy packet data */ + rte_memcpy(rte_pktmbuf_mtod(dst, char *), + rte_pktmbuf_mtod(src, char *), src->data_len); +} + +static uint32_t +ioat_enqueue_packets(struct rte_mbuf **pkts, + uint32_t nb_rx, uint16_t dev_id) +{ + int ret; + uint32_t i; + struct rte_mbuf *pkts_copy[MAX_PKT_BURST]; + + const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr, + &pkts[0]->rearm_data); + + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, + (void *)pkts_copy, nb_rx); + + if (unlikely(ret < 0)) + rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); + + for (i = 0; i < nb_rx; i++) { + /* Perform data copy */ + ret = rte_ioat_enqueue_copy(dev_id, + pkts[i]->buf_iova + - addr_offset, + pkts_copy[i]->buf_iova + - addr_offset, + rte_pktmbuf_data_len(pkts[i]) + + addr_offset, + (uintptr_t)pkts[i], + (uintptr_t)pkts_copy[i], + 0 /* nofence */); + + if (ret != 1) + break; + } + + ret = i; + /* Free any not enqueued packets. */ + rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts[i], nb_rx - i); + rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts_copy[i], + nb_rx - i); + + + return ret; +} + +/* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. */ +static void +ioat_rx_port(struct rxtx_port_config *rx_config) +{ + uint32_t nb_rx, nb_enq, i, j; + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; + + for (i = 0; i < rx_config->nb_queues; i++) { + + nb_rx = rte_eth_rx_burst(rx_config->rxtx_port, i, + pkts_burst, MAX_PKT_BURST); + + if (nb_rx == 0) + continue; + + port_statistics.rx[rx_config->rxtx_port] += nb_rx; + + if (copy_mode == COPY_MODE_IOAT_NUM) { + /* Perform packet hardware copy */ + nb_enq = ioat_enqueue_packets(pkts_burst, + nb_rx, rx_config->ioat_ids[i]); + if (nb_enq > 0) + rte_ioat_do_copies(rx_config->ioat_ids[i]); + } else { + /* Perform packet software copy, free source packets */ + int ret; + struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; + + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, + (void *)pkts_burst_copy, nb_rx); + + if (unlikely(ret < 0)) + rte_exit(EXIT_FAILURE, + "Unable to allocate memory.\n"); + + for (j = 0; j < nb_rx; j++) + pktmbuf_sw_copy(pkts_burst[j], + pkts_burst_copy[j]); + + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)pkts_burst, nb_rx); + + nb_enq = rte_ring_enqueue_burst( + rx_config->rx_to_tx_ring, + (void *)pkts_burst_copy, nb_rx, NULL); + + /* Free any not enqueued packets. */ + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)&pkts_burst_copy[nb_enq], + nb_rx - nb_enq); + } + + port_statistics.copy_dropped[rx_config->rxtx_port] += + (nb_rx - nb_enq); + } +} + +/* Transmit packets from IOAT rawdev/rte_ring for one port. */ +static void +ioat_tx_port(struct rxtx_port_config *tx_config) +{ + uint32_t i, j, nb_dq = 0; + struct rte_mbuf *mbufs_src[MAX_PKT_BURST]; + struct rte_mbuf *mbufs_dst[MAX_PKT_BURST]; + + if (copy_mode == COPY_MODE_IOAT_NUM) { + /* Deque the mbufs from IOAT device. */ + for (i = 0; i < tx_config->nb_queues; i++) { + nb_dq = rte_ioat_completed_copies( + tx_config->ioat_ids[i], MAX_PKT_BURST, + (void *)mbufs_src, (void *)mbufs_dst); + + if (nb_dq == 0) + break; + + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)mbufs_src, nb_dq); + + /* Update macs if enabled */ + if (mac_updating) { + for (j = 0; j < nb_dq; j++) + update_mac_addrs(mbufs_dst[j], + tx_config->rxtx_port); + } + + const uint16_t nb_tx = rte_eth_tx_burst( + tx_config->rxtx_port, 0, + (void *)mbufs_dst, nb_dq); + + port_statistics.tx[tx_config->rxtx_port] += nb_tx; + + /* Free any unsent packets. */ + if (unlikely(nb_tx < nb_dq)) + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)&mbufs_dst[nb_tx], + nb_dq - nb_tx); + } + } else { + /* Deque the mbufs from rx_to_tx_ring. */ + nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring, + (void *)mbufs_dst, MAX_PKT_BURST, NULL); + + if (nb_dq == 0) + return; + + /* Update macs if enabled */ + if (mac_updating) { + for (i = 0; i < nb_dq; i++) + update_mac_addrs(mbufs_dst[i], + tx_config->rxtx_port); + } + + const uint16_t nb_tx = rte_eth_tx_burst(tx_config->rxtx_port, + 0, (void *)mbufs_dst, nb_dq); + + port_statistics.tx[tx_config->rxtx_port] += nb_tx; + + /* Free any unsent packets. */ + if (unlikely(nb_tx < nb_dq)) + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)&mbufs_dst[nb_tx], + nb_dq - nb_tx); + } +} + +/* Main rx processing loop for IOAT rawdev. */ +static void +rx_main_loop(void) +{ + uint16_t i; + uint16_t nb_ports = cfg.nb_ports; + + RTE_LOG(INFO, IOAT, "Entering main rx loop for copy on lcore %u\n", + rte_lcore_id()); + + while (!force_quit) + for (i = 0; i < nb_ports; i++) + ioat_rx_port(&cfg.ports[i]); +} + +/* Main tx processing loop for hardware copy. */ +static void +tx_main_loop(void) +{ + uint16_t i; + uint16_t nb_ports = cfg.nb_ports; + + RTE_LOG(INFO, IOAT, "Entering main tx loop for copy on lcore %u\n", + rte_lcore_id()); + + while (!force_quit) + for (i = 0; i < nb_ports; i++) + ioat_tx_port(&cfg.ports[i]); +} + +/* Main rx and tx loop if only one slave lcore available */ +static void +rxtx_main_loop(void) +{ + uint16_t i; + uint16_t nb_ports = cfg.nb_ports; + + RTE_LOG(INFO, IOAT, "Entering main rx and tx loop for copy on" + " lcore %u\n", rte_lcore_id()); + + while (!force_quit) + for (i = 0; i < nb_ports; i++) { + ioat_rx_port(&cfg.ports[i]); + ioat_tx_port(&cfg.ports[i]); + } +} + +static void start_forwarding_cores(void) +{ + uint32_t lcore_id = rte_lcore_id(); + + RTE_LOG(INFO, IOAT, "Entering %s on lcore %u\n", + __func__, rte_lcore_id()); + + if (cfg.nb_lcores == 1) { + lcore_id = rte_get_next_lcore(lcore_id, true, true); + rte_eal_remote_launch((lcore_function_t *)rxtx_main_loop, + NULL, lcore_id); + } else if (cfg.nb_lcores > 1) { + lcore_id = rte_get_next_lcore(lcore_id, true, true); + rte_eal_remote_launch((lcore_function_t *)rx_main_loop, + NULL, lcore_id); + + lcore_id = rte_get_next_lcore(lcore_id, true, true); + rte_eal_remote_launch((lcore_function_t *)tx_main_loop, NULL, + lcore_id); + } +} + +/* Display usage */ +static void +ioat_usage(const char *prgname) +{ + printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n" + " -p --portmask: hexadecimal bitmask of ports to configure\n" + " -q NQ: number of RX queues per port (default is 1)\n" + " --[no-]mac-updating: Enable or disable MAC addresses updating (enabled by default)\n" + " When enabled:\n" + " - The source MAC address is replaced by the TX port MAC address\n" + " - The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID\n" + " -c --copy-type CT: type of copy: sw|rawdev\n" + " -s --ring-size RS: size of IOAT rawdev ring for hardware copy mode or rte_ring for software copy mode\n", + prgname); +} + +static int +ioat_parse_portmask(const char *portmask) +{ + char *end = NULL; + unsigned long pm; + + /* Parse hexadecimal string */ + pm = strtoul(portmask, &end, 16); + if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0')) + return -1; + + return pm; +} + +static copy_mode_t +ioat_parse_copy_mode(const char *copy_mode) +{ + if (strcmp(copy_mode, COPY_MODE_SW) == 0) + return COPY_MODE_SW_NUM; + else if (strcmp(copy_mode, COPY_MODE_IOAT) == 0) + return COPY_MODE_IOAT_NUM; + + return COPY_MODE_INVALID_NUM; +} + +/* Parse the argument given in the command line of the application */ +static int +ioat_parse_args(int argc, char **argv, unsigned int nb_ports) +{ + static const char short_options[] = + "p:" /* portmask */ + "q:" /* number of RX queues per port */ + "c:" /* copy type (sw|rawdev) */ + "s:" /* ring size */ + ; + + static const struct option lgopts[] = { + {CMD_LINE_OPT_MAC_UPDATING, no_argument, &mac_updating, 1}, + {CMD_LINE_OPT_NO_MAC_UPDATING, no_argument, &mac_updating, 0}, + {CMD_LINE_OPT_PORTMASK, required_argument, NULL, 'p'}, + {CMD_LINE_OPT_NB_QUEUE, required_argument, NULL, 'q'}, + {CMD_LINE_OPT_COPY_TYPE, required_argument, NULL, 'c'}, + {CMD_LINE_OPT_RING_SIZE, required_argument, NULL, 's'}, + {NULL, 0, 0, 0} + }; + + const unsigned int default_port_mask = (1 << nb_ports) - 1; + int opt, ret; + char **argvopt; + int option_index; + char *prgname = argv[0]; + + ioat_enabled_port_mask = default_port_mask; + argvopt = argv; + + while ((opt = getopt_long(argc, argvopt, short_options, + lgopts, &option_index)) != EOF) { + + switch (opt) { + /* portmask */ + case 'p': + ioat_enabled_port_mask = ioat_parse_portmask(optarg); + if (ioat_enabled_port_mask & ~default_port_mask || + ioat_enabled_port_mask <= 0) { + printf("Invalid portmask, %s, suggest 0x%x\n", + optarg, default_port_mask); + ioat_usage(prgname); + return -1; + } + break; + + case 'q': + nb_queues = atoi(optarg); + if (nb_queues == 0 || nb_queues > MAX_RX_QUEUES_COUNT) { + printf("Invalid RX queues number %s. Max %u\n", + optarg, MAX_RX_QUEUES_COUNT); + ioat_usage(prgname); + return -1; + } + break; + + case 'c': + copy_mode = ioat_parse_copy_mode(optarg); + if (copy_mode == COPY_MODE_INVALID_NUM) { + printf("Invalid copy type. Use: sw, rawdev\n"); + ioat_usage(prgname); + return -1; + } + break; + + case 's': + ring_size = atoi(optarg); + if (ring_size == 0) { + printf("Invalid ring size, %s.\n", optarg); + ioat_usage(prgname); + return -1; + } + break; + + /* long options */ + case 0: + break; + + default: + ioat_usage(prgname); + return -1; + } + } + + printf("MAC updating %s\n", mac_updating ? "enabled" : "disabled"); + if (optind >= 0) + argv[optind-1] = prgname; + + ret = optind-1; + optind = 1; /* reset getopt lib */ + return ret; +} + +/* check link status, return true if at least one port is up */ +static int +check_link_status(uint32_t port_mask) +{ + uint16_t portid; + struct rte_eth_link link; + int retval = 0; + + printf("\nChecking link status\n"); + RTE_ETH_FOREACH_DEV(portid) { + if ((port_mask & (1 << portid)) == 0) + continue; + + memset(&link, 0, sizeof(link)); + rte_eth_link_get(portid, &link); + + /* Print link status */ + if (link.link_status) { + printf( + "Port %d Link Up. Speed %u Mbps - %s\n", + portid, link.link_speed, + (link.link_duplex == ETH_LINK_FULL_DUPLEX) ? + ("full-duplex") : ("half-duplex\n")); + retval = 1; + } else + printf("Port %d Link Down\n", portid); + } + return retval; +} + +static void +configure_rawdev_queue(uint32_t dev_id) +{ + struct rte_ioat_rawdev_config dev_config = { .ring_size = ring_size }; + struct rte_rawdev_info info = { .dev_private = &dev_config }; + + if (rte_rawdev_configure(dev_id, &info) != 0) { + rte_exit(EXIT_FAILURE, + "Error with rte_rawdev_configure()\n"); + } + if (rte_rawdev_start(dev_id) != 0) { + rte_exit(EXIT_FAILURE, + "Error with rte_rawdev_start()\n"); + } +} + +static void +assign_rawdevs(void) +{ + uint16_t nb_rawdev = 0, rdev_id = 0; + uint32_t i, j; + + for (i = 0; i < cfg.nb_ports; i++) { + for (j = 0; j < cfg.ports[i].nb_queues; j++) { + struct rte_rawdev_info rdev_info = { 0 }; + + do { + if (rdev_id == rte_rawdev_count()) + goto end; + rte_rawdev_info_get(rdev_id++, &rdev_info); + } while (strcmp(rdev_info.driver_name, + IOAT_PMD_RAWDEV_NAME_STR) != 0); + + cfg.ports[i].ioat_ids[j] = rdev_id - 1; + configure_rawdev_queue(cfg.ports[i].ioat_ids[j]); + ++nb_rawdev; + } + } +end: + if (nb_rawdev < cfg.nb_ports * cfg.ports[0].nb_queues) + rte_exit(EXIT_FAILURE, + "Not enough IOAT rawdevs (%u) for all queues (%u).\n", + nb_rawdev, cfg.nb_ports * cfg.ports[0].nb_queues); + RTE_LOG(INFO, IOAT, "Number of used rawdevs: %u.\n", nb_rawdev); +} + +static void +assign_rings(void) +{ + uint32_t i; + + for (i = 0; i < cfg.nb_ports; i++) { + char ring_name[RTE_RING_NAMESIZE]; + + snprintf(ring_name, sizeof(ring_name), "rx_to_tx_ring_%u", i); + /* Create ring for inter core communication */ + cfg.ports[i].rx_to_tx_ring = rte_ring_create( + ring_name, ring_size, + rte_socket_id(), RING_F_SP_ENQ | RING_F_SC_DEQ); + + if (cfg.ports[i].rx_to_tx_ring == NULL) + rte_exit(EXIT_FAILURE, "Ring create failed: %s\n", + rte_strerror(rte_errno)); + } +} + +/* + * Initializes a given port using global settings and with the RX buffers + * coming from the mbuf_pool passed as a parameter. + */ +static inline void +port_init(uint16_t portid, struct rte_mempool *mbuf_pool, uint16_t nb_queues) +{ + /* configuring port to use RSS for multiple RX queues */ + static const struct rte_eth_conf port_conf = { + .rxmode = { + .mq_mode = ETH_MQ_RX_RSS, + .max_rx_pkt_len = RTE_ETHER_MAX_LEN + }, + .rx_adv_conf = { + .rss_conf = { + .rss_key = NULL, + .rss_hf = ETH_RSS_PROTO_MASK, + } + } + }; + + struct rte_eth_rxconf rxq_conf; + struct rte_eth_txconf txq_conf; + struct rte_eth_conf local_port_conf = port_conf; + struct rte_eth_dev_info dev_info; + int ret, i; + + /* Skip ports that are not enabled */ + if ((ioat_enabled_port_mask & (1 << portid)) == 0) { + printf("Skipping disabled port %u\n", portid); + return; + } + + /* Init port */ + printf("Initializing port %u... ", portid); + fflush(stdout); + rte_eth_dev_info_get(portid, &dev_info); + local_port_conf.rx_adv_conf.rss_conf.rss_hf &= + dev_info.flow_type_rss_offloads; + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) + local_port_conf.txmode.offloads |= + DEV_TX_OFFLOAD_MBUF_FAST_FREE; + ret = rte_eth_dev_configure(portid, nb_queues, 1, &local_port_conf); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Cannot configure device:" + " err=%d, port=%u\n", ret, portid); + + ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd, + &nb_txd); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Cannot adjust number of descriptors: err=%d, port=%u\n", + ret, portid); + + rte_eth_macaddr_get(portid, &ioat_ports_eth_addr[portid]); + + /* Init RX queues */ + rxq_conf = dev_info.default_rxconf; + rxq_conf.offloads = local_port_conf.rxmode.offloads; + for (i = 0; i < nb_queues; i++) { + ret = rte_eth_rx_queue_setup(portid, i, nb_rxd, + rte_eth_dev_socket_id(portid), &rxq_conf, + mbuf_pool); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "rte_eth_rx_queue_setup:err=%d,port=%u, queue_id=%u\n", + ret, portid, i); + } + + /* Init one TX queue on each port */ + txq_conf = dev_info.default_txconf; + txq_conf.offloads = local_port_conf.txmode.offloads; + ret = rte_eth_tx_queue_setup(portid, 0, nb_txd, + rte_eth_dev_socket_id(portid), + &txq_conf); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "rte_eth_tx_queue_setup:err=%d,port=%u\n", + ret, portid); + + /* Initialize TX buffers */ + tx_buffer[portid] = rte_zmalloc_socket("tx_buffer", + RTE_ETH_TX_BUFFER_SIZE(MAX_PKT_BURST), 0, + rte_eth_dev_socket_id(portid)); + if (tx_buffer[portid] == NULL) + rte_exit(EXIT_FAILURE, + "Cannot allocate buffer for tx on port %u\n", + portid); + + rte_eth_tx_buffer_init(tx_buffer[portid], MAX_PKT_BURST); + + ret = rte_eth_tx_buffer_set_err_callback(tx_buffer[portid], + rte_eth_tx_buffer_count_callback, + &port_statistics.tx_dropped[portid]); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Cannot set error callback for tx buffer on port %u\n", + portid); + + /* Start device */ + ret = rte_eth_dev_start(portid); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "rte_eth_dev_start:err=%d, port=%u\n", + ret, portid); + + rte_eth_promiscuous_enable(portid); + + printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n", + portid, + ioat_ports_eth_addr[portid].addr_bytes[0], + ioat_ports_eth_addr[portid].addr_bytes[1], + ioat_ports_eth_addr[portid].addr_bytes[2], + ioat_ports_eth_addr[portid].addr_bytes[3], + ioat_ports_eth_addr[portid].addr_bytes[4], + ioat_ports_eth_addr[portid].addr_bytes[5]); + + cfg.ports[cfg.nb_ports].rxtx_port = portid; + cfg.ports[cfg.nb_ports++].nb_queues = nb_queues; +} + +static void +signal_handler(int signum) +{ + if (signum == SIGINT || signum == SIGTERM) { + printf("\n\nSignal %d received, preparing to exit...\n", + signum); + force_quit = true; + } +} + +int +main(int argc, char **argv) +{ + int ret; + uint16_t nb_ports, portid; + uint32_t i; + unsigned int nb_mbufs; + + /* Init EAL */ + ret = rte_eal_init(argc, argv); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); + argc -= ret; + argv += ret; + + force_quit = false; + signal(SIGINT, signal_handler); + signal(SIGTERM, signal_handler); + + nb_ports = rte_eth_dev_count_avail(); + if (nb_ports == 0) + rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n"); + + /* Parse application arguments (after the EAL ones) */ + ret = ioat_parse_args(argc, argv, nb_ports); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Invalid IOAT arguments\n"); + + nb_mbufs = RTE_MAX(nb_ports * (nb_queues * (nb_rxd + nb_txd + + 4 * MAX_PKT_BURST) + rte_lcore_count() * MEMPOOL_CACHE_SIZE), + MIN_POOL_SIZE); + + /* Create the mbuf pool */ + ioat_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool", nb_mbufs, + MEMPOOL_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, + rte_socket_id()); + if (ioat_pktmbuf_pool == NULL) + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n"); + + /* Initialise each port */ + cfg.nb_ports = 0; + RTE_ETH_FOREACH_DEV(portid) + port_init(portid, ioat_pktmbuf_pool, nb_queues); + + /* Initialize port xstats */ + memset(&port_statistics, 0, sizeof(port_statistics)); + + while (!check_link_status(ioat_enabled_port_mask) && !force_quit) + sleep(1); + + /* Check if there is enough lcores for all ports. */ + cfg.nb_lcores = rte_lcore_count() - 1; + if (cfg.nb_lcores < 1) + rte_exit(EXIT_FAILURE, + "There should be at least one slave lcore.\n"); + + if (copy_mode == COPY_MODE_IOAT_NUM) + assign_rawdevs(); + else /* copy_mode == COPY_MODE_SW_NUM */ + assign_rings(); + + start_forwarding_cores(); + /* master core prints stats while other cores forward */ + print_stats(argv[0]); + + /* force_quit is true when we get here */ + rte_eal_mp_wait_lcore(); + + uint32_t j; + + for (i = 0; i < cfg.nb_ports; i++) { + printf("Closing port %d\n", cfg.ports[i].rxtx_port); + rte_eth_dev_stop(cfg.ports[i].rxtx_port); + rte_eth_dev_close(cfg.ports[i].rxtx_port); + if (copy_mode == COPY_MODE_IOAT_NUM) { + for (j = 0; j < cfg.ports[i].nb_queues; j++) { + printf("Stopping rawdev %d\n", + cfg.ports[i].ioat_ids[j]); + rte_rawdev_stop(cfg.ports[i].ioat_ids[j]); + } + } else /* copy_mode == COPY_MODE_SW_NUM */ + rte_ring_free(cfg.ports[i].rx_to_tx_ring); + } + + printf("Bye...\n"); + return 0; +} diff --git a/examples/ioat/meson.build b/examples/ioat/meson.build new file mode 100644 index 000000000..ff56dc99c --- /dev/null +++ b/examples/ioat/meson.build @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +# meson file, for building this example as part of a main DPDK build. +# +# To build this example as a standalone application with an already-installed +# DPDK instance, use 'make' + +deps += ['pmd_ioat'] + +sources = files( + 'ioatfwd.c' +) diff --git a/examples/meson.build b/examples/meson.build index a046b74ad..c2e18b59b 100644 --- a/examples/meson.build +++ b/examples/meson.build @@ -16,6 +16,7 @@ all_examples = [ 'eventdev_pipeline', 'exception_path', 'fips_validation', 'flow_classify', 'flow_filtering', 'helloworld', + 'ioat', 'ip_fragmentation', 'ip_pipeline', 'ip_reassembly', 'ipsec-secgw', 'ipv4_multicast', 'kni', -- 2.22.0.windows.1