From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 01C08379B for ; Wed, 23 Nov 2016 18:37:16 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP; 23 Nov 2016 09:37:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,539,1473145200"; d="scan'208";a="1089548608" Received: from unknown (HELO Sent) ([10.103.102.79]) by fmsmga002.fm.intel.com with SMTP; 23 Nov 2016 09:37:12 -0800 Received: by Sent (sSMTP sendmail emulation); Wed, 23 Nov 2016 18:36:29 +0100 From: Tomasz Kulasek To: dev@dpdk.org Cc: konstantin.ananyev@intel.com, olivier.matz@6wind.com Date: Wed, 23 Nov 2016 18:36:19 +0100 Message-Id: <1479922585-8640-1-git-send-email-tomaszx.kulasek@intel.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1477486575-25148-1-git-send-email-tomaszx.kulasek@intel.com> References: <1477486575-25148-1-git-send-email-tomaszx.kulasek@intel.com> Subject: [dpdk-dev] [PATCH v12 0/6] add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Nov 2016 17:37:17 -0000 As discussed in that thread: http://dpdk.org/ml/archives/dev/2015-September/023603.html Different NIC models depending on HW offload requested might impose different requirements on packets to be TX-ed in terms of: - Max number of fragments per packet allowed - Max number of fragments per TSO segments - The way pseudo-header checksum should be pre-calculated - L3/L4 header fields filling - etc. MOTIVATION: ----------- 1) Some work cannot (and didn't should) be done in rte_eth_tx_burst. However, this work is sometimes required, and now, it's an application issue. 2) Different hardware may have different requirements for TX offloads, other subset can be supported and so on. 3) Some parameters (e.g. number of segments in ixgbe driver) may hung device. These parameters may be vary for different devices. For example i40e HW allows 8 fragments per packet, but that is after TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit. 4) Fields in packet may require different initialization (like e.g. will require pseudo-header checksum precalculation, sometimes in a different way depending on packet type, and so on). Now application needs to care about it. 5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to prepare packet burst in acceptable form for specific device. 6) Some additional checks may be done in debug mode keeping tx_burst implementation clean. PROPOSAL: --------- To help user to deal with all these varieties we propose to: 1) Introduce rte_eth_tx_prepare() function to do necessary preparations of packet burst to be safely transmitted on device for desired HW offloads (set/reset checksum field according to the hardware requirements) and check HW constraints (number of segments per packet, etc). While the limitations and requirements may differ for devices, it requires to extend rte_eth_dev structure with new function pointer "tx_pkt_prepare" which can be implemented in the driver to prepare and verify packets, in devices specific way, before burst, what should to prevent application to send malformed packets. 2) Also new fields will be introduced in rte_eth_desc_lim: nb_seg_max and nb_mtu_seg_max, providing an information about max segments in TSO and non-TSO packets acceptable by device. This information is useful for application to not create/limit malicious packet. APPLICATION (CASE OF USE): -------------------------- 1) Application should to initialize burst of packets to send, set required tx offload flags and required fields, like l2_len, l3_len, l4_len, and tso_segsz 2) Application passes burst to the rte_eth_tx_prep to check conditions required to send packets through the NIC. 3) The result of rte_eth_tx_prep can be used to send valid packets and/or restore invalid if function fails. e.g. for (i = 0; i < nb_pkts; i++) { /* initialize or process packet */ bufs[i]->tso_segsz = 800; bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_IP_CKSUM; bufs[i]->l2_len = sizeof(struct ether_hdr); bufs[i]->l3_len = sizeof(struct ipv4_hdr); bufs[i]->l4_len = sizeof(struct tcp_hdr); } /* Prepare burst of TX packets */ nb_prep = rte_eth_tx_prepare(port, 0, bufs, nb_pkts); if (nb_prep < nb_pkts) { printf("Tx prepare failed\n"); /* nb_prep indicates here first invalid packet. rte_eth_tx_prep * can be used on remaining packets to find another ones. */ } /* Send burst of TX packets */ nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep); /* Free any unsent packets. */ v12 changes: - renamed API function from "rte_eth_tx_prep" to "rte_eth_tx_prepare" (to be not confused with "prepend") - changed "rte_phdr_cksum_fix" to "rte_net_intel_cksum_prepare" - added "csum txprep (on|off)" command to the csum engine allowing to select txprep path for packet processing v11 changed: - updated comments - added information to the API description about packet data requirements/limitations. v10 changes: - moved drivers tx calback check in rte_eth_tx_prep after queue_id check v9 changes: - fixed headers structure fragmentation check - moved fragmentation check into rte_validate_tx_offload() v8 changes: - mbuf argument in rte_validate_tx_offload declared as const v7 changes: - comments reworded/added - changed errno values returned from Tx prep API - added check in rte_phdr_cksum_fix if headers are in the first data segment and can be safetly modified - moved rte_validate_tx_offload to rte_mbuf - moved rte_phdr_cksum_fix to rte_net.h - removed rte_pkt.h new file as useless v6 changes: - added performance impact test results to the patch description v5 changes: - rebased csum engine modification - added information to the csum engine about performance tests - some performance improvements v4 changes: - tx_prep is now set to default behavior (NULL) for simple/vector path in fm10k, i40e and ixgbe drivers to increase performance, when Tx offloads are not intentionally available v3 changes: - reworked csum testpmd engine instead adding new one, - fixed checksum initialization procedure to include also outer checksum offloads, - some minor formattings and optimalizations v2 changes: - rte_eth_tx_prep() returns number of packets when device doesn't support tx_prep functionality, - introduced CONFIG_RTE_ETHDEV_TX_PREP allowing to turn off tx_prep Tomasz Kulasek (6): ethdev: add Tx preparation e1000: add Tx preparation fm10k: add Tx preparation i40e: add Tx preparation ixgbe: add Tx preparation testpmd: use Tx preparation in csum engine app/test-pmd/cmdline.c | 49 ++++++++++++++++++ app/test-pmd/csumonly.c | 33 +++++++++--- app/test-pmd/testpmd.c | 5 ++ app/test-pmd/testpmd.h | 2 + config/common_base | 1 + drivers/net/e1000/e1000_ethdev.h | 11 ++++ drivers/net/e1000/em_ethdev.c | 5 +- drivers/net/e1000/em_rxtx.c | 48 ++++++++++++++++- drivers/net/e1000/igb_ethdev.c | 4 ++ drivers/net/e1000/igb_rxtx.c | 52 ++++++++++++++++++- drivers/net/fm10k/fm10k.h | 6 +++ drivers/net/fm10k/fm10k_ethdev.c | 5 ++ drivers/net/fm10k/fm10k_rxtx.c | 50 +++++++++++++++++- drivers/net/i40e/i40e_ethdev.c | 3 ++ drivers/net/i40e/i40e_rxtx.c | 72 +++++++++++++++++++++++++- drivers/net/i40e/i40e_rxtx.h | 8 +++ drivers/net/ixgbe/ixgbe_ethdev.c | 3 ++ drivers/net/ixgbe/ixgbe_ethdev.h | 5 +- drivers/net/ixgbe/ixgbe_rxtx.c | 56 ++++++++++++++++++++ drivers/net/ixgbe/ixgbe_rxtx.h | 2 + lib/librte_ether/rte_ethdev.h | 106 ++++++++++++++++++++++++++++++++++++++ lib/librte_mbuf/rte_mbuf.h | 64 +++++++++++++++++++++++ lib/librte_net/rte_net.h | 85 ++++++++++++++++++++++++++++++ 23 files changed, 662 insertions(+), 13 deletions(-) -- 1.7.9.5