From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id C0971231C for ; Mon, 24 Oct 2016 19:26:26 +0200 (CEST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP; 24 Oct 2016 10:26:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,542,1473145200"; d="scan'208";a="183245094" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by fmsmga004.fm.intel.com with ESMTP; 24 Oct 2016 10:26:26 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.177]) by IRSMSX103.ger.corp.intel.com ([169.254.3.190]) with mapi id 14.03.0248.002; Mon, 24 Oct 2016 18:26:24 +0100 From: "Ananyev, Konstantin" To: "Kulasek, TomaszX" , "dev@dpdk.org" CC: "olivier.matz@6wind.com" Thread-Topic: [PATCH v10 0/6] add Tx preparation Thread-Index: AQHSLhdMjvf3Vf9m2EevvumG23ATtaC32rHQ Date: Mon, 24 Oct 2016 17:26:23 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772583F0CC4C6@irsmsx105.ger.corp.intel.com> References: <1477317933-14144-1-git-send-email-tomaszx.kulasek@intel.com> <1477327917-18564-1-git-send-email-tomaszx.kulasek@intel.com> In-Reply-To: <1477327917-18564-1-git-send-email-tomaszx.kulasek@intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v10 0/6] add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Oct 2016 17:26:27 -0000 >=20 > As discussed in that thread: >=20 > http://dpdk.org/ml/archives/dev/2015-September/023603.html >=20 > Different NIC models depending on HW offload requested might impose > different requirements on packets to be TX-ed in terms of: >=20 > - Max number of fragments per packet allowed > - Max number of fragments per TSO segments > - The way pseudo-header checksum should be pre-calculated > - L3/L4 header fields filling > - etc. >=20 >=20 > MOTIVATION: > ----------- >=20 > 1) Some work cannot (and didn't should) be done in rte_eth_tx_burst. > However, this work is sometimes required, and now, it's an > application issue. >=20 > 2) Different hardware may have different requirements for TX offloads, > other subset can be supported and so on. >=20 > 3) Some parameters (e.g. number of segments in ixgbe driver) may hung > device. These parameters may be vary for different devices. >=20 > For example i40e HW allows 8 fragments per packet, but that is after > TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit. >=20 > 4) Fields in packet may require different initialization (like e.g. will > require pseudo-header checksum precalculation, sometimes in a > different way depending on packet type, and so on). Now application > needs to care about it. >=20 > 5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to > prepare packet burst in acceptable form for specific device. >=20 > 6) Some additional checks may be done in debug mode keeping tx_burst > implementation clean. >=20 >=20 > PROPOSAL: > --------- >=20 > To help user to deal with all these varieties we propose to: >=20 > 1) Introduce rte_eth_tx_prep() function to do necessary preparations of > packet burst to be safely transmitted on device for desired HW > offloads (set/reset checksum field according to the hardware > requirements) and check HW constraints (number of segments per > packet, etc). >=20 > While the limitations and requirements may differ for devices, it > requires to extend rte_eth_dev structure with new function pointer > "tx_pkt_prep" which can be implemented in the driver to prepare and > verify packets, in devices specific way, before burst, what should to > prevent application to send malformed packets. >=20 > 2) Also new fields will be introduced in rte_eth_desc_lim: > nb_seg_max and nb_mtu_seg_max, providing an information about max > segments in TSO and non-TSO packets acceptable by device. >=20 > This information is useful for application to not create/limit > malicious packet. >=20 >=20 > APPLICATION (CASE OF USE): > -------------------------- >=20 > 1) Application should to initialize burst of packets to send, set > required tx offload flags and required fields, like l2_len, l3_len, > l4_len, and tso_segsz >=20 > 2) Application passes burst to the rte_eth_tx_prep to check conditions > required to send packets through the NIC. >=20 > 3) The result of rte_eth_tx_prep can be used to send valid packets > and/or restore invalid if function fails. >=20 > e.g. >=20 > for (i =3D 0; i < nb_pkts; i++) { >=20 > /* initialize or process packet */ >=20 > bufs[i]->tso_segsz =3D 800; > bufs[i]->ol_flags =3D PKT_TX_TCP_SEG | PKT_TX_IPV4 > | PKT_TX_IP_CKSUM; > bufs[i]->l2_len =3D sizeof(struct ether_hdr); > bufs[i]->l3_len =3D sizeof(struct ipv4_hdr); > bufs[i]->l4_len =3D sizeof(struct tcp_hdr); > } >=20 > /* Prepare burst of TX packets */ > nb_prep =3D rte_eth_tx_prep(port, 0, bufs, nb_pkts); >=20 > if (nb_prep < nb_pkts) { > printf("tx_prep failed\n"); >=20 > /* nb_prep indicates here first invalid packet. rte_eth_tx_prep > * can be used on remaining packets to find another ones. > */ >=20 > } >=20 > /* Send burst of TX packets */ > nb_tx =3D rte_eth_tx_burst(port, 0, bufs, nb_prep); >=20 > /* Free any unsent packets. */ >=20 > v10 changes: > - moved drivers tx calback check in rte_eth_tx_prep after queue_id check >=20 > v9 changes: > - fixed headers structure fragmentation check > - moved fragmentation check into rte_validate_tx_offload() >=20 > v8 changes: > - mbuf argument in rte_validate_tx_offload declared as const >=20 > v7 changes: > - comments reworded/added > - changed errno values returned from Tx prep API > - added check in rte_phdr_cksum_fix if headers are in the first > data segment and can be safetly modified > - moved rte_validate_tx_offload to rte_mbuf > - moved rte_phdr_cksum_fix to rte_net.h > - removed rte_pkt.h new file as useless >=20 > v6 changes: > - added performance impact test results to the patch description >=20 > v5 changes: > - rebased csum engine modification > - added information to the csum engine about performance tests > - some performance improvements >=20 > v4 changes: > - tx_prep is now set to default behavior (NULL) for simple/vector path > in fm10k, i40e and ixgbe drivers to increase performance, when > Tx offloads are not intentionally available >=20 > v3 changes: > - reworked csum testpmd engine instead adding new one, > - fixed checksum initialization procedure to include also outer > checksum offloads, > - some minor formattings and optimalizations >=20 > v2 changes: > - rte_eth_tx_prep() returns number of packets when device doesn't > support tx_prep functionality, > - introduced CONFIG_RTE_ETHDEV_TX_PREP allowing to turn off tx_prep >=20 > Tomasz Kulasek (6): > ethdev: add Tx preparation > e1000: add Tx preparation > fm10k: add Tx preparation > i40e: add Tx preparation > ixgbe: add Tx preparation > testpmd: use Tx preparation in csum engine >=20 > app/test-pmd/csumonly.c | 36 ++++++-------- > config/common_base | 1 + > drivers/net/e1000/e1000_ethdev.h | 11 +++++ > drivers/net/e1000/em_ethdev.c | 5 +- > drivers/net/e1000/em_rxtx.c | 48 ++++++++++++++++++- > drivers/net/e1000/igb_ethdev.c | 4 ++ > drivers/net/e1000/igb_rxtx.c | 52 ++++++++++++++++++++- > drivers/net/fm10k/fm10k.h | 6 +++ > drivers/net/fm10k/fm10k_ethdev.c | 5 ++ > drivers/net/fm10k/fm10k_rxtx.c | 50 +++++++++++++++++++- > drivers/net/i40e/i40e_ethdev.c | 3 ++ > drivers/net/i40e/i40e_rxtx.c | 72 +++++++++++++++++++++++++++- > drivers/net/i40e/i40e_rxtx.h | 8 ++++ > drivers/net/ixgbe/ixgbe_ethdev.c | 3 ++ > drivers/net/ixgbe/ixgbe_ethdev.h | 5 +- > drivers/net/ixgbe/ixgbe_rxtx.c | 58 ++++++++++++++++++++++- > drivers/net/ixgbe/ixgbe_rxtx.h | 2 + > lib/librte_ether/rte_ethdev.h | 96 ++++++++++++++++++++++++++++++++= ++++++ > lib/librte_mbuf/rte_mbuf.h | 64 +++++++++++++++++++++++++ > lib/librte_net/rte_net.h | 85 ++++++++++++++++++++++++++++++++= + > 20 files changed, 584 insertions(+), 30 deletions(-) >=20 > -- Acked-by: Konstantin Ananyev > 1.7.9.5