From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 998294AC7 for ; Fri, 30 Sep 2016 11:55:52 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP; 30 Sep 2016 02:55:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,271,1473145200"; d="scan'208";a="885398028" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by orsmga003.jf.intel.com with ESMTP; 30 Sep 2016 02:55:49 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.196]) by irsmsx110.ger.corp.intel.com ([163.33.3.25]) with mapi id 14.03.0248.002; Fri, 30 Sep 2016 10:55:34 +0100 From: "Ananyev, Konstantin" To: "Kulasek, TomaszX" , "dev@dpdk.org" Thread-Topic: [PATCH v4 0/6] add Tx preparation Thread-Index: AQHSGvk4/6B1v667l0OiLt8J9Ez/VKCRysHQ Date: Fri, 30 Sep 2016 09:55:34 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772583F0BCD64@irsmsx105.ger.corp.intel.com> References: <20160928111052.9968-1-tomaszx.kulasek@intel.com> <20160930090039.10164-1-tomaszx.kulasek@intel.com> In-Reply-To: <20160930090039.10164-1-tomaszx.kulasek@intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4 0/6] add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2016 09:55:53 -0000 >=20 > As discussed in that thread: >=20 > http://dpdk.org/ml/archives/dev/2015-September/023603.html >=20 > Different NIC models depending on HW offload requested might impose diffe= rent requirements on packets to be TX-ed in terms of: >=20 > - Max number of fragments per packet allowed > - Max number of fragments per TSO segments > - The way pseudo-header checksum should be pre-calculated > - L3/L4 header fields filling > - etc. >=20 >=20 > MOTIVATION: > ----------- >=20 > 1) Some work cannot (and didn't should) be done in rte_eth_tx_burst. > However, this work is sometimes required, and now, it's an > application issue. >=20 > 2) Different hardware may have different requirements for TX offloads, > other subset can be supported and so on. >=20 > 3) Some parameters (e.g. number of segments in ixgbe driver) may hung > device. These parameters may be vary for different devices. >=20 > For example i40e HW allows 8 fragments per packet, but that is after > TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit. >=20 > 4) Fields in packet may require different initialization (like e.g. will > require pseudo-header checksum precalculation, sometimes in a > different way depending on packet type, and so on). Now application > needs to care about it. >=20 > 5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to > prepare packet burst in acceptable form for specific device. >=20 > 6) Some additional checks may be done in debug mode keeping tx_burst > implementation clean. >=20 >=20 > PROPOSAL: > --------- >=20 > To help user to deal with all these varieties we propose to: >=20 > 1) Introduce rte_eth_tx_prep() function to do necessary preparations of > packet burst to be safely transmitted on device for desired HW > offloads (set/reset checksum field according to the hardware > requirements) and check HW constraints (number of segments per > packet, etc). >=20 > While the limitations and requirements may differ for devices, it > requires to extend rte_eth_dev structure with new function pointer > "tx_pkt_prep" which can be implemented in the driver to prepare and > verify packets, in devices specific way, before burst, what should to > prevent application to send malformed packets. >=20 > 2) Also new fields will be introduced in rte_eth_desc_lim: > nb_seg_max and nb_mtu_seg_max, providing an information about max > segments in TSO and non-TSO packets acceptable by device. >=20 > This information is useful for application to not create/limit > malicious packet. >=20 >=20 > APPLICATION (CASE OF USE): > -------------------------- >=20 > 1) Application should to initialize burst of packets to send, set > required tx offload flags and required fields, like l2_len, l3_len, > l4_len, and tso_segsz >=20 > 2) Application passes burst to the rte_eth_tx_prep to check conditions > required to send packets through the NIC. >=20 > 3) The result of rte_eth_tx_prep can be used to send valid packets > and/or restore invalid if function fails. >=20 > e.g. >=20 > for (i =3D 0; i < nb_pkts; i++) { >=20 > /* initialize or process packet */ >=20 > bufs[i]->tso_segsz =3D 800; > bufs[i]->ol_flags =3D PKT_TX_TCP_SEG | PKT_TX_IPV4 > | PKT_TX_IP_CKSUM; > bufs[i]->l2_len =3D sizeof(struct ether_hdr); > bufs[i]->l3_len =3D sizeof(struct ipv4_hdr); > bufs[i]->l4_len =3D sizeof(struct tcp_hdr); > } >=20 > /* Prepare burst of TX packets */ > nb_prep =3D rte_eth_tx_prep(port, 0, bufs, nb_pkts); >=20 > if (nb_prep < nb_pkts) { > printf("tx_prep failed\n"); >=20 > /* nb_prep indicates here first invalid packet. rte_eth_tx_prep > * can be used on remaining packets to find another ones. > */ >=20 > } >=20 > /* Send burst of TX packets */ > nb_tx =3D rte_eth_tx_burst(port, 0, bufs, nb_prep); >=20 > /* Free any unsent packets. */ >=20 > v4 changes: > - tx_prep is now set to default behavior (NULL) for simple/vector path > in fm10k, i40e and ixgbe drivers to increase performance, when > Tx offloads are not intentionally available >=20 > v3 changes: > - reworked csum testpmd engine instead adding new one, > - fixed checksum initialization procedure to include also outer > checksum offloads, > - some minor formattings and optimalizations >=20 > v2 changes: > - rte_eth_tx_prep() returns number of packets when device doesn't > support tx_prep functionality, > - introduced CONFIG_RTE_ETHDEV_TX_PREP allowing to turn off tx_prep >=20 >=20 > Tomasz Kulasek (6): > ethdev: add Tx preparation > e1000: add Tx preparation > fm10k: add Tx preparation > i40e: add Tx preparation > ixgbe: add Tx preparation > testpmd: use Tx preparation in csum engine >=20 > app/test-pmd/csumonly.c | 97 +++++++++++++++------------ > config/common_base | 1 + > drivers/net/e1000/e1000_ethdev.h | 11 ++++ > drivers/net/e1000/em_ethdev.c | 5 +- > drivers/net/e1000/em_rxtx.c | 48 +++++++++++++- > drivers/net/e1000/igb_ethdev.c | 4 ++ > drivers/net/e1000/igb_rxtx.c | 52 ++++++++++++++- > drivers/net/fm10k/fm10k.h | 6 ++ > drivers/net/fm10k/fm10k_ethdev.c | 5 ++ > drivers/net/fm10k/fm10k_rxtx.c | 50 +++++++++++++- > drivers/net/i40e/i40e_ethdev.c | 3 + > drivers/net/i40e/i40e_rxtx.c | 72 ++++++++++++++++++++- > drivers/net/i40e/i40e_rxtx.h | 8 +++ > drivers/net/ixgbe/ixgbe_ethdev.c | 3 + > drivers/net/ixgbe/ixgbe_ethdev.h | 5 +- > drivers/net/ixgbe/ixgbe_rxtx.c | 56 +++++++++++++++- > drivers/net/ixgbe/ixgbe_rxtx.h | 2 + > lib/librte_ether/rte_ethdev.h | 85 ++++++++++++++++++++++++ > lib/librte_mbuf/rte_mbuf.h | 8 +++ > lib/librte_net/Makefile | 2 +- > lib/librte_net/rte_pkt.h | 133 ++++++++++++++++++++++++++++++++= ++++++ > 21 files changed, 605 insertions(+), 51 deletions(-) create mode 100644= lib/librte_net/rte_pkt.h >=20 > -- Acked-by: Konstantin Ananyev > 1.7.9.5