From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com [74.125.82.41]) by dpdk.org (Postfix) with ESMTP id 0058F37A8 for ; Thu, 28 Jul 2016 14:04:34 +0200 (CEST) Received: by mail-wm0-f41.google.com with SMTP id f65so249315715wmi.0 for ; Thu, 28 Jul 2016 05:04:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=scylladb-com.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=1e9DpwTMDBJBoF28hhrNe5cQJkXEevg5ZPNO/o1qZpg=; b=K1+zdEPa6WpoVe8bxf0xwCeNji0oNlZWUy5hWZ4z+P2xc83vr60QAB3ojoY76J6E7R yVJhiqPJCtM/NGD3Z25CF6MM4izBbuVHmgEkafRMon9liyykBAaamWRDzzuTpxJ51Xgq xRp9/jh8Hzr/tWgLHKO2Ife6U6ezoyhpWuuR9Ado5LqROYRztA0GEeVnaxuCU7inwQ4R qYMYcEMnu9QhVq+DYPkX/2+41X386aWB8aAISjPOl/wCTmqlZ2UUVY5rZwGuo7nGt/Ns SgRbQVXc4q9Lvui5H7KvXGvfMqlHXWRWhgXSr0BJ7Df9bFj/94dC9XdTav3iZX4wFTsP AfTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding; bh=1e9DpwTMDBJBoF28hhrNe5cQJkXEevg5ZPNO/o1qZpg=; b=Xtq5Sq/sClP2kd8yxrL6+e6MFB3za1zBYs4uDFZyHRh36FG+lMNxMSdmexDPPAsm2I nuEdaNJCcacF+tH234NZxC37xjh8JpIlaH3yTMcSBiYYMetW+l18NSF8FmRYl7ggKYZK DgXNX6eyOl1GU80TEK6Y7k0CLrl22b999xFX3+H0WWAm9+Ry9/3H96b0aj19h+nP2Mpd u7qSXcxj0TqIbqrxEcTxFP3hAav5Lv4pUF7SRqI1dFKaQOCV0OK6/zs7tAZMYteC6QSE MB+p1+mNn5jg1j/+PqlzhOOVKloEQ2lVm0pK3ELHvjQ6YRSdXvoZ2ejPJ3x8MA1hVgXS shEw== X-Gm-Message-State: ALyK8tIuoreKCgPVVfBMQHCn3XBSIg0D2EOrYS5M/ZPhzYUZ1WsdErrexKNGyMLDZZI/TQ== X-Received: by 10.28.127.138 with SMTP id a132mr55402785wmd.72.1469707473537; Thu, 28 Jul 2016 05:04:33 -0700 (PDT) Received: from avi.cloudius-systems.com ([37.142.229.250]) by smtp.gmail.com with ESMTPSA id f187sm11993127wmf.15.2016.07.28.05.04.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Jul 2016 05:04:32 -0700 (PDT) To: Tomasz Kulasek , dev@dpdk.org References: <1469024691-58750-1-git-send-email-tomaszx.kulasek@intel.com> <1469114659-66063-1-git-send-email-tomaszx.kulasek@intel.com> Cc: Vladislav Zolotarov , Takuya ASADA From: Avi Kivity Organization: ScyllaDB Message-ID: <83855193-c7ea-55ad-5a02-7f26a8984878@scylladb.com> Date: Thu, 28 Jul 2016 15:04:31 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <1469114659-66063-1-git-send-email-tomaszx.kulasek@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jul 2016 12:04:34 -0000 On 07/21/2016 06:24 PM, Tomasz Kulasek wrote: > This is an ABI deprecation notice for DPDK 16.11 in librte_ether about > changes in rte_eth_dev and rte_eth_desc_lim structures. > > As discussed in that thread: > > http://dpdk.org/ml/archives/dev/2015-September/023603.html > > Different NIC models depending on HW offload requested might impose > different requirements on packets to be TX-ed in terms of: > > - Max number of fragments per packet allowed > - Max number of fragments per TSO segments > - The way pseudo-header checksum should be pre-calculated > - L3/L4 header fields filling > - etc. > > > MOTIVATION: > ----------- > > 1) Some work cannot (and didn't should) be done in rte_eth_tx_burst. > However, this work is sometimes required, and now, it's an > application issue. > > 2) Different hardware may have different requirements for TX offloads, > other subset can be supported and so on. > > 3) Some parameters (eg. number of segments in ixgbe driver) may hung > device. These parameters may be vary for different devices. > > For example i40e HW allows 8 fragments per packet, but that is after > TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit. > > 4) Fields in packet may require different initialization (like eg. will > require pseudo-header checksum precalculation, sometimes in a > different way depending on packet type, and so on). Now application > needs to care about it. > > 5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to > prepare packet burst in acceptable form for specific device. > > 6) Some additional checks may be done in debug mode keeping tx_burst > implementation clean. Thanks a lot for this. Seastar suffered from this issue and had to apply NIC-specific workarounds. The proposal will work well for seastar. > > PROPOSAL: > --------- > > To help user to deal with all these varieties we propose to: > > 1. Introduce rte_eth_tx_prep() function to do necessary preparations of > packet burst to be safely transmitted on device for desired HW > offloads (set/reset checksum field according to the hardware > requirements) and check HW constraints (number of segments per > packet, etc). > > While the limitations and requirements may differ for devices, it > requires to extend rte_eth_dev structure with new function pointer > "tx_pkt_prep" which can be implemented in the driver to prepare and > verify packets, in devices specific way, before burst, what should to > prevent application to send malformed packets. > > 2. Also new fields will be introduced in rte_eth_desc_lim: > nb_seg_max and nb_mtu_seg_max, providing an information about max > segments in TSO and non-TSO packets acceptable by device. > > This information is useful for application to not create/limit > malicious packet. > > > APPLICATION (CASE OF USE): > -------------------------- > > 1) Application should to initialize burst of packets to send, set > required tx offload flags and required fields, like l2_len, l3_len, > l4_len, and tso_segsz > > 2) Application passes burst to the rte_eth_tx_prep to check conditions > required to send packets through the NIC. > > 3) The result of rte_eth_tx_prep can be used to send valid packets > and/or restore invalid if function fails. > > eg. > > for (i = 0; i < nb_pkts; i++) { > > /* initialize or process packet */ > > bufs[i]->tso_segsz = 800; > bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4 > | PKT_TX_IP_CKSUM; > bufs[i]->l2_len = sizeof(struct ether_hdr); > bufs[i]->l3_len = sizeof(struct ipv4_hdr); > bufs[i]->l4_len = sizeof(struct tcp_hdr); > } > > /* Prepare burst of TX packets */ > nb_prep = rte_eth_tx_prep(port, 0, bufs, nb_pkts); > > if (nb_prep < nb_pkts) { > printf("tx_prep failed\n"); > > /* drop or restore invalid packets */ > > } > > /* Send burst of TX packets */ > nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep); > > /* Free any unsent packets. */ > > > > Signed-off-by: Tomasz Kulasek > --- > doc/guides/rel_notes/deprecation.rst | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst > index f502f86..485aacb 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -41,3 +41,10 @@ Deprecation Notices > * The mempool functions for single/multi producer/consumer are deprecated and > will be removed in 16.11. > It is replaced by rte_mempool_generic_get/put functions. > + > +* In 16.11 ABI changes are plained: the ``rte_eth_dev`` structure will be > + extended with new function pointer ``tx_pkt_prep`` allowing verification > + and processing of packet burst to meet HW specific requirements before > + transmit. Also new fields will be added to the ``rte_eth_desc_lim`` structure: > + ``nb_seg_max`` and ``nb_mtu_seg_max`` provideing information about number of > + segments limit to be transmitted by device for TSO/non-TSO packets.