From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id ECC242BA2 for ; Thu, 28 Jul 2016 15:58:55 +0200 (CEST) Received: from [10.16.0.195] (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id A7D212596B; Thu, 28 Jul 2016 15:58:55 +0200 (CEST) To: "Ananyev, Konstantin" , Jerin Jacob References: <1469024691-58750-1-git-send-email-tomaszx.kulasek@intel.com> <1469114659-66063-1-git-send-email-tomaszx.kulasek@intel.com> <2601191342CEEE43887BDE71AB97725836B80AD8@irsmsx105.ger.corp.intel.com> <2146153.nVzdynOqdk@xps13> <20160727171043.GA22116@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8884E@irsmsx105.ger.corp.intel.com> <20160727174133.GA22895@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B88894@irsmsx105.ger.corp.intel.com> <20160728021345.GA3617@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8AA48@irsmsx105.ger.corp.intel.com> <20160728113853.GA14755@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8AB57@irsmsx105.ger.corp.intel.com> Cc: Thomas Monjalon , "dev@dpdk.org" From: Olivier MATZ Message-ID: <579A0F9E.6090200@6wind.com> Date: Thu, 28 Jul 2016 15:58:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0 MIME-Version: 1.0 In-Reply-To: <2601191342CEEE43887BDE71AB97725836B8AB57@irsmsx105.ger.corp.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jul 2016 13:58:56 -0000 Hi, Jumping into this thread, it looks it's the last pending patch remaining for the release. For reference, the idea of tx_prep() was mentionned some time ago in http://dpdk.org/ml/archives/dev/2014-May/002504.html Few comments below. On 07/28/2016 03:01 PM, Ananyev, Konstantin wrote: > Right now to make HW TX offloads to work user is required to do particular actions: > 1. set mbuf.ol_flags properly. > 2. setup mbuf.tx_offload fields properly. > 3. update L3/L4 header fields in a particular way. > > We move #3 into tx_prep(), to hide that complexity from the user simplify things for him. > Though if he still prefers to do #3 by himself - that's ok too. I think moving #3 out of the application is a good idea. Today, for TSO, the offload dpdk API requires to set a specific pseudo header checksum (which does not include the ip len, as expected by Intel drivers), and set the IP checksum to 0. In our app, the network stack sets the TCP checksum to the standard pseudo header checksum, and before sending the mbuf: - packets are split in sw if the driver does not support tso - the tcp csum is patched to match dpdk api if the driver supports tso In the patchs I've recently sent adding tso support for virtio-pmd, it conforms to the dpdk API (phdr csum without ip len), so the tx function need to patch the mbuf data inside the driver, which is something what we want to avoid, for some good reasons explained by Konstantin. So, I think having a tx_prep would also be the occasion to standardize a bit the dpdk offload api, and let the driver-specific stuff in tx_prep(). Konstantin, any opinion about this? >>> Another main purpose of tx_prep(): for multi-segment packets is to >>> check that number of segments doesn't exceed HW limit. >>> Again right now users have to do that on their own. If calling tx_prep() is optional, does it mean that this check may be done twice? (once in tx_prep() and once in tx_burst()) What would be the advantage of doing this check in tx_prep() instead of keeping it in tx_burst(), as it does not touch the mbuf data? >>> 3. Having it a s separate function would allow user control when/where >>> to call it, let say only for some packets, or probably call tx_prep() >>> on one core, and do actual tx_burst() for these packets on the other. Yes, from what I remember, the pipeline model was the main reason why we do not just modify the packet in tx_burst(). Right? >>>>> If you refer as lost cycles here something like: >>>>> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_prep, -ENOTSUP); then >>>>> yes. >>>>> Though comparing to actual work need to be done for most HW TX >>>>> offloads, I think it is neglectable. >>>> >>>> Not sure. I think doing this test on a per-bulk basis should not impact performance. > To be honest, I don't understand what is your concern here. > That proposed change doesn't break any existing functionality, > doesn't introduce any new requirements to the existing API, > and wouldn't introduce any performance regression for existing apps. > It is a an extension, and user is free not to use it, if it doesn't fit his needs. > From other side there are users who are interested in that functionality, > and they do have use-cases for it. In my opinion, using tx_prep() will implicitly become mandatory as soon as the application want to do offload. An application that won't use it will have to prepare the mbuf, and this preparation will depend on the device, which is not acceptable inside an application. So, to conclude, the api change notification looks good to me, even if there is still some room to discuss the implementation details. Acked-by: Olivier Matz