From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id D6EDB2BE1 for ; Thu, 28 Jul 2016 16:21:04 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP; 28 Jul 2016 07:21:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,434,1464678000"; d="scan'208";a="1015540506" Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153]) by fmsmga001.fm.intel.com with ESMTP; 28 Jul 2016 07:21:02 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.102]) by IRSMSX101.ger.corp.intel.com ([169.254.1.183]) with mapi id 14.03.0248.002; Thu, 28 Jul 2016 15:21:02 +0100 From: "Ananyev, Konstantin" To: Olivier MATZ , Jerin Jacob CC: Thomas Monjalon , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure Thread-Index: AQHR6C4vhCk5+MDYj0GuIonGmNNwxqAsvetQgABMTACAAI6goIAAD0eAgAAetsCAAAhmAIAAE1hQ Date: Thu, 28 Jul 2016 14:21:01 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836B8ABEB@irsmsx105.ger.corp.intel.com> References: <1469024691-58750-1-git-send-email-tomaszx.kulasek@intel.com> <1469114659-66063-1-git-send-email-tomaszx.kulasek@intel.com> <2601191342CEEE43887BDE71AB97725836B80AD8@irsmsx105.ger.corp.intel.com> <2146153.nVzdynOqdk@xps13> <20160727171043.GA22116@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8884E@irsmsx105.ger.corp.intel.com> <20160727174133.GA22895@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B88894@irsmsx105.ger.corp.intel.com> <20160728021345.GA3617@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8AA48@irsmsx105.ger.corp.intel.com> <20160728113853.GA14755@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8AB57@irsmsx105.ger.corp.intel.com> <579A0F9E.6090200@6wind.com> In-Reply-To: <579A0F9E.6090200@6wind.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jul 2016 14:21:05 -0000 Hi Olivier, >=20 > Hi, >=20 > Jumping into this thread, it looks it's the last pending patch remaining = for the release. >=20 > For reference, the idea of tx_prep() was mentionned some time ago in http= ://dpdk.org/ml/archives/dev/2014-May/002504.html >=20 > Few comments below. >=20 > On 07/28/2016 03:01 PM, Ananyev, Konstantin wrote: > > Right now to make HW TX offloads to work user is required to do particu= lar actions: > > 1. set mbuf.ol_flags properly. > > 2. setup mbuf.tx_offload fields properly. > > 3. update L3/L4 header fields in a particular way. > > > > We move #3 into tx_prep(), to hide that complexity from the user simpli= fy things for him. > > Though if he still prefers to do #3 by himself - that's ok too. >=20 > I think moving #3 out of the application is a good idea. Today, for TSO, = the offload dpdk API requires to set a specific pseudo header > checksum (which does not include the ip len, as expected by Intel drivers= ), and set the IP checksum to 0. >=20 > In our app, the network stack sets the TCP checksum to the standard pseud= o header checksum, and before sending the mbuf: > - packets are split in sw if the driver does not support tso > - the tcp csum is patched to match dpdk api if the driver supports tso >=20 > In the patchs I've recently sent adding tso support for virtio-pmd, it co= nforms to the dpdk API (phdr csum without ip len), so the tx function > need to patch the mbuf data inside the driver, which is something what we= want to avoid, for some good reasons explained by Konstantin. Yep, that would be another good use-case for tx_prep() I suppose. >=20 > So, I think having a tx_prep would also be the occasion to standardize a = bit the dpdk offload api, and let the driver-specific stuff in tx_prep(). >=20 > Konstantin, any opinion about this? Yes, that sounds like a good thing to me. >=20 > >>> Another main purpose of tx_prep(): for multi-segment packets is to > >>> check that number of segments doesn't exceed HW limit. > >>> Again right now users have to do that on their own. >=20 > If calling tx_prep() is optional, does it mean that this check may be don= e twice? (once in tx_prep() and once in tx_burst()) I meant 'optional' in a way, that if user doesn't want to use tx_prep() and do step #3 from the above on his own (what happens now) that is still ok. But I think step #3 (modify packet's data) still needs to be done before tx= _burst() is called for the packets. >=20 > What would be the advantage of doing this check in tx_prep() instead of k= eeping it in tx_burst(), as it does not touch the mbuf data? >=20 > >>> 3. Having it a s separate function would allow user control when/whe= re > >>> to call it, let say only for some packets, or probably call tx= _prep() > >>> on one core, and do actual tx_burst() for these packets on the= other. >=20 > Yes, from what I remember, the pipeline model was the main reason why we = do not just modify the packet in tx_burst(). Right? Yes. >=20 > >>>>> If you refer as lost cycles here something like: > >>>>> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_prep, -ENOTSUP); then > >>>>> yes. > >>>>> Though comparing to actual work need to be done for most HW TX > >>>>> offloads, I think it is neglectable. > >>>> > >>>> Not sure. >=20 > I think doing this test on a per-bulk basis should not impact performance= . >=20 > > To be honest, I don't understand what is your concern here. > > That proposed change doesn't break any existing functionality, doesn't > > introduce any new requirements to the existing API, and wouldn't > > introduce any performance regression for existing apps. > > It is a an extension, and user is free not to use it, if it doesn't fit= his needs. > > From other side there are users who are interested in that > > functionality, and they do have use-cases for it. >=20 > In my opinion, using tx_prep() will implicitly become mandatory as soon a= s the application want to do offload. An application that won't use > it will have to prepare the mbuf, and this preparation will depend on the= device, which is not acceptable inside an application. Yes, I also hope that most apps that do use TX offloads will start to use i= t, as I think it will be much more convenient way, then what we have right now= . I just liked to emphasis that user wouldn't be forced to. Konstantin >=20 >=20 > So, to conclude, the api change notification looks good to me, even if th= ere is still some room to discuss the implementation details. >=20 >=20 > Acked-by: Olivier Matz