From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 52449377E for ; Thu, 28 Jul 2016 12:36:15 +0200 (CEST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP; 28 Jul 2016 03:36:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,433,1464678000"; d="scan'208";a="146821148" Received: from irsmsx102.ger.corp.intel.com ([163.33.3.155]) by fmsmga004.fm.intel.com with ESMTP; 28 Jul 2016 03:36:09 -0700 Received: from irsmsx155.ger.corp.intel.com (163.33.192.3) by IRSMSX102.ger.corp.intel.com (163.33.3.155) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 28 Jul 2016 11:36:08 +0100 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.102]) by irsmsx155.ger.corp.intel.com ([169.254.14.102]) with mapi id 14.03.0248.002; Thu, 28 Jul 2016 11:36:08 +0100 From: "Ananyev, Konstantin" To: Jerin Jacob CC: Thomas Monjalon , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure Thread-Index: AQHR6C4vhCk5+MDYj0GuIonGmNNwxqAsvetQgABMTACAAI6goA== Date: Thu, 28 Jul 2016 10:36:07 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836B8AA48@irsmsx105.ger.corp.intel.com> References: <1469024691-58750-1-git-send-email-tomaszx.kulasek@intel.com> <1469114659-66063-1-git-send-email-tomaszx.kulasek@intel.com> <2601191342CEEE43887BDE71AB97725836B80AD8@irsmsx105.ger.corp.intel.com> <2146153.nVzdynOqdk@xps13> <20160727171043.GA22116@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B8884E@irsmsx105.ger.corp.intel.com> <20160727174133.GA22895@localhost.localdomain> <2601191342CEEE43887BDE71AB97725836B88894@irsmsx105.ger.corp.intel.com> <20160728021345.GA3617@localhost.localdomain> In-Reply-To: <20160728021345.GA3617@localhost.localdomain> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jul 2016 10:36:15 -0000 > > > > > > > > > -----Original Message----- > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > > > > > Sent: Wednesday, July 27, 2016 6:11 PM > > > > > To: Thomas Monjalon > > > > > Cc: Kulasek, TomaszX ; dev@dpdk.org; > > > > > Ananyev, Konstantin > > > > > Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for > > > > > rte_eth_dev structure > > > > > > > > > > On Wed, Jul 27, 2016 at 01:59:01AM -0700, Thomas Monjalon wrote: > > > > > > > > Signed-off-by: Tomasz Kulasek > > > > > > > > --- > > > > > > > > +* In 16.11 ABI changes are plained: the ``rte_eth_dev`` > > > > > > > > +structure will be > > > > > > > > + extended with new function pointer ``tx_pkt_prep`` allow= ing > > > > > > > > +verification > > > > > > > > + and processing of packet burst to meet HW specific > > > > > > > > +requirements before > > > > > > > > + transmit. Also new fields will be added to the ``rte_eth= _desc_lim`` structure: > > > > > > > > + ``nb_seg_max`` and ``nb_mtu_seg_max`` provideing > > > > > > > > +information about number of > > > > > > > > + segments limit to be transmitted by device for TSO/non-T= SO packets. > > > > > > > > > > > > > > Acked-by: Konstantin Ananyev > > > > > > > > > > > > I think I understand you want to split the TX processing: > > > > > > 1/ modify/write in mbufs > > > > > > 2/ write in HW > > > > > > and let application decide: > > > > > > - where the TX prep is done (which core) > > > > > > > > > > In what basics applications knows when and where to call tx_pkt_p= rep in fast path. > > > > > if all the time it needs to call before tx_burst then the PMD won= 't > > > > > have/don't need this callback waste cycles in fast path.Is this t= he expected behavior ? > > > > > Anything think it as compile time to make other PMDs wont suffer = because of this change. > > > > > > > > Not sure what suffering you are talking about... > > > > Current model - i.e. when application does preparations (or doesn't= if > > > > none is required) on its own and just call tx_burst() would still b= e there. > > > > If the app doesn't want to use tx_prep() by some reason - that stil= l > > > > ok, and decision is up to the particular app. > > > > > > So my question is in what basics application decides to call the prep= aration. > > > Can you tell me the use case in application perspective? > > > > I suppose one most common use-case when application uses HW TX offloads= , > > and don't' to cope on its own which L3/L4 header fields need to be fill= ed > > for that particular dev_type/hw_offload combination. >=20 > If it does not cope up then it can skip tx'ing in the actual tx burst > itself and move the "skipped" tx packets to end of the list in the tx > burst so that application can take the action on "skipped" packet after > the tx burst Sorry, that's too cryptic for me. Can you reword it somehow? >=20 >=20 > > Instead it just setups the ol_flags, fills tx_offload fields and calls = tx_prep(). > > Please read the original Tomasz's patch, I think he explained possible = use-cases > > with lot of details. >=20 > Sorry, it is not very clear in terms of use cases. Ok, what I meant to say: Right now, if user wants to use HW TX cksum/TSO offloads he might have to: - setup ipv4 header cksum field. - calculate the pseudo header checksum - setup tcp/udp cksum field. Rules how these calculations need to be done and which fields need to be up= dated, may vary depending on HW underneath and requested offloads. tx_prep() - supposed to hide all these nuances from user and allow him to u= se TX HW offloads in a transparent way. Another main purpose of tx_prep(): for multi-segment packets is to check that number of segments doesn't exceed HW limit. Again right now users have to do that on their own. >=20 > In HW perspective, It it tries to avoid the illegal state. But not sure > calling "back to back" tx prepare and then tx burst how does it improve t= he > situation as the check illegal state check introduce in actual tx burst > it self. >=20 > In SW perspective, its try to avoid sending malformed packets. In my > view the same can achieved with existing tx burst it self as PMD is the > one finally send the packets on the wire. Ok, so your question is: why not to put that functionality into tx_burst() itself, right? For few reasons: 1. putting that functionality into tx_burst() would introduce unnecessary slowdown for cases when that functionality is not needed (one segment per packet, no HW offloads). 2. User might don't want to use tx_prep() - he/she might have its own analog, which he/she belives is faster/smarter,etc. 3. Having it a s separate function would allow user control when/where to call it, let say only for some packets, or probably call tx_prep() on one core, and do actual tx_burst() for these packets on the other.= =20 =20 >=20 > proposal quote: >=20 > 1. Introduce rte_eth_tx_prep() function to do necessary preparations of > packet burst to be safely transmitted on device for desired HW > offloads (set/reset checksum field according to the hardware > requirements) and check HW constraints (number of segments per > packet, etc). >=20 > While the limitations and requirements may differ for devices, it > requires to extend rte_eth_dev structure with new function pointer > "tx_pkt_prep" which can be implemented in the driver to prepare and > verify packets, in devices specific way, before burst, what should to > prevent application to send malformed packets. >=20 >=20 > > > > > and what if the PMD does not implement that callback then it is of wa= ste cycles. Right? > > > > If you refer as lost cycles here something like: > > RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_prep, -ENOTSUP); > > then yes. > > Though comparing to actual work need to be done for most HW TX offloads= , > > I think it is neglectable. >=20 > Not sure. >=20 > > Again, as I said before, it is totally voluntary for the application. >=20 > Not according to proposal. It can't be too as application has no idea > what PMD driver does with "prep" what is the implication on a HW if > application does not Why application writer wouldn't have an idea?=20 We would document what tx_prep() supposed to do, and in what cases user don= 't need it. Then it would be up to the user: - not to use it at all (one segment per packet, no HW TX offloads) - not to use tx_prep(), and make necessary preparations himself, that what people have to do now. - use tx_prep() Konstantin