From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 989486828 for ; Mon, 10 Nov 2014 12:30:07 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP; 10 Nov 2014 03:39:51 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="414211193" Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153]) by FMSMGA003.fm.intel.com with ESMTP; 10 Nov 2014 03:31:01 -0800 Received: from irsmsx151.ger.corp.intel.com (163.33.192.59) by IRSMSX101.ger.corp.intel.com (163.33.3.153) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 10 Nov 2014 11:39:50 +0000 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.56]) by IRSMSX151.ger.corp.intel.com ([169.254.4.227]) with mapi id 14.03.0195.001; Mon, 10 Nov 2014 11:39:49 +0000 From: "Ananyev, Konstantin" To: Olivier MATZ , Yong Wang , "Liu, Jijiang" Thread-Topic: [dpdk-dev] [PATCH v8 10/10] app/testpmd:test VxLAN Tx checksum offload Thread-Index: AQHP+igwSkdCwquBBkeqJNewvNm7F5xVaLcAgARPPrA= Date: Mon, 10 Nov 2014 11:39:48 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258213A38D2@IRSMSX105.ger.corp.intel.com> References: <1414376006-31402-1-git-send-email-jijiang.liu@intel.com> <1414376006-31402-11-git-send-email-jijiang.liu@intel.com> <54588BF7.309@6wind.com> <1ED644BD7E0A5F4091CF203DAFB8E4CC01D8510E@SHSMSX101.ccr.corp.intel.com>, <5459FBB2.1040408@6wind.com> <0c654d2c0d304b45a40af6ca38b70adf@EX13-MBX-026.vmware.com> <545CFE56.60605@6wind.com> In-Reply-To: <545CFE56.60605@6wind.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v8 10/10] app/testpmd:test VxLAN Tx checksum offload X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2014 11:30:09 -0000 Hi Oliver, > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ > Sent: Friday, November 07, 2014 5:16 PM > To: Yong Wang; Liu, Jijiang > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v8 10/10] app/testpmd:test VxLAN Tx checks= um offload >=20 > Hello Yong, >=20 > On 11/07/2014 01:43 AM, Yong Wang wrote: > >>> As to HW TX checksum offload, do you have special requirement for imp= lementing TSO? > > > >> Yes. TSO implies TX TCP and IP checksum offload. > > > > Is this a general requirement or something specific to ixgbe/i40e? FWIW= , > > vmxnet3 device does not support tx IP checksum offload but doe support > > TSO. In that case, we cannot leave IP checksum field as 0 (the correct > > checksum needs to be filled in the header) before passing it the the NI= C > > when TSO is enabled. >=20 > This is a good question because we need to define the proper API that > will work on other PMDs in the future. >=20 > Indeed, there is a hardware specificity in ixgbe: when TSO is enabled, > the IP checksum flag must also be passed to the driver if it's IPv4. > From 82599 datasheets (7.2.3.2.4 Advanced Transmit Data Descriptor): >=20 > IXSM (bit 0) - Insert IP Checksum: This field indicates that IP > checksum must be inserted. In IPv6 mode, it must be reset to 0b. > If DCMD.TSE and TUCMD.IPV4 are set, IXSM must be set as well. > If this bit is set, the packet should at least contain an > IP header. >=20 > If we allow the user to give the TSO flag without the IP checksum > flag in mbuf flags, the ixgbe driver would have to set the IP checksum > flag in hardware descriptors if the packet is IPv4. The driver would > have to parse the IP header: this is not a problem as we already need > it for TCP checksum. >=20 > To summarize, I think we have 3 options when transmitting a packet to be > segmented using TSO: >=20 > - set IP checksum to 0 in the application: in this case, it would > require additional work in virtual drivers if the peer expects > to receive a packet with a valid IP checksum. But I'm wondering > what is the need for calculating a checksum when transmitting on > a virtual device (the peer receiving the packet knows that the > packet is not corrupted as it comes from memory). Moreover, if the > device advertise TSO, I assume it can also advertise IP checksum > offload. >=20 > - calculate the IP checksum in the application. It would take additional > cycles although it may not be needed as the driver probably knows > how to calculate it. >=20 > - if the driver supports both TSO and IP checksum, the 2 flags MUST > be given to the driver and the IP checksum must be set to 0 and the > checksum cannot be calculated in software. If the driver only > supports TSO, the checksum has to be calculated in software. >=20 > Currently, I choosen the first solution, but I'm open to change the > design. Maybe the 3rd one is also a good solution. >=20 > By the way, we had the same kind of discussion with Konstantin [1] > about what to do with the TCP checksum. My feeling is that setting it > to the pseudo-header checksum is the best we can do: > - linux does that > - many hardware requires that (this is not the case for ixgbe, which > need a pshdr checksum without the IP len) > - it can be reused if received by a virtual device and sent to a > physical device supporting TSO Yes, I remember that discussion. I still think we better avoid any read/write access of the packet data insi= de PMD TX routine. (packet header parsing and/or pseudo-header checksum calculations). As I said before - if different HW have different requirements of what have= to be recalculated for HW TX offloads - why not introduce a new function dev_prep_tx(portid, queueid, mbuf[], num)? PMD developer can put all necessary calculations/updates of the packet data= and related mbuf fields inside that function. It would be then a PMD responsibility to provide that function and it would= be an app layer responsibility to call it for mbufs with TX offload flags before calling tx_burst(). Konstantin >=20 > Best regards, > Olivier >=20 >=20 > [1] http://dpdk.org/ml/archives/dev/2014-May/002766.html