DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Olivier MATZ <olivier.matz@6wind.com>,
	Yong Wang <yongwang@vmware.com>,
	"Liu, Jijiang" <jijiang.liu@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v8 10/10] app/testpmd:test VxLAN Tx checksum offload
Date: Mon, 10 Nov 2014 11:39:48 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB977258213A38D2@IRSMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <545CFE56.60605@6wind.com>

Hi Oliver,

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ
> Sent: Friday, November 07, 2014 5:16 PM
> To: Yong Wang; Liu, Jijiang
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v8 10/10] app/testpmd:test VxLAN Tx checksum offload
> 
> Hello Yong,
> 
> On 11/07/2014 01:43 AM, Yong Wang wrote:
> >>> As to HW TX checksum offload, do you have special requirement for implementing TSO?
> >
> >> Yes. TSO implies TX TCP and IP checksum offload.
> >
> > Is this a general requirement or something specific to ixgbe/i40e? FWIW,
> > vmxnet3 device does not support tx IP checksum offload but doe support
> > TSO.  In that case, we cannot leave IP checksum field as 0 (the correct
> > checksum needs to be filled in the header) before passing it the the NIC
> > when TSO is enabled.
> 
> This is a good question because we need to define the proper API that
> will work on other PMDs in the future.
> 
> Indeed, there is a hardware specificity in ixgbe: when TSO is enabled,
> the IP checksum flag must also be passed to the driver if it's IPv4.
> From 82599 datasheets (7.2.3.2.4 Advanced Transmit Data Descriptor):
> 
>     IXSM (bit 0) - Insert IP Checksum: This field indicates that IP
>     checksum must be inserted. In IPv6 mode, it must be reset to 0b.
>     If DCMD.TSE and TUCMD.IPV4 are set, IXSM must be set as well.
>     If this bit is set, the packet should at least contain an
>     IP header.
> 
> If we allow the user to give the TSO flag without the IP checksum
> flag in mbuf flags, the ixgbe driver would have to set the IP checksum
> flag in hardware descriptors if the packet is IPv4. The driver would
> have to parse the IP header: this is not a problem as we already need
> it for TCP checksum.
> 
> To summarize, I think we have 3 options when transmitting a packet to be
> segmented using TSO:
> 
> - set IP checksum to 0 in the application: in this case, it would
>   require additional work in virtual drivers if the peer expects
>   to receive a packet with a valid IP checksum. But I'm wondering
>   what is the need for calculating a checksum when transmitting on
>   a virtual device (the peer receiving the packet knows that the
>   packet is not corrupted as it comes from memory). Moreover, if the
>   device advertise TSO, I assume it can also advertise IP checksum
>   offload.
> 
> - calculate the IP checksum in the application. It would take additional
>   cycles although it may not be needed as the driver probably knows
>   how to calculate it.
> 
> - if the driver supports both TSO and IP checksum, the 2 flags MUST
>   be given to the driver and the IP checksum must be set to 0 and the
>   checksum cannot be calculated in software. If the driver only
>   supports TSO, the checksum has to be calculated in software.
> 
> Currently, I choosen the first solution, but I'm open to change the
> design. Maybe the 3rd one is also a good solution.
> 
> By the way, we had the same kind of discussion with Konstantin [1]
> about what to do with the TCP checksum. My feeling is that setting it
> to the pseudo-header checksum is the best we can do:
>  - linux does that
>  - many hardware requires that (this is not the case for ixgbe, which
>    need a pshdr checksum without the IP len)
>  - it can be reused if received by a virtual device and sent to a
>    physical device supporting TSO

Yes, I remember that discussion.
I still think we better avoid any read/write access of the packet data inside PMD TX routine.
(packet header parsing and/or pseudo-header checksum calculations).
As I said before - if different HW have different requirements of what have to be recalculated for HW TX offloads -
why not introduce a new function dev_prep_tx(portid, queueid, mbuf[], num)?
PMD developer can put all necessary calculations/updates of the packet data and related mbuf fields inside that function.
It would be then a PMD responsibility to provide that function and it would be an app layer responsibility to call it for
mbufs with TX offload flags before calling tx_burst().

Konstantin

> 
> Best regards,
> Olivier
> 
> 
> [1] http://dpdk.org/ml/archives/dev/2014-May/002766.html

  reply	other threads:[~2014-11-10 11:30 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-27  2:13 [dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 01/10] librte_mbuf:the rte_mbuf structure changes Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 02/10] librte_ether:add the basic data structures of VxLAN Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 03/10] librte_ether:add VxLAN packet identification API Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 04/10] i40e:support VxLAN packet identification in i40e Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 05/10] app/test-pmd:test VxLAN packet identification Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 06/10] librte_ether:add data structures of VxLAN filter Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 07/10] i40e:implement the API of VxLAN filter in librte_pmd_i40e Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 08/10] app/testpmd:test VxLAN packet filter Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 09/10] i40e:support VxLAN Tx checksum offload Jijiang Liu
2014-10-27  2:13 ` [dpdk-dev] [PATCH v8 10/10] app/testpmd:test " Jijiang Liu
2014-11-04  8:19   ` Olivier MATZ
2014-11-05  6:02     ` Liu, Jijiang
2014-11-05 10:28       ` Olivier MATZ
2014-11-06 11:24         ` Liu, Jijiang
2014-11-06 13:08           ` Olivier MATZ
2014-11-06 14:27             ` Liu, Jijiang
2014-11-07  0:43         ` Yong Wang
2014-11-07 17:16           ` Olivier MATZ
2014-11-10 11:39             ` Ananyev, Konstantin [this message]
2014-11-10 15:57               ` Olivier MATZ
2014-11-12  9:55                 ` Ananyev, Konstantin
2014-11-12 13:05                   ` Olivier MATZ
2014-11-12 13:40                     ` Thomas Monjalon
2014-11-12 23:14                       ` Ananyev, Konstantin
2014-11-12 14:39                     ` Ananyev, Konstantin
2014-11-12 14:56                       ` Olivier MATZ
     [not found]             ` <D0868B54.24DBB%yongwang@vmware.com>
2014-11-11  0:07               ` [dpdk-dev] FW: " Yong Wang
2014-11-10  6:03         ` [dpdk-dev] " Liu, Jijiang
2014-11-10 16:17           ` Olivier MATZ
     [not found]             ` <1ED644BD7E0A5F4091CF203DAFB8E4CC01D8F7A7@SHSMSX101.ccr.corp.intel.com>
2014-11-12 17:26               ` Thomas Monjalon
2014-11-13  5:35                 ` Liu, Jijiang
2014-11-13  5:39                   ` Liu, Jijiang
2014-11-13  6:51                 ` Liu, Jijiang
2014-11-13  9:10                   ` Thomas Monjalon
2014-11-14  8:15                     ` Liu, Jijiang
2014-11-14  9:09                       ` Olivier MATZ
2014-11-17  6:52                         ` Liu, Jijiang
2014-11-17 11:21                           ` Olivier MATZ
2014-11-20  7:28                             ` Liu, Jijiang
2014-11-20 16:36                               ` Olivier MATZ
2014-11-21  5:40                                 ` Liu, Jijiang
2014-10-27  2:20 ` [dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville Liu, Yong
2014-10-27  2:41 ` Zhang, Helin
2014-10-27 13:46   ` Thomas Monjalon
2014-10-27 14:34     ` Liu, Jijiang
2014-10-27 15:15       ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB977258213A38D2@IRSMSX105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=dev@dpdk.org \
    --cc=jijiang.liu@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=yongwang@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).