DPDK patches and discussions
 help / color / mirror / Atom feed
From: Olivier MATZ <olivier.matz@6wind.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	 "Liu, Jijiang" <jijiang.liu@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine
Date: Fri, 16 Jan 2015 18:27:52 +0100
Message-ID: <54B94A18.5030700@6wind.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258213D4FCF@irsmsx105.ger.corp.intel.com>

Hi Konstantin, Hi Jijiang,

On 01/15/2015 02:31 PM, Ananyev, Konstantin wrote:
> To be honest, there are so many mails around that subject, so I am already lost :)
> Oliver, as I understand you are not happy with the test-pmd commands Frank is proposing.
> Both syntax and semantics.
> Is that correct?
> If so, could you suggest something from your side?
> That would allow to configure test-pmd to behave in any of 4 possible ways we discussed previously:
> http://dpdk.org/ml/archives/dev/2014-December/009213.html

I first wanted to send a mail to describe the current problem with
testpmd command line and the 2 solutions (Jijiang's and mine).
But, first, I think we need to fully clarify the checksum offload
API through examples as it will help to implement testpmd and do
the documentation. They are based on Jijiang's previous mail [1].

I will submit a patchset fixing the problems described below in
the coming days. If we agree on it, I'll submit another one for testpmd.

Let's use the following packet for all the examples below:
  out_eth / out_ip / out_udp / vxlan / in_eth / in_ip / in_tcp


The following cases are supposed to work on niantic and fortville:

case 1) calculate checksum of out_ip  (was case A in [1])

   mb->l2_len = len(out_eth)
   mb->l3_len = len(out_ip)
   mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_IP_CSUM
   set out_ip checksum to 0 in the packet

   supported on hardware advertising DEV_TX_OFFLOAD_IPV4_CKSUM

case 2) calculate checksum of out_ip and out_udp

   mb->l2_len = len(out_eth)
   mb->l3_len = len(out_ip)
   mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_IP_CSUM | PKT_TX_UDP_CKSUM
   set out_ip checksum to 0 in the packet
   set out_udp checksum to pseudo header using rte_ipv4_phdr_cksum()

   supported on hardware advertising DEV_TX_OFFLOAD_IPV4_CKSUM and
   DEV_TX_OFFLOAD_UDP_CKSUM

   *Problem 1*: The comment above PKT_TX_IPV4 says "Packet is IPv4
   without requiring IP checksum offload" [2], and the help of L4
   checksum and TSO says that it is required to set the PKT_TX_IPV4
   flag [3]. This is not coherent.

   We are back on the debate about the meaning of PKT_TX_IPV4 vs
   PKT_TX_IP_CSUM from [4]. This incoherency in comments are introduced
   by patch [5]. The question is "when an application should set
   this flag? for any IP packet that does not require IP checksum?".
   This would break many applications. I think a good definition would
   be:

   Packet is IPv4. This flag must be set when using any offload
   feature (TSO, L3 or L4 checksum) to tell the NIC that the packet
   is an IPv4 packet.

   That's why I added PKT_TX_IPV4 in the examples.

case 3) calculate checksum of in_ip

   mb->l2_len = len(out_eth + out_ip + out_udp + vxlan + in_eth)
   mb->l3_len = len(in_ip)
   mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_IP_CSUM
   set in_ip checksum to 0 in the packet

   This is similar to case 1), but l2_len is different.

   supported on hardware advertising DEV_TX_OFFLOAD_IPV4_CKSUM

   Note that it can only work if outer L4 checksum is 0.

case 4) calculate checksum of in_ip and in_tcp  (was case B.2 in [1])

   mb->l2_len = len(out_eth + out_ip + out_udp + vxlan + in_eth)
   mb->l3_len = len(in_ip)
   mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_IP_CSUM | PKT_TX_TCP_CKSUM
   set in_ip checksum to 0 in the packet
   set in_tcp checksum to pseudo header using rte_ipv4_phdr_cksum()

   This is similar to case 2), but l2_len is different.

   supported on hardware advertising DEV_TX_OFFLOAD_IPV4_CKSUM and
   DEV_TX_OFFLOAD_TCP_CKSUM

   Note that it can only work if outer L4 checksum is 0.

case 5) segment inner TCP

   mb->l2_len = len(out_eth + out_ip + out_udp + vxlan + in_eth)
   mb->l3_len = len(in_ip)
   mb->l4_len = len(in_tcp)
   mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
     PKT_TX_TCP_SEG;
   set in_ip checksum to 0 in the packet
   set in_tcp checksum to pseudo header without including the IP
     payload length using rte_ipv4_phdr_cksum()

   supported on hardware advertising DEV_TX_OFFLOAD_TCP_TSO.

   Note that it can only work if outer L4 checksum is 0.

   Problem 1 is also visible here.


The following cases are supposed to *work on fortville*:

case 6) calculate checksum of out_ip, in_ip, in_tcp (was case C in [1])

   mb->outer_l2_len = len(out_eth)
   mb->outer_l3_len = len(out_ip)
   mb->l2_len = len(out_udp + vxlan + in_eth)
   mb->l3_len = len(in_ip)
   mb->ol_flags |= PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IP_CKSUM  | \
     PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
   set out_ip checksum to 0 in the packet
   set in_ip checksum to 0 in the packet
   set in_tcp checksum to pseudo header using rte_ipv4_phdr_cksum()

   supported on hardware advertising DEV_TX_OFFLOAD_IPV4_CKSUM,
   DEV_TX_OFFLOAD_UDP_CKSUM and DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM

   *Problem 2*: it is not written in the API comments that out_ip
   checksum field should be set to 0 by the application. They should
   be enhanced.

   *Problem 3*: without using the word "fortville", it is difficult
   to understand the goal of the flag PKT_TX_UDP_TUNNEL_PKT. Indeed,
   once PKT_TX_OUTER_IPV4/6 is set, it looks obvious that it's a
   tunnel packet. I suggest to remove the PKT_TX_UDP_TUNNEL_PKT
   flag. In linux, the driver doesn't care about the tunnel type,
   it always set I40E_TXD_CTX_UDP_TUNNELING for all encapsulations [6].

   *Problem 4*: features flags are missing here. A flag
   DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM should be added. This is already
   addressed by one patch from Jijiang [7]


The cases should work in some *future drivers*:

case 7) calculate checksum of out_ip, out_udp, in_ip and in_tcp

   mb->outer_l2_len = len(out_eth)
   mb->outer_l3_len = len(out_ip)
   mb->l2_len = len(out_udp + vxlan + in_eth)
   mb->l3_len = len(in_ip)
   mb->ol_flags |= PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IP_CKSUM | \
     PKT_TX_OUTER_UDP_CKSUM | PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
   set out_ip checksum to 0 in the packet
   set out_udp checksum to pseudo header using rte_ipv4_phdr_cksum()
   set in_ip checksum to 0 in the packet
   set in_tcp checksum to pseudo header using rte_ipv4_phdr_cksum()

  We need to add the flag PKT_TX_OUTER_UDP_CKSUM.

case 8) TSO on inner header + out_ip checksum

   This is not supported yet, but latest patch from Jijiang [8]
   implements this feature.

   mb->outer_l2_len = len(out_eth)
   mb->outer_l3_len = len(out_ip)
   mb->l2_len = len(out_udp + vxlan + in_eth)
   mb->l3_len = len(in_ip)
   mb->l4_len = len(in_tcp)
   mb->ol_flags |= PKT_TX_OUTER_IP_CKSUM | PKT_TX_OUTER_IPV4 | \
     PKT_TX_IPV4 | PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | \
     PKT_TX_TCP_SEG;
   set out_ip checksum to 0 in the packet
   set in_ip checksum to 0 in the packet
   set in_tcp checksum to pseudo header without including the IP
     payload length using rte_ipv4_phdr_cksum()

   supported on hardware advertising DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM
   and DEV_TX_OFFLOAD_TCP_TSO.


I think the following cases should be *forbidden by the API*:

case 9) calculate checksum of in_ip and in_tcp  (was case B.1 in [1])

   mb->outer_l2_len = len(out_eth)
   mb->outer_l3_len = len(out_ip)
   mb->l2_len = len(out_udp + vxlan + in_eth)
   mb->l3_len = len(out_ip)
   mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_UDP_TUNNEL_PKT | \
     PKT_TX_IP_CSUM | PKT_TX_UDP_CKSUM;
   set out_ip checksum to 0 in the packet
   set out_udp checksum to pseudo header using rte_ipv4_phdr_cksum()

   If we remove the flag PKT_TX_UDP_TUNNEL_PKT, this cannot be
   supported, but there is no reason to support it as there is
   already one way to do the same.

   I think the driver should not even look at mb->outer_l2_len
   and mb->outer_l3_len if no flag PKT_TX_OUTER_* is set.

case 10) calculate a checksum using only outer_lX fields

   The outer_lX fields or PKT_TX_OUTER_* flags can only be used
   if a inner checksum is enabled. So it's not possible to do
   the following:

   mb->outer_l2_len = len(out_eth)
   mb->outer_l3_len = len(out_ip)
   mb->ol_flags |= PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IP_CSUM
   set out_ip checksum to 0 in the packet

Regards,
Olivier



[1] http://dpdk.org/ml/archives/dev/2014-December/009213.html
[2]
http://dpdk.org/browse/dpdk/tree/lib/librte_mbuf/rte_mbuf.h?id=v1.8.0#n147
[3]
http://dpdk.org/browse/dpdk/tree/lib/librte_mbuf/rte_mbuf.h?id=v1.8.0#n108
[4] http://dpdk.org/ml/archives/dev/2014-December/009352.html
[5]
http://dpdk.org/browse/dpdk/commit/lib/librte_mbuf/rte_mbuf.h?id=711ba9e23e681b97d547219de8af199ea03a33b3
[6]
http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/i40evf/i40e_txrx.c?v=3.17#L1223
[7] http://dpdk.org/dev/patchwork/patch/1907/
[8] http://dpdk.org/dev/patchwork/patch/2329/

  reply	other threads:[~2015-01-16 17:28 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-10  1:03 Jijiang Liu
2014-12-10  1:03 ` [dpdk-dev] [PATCH v3 1/3] librte_ether:add outer IP offload capability flag Jijiang Liu
2014-12-11 10:33   ` Olivier MATZ
2014-12-10  1:03 ` [dpdk-dev] [PATCH v3 2/3] i40e:support outer IPv4 checksum capability Jijiang Liu
2014-12-11 10:34   ` Olivier MATZ
2014-12-10  1:03 ` [dpdk-dev] [PATCH v3 3/3] app/testpmd:change tx_checksum command and csum forwarding engine Jijiang Liu
2014-12-11 10:52   ` Olivier MATZ
2014-12-12  4:06     ` Liu, Jijiang
2014-12-11 10:17 ` [dpdk-dev] [PATCH v3 0/3] enhance TX checksum " Olivier MATZ
2014-12-12  3:48   ` Liu, Jijiang
2014-12-12 16:33     ` Olivier MATZ
2015-01-07  2:03       ` Liu, Jijiang
2015-01-07  9:59         ` Ananyev, Konstantin
2015-01-07 11:39           ` Liu, Jijiang
2015-01-07 12:07             ` Ananyev, Konstantin
2015-01-08  8:51               ` Liu, Jijiang
2015-01-08 10:54                 ` Ananyev, Konstantin
2015-01-09 10:45                   ` Olivier MATZ
2015-01-12  3:41                     ` Liu, Jijiang
2015-01-12 11:43                       ` Olivier MATZ
2015-01-13  3:04                         ` Liu, Jijiang
2015-01-13  9:55                           ` Olivier MATZ
2015-01-14  3:01                             ` Liu, Jijiang
2015-01-15 13:31                               ` Ananyev, Konstantin
2015-01-16 17:27                                 ` Olivier MATZ [this message]
2015-01-19 13:04                                   ` Ananyev, Konstantin
2015-01-19 14:38                                     ` Olivier MATZ
2015-01-20  1:12                                       ` Ananyev, Konstantin
2015-01-20 12:39                                         ` Olivier MATZ
2015-01-20 15:18                                           ` Thomas Monjalon
2015-01-20 17:10                                             ` Stephen Hemminger
2015-01-20 17:23                                           ` Ananyev, Konstantin
2015-01-20 18:15                                             ` Olivier MATZ
2015-01-21  3:12                                               ` Liu, Jijiang
2015-01-21 15:25                                                 ` Olivier MATZ
2015-01-21 16:28                                                   ` Ananyev, Konstantin
2015-01-21 17:13                                                     ` Olivier MATZ
2015-01-26  4:13                                                   ` Ananyev, Konstantin
2015-01-26  6:02                                                     ` Liu, Jijiang
2015-01-26 14:07                                                       ` Olivier MATZ
2015-01-26 14:15                                                         ` Ananyev, Konstantin
2015-01-27  8:34                                                           ` Olivier MATZ
2015-01-27 15:26                                                             ` Ananyev, Konstantin
2015-01-21 19:44                                                 ` Stephen Hemminger
2015-01-22  1:40                                                   ` Liu, Jijiang
2015-01-21  8:01                                               ` Liu, Jijiang
2015-01-21  9:10                                                 ` Olivier MATZ
2015-01-21 11:52                                                   ` Ananyev, Konstantin
2015-01-07 13:06 ` Qiu, Michael

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54B94A18.5030700@6wind.com \
    --to=olivier.matz@6wind.com \
    --cc=dev@dpdk.org \
    --cc=jijiang.liu@intel.com \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git