DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Liu, Jijiang" <jijiang.liu@intel.com>,
	"Olivier Matz (olivier.matz@6wind.com)" <olivier.matz@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
Date: Thu, 27 Nov 2014 17:01:33 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB977258213BAE90@IRSMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258213BADB8@IRSMSX105.ger.corp.intel.com>



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Thursday, November 27, 2014 2:56 PM
> To: Liu, Jijiang; Olivier Matz (olivier.matz@6wind.com)
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
> 
> 
> 
> >
> > -----Original Message-----
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > Sent: Thursday, November 27, 2014 6:00 PM
> > To: Liu, Jijiang; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
> >
> > Hi Jijiang,
> >
> > Please see some comments below.
> >
> > On 11/27/2014 09:18 AM, Jijiang Liu wrote:
> > > In place of removing the PKT_TX_VXLAN_CKSUM, we introduce 2 new flags: PKT_TX_OUT_IP_CKSUM,
> PKT_TX_UDP_TUNNEL_PKT,
> > and a new field: l4_tun_len.
> > > Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field.
> > >
> > > PKT_TX_OUT_IP_CKSUM: is not used for non-tunnelling packet;hardware outer checksum for tunnelling packet.
> > > PKT_TX_UDP_TUNNEL_PKT: is used to tell PMD that the transmit packet is a UDP tunneling packet.
> > > l4_tun_len: for VXLAN packet, it should be udp header length plus VXLAN header length.
> > >
> > > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> > > ---
> > >   lib/librte_mbuf/rte_mbuf.c |    2 +-
> > >   lib/librte_mbuf/rte_mbuf.h |   23 ++++++++++++++---------
> > >   2 files changed, 15 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > index 87c2963..e89c310 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > @@ -240,7 +240,7 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > >   	case PKT_TX_SCTP_CKSUM: return "PKT_TX_SCTP_CKSUM";
> > >   	case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM";
> > >   	case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST";
> > > -	case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM";
> > > +	case PKT_TX_UDP_TUNNEL_PKT: return "PKT_TX_UDP_TUNNEL_PKT";
> > >   	case PKT_TX_TCP_SEG: return "PKT_TX_TCP_SEG";
> > >   	default: return NULL;
> >
> > As I said as a reply to the cover letter, I suggest to use PKT_TX_OUT_UDP_CKSUM instead of PKT_TX_UDP_TUNNEL_PKT.
> 
> HW don't support outer L4 checksum offload.
> But to calculate inner checksums correctly, it needs a hint from SW about L4 Tunneling Type.
> Currently the following values are recognised by HW:
> 
> L4 Tunneling Type (Teredo / GRE header / VXLAN header) indication:
> 00b - No UDP / GRE tunneling (field must be set to zero if EIPT equals to zero)
> 01b - UDP tunneling header (any UDP tunneling, VXLAN and Geneve).
> 10b - GRE tunneling header
> Else - reserved
> 
> You can check yourself:
> http://www.intel.com/content/www/us/en/embedded/products/networking/xl710-10-40-controller-datasheet.html
> Sections 8.4.2.2.1 and 8.4.4.2
> 
> >
> > Also, the PKT_TX_OUT_IP_CKSUM case is missing here.
> >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 367fc56..48cd8e1 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -99,10 +99,9 @@ extern "C" {
> > >   #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 header. */
> > >   #define PKT_RX_FDIR_ID       (1ULL << 13) /**< FD id reported if FDIR match. */
> > >   #define PKT_RX_FDIR_FLX      (1ULL << 14) /**< Flexible bytes reported if FDIR match. */
> > > -/* add new RX flags here */
> > >
> >
> > We should probably not remove this line.
> >
> >
> > >   /* add new TX flags here */
> > > -#define PKT_TX_VXLAN_CKSUM   (1ULL << 50) /**< TX checksum of VXLAN computed by NIC */
> > > +#define PKT_TX_UDP_TUNNEL_PKT (1ULL << 50) /**< TX packet is an UDP
> > > +tunneling packet */
> > >   #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to
> > > timestamp. */
> > >
> > >   /**
> > > @@ -125,13 +124,20 @@ extern "C" {
> > >   #define PKT_TX_IP_CKSUM      (1ULL << 54) /**< IP cksum of TX pkt. computed by NIC. */
> > >   #define PKT_TX_IPV4_CSUM     PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
> > >
> > > +#define PKT_TX_VLAN_PKT      (1ULL << 55) /**< TX packet is a 802.1q VLAN packet. */
> > > +
> > >   /** Tell the NIC it's an IPv4 packet. Required for L4 checksum offload or TSO. */
> > > -#define PKT_TX_IPV4          PKT_RX_IPV4_HDR
> > > +#define PKT_TX_IPV4          (1ULL << 56)
> > >
> > >   /** Tell the NIC it's an IPv6 packet. Required for L4 checksum offload or TSO. */
> > > -#define PKT_TX_IPV6          PKT_RX_IPV6_HDR
> > > +#define PKT_TX_IPV6          (1ULL << 57)
> >
> > The description in comment does not match the description in the cover letter.
> >
> > Also, I think replacing PKT_RX_IPV[46]_HDR by the value may go in another commit.
> >
> >
> > > -#define PKT_TX_VLAN_PKT      (1ULL << 55) /**< TX packet is a 802.1q VLAN packet. */
> > > +/** Outer IP cksum of TX pkt. computed by NIC for tunneling packet */
> > > +#define PKT_TX_OUTER_IP_CKSUM   (1ULL << 58)
> > > +#define PKT_TX_OUTER_IPV4_CSUM  PKT_TX_OUTER_IP_CKSUM /**< Alias of
> > > +PKT_TX_OUTER_IP_CKSUM. */
> >
> > Why do we need an alias?
> >
> > By the way, I think the alias of PKT_TX_IP_CKSUM is also uneeded and can be removed. But it's not the topic of your series.
> >
> > Also, the name PKT_TX_OUTER_IP_CKSUM does not match the name in the cover letter and commit logs.
> >
> >
> > > +
> > > +/** Tell the NIC it's an outer IPv6 packet for tunneling packet.*/
> > > +#define PKT_TX_OUTER_IPV6    (1ULL << 59)
> > >
> >
> > This flag is not in the cover letter or commit log. What is its purpose?
> 
> 
> My bad, forgot that for outer IP, will also need to specify it's type.
> So same story here as for inner IP.
> So in total, we might need 3 flags for outer IP:
> 
> /* Tells HW that outer IP is IPV4 and checksum for it should be calculated by HW. */
> PKT_TX_OUTER_IP_CKSUM
> 
> /* Tells HW that outer IP is IPV4 and checksum for it should not be calculated by HW. */
> PKT_TX_OUTER_IPV4
> 
> /* Tells HW that outer IP is IPV6. */
> PKT_TX_OUTER_IPV6
> 
> >
> >
> > >   /**
> > >    * TCP segmentation offload. To enable this offload feature for a @@
> > > -266,10 +272,9 @@ struct rte_mbuf {
> > >   			uint64_t tso_segsz:16; /**< TCP TSO segment size */
> > >
> > >   			/* fields for TX offloading of tunnels */
> > > -			uint64_t inner_l3_len:9; /**< inner L3 (IP) Hdr Length. */
> > > -			uint64_t inner_l2_len:7; /**< inner L2 (MAC) Hdr Length. */
> > > -
> > > -			/* uint64_t unused:8; */
> > > +			uint64_t outer_l3_len:9; /**< outer L3 (IP) Hdr Length. */
> > > +			uint64_t outer_l2_len:7; /**< outer L2 (MAC) Hdr Length. */
> > > +			uint64_t l4_tun_len:8; /**< L4 tunnelling header length */
> > >   		};
> > >   	};
> > >   } __rte_cache_aligned;
> > >
> >
> > About l4_tun_len, I have another comment I forgot to add in the cover letter. Can we remove it and include its length in
> outer_l2_len
> > instead? For instance, replace:
> >
> >       mb->l2_len =  eth_hdr_in;
> >       mb->l3_len = ipv4_hdr_in;
> >       mb->outer_l2_len = eth_hdr_out;
> >       mb->outer_l3_len = ipv4_hdr_out;
> >       mb->l4tun_len = vxlan_hdr;
> >       mb->ol_flags |= PKT_TX_OUT_IP_CKSUM  | PKT_TX_UDP_TUNNEL |
> >         PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
> >
> > by:
> >
> >       mb->l2_len =  eth_hdr_in;
> >       mb->l3_len = ipv4_hdr_in;
> >       mb->outer_l2_len = eth_hdr_out + vxlan_hdr;
> >       mb->outer_l3_len = ipv4_hdr_out;
> >       mb->ol_flags |= PKT_TX_OUT_IP_CKSUM  | PKT_TX_UDP_TUNNEL |
> >         PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
> >
> > I think it won't bother the driver, and it's coherent with case B.2 of your cover letter.
> 
> You probably meant:
> mb->l2_len =  eth_hdr_in + vxlan_hdr;
> ?
> Yes, I think it could be done that way too.
> Though I still prefer to keep l4tun_len - it makes things a bit cleaner (at least to me).
> After all  we do have space for it in mbuf's tx_offload.

As one more thing in favour of separate l4tun_len field:
l2_len is 7 bit long, so in theory it might be not enough, as for FVL:
12:18 L4TUNLEN L4 Tunneling Length (Teredo / GRE header / VXLAN header) defined in Words. 


> Konstantin
> 
> >
> > Regards,
> > Olivier

  reply	other threads:[~2014-11-27 17:01 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-27  8:18 [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework Jijiang Liu
2014-11-27  8:18 ` [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields Jijiang Liu
2014-11-27 10:00   ` Olivier MATZ
2014-11-27 13:14     ` Liu, Jijiang
2014-11-28  9:17       ` Olivier MATZ
     [not found]     ` <1ED644BD7E0A5F4091CF203DAFB8E4CC01D9EEA0@SHSMSX101.ccr.corp.intel.com>
2014-11-27 14:56       ` Ananyev, Konstantin
2014-11-27 17:01         ` Ananyev, Konstantin [this message]
2014-11-28 10:45           ` Olivier MATZ
2014-11-28 11:16             ` Ananyev, Konstantin
2014-11-30 14:50             ` Ananyev, Konstantin
2014-12-01  2:30               ` Liu, Jijiang
2014-12-01  9:52                 ` Olivier MATZ
2014-12-01 11:58                   ` Ananyev, Konstantin
2014-12-01 12:28                     ` Olivier MATZ
2014-12-01 13:07                       ` Liu, Jijiang
2014-12-01 14:31                         ` Ananyev, Konstantin
2014-11-27  8:18 ` [dpdk-dev] [PATCH 2/3] i40e:PMD change for VXLAN TX checksum Jijiang Liu
2014-11-27  8:18 ` [dpdk-dev] [PATCH 3/3] testpmd:rework csum forward engine Jijiang Liu
2014-11-27 10:23   ` Olivier MATZ
2014-11-27  8:50 ` [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework Liu, Jijiang
2014-11-27  9:44 ` Olivier MATZ
2014-11-27 10:12   ` Olivier MATZ
2014-11-27 12:06     ` Liu, Jijiang
2014-11-27 12:07   ` Liu, Jijiang
2014-11-27 15:29   ` Ananyev, Konstantin
2014-11-27 16:31     ` Liu, Jijiang
2014-12-03  8:02       ` Liu, Jijiang
2014-11-28  9:26     ` Olivier MATZ

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB977258213BAE90@IRSMSX105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=dev@dpdk.org \
    --cc=jijiang.liu@intel.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).