From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
"Liu, Jijiang" <jijiang.liu@intel.com>,
"Olivier Matz (olivier.matz@6wind.com)" <olivier.matz@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
Date: Thu, 27 Nov 2014 17:01:33 +0000 [thread overview]
Message-ID: <2601191342CEEE43887BDE71AB977258213BAE90@IRSMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258213BADB8@IRSMSX105.ger.corp.intel.com>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Thursday, November 27, 2014 2:56 PM
> To: Liu, Jijiang; Olivier Matz (olivier.matz@6wind.com)
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
>
>
>
> >
> > -----Original Message-----
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > Sent: Thursday, November 27, 2014 6:00 PM
> > To: Liu, Jijiang; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
> >
> > Hi Jijiang,
> >
> > Please see some comments below.
> >
> > On 11/27/2014 09:18 AM, Jijiang Liu wrote:
> > > In place of removing the PKT_TX_VXLAN_CKSUM, we introduce 2 new flags: PKT_TX_OUT_IP_CKSUM,
> PKT_TX_UDP_TUNNEL_PKT,
> > and a new field: l4_tun_len.
> > > Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field.
> > >
> > > PKT_TX_OUT_IP_CKSUM: is not used for non-tunnelling packet;hardware outer checksum for tunnelling packet.
> > > PKT_TX_UDP_TUNNEL_PKT: is used to tell PMD that the transmit packet is a UDP tunneling packet.
> > > l4_tun_len: for VXLAN packet, it should be udp header length plus VXLAN header length.
> > >
> > > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> > > ---
> > > lib/librte_mbuf/rte_mbuf.c | 2 +-
> > > lib/librte_mbuf/rte_mbuf.h | 23 ++++++++++++++---------
> > > 2 files changed, 15 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > index 87c2963..e89c310 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > @@ -240,7 +240,7 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > case PKT_TX_SCTP_CKSUM: return "PKT_TX_SCTP_CKSUM";
> > > case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM";
> > > case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST";
> > > - case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM";
> > > + case PKT_TX_UDP_TUNNEL_PKT: return "PKT_TX_UDP_TUNNEL_PKT";
> > > case PKT_TX_TCP_SEG: return "PKT_TX_TCP_SEG";
> > > default: return NULL;
> >
> > As I said as a reply to the cover letter, I suggest to use PKT_TX_OUT_UDP_CKSUM instead of PKT_TX_UDP_TUNNEL_PKT.
>
> HW don't support outer L4 checksum offload.
> But to calculate inner checksums correctly, it needs a hint from SW about L4 Tunneling Type.
> Currently the following values are recognised by HW:
>
> L4 Tunneling Type (Teredo / GRE header / VXLAN header) indication:
> 00b - No UDP / GRE tunneling (field must be set to zero if EIPT equals to zero)
> 01b - UDP tunneling header (any UDP tunneling, VXLAN and Geneve).
> 10b - GRE tunneling header
> Else - reserved
>
> You can check yourself:
> http://www.intel.com/content/www/us/en/embedded/products/networking/xl710-10-40-controller-datasheet.html
> Sections 8.4.2.2.1 and 8.4.4.2
>
> >
> > Also, the PKT_TX_OUT_IP_CKSUM case is missing here.
> >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 367fc56..48cd8e1 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -99,10 +99,9 @@ extern "C" {
> > > #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 header. */
> > > #define PKT_RX_FDIR_ID (1ULL << 13) /**< FD id reported if FDIR match. */
> > > #define PKT_RX_FDIR_FLX (1ULL << 14) /**< Flexible bytes reported if FDIR match. */
> > > -/* add new RX flags here */
> > >
> >
> > We should probably not remove this line.
> >
> >
> > > /* add new TX flags here */
> > > -#define PKT_TX_VXLAN_CKSUM (1ULL << 50) /**< TX checksum of VXLAN computed by NIC */
> > > +#define PKT_TX_UDP_TUNNEL_PKT (1ULL << 50) /**< TX packet is an UDP
> > > +tunneling packet */
> > > #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to
> > > timestamp. */
> > >
> > > /**
> > > @@ -125,13 +124,20 @@ extern "C" {
> > > #define PKT_TX_IP_CKSUM (1ULL << 54) /**< IP cksum of TX pkt. computed by NIC. */
> > > #define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
> > >
> > > +#define PKT_TX_VLAN_PKT (1ULL << 55) /**< TX packet is a 802.1q VLAN packet. */
> > > +
> > > /** Tell the NIC it's an IPv4 packet. Required for L4 checksum offload or TSO. */
> > > -#define PKT_TX_IPV4 PKT_RX_IPV4_HDR
> > > +#define PKT_TX_IPV4 (1ULL << 56)
> > >
> > > /** Tell the NIC it's an IPv6 packet. Required for L4 checksum offload or TSO. */
> > > -#define PKT_TX_IPV6 PKT_RX_IPV6_HDR
> > > +#define PKT_TX_IPV6 (1ULL << 57)
> >
> > The description in comment does not match the description in the cover letter.
> >
> > Also, I think replacing PKT_RX_IPV[46]_HDR by the value may go in another commit.
> >
> >
> > > -#define PKT_TX_VLAN_PKT (1ULL << 55) /**< TX packet is a 802.1q VLAN packet. */
> > > +/** Outer IP cksum of TX pkt. computed by NIC for tunneling packet */
> > > +#define PKT_TX_OUTER_IP_CKSUM (1ULL << 58)
> > > +#define PKT_TX_OUTER_IPV4_CSUM PKT_TX_OUTER_IP_CKSUM /**< Alias of
> > > +PKT_TX_OUTER_IP_CKSUM. */
> >
> > Why do we need an alias?
> >
> > By the way, I think the alias of PKT_TX_IP_CKSUM is also uneeded and can be removed. But it's not the topic of your series.
> >
> > Also, the name PKT_TX_OUTER_IP_CKSUM does not match the name in the cover letter and commit logs.
> >
> >
> > > +
> > > +/** Tell the NIC it's an outer IPv6 packet for tunneling packet.*/
> > > +#define PKT_TX_OUTER_IPV6 (1ULL << 59)
> > >
> >
> > This flag is not in the cover letter or commit log. What is its purpose?
>
>
> My bad, forgot that for outer IP, will also need to specify it's type.
> So same story here as for inner IP.
> So in total, we might need 3 flags for outer IP:
>
> /* Tells HW that outer IP is IPV4 and checksum for it should be calculated by HW. */
> PKT_TX_OUTER_IP_CKSUM
>
> /* Tells HW that outer IP is IPV4 and checksum for it should not be calculated by HW. */
> PKT_TX_OUTER_IPV4
>
> /* Tells HW that outer IP is IPV6. */
> PKT_TX_OUTER_IPV6
>
> >
> >
> > > /**
> > > * TCP segmentation offload. To enable this offload feature for a @@
> > > -266,10 +272,9 @@ struct rte_mbuf {
> > > uint64_t tso_segsz:16; /**< TCP TSO segment size */
> > >
> > > /* fields for TX offloading of tunnels */
> > > - uint64_t inner_l3_len:9; /**< inner L3 (IP) Hdr Length. */
> > > - uint64_t inner_l2_len:7; /**< inner L2 (MAC) Hdr Length. */
> > > -
> > > - /* uint64_t unused:8; */
> > > + uint64_t outer_l3_len:9; /**< outer L3 (IP) Hdr Length. */
> > > + uint64_t outer_l2_len:7; /**< outer L2 (MAC) Hdr Length. */
> > > + uint64_t l4_tun_len:8; /**< L4 tunnelling header length */
> > > };
> > > };
> > > } __rte_cache_aligned;
> > >
> >
> > About l4_tun_len, I have another comment I forgot to add in the cover letter. Can we remove it and include its length in
> outer_l2_len
> > instead? For instance, replace:
> >
> > mb->l2_len = eth_hdr_in;
> > mb->l3_len = ipv4_hdr_in;
> > mb->outer_l2_len = eth_hdr_out;
> > mb->outer_l3_len = ipv4_hdr_out;
> > mb->l4tun_len = vxlan_hdr;
> > mb->ol_flags |= PKT_TX_OUT_IP_CKSUM | PKT_TX_UDP_TUNNEL |
> > PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM;
> >
> > by:
> >
> > mb->l2_len = eth_hdr_in;
> > mb->l3_len = ipv4_hdr_in;
> > mb->outer_l2_len = eth_hdr_out + vxlan_hdr;
> > mb->outer_l3_len = ipv4_hdr_out;
> > mb->ol_flags |= PKT_TX_OUT_IP_CKSUM | PKT_TX_UDP_TUNNEL |
> > PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM;
> >
> > I think it won't bother the driver, and it's coherent with case B.2 of your cover letter.
>
> You probably meant:
> mb->l2_len = eth_hdr_in + vxlan_hdr;
> ?
> Yes, I think it could be done that way too.
> Though I still prefer to keep l4tun_len - it makes things a bit cleaner (at least to me).
> After all we do have space for it in mbuf's tx_offload.
As one more thing in favour of separate l4tun_len field:
l2_len is 7 bit long, so in theory it might be not enough, as for FVL:
12:18 L4TUNLEN L4 Tunneling Length (Teredo / GRE header / VXLAN header) defined in Words.
> Konstantin
>
> >
> > Regards,
> > Olivier
next prev parent reply other threads:[~2014-11-27 17:01 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-27 8:18 [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework Jijiang Liu
2014-11-27 8:18 ` [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields Jijiang Liu
2014-11-27 10:00 ` Olivier MATZ
2014-11-27 13:14 ` Liu, Jijiang
2014-11-28 9:17 ` Olivier MATZ
[not found] ` <1ED644BD7E0A5F4091CF203DAFB8E4CC01D9EEA0@SHSMSX101.ccr.corp.intel.com>
2014-11-27 14:56 ` Ananyev, Konstantin
2014-11-27 17:01 ` Ananyev, Konstantin [this message]
2014-11-28 10:45 ` Olivier MATZ
2014-11-28 11:16 ` Ananyev, Konstantin
2014-11-30 14:50 ` Ananyev, Konstantin
2014-12-01 2:30 ` Liu, Jijiang
2014-12-01 9:52 ` Olivier MATZ
2014-12-01 11:58 ` Ananyev, Konstantin
2014-12-01 12:28 ` Olivier MATZ
2014-12-01 13:07 ` Liu, Jijiang
2014-12-01 14:31 ` Ananyev, Konstantin
2014-11-27 8:18 ` [dpdk-dev] [PATCH 2/3] i40e:PMD change for VXLAN TX checksum Jijiang Liu
2014-11-27 8:18 ` [dpdk-dev] [PATCH 3/3] testpmd:rework csum forward engine Jijiang Liu
2014-11-27 10:23 ` Olivier MATZ
2014-11-27 8:50 ` [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework Liu, Jijiang
2014-11-27 9:44 ` Olivier MATZ
2014-11-27 10:12 ` Olivier MATZ
2014-11-27 12:06 ` Liu, Jijiang
2014-11-27 12:07 ` Liu, Jijiang
2014-11-27 15:29 ` Ananyev, Konstantin
2014-11-27 16:31 ` Liu, Jijiang
2014-12-03 8:02 ` Liu, Jijiang
2014-11-28 9:26 ` Olivier MATZ
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2601191342CEEE43887BDE71AB977258213BAE90@IRSMSX105.ger.corp.intel.com \
--to=konstantin.ananyev@intel.com \
--cc=dev@dpdk.org \
--cc=jijiang.liu@intel.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).