From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 3D4E52E81 for ; Thu, 27 Nov 2014 18:01:57 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 27 Nov 2014 09:01:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,470,1413270000"; d="scan'208";a="644637374" Received: from irsmsx106.ger.corp.intel.com ([163.33.3.31]) by orsmga002.jf.intel.com with ESMTP; 27 Nov 2014 09:01:52 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.144]) by IRSMSX106.ger.corp.intel.com ([169.254.8.18]) with mapi id 14.03.0195.001; Thu, 27 Nov 2014 17:01:46 +0000 From: "Ananyev, Konstantin" To: "Ananyev, Konstantin" , "Liu, Jijiang" , "Olivier Matz (olivier.matz@6wind.com)" Thread-Topic: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields Thread-Index: AQHQCikb8nVQX/4wQEWUH1iPbQ9du5x0gSyAgAABBFCAAC/e0A== Date: Thu, 27 Nov 2014 17:01:33 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258213BAE90@IRSMSX105.ger.corp.intel.com> References: <1417076319-629-1-git-send-email-jijiang.liu@intel.com> <1417076319-629-2-git-send-email-jijiang.liu@intel.com> <5476F626.2020708@6wind.com> <1ED644BD7E0A5F4091CF203DAFB8E4CC01D9EEA0@SHSMSX101.ccr.corp.intel.com> <2601191342CEEE43887BDE71AB977258213BADB8@IRSMSX105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB977258213BADB8@IRSMSX105.ger.corp.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Nov 2014 17:01:58 -0000 > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin > Sent: Thursday, November 27, 2014 2:56 PM > To: Liu, Jijiang; Olivier Matz (olivier.matz@6wind.com) > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and cha= nge three fields >=20 >=20 >=20 > > > > -----Original Message----- > > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > > Sent: Thursday, November 27, 2014 6:00 PM > > To: Liu, Jijiang; dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and c= hange three fields > > > > Hi Jijiang, > > > > Please see some comments below. > > > > On 11/27/2014 09:18 AM, Jijiang Liu wrote: > > > In place of removing the PKT_TX_VXLAN_CKSUM, we introduce 2 new flags= : PKT_TX_OUT_IP_CKSUM, > PKT_TX_UDP_TUNNEL_PKT, > > and a new field: l4_tun_len. > > > Replace the inner_l2_len and the inner_l3_len field with the outer_l2= _len and outer_l3_len field. > > > > > > PKT_TX_OUT_IP_CKSUM: is not used for non-tunnelling packet;hardware o= uter checksum for tunnelling packet. > > > PKT_TX_UDP_TUNNEL_PKT: is used to tell PMD that the transmit packet i= s a UDP tunneling packet. > > > l4_tun_len: for VXLAN packet, it should be udp header length plus VXL= AN header length. > > > > > > Signed-off-by: Jijiang Liu > > > --- > > > lib/librte_mbuf/rte_mbuf.c | 2 +- > > > lib/librte_mbuf/rte_mbuf.h | 23 ++++++++++++++--------- > > > 2 files changed, 15 insertions(+), 10 deletions(-) > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c > > > index 87c2963..e89c310 100644 > > > --- a/lib/librte_mbuf/rte_mbuf.c > > > +++ b/lib/librte_mbuf/rte_mbuf.c > > > @@ -240,7 +240,7 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask= ) > > > case PKT_TX_SCTP_CKSUM: return "PKT_TX_SCTP_CKSUM"; > > > case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM"; > > > case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST"; > > > - case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM"; > > > + case PKT_TX_UDP_TUNNEL_PKT: return "PKT_TX_UDP_TUNNEL_PKT"; > > > case PKT_TX_TCP_SEG: return "PKT_TX_TCP_SEG"; > > > default: return NULL; > > > > As I said as a reply to the cover letter, I suggest to use PKT_TX_OUT_U= DP_CKSUM instead of PKT_TX_UDP_TUNNEL_PKT. >=20 > HW don't support outer L4 checksum offload. > But to calculate inner checksums correctly, it needs a hint from SW about= L4 Tunneling Type. > Currently the following values are recognised by HW: >=20 > L4 Tunneling Type (Teredo / GRE header / VXLAN header) indication: > 00b - No UDP / GRE tunneling (field must be set to zero if EIPT equals to= zero) > 01b - UDP tunneling header (any UDP tunneling, VXLAN and Geneve). > 10b - GRE tunneling header > Else - reserved >=20 > You can check yourself: > http://www.intel.com/content/www/us/en/embedded/products/networking/xl710= -10-40-controller-datasheet.html > Sections 8.4.2.2.1 and 8.4.4.2 >=20 > > > > Also, the PKT_TX_OUT_IP_CKSUM case is missing here. > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > > > index 367fc56..48cd8e1 100644 > > > --- a/lib/librte_mbuf/rte_mbuf.h > > > +++ b/lib/librte_mbuf/rte_mbuf.h > > > @@ -99,10 +99,9 @@ extern "C" { > > > #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet w= ith IPv6 header. */ > > > #define PKT_RX_FDIR_ID (1ULL << 13) /**< FD id reported if FD= IR match. */ > > > #define PKT_RX_FDIR_FLX (1ULL << 14) /**< Flexible bytes repor= ted if FDIR match. */ > > > -/* add new RX flags here */ > > > > > > > We should probably not remove this line. > > > > > > > /* add new TX flags here */ > > > -#define PKT_TX_VXLAN_CKSUM (1ULL << 50) /**< TX checksum of VXLAN = computed by NIC */ > > > +#define PKT_TX_UDP_TUNNEL_PKT (1ULL << 50) /**< TX packet is an UDP > > > +tunneling packet */ > > > #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet t= o > > > timestamp. */ > > > > > > /** > > > @@ -125,13 +124,20 @@ extern "C" { > > > #define PKT_TX_IP_CKSUM (1ULL << 54) /**< IP cksum of TX pkt. = computed by NIC. */ > > > #define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_I= P_CKSUM. */ > > > > > > +#define PKT_TX_VLAN_PKT (1ULL << 55) /**< TX packet is a 802.1q= VLAN packet. */ > > > + > > > /** Tell the NIC it's an IPv4 packet. Required for L4 checksum offl= oad or TSO. */ > > > -#define PKT_TX_IPV4 PKT_RX_IPV4_HDR > > > +#define PKT_TX_IPV4 (1ULL << 56) > > > > > > /** Tell the NIC it's an IPv6 packet. Required for L4 checksum offl= oad or TSO. */ > > > -#define PKT_TX_IPV6 PKT_RX_IPV6_HDR > > > +#define PKT_TX_IPV6 (1ULL << 57) > > > > The description in comment does not match the description in the cover = letter. > > > > Also, I think replacing PKT_RX_IPV[46]_HDR by the value may go in anoth= er commit. > > > > > > > -#define PKT_TX_VLAN_PKT (1ULL << 55) /**< TX packet is a 802.1q= VLAN packet. */ > > > +/** Outer IP cksum of TX pkt. computed by NIC for tunneling packet *= / > > > +#define PKT_TX_OUTER_IP_CKSUM (1ULL << 58) > > > +#define PKT_TX_OUTER_IPV4_CSUM PKT_TX_OUTER_IP_CKSUM /**< Alias of > > > +PKT_TX_OUTER_IP_CKSUM. */ > > > > Why do we need an alias? > > > > By the way, I think the alias of PKT_TX_IP_CKSUM is also uneeded and ca= n be removed. But it's not the topic of your series. > > > > Also, the name PKT_TX_OUTER_IP_CKSUM does not match the name in the cov= er letter and commit logs. > > > > > > > + > > > +/** Tell the NIC it's an outer IPv6 packet for tunneling packet.*/ > > > +#define PKT_TX_OUTER_IPV6 (1ULL << 59) > > > > > > > This flag is not in the cover letter or commit log. What is its purpose= ? >=20 >=20 > My bad, forgot that for outer IP, will also need to specify it's type. > So same story here as for inner IP. > So in total, we might need 3 flags for outer IP: >=20 > /* Tells HW that outer IP is IPV4 and checksum for it should be calculate= d by HW. */ > PKT_TX_OUTER_IP_CKSUM >=20 > /* Tells HW that outer IP is IPV4 and checksum for it should not be calcu= lated by HW. */ > PKT_TX_OUTER_IPV4 >=20 > /* Tells HW that outer IP is IPV6. */ > PKT_TX_OUTER_IPV6 >=20 > > > > > > > /** > > > * TCP segmentation offload. To enable this offload feature for a @= @ > > > -266,10 +272,9 @@ struct rte_mbuf { > > > uint64_t tso_segsz:16; /**< TCP TSO segment size */ > > > > > > /* fields for TX offloading of tunnels */ > > > - uint64_t inner_l3_len:9; /**< inner L3 (IP) Hdr Length. */ > > > - uint64_t inner_l2_len:7; /**< inner L2 (MAC) Hdr Length. */ > > > - > > > - /* uint64_t unused:8; */ > > > + uint64_t outer_l3_len:9; /**< outer L3 (IP) Hdr Length. */ > > > + uint64_t outer_l2_len:7; /**< outer L2 (MAC) Hdr Length. */ > > > + uint64_t l4_tun_len:8; /**< L4 tunnelling header length */ > > > }; > > > }; > > > } __rte_cache_aligned; > > > > > > > About l4_tun_len, I have another comment I forgot to add in the cover l= etter. Can we remove it and include its length in > outer_l2_len > > instead? For instance, replace: > > > > mb->l2_len =3D eth_hdr_in; > > mb->l3_len =3D ipv4_hdr_in; > > mb->outer_l2_len =3D eth_hdr_out; > > mb->outer_l3_len =3D ipv4_hdr_out; > > mb->l4tun_len =3D vxlan_hdr; > > mb->ol_flags |=3D PKT_TX_OUT_IP_CKSUM | PKT_TX_UDP_TUNNEL | > > PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; > > > > by: > > > > mb->l2_len =3D eth_hdr_in; > > mb->l3_len =3D ipv4_hdr_in; > > mb->outer_l2_len =3D eth_hdr_out + vxlan_hdr; > > mb->outer_l3_len =3D ipv4_hdr_out; > > mb->ol_flags |=3D PKT_TX_OUT_IP_CKSUM | PKT_TX_UDP_TUNNEL | > > PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; > > > > I think it won't bother the driver, and it's coherent with case B.2 of = your cover letter. >=20 > You probably meant: > mb->l2_len =3D eth_hdr_in + vxlan_hdr; > ? > Yes, I think it could be done that way too. > Though I still prefer to keep l4tun_len - it makes things a bit cleaner (= at least to me). > After all we do have space for it in mbuf's tx_offload. As one more thing in favour of separate l4tun_len field: l2_len is 7 bit long, so in theory it might be not enough, as for FVL: 12:18 L4TUNLEN L4 Tunneling Length (Teredo / GRE header / VXLAN header) def= ined in Words.=20 > Konstantin >=20 > > > > Regards, > > Olivier