From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id EBB5C2A9 for ; Thu, 27 Nov 2014 16:30:05 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 27 Nov 2014 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,470,1413270000"; d="scan'208";a="644597608" Received: from irsmsx102.ger.corp.intel.com ([163.33.3.155]) by orsmga002.jf.intel.com with ESMTP; 27 Nov 2014 07:30:02 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.144]) by IRSMSX102.ger.corp.intel.com ([169.254.2.93]) with mapi id 14.03.0195.001; Thu, 27 Nov 2014 15:30:01 +0000 From: "Ananyev, Konstantin" To: Olivier MATZ , "Liu, Jijiang" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework Thread-Index: AQHQChvQ5FrFR2KMP06/Q61mQ+hfUJx0OVeAgABbTqA= Date: Thu, 27 Nov 2014 15:29:59 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258213BADE4@IRSMSX105.ger.corp.intel.com> References: <1417076319-629-1-git-send-email-jijiang.liu@intel.com> <5476F28F.7010802@6wind.com> In-Reply-To: <5476F28F.7010802@6wind.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Nov 2014 15:30:06 -0000 Hi Oliver, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ > Sent: Thursday, November 27, 2014 9:45 AM > To: Liu, Jijiang; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework >=20 > Hi Jijiang, >=20 > Please find below some comments about the specifications. The global > picture looks fine to me. >=20 > I've not reviewed the patch right now, but it's in the pipe. >=20 > On 11/27/2014 09:18 AM, Jijiang Liu wrote: > > We have got some feedback about backward compatibility of VXLAN TX chec= ksum offload API with 1G/10G NIC after the i40e VXLAN > TX checksum codes were applied, so we have to rework the APIs on i40e, in= cluding the changes of mbuf, i40e PMD and csum engine. > > > > The main changes in mbuf are as follows, > > In place of removing PKT_TX_VXLAN_CKSUM, we introducing 2 new flags: PK= T_TX_OUT_IP_CKSUM, PKT_TX_UDP_TUNNEL_PKT, > and a new field: l4_tun_len. >=20 > What about PKT_TX_OUT_UDP_CKSUM instead of PKT_TX_UDP_TUNNEL_PKT? It's > maybe more coherent with the other names. FVL HW don't support outer L4 checksum offload. But to calculate inner checksums correctly, it needs a hint from SW about L= 4 Tunnelling Type. >=20 >=20 > > Replace the inner_l2_len and the inner_l3_len field with the outer_l2_l= en and outer_l3_len field. > > > > The existing flags are listed below, > > PKT_TX_IP_CKSUM: HW IPv4 checksum for non-tunnelling packet/ HW inn= er IPv4 checksum for tunnelling packet > > PKT_TX_TCP_CKSUM: HW TCP checksum for non-tunnelling packet/ HW inne= r TCP checksum for tunnelling packet > > PKT_TX_SCTP_CKSUM: HW SCTP checksum for non-tunnelling packet/ HW inn= er SCTP checksum for tunnelling packet > > PKT_TX_UDP_CKSUM: HW SCTP checksum for non-tunnelling packet/ HW inn= er SCTP checksum for tunnelling packet > > PKT_TX_IPV4: IPv4 with no HW checksum offload for non-tunnelling= packet/inner IPv4 with no HW checksum offload for > tunnelling packet > > PKT_TX_IPV6: IPv6 non-tunnelling packet/ inner IPv6 with no HW c= hecksum offload for tunnelling packet >=20 > As I suggested in the TSO thread, I think the following semantics > is easier to understand for the user: >=20 > - PKT_TX_IP_CKSUM: tell the NIC to compute IP cksum >=20 > - PKT_TX_IPV4: tell the NIC it's an IPv4 packet. Required for L4 > checksum offload or TSO. >=20 > - PKT_TX_IPV6: tell the NIC it's an IPv6 packet. Required for L4 > checksum offload or TSO. >=20 > I think it won't make a big difference in the FVL driver. No, no big difference here, but I still think it will be a bit cleaner if a= ll 3 flags would be nutually exclusive. In fact, we can unite all 3 of them them into 2 bits, same as we doing = for L4 checksum flags. >=20 >=20 > > let's use a few examples to demonstrate how to use these flags: > > Let say we have a tunnel packet: eth_hdr_out/ipv4_hdr_out/udp_hdr_out/v= xlan_hdr/ehtr_hdr_in/ipv4_hdr_in/tcp_hdr_in.There > could be several scenarios: > > > > A) User requests HW offload for ipv4_hdr_out checksum. > > He doesn't care is it a tunnelled packet or not. > > So he sets: > > > > mb->l2_len =3D eth_hdr_out; > > mb->l3_len =3D ipv4_hdr_out; > > mb->ol_flags |=3D PKT_TX_IPV4_CSUM; > > > > B) User is aware that it is a tunnelled packet and requests HW offload = for ipv4_hdr_in and tcp_hdr_in *only*. > > He doesn't care about outer IP checksum offload. > > In that case, for FVL he has 2 choices: > > 1. Treat that packet as a 'proper' tunnelled packet, and fill all t= he fields: > > mb->l2_len =3D eth_hdr_in; > > mb->l3_len =3D ipv4_hdr_in; > > mb->outer_l2_len =3D eth_hdr_out; > > mb->outer_l3_len =3D ipv4_hdr_out; > > mb->l4tun_len =3D vxlan_hdr; > > mb->ol_flags |=3D PKT_TX_UDP_TUNNEL_PKT | PKT_TX_IP_CKSUM | PKT_= TX_TCP_CKSUM; > > > > 2. As user doesn't care about outer IP hdr checksum, he can treat e= verything before ipv4_hdr_in as L2 header. > > So he knows, that it is a tunnelled packet, but makes HW to treat i= t as ordinary (non-tunnelled) packet: > > mb->l2_len =3D eth_hdr_out + ipv4_hdr_out + udp_hdr_out + vxlan_h= dr + ehtr_hdr_in; > > mb->l3_len =3D ipv4_hdr_in; > > mb->ol_flags |=3D PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; > > > > i40e PMD will support both B.1 and B.2. > > ixgbe/igb/em PMD supports only B.2. > > if HW supports both - it will be up to user app which method to choose. >=20 > I think we should have a flag to advertise outer ip and outer udp > checksum offload support, so the application knows which mode can > be used. You mean a new DEV_TX_OFFLOAD_* value, right? Something like: DEV_TX_OFFLOAD_UDP_TUNNEL? And make i40e_dev_info_get() to return it? Yes, forgot about it, sounds like a proper thing to do.=20 Konstantin >=20 >=20 > Regards, > Olivier