From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id BF9297E23 for ; Wed, 3 Dec 2014 09:02:06 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP; 02 Dec 2014 23:58:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,691,1406617200"; d="scan'208";a="492770602" Received: from pgsmsx101.gar.corp.intel.com ([10.221.44.78]) by orsmga003.jf.intel.com with ESMTP; 02 Dec 2014 23:58:48 -0800 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by PGSMSX101.gar.corp.intel.com (10.221.44.78) with Microsoft SMTP Server (TLS) id 14.3.195.1; Wed, 3 Dec 2014 16:02:03 +0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.110]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.240]) with mapi id 14.03.0195.001; Wed, 3 Dec 2014 16:02:02 +0800 From: "Liu, Jijiang" To: Thomas Monjalon Thread-Topic: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework Thread-Index: AQHQChvO5Hbj694uHUagPyQ5KbGpNZxzszuAgABgc4CAAJbOQIAI3swA Date: Wed, 3 Dec 2014 08:02:01 +0000 Message-ID: <1ED644BD7E0A5F4091CF203DAFB8E4CC01D9FD8A@SHSMSX101.ccr.corp.intel.com> References: <1417076319-629-1-git-send-email-jijiang.liu@intel.com> <5476F28F.7010802@6wind.com> <2601191342CEEE43887BDE71AB977258213BADE4@IRSMSX105.ger.corp.intel.com> <1ED644BD7E0A5F4091CF203DAFB8E4CC01D9EF72@SHSMSX101.ccr.corp.intel.com> In-Reply-To: <1ED644BD7E0A5F4091CF203DAFB8E4CC01D9EF72@SHSMSX101.ccr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "'dev@dpdk.org'" Subject: Re: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Dec 2014 08:02:07 -0000 Hi Thomas, > -----Original Message----- > From: Liu, Jijiang > Sent: Friday, November 28, 2014 12:32 AM > To: Olivier MATZ > Cc: Ananyev, Konstantin; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework >=20 >=20 >=20 > > -----Original Message----- > > From: Ananyev, Konstantin > > Sent: Thursday, November 27, 2014 11:30 PM > > To: Olivier MATZ; Liu, Jijiang; dev@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework > > > > Hi Oliver, > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ > > > Sent: Thursday, November 27, 2014 9:45 AM > > > To: Liu, Jijiang; dev@dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework > > > > > > Hi Jijiang, > > > > > > Please find below some comments about the specifications. The global > > > picture looks fine to me. > > > > > > I've not reviewed the patch right now, but it's in the pipe. > > > > > > On 11/27/2014 09:18 AM, Jijiang Liu wrote: > > > > We have got some feedback about backward compatibility of VXLAN TX > > > > checksum offload API with 1G/10G NIC after the i40e VXLAN > > > TX checksum codes were applied, so we have to rework the APIs on > > > i40e, > > including the changes of mbuf, i40e PMD and csum engine. > > > > > > > > The main changes in mbuf are as follows, In place of removing > > > > PKT_TX_VXLAN_CKSUM, we introducing 2 new flags: > > PKT_TX_OUT_IP_CKSUM, > > > > PKT_TX_UDP_TUNNEL_PKT, > > > and a new field: l4_tun_len. > > > > > > What about PKT_TX_OUT_UDP_CKSUM instead of > > PKT_TX_UDP_TUNNEL_PKT? It's > > > maybe more coherent with the other names. > > > > FVL HW don't support outer L4 checksum offload. > > But to calculate inner checksums correctly, it needs a hint from SW > > about L4 Tunnelling Type. > > > > > > > > > > > > Replace the inner_l2_len and the inner_l3_len field with the > > > > outer_l2_len and > > outer_l3_len field. > > > > > > > > The existing flags are listed below, > > > > PKT_TX_IP_CKSUM: HW IPv4 checksum for non-tunnelling packet/ HW > > inner IPv4 checksum for tunnelling packet > > > > PKT_TX_TCP_CKSUM: HW TCP checksum for non-tunnelling packet/ HW > > inner TCP checksum for tunnelling packet > > > > PKT_TX_SCTP_CKSUM: HW SCTP checksum for non-tunnelling packet/ HW > > inner SCTP checksum for tunnelling packet > > > > PKT_TX_UDP_CKSUM: HW SCTP checksum for non-tunnelling packet/ HW > > inner SCTP checksum for tunnelling packet > > > > PKT_TX_IPV4: IPv4 with no HW checksum offload for non-tunnel= ling > > packet/inner IPv4 with no HW checksum offload for > > > tunnelling packet > > > > PKT_TX_IPV6: IPv6 non-tunnelling packet/ inner IPv6 with no = HW > > checksum offload for tunnelling packet > > > > > > As I suggested in the TSO thread, I think the following semantics is > > > easier to understand for the user: > > > > > > - PKT_TX_IP_CKSUM: tell the NIC to compute IP cksum > > > > > > - PKT_TX_IPV4: tell the NIC it's an IPv4 packet. Required for L4 > > > checksum offload or TSO. > > > > > > - PKT_TX_IPV6: tell the NIC it's an IPv6 packet. Required for L4 > > > checksum offload or TSO. > > > > > > I think it won't make a big difference in the FVL driver. > > > > No, no big difference here, but I still think it will be a bit cleaner > > if all 3 flags would be nutually exclusive. > > In fact, we can unite all 3 of them them into 2 bits, same as we do= ing for L4 > > checksum flags. > > > > > > > > > > > > let's use a few examples to demonstrate how to use these flags: > > > > Let say we have a tunnel packet: > > > > eth_hdr_out/ipv4_hdr_out/udp_hdr_out/vxlan_hdr/ehtr_hdr_in/ipv4_hd > > > > r_ > > > > in/tcp_hdr_in.There > > > could be several scenarios: > > > > > > > > A) User requests HW offload for ipv4_hdr_out checksum. > > > > He doesn't care is it a tunnelled packet or not. > > > > So he sets: > > > > > > > > mb->l2_len =3D eth_hdr_out; > > > > mb->l3_len =3D ipv4_hdr_out; > > > > mb->ol_flags |=3D PKT_TX_IPV4_CSUM; > > > > > > > > B) User is aware that it is a tunnelled packet and requests HW > > > > offload for > > ipv4_hdr_in and tcp_hdr_in *only*. > > > > He doesn't care about outer IP checksum offload. > > > > In that case, for FVL he has 2 choices: > > > > 1. Treat that packet as a 'proper' tunnelled packet, and fill a= ll the fields: > > > > mb->l2_len =3D eth_hdr_in; > > > > mb->l3_len =3D ipv4_hdr_in; > > > > mb->outer_l2_len =3D eth_hdr_out; > > > > mb->outer_l3_len =3D ipv4_hdr_out; > > > > mb->l4tun_len =3D vxlan_hdr; > > > > mb->ol_flags |=3D PKT_TX_UDP_TUNNEL_PKT | PKT_TX_IP_CKSUM | > > > > PKT_TX_TCP_CKSUM; > > > > > > > > 2. As user doesn't care about outer IP hdr checksum, he can > > > > treat > > everything before ipv4_hdr_in as L2 header. > > > > So he knows, that it is a tunnelled packet, but makes HW to > > > > treat it as > > ordinary (non-tunnelled) packet: > > > > mb->l2_len =3D eth_hdr_out + ipv4_hdr_out + udp_hdr_out + > > > > vxlan_hdr + > > ehtr_hdr_in; > > > > mb->l3_len =3D ipv4_hdr_in; > > > > mb->ol_flags |=3D PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; > > > > > > > > i40e PMD will support both B.1 and B.2. > > > > ixgbe/igb/em PMD supports only B.2. > > > > if HW supports both - it will be up to user app which method to cho= ose. > > > > > > I think we should have a flag to advertise outer ip and outer udp > > > checksum offload support, so the application knows which mode can be > > > used. > > > > You mean a new DEV_TX_OFFLOAD_* value, right? > > Something like: DEV_TX_OFFLOAD_UDP_TUNNEL? > > And make i40e_dev_info_get() to return it? > > Yes, forgot about it, sounds like a proper thing to do. > Yes, makes sense, I will send a separate patch(bug fixing) to do this. Th= anks . I'm preparing this patch, and will send it out soon, I hope this patch also= can be included in DPDK1.8 Thanks. =20 > > Konstantin > > > > > > > > > > > Regards, > > > Olivier