From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id C2B745927 for ; Thu, 27 Nov 2014 09:24:57 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 27 Nov 2014 00:24:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,468,1413270000"; d="scan'208";a="638932960" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by fmsmga002.fm.intel.com with ESMTP; 27 Nov 2014 00:18:41 -0800 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id sAR8Ifea003115 for ; Thu, 27 Nov 2014 16:18:41 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id sAR8IdAR000664 for ; Thu, 27 Nov 2014 16:18:41 +0800 Received: (from jijiangl@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id sAR8IdZC000660 for dev@dpdk.org; Thu, 27 Nov 2014 16:18:39 +0800 From: Jijiang Liu To: dev@dpdk.org Date: Thu, 27 Nov 2014 16:18:36 +0800 Message-Id: <1417076319-629-1-git-send-email-jijiang.liu@intel.com> X-Mailer: git-send-email 1.7.12.2 Subject: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Nov 2014 08:24:58 -0000 We have got some feedback about backward compatibility of VXLAN TX checksum offload API with 1G/10G NIC after the i40e VXLAN TX checksum codes were applied, so we have to rework the APIs on i40e, including the changes of mbuf, i40e PMD and csum engine. The main changes in mbuf are as follows, In place of removing PKT_TX_VXLAN_CKSUM, we introducing 2 new flags: PKT_TX_OUT_IP_CKSUM, PKT_TX_UDP_TUNNEL_PKT, and a new field: l4_tun_len. Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field. The existing flags are listed below, PKT_TX_IP_CKSUM: HW IPv4 checksum for non-tunnelling packet/ HW inner IPv4 checksum for tunnelling packet PKT_TX_TCP_CKSUM: HW TCP checksum for non-tunnelling packet/ HW inner TCP checksum for tunnelling packet PKT_TX_SCTP_CKSUM: HW SCTP checksum for non-tunnelling packet/ HW inner SCTP checksum for tunnelling packet PKT_TX_UDP_CKSUM: HW SCTP checksum for non-tunnelling packet/ HW inner SCTP checksum for tunnelling packet PKT_TX_IPV4: IPv4 with no HW checksum offload for non-tunnelling packet/inner IPv4 with no HW checksum offload for tunnelling packet PKT_TX_IPV6: IPv6 non-tunnelling packet/ inner IPv6 with no HW checksum offload for tunnelling packet let's use a few examples to demonstrate how to use these flags: Let say we have a tunnel packet: eth_hdr_out/ipv4_hdr_out/udp_hdr_out/vxlan_hdr/ehtr_hdr_in/ipv4_hdr_in/tcp_hdr_in.There could be several scenarios: A) User requests HW offload for ipv4_hdr_out checksum. He doesn't care is it a tunnelled packet or not. So he sets: mb->l2_len = eth_hdr_out; mb->l3_len = ipv4_hdr_out; mb->ol_flags |= PKT_TX_IPV4_CSUM; B) User is aware that it is a tunnelled packet and requests HW offload for ipv4_hdr_in and tcp_hdr_in *only*. He doesn't care about outer IP checksum offload. In that case, for FVL he has 2 choices: 1. Treat that packet as a 'proper' tunnelled packet, and fill all the fields: mb->l2_len = eth_hdr_in; mb->l3_len = ipv4_hdr_in; mb->outer_l2_len = eth_hdr_out; mb->outer_l3_len = ipv4_hdr_out; mb->l4tun_len = vxlan_hdr; mb->ol_flags |= PKT_TX_UDP_TUNNEL_PKT | PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; 2. As user doesn't care about outer IP hdr checksum, he can treat everything before ipv4_hdr_in as L2 header. So he knows, that it is a tunnelled packet, but makes HW to treat it as ordinary (non-tunnelled) packet: mb->l2_len = eth_hdr_out + ipv4_hdr_out + udp_hdr_out + vxlan_hdr + ehtr_hdr_in; mb->l3_len = ipv4_hdr_in; mb->ol_flags |= PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; i40e PMD will support both B.1 and B.2. ixgbe/igb/em PMD supports only B.2. if HW supports both - it will be up to user app which method to choose. tespmd will support both methods, and it should be configurable by user which approach to use (cmdline parameter). So the user can try/test both methods and select an appropriate for him. Now, B.2 is exactly what Oliver suggested. I think it has few important advantages over B.1: First of all - compatibility. It works across all HW we currently support (i40e/ixgbe/igb/em). Second - it is probably faster even on FVL, as for it we have to fill only TXD, while with approach #2 we have to fill both TCD and TXD. C) User knows that is a tunnelled packet, and wants HW offload for all 3 checksums: outer IP hdr checksum, inner IP checksum, inner TCP checksum. Then he has to setup all TX checksum fields: mb->l2_len = eth_hdr_in; mb->l3_len = ipv4_hdr_in; mb->outer_l2_len = eth_hdr_out; mb->outer_l3_len = ipv4_hdr_out; mb->l4tun_len = vxlan_hdr; mb->ol_flags |= PKT_TX_OUT_IP_CKSUM | PKT_TX_UDP_TUNNEL | PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM; Jijiang Liu (3): mbuf change i40e PMD change in i40e_rxtx.c rework csum forward engine app/test-pmd/csumonly.c | 55 +++++++++++++++++++++----------------- lib/librte_mbuf/rte_mbuf.c | 2 +- lib/librte_mbuf/rte_mbuf.h | 23 ++++++++++------ lib/librte_pmd_i40e/i40e_rxtx.c | 40 ++++++++++++--------------- 4 files changed, 63 insertions(+), 57 deletions(-) -- 1.7.7.6