From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 460E31B64F for ; Wed, 4 Oct 2017 16:35:10 +0200 (CEST) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP; 04 Oct 2017 07:35:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,477,1500966000"; d="scan'208";a="159324506" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by fmsmga006.fm.intel.com with ESMTP; 04 Oct 2017 07:35:08 -0700 Received: from irsmsx101.ger.corp.intel.com ([169.254.1.22]) by IRSMSX103.ger.corp.intel.com ([169.254.3.49]) with mapi id 14.03.0319.002; Wed, 4 Oct 2017 15:35:07 +0100 From: "Kavanagh, Mark B" To: "Ananyev, Konstantin" , "dev@dpdk.org" CC: "Hu, Jiayu" , "Tan, Jianfeng" , "Yigit, Ferruh" , "thomas@monjalon.net" Thread-Topic: [PATCH v6 3/6] gso: add VxLAN GSO support Thread-Index: AQHTO534X2MTV/wGJEiiJNZvszd2EKLTrZUAgAAW0SA= Date: Wed, 4 Oct 2017 14:35:07 +0000 Message-ID: References: <1506636833-25851-1-git-send-email-mark.b.kavanagh@intel.com> <1506962749-106779-1-git-send-email-mark.b.kavanagh@intel.com> <1506962749-106779-4-git-send-email-mark.b.kavanagh@intel.com> <2601191342CEEE43887BDE71AB9772585FAA3DDE@IRSMSX103.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAA3DDE@IRSMSX103.ger.corp.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNTU5NDc0OGItYjhjNC00MGExLWIzZDItYTA3NWI1NTEyNmI0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6InVsRksyclFndExuNEdlVlQ3d3JcL1wvQnFIbzJ3RnNCOTVmTGZjRytcL3JhSFU9In0= dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v6 3/6] gso: add VxLAN GSO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Oct 2017 14:35:11 -0000 >-----Original Message----- >From: Ananyev, Konstantin >Sent: Wednesday, October 4, 2017 3:12 PM >To: Kavanagh, Mark B ; dev@dpdk.org >Cc: Hu, Jiayu ; Tan, Jianfeng = ; >Yigit, Ferruh ; thomas@monjalon.net >Subject: RE: [PATCH v6 3/6] gso: add VxLAN GSO support > > > >> -----Original Message----- >> From: Kavanagh, Mark B >> Sent: Monday, October 2, 2017 5:46 PM >> To: dev@dpdk.org >> Cc: Hu, Jiayu ; Tan, Jianfeng ; >Ananyev, Konstantin ; Yigit, >> Ferruh ; thomas@monjalon.net; Kavanagh, Mark B > >> Subject: [PATCH v6 3/6] gso: add VxLAN GSO support >> >> This patch adds a framework that allows GSO on tunneled packets. >> Furthermore, it leverages that framework to provide GSO support for >> VxLAN-encapsulated packets. >> >> Supported VxLAN packets must have an outer IPv4 header (prepended by an >> optional VLAN tag), and contain an inner TCP/IPv4 packet (with an option= al >> inner VLAN tag). >> >> VxLAN GSO doesn't check if input packets have correct checksums and >> doesn't update checksums for output packets. Additionally, it doesn't >> process IP fragmented packets. >> >> As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each >> output packet, which mandates support for multi-segment mbufs in the TX >> functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO >> reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segment= s >> are freed, the packet is freed automatically. >> >> Signed-off-by: Mark Kavanagh >> Signed-off-by: Jiayu Hu >> --- >> doc/guides/rel_notes/release_17_11.rst | 3 + >> lib/librte_gso/Makefile | 1 + >> lib/librte_gso/gso_common.h | 25 +++++++ >> lib/librte_gso/gso_tunnel_tcp4.c | 123 >+++++++++++++++++++++++++++++++++ >> lib/librte_gso/gso_tunnel_tcp4.h | 75 ++++++++++++++++++++ >> lib/librte_gso/rte_gso.c | 13 +++- >> 6 files changed, 237 insertions(+), 3 deletions(-) >> create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c >> create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h >> >> diff --git a/doc/guides/rel_notes/release_17_11.rst >b/doc/guides/rel_notes/release_17_11.rst >> index c414f73..25b8a78 100644 >> --- a/doc/guides/rel_notes/release_17_11.rst >> +++ b/doc/guides/rel_notes/release_17_11.rst >> @@ -48,6 +48,9 @@ New Features >> ones (e.g. MTU is 1500B). Supported packet types are: >> >> * TCP/IPv4 packets, which may include a single VLAN tag. >> + * VxLAN packets, which must have an outer IPv4 header (prepended by >> + an optional VLAN tag), and contain an inner TCP/IPv4 packet (with >> + an optional VLAN tag). >> >> The GSO library doesn't check if the input packets have correct >> checksums, and doesn't update checksums for output packets. >> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile >> index 2be64d1..e6d41df 100644 >> --- a/lib/librte_gso/Makefile >> +++ b/lib/librte_gso/Makefile >> @@ -44,6 +44,7 @@ LIBABIVER :=3D 1 >> SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D rte_gso.c >> SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_common.c >> SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_tcp4.c >> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_tunnel_tcp4.c >> >> # install this header file >> SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include +=3D rte_gso.h >> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h >> index 8d9b94e..c051295 100644 >> --- a/lib/librte_gso/gso_common.h >> +++ b/lib/librte_gso/gso_common.h >> @@ -39,6 +39,7 @@ >> #include >> #include >> #include >> +#include >> >> #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != =3D 0 \ >> || ((frag_off) & IPV4_HDR_MF_FLAG) =3D=3D IPV4_HDR_MF_FLAG) >> @@ -49,6 +50,30 @@ >> #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) = =3D=3D \ >> (PKT_TX_TCP_SEG | PKT_TX_IPV4)) >> >> +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IP= V4 | >\ >> + PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) =3D=3D \ >> + (PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \ >> + PKT_TX_TUNNEL_VXLAN)) >> + >> +/** >> + * Internal function which updates the UDP header of a packet, followin= g >> + * segmentation. This is required to update the header's datagram lengt= h >field. >> + * >> + * @param pkt >> + * The packet containing the UDP header. >> + * @param udp_offset >> + * The offset of the UDP header from the start of the packet. >> + */ >> +static inline void >> +update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset) >> +{ >> + struct udp_hdr *udp_hdr; >> + >> + udp_hdr =3D (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + udp_offset); >> + udp_hdr->dgram_len =3D rte_cpu_to_be_16(pkt->pkt_len - udp_offset); >> +} >> + >> /** >> * Internal function which updates the TCP header of a packet, followin= g >> * segmentation. This is required to update the header's 'sent' sequenc= e >> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c >b/lib/librte_gso/gso_tunnel_tcp4.c >> new file mode 100644 >> index 0000000..34bbbd7 >> --- /dev/null >> +++ b/lib/librte_gso/gso_tunnel_tcp4.c >> @@ -0,0 +1,123 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2017 Intel Corporation. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above copyri= ght >> + * notice, this list of conditions and the following disclaimer i= n >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products derive= d >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTOR= S >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS = FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIG= HT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENT= AL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF U= SE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON = ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TOR= T >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE = USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAG= E. >> + */ >> + >> +#include "gso_common.h" >> +#include "gso_tunnel_tcp4.h" >> + >> +static void >> +update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta= , >> + struct rte_mbuf **segs, uint16_t nb_segs) >> +{ >> + struct ipv4_hdr *ipv4_hdr; >> + struct tcp_hdr *tcp_hdr; >> + uint32_t sent_seq; >> + uint16_t outer_id, inner_id, tail_idx, i; >> + uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset; >> + >> + outer_ipv4_offset =3D pkt->outer_l2_len; >> + udp_offset =3D outer_ipv4_offset + pkt->outer_l3_len; >> + inner_ipv4_offset =3D udp_offset + pkt->l2_len; >> + tcp_offset =3D inner_ipv4_offset + pkt->l3_len; >> + >> + /* Outer IPv4 header. */ >> + ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + outer_ipv4_offset); >> + outer_id =3D rte_be_to_cpu_16(ipv4_hdr->packet_id); >> + >> + /* Inner IPv4 header. */ >> + ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + inner_ipv4_offset); >> + inner_id =3D rte_be_to_cpu_16(ipv4_hdr->packet_id); >> + >> + tcp_hdr =3D (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); >> + sent_seq =3D rte_be_to_cpu_32(tcp_hdr->sent_seq); >> + tail_idx =3D nb_segs - 1; >> + >> + for (i =3D 0; i < nb_segs; i++) { >> + update_ipv4_header(segs[i], outer_ipv4_offset, outer_id); >> + update_udp_header(segs[i], udp_offset); >> + update_ipv4_header(segs[i], inner_ipv4_offset, inner_id); >> + update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx); >> + outer_id++; >> + inner_id +=3D ipid_delta; >> + sent_seq +=3D (segs[i]->pkt_len - segs[i]->data_len); >> + } >> +} >> + >> +int >> +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt, >> + uint16_t gso_size, >> + uint8_t ipid_delta, >> + struct rte_mempool *direct_pool, >> + struct rte_mempool *indirect_pool, >> + struct rte_mbuf **pkts_out, >> + uint16_t nb_pkts_out) >> +{ >> + struct ipv4_hdr *inner_ipv4_hdr; >> + uint16_t pyld_unit_size, hdr_offset; >> + uint16_t tcp_dl, frag_off; >> + int ret =3D 1; >> + >> + hdr_offset =3D pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len; >> + inner_ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + hdr_offset); >> + /* >> + * Don't process the packet whose MF bit or offset in the inner >> + * IPv4 header are non-zero. >> + */ >> + frag_off =3D rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset); >> + if (unlikely(IS_FRAGMENTED(frag_off))) { >> + pkts_out[0] =3D pkt; >> + return 1; >> + } >> + >> + /* Don't process the packet without data */ >> + tcp_dl =3D pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len; >> + if (unlikely(tcp_dl =3D=3D 0)) { > >You probably need to take into account outer_len* too.. >Probably better to move that check after final hdr_offset calculations: Agreed - thanks. > >... >hdr_offset +=3D pkt->l3_len + pkt->l4_len; >if (hdr_offset >=3D pkt->pkt_len) {..;' return 1;} >... > >> + pkts_out[0] =3D pkt; >> + return 1; >> + } >> + >> + hdr_offset +=3D pkt->l3_len + pkt->l4_len; >> + pyld_unit_size =3D gso_size - hdr_offset; >> + >> + /* Segment the payload */ >> + ret =3D gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool, >> + indirect_pool, pkts_out, nb_pkts_out); >> + if (ret <=3D 1) >> + return ret; >> + >> + update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret); >> + >> + return ret; >> +} >> diff --git a/lib/librte_gso/gso_tunnel_tcp4.h >b/lib/librte_gso/gso_tunnel_tcp4.h >> new file mode 100644 >> index 0000000..3c67f0c >> --- /dev/null >> +++ b/lib/librte_gso/gso_tunnel_tcp4.h >> @@ -0,0 +1,75 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2017 Intel Corporation. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above copyri= ght >> + * notice, this list of conditions and the following disclaimer i= n >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products derive= d >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTOR= S >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS = FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIG= HT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENT= AL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF U= SE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON = ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TOR= T >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE = USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAG= E. >> + */ >> + >> +#ifndef _GSO_TUNNEL_TCP4_H_ >> +#define _GSO_TUNNEL_TCP4_H_ >> + >> +#include >> +#include >> + >> +/** >> + * Segment a tunneling packet with inner TCP/IPv4 headers. This functio= n >> + * doesn't check if the input packet has correct checksums, and doesn't >> + * update checksums for output GSO segments. Furthermore, it doesn't >> + * process IP fragment packets. >> + * >> + * @param pkt >> + * The packet mbuf to segment. >> + * @param gso_size >> + * The max length of a GSO segment, measured in bytes. >> + * @param ipid_delta >> + * The increasing unit of IP ids. >> + * @param direct_pool >> + * MBUF pool used for allocating direct buffers for output segments. >> + * @param indirect_pool >> + * MBUF pool used for allocating indirect buffers for output segments. >> + * @param pkts_out >> + * Pointer array used to store the MBUF addresses of output GSO >> + * segments, when it succeeds. If the memory space in pkts_out is >> + * insufficient, it fails and returns -EINVAL. >> + * @param nb_pkts_out >> + * The max number of items that 'pkts_out' can keep. >> + * >> + * @return >> + * - The number of GSO segments filled in pkts_out on success. >> + * - Return -ENOMEM if run out of memory in MBUF pools. >> + * - Return -EINVAL for invalid parameters. >> + */ >> +int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt, >> + uint16_t gso_size, >> + uint8_t ipid_delta, >> + struct rte_mempool *direct_pool, >> + struct rte_mempool *indirect_pool, >> + struct rte_mbuf **pkts_out, >> + uint16_t nb_pkts_out); >> +#endif >> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c >> index a4fce50..6095689 100644 >> --- a/lib/librte_gso/rte_gso.c >> +++ b/lib/librte_gso/rte_gso.c >> @@ -39,6 +39,7 @@ >> #include "rte_gso.h" >> #include "gso_common.h" >> #include "gso_tcp4.h" >> +#include "gso_tunnel_tcp4.h" >> >> int >> rte_gso_segment(struct rte_mbuf *pkt, >> @@ -58,8 +59,9 @@ >> return -EINVAL; >> >> if ((gso_ctx->gso_size >=3D pkt->pkt_len) || (gso_ctx->gso_types & >> - DEV_TX_OFFLOAD_TCP_TSO) !=3D >> - gso_ctx->gso_types) { >> + (DEV_TX_OFFLOAD_TCP_TSO | >> + DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=3D >> + gso_ctx->gso_types) { >> pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); >> pkts_out[0] =3D pkt; >> return 1; >> @@ -71,7 +73,12 @@ >> ipid_delta =3D (gso_ctx->ipid_flag !=3D RTE_GSO_IPID_FIXED); >> ol_flags =3D pkt->ol_flags; >> >> - if (IS_IPV4_TCP(pkt->ol_flags)) { >> + if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) { >> + pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); >> + ret =3D gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta, >> + direct_pool, indirect_pool, >> + pkts_out, nb_pkts_out); >> + } else if (IS_IPV4_TCP(pkt->ol_flags)) { > >Hmm it doesn't look quite right. >Imagine user doesn't want libgso to segment plain TCP packets with that ct= x, >just VXLAN+TCP. That's a very good point - I'll update the code as per your suggestion. Thanks! > >I think you need to merge that if and one above to something like that: > >If (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) > && (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | >DEV_TX_OFFLOAD_TCP_TSO)) =3D=3D > (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | DEV_TX_OFFLOAD_TCP_TSO)) { > ... >} else if (IS_IPV4_TCP(pkt->ol_flags) && (gso_ctx->gso_types & >DEV_TX_OFFLOAD_TCP_TSO)) { > ... >} else { > /* unsupported packet, skip */ >} > >Konstantin > >> pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); >> ret =3D gso_tcp4_segment(pkt, gso_size, ipid_delta, >> direct_pool, indirect_pool, >> -- >> 1.9.3