From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 0A5AD1B912 for ; Wed, 27 Jun 2018 04:28:50 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Jun 2018 19:28:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,277,1526367600"; d="scan'208";a="235890700" Received: from fmsmsx103.amr.corp.intel.com ([10.18.124.201]) by orsmga005.jf.intel.com with ESMTP; 26 Jun 2018 19:28:46 -0700 Received: from fmsmsx124.amr.corp.intel.com (10.18.125.39) by FMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 26 Jun 2018 19:28:46 -0700 Received: from shsmsx101.ccr.corp.intel.com (10.239.4.153) by fmsmsx124.amr.corp.intel.com (10.18.125.39) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 26 Jun 2018 19:28:45 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.223]) by SHSMSX101.ccr.corp.intel.com ([169.254.1.82]) with mapi id 14.03.0319.002; Wed, 27 Jun 2018 10:28:44 +0800 From: "Hu, Jiayu" To: Ophir Munk , "dev@dpdk.org" CC: "Wang, Xiao W" , "Ananyev, Konstantin" , "Zhang, Yuwei1" , "Iremonger, Bernard" , Thomas Monjalon Thread-Topic: [dpdk-dev] [PATCH v3 1/3] gso: support UDP/IPv4 fragmentation Thread-Index: AQHUCexYtq8usPu+20et8LBKlmF1tqRyuUcAgACXtVA= Date: Wed, 27 Jun 2018 02:28:44 +0000 Message-ID: References: <1529205194-87434-1-git-send-email-jiayu.hu@intel.com> <1529646843-45903-1-git-send-email-jiayu.hu@intel.com> <1529646843-45903-2-git-send-email-jiayu.hu@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.200.100 dlp-reaction: no-action x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYzkyZWFmNzAtZjJmMS00NmYwLTkyY2QtMmU0MjhiZTUyYzNkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiTG1JUVN3Q2FHakVvSHRGT2NuZHcyYkJLZWNDTlNkMGlteVZoTVM0WlI5MHRtTnZ2MFpcL2Zob3pkRzR5dWRzbUEifQ== x-ctpclassification: CTP_NT x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 1/3] gso: support UDP/IPv4 fragmentation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jun 2018 02:28:51 -0000 Hi Ophir, Replies are inline. > -----Original Message----- > From: Ophir Munk [mailto:ophirmu@mellanox.com] > Sent: Wednesday, June 27, 2018 7:59 AM > To: Hu, Jiayu ; dev@dpdk.org > Cc: Wang, Xiao W ; Ananyev, Konstantin > ; Zhang, Yuwei1 > ; Iremonger, Bernard > ; Thomas Monjalon > > Subject: RE: [dpdk-dev] [PATCH v3 1/3] gso: support UDP/IPv4 > fragmentation >=20 > Hi, > Please find some comments below. >=20 > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jiayu Hu > > Sent: Friday, June 22, 2018 8:54 AM > > To: dev@dpdk.org > > Cc: xiao.w.wang@intel.com; konstantin.ananyev@intel.com; > > yuwei1.zhang@intel.com; bernard.iremonger@intel.com; Thomas > Monjalon > > ; Jiayu Hu > > Subject: [dpdk-dev] [PATCH v3 1/3] gso: support UDP/IPv4 fragmentation > > > > This patch adds GSO support for UDP/IPv4 packets. Supported packets > may > > include a single VLAN tag. UDP/IPv4 GSO doesn't check if input packets > have > > correct checksums, and doesn't update checksums for output packets (the > > responsibility for this lies with the application). > > Additionally, UDP/IPv4 GSO doesn't process IP fragmented packets. > > > > UDP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect > > MBUF, to organize an output packet. The direct MBUF stores the packet > > header, while the indirect mbuf simply points to a location within the > original > > packet's payload. Consequently, use of UDP GSO requires multi-segment > > MBUF support in the TX functions of the NIC driver. > > > > If a packet is GSO'd, UDP/IPv4 GSO reduces its MBUF refcnt by 1. As a > result, > > when all of its GSOed segments are freed, the packet is freed > automatically. > > > > Signed-off-by: Jiayu Hu > > --- > > lib/librte_gso/Makefile | 1 + > > lib/librte_gso/gso_common.h | 3 ++ > > lib/librte_gso/gso_udp4.c | 81 > > +++++++++++++++++++++++++++++++++++++++++++++ > > lib/librte_gso/gso_udp4.h | 42 +++++++++++++++++++++++ > > lib/librte_gso/meson.build | 2 +- > > lib/librte_gso/rte_gso.c | 24 +++++++++++--- > > lib/librte_gso/rte_gso.h | 6 +++- > > 7 files changed, 152 insertions(+), 7 deletions(-) create mode 100644 > > lib/librte_gso/gso_udp4.c create mode 100644 lib/librte_gso/gso_udp4.h > > > > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile index > > 3648ec0..1fac53a 100644 > > --- a/lib/librte_gso/Makefile > > +++ b/lib/librte_gso/Makefile > > @@ -19,6 +19,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D rte_gso.c > > SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_common.c > > SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_tcp4.c > > SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_tunnel_tcp4.c > > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_udp4.c > > > > # install this header file > > SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include +=3D rte_gso.h diff --git > > a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h index > > 5ca5974..6cd764f 100644 > > --- a/lib/librte_gso/gso_common.h > > +++ b/lib/librte_gso/gso_common.h > > @@ -31,6 +31,9 @@ > > (PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | > \ > > PKT_TX_TUNNEL_GRE)) > > > > +#define IS_IPV4_UDP(flag) (((flag) & (PKT_TX_UDP_SEG | PKT_TX_IPV4)) > =3D=3D > > \ > > + (PKT_TX_UDP_SEG | PKT_TX_IPV4)) > > + > > /** > > * Internal function which updates the UDP header of a packet, followi= ng > > * segmentation. This is required to update the header's datagram leng= th > > field. > > diff --git a/lib/librte_gso/gso_udp4.c b/lib/librte_gso/gso_udp4.c new = file > > mode 100644 index 0000000..927dee1 > > --- /dev/null > > +++ b/lib/librte_gso/gso_udp4.c >=20 > File gso_upd4.c could be very similar to file gso_tcp4.c and that would > avoid code duplication. > In a unified file you could use a tcp vs. udp flag to distinguish between= them > when necessary. > The files are short (~75 lines) so it is not a critical issue. The function gso_tcp4_segment and gso_udp4_segment have different prototype= , and their GSO rules are different. They are two different basic GSO types. The GSO library gives each GSO type (TCP, tunnel) an internal .c and .h fil= e, and their name represents their content. The rte_gso_segment calls different function acco= rding to the GSO type. This style is very clear for developers or users to understand. In addition= , as you said, the code is short. So I think it's better to keep this style for UDP/IPv4 GSO. >=20 > > @@ -0,0 +1,81 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2018 Intel Corporation > > + */ > > + > > +#include "gso_common.h" > > +#include "gso_udp4.h" > > + > > +#define IPV4_HDR_MF_BIT (1U << 13) > > + > > +static inline void > > +update_ipv4_udp_headers(struct rte_mbuf *pkt, struct rte_mbuf **segs, > > + uint16_t nb_segs) > > +{ > > + struct ipv4_hdr *ipv4_hdr; > > + uint16_t frag_offset =3D 0, is_mf; > > + uint16_t l2_hdrlen =3D pkt->l2_len, l3_hdrlen =3D pkt->l3_len; > > + uint16_t tail_idx =3D nb_segs - 1, length, i; > > + > > + /* > > + * Update IP header fields for output segments. Specifically, > > + * keep the same IP id, update fragment offset and total > > + * length. > > + */ > > + for (i =3D 0; i < nb_segs; i++) { > > + ipv4_hdr =3D rte_pktmbuf_mtod_offset(segs[i], struct > ipv4_hdr > > *, > > + l2_hdrlen); > > + length =3D segs[i]->pkt_len - l2_hdrlen; > > + ipv4_hdr->total_length =3D rte_cpu_to_be_16(length); > > + > > + is_mf =3D i < tail_idx ? IPV4_HDR_MF_BIT : 0; > > + ipv4_hdr->fragment_offset =3D > > + rte_cpu_to_be_16(frag_offset | is_mf); > > + frag_offset +=3D ((length - l3_hdrlen) >> 3); > > + } > > +} > > + > > +int > > +gso_udp4_segment(struct rte_mbuf *pkt, > > + uint16_t gso_size, > > + struct rte_mempool *direct_pool, > > + struct rte_mempool *indirect_pool, > > + struct rte_mbuf **pkts_out, > > + uint16_t nb_pkts_out) > > +{ > > + struct ipv4_hdr *ipv4_hdr; > > + uint16_t pyld_unit_size, hdr_offset; > > + uint16_t frag_off; > > + int ret; > > + > > + /* Don't process the fragmented packet */ > > + ipv4_hdr =3D rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *, > > + pkt->l2_len); > > + frag_off =3D rte_be_to_cpu_16(ipv4_hdr->fragment_offset); > > + if (unlikely(IS_FRAGMENTED(frag_off))) { > > + pkts_out[0] =3D pkt; > > + return 1; > > + } > > + > > + /* > > + * UDP fragmentation is the same as IP fragmentation. > > + * Except the first one, other output packets just have l2 > > + * and l3 headers. > > + */ > > + hdr_offset =3D pkt->l2_len + pkt->l3_len; > > + > > + /* Don't process the packet without data. */ > > + if (unlikely(hdr_offset + pkt->l4_len >=3D pkt->pkt_len)) { > > + pkts_out[0] =3D pkt; > > + return 1; > > + } > > + > > + pyld_unit_size =3D gso_size - hdr_offset; > > + > > + /* Segment the payload */ > > + ret =3D gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool, > > + indirect_pool, pkts_out, nb_pkts_out); > > + if (ret > 1) > > + update_ipv4_udp_headers(pkt, pkts_out, ret); > > + > > + return ret; > > +} > > diff --git a/lib/librte_gso/gso_udp4.h b/lib/librte_gso/gso_udp4.h new = file > > mode 100644 index 0000000..b2a2908 >=20 > File gso_upd4.h is almost identical to file gso_tcp4.h so both files (alt= hough > short ~40 lines) could have been unified into one file. Ditto. >=20 > > --- /dev/null > > +++ b/lib/librte_gso/gso_udp4.h > > @@ -0,0 +1,42 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2018 Intel Corporation > > + */ > > + > > +#ifndef _GSO_UDP4_H_ > > +#define _GSO_UDP4_H_ > > + > > +#include > > +#include > > + > > +/** > > + * Segment an UDP/IPv4 packet. This function doesn't check if the inpu= t > > + * packet has correct checksums, and doesn't update checksums for > > +output > > + * GSO segments. Furthermore, it doesn't process IP fragment packets. > > + * > > + * @param pkt > > + * The packet mbuf to segment. > > + * @param gso_size > > + * The max length of a GSO segment, measured in bytes. > > + * @param direct_pool > > + * MBUF pool used for allocating direct buffers for output segments. > > + * @param indirect_pool > > + * MBUF pool used for allocating indirect buffers for output segments= . > > + * @param pkts_out > > + * Pointer array used to store the MBUF addresses of output GSO > > + * segments, when the function succeeds. If the memory space in > > + * pkts_out is insufficient, it fails and returns -EINVAL. > > + * @param nb_pkts_out > > + * The max number of items that 'pkts_out' can keep. > > + * > > + * @return > > + * - The number of GSO segments filled in pkts_out on success. > > + * - Return -ENOMEM if run out of memory in MBUF pools. > > + * - Return -EINVAL for invalid parameters. > > + */ > > +int gso_udp4_segment(struct rte_mbuf *pkt, > > + uint16_t gso_size, > > + struct rte_mempool *direct_pool, > > + struct rte_mempool *indirect_pool, > > + struct rte_mbuf **pkts_out, > > + uint16_t nb_pkts_out); > > +#endif > > diff --git a/lib/librte_gso/meson.build b/lib/librte_gso/meson.build in= dex > > 056534f..ad8dd85 100644 > > --- a/lib/librte_gso/meson.build > > +++ b/lib/librte_gso/meson.build > > @@ -1,7 +1,7 @@ > > # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel > > Corporation > > > > -sources =3D files('gso_common.c', 'gso_tcp4.c', > > +sources =3D files('gso_common.c', 'gso_tcp4.c', 'gso_udp4.c', > > 'gso_tunnel_tcp4.c', 'rte_gso.c') > > headers =3D files('rte_gso.h') > > deps +=3D ['ethdev'] > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c index > > a44e3d4..751b5b6 100644 > > --- a/lib/librte_gso/rte_gso.c > > +++ b/lib/librte_gso/rte_gso.c > > @@ -11,6 +11,17 @@ > > #include "gso_common.h" > > #include "gso_tcp4.h" > > #include "gso_tunnel_tcp4.h" > > +#include "gso_udp4.h" > > + > > +#define ILLEGAL_UDP_GSO_CTX(ctx) \ > > + ((((ctx)->gso_types & DEV_TX_OFFLOAD_UDP_TSO) =3D=3D 0) || \ > > + (ctx)->gso_size < RTE_GSO_UDP_SEG_SIZE_MIN) > > + > > +#define ILLEGAL_TCP_GSO_CTX(ctx) \ > > + ((((ctx)->gso_types & (DEV_TX_OFFLOAD_TCP_TSO | \ > > + DEV_TX_OFFLOAD_VXLAN_TNL_TSO | \ > > + DEV_TX_OFFLOAD_GRE_TNL_TSO)) =3D=3D 0) || \ > > + (ctx)->gso_size < RTE_GSO_SEG_SIZE_MIN) >=20 > Can you please explain why it is correct that the min len for VXLAN_TNL o= r > GRE_TNL is that of TCP MIN size (RTE_GSO_SEG_SIZE_MIN) The logic here is a little complicated. First, we have two GSO types, i.e. = TCP and UDP, and they have different requirements for min lengths and gso_types flags. I= n the code, we use ILLEGAL_UDP_GSO_CTX() and ILLEGAL_TCP_GSO_CTX() to check if the inpu= t packet doesn't meet the requirements. Rte_gso_segment() starts to process the inpu= t packet only when it meets UDP or TCP requirement. In other words, rte_gso_segment() sto= ps if the packet doesn't meet both requirements at the same time. This is why I use " ILLEGAL_UDP_GSO_CTX(gso_ctx) && ILLEGAL_TCP_GSO_CTX(gso_ctx))" as the exi= t condition. RTE_GSO_SEG_SIZE_MIN is not used to decide if input tunnel packets meet len= gth requirement, but just checks a min length. In fact, we cannot decide a real= correct min length for a packet type, since it may have vlan header or not. The origina= l GSO code leverages this macro for tunnel packets to do a minimal check, so I think w= e can keep it here. >=20 > > >=20 > To make the macros above and their usage below clearer: >=20 > 1. I would change the || with &&. and =3D=3D with !=3D >=20 > #define ILLEGAL_UDP_GSO_CTX(ctx) \ > ((((ctx)->gso_types & DEV_TX_OFFLOAD_UDP_TSO) !=3D 0) && \ > (ctx)->gso_size < RTE_GSO_UDP_SEG_SIZE_MIN) This macro doesn't check all conditions for illegal UDP GSO. For example, if we input a UDP/IPv4 packet with setting gso_size to 1500 and without set= ting DEV_TX_OFFLOAD_UDP_TSO in gso_types, this macro returns 0, which means it's a legal UDP GSO packet. If you mean we'd better use LEGAL rather than ILLEGAL as the check in the c= ode, the exit condition should be:=20 #define LEGAL_UDP_GSO_CTX(ctx) \ ((((ctx)->gso_types & DEV_TX_OFFLOAD_UDP_TSO) !=3D 0) && \ (ctx)->gso_size >=3D RTE_GSO_UDP_SEG_SIZE_MIN) if (~(LEGAL_UDP_GSO_CTX(..) || LEGAL_TCP_GSO_CTX(..))) return -EINVAL; But in fact, it's the same as current implementation. I can add some explanations to the code for users to better understand the logic, if you think it's OK. >=20 > #define ILLEGAL_TCP_GSO_CTX(ctx) \ > ((((ctx)->gso_types & (DEV_TX_OFFLOAD_TCP_TSO | \ > DEV_TX_OFFLOAD_VXLAN_TNL_TSO | \ > DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=3D 0) && \ > (ctx)->gso_size < RTE_GSO_SEG_SIZE_MIN) >=20 Ditto. > 2. Then later I would change the && with || >=20 > Changing original: > (ILLEGAL_UDP_GSO_CTX(gso_ctx) && > ILLEGAL_TCP_GSO_CTX(gso_ctx))) >=20 > With this: > ILLEGAL_UDP_GSO_CTX(gso_ctx) || > ILLEGAL_TCP_GSO_CTX(gso_ctx)) Ditto. >=20 >=20 > > int > > rte_gso_segment(struct rte_mbuf *pkt, > > @@ -27,14 +38,12 @@ rte_gso_segment(struct rte_mbuf *pkt, > > > > if (pkt =3D=3D NULL || pkts_out =3D=3D NULL || gso_ctx =3D=3D NULL || > > nb_pkts_out < 1 || > > - gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN || > > - ((gso_ctx->gso_types & > > (DEV_TX_OFFLOAD_TCP_TSO | > > - DEV_TX_OFFLOAD_VXLAN_TNL_TSO | > > - DEV_TX_OFFLOAD_GRE_TNL_TSO)) =3D=3D 0)) > > + (ILLEGAL_UDP_GSO_CTX(gso_ctx) && > > + ILLEGAL_TCP_GSO_CTX(gso_ctx))) > > return -EINVAL; > > > > if (gso_ctx->gso_size >=3D pkt->pkt_len) { > > - pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); > > + pkt->ol_flags &=3D (~(PKT_TX_TCP_SEG | PKT_TX_UDP_SEG)); > > pkts_out[0] =3D pkt; > > return 1; > > } > > @@ -59,6 +68,11 @@ rte_gso_segment(struct rte_mbuf *pkt, > > ret =3D gso_tcp4_segment(pkt, gso_size, ipid_delta, > > direct_pool, indirect_pool, > > pkts_out, nb_pkts_out); > > + } else if (IS_IPV4_UDP(pkt->ol_flags) && > > + (gso_ctx->gso_types & > > DEV_TX_OFFLOAD_UDP_TSO)) { > > + pkt->ol_flags &=3D (~PKT_TX_UDP_SEG); > > + ret =3D gso_udp4_segment(pkt, gso_size, direct_pool, > > + indirect_pool, pkts_out, nb_pkts_out); > > } else { > > /* unsupported packet, skip */ > > pkts_out[0] =3D pkt; > > diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h index > > f4abd61..a626a11 100644 > > --- a/lib/librte_gso/rte_gso.h > > +++ b/lib/librte_gso/rte_gso.h > > @@ -17,10 +17,14 @@ extern "C" { > > #include > > #include > > > > -/* Minimum GSO segment size. */ > > +/* Minimum GSO segment size for TCP based packets. */ > > #define RTE_GSO_SEG_SIZE_MIN (sizeof(struct ether_hdr) + \ > > sizeof(struct ipv4_hdr) + sizeof(struct tcp_hdr) + 1) >=20 > RTE_GSO_SEG_SIZE_MIN is actually TCP min size. Can you name this macro > as > RTE_GSO_TCP_SEG_SIZE_MIN (symmetrically to the UDP macro below)? >=20 Yes, you are right. The name is not good. But I don't know if changing name will introduce ABI change, so I select this name as a workaround. Thanks, Jiayu > > > > +/* Minimum GSO segment size for UDP based packets. */ #define > > +RTE_GSO_UDP_SEG_SIZE_MIN (sizeof(struct ether_hdr) + \ > > + sizeof(struct ipv4_hdr) + sizeof(struct udp_hdr) + 1) > > + > > /* GSO flags for rte_gso_ctx. */ > > #define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0) /**< Use fixed IP ids for > > output GSO segments. Setting > > -- > > 2.7.4