From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2F297A04BC; Sun, 27 Sep 2020 06:56:36 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 10B701D94F; Sun, 27 Sep 2020 06:56:35 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 22CEB1D92C for ; Sun, 27 Sep 2020 06:56:27 +0200 (CEST) IronPort-SDR: a661S7CPTakcpUk6F0PonlR3qG/aUjAKhQFmRe4cE26paA92TSgG1GtDCB8jknHwm0RjW9BBHr rsgyxAFOAC1A== X-IronPort-AV: E=McAfee;i="6000,8403,9756"; a="141246705" X-IronPort-AV: E=Sophos;i="5.77,308,1596524400"; d="scan'208";a="141246705" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Sep 2020 21:56:24 -0700 IronPort-SDR: tDBO45cDJXmiXyKKYXdP3x5OK1utHDHZ2DnqetvS8s1FuoalLLxp3qkaQNfQafVsD1vxsWBVS1 RtGZbuxvxbNw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,308,1596524400"; d="scan'208";a="350311644" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by orsmga007.jf.intel.com with ESMTP; 26 Sep 2020 21:56:24 -0700 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Sat, 26 Sep 2020 21:56:23 -0700 Received: from shsmsx606.ccr.corp.intel.com (10.109.6.216) by SHSMSX601.ccr.corp.intel.com (10.109.6.141) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Sun, 27 Sep 2020 12:56:21 +0800 Received: from shsmsx606.ccr.corp.intel.com ([10.109.6.216]) by SHSMSX606.ccr.corp.intel.com ([10.109.6.216]) with mapi id 15.01.1713.004; Sun, 27 Sep 2020 12:56:21 +0800 From: "Hu, Jiayu" To: "yang_y_yi@163.com" , "dev@dpdk.org" CC: "thomas@monjalon.net" , "yangyi01@inspur.com" Thread-Topic: [PATCH v7 2/3] gro: add VXLAN UDP/IPv4 GRO support Thread-Index: AQHWklDLTl/gzeIEP0mB/JIDEtNw8Kl78GUw Date: Sun, 27 Sep 2020 04:56:21 +0000 Message-ID: References: <20200924085740.270192-1-yang_y_yi@163.com> <20200924085740.270192-3-yang_y_yi@163.com> In-Reply-To: <20200924085740.270192-3-yang_y_yi@163.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.5.1.3 dlp-product: dlpe-windows x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v7 2/3] gro: add VXLAN UDP/IPv4 GRO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Acked-by: Jiayu Hu > -----Original Message----- > From: yang_y_yi@163.com > Sent: Thursday, September 24, 2020 4:58 PM > To: dev@dpdk.org > Cc: Hu, Jiayu ; thomas@monjalon.net; > yangyi01@inspur.com; yang_y_yi@163.com > Subject: [PATCH v7 2/3] gro: add VXLAN UDP/IPv4 GRO support >=20 > From: Yi Yang >=20 > VXLAN UDP/IPv4 GRO can help improve VM-to-VM UDP > performance when UFO or GSO is enabled in VM, GRO > must be supported if UFO or GSO is enabled, > otherwise, performance can't get big improvement > if only GSO is there. >=20 > With this enabled in DPDK, OVS DPDK can leverage it > to improve VM-to-VM UDP performance, it will reassemble > VXLAN UDP/IPv4 fragments immediate after they are > received from a physical NIC. It is very helpful in > OVS DPDK VXLAN use case. >=20 > Signed-off-by: Yi Yang > --- > lib/librte_gro/gro_udp4.h | 1 + > lib/librte_gro/gro_vxlan_udp4.c | 545 > ++++++++++++++++++++++++++++++++++++++++ > lib/librte_gro/gro_vxlan_udp4.h | 153 +++++++++++ > lib/librte_gro/meson.build | 2 +- > lib/librte_gro/rte_gro.c | 115 +++++++-- > lib/librte_gro/rte_gro.h | 3 + > 6 files changed, 792 insertions(+), 27 deletions(-) > create mode 100644 lib/librte_gro/gro_vxlan_udp4.c > create mode 100644 lib/librte_gro/gro_vxlan_udp4.h >=20 > diff --git a/lib/librte_gro/gro_udp4.h b/lib/librte_gro/gro_udp4.h > index 0a078e4..d38b393 100644 > --- a/lib/librte_gro/gro_udp4.h > +++ b/lib/librte_gro/gro_udp4.h > @@ -7,6 +7,7 @@ >=20 > #include > #include > +#include >=20 > #define INVALID_ARRAY_INDEX 0xffffffffUL > #define GRO_UDP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL) > diff --git a/lib/librte_gro/gro_vxlan_udp4.c b/lib/librte_gro/gro_vxlan_u= dp4.c > new file mode 100644 > index 0000000..3747636 > --- /dev/null > +++ b/lib/librte_gro/gro_vxlan_udp4.c > @@ -0,0 +1,545 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Inspur Corporation > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include "gro_vxlan_udp4.h" > + > +void * > +gro_vxlan_udp4_tbl_create(uint16_t socket_id, > + uint16_t max_flow_num, > + uint16_t max_item_per_flow) > +{ > + struct gro_vxlan_udp4_tbl *tbl; > + size_t size; > + uint32_t entries_num, i; > + > + entries_num =3D max_flow_num * max_item_per_flow; > + entries_num =3D RTE_MIN(entries_num, > GRO_VXLAN_UDP4_TBL_MAX_ITEM_NUM); > + > + if (entries_num =3D=3D 0) > + return NULL; > + > + tbl =3D rte_zmalloc_socket(__func__, > + sizeof(struct gro_vxlan_udp4_tbl), > + RTE_CACHE_LINE_SIZE, > + socket_id); > + if (tbl =3D=3D NULL) > + return NULL; > + > + size =3D sizeof(struct gro_vxlan_udp4_item) * entries_num; > + tbl->items =3D rte_zmalloc_socket(__func__, > + size, > + RTE_CACHE_LINE_SIZE, > + socket_id); > + if (tbl->items =3D=3D NULL) { > + rte_free(tbl); > + return NULL; > + } > + tbl->max_item_num =3D entries_num; > + > + size =3D sizeof(struct gro_vxlan_udp4_flow) * entries_num; > + tbl->flows =3D rte_zmalloc_socket(__func__, > + size, > + RTE_CACHE_LINE_SIZE, > + socket_id); > + if (tbl->flows =3D=3D NULL) { > + rte_free(tbl->items); > + rte_free(tbl); > + return NULL; > + } > + > + for (i =3D 0; i < entries_num; i++) > + tbl->flows[i].start_index =3D INVALID_ARRAY_INDEX; > + tbl->max_flow_num =3D entries_num; > + > + return tbl; > +} > + > +void > +gro_vxlan_udp4_tbl_destroy(void *tbl) > +{ > + struct gro_vxlan_udp4_tbl *vxlan_tbl =3D tbl; > + > + if (vxlan_tbl) { > + rte_free(vxlan_tbl->items); > + rte_free(vxlan_tbl->flows); > + } > + rte_free(vxlan_tbl); > +} > + > +static inline uint32_t > +find_an_empty_item(struct gro_vxlan_udp4_tbl *tbl) > +{ > + uint32_t max_item_num =3D tbl->max_item_num, i; > + > + for (i =3D 0; i < max_item_num; i++) > + if (tbl->items[i].inner_item.firstseg =3D=3D NULL) > + return i; > + return INVALID_ARRAY_INDEX; > +} > + > +static inline uint32_t > +find_an_empty_flow(struct gro_vxlan_udp4_tbl *tbl) > +{ > + uint32_t max_flow_num =3D tbl->max_flow_num, i; > + > + for (i =3D 0; i < max_flow_num; i++) > + if (tbl->flows[i].start_index =3D=3D INVALID_ARRAY_INDEX) > + return i; > + return INVALID_ARRAY_INDEX; > +} > + > +static inline uint32_t > +insert_new_item(struct gro_vxlan_udp4_tbl *tbl, > + struct rte_mbuf *pkt, > + uint64_t start_time, > + uint32_t prev_idx, > + uint16_t frag_offset, > + uint8_t is_last_frag) > +{ > + uint32_t item_idx; > + > + item_idx =3D find_an_empty_item(tbl); > + if (unlikely(item_idx =3D=3D INVALID_ARRAY_INDEX)) > + return INVALID_ARRAY_INDEX; > + > + tbl->items[item_idx].inner_item.firstseg =3D pkt; > + tbl->items[item_idx].inner_item.lastseg =3D rte_pktmbuf_lastseg(pkt); > + tbl->items[item_idx].inner_item.start_time =3D start_time; > + tbl->items[item_idx].inner_item.next_pkt_idx =3D > INVALID_ARRAY_INDEX; > + tbl->items[item_idx].inner_item.frag_offset =3D frag_offset; > + tbl->items[item_idx].inner_item.is_last_frag =3D is_last_frag; > + tbl->items[item_idx].inner_item.nb_merged =3D 1; > + tbl->item_num++; > + > + /* If the previous packet exists, chain the new one with it. */ > + if (prev_idx !=3D INVALID_ARRAY_INDEX) { > + tbl->items[item_idx].inner_item.next_pkt_idx =3D > + tbl->items[prev_idx].inner_item.next_pkt_idx; > + tbl->items[prev_idx].inner_item.next_pkt_idx =3D item_idx; > + } > + > + return item_idx; > +} > + > +static inline uint32_t > +delete_item(struct gro_vxlan_udp4_tbl *tbl, > + uint32_t item_idx, > + uint32_t prev_item_idx) > +{ > + uint32_t next_idx =3D tbl->items[item_idx].inner_item.next_pkt_idx; > + > + /* NULL indicates an empty item. */ > + tbl->items[item_idx].inner_item.firstseg =3D NULL; > + tbl->item_num--; > + if (prev_item_idx !=3D INVALID_ARRAY_INDEX) > + tbl->items[prev_item_idx].inner_item.next_pkt_idx =3D > next_idx; > + > + return next_idx; > +} > + > +static inline uint32_t > +insert_new_flow(struct gro_vxlan_udp4_tbl *tbl, > + struct vxlan_udp4_flow_key *src, > + uint32_t item_idx) > +{ > + struct vxlan_udp4_flow_key *dst; > + uint32_t flow_idx; > + > + flow_idx =3D find_an_empty_flow(tbl); > + if (unlikely(flow_idx =3D=3D INVALID_ARRAY_INDEX)) > + return INVALID_ARRAY_INDEX; > + > + dst =3D &(tbl->flows[flow_idx].key); > + > + rte_ether_addr_copy(&(src->inner_key.eth_saddr), > + &(dst->inner_key.eth_saddr)); > + rte_ether_addr_copy(&(src->inner_key.eth_daddr), > + &(dst->inner_key.eth_daddr)); > + dst->inner_key.ip_src_addr =3D src->inner_key.ip_src_addr; > + dst->inner_key.ip_dst_addr =3D src->inner_key.ip_dst_addr; > + dst->inner_key.ip_id =3D src->inner_key.ip_id; > + > + dst->vxlan_hdr.vx_flags =3D src->vxlan_hdr.vx_flags; > + dst->vxlan_hdr.vx_vni =3D src->vxlan_hdr.vx_vni; > + rte_ether_addr_copy(&(src->outer_eth_saddr), &(dst- > >outer_eth_saddr)); > + rte_ether_addr_copy(&(src->outer_eth_daddr), &(dst- > >outer_eth_daddr)); > + dst->outer_ip_src_addr =3D src->outer_ip_src_addr; > + dst->outer_ip_dst_addr =3D src->outer_ip_dst_addr; > + dst->outer_dst_port =3D src->outer_dst_port; > + > + tbl->flows[flow_idx].start_index =3D item_idx; > + tbl->flow_num++; > + > + return flow_idx; > +} > + > +static inline int > +is_same_vxlan_udp4_flow(struct vxlan_udp4_flow_key k1, > + struct vxlan_udp4_flow_key k2) > +{ > + /* For VxLAN packet, outer udp src port is calculated from > + * inner packet RSS hash, udp src port of the first UDP > + * fragment is different from one of other UDP fragments > + * even if they are same flow, so we have to skip outer udp > + * src port comparison here. > + */ > + return (rte_is_same_ether_addr(&k1.outer_eth_saddr, > + &k2.outer_eth_saddr) && > + rte_is_same_ether_addr(&k1.outer_eth_daddr, > + &k2.outer_eth_daddr) && > + (k1.outer_ip_src_addr =3D=3D k2.outer_ip_src_addr) && > + (k1.outer_ip_dst_addr =3D=3D k2.outer_ip_dst_addr) && > + (k1.outer_dst_port =3D=3D k2.outer_dst_port) && > + (k1.vxlan_hdr.vx_flags =3D=3D k2.vxlan_hdr.vx_flags) && > + (k1.vxlan_hdr.vx_vni =3D=3D k2.vxlan_hdr.vx_vni) && > + is_same_udp4_flow(k1.inner_key, k2.inner_key)); > +} > + > +static inline int > +udp4_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item, > + uint16_t frag_offset, > + uint16_t ip_dl) > +{ > + struct rte_mbuf *pkt =3D item->inner_item.firstseg; > + int cmp; > + uint16_t l2_offset; > + int ret =3D 0; > + > + /* Note: if outer DF bit is set, i.e outer_is_atomic is 0, > + * we needn't compare outer_ip_id because they are same, > + * for the case outer_is_atomic is 1, we also have no way > + * to compare outer_ip_id because the difference between > + * outer_ip_ids of two received packets isn't always +/-1. > + * So skip outer_ip_id comparison here. > + */ > + > + l2_offset =3D pkt->outer_l2_len + pkt->outer_l3_len; > + cmp =3D udp4_check_neighbor(&item->inner_item, frag_offset, ip_dl, > + l2_offset); > + if (cmp > 0) > + /* Append the new packet. */ > + ret =3D 1; > + else if (cmp < 0) > + /* Prepend the new packet. */ > + ret =3D -1; > + > + return ret; > +} > + > +static inline int > +merge_two_vxlan_udp4_packets(struct gro_vxlan_udp4_item *item, > + struct rte_mbuf *pkt, > + int cmp, > + uint16_t frag_offset, > + uint8_t is_last_frag) > +{ > + if (merge_two_udp4_packets(&item->inner_item, pkt, cmp, > frag_offset, > + is_last_frag, > + pkt->outer_l2_len + pkt->outer_l3_len)) { > + return 1; > + } > + > + return 0; > +} > + > +static inline void > +update_vxlan_header(struct gro_vxlan_udp4_item *item) > +{ > + struct rte_ipv4_hdr *ipv4_hdr; > + struct rte_udp_hdr *udp_hdr; > + struct rte_mbuf *pkt =3D item->inner_item.firstseg; > + uint16_t len; > + uint16_t frag_offset; > + > + /* Update the outer IPv4 header. */ > + len =3D pkt->pkt_len - pkt->outer_l2_len; > + ipv4_hdr =3D (struct rte_ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + > + pkt->outer_l2_len); > + ipv4_hdr->total_length =3D rte_cpu_to_be_16(len); > + > + /* Update the outer UDP header. */ > + len -=3D pkt->outer_l3_len; > + udp_hdr =3D (struct rte_udp_hdr *)((char *)ipv4_hdr + pkt- > >outer_l3_len); > + udp_hdr->dgram_len =3D rte_cpu_to_be_16(len); > + > + /* Update the inner IPv4 header. */ > + len -=3D pkt->l2_len; > + ipv4_hdr =3D (struct rte_ipv4_hdr *)((char *)udp_hdr + pkt->l2_len); > + ipv4_hdr->total_length =3D rte_cpu_to_be_16(len); > + > + /* Clear MF bit if it is last fragment */ > + if (item->inner_item.is_last_frag) { > + frag_offset =3D rte_be_to_cpu_16(ipv4_hdr->fragment_offset); > + ipv4_hdr->fragment_offset =3D > + rte_cpu_to_be_16(frag_offset & > ~RTE_IPV4_HDR_MF_FLAG); > + } > +} > + > +int32_t > +gro_vxlan_udp4_reassemble(struct rte_mbuf *pkt, > + struct gro_vxlan_udp4_tbl *tbl, > + uint64_t start_time) > +{ > + struct rte_ether_hdr *outer_eth_hdr, *eth_hdr; > + struct rte_ipv4_hdr *outer_ipv4_hdr, *ipv4_hdr; > + struct rte_udp_hdr *udp_hdr; > + struct rte_vxlan_hdr *vxlan_hdr; > + uint16_t frag_offset; > + uint8_t is_last_frag; > + int16_t ip_dl; > + uint16_t ip_id; > + > + struct vxlan_udp4_flow_key key; > + uint32_t cur_idx, prev_idx, item_idx; > + uint32_t i, max_flow_num, remaining_flow_num; > + int cmp; > + uint16_t hdr_len; > + uint8_t find; > + > + outer_eth_hdr =3D rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *); > + outer_ipv4_hdr =3D (struct rte_ipv4_hdr *)((char *)outer_eth_hdr + > + pkt->outer_l2_len); > + > + udp_hdr =3D (struct rte_udp_hdr *)((char *)outer_ipv4_hdr + > + pkt->outer_l3_len); > + vxlan_hdr =3D (struct rte_vxlan_hdr *)((char *)udp_hdr + > + sizeof(struct rte_udp_hdr)); > + eth_hdr =3D (struct rte_ether_hdr *)((char *)vxlan_hdr + > + sizeof(struct rte_vxlan_hdr)); > + /* l2_len =3D outer udp hdr len + vxlan hdr len + inner l2 len */ > + ipv4_hdr =3D (struct rte_ipv4_hdr *)((char *)udp_hdr + pkt->l2_len); > + > + /* > + * Don't process the packet which has non-fragment inner IP. > + */ > + if (!is_ipv4_fragment(ipv4_hdr)) > + return -1; > + > + hdr_len =3D pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len + > + pkt->l3_len; > + /* > + * Don't process the packet whose payload length is less than or > + * equal to 0. > + */ > + if (pkt->pkt_len <=3D hdr_len) > + return -1; > + > + ip_dl =3D pkt->pkt_len - hdr_len; > + > + ip_id =3D rte_be_to_cpu_16(ipv4_hdr->packet_id); > + frag_offset =3D rte_be_to_cpu_16(ipv4_hdr->fragment_offset); > + is_last_frag =3D ((frag_offset & RTE_IPV4_HDR_MF_FLAG) =3D=3D 0) ? 1 : = 0; > + frag_offset =3D (uint16_t)(frag_offset & RTE_IPV4_HDR_OFFSET_MASK) > << 3; > + > + rte_ether_addr_copy(&(eth_hdr->s_addr), > &(key.inner_key.eth_saddr)); > + rte_ether_addr_copy(&(eth_hdr->d_addr), > &(key.inner_key.eth_daddr)); > + key.inner_key.ip_src_addr =3D ipv4_hdr->src_addr; > + key.inner_key.ip_dst_addr =3D ipv4_hdr->dst_addr; > + key.inner_key.ip_id =3D ip_id; > + > + key.vxlan_hdr.vx_flags =3D vxlan_hdr->vx_flags; > + key.vxlan_hdr.vx_vni =3D vxlan_hdr->vx_vni; > + rte_ether_addr_copy(&(outer_eth_hdr->s_addr), > &(key.outer_eth_saddr)); > + rte_ether_addr_copy(&(outer_eth_hdr->d_addr), > &(key.outer_eth_daddr)); > + key.outer_ip_src_addr =3D outer_ipv4_hdr->src_addr; > + key.outer_ip_dst_addr =3D outer_ipv4_hdr->dst_addr; > + /* Note: It is unnecessary to save outer_src_port here because it can > + * be different for VxLAN UDP fragments from the same flow. > + */ > + key.outer_dst_port =3D udp_hdr->dst_port; > + > + /* Search for a matched flow. */ > + max_flow_num =3D tbl->max_flow_num; > + remaining_flow_num =3D tbl->flow_num; > + find =3D 0; > + for (i =3D 0; i < max_flow_num && remaining_flow_num; i++) { > + if (tbl->flows[i].start_index !=3D INVALID_ARRAY_INDEX) { > + if (is_same_vxlan_udp4_flow(tbl->flows[i].key, key)) { > + find =3D 1; > + break; > + } > + remaining_flow_num--; > + } > + } > + > + /* > + * Can't find a matched flow. Insert a new flow and store the > + * packet into the flow. > + */ > + if (find =3D=3D 0) { > + item_idx =3D insert_new_item(tbl, pkt, start_time, > + INVALID_ARRAY_INDEX, frag_offset, > + is_last_frag); > + if (unlikely(item_idx =3D=3D INVALID_ARRAY_INDEX)) > + return -1; > + if (insert_new_flow(tbl, &key, item_idx) =3D=3D > + INVALID_ARRAY_INDEX) { > + /* > + * Fail to insert a new flow, so > + * delete the inserted packet. > + */ > + delete_item(tbl, item_idx, INVALID_ARRAY_INDEX); > + return -1; > + } > + return 0; > + } > + > + /* Check all packets in the flow and try to find a neighbor. */ > + cur_idx =3D tbl->flows[i].start_index; > + prev_idx =3D cur_idx; > + do { > + cmp =3D udp4_check_vxlan_neighbor(&(tbl->items[cur_idx]), > + frag_offset, ip_dl); > + if (cmp) { > + if (merge_two_vxlan_udp4_packets( > + &(tbl->items[cur_idx]), > + pkt, cmp, frag_offset, > + is_last_frag)) { > + return 1; > + } > + /* > + * Can't merge two packets, as the packet > + * length will be greater than the max value. > + * Insert the packet into the flow. > + */ > + if (insert_new_item(tbl, pkt, start_time, prev_idx, > + frag_offset, is_last_frag) =3D=3D > + INVALID_ARRAY_INDEX) > + return -1; > + return 0; > + } > + > + /* Ensure inserted items are ordered by frag_offset */ > + if (frag_offset > + < tbl->items[cur_idx].inner_item.frag_offset) { > + break; > + } > + > + prev_idx =3D cur_idx; > + cur_idx =3D tbl->items[cur_idx].inner_item.next_pkt_idx; > + } while (cur_idx !=3D INVALID_ARRAY_INDEX); > + > + /* Can't find neighbor. Insert the packet into the flow. */ > + if (cur_idx =3D=3D tbl->flows[i].start_index) { > + /* Insert it before the first packet of the flow */ > + item_idx =3D insert_new_item(tbl, pkt, start_time, > + INVALID_ARRAY_INDEX, frag_offset, > + is_last_frag); > + if (unlikely(item_idx =3D=3D INVALID_ARRAY_INDEX)) > + return -1; > + tbl->items[item_idx].inner_item.next_pkt_idx =3D cur_idx; > + tbl->flows[i].start_index =3D item_idx; > + } else { > + if (insert_new_item(tbl, pkt, start_time, prev_idx, > + frag_offset, is_last_frag > + ) =3D=3D INVALID_ARRAY_INDEX) > + return -1; > + } > + > + return 0; > +} > + > +static int > +gro_vxlan_udp4_merge_items(struct gro_vxlan_udp4_tbl *tbl, > + uint32_t start_idx) > +{ > + uint16_t frag_offset; > + uint8_t is_last_frag; > + int16_t ip_dl; > + struct rte_mbuf *pkt; > + int cmp; > + uint32_t item_idx; > + uint16_t hdr_len; > + > + item_idx =3D tbl->items[start_idx].inner_item.next_pkt_idx; > + while (item_idx !=3D INVALID_ARRAY_INDEX) { > + pkt =3D tbl->items[item_idx].inner_item.firstseg; > + hdr_len =3D pkt->outer_l2_len + pkt->outer_l3_len + pkt- > >l2_len + > + pkt->l3_len; > + ip_dl =3D pkt->pkt_len - hdr_len; > + frag_offset =3D tbl->items[item_idx].inner_item.frag_offset; > + is_last_frag =3D tbl->items[item_idx].inner_item.is_last_frag; > + cmp =3D udp4_check_vxlan_neighbor(&(tbl->items[start_idx]), > + frag_offset, ip_dl); > + if (cmp) { > + if (merge_two_vxlan_udp4_packets( > + &(tbl->items[start_idx]), > + pkt, cmp, frag_offset, > + is_last_frag)) { > + item_idx =3D delete_item(tbl, item_idx, > + > INVALID_ARRAY_INDEX); > + tbl->items[start_idx].inner_item.next_pkt_idx > + =3D item_idx; > + } else > + return 0; > + } else > + return 0; > + } > + > + return 0; > +} > + > +uint16_t > +gro_vxlan_udp4_tbl_timeout_flush(struct gro_vxlan_udp4_tbl *tbl, > + uint64_t flush_timestamp, > + struct rte_mbuf **out, > + uint16_t nb_out) > +{ > + uint16_t k =3D 0; > + uint32_t i, j; > + uint32_t max_flow_num =3D tbl->max_flow_num; > + > + for (i =3D 0; i < max_flow_num; i++) { > + if (unlikely(tbl->flow_num =3D=3D 0)) > + return k; > + > + j =3D tbl->flows[i].start_index; > + while (j !=3D INVALID_ARRAY_INDEX) { > + if (tbl->items[j].inner_item.start_time <=3D > + flush_timestamp) { > + gro_vxlan_udp4_merge_items(tbl, j); > + out[k++] =3D tbl->items[j].inner_item.firstseg; > + if (tbl->items[j].inner_item.nb_merged > 1) > + update_vxlan_header(&(tbl- > >items[j])); > + /* > + * Delete the item and get the next packet > + * index. > + */ > + j =3D delete_item(tbl, j, INVALID_ARRAY_INDEX); > + tbl->flows[i].start_index =3D j; > + if (j =3D=3D INVALID_ARRAY_INDEX) > + tbl->flow_num--; > + > + if (unlikely(k =3D=3D nb_out)) > + return k; > + } else > + /* > + * Flushing packets does not strictly follow > + * timestamp. It does not flush left packets of > + * the flow this time once it finds one item > + * whose start_time is greater than > + * flush_timestamp. So go to check other > flows. > + */ > + break; > + } > + } > + return k; > +} > + > +uint32_t > +gro_vxlan_udp4_tbl_pkt_count(void *tbl) > +{ > + struct gro_vxlan_udp4_tbl *gro_tbl =3D tbl; > + > + if (gro_tbl) > + return gro_tbl->item_num; > + > + return 0; > +} > diff --git a/lib/librte_gro/gro_vxlan_udp4.h > b/lib/librte_gro/gro_vxlan_udp4.h > new file mode 100644 > index 0000000..d045221 > --- /dev/null > +++ b/lib/librte_gro/gro_vxlan_udp4.h > @@ -0,0 +1,153 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Inspur Corporation > + */ > + > +#ifndef _GRO_VXLAN_UDP4_H_ > +#define _GRO_VXLAN_UDP4_H_ > + > +#include "gro_udp4.h" > + > +#define GRO_VXLAN_UDP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL) > + > +/* Header fields representing a VxLAN flow */ > +struct vxlan_udp4_flow_key { > + struct udp4_flow_key inner_key; > + struct rte_vxlan_hdr vxlan_hdr; > + > + struct rte_ether_addr outer_eth_saddr; > + struct rte_ether_addr outer_eth_daddr; > + > + uint32_t outer_ip_src_addr; > + uint32_t outer_ip_dst_addr; > + > + /* Note: It is unnecessary to save outer_src_port here because it can > + * be different for VxLAN UDP fragments from the same flow. > + */ > + uint16_t outer_dst_port; > +}; > + > +struct gro_vxlan_udp4_flow { > + struct vxlan_udp4_flow_key key; > + /* > + * The index of the first packet in the flow. INVALID_ARRAY_INDEX > + * indicates an empty flow. > + */ > + uint32_t start_index; > +}; > + > +struct gro_vxlan_udp4_item { > + struct gro_udp4_item inner_item; > + /* Note: VXLAN UDP/IPv4 GRO needn't check outer_ip_id because > + * the difference between outer_ip_ids of two received packets > + * isn't always +/-1 in case of OVS DPDK. So no outer_ip_id > + * and outer_is_atomic fields here. > + */ > +}; > + > +/* > + * VxLAN (with an outer IPv4 header and an inner UDP/IPv4 packet) > + * reassembly table structure > + */ > +struct gro_vxlan_udp4_tbl { > + /* item array */ > + struct gro_vxlan_udp4_item *items; > + /* flow array */ > + struct gro_vxlan_udp4_flow *flows; > + /* current item number */ > + uint32_t item_num; > + /* current flow number */ > + uint32_t flow_num; > + /* the maximum item number */ > + uint32_t max_item_num; > + /* the maximum flow number */ > + uint32_t max_flow_num; > +}; > + > +/** > + * This function creates a VxLAN reassembly table for VxLAN packets > + * which have an outer IPv4 header and an inner UDP/IPv4 packet. > + * > + * @param socket_id > + * Socket index for allocating the table > + * @param max_flow_num > + * The maximum number of flows in the table > + * @param max_item_per_flow > + * The maximum number of packets per flow > + * > + * @return > + * - Return the table pointer on success. > + * - Return NULL on failure. > + */ > +void *gro_vxlan_udp4_tbl_create(uint16_t socket_id, > + uint16_t max_flow_num, > + uint16_t max_item_per_flow); > + > +/** > + * This function destroys a VxLAN reassembly table. > + * > + * @param tbl > + * Pointer pointing to the VxLAN reassembly table > + */ > +void gro_vxlan_udp4_tbl_destroy(void *tbl); > + > +/** > + * This function merges a VxLAN packet which has an outer IPv4 header an= d > + * an inner UDP/IPv4 packet. It does not process the packet which does n= ot > + * have payload. > + * > + * This function does not check if the packet has correct checksums and > + * does not re-calculate checksums for the merged packet. It returns the > + * packet if there is no available space in the table. > + * > + * @param pkt > + * Packet to reassemble > + * @param tbl > + * Pointer pointing to the VxLAN reassembly table > + * @start_time > + * The time when the packet is inserted into the table > + * > + * @return > + * - Return a positive value if the packet is merged. > + * - Return zero if the packet isn't merged but stored in the table. > + * - Return a negative value for invalid parameters or no available > + * space in the table. > + */ > +int32_t gro_vxlan_udp4_reassemble(struct rte_mbuf *pkt, > + struct gro_vxlan_udp4_tbl *tbl, > + uint64_t start_time); > + > +/** > + * This function flushes timeout packets in the VxLAN reassembly table, > + * and without updating checksums. > + * > + * @param tbl > + * Pointer pointing to a VxLAN GRO table > + * @param flush_timestamp > + * This function flushes packets which are inserted into the table > + * before or at the flush_timestamp. > + * @param out > + * Pointer array used to keep flushed packets > + * @param nb_out > + * The element number in 'out'. It also determines the maximum number > of > + * packets that can be flushed finally. > + * > + * @return > + * The number of flushed packets > + */ > +uint16_t gro_vxlan_udp4_tbl_timeout_flush(struct gro_vxlan_udp4_tbl *tbl= , > + uint64_t flush_timestamp, > + struct rte_mbuf **out, > + uint16_t nb_out); > + > +/** > + * This function returns the number of the packets in a VxLAN > + * reassembly table. > + * > + * @param tbl > + * Pointer pointing to the VxLAN reassembly table > + * > + * @return > + * The number of packets in the table > + */ > +uint32_t gro_vxlan_udp4_tbl_pkt_count(void *tbl); > +#endif > diff --git a/lib/librte_gro/meson.build b/lib/librte_gro/meson.build > index 0d18dc2..ea8b45c 100644 > --- a/lib/librte_gro/meson.build > +++ b/lib/librte_gro/meson.build > @@ -1,6 +1,6 @@ > # SPDX-License-Identifier: BSD-3-Clause > # Copyright(c) 2017 Intel Corporation >=20 > -sources =3D files('rte_gro.c', 'gro_tcp4.c', 'gro_udp4.c', 'gro_vxlan_tc= p4.c') > +sources =3D files('rte_gro.c', 'gro_tcp4.c', 'gro_udp4.c', 'gro_vxlan_tc= p4.c', > 'gro_vxlan_udp4.c') > headers =3D files('rte_gro.h') > deps +=3D ['ethdev'] > diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c > index ac23df1..e56bd20 100644 > --- a/lib/librte_gro/rte_gro.c > +++ b/lib/librte_gro/rte_gro.c > @@ -11,6 +11,7 @@ > #include "gro_tcp4.h" > #include "gro_udp4.h" > #include "gro_vxlan_tcp4.h" > +#include "gro_vxlan_udp4.h" >=20 > typedef void *(*gro_tbl_create_fn)(uint16_t socket_id, > uint16_t max_flow_num, > @@ -20,14 +21,14 @@ >=20 > static gro_tbl_create_fn tbl_create_fn[RTE_GRO_TYPE_MAX_NUM] =3D { > gro_tcp4_tbl_create, gro_vxlan_tcp4_tbl_create, > - gro_udp4_tbl_create, NULL}; > + gro_udp4_tbl_create, gro_vxlan_udp4_tbl_create, NULL}; > static gro_tbl_destroy_fn tbl_destroy_fn[RTE_GRO_TYPE_MAX_NUM] =3D { > gro_tcp4_tbl_destroy, gro_vxlan_tcp4_tbl_destroy, > - gro_udp4_tbl_destroy, > + gro_udp4_tbl_destroy, gro_vxlan_udp4_tbl_destroy, > NULL}; > static gro_tbl_pkt_count_fn tbl_pkt_count_fn[RTE_GRO_TYPE_MAX_NUM] =3D > { > gro_tcp4_tbl_pkt_count, > gro_vxlan_tcp4_tbl_pkt_count, > - gro_udp4_tbl_pkt_count, > + gro_udp4_tbl_pkt_count, > gro_vxlan_udp4_tbl_pkt_count, > NULL}; >=20 > #define IS_IPV4_TCP_PKT(ptype) (RTE_ETH_IS_IPV4_HDR(ptype) && \ > @@ -47,6 +48,16 @@ > RTE_PTYPE_INNER_L3_IPV4_EXT | \ > RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN)) !=3D 0)) >=20 > +#define IS_IPV4_VXLAN_UDP4_PKT(ptype) (RTE_ETH_IS_IPV4_HDR(ptype) > && \ > + ((ptype & RTE_PTYPE_L4_UDP) =3D=3D RTE_PTYPE_L4_UDP) && \ > + ((ptype & RTE_PTYPE_TUNNEL_VXLAN) =3D=3D \ > + RTE_PTYPE_TUNNEL_VXLAN) && \ > + ((ptype & RTE_PTYPE_INNER_L4_UDP) =3D=3D \ > + RTE_PTYPE_INNER_L4_UDP) && \ > + (((ptype & RTE_PTYPE_INNER_L3_MASK) & \ > + (RTE_PTYPE_INNER_L3_IPV4 | \ > + RTE_PTYPE_INNER_L3_IPV4_EXT | \ > + RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN)) !=3D 0)) >=20 > /* > * GRO context structure. It keeps the table structures, which are > @@ -137,19 +148,27 @@ struct gro_ctx { > struct gro_udp4_item udp_items[RTE_GRO_MAX_BURST_ITEM_NUM] > =3D {{0} }; >=20 > /* Allocate a reassembly table for VXLAN TCP GRO */ > - struct gro_vxlan_tcp4_tbl vxlan_tbl; > - struct gro_vxlan_tcp4_flow > vxlan_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > - struct gro_vxlan_tcp4_item > vxlan_items[RTE_GRO_MAX_BURST_ITEM_NUM] > + struct gro_vxlan_tcp4_tbl vxlan_tcp_tbl; > + struct gro_vxlan_tcp4_flow > vxlan_tcp_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > + struct gro_vxlan_tcp4_item > vxlan_tcp_items[RTE_GRO_MAX_BURST_ITEM_NUM] > =3D {{{0}, 0, 0} }; >=20 > + /* Allocate a reassembly table for VXLAN UDP GRO */ > + struct gro_vxlan_udp4_tbl vxlan_udp_tbl; > + struct gro_vxlan_udp4_flow > vxlan_udp_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > + struct gro_vxlan_udp4_item > vxlan_udp_items[RTE_GRO_MAX_BURST_ITEM_NUM] > + =3D {{{0}} }; > + > struct rte_mbuf *unprocess_pkts[nb_pkts]; > uint32_t item_num; > int32_t ret; > uint16_t i, unprocess_num =3D 0, nb_after_gro =3D nb_pkts; > - uint8_t do_tcp4_gro =3D 0, do_vxlan_gro =3D 0, do_udp4_gro =3D 0; > + uint8_t do_tcp4_gro =3D 0, do_vxlan_tcp_gro =3D 0, do_udp4_gro =3D 0, > + do_vxlan_udp_gro =3D 0; >=20 > if (unlikely((param->gro_types & (RTE_GRO_IPV4_VXLAN_TCP_IPV4 | > RTE_GRO_TCP_IPV4 | > + RTE_GRO_IPV4_VXLAN_UDP_IPV4 | > RTE_GRO_UDP_IPV4)) =3D=3D 0)) > return nb_pkts; >=20 > @@ -160,15 +179,28 @@ struct gro_ctx { >=20 > if (param->gro_types & RTE_GRO_IPV4_VXLAN_TCP_IPV4) { > for (i =3D 0; i < item_num; i++) > - vxlan_flows[i].start_index =3D INVALID_ARRAY_INDEX; > - > - vxlan_tbl.flows =3D vxlan_flows; > - vxlan_tbl.items =3D vxlan_items; > - vxlan_tbl.flow_num =3D 0; > - vxlan_tbl.item_num =3D 0; > - vxlan_tbl.max_flow_num =3D item_num; > - vxlan_tbl.max_item_num =3D item_num; > - do_vxlan_gro =3D 1; > + vxlan_tcp_flows[i].start_index =3D > INVALID_ARRAY_INDEX; > + > + vxlan_tcp_tbl.flows =3D vxlan_tcp_flows; > + vxlan_tcp_tbl.items =3D vxlan_tcp_items; > + vxlan_tcp_tbl.flow_num =3D 0; > + vxlan_tcp_tbl.item_num =3D 0; > + vxlan_tcp_tbl.max_flow_num =3D item_num; > + vxlan_tcp_tbl.max_item_num =3D item_num; > + do_vxlan_tcp_gro =3D 1; > + } > + > + if (param->gro_types & RTE_GRO_IPV4_VXLAN_UDP_IPV4) { > + for (i =3D 0; i < item_num; i++) > + vxlan_udp_flows[i].start_index =3D > INVALID_ARRAY_INDEX; > + > + vxlan_udp_tbl.flows =3D vxlan_udp_flows; > + vxlan_udp_tbl.items =3D vxlan_udp_items; > + vxlan_udp_tbl.flow_num =3D 0; > + vxlan_udp_tbl.item_num =3D 0; > + vxlan_udp_tbl.max_flow_num =3D item_num; > + vxlan_udp_tbl.max_item_num =3D item_num; > + do_vxlan_udp_gro =3D 1; > } >=20 > if (param->gro_types & RTE_GRO_TCP_IPV4) { > @@ -204,9 +236,18 @@ struct gro_ctx { > * will be flushed from the tables. > */ > if (IS_IPV4_VXLAN_TCP4_PKT(pkts[i]->packet_type) && > - do_vxlan_gro) { > + do_vxlan_tcp_gro) { > ret =3D gro_vxlan_tcp4_reassemble(pkts[i], > - &vxlan_tbl, 0); > + &vxlan_tcp_tbl, 0); > + if (ret > 0) > + /* Merge successfully */ > + nb_after_gro--; > + else if (ret < 0) > + unprocess_pkts[unprocess_num++] =3D pkts[i]; > + } else if (IS_IPV4_VXLAN_UDP4_PKT(pkts[i]->packet_type) && > + do_vxlan_udp_gro) { > + ret =3D gro_vxlan_udp4_reassemble(pkts[i], > + &vxlan_udp_tbl, 0); > if (ret > 0) > /* Merge successfully */ > nb_after_gro--; > @@ -236,11 +277,17 @@ struct gro_ctx { > || (unprocess_num < nb_pkts)) { > i =3D 0; > /* Flush all packets from the tables */ > - if (do_vxlan_gro) { > - i =3D gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tbl, > + if (do_vxlan_tcp_gro) { > + i =3D gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl, > 0, pkts, nb_pkts); > } >=20 > + if (do_vxlan_udp_gro) { > + i +=3D > gro_vxlan_udp4_tbl_timeout_flush(&vxlan_udp_tbl, > + 0, &pkts[i], nb_pkts - i); > + > + } > + > if (do_tcp4_gro) { > i +=3D gro_tcp4_tbl_timeout_flush(&tcp_tbl, 0, > &pkts[i], nb_pkts - i); > @@ -269,33 +316,42 @@ struct gro_ctx { > { > struct rte_mbuf *unprocess_pkts[nb_pkts]; > struct gro_ctx *gro_ctx =3D ctx; > - void *tcp_tbl, *udp_tbl, *vxlan_tbl; > + void *tcp_tbl, *udp_tbl, *vxlan_tcp_tbl, *vxlan_udp_tbl; > uint64_t current_time; > uint16_t i, unprocess_num =3D 0; > - uint8_t do_tcp4_gro, do_vxlan_gro, do_udp4_gro; > + uint8_t do_tcp4_gro, do_vxlan_tcp_gro, do_udp4_gro, > do_vxlan_udp_gro; >=20 > if (unlikely((gro_ctx->gro_types & (RTE_GRO_IPV4_VXLAN_TCP_IPV4 > | > RTE_GRO_TCP_IPV4 | > + RTE_GRO_IPV4_VXLAN_UDP_IPV4 | > RTE_GRO_UDP_IPV4)) =3D=3D 0)) > return nb_pkts; >=20 > tcp_tbl =3D gro_ctx->tbls[RTE_GRO_TCP_IPV4_INDEX]; > - vxlan_tbl =3D gro_ctx->tbls[RTE_GRO_IPV4_VXLAN_TCP_IPV4_INDEX]; > + vxlan_tcp_tbl =3D gro_ctx- > >tbls[RTE_GRO_IPV4_VXLAN_TCP_IPV4_INDEX]; > udp_tbl =3D gro_ctx->tbls[RTE_GRO_UDP_IPV4_INDEX]; > + vxlan_udp_tbl =3D gro_ctx- > >tbls[RTE_GRO_IPV4_VXLAN_UDP_IPV4_INDEX]; >=20 > do_tcp4_gro =3D (gro_ctx->gro_types & RTE_GRO_TCP_IPV4) =3D=3D > RTE_GRO_TCP_IPV4; > - do_vxlan_gro =3D (gro_ctx->gro_types & > RTE_GRO_IPV4_VXLAN_TCP_IPV4) =3D=3D > + do_vxlan_tcp_gro =3D (gro_ctx->gro_types & > RTE_GRO_IPV4_VXLAN_TCP_IPV4) =3D=3D > RTE_GRO_IPV4_VXLAN_TCP_IPV4; > do_udp4_gro =3D (gro_ctx->gro_types & RTE_GRO_UDP_IPV4) =3D=3D > RTE_GRO_UDP_IPV4; > + do_vxlan_udp_gro =3D (gro_ctx->gro_types & > RTE_GRO_IPV4_VXLAN_UDP_IPV4) =3D=3D > + RTE_GRO_IPV4_VXLAN_UDP_IPV4; >=20 > current_time =3D rte_rdtsc(); >=20 > for (i =3D 0; i < nb_pkts; i++) { > if (IS_IPV4_VXLAN_TCP4_PKT(pkts[i]->packet_type) && > - do_vxlan_gro) { > - if (gro_vxlan_tcp4_reassemble(pkts[i], vxlan_tbl, > + do_vxlan_tcp_gro) { > + if (gro_vxlan_tcp4_reassemble(pkts[i], vxlan_tcp_tbl, > + current_time) < 0) > + unprocess_pkts[unprocess_num++] =3D pkts[i]; > + } else if (IS_IPV4_VXLAN_UDP4_PKT(pkts[i]->packet_type) && > + do_vxlan_udp_gro) { > + if (gro_vxlan_udp4_reassemble(pkts[i], > vxlan_udp_tbl, > current_time) < 0) > unprocess_pkts[unprocess_num++] =3D pkts[i]; > } else if (IS_IPV4_TCP_PKT(pkts[i]->packet_type) && > @@ -341,6 +397,13 @@ struct gro_ctx { > left_nb_out =3D max_nb_out - num; > } >=20 > + if ((gro_types & RTE_GRO_IPV4_VXLAN_UDP_IPV4) && left_nb_out > > 0) { > + num +=3D gro_vxlan_udp4_tbl_timeout_flush(gro_ctx->tbls[ > + RTE_GRO_IPV4_VXLAN_UDP_IPV4_INDEX], > + flush_timestamp, &out[num], left_nb_out); > + left_nb_out =3D max_nb_out - num; > + } > + > /* If no available space in 'out', stop flushing. */ > if ((gro_types & RTE_GRO_TCP_IPV4) && left_nb_out > 0) { > num +=3D gro_tcp4_tbl_timeout_flush( > diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h > index 470f3ed..9f9ed49 100644 > --- a/lib/librte_gro/rte_gro.h > +++ b/lib/librte_gro/rte_gro.h > @@ -35,6 +35,9 @@ > #define RTE_GRO_UDP_IPV4_INDEX 2 > #define RTE_GRO_UDP_IPV4 (1ULL << RTE_GRO_UDP_IPV4_INDEX) > /**< UDP/IPv4 GRO flag */ > +#define RTE_GRO_IPV4_VXLAN_UDP_IPV4_INDEX 3 > +#define RTE_GRO_IPV4_VXLAN_UDP_IPV4 (1ULL << > RTE_GRO_IPV4_VXLAN_UDP_IPV4_INDEX) > +/**< VxLAN UDP/IPv4 GRO flag. */ >=20 > /** > * Structure used to create GRO context objects or used to pass > -- > 1.8.3.1