From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C0D6BA04C7; Wed, 16 Sep 2020 04:44:42 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 16E771BFBC; Wed, 16 Sep 2020 04:44:42 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id CCBB01BF78 for ; Wed, 16 Sep 2020 04:44:39 +0200 (CEST) IronPort-SDR: UfTv1PopUZntQoXLb4fQAwCi7ddFaYSCt+3CTd/e+o7Hblra1L9+ZvPRq2sQYxtg3GxsQOmYp3 GMkcPXPiWyRA== X-IronPort-AV: E=McAfee;i="6000,8403,9745"; a="159439349" X-IronPort-AV: E=Sophos;i="5.76,430,1592895600"; d="scan'208";a="159439349" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2020 19:44:38 -0700 IronPort-SDR: h4a/rsjzVO7Lt7ta/xpqxRU8ufgm9uT1oWrWf4PTLzPYWIFyBgXJ9fx/3Wcwbr/IR53iTM4Arg FhGwW/mX5CTQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,430,1592895600"; d="scan'208";a="483107458" Received: from npg_dpdk_virtio_jiayuhu_15.sh.intel.com ([10.67.117.43]) by orsmga005.jf.intel.com with ESMTP; 15 Sep 2020 19:44:36 -0700 Date: Wed, 16 Sep 2020 10:54:24 +0800 From: Jiayu Hu To: yang_y_yi Cc: yangyi01@inspur.com, thomas@monjalon.net, dev@dpdk.org, jiayu.hu@intel.com Message-ID: <20200916025424.GA90702@NPG_DPDK_VIRTIO_jiayuhu_15.sh.intel.com> References: <20200910092914.120452-1-yang_y_yi@163.com> <20200910092914.120452-3-yang_y_yi@163.com> <20200911042416.GA46174@NPG_DPDK_VIRTIO_jiayuhu_15.sh.intel.com> <1623572b.1b82.1748a63b63f.Coremail.yang_y_yi@163.com> <20200914042126.GA15258@NPG_DPDK_VIRTIO_jiayuhu_15.sh.intel.com> <20ddfcd3.5b19.1748bb90144.Coremail.yang_y_yi@163.com> <662209b31e344cfb8994a5b5cdcb0375@intel.com> <778c4739.666a.1748be56189.Coremail.yang_y_yi@163.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <778c4739.666a.1748be56189.Coremail.yang_y_yi@163.com> User-Agent: Mutt/1.7.1 (2016-10-04) Subject: Re: [dpdk-dev] [PATCH v5 2/2] gro: add VXLAN UDP GRO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Sep 14, 2020 at 05:14:59PM +0800, yang_y_yi wrote: > > Jiayu, VM to VM case for big UDP packet (say udp parload size is 8192K, but mtu > 1500). VM will segment it as UDP framgments, but when ovs dpdk uses VxLan to > encapsulate them, it doesn't know they are same flow, OVS DPDK has no knowldge > about them, just encapsulate them as if they are not related. But on receive > side on another machine, OVS DPDK will GRO them, at this point, outer IP ID is > even random for every packet. For TCP, I think same issue is there, GRO just > considered them as different flow and doesn't do GRO, in my opinion, we also > should do GRO for such case. I am a little confused about ovs behavior. Does ovs know they are udp fragments? Or it knows they are fragments but doesn't know they are from one flow? Given ovs receives 3 udp fragments of a udp packet, if ovs doesn't know they are fragments of a udp packet, their mbuf->packet_type will be ether/ipv4/udp, ether/ipv4, and ether/ipv4. dpdk udp gro will not process the second and third one as their packet type in mbuf is not ether/ipv4/udp. > > > 2020-09-14 16:50:08 "Hu, Jiayu" ะด > > Out IP ID is not used to check if they belong to the same flow, > > but check if they are neighbors. If two vxlan/udp fragments are > > neighbors, their outer IP ID and outer frag_oft should be incremental. > > I still cannot understand why ignore outer IP ID when DF is 0. > > Can you give more explanation? > > > > From: yang_y_yi > Sent: Monday, September 14, 2020 4:27 PM > To: Hu, Jiayu > Cc: thomas@monjalon.net; yangyi01@inspur.com; dev@dpdk.org > Subject: Re:Re: Re: [PATCH v5 2/2] gro: add VXLAN UDP GRO support > Importance: High > > > > For outer_ip_id, there is same ip_id disorder issue existing as UDP GRO, so > we can't use ip_id +/-1 to check if they are same flow, here is my > incrmental change for it with more comments. > > > > @@ -189,7 +188,12 @@ > is_same_vxlan_udp4_flow(struct vxlan_udp4_flow_key k1, > struct vxlan_udp4_flow_key k2) > { > - /* src port is changing, so shouldn't use it for flow check */ > + /* For VxLAN packet, outer udp src port is calculated from > + * inner packet RSS hash, udp src port of the first UDP > + * fragment is different from one of other UDP fragments > + * even if they are same flow, so we have to skip outer udp > + * src port comparison here. > + */ > return (rte_is_same_ether_addr(&k1.outer_eth_saddr, > &k2.outer_eth_saddr) && > rte_is_same_ether_addr(&k1.outer_eth_daddr, > @@ -212,12 +216,16 @@ > uint16_t l2_offset; > int ret = 0; > > + /* Note: if outer DF bit is set, i.e outer_is_atomic is 0, > + * we needn't compare outer_ip_id because they are same, > + * for the case outer_is_atomic is 1, we also have no way > + * to comapre outer_ip_id because the received packets can > + * be out of order. So ignore outer_ip_id comparison here. > + */ > + > l2_offset = pkt->outer_l2_len + pkt->outer_l3_len; > cmp = udp4_check_neighbor(&item->inner_item, frag_offset, ip_dl, > l2_offset); > - /* VXLAN outer IP ID is out of order, so don't touch it and > - * don't compare it. > - */ > if (cmp > 0) > /* Append the new packet. */ > ret = 1; > @@ -354,7 +362,9 @@ > rte_ether_addr_copy(&(outer_eth_hdr->d_addr), & > (key.outer_eth_daddr)); > key.outer_ip_src_addr = outer_ipv4_hdr->src_addr; > key.outer_ip_dst_addr = outer_ipv4_hdr->dst_addr; > - key.outer_src_port = udp_hdr->src_port; > + /* Note: It is unnecessary to save outer_src_port here because it > can > + * be different for VxLAN UDP fragments from the same flow. > + */ > key.outer_dst_port = udp_hdr->dst_port; > > /* Search for a matched flow. */ > > > > > > > > > At 2020-09-14 12:21:26, "Jiayu Hu" wrote: > > >Replies are inline. > > > > > >BTW, you need to update the programmer guide > > >doc/guides/prog_guide/generic_receive_offload_lib.rst and > > >release note doc/guides/rel_notes/release_20_11.rst. > > > > > >Thanks, > > >Jiayu > > > > > >On Mon, Sep 14, 2020 at 10:13:44AM +0800, yang_y_yi wrote: > > >> Jiayu, thank you so much, please check my replies for some of your comments, > > >> here is incremental patch. I built it by meson this time :-) > > >> > > >> diff --git a/lib/librte_gro/gro_vxlan_udp4.c b/lib/librte_gro/gro_vxlan_udp4.c > > >> index 9e9df72..1fcfaf1 100644 > > >> --- a/lib/librte_gro/gro_vxlan_udp4.c > > >> +++ b/lib/librte_gro/gro_vxlan_udp4.c > > >> @@ -203,7 +203,7 @@ > > >> } > > >> > > >> static inline int > > >> -udp_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item, > > >> +udp4_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item, > > >> uint16_t frag_offset, > > >> uint16_t ip_dl) > > >> { > > >> @@ -213,7 +213,7 @@ > > >> int ret = 0; > > >> > > >> l2_offset = pkt->outer_l2_len + pkt->outer_l3_len; > > >> - cmp = udp_check_neighbor(&item->inner_item, frag_offset, ip_dl, > > >> + cmp = udp4_check_neighbor(&item->inner_item, frag_offset, ip_dl, > > >> l2_offset); > > >> /* VXLAN outer IP ID is out of order, so don't touch it and > > >> * don't compare it. > > >> @@ -379,7 +379,7 @@ > > >> item_idx = insert_new_item(tbl, pkt, start_time, > > >> INVALID_ARRAY_INDEX, frag_offset, > > >> is_last_frag, outer_ip_id, outer_is_atomic); > > >> - if (item_idx == INVALID_ARRAY_INDEX) > > >> + if (unlikely(item_idx == INVALID_ARRAY_INDEX)) > > >> return -1; > > >> if (insert_new_flow(tbl, &key, item_idx) == > > >> INVALID_ARRAY_INDEX) { > > >> @@ -397,7 +397,7 @@ > > >> cur_idx = tbl->flows[i].start_index; > > >> prev_idx = cur_idx; > > >> do { > > >> - cmp = udp_check_vxlan_neighbor(&(tbl->items[cur_idx]), > > >> + cmp = udp4_check_vxlan_neighbor(&(tbl->items[cur_idx]), > > >> frag_offset, ip_dl); > > >> if (cmp) { > > >> if (merge_two_vxlan_udp4_packets( > > >> @@ -436,7 +436,7 @@ > > >> item_idx = insert_new_item(tbl, pkt, start_time, > > >> INVALID_ARRAY_INDEX, frag_offset, > > >> is_last_frag, outer_ip_id, outer_is_atomic); > > >> - if (item_idx == INVALID_ARRAY_INDEX) > > >> + if (unlikely(item_idx == INVALID_ARRAY_INDEX)) > > >> return -1; > > >> tbl->items[item_idx].inner_item.next_pkt_idx = cur_idx; > > >> tbl->flows[i].start_index = item_idx; > > >> @@ -470,7 +470,7 @@ > > >> ip_dl = pkt->pkt_len - hdr_len; > > >> diff --git a/lib/librte_gro/gro_vxlan_udp4.c b/lib/librte_gro/gro_vxlan_udp4.c > > >> index 9e9df72..1fcfaf1 100644 > > >> --- a/lib/librte_gro/gro_vxlan_udp4.c > > >> +++ b/lib/librte_gro/gro_vxlan_udp4.c > > >> @@ -203,7 +203,7 @@ > > >> } > > >> > > >> static inline int > > >> -udp_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item, > > >> +udp4_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item, > > >> uint16_t frag_offset, > > >> uint16_t ip_dl) > > >> { > > >> @@ -213,7 +213,7 @@ > > >> int ret = 0; > > >> > > >> l2_offset = pkt->outer_l2_len + pkt->outer_l3_len; > > >> - cmp = udp_check_neighbor(&item->inner_item, frag_offset, ip_dl, > > >> + cmp = udp4_check_neighbor(&item->inner_item, frag_offset, ip_dl, > > >> l2_offset); > > >> /* VXLAN outer IP ID is out of order, so don't touch it and > > >> * don't compare it. > > >> @@ -379,7 +379,7 @@ > > >> item_idx = insert_new_item(tbl, pkt, start_time, > > >> INVALID_ARRAY_INDEX, frag_offset, > > >> is_last_frag, outer_ip_id, outer_is_atomic); > > >> - if (item_idx == INVALID_ARRAY_INDEX) > > >> + if (unlikely(item_idx == INVALID_ARRAY_INDEX)) > > >> return -1; > > >> if (insert_new_flow(tbl, &key, item_idx) == > > >> INVALID_ARRAY_INDEX) { > > >> @@ -397,7 +397,7 @@ > > >> cur_idx = tbl->flows[i].start_index; > > >> prev_idx = cur_idx; > > >> do { > > >> - cmp = udp_check_vxlan_neighbor(&(tbl->items[cur_idx]), > > >> + cmp = udp4_check_vxlan_neighbor(&(tbl->items[cur_idx]), > > >> frag_offset, ip_dl); > > >> if (cmp) { > > >> if (merge_two_vxlan_udp4_packets( > > >> @@ -436,7 +436,7 @@ > > >> item_idx = insert_new_item(tbl, pkt, start_time, > > >> INVALID_ARRAY_INDEX, frag_offset, > > >> is_last_frag, outer_ip_id, outer_is_atomic); > > >> - if (item_idx == INVALID_ARRAY_INDEX) > > >> + if (unlikely(item_idx == INVALID_ARRAY_INDEX)) > > >> return -1; > > >> tbl->items[item_idx].inner_item.next_pkt_idx = cur_idx; > > >> tbl->flows[i].start_index = item_idx; > > >> @@ -470,7 +470,7 @@ > > >> ip_dl = pkt->pkt_len - hdr_len; > > >> frag_offset = tbl->items[item_idx].inner_item.frag_offset; > > >> is_last_frag = tbl->items[item_idx].inner_item.is_last_frag; > > >> - cmp = udp_check_vxlan_neighbor(&(tbl->items[start_idx]), > > >> + cmp = udp4_check_vxlan_neighbor(&(tbl->items[start_idx]), > > >> frag_offset, ip_dl); > > >> if (cmp) { > > >> if (merge_two_vxlan_udp4_packets( > > >> @@ -481,12 +481,10 @@ > > >> INVALID_ARRAY_INDEX); > > >> tbl->items[start_idx].inner_item.next_pkt_idx > > >> = item_idx; > > >> - } else { > > >> + } else > > >> return 0; > > >> - } > > >> - } else { > > >> + } else > > >> return 0; > > >> - } > > >> } > > >> > > >> return 0; > > >> > > >> > > >> At 2020-09-11 12:24:16, "Jiayu Hu" wrote: > > >> >Some comments are inline. > > >> > > > >> >Thanks, > > >> >Jiayu > > >> > > > >> >On Thu, Sep 10, 2020 at 05:29:14PM +0800, yang_y_yi@163.com wrote: > > >> >> From: Yi Yang > > >> >> > > >> >> VXLAN UDP GRO can help improve VM-to-VM UDP performance > > >> >> when VM is enabled UFO or GSO, GRO must be supported if > > >> >> GSO or UFO is enabled, otherwise, performance gain will > > >> >> be hurt. > > >> >> > > >> >> With this enabled in DPDK, OVS DPDK can leverage it to > > >> >> improve VM-to-VM UDP performance, this will make sure > > >> >> IP fragments will be reassembled once it is received > > >> >> from physical NIC. It is very helpful in OVS DPDK VXLAN > > >> >> TSO case. > > >> >> > > >> >> Signed-off-by: Yi Yang > > >> >> --- > > >> >> lib/librte_gro/gro_udp4.h | 1 + > > >> >> lib/librte_gro/gro_vxlan_udp4.c | 548 ++++++++++++++++++++++++++++++++++++++++ > > >> >> lib/librte_gro/gro_vxlan_udp4.h | 152 +++++++++++ > > >> >> lib/librte_gro/meson.build | 2 +- > > >> >> lib/librte_gro/rte_gro.c | 115 +++++++-- > > >> >> lib/librte_gro/rte_gro.h | 3 + > > >> >> 6 files changed, 794 insertions(+), 27 deletions(-) > > >> >> create mode 100644 lib/librte_gro/gro_vxlan_udp4.c > > >> >> create mode 100644 lib/librte_gro/gro_vxlan_udp4.h > > >> >> > > >> >> diff --git a/lib/librte_gro/gro_udp4.h b/lib/librte_gro/gro_udp4.h > > >> >> index 0a078e4..d38b393 100644 > > >> >> --- a/lib/librte_gro/gro_udp4.h > > >> >> +++ b/lib/librte_gro/gro_udp4.h > > >> >> @@ -7,6 +7,7 @@ > > >> >> > > >> >> #include > > >> >> #include > > >> >> +#include > > >> >> > > >> >> #define INVALID_ARRAY_INDEX 0xffffffffUL > > >> >> #define GRO_UDP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL) > > >> >> diff --git a/lib/librte_gro/gro_vxlan_udp4.c b/lib/librte_gro/gro_vxlan_udp4.c > > >> >> new file mode 100644 > > >> >> index 0000000..9e9df72 > > >> >> --- /dev/null > > >> >> +++ b/lib/librte_gro/gro_vxlan_udp4.c > > >> >> @@ -0,0 +1,548 @@ > > >> >> +/* SPDX-License-Identifier: BSD-3-Clause > > >> >> + * Copyright(c) 2020 Inspur Corporation > > >> >> + */ > > >> >> + > > >> >> + > > >> >> + > > >> >> +static inline uint32_t > > >> >> +delete_item(struct gro_vxlan_udp4_tbl *tbl, > > >> >> + uint32_t item_idx, > > >> >> + uint32_t prev_item_idx) > > >> >> +{ > > >> >> + uint32_t next_idx = tbl->items[item_idx].inner_item.next_pkt_idx; > > >> >> + > > >> >> + /* NULL indicates an empty item. */ > > >> >> + tbl->items[item_idx].inner_item.firstseg = NULL; > > >> >> + tbl->item_num--; > > >> >> + if (prev_item_idx != INVALID_ARRAY_INDEX) > > >> >> + tbl->items[prev_item_idx].inner_item.next_pkt_idx = next_idx; > > >> >> + > > >> >> + return next_idx; > > >> >> +} > > >> >> + > > >> >> +static inline uint32_t > > >> >> +insert_new_flow(struct gro_vxlan_udp4_tbl *tbl, > > >> >> + struct vxlan_udp4_flow_key *src, > > >> >> + uint32_t item_idx) > > >> >> +{ > > >> >> + struct vxlan_udp4_flow_key *dst; > > >> >> + uint32_t flow_idx; > > >> >> + > > >> >> + flow_idx = find_an_empty_flow(tbl); > > >> >> + if (unlikely(flow_idx == INVALID_ARRAY_INDEX)) > > >> >> + return INVALID_ARRAY_INDEX; > > >> >> + > > >> >> + dst = &(tbl->flows[flow_idx].key); > > >> >> + > > >> >> + rte_ether_addr_copy(&(src->inner_key.eth_saddr), > > >> >> + &(dst->inner_key.eth_saddr)); > > >> >> + rte_ether_addr_copy(&(src->inner_key.eth_daddr), > > >> >> + &(dst->inner_key.eth_daddr)); > > >> >> + dst->inner_key.ip_src_addr = src->inner_key.ip_src_addr; > > >> >> + dst->inner_key.ip_dst_addr = src->inner_key.ip_dst_addr; > > >> >> + dst->inner_key.ip_id = src->inner_key.ip_id; > > >> >> + > > >> >> + dst->vxlan_hdr.vx_flags = src->vxlan_hdr.vx_flags; > > >> >> + dst->vxlan_hdr.vx_vni = src->vxlan_hdr.vx_vni; > > >> >> + rte_ether_addr_copy(&(src->outer_eth_saddr), &(dst->outer_eth_saddr)); > > >> >> + rte_ether_addr_copy(&(src->outer_eth_daddr), &(dst->outer_eth_daddr)); > > >> >> + dst->outer_ip_src_addr = src->outer_ip_src_addr; > > >> >> + dst->outer_ip_dst_addr = src->outer_ip_dst_addr; > > >> >> + dst->outer_src_port = src->outer_src_port; > > >> >> + dst->outer_dst_port = src->outer_dst_port; > > >> >> + > > >> >> + tbl->flows[flow_idx].start_index = item_idx; > > >> >> + tbl->flow_num++; > > >> >> + > > >> >> + return flow_idx; > > >> >> +} > > >> >> + > > >> >> +static inline int > > >> >> +is_same_vxlan_udp4_flow(struct vxlan_udp4_flow_key k1, > > >> >> + struct vxlan_udp4_flow_key k2) > > >> >> +{ > > >> >> + /* src port is changing, so shouldn't use it for flow check */ > > >> > > > >> >As I know, fragments of a vxlan/udp/ipv4 packet have same outer source > > >> >port. In what cases outer source port will change? > > >> > > >> In OVS, vxlan udp soure port is calculated from inner packet RSS, so for UDP fragment, RSS of first packet is > > >> different from other fragments because RSS is calculated based on 5 tuples (src ip, src port, dst ip, dst port, > > >> protocol type). > > >> > > >> ovs/lib/netdev-native-tnl.h: > > >> > > >> static inline uint32_t * > > >> dp_packet_rss_ptr(const struct dp_packet *b) > > >> { > > >> return CONST_CAST(uint32_t *, &b->mbuf.hash.rss); > > >> } > > >> > > >> static inline uint32_t > > >> dp_packet_get_rss_hash(const struct dp_packet *p) > > >> { > > >> return *dp_packet_rss_ptr(p); > > >> } > > >> > > >> static inline ovs_be16 > > >> netdev_tnl_get_src_port(struct dp_packet *packet) > > >> { > > >> uint32_t hash; > > >> > > >> hash = dp_packet_get_rss_hash(packet); > > >> > > >> return htons((((uint64_t) hash * (tnl_udp_port_max - tnl_udp_port_min)) >> 32) + > > >> tnl_udp_port_min); > > >> } > > >> > > >> void > > >> netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED, > > >> struct dp_packet *packet, > > >> const struct ovs_action_push_tnl *data) > > >> { > > >> struct udp_header *udp; > > >> int ip_tot_size; > > >> > > >> udp = netdev_tnl_push_ip_header(packet, data->header, data->header_len, &ip_tot_size); > > >> > > >> /* set udp src port */ > > >> udp->udp_src = netdev_tnl_get_src_port(packet); > > >> udp->udp_len = htons(ip_tot_size); > > >> > > >> if (udp->udp_csum) { > > >> netdev_tnl_calc_udp_csum(udp, packet, ip_tot_size); > > >> } > > >> } > > >> > > >> This is special for ECMP which only checks outer header to differentiate different flows (innner packet). The > > >> first fragment of original inner UDP is considered as different flow from other fragments. > > >> > > >> I found this by debugging becuase the same UDP traffic did generate different vxlan UDP source port. > > >> > > > > > >You are right. I checked VxLAN RFC 7348, and outer source port is > > >calculated from inner packet. In OVS case, the first fragment > > >will have different outer source port compared with other fragments. > > >Please add some comments in the code for people to better understand > > >the code. In addition, need to remove outer_src_port from the structure > > >vxlan_udp4_flow_key. > > > > > >> > > > >> >> + return (rte_is_same_ether_addr(&k1.outer_eth_saddr, > > >> >> + &k2.outer_eth_saddr) && > > >> >> + rte_is_same_ether_addr(&k1.outer_eth_daddr, > > >> >> + &k2.outer_eth_daddr) && > > >> >> + (k1.outer_ip_src_addr == k2.outer_ip_src_addr) && > > >> >> + (k1.outer_ip_dst_addr == k2.outer_ip_dst_addr) && > > >> >> + (k1.outer_dst_port == k2.outer_dst_port) && > > >> >> + (k1.vxlan_hdr.vx_flags == k2.vxlan_hdr.vx_flags) && > > >> >> + (k1.vxlan_hdr.vx_vni == k2.vxlan_hdr.vx_vni) && > > >> >> + is_same_udp4_flow(k1.inner_key, k2.inner_key)); > > >> >> +} > > >> >> + > > >> >> +static inline int > > >> >> +udp_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item, > > >> >> + uint16_t frag_offset, > > >> >> + uint16_t ip_dl) > > >> > > > >> >Better rename it as udp4_check_vxlan_neighbor. > > >> > > >> Ok > > >> > > >> > > > >> >> +{ > > >> >> + struct rte_mbuf *pkt = item->inner_item.firstseg; > > >> >> + int cmp; > > >> >> + uint16_t l2_offset; > > >> >> + int ret = 0; > > >> >> + > > >> >> + l2_offset = pkt->outer_l2_len + pkt->outer_l3_len; > > >> >> + cmp = udp_check_neighbor(&item->inner_item, frag_offset, ip_dl, > > >> >> + l2_offset); > > >> > > > >> >Why don't check outer IP ID? If outer DF bit is not set, all fragments > > >> >should have incremental outer IP ID; if DF bit is set, we don't need > > >> >to check outer IP ID. > > >> > > >> By default, OVS set DF flag. I don't think we need to compare outer IP id because inner check > > >> is enough to differentiate if they are same flow, isn't it? > > > > > >OVS is one of workloads using DPDK, and we can assume > > >other cases have the same design as OVS. I think IP ID > > >check is necessary. > > > > > >> > > >> > > > >> >In addition, udp_check_neighbor is renamed, but you didn't change here. > > >> > > >> Forgot to build it, Makefile has been removed from dpdk main branch, build isn't so convinient, will > > >> use meson build to check it for next version. > > >> > > >> > > > >> >> + /* VXLAN outer IP ID is out of order, so don't touch it and > > >> >> + * don't compare it. > > >> >> + */ > > >> >> + if (cmp > 0) > > >> >> + /* Append the new packet. */ > > >> >> + ret = 1; > > >> >> + else if (cmp < 0) > > >> >> + /* Prepend the new packet. */ > > >> >> + ret = -1; > > >> >> + > > >> >> + return ret; > > >> >> +} > > >> >> + > > >> >> > > >> >> /* > > >> >> * GRO context structure. It keeps the table structures, which are > > >> >> @@ -137,19 +148,27 @@ struct gro_ctx { > > >> >> struct gro_udp4_item udp_items[RTE_GRO_MAX_BURST_ITEM_NUM] = {{0} }; > > >> >> > > >> >> /* Allocate a reassembly table for VXLAN TCP GRO */ > > >> >> - struct gro_vxlan_tcp4_tbl vxlan_tbl; > > >> >> - struct gro_vxlan_tcp4_flow vxlan_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > > >> >> - struct gro_vxlan_tcp4_item vxlan_items[RTE_GRO_MAX_BURST_ITEM_NUM] > > >> >> + struct gro_vxlan_tcp4_tbl vxlan_tcp_tbl; > > >> >> + struct gro_vxlan_tcp4_flow vxlan_tcp_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > > >> >> + struct gro_vxlan_tcp4_item vxlan_tcp_items[RTE_GRO_MAX_BURST_ITEM_NUM] > > >> >> + = {{{0}, 0, 0} }; > > >> >> + > > >> >> + /* Allocate a reassembly table for VXLAN UDP GRO */ > > >> >> + struct gro_vxlan_udp4_tbl vxlan_udp_tbl; > > >> >> + struct gro_vxlan_udp4_flow vxlan_udp_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > > >> >> + struct gro_vxlan_udp4_item vxlan_udp_items[RTE_GRO_MAX_BURST_ITEM_NUM] > > >> >> = {{{0}, 0, 0} }; > > >> >> > > >> >> struct rte_mbuf *unprocess_pkts[nb_pkts]; > > >> >> uint32_t item_num; > > >> >> int32_t ret; > > >> >> uint16_t i, unprocess_num = 0, nb_after_gro = nb_pkts; > > >> >> - uint8_t do_tcp4_gro = 0, do_vxlan_gro = 0, do_udp4_gro = 0; > > >> >> + uint8_t do_tcp4_gro = 0, do_vxlan_tcp_gro = 0, do_udp4_gro = 0, > > >> >> + do_vxlan_udp_gro = 0; > > >> >> > > >> >> if (unlikely((param->gro_types & (RTE_GRO_IPV4_VXLAN_TCP_IPV4 | > > >> >> RTE_GRO_TCP_IPV4 | > > >> >> + RTE_GRO_IPV4_VXLAN_UDP_IPV4 | > > >> >> RTE_GRO_UDP_IPV4)) == 0)) > > >> >> return nb_pkts; > > >> >> > > >> >> @@ -160,15 +179,28 @@ struct gro_ctx { > > >> >> > > >> >> if (param->gro_types & RTE_GRO_IPV4_VXLAN_TCP_IPV4) { > > >> >> for (i = 0; i < item_num; i++) > > >> >> - vxlan_flows[i].start_index = INVALID_ARRAY_INDEX; > > >> >> - > > >> >> - vxlan_tbl.flows = vxlan_flows; > > >> >> - vxlan_tbl.items = vxlan_items; > > >> >> - vxlan_tbl.flow_num = 0; > > >> >> - vxlan_tbl.item_num = 0; > > >> >> - vxlan_tbl.max_flow_num = item_num; > > >> >> - vxlan_tbl.max_item_num = item_num; > > >> >> - do_vxlan_gro = 1; > > >> >> + vxlan_tcp_flows[i].start_index = INVALID_ARRAY_INDEX; > > >> >> + > > >> >> + vxlan_tcp_tbl.flows = vxlan_tcp_flows; > > >> >> + vxlan_tcp_tbl.items = vxlan_tcp_items; > > >> >> + vxlan_tcp_tbl.flow_num = 0; > > >> >> + vxlan_tcp_tbl.item_num = 0; > > >> >> + vxlan_tcp_tbl.max_flow_num = item_num; > > >> >> + vxlan_tcp_tbl.max_item_num = item_num; > > >> >> + do_vxlan_tcp_gro = 1; > > >> >> + } > > >> >> + > > >> >> + if (param->gro_types & RTE_GRO_IPV4_VXLAN_UDP_IPV4) { > > >> >> + for (i = 0; i < item_num; i++) > > >> >> + vxlan_udp_flows[i].start_index = INVALID_ARRAY_INDEX; > > >> >> + > > >> >> + vxlan_udp_tbl.flows = vxlan_udp_flows; > > >> >> + vxlan_udp_tbl.items = vxlan_udp_items; > > >> >> + vxlan_udp_tbl.flow_num = 0; > > >> >> + vxlan_udp_tbl.item_num = 0; > > >> >> + vxlan_udp_tbl.max_flow_num = item_num; > > >> >> + vxlan_udp_tbl.max_item_num = item_num; > > >> >> + do_vxlan_udp_gro = 1; > > >> >> } > > >> >> > > >> >> if (param->gro_types & RTE_GRO_TCP_IPV4) { > > >> >> @@ -204,9 +236,18 @@ struct gro_ctx { > > >> >> * will be flushed from the tables. > > >> >> */ > > >> >> if (IS_IPV4_VXLAN_TCP4_PKT(pkts[i]->packet_type) && > > >> >> - do_vxlan_gro) { > > >> >> + do_vxlan_tcp_gro) { > > >> >> ret = gro_vxlan_tcp4_reassemble(pkts[i], > > >> >> - &vxlan_tbl, 0); > > >> >> + &vxlan_tcp_tbl, 0); > > >> >> + if (ret > 0) > > >> >> + /* Merge successfully */ > > >> >> + nb_after_gro--; > > >> >> + else if (ret < 0) > > >> >> + unprocess_pkts[unprocess_num++] = pkts[i]; > > >> >> + } else if (IS_IPV4_VXLAN_UDP4_PKT(pkts[i]->packet_type) && > > >> >> + do_vxlan_udp_gro) { > > >> >> + ret = gro_vxlan_udp4_reassemble(pkts[i], > > >> >> + &vxlan_udp_tbl, 0); > > >> >> if (ret > 0) > > >> >> /* Merge successfully */ > > >> >> nb_after_gro--; > > >> >> @@ -236,11 +277,17 @@ struct gro_ctx { > > >> >> || (unprocess_num < nb_pkts)) { > > >> >> i = 0; > > >> >> /* Flush all packets from the tables */ > > >> >> - if (do_vxlan_gro) { > > >> >> - i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tbl, > > >> >> + if (do_vxlan_tcp_gro) { > > >> >> + i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl, > > >> >> 0, pkts, nb_pkts); > > >> >> } > > >> >> > > >> >> + if (do_vxlan_udp_gro) { > > >> >> + i += gro_vxlan_udp4_tbl_timeout_flush(&vxlan_udp_tbl, > > >> >> + 0, &pkts[i], nb_pkts - i); > > >> >> + > > >> >> + } > > >> >> + > > >> >> if (do_tcp4_gro) { > > >> >> i += gro_tcp4_tbl_timeout_flush(&tcp_tbl, 0, > > >> >> &pkts[i], nb_pkts - i); > > >> >> @@ -269,33 +316,42 @@ struct gro_ctx { > > >> >> { > > >> >> struct rte_mbuf *unprocess_pkts[nb_pkts]; > > >> >> struct gro_ctx *gro_ctx = ctx; > > >> >> - void *tcp_tbl, *udp_tbl, *vxlan_tbl; > > >> >> + void *tcp_tbl, *udp_tbl, *vxlan_tcp_tbl, *vxlan_udp_tbl; > > >> >> uint64_t current_time; > > >> >> uint16_t i, unprocess_num = 0; > > >> >> - uint8_t do_tcp4_gro, do_vxlan_gro, do_udp4_gro; > > >> >> + uint8_t do_tcp4_gro, do_vxlan_tcp_gro, do_udp4_gro, do_vxlan_udp_gro; > > >> >> > > >> >> if (unlikely((gro_ctx->gro_types & (RTE_GRO_IPV4_VXLAN_TCP_IPV4 | > > >> >> RTE_GRO_TCP_IPV4 | > > >> >> + RTE_GRO_IPV4_VXLAN_UDP_IPV4 | > > >> >> RTE_GRO_UDP_IPV4)) == 0)) > > >> >> return nb_pkts; > > >> >> > > >> >> tcp_tbl = gro_ctx->tbls[RTE_GRO_TCP_IPV4_INDEX]; > > >> >> - vxlan_tbl = gro_ctx->tbls[RTE_GRO_IPV4_VXLAN_TCP_IPV4_INDEX]; > > >> >> + vxlan_tcp_tbl = gro_ctx->tbls[RTE_GRO_IPV4_VXLAN_TCP_IPV4_INDEX]; > > >> >> udp_tbl = gro_ctx->tbls[RTE_GRO_UDP_IPV4_INDEX]; > > >> >> + vxlan_udp_tbl = gro_ctx->tbls[RTE_GRO_IPV4_VXLAN_UDP_IPV4_INDEX]; > > >> >> > > >> >> do_tcp4_gro = (gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == > > >> >> RTE_GRO_TCP_IPV4; > > >> >> - do_vxlan_gro = (gro_ctx->gro_types & RTE_GRO_IPV4_VXLAN_TCP_IPV4) == > > >> >> + do_vxlan_tcp_gro = (gro_ctx->gro_types & RTE_GRO_IPV4_VXLAN_TCP_IPV4) == > > >> >> RTE_GRO_IPV4_VXLAN_TCP_IPV4; > > >> >> do_udp4_gro = (gro_ctx->gro_types & RTE_GRO_UDP_IPV4) == > > >> >> RTE_GRO_UDP_IPV4; > > >> >> + do_vxlan_udp_gro = (gro_ctx->gro_types & RTE_GRO_IPV4_VXLAN_UDP_IPV4) == > > >> >> + RTE_GRO_IPV4_VXLAN_UDP_IPV4; > > >> >> > > >> >> current_time = rte_rdtsc(); > > >> >> > > >> >> for (i = 0; i < nb_pkts; i++) { > > >> >> if (IS_IPV4_VXLAN_TCP4_PKT(pkts[i]->packet_type) && > > >> >> - do_vxlan_gro) { > > >> >> - if (gro_vxlan_tcp4_reassemble(pkts[i], vxlan_tbl, > > >> >> + do_vxlan_tcp_gro) { > > >> >> + if (gro_vxlan_tcp4_reassemble(pkts[i], vxlan_tcp_tbl, > > >> >> + current_time) < 0) > > >> >> + unprocess_pkts[unprocess_num++] = pkts[i]; > > >> >> + } else if (IS_IPV4_VXLAN_UDP4_PKT(pkts[i]->packet_type) && > > >> >> + do_vxlan_udp_gro) { > > >> >> + if (gro_vxlan_udp4_reassemble(pkts[i], vxlan_udp_tbl, > > >> >> current_time) < 0) > > >> >> unprocess_pkts[unprocess_num++] = pkts[i]; > > >> >> } else if (IS_IPV4_TCP_PKT(pkts[i]->packet_type) && > > >> >> @@ -341,6 +397,13 @@ struct gro_ctx { > > >> >> left_nb_out = max_nb_out - num; > > >> >> } > > >> >> > > >> >> + if ((gro_types & RTE_GRO_IPV4_VXLAN_UDP_IPV4) && max_nb_out > 0) { > > >> >> + num += gro_vxlan_udp4_tbl_timeout_flush(gro_ctx->tbls[ > > >> >> + RTE_GRO_IPV4_VXLAN_UDP_IPV4_INDEX], > > >> >> + flush_timestamp, &out[num], left_nb_out); > > >> >> + left_nb_out = max_nb_out - num; > > >> > > > >> >Don't need to calculate left_nb_out here. In the previous patch, you > > >> >also calculate it for UDP GRO. Please remove it both. > > >> > > >> I think keeping it here isn't bad thing, if someone will add a new GRO type here, he/she won't make a mistake. > > > > > >I don't think so. It's unnecessary for current design. > > >If others want to add new GRO types, they will write > > >correct code. > > > > > >> > > >> > > > >> > > >> > > >> > > >> > > >> > > > > > > > > >