From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 04A7337A0 for ; Fri, 30 Jun 2017 14:07:17 +0200 (CEST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP; 30 Jun 2017 05:07:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,286,1496127600"; d="scan'208";a="280606196" Received: from irsmsx107.ger.corp.intel.com ([163.33.3.99]) by fmsmga004.fm.intel.com with ESMTP; 30 Jun 2017 05:07:16 -0700 Received: from irsmsx109.ger.corp.intel.com ([169.254.13.115]) by IRSMSX107.ger.corp.intel.com ([169.254.10.129]) with mapi id 14.03.0319.002; Fri, 30 Jun 2017 13:07:14 +0100 From: "Ananyev, Konstantin" To: "Hu, Jiayu" CC: "dev@dpdk.org" , "Tan, Jianfeng" , "stephen@networkplumber.org" , "yliu@fridaylinux.org" , "Wu, Jingjing" , "Yao, Lei A" , "Wiles, Keith" , "Bie, Tiwei" Thread-Topic: [PATCH v7 2/3] lib/gro: add TCP/IPv4 GRO support Thread-Index: AQHS7kdlXNHGhRYN9Eipev8pdR3kGKI6kXCQgAB/Z4CAAjYfoA== Date: Fri, 30 Jun 2017 12:07:13 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772583FB18595@IRSMSX109.ger.corp.intel.com> References: <1498229000-94867-1-git-send-email-jiayu.hu@intel.com> <1498459430-116048-1-git-send-email-jiayu.hu@intel.com> <1498459430-116048-3-git-send-email-jiayu.hu@intel.com> <2601191342CEEE43887BDE71AB9772583FB13208@IRSMSX109.ger.corp.intel.com> <20170629022614.GB123467@dpdk15.sh.intel.com> In-Reply-To: <20170629022614.GB123467@dpdk15.sh.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v7 2/3] lib/gro: add TCP/IPv4 GRO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Jun 2017 12:07:18 -0000 Hi Jiayu, > > > + > > > +int32_t > > > +gro_tcp4_reassemble(struct rte_mbuf *pkt, > > > + struct gro_tcp_tbl *tbl, > > > + uint32_t max_packet_size) > > > +{ > > > + struct ether_hdr *eth_hdr; > > > + struct ipv4_hdr *ipv4_hdr; > > > + struct tcp_hdr *tcp_hdr; > > > + uint16_t ipv4_ihl, tcp_hl, tcp_dl; > > > + > > > + struct tcp_key key; > > > + uint32_t cur_idx, prev_idx, item_idx; > > > + uint32_t i, key_idx; > > > + > > > + eth_hdr =3D rte_pktmbuf_mtod(pkt, struct ether_hdr *); > > > + ipv4_hdr =3D (struct ipv4_hdr *)(eth_hdr + 1); > > > + ipv4_ihl =3D IPv4_HDR_LEN(ipv4_hdr); > > > + > > > + /* check if the packet should be processed */ > > > + if (ipv4_ihl < sizeof(struct ipv4_hdr)) > > > + goto fail; > > > + tcp_hdr =3D (struct tcp_hdr *)((char *)ipv4_hdr + ipv4_ihl); > > > + tcp_hl =3D TCP_HDR_LEN(tcp_hdr); > > > + tcp_dl =3D rte_be_to_cpu_16(ipv4_hdr->total_length) - ipv4_ihl > > > + - tcp_hl; > > > + if (tcp_dl =3D=3D 0) > > > + goto fail; > > > + > > > + /* find a key and traverse all packets in its item group */ > > > + key.eth_saddr =3D eth_hdr->s_addr; > > > + key.eth_daddr =3D eth_hdr->d_addr; > > > + key.ip_src_addr[0] =3D rte_be_to_cpu_32(ipv4_hdr->src_addr); > > > + key.ip_dst_addr[0] =3D rte_be_to_cpu_32(ipv4_hdr->dst_addr); > > > > Your key.ip_src_addr[1-3] still contains some junk. > > How memcmp below supposed to worj properly? >=20 > When allocate an item, we already guarantee the content of its > memory space is 0. So memcpy won't be error. key is allocated on the stack. Obviously fileds that are not initialized manually will contain undefined v= alues, i.e.: ip_src-addr[1-3]. Then below you are doing: memcp((&(tbl->keys[i].key), &key, sizeof(struct tcp_key)); ... memcpy(&(tbl->keys[key_idx].key), &key, sizeof(struct tcp_key)); So I still think you are comparing/copying some junk here. >=20 > > BTW why do you need 4 elems here, why just not uint32_t ip_src_addr;? > > Same for ip_dst_addr. >=20 > I think tcp6 and tcp4 can share the same table structure. So I use > 128bit IP address here. You mean we need to use different structures > for tcp4 and tcp6? That would be my preference. >=20 > > > > > + key.src_port =3D rte_be_to_cpu_16(tcp_hdr->src_port); > > > + key.dst_port =3D rte_be_to_cpu_16(tcp_hdr->dst_port); > > > + key.recv_ack =3D rte_be_to_cpu_32(tcp_hdr->recv_ack); > > > + key.tcp_flags =3D tcp_hdr->tcp_flags; > > > + > > > + for (i =3D 0; i < tbl->max_key_num; i++) { > > > + if (tbl->keys[i].is_valid && > > > + (memcmp(&(tbl->keys[i].key), &key, > > > + sizeof(struct tcp_key)) > > > + =3D=3D 0)) { > > > + cur_idx =3D tbl->keys[i].start_index; > > > + prev_idx =3D cur_idx; > > > + while (cur_idx !=3D INVALID_ARRAY_INDEX) { > > > + if (check_seq_option(tbl->items[cur_idx].pkt, > > > + tcp_hdr, > > > + tcp_hl) > 0) { > > > > As I remember linux gro also check ipv4 packet_id - it should be consec= utive. >=20 > IP fragmented packets have the same IP ID, but others are consecutive. Yes, you assume that they are consecutive. But the only way to know for sure that they are - check it. Another thing - I think we need to perform GRO only for TCP packets with on= ly ACK bit set (i.e. - no GRO for FIN/SYN/PUSH/URG/, etc.). > As we suppose GRO can merge IP fragmented packets, so I think we shouldn= 't check if > the IP ID is consecutive here. How do you think? >=20 > > > > > + if (merge_two_tcp4_packets( > > > + tbl->items[cur_idx].pkt, > > > + pkt, > > > + max_packet_size) > 0) { > > > + /* successfully merge two packets */ > > > + tbl->items[cur_idx].is_groed =3D 1; > > > + return 1; > > > + } > > > > If you allow more then packet per flow to be stored in the table, then = you should be > > prepared that new segment can fill a gap between 2 packets. > > Probably the easiest thing - don't allow more then one 'item' per flow. >=20 > We allow the table to store same flow but out-of-order arriving packets. = For > these packets, they will occupy different 'item' and we won't re-merge th= em. > For example, there are three same flow tcp packets: p1, p2 and p3. And p1= arrives > first, then p3, and last is p2. So TCP GRO will allocate one 'item' for p= 1 and one > 'item' for p3, and when p2 arrives, p2 will be merged with p1. Therefore,= in the > table, we will have two 'item': item1 to store merged p1 and p2, item2 to= store p3. >=20 > As you can see, TCP GRO can only merges sequential arriving packets. If w= e want to > merge all out-of-order arriving packets, we need to re-process these pack= ets which > are already processed and have one 'item'. IMO, this procedure will be ve= ry complicated. > So we don't do that. >=20 > Sorry, I don't understand how to allow one 'item' per-flow. Because packe= ts are arriving > out-of-order. If we don't re-process these packets which already have one= 'item', how to > guarantee it? As I understand you'll have an input array: - you wouldn't be able to merge it. So I think your merge need be prepared to merge both smaller and bigger seq= uence. About one item my thought : instead of allowing N items per key(flow) - for= simplicity just allow one item per flow. In that case we wouldn't allow holes, but still would be able to merge reor= dered packets. Alternatively you can store items ordered by seq, and after merge, try to m= erge neighbor Items too. >=20 > > > > > + /** > > > + * fail to merge two packets since > > > + * it's beyond the max packet length. > > > + * Insert it into the item group. > > > + */ > > > + goto insert_to_item_group; > > > + } else { > > > + prev_idx =3D cur_idx; > > > + cur_idx =3D tbl->items[cur_idx].next_pkt_idx; > > > + } > > > + } > > > + /** > > > + * find a corresponding item group but fails to find > > > + * one packet to merge. Insert it into this item group. > > > + */ > > > +insert_to_item_group: > > > + item_idx =3D find_an_empty_item(tbl); > > > + /* the item number is greater than the max value */ > > > + if (item_idx =3D=3D INVALID_ARRAY_INDEX) > > > + return -1; > > > + tbl->items[prev_idx].next_pkt_idx =3D item_idx; > > > + tbl->items[item_idx].pkt =3D pkt; > > > + tbl->items[item_idx].is_groed =3D 0; > > > + tbl->items[item_idx].next_pkt_idx =3D INVALID_ARRAY_INDEX; > > > + tbl->items[item_idx].is_valid =3D 1; > > > + tbl->items[item_idx].start_time =3D rte_rdtsc(); > > > + tbl->item_num++; > > > + return 0; > > > + } > > > + } > > > + > > > + /** > > > + * merge fail as the given packet has a new key. > > > + * So insert a new key. > > > + */ > > > + item_idx =3D find_an_empty_item(tbl); > > > + key_idx =3D find_an_empty_key(tbl); > > > + /** > > > + * if current key or item number is greater than the max > > > + * value, don't insert the packet into the table and return > > > + * immediately. > > > + */ > > > + if (item_idx =3D=3D INVALID_ARRAY_INDEX || > > > + key_idx =3D=3D INVALID_ARRAY_INDEX) > > > + return -1; > > > + tbl->items[item_idx].pkt =3D pkt; > > > + tbl->items[item_idx].next_pkt_idx =3D INVALID_ARRAY_INDEX; > > > + tbl->items[item_idx].is_groed =3D 0; > > > + tbl->items[item_idx].is_valid =3D 1; > > > + tbl->items[item_idx].start_time =3D rte_rdtsc(); > >