From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 21AAD42C3E; Tue, 6 Jun 2023 11:31:19 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0C0A040A84; Tue, 6 Jun 2023 11:31:19 +0200 (CEST) Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) by mails.dpdk.org (Postfix) with ESMTP id F0A1A40697 for ; Tue, 6 Jun 2023 11:31:16 +0200 (CEST) Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-bacfc573647so6546953276.1 for ; Tue, 06 Jun 2023 02:31:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686043876; x=1688635876; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=9mTbGJPTbMMWpDOHhbd0PB4lazuUMRLmNCcHZlHGwrk=; b=TAiTGMke5XCtkAVvun0F//ojugwQzwOLrKpWjrzyRnpRcmr0j3H5d/+AQFcvsnYysA 6McN3LQtTtrsRjo21b81LqSL7R46PwmY/It9IPwdnwYXxubWem4nKMGeSJgUDkWC/r1N ZvDSdCjF2FjAPgA/WOep6oBM8aOB3Wybv9aoHckB9LyKkxVE4S0utgF9mwVPfq3Fgaet VfBAnc2Ba+nMEfXMLYYLgwMp0JXX4CLZ5+oJiUqlx6TxDgTqkCdrJVEpC61NL3mbA6A1 DHotSJdsjbSj8vMa4e8Fpksdntwq+AExgI1qSbnohUwSbLvGa94vrVFKOih0tOmUBLvu lHuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686043876; x=1688635876; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=9mTbGJPTbMMWpDOHhbd0PB4lazuUMRLmNCcHZlHGwrk=; b=bp9znXfmDNYEOoboxNtqf2+6GwkhNR81wtIk57jBiqjmQh632nAU++dHHD316TldNc gqne6a66miLQEYIckGePSGmjMJqeWK8Af5cJ1mbk/QmE2uep4nwaiU7dny3lb7iOkPqV wVq2emhkBB3s4+SlSeEKPFxkFXjRU9DxczE7k44aKQuzpHDTZSaQiaBBS2eWRzQn3k2p DfMTcf4Ye4glgONLlkx23tvBX73LP2z1/fZZaFzc1me1q+49gu/gbXq10CwrBP9loHh7 USOBege7w4eKMbCNkebrkAY7ZzCd/LPhzy+VTEa5VDRMR5lX/pdSgq91F8jpMvyr+AZ6 r9/Q== X-Gm-Message-State: AC+VfDzXz4pntfQ/GrBx/ha/55Sup7a+j/i3c42RSsQ5siyWi7khsw92 qJmlVz+92YLkm06QzfeXYfjKIr78t1HhacL3RWbndEXigqI= X-Google-Smtp-Source: ACHHUZ5obNmtWje8q8hvV7vCUrAOVzBUQPhASBIuUX7lQJhINjJYGb9pA1tSwGPwzn5EIM9ANXBObxMhBc3Vkf91Scs= X-Received: by 2002:a81:9209:0:b0:567:aea3:75ba with SMTP id j9-20020a819209000000b00567aea375bamr1500978ywg.4.1686043875972; Tue, 06 Jun 2023 02:31:15 -0700 (PDT) MIME-Version: 1.0 References: <20221020181425.48006-1-kumaraparmesh92@gmail.com> <20230602063423.30312-1-kumaraparamesh92@gmail.com> In-Reply-To: From: kumaraparameshwaran rathinavel Date: Tue, 6 Jun 2023 15:01:04 +0530 Message-ID: Subject: Re: [PATCH v3] gro : ipv6 changes to support GRO for TCP/ipv6 To: "Hu, Jiayu" Cc: "dev@dpdk.org" Content-Type: multipart/alternative; boundary="0000000000004d9a0f05fd72ae09" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --0000000000004d9a0f05fd72ae09 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jun 6, 2023 at 10:05=E2=80=AFAM Hu, Jiayu wrot= e: > Hi Kumara, > > The v3 patch is not complete and it seems to be a patch based on v2. > In addition, did you test the code for tcp4 and tcp6 after your change? > >> Will make sure that next patch contains the entire diff. >> > For others, please see replies inline. > > Thanks, > Jiayu > > > -----Original Message----- > > From: Kumara Parameshwaran > > Sent: Friday, June 2, 2023 2:34 PM > > To: Hu, Jiayu > > Cc: dev@dpdk.org; Kumara Parameshwaran > > > > Subject: [PATCH v3] gro : ipv6 changes to support GRO for TCP/ipv6 > > > > The patch adds GRO support for TCP/ipv6 packets. This does not include > the > > support for vxlan, udp ipv6 packets. > > > > Signed-off-by: Kumara Parameshwaran > > --- > > v1: > > * Changes to support GRO for TCP/ipv6 packets. This does not > > include > > vxlan changes. > > * The GRO is performed only for ipv6 packets that does not contai= n > > extension headers. > > * The logic for the TCP coalescing remains the same, in ipv6 head= er > > the source address, destination address, flow label, version > fields > > are expected to be the same. > > * Re-organised the code to reuse certain tcp functions for both > ipv4 > > and > > ipv6 flows. > > v2: > > * Fix comments in gro_tcp6.h header file. > > > > v3: > > * Adderess review comments to fix code duplication for v4 and v6 > > > > lib/gro/gro_tcp.c | 160 ++++++++++++++++++++++++ > > lib/gro/gro_tcp.h | 63 ++++++++++ > > lib/gro/gro_tcp4.c | 255 ++++++++++++--------------------------- > > lib/gro/gro_tcp4.h | 18 +-- > > lib/gro/gro_tcp6.c | 243 ++++++++++--------------------------- > > lib/gro/gro_tcp6.h | 31 +++-- > > lib/gro/gro_vxlan_tcp4.c | 18 +-- > > lib/gro/meson.build | 1 + > > 8 files changed, 396 insertions(+), 393 deletions(-) create mode 1006= 44 > > lib/gro/gro_tcp.c > > > > diff --git a/lib/gro/gro_tcp.c b/lib/gro/gro_tcp.c new file mode 100644 > index > > 0000000000..6a5aaada58 > > --- /dev/null > > +++ b/lib/gro/gro_tcp.c > > @@ -0,0 +1,160 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2017 Intel Corporation > > + */ > > +#include > > +#include > > +#include > > + > > +#include "gro_tcp.h" > > + > > +static inline uint32_t > > +find_an_empty_item(struct gro_tcp_item *items, > > + uint32_t table_size) > > +{ > > + uint32_t i; > > + > > + for (i =3D 0; i < table_size; i++) > > + if (items[i].firstseg =3D=3D NULL) > > + return i; > > + return INVALID_ARRAY_INDEX; > > +} > > + > > +uint32_t > > +insert_new_tcp_item(struct rte_mbuf *pkt, > > + struct gro_tcp_item *items, > > + uint32_t *item_num, > > + uint32_t table_size, > > + uint64_t start_time, > > + uint32_t prev_idx, > > + uint32_t sent_seq, > > + uint16_t ip_id, > > + uint8_t is_atomic) > > +{ > > + uint32_t item_idx; > > + > > + item_idx =3D find_an_empty_item(items, table_size); > > + if (item_idx =3D=3D INVALID_ARRAY_INDEX) > > + return INVALID_ARRAY_INDEX; > > + > > + items[item_idx].firstseg =3D pkt; > > + items[item_idx].lastseg =3D rte_pktmbuf_lastseg(pkt); > > + items[item_idx].start_time =3D start_time; > > + items[item_idx].next_pkt_idx =3D INVALID_ARRAY_INDEX; > > + items[item_idx].sent_seq =3D sent_seq; > > + items[item_idx].ip_id =3D ip_id; > > + items[item_idx].nb_merged =3D 1; > > + items[item_idx].is_atomic =3D is_atomic; > > + (*item_num) +=3D 1; > > + > > + /* if the previous packet exists, chain them together. */ > > + if (prev_idx !=3D INVALID_ARRAY_INDEX) { > > + items[item_idx].next_pkt_idx =3D > > + items[prev_idx].next_pkt_idx; > > + items[prev_idx].next_pkt_idx =3D item_idx; > > + } > > + > > + return item_idx; > > +} > > + > > +uint32_t > > +delete_tcp_item(struct gro_tcp_item *items, uint32_t item_idx, > > + uint32_t *item_num, > > + uint32_t prev_item_idx) > > +{ > > + uint32_t next_idx =3D items[item_idx].next_pkt_idx; > > + > > + /* NULL indicates an empty item */ > > + items[item_idx].firstseg =3D NULL; > > + (*item_num) -=3D 1; > > + if (prev_item_idx !=3D INVALID_ARRAY_INDEX) > > + items[prev_item_idx].next_pkt_idx =3D next_idx; > > + > > + return next_idx; > > +} > > + > > +int32_t > > +gro_tcp_reassemble(struct rte_mbuf *pkt, > > + void *tbl, > > + void *key, > > + int32_t tcp_dl, > > + struct gro_tcp_flow_ops *ops, > > + struct gro_tcp_item *items, > > + uint32_t *item_num, > > + uint32_t table_size, > > + uint16_t ip_id, > > + uint8_t is_atomic, > > + uint64_t start_time) > > In general, TCP4 and TCP6 share struct gro_tcp_item and have private flow > structures, > i.e., struct gro_tcp4/6_flow, and I like this abstraction. IMO, the code > processing > struct gro_tcp_item should be implemented as common functions shared by > gro_tcp4.c and gro_tcp6.c. The code processing struct gro_tcp4/6_flow is > tcp4 and > tcp6 dependent and no need to abstract and share. > > In gro_tcp_reassemble(), it uses callback functions defined in struct > gro_tcp_flow_ops > to provide the different operations on struct gro_tcp4/6_flow. I don't > think it's necessary > for abstraction purpose as gro_tcp4/6_flows_ops implementations are alike > and it also > introduces extra cost on function calls. > >> Sure, would make the changes accordingly. >> > > The gro_tcp_reassemble() has two parts: the common part to process struct > gro_tcp_item > and the private part to process struct gro_tcp4/6_flow. I think a better > way is to remove > gro_tcp_reassemble() and struct gro_tcp_flow_ops, and implement the commo= n > part as > an internal function in gro_tcp.c/gro_tcp.h, and make both > gro_tcp4/6_reassemble() > implement own private part and invoke the common function to process > struct gro_tcp_item. > With this change, there is no callback cost anymore and tcp4/6 can share > the common function. > >> Sure, got it. >> >> > +{ > > + uint32_t item_idx; > > + uint32_t cur_idx; > > + uint32_t prev_idx; > > + struct rte_tcp_hdr *tcp_hdr; > > + int cmp; > > + uint32_t sent_seq; > > + > > + tcp_hdr =3D rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pk= t- > > >l2_len + pkt->l3_len); > > + /* > > + * Don't process the packet which has FIN, SYN, RST, PSH, URG, EC= E > > + * or CWR set. > > + */ > > + if (tcp_hdr->tcp_flags !=3D RTE_TCP_ACK_FLAG) > > + return -1; > > + sent_seq =3D rte_be_to_cpu_32(tcp_hdr->sent_seq); > > + > > + ops->tcp_flow_key_init(key, tcp_hdr); > > + > > + item_idx =3D ops->tcp_flow_lookup(tbl, key); > > + if (item_idx =3D=3D INVALID_ARRAY_INDEX) { > > + item_idx =3D insert_new_tcp_item(pkt, items, item_num, > > table_size, start_time, > > + > > INVALID_ARRAY_INDEX, sent_seq, ip_id, > > + is_atomic); > > + if (item_idx =3D=3D INVALID_ARRAY_INDEX) > > + return -1; > > + if (ops->tcp_flow_insert(tbl, key, item_idx) =3D=3D > > + INVALID_ARRAY_INDEX) { > > + /* > > + * Fail to insert a new flow, so delete the > > + * stored packet. > > + */ > > + delete_tcp_item(items, item_idx, item_num, > > INVALID_ARRAY_INDEX); > > + return -1; > > + } > > + return 0; > > + } > > + /* > > + * Check all packets in the flow and try to find a neighbor for > > + * the input packet. > > + */ > > + cur_idx =3D item_idx; > > + prev_idx =3D cur_idx; > > + do { > > + cmp =3D check_seq_option(&items[cur_idx], tcp_hdr, > > + sent_seq, ip_id, pkt->l4_len, tcp_dl, 0, > > + is_atomic); > > + if (cmp) { > > + if (merge_two_tcp_packets(&items[cur_idx], > > + pkt, cmp, sent_seq, ip_id= , > 0)) > > + return 1; > > + /* > > + * Fail to merge the two packets, as the packet > > + * length is greater than the max value. Store > > + * the packet into the flow. > > + */ > > + if (insert_new_tcp_item(pkt, items, item_num, > > table_size, start_time, cur_idx, > > + sent_seq, ip_id, > is_atomic) =3D=3D > > + INVALID_ARRAY_INDEX) > > + return -1; > > + return 0; > > + } > > + prev_idx =3D cur_idx; > > + cur_idx =3D items[cur_idx].next_pkt_idx; > > + } while (cur_idx !=3D INVALID_ARRAY_INDEX); > > + > > + /* Fail to find a neighbor, so store the packet into the flow. */ > > + if (insert_new_tcp_item(pkt, items, item_num, table_size, > start_time, > > prev_idx, sent_seq, > > + ip_id, is_atomic) =3D=3D INVALID_ARRAY_IN= DEX) > > + return -1; > > + > > + return 0; > > + > > +} > > diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h index > > c5d248a022..202f485c18 100644 > > --- a/lib/gro/gro_tcp.h > > +++ b/lib/gro/gro_tcp.h > > @@ -1,6 +1,8 @@ > > #ifndef _GRO_TCP_H_ > > #define _GRO_TCP_H_ > > > > +#define INVALID_ARRAY_INDEX 0xffffffffUL > > + > > #include > > > > /* > > @@ -14,6 +16,31 @@ > > #define INVALID_TCP_HDRLEN(len) \ > > (((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN)) > > > > +struct gro_tcp_flow { > > + struct rte_ether_addr eth_saddr; > > + struct rte_ether_addr eth_daddr; > > + uint32_t recv_ack; > > + uint16_t src_port; > > + uint16_t dst_port; > > +}; > > + > > +#define ASSIGN_TCP_FLOW_KEY(k1, k2) \ > > + rte_ether_addr_copy(&(k1->eth_saddr), &(k2->eth_saddr)); \ > > + rte_ether_addr_copy(&(k1->eth_daddr), &(k2->eth_daddr)); \ > > + k2->recv_ack =3D k1->recv_ack; \ > > + k2->src_port =3D k1->src_port; \ > > + k2->dst_port =3D k1->dst_port; > > For multiline macro, it's better to use do{...}while(0) to avoid > unexpected errors > in the future. > > > + > > +typedef uint32_t (*gro_tcp_flow_lookup)(void *table, void *key); > > +typedef uint32_t (*gro_tcp_flow_insert)(void *table, void *key, > > +uint32_t item_idx); typedef void (*gro_tcp_flow_key_init)(void *key, > > +struct rte_tcp_hdr *tcp_hdr); > > + > > +struct gro_tcp_flow_ops { > > + gro_tcp_flow_lookup tcp_flow_lookup; > > + gro_tcp_flow_insert tcp_flow_insert; > > + gro_tcp_flow_key_init tcp_flow_key_init; }; > > + > > struct gro_tcp_item { > > /* > > * The first MBUF segment of the packet. If the value @@ -44,6 > > +71,36 @@ struct gro_tcp_item { > > uint8_t is_atomic; > > }; > > > > +uint32_t > > +insert_new_tcp_item(struct rte_mbuf *pkt, > > + struct gro_tcp_item *items, > > + uint32_t *item_num, > > + uint32_t table_size, > > + uint64_t start_time, > > + uint32_t prev_idx, > > + uint32_t sent_seq, > > + uint16_t ip_id, > > + uint8_t is_atomic); > > + > > +uint32_t > > +delete_tcp_item(struct gro_tcp_item *items, > > + uint32_t item_idx, > > + uint32_t *item_num, > > + uint32_t prev_item_idx); > > + > > +int32_t > > +gro_tcp_reassemble(struct rte_mbuf *pkt, > > + void *tbl, > > + void *key, > > + int32_t tcp_dl, > > + struct gro_tcp_flow_ops *ops, > > + struct gro_tcp_item *items, > > + uint32_t *item_num, > > + uint32_t table_size, > > + uint16_t ip_id, > > + uint8_t is_atomic, > > + uint64_t start_time); > > + > > /* > > * Merge two TCP packets without updating checksums. > > * If cmp is larger than 0, append the new packet to the @@ -152,4 > +209,10 > > @@ check_seq_option(struct gro_tcp_item *item, > > return 0; > > } > > > > +static inline int > > +is_same_tcp_flow(struct gro_tcp_flow *k1, struct gro_tcp_flow *k2) { > > + return (!memcmp(k1, k2, sizeof(struct gro_tcp_flow))); } > > + > > #endif > > --0000000000004d9a0f05fd72ae09 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Jun 6, 2023 at 10:05=E2=80=AF= AM Hu, Jiayu <jiayu.hu@intel.com> wrote:
Hi= Kumara,

The v3 patch is not complete and it seems to be a patch based on v2.
In addition, did you test the code for tcp4 and tcp6 after your change?
=
Will make sure that next = patch contains the entire diff.
For others, please see replies inline.

Thanks,
Jiayu

> -----Original Message-----
> From: Kumara Parameshwaran <
kumaraparamesh92@gmail.com>
> Sent: Friday, June 2, 2023 2:34 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org= ; Kumara Parameshwaran
> <ku= maraparamesh92@gmail.com>
> Subject: [PATCH v3] gro : ipv6 changes to support GRO for TCP/ipv6
>
> The patch adds GRO support for TCP/ipv6 packets. This does not include= the
> support for vxlan, udp ipv6 packets.
>
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> ---
> v1:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0* Changes to support GRO for TCP/ipv6 packet= s. This does not
> include
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vxlan changes.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0* The GRO is performed only for ipv6 packets= that does not contain
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 extension headers.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0* The logic for the TCP coalescing remains t= he same, in ipv6 header
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0the source address, destination addre= ss, flow label, version fields
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0are expected to be the same.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0* Re-organised the code to reuse certain tcp= functions for both ipv4
> and
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ipv6 flows.
> v2:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0* Fix comments in gro_tcp6.h header file. >
> v3:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0* Adderess review comments to fix code dupli= cation for v4 and v6
>
>=C2=A0 lib/gro/gro_tcp.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 | 160 +++++++++++++= +++++++++++
>=C2=A0 lib/gro/gro_tcp.h=C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 63 ++++++++= ++
>=C2=A0 lib/gro/gro_tcp4.c=C2=A0 =C2=A0 =C2=A0 =C2=A0| 255 ++++++++++++-= --------------------------
>=C2=A0 lib/gro/gro_tcp4.h=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 18 +--
>=C2=A0 lib/gro/gro_tcp6.c=C2=A0 =C2=A0 =C2=A0 =C2=A0| 243 ++++++++++---= ------------------------
>=C2=A0 lib/gro/gro_tcp6.h=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 31 +++-- >=C2=A0 lib/gro/gro_vxlan_tcp4.c |=C2=A0 18 +--
>=C2=A0 lib/gro/meson.build=C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A01 +
>=C2=A0 8 files changed, 396 insertions(+), 393 deletions(-)=C2=A0 creat= e mode 100644
> lib/gro/gro_tcp.c
>
> diff --git a/lib/gro/gro_tcp.c b/lib/gro/gro_tcp.c new file mode 10064= 4 index
> 0000000000..6a5aaada58
> --- /dev/null
> +++ b/lib/gro/gro_tcp.c
> @@ -0,0 +1,160 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +#include <rte_malloc.h>
> +#include <rte_mbuf.h>
> +#include <rte_ethdev.h>
> +
> +#include "gro_tcp.h"
> +
> +static inline uint32_t
> +find_an_empty_item(struct gro_tcp_item *items,
> +=C2=A0 =C2=A0 =C2=A0uint32_t table_size)
> +{
> +=C2=A0 =C2=A0 =C2=A0uint32_t i;
> +
> +=C2=A0 =C2=A0 =C2=A0for (i =3D 0; i < table_size; i++)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (items[i].firstseg= =3D=3D NULL)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0return i;
> +=C2=A0 =C2=A0 =C2=A0return INVALID_ARRAY_INDEX;
> +}
> +
> +uint32_t
> +insert_new_tcp_item(struct rte_mbuf *pkt,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct gro_tcp_item *= items,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t *item_num, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t table_size,<= br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint64_t start_time,<= br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t prev_idx, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t sent_seq, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint16_t ip_id,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint8_t is_atomic) > +{
> +=C2=A0 =C2=A0 =C2=A0uint32_t item_idx;
> +
> +=C2=A0 =C2=A0 =C2=A0item_idx =3D find_an_empty_item(items, table_size= );
> +=C2=A0 =C2=A0 =C2=A0if (item_idx =3D=3D INVALID_ARRAY_INDEX)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return INVALID_ARRAY_= INDEX;
> +
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].firstseg =3D pkt;
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].lastseg =3D rte_pktmbuf_lastseg(p= kt);
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].start_time =3D start_time;
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].next_pkt_idx =3D INVALID_ARRAY_IN= DEX;
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].sent_seq =3D sent_seq;
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].ip_id =3D ip_id;
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].nb_merged =3D 1;
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].is_atomic =3D is_atomic;
> +=C2=A0 =C2=A0 =C2=A0(*item_num) +=3D 1;
> +
> +=C2=A0 =C2=A0 =C2=A0/* if the previous packet exists, chain them toge= ther. */
> +=C2=A0 =C2=A0 =C2=A0if (prev_idx !=3D INVALID_ARRAY_INDEX) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0items[item_idx].next_= pkt_idx =3D
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0items[prev_idx].next_pkt_idx;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0items[prev_idx].next_= pkt_idx =3D item_idx;
> +=C2=A0 =C2=A0 =C2=A0}
> +
> +=C2=A0 =C2=A0 =C2=A0return item_idx;
> +}
> +
> +uint32_t
> +delete_tcp_item(struct gro_tcp_item *items, uint32_t item_idx,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t *item_num, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t prev_item_id= x)
> +{
> +=C2=A0 =C2=A0 =C2=A0uint32_t next_idx =3D items[item_idx].next_pkt_id= x;
> +
> +=C2=A0 =C2=A0 =C2=A0/* NULL indicates an empty item */
> +=C2=A0 =C2=A0 =C2=A0items[item_idx].firstseg =3D NULL;
> +=C2=A0 =C2=A0 =C2=A0(*item_num) -=3D 1;
> +=C2=A0 =C2=A0 =C2=A0if (prev_item_idx !=3D INVALID_ARRAY_INDEX)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0items[prev_item_idx].= next_pkt_idx =3D next_idx;
> +
> +=C2=A0 =C2=A0 =C2=A0return next_idx;
> +}
> +
> +int32_t
> +gro_tcp_reassemble(struct rte_mbuf *pkt,
> +=C2=A0 =C2=A0 =C2=A0void *tbl,
> +=C2=A0 =C2=A0 =C2=A0void *key,
> +=C2=A0 =C2=A0 =C2=A0int32_t tcp_dl,
> +=C2=A0 =C2=A0 =C2=A0struct gro_tcp_flow_ops *ops,
> +=C2=A0 =C2=A0 =C2=A0struct gro_tcp_item *items,
> +=C2=A0 =C2=A0 =C2=A0uint32_t *item_num,
> +=C2=A0 =C2=A0 =C2=A0uint32_t table_size,
> +=C2=A0 =C2=A0 =C2=A0uint16_t ip_id,
> +=C2=A0 =C2=A0 =C2=A0uint8_t is_atomic,
> +=C2=A0 =C2=A0 =C2=A0uint64_t start_time)

In general, TCP4 and TCP6 share struct gro_tcp_item and have private flow s= tructures,
i.e., struct gro_tcp4/6_flow, and I like this abstraction. IMO, the code pr= ocessing
struct gro_tcp_item should be implemented as common functions shared by
gro_tcp4.c and gro_tcp6.c. The code processing struct gro_tcp4/6_flow is tc= p4 and
tcp6 dependent and no need to abstract and share.

In gro_tcp_reassemble(), it uses callback functions defined in struct gro_t= cp_flow_ops
to provide the different operations on struct gro_tcp4/6_flow. I don't = think it's necessary
for abstraction purpose as gro_tcp4/6_flows_ops implementations are alike a= nd it also
introduces extra cost on function calls.
Sure, would make the changes accordingly.=C2=A0
=C2=A0
The gro_tcp_reassemble() has two parts: the common part to process struct g= ro_tcp_item
and the private part to process struct gro_tcp4/6_flow. I think a better wa= y is to remove
gro_tcp_reassemble() and struct gro_tcp_flow_ops, and implement the common = part as
an internal function in gro_tcp.c/gro_tcp.h, and make both gro_tcp4/6_reass= emble()
implement own private part and invoke the common function to process struct= gro_tcp_item.
With this change, there is no callback cost anymore and tcp4/6 can share th= e common function.
Sur= e, got it.

> +{
> +=C2=A0 =C2=A0 =C2=A0uint32_t item_idx;
> +=C2=A0 =C2=A0 =C2=A0uint32_t cur_idx;
> +=C2=A0 =C2=A0 =C2=A0uint32_t prev_idx;
> +=C2=A0 =C2=A0 =C2=A0struct rte_tcp_hdr *tcp_hdr;
> +=C2=A0 =C2=A0 =C2=A0int cmp;
> +=C2=A0 =C2=A0 =C2=A0uint32_t sent_seq;
> +
> +=C2=A0 =C2=A0 =C2=A0tcp_hdr =3D rte_pktmbuf_mtod_offset(pkt, struct r= te_tcp_hdr *, pkt-
> >l2_len + pkt->l3_len);
> +=C2=A0 =C2=A0 =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 * Don't process the packet which has FIN, SY= N, RST, PSH, URG, ECE
> +=C2=A0 =C2=A0 =C2=A0 * or CWR set.
> +=C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0if (tcp_hdr->tcp_flags !=3D RTE_TCP_ACK_FLAG)<= br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -1;
> +=C2=A0 =C2=A0 =C2=A0sent_seq =3D rte_be_to_cpu_32(tcp_hdr->sent_se= q);
> +
> +=C2=A0 =C2=A0 =C2=A0ops->tcp_flow_key_init(key, tcp_hdr);
> +
> +=C2=A0 =C2=A0 =C2=A0item_idx =3D ops->tcp_flow_lookup(tbl, key); > +=C2=A0 =C2=A0 =C2=A0if (item_idx =3D=3D INVALID_ARRAY_INDEX) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0item_idx =3D insert_n= ew_tcp_item(pkt, items, item_num,
> table_size, start_time,
> +
>=C2=A0 =C2=A0 =C2=A0 =C2=A0INVALID_ARRAY_INDEX, sent_seq, ip_id,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0is_atomic);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (item_idx =3D=3D I= NVALID_ARRAY_INDEX)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0return -1;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (ops->tcp_flow_= insert(tbl, key, item_idx) =3D=3D
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0INVALID_ARRAY_INDEX) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 * Fail to insert a new flow, so delete the
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 * stored packet.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0delete_tcp_item(items, item_idx, item_num,
> INVALID_ARRAY_INDEX);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0return -1;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> +=C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 * Check all packets in the flow and try to find = a neighbor for
> +=C2=A0 =C2=A0 =C2=A0 * the input packet.
> +=C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0cur_idx =3D item_idx;
> +=C2=A0 =C2=A0 =C2=A0prev_idx =3D cur_idx;
> +=C2=A0 =C2=A0 =C2=A0do {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cmp =3D check_seq_opt= ion(&items[cur_idx], tcp_hdr,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sent_seq, ip_id, pkt->l4_len, tcp_dl,= 0,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0is_atomic);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (cmp) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0if (merge_two_tcp_packets(&items[cur_idx],
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0pkt, cmp, sent_seq, ip_id, 0))
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 1;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 * Fail to merge the two packets, as the packet
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 * length is greater than the max value. Store
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 * the packet into the flow.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0if (insert_new_tcp_item(pkt, items, item_num,
> table_size, start_time, cur_idx,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0sent_seq, ip_id, is_atomic) =3D=3D
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0INVALID_ARRA= Y_INDEX)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -1;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0return 0;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0prev_idx =3D cur_idx;=
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cur_idx =3D items[cur= _idx].next_pkt_idx;
> +=C2=A0 =C2=A0 =C2=A0} while (cur_idx !=3D INVALID_ARRAY_INDEX);
> +
> +=C2=A0 =C2=A0 =C2=A0/* Fail to find a neighbor, so store the packet i= nto the flow. */
> +=C2=A0 =C2=A0 =C2=A0if (insert_new_tcp_item(pkt, items, item_num, tab= le_size, start_time,
> prev_idx, sent_seq,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ip_id, is_atomic) =3D=3D INVALID_ARRAY_I= NDEX)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -1;
> +
> +=C2=A0 =C2=A0 =C2=A0return 0;
> +
> +}
> diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h index
> c5d248a022..202f485c18 100644
> --- a/lib/gro/gro_tcp.h
> +++ b/lib/gro/gro_tcp.h
> @@ -1,6 +1,8 @@
>=C2=A0 #ifndef _GRO_TCP_H_
>=C2=A0 #define _GRO_TCP_H_
>
> +#define INVALID_ARRAY_INDEX 0xffffffffUL
> +
>=C2=A0 #include <rte_tcp.h>
>
>=C2=A0 /*
> @@ -14,6 +16,31 @@
>=C2=A0 #define INVALID_TCP_HDRLEN(len) \
>=C2=A0 =C2=A0 =C2=A0 =C2=A0(((len) < sizeof(struct rte_tcp_hdr)) || = ((len) > MAX_TCP_HLEN))
>
> +struct gro_tcp_flow {
> +=C2=A0 =C2=A0 =C2=A0struct rte_ether_addr eth_saddr;
> +=C2=A0 =C2=A0 =C2=A0struct rte_ether_addr eth_daddr;
> +=C2=A0 =C2=A0 =C2=A0uint32_t recv_ack;
> +=C2=A0 =C2=A0 =C2=A0uint16_t src_port;
> +=C2=A0 =C2=A0 =C2=A0uint16_t dst_port;
> +};
> +
> +#define ASSIGN_TCP_FLOW_KEY(k1, k2) \
> +=C2=A0 =C2=A0 =C2=A0rte_ether_addr_copy(&(k1->eth_saddr), &= ;(k2->eth_saddr)); \
> +=C2=A0 =C2=A0 =C2=A0rte_ether_addr_copy(&(k1->eth_daddr), &= ;(k2->eth_daddr)); \
> +=C2=A0 =C2=A0 =C2=A0k2->recv_ack =3D k1->recv_ack; \
> +=C2=A0 =C2=A0 =C2=A0k2->src_port =3D k1->src_port; \
> +=C2=A0 =C2=A0 =C2=A0k2->dst_port =3D k1->dst_port;

For multiline macro, it's better to use do{...}while(0) to avoid unexpe= cted errors
in the future.

> +
> +typedef uint32_t (*gro_tcp_flow_lookup)(void *table, void *key);
> +typedef uint32_t (*gro_tcp_flow_insert)(void *table, void *key,
> +uint32_t item_idx); typedef void (*gro_tcp_flow_key_init)(void *key,<= br> > +struct rte_tcp_hdr *tcp_hdr);
> +
> +struct gro_tcp_flow_ops {
> +=C2=A0 =C2=A0 =C2=A0gro_tcp_flow_lookup tcp_flow_lookup;
> +=C2=A0 =C2=A0 =C2=A0gro_tcp_flow_insert tcp_flow_insert;
> +=C2=A0 =C2=A0 =C2=A0gro_tcp_flow_key_init tcp_flow_key_init; };
> +
>=C2=A0 struct gro_tcp_item {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 * The first MBUF segment of the packet. If = the value @@ -44,6
> +71,36 @@ struct gro_tcp_item {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0uint8_t is_atomic;
>=C2=A0 };
>
> +uint32_t
> +insert_new_tcp_item(struct rte_mbuf *pkt,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct gro_tcp_item *= items,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t *item_num, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t table_size,<= br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint64_t start_time,<= br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t prev_idx, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t sent_seq, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint16_t ip_id,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint8_t is_atomic); > +
> +uint32_t
> +delete_tcp_item(struct gro_tcp_item *items,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t item_idx, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t *item_num, > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t prev_item_id= x);
> +
> +int32_t
> +gro_tcp_reassemble(struct rte_mbuf *pkt,
> +=C2=A0 =C2=A0 =C2=A0void *tbl,
> +=C2=A0 =C2=A0 =C2=A0void *key,
> +=C2=A0 =C2=A0 =C2=A0int32_t tcp_dl,
> +=C2=A0 =C2=A0 =C2=A0struct gro_tcp_flow_ops *ops,
> +=C2=A0 =C2=A0 =C2=A0struct gro_tcp_item *items,
> +=C2=A0 =C2=A0 =C2=A0uint32_t *item_num,
> +=C2=A0 =C2=A0 =C2=A0uint32_t table_size,
> +=C2=A0 =C2=A0 =C2=A0uint16_t ip_id,
> +=C2=A0 =C2=A0 =C2=A0uint8_t is_atomic,
> +=C2=A0 =C2=A0 =C2=A0uint64_t start_time);
> +
>=C2=A0 /*
>=C2=A0 =C2=A0* Merge two TCP packets without updating checksums.
>=C2=A0 =C2=A0* If cmp is larger than 0, append the new packet to the @@= -152,4 +209,10
> @@ check_seq_option(struct gro_tcp_item *item,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
>=C2=A0 }
>
> +static inline int
> +is_same_tcp_flow(struct gro_tcp_flow *k1, struct gro_tcp_flow *k2) {<= br> > +=C2=A0 =C2=A0 =C2=A0return (!memcmp(k1, k2, sizeof(struct gro_tcp_flo= w))); }
> +
>=C2=A0 #endif

--0000000000004d9a0f05fd72ae09--