From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4984FA00C4; Tue, 26 Jul 2022 09:08:35 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 278AA40DDD; Tue, 26 Jul 2022 09:08:35 +0200 (CEST) Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) by mails.dpdk.org (Postfix) with ESMTP id C7BDC40695; Tue, 26 Jul 2022 09:08:33 +0200 (CEST) Received: by mail-vk1-f170.google.com with SMTP id z9so5808593vkb.9; Tue, 26 Jul 2022 00:08:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jvNwCbxwAetTyCdfbxFgIIIFOd//i3CpcwIIx7PGpDI=; b=gWIQb/TlLpUgGX0gLq28hT80tsitR1Kp78u3ElNSOFhR8SKXaQYqeRzzcwYDMjHBUV 4E6aLjYldmtcle5XwlbPUn5SM+YYF8yOwihKq/ik1Rt+ZG3kp7XOYZKVwAOWyYHdjFxj iJdGr5+XJ8tDhLwnmRg6RCuam7NwSd+spt5flW4cpmtT5JLCH9vLQEgYcrvTV5nRsgzM vHIEE8J6DOFzFOw8aQ9Nf2iZWLkScMAAehSHIGABT3e1etD2FhwEJSWsKSLb+ihqTSMu JrUwyppAtPYZ0mVu4j9vuhHOet3zBPm3JF/gnUy5a8wbna3X2t7/S+4v+ShmgeFBGDH8 Dgfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jvNwCbxwAetTyCdfbxFgIIIFOd//i3CpcwIIx7PGpDI=; b=xVHchrGCKpH7fK7/0E2zDgFi3TnPwgoLPlTd7QsZizIINstWZFlrC6FjWNrZUGtslG aVjXdOS4zwrxoQWrJZLtecKOoUtEGg6TQ/D27NCl+ujZ7zyXQFQaYt8CTrD1fwFqKB7M 9GrYSWFnUvHK2HTQMp22RrUnJ1hjodAJ7CvRT0LrkwTlz3ujCvX8S1tRO99vW4sIoWr8 Vqr3jqzdKTzkWCH7YTA2nA4qG9RKFoHohg1kxaKHZ0xxENpEYRC4T5yQHZe/BlYrHBVZ clZLCZbK+nBO7lbx2BU3Es71kA02hVnQWazSScx2CzqEmvQ9bAL9Ouk+cMnKx+kcFVHF Z8Xw== X-Gm-Message-State: AJIora8BUBq4GFBfEW7d20gzmHpMyzbRJGru8s8uF7Bmj83Y+MXJ/bWM OsQxIYD0G/yON2rFHCkrib1FJ7MvLysc3zUVWIM= X-Google-Smtp-Source: AGRyM1uQ5sAp3J8B2m59yMUVySr9gWjhUGxxu/Rh5JsCJ9C+3nTbmMsA2L8At4Und/1Zi9KU9Y4MV9h1NAEUnwmbVxs= X-Received: by 2002:a1f:c687:0:b0:374:9549:741b with SMTP id w129-20020a1fc687000000b003749549741bmr4597929vkf.16.1658819313025; Tue, 26 Jul 2022 00:08:33 -0700 (PDT) MIME-Version: 1.0 References: <20220726061741.21021-1-jun.qiu@jaguarmicro.com> In-Reply-To: From: kumaraparameshwaran rathinavel Date: Tue, 26 Jul 2022 12:38:21 +0530 Message-ID: Subject: Re: [PATCH] gro: fix gro with tcp push flag To: Jun Qiu Cc: "dev@dpdk.org" , "jiayu.hu@intel.com" , "stable@dpdk.org" Content-Type: multipart/alternative; boundary="000000000000e645b205e4aff7fd" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --000000000000e645b205e4aff7fd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable We should do it for the rte_gro_reassemble as well, with timer mode it could lead to more duplicate ACKs. I had a proposal for the enhancement which would handle both rte_gro_reassemble and rte_gro_reassemble_burst but have not got any response yet. I have a custom patch which is working fine for timer mode where there is no packet reordering, earlier without the patch there were DUP-ACKs and this could potentially affect the window scaling. On Tue, Jul 26, 2022 at 12:27 PM Jun Qiu wrote: > May be in rte_gro_reassemble_burst, where no delay is introduced, PUSH > packets can be merged > > > > *=E5=8F=91=E4=BB=B6=E4=BA=BA:* kumaraparameshwaran rathinavel > *=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:* 2022=E5=B9=B47=E6=9C=8826=E6=97= =A5 14:41 > *=E6=94=B6=E4=BB=B6=E4=BA=BA:* Jun Qiu > *=E6=8A=84=E9=80=81:* dev@dpdk.org; jiayu.hu@intel.com; stable@dpdk.org > *=E4=B8=BB=E9=A2=98:* Re: [PATCH] gro: fix gro with tcp push flag > > > > > > > > On Tue, Jul 26, 2022 at 11:48 AM Jun Qiu wrote: > > TCP data packets sometimes carry a PUSH flag. Currently, > only the packets that do not have PUSH flag can be GROed. > The packets that have a PUSH flag cannot be GROed, the packets > that cannot be processed by GRO are placed last. > In this case, the received packets may be out of order. > For example, there are two packets mbuf1 and mbuf2. mbuf1 > contains PUSH flag, mbuf2 does not contain PUSH flag. > After GRO processing, mbuf2 is sent for processing before mbuf1. > This out-of-order will affect TCP processing performance and > lead to unnecessary dup-ACK. > > Referring to the Linux kernel implementation, packets with PUSH > flag can also perform GRO. And if one of the packets containing > PUSH flag, the packets after GRO will carry PUSH flag. > > > > In case of smaller transfers in which the TCP segment size is not more > than one MTU, it is a single TCP packet with PSH flag set, so in those > cases we are introducing unwanted delay. I think the better approach > would be if there are previous packets in the flow and the current packet > received has PSH flag then coalesce with the previous packet, if lookup i= s > failure and the current packet has PSH flag set then deliver it > immediately. > > > > Fixes: 0d2cbe59b719 ("lib/gro: support TCP/IPv4") > Cc: stable@dpdk.org > > Signed-off-by: Jun Qiu > --- > lib/gro/gro_tcp4.c | 4 ++-- > lib/gro/gro_tcp4.h | 16 +++++++++++++--- > lib/gro/gro_vxlan_tcp4.c | 4 ++-- > 3 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c > index 7498c66141..7849a2bd1d 100644 > --- a/lib/gro/gro_tcp4.c > +++ b/lib/gro/gro_tcp4.c > @@ -220,10 +220,10 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, > hdr_len =3D pkt->l2_len + pkt->l3_len + pkt->l4_len; > > /* > - * Don't process the packet which has FIN, SYN, RST, PSH, URG, EC= E > + * Don't process the packet which has FIN, SYN, RST, URG, ECE > * or CWR set. > */ > - if (tcp_hdr->tcp_flags !=3D RTE_TCP_ACK_FLAG) > + if (tcp_hdr->tcp_flags & (~(RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG))= ) > return -1; > /* > * Don't process the packet whose payload length is less than or > diff --git a/lib/gro/gro_tcp4.h b/lib/gro/gro_tcp4.h > index 212f97a042..2974faf228 100644 > --- a/lib/gro/gro_tcp4.h > +++ b/lib/gro/gro_tcp4.h > @@ -210,7 +210,8 @@ merge_two_tcp4_packets(struct gro_tcp4_item *item, > uint16_t l2_offset) > { > struct rte_mbuf *pkt_head, *pkt_tail, *lastseg; > - uint16_t hdr_len, l2_len; > + struct rte_tcp_hdr *head_tcp_hdr, *tail_tcp_hdr; > + uint16_t hdr_len, l2_len, l3_offset; > > if (cmp > 0) { > pkt_head =3D item->firstseg; > @@ -221,13 +222,22 @@ merge_two_tcp4_packets(struct gro_tcp4_item *item, > } > > /* check if the IPv4 packet length is greater than the max value = */ > - hdr_len =3D l2_offset + pkt_head->l2_len + pkt_head->l3_len + > - pkt_head->l4_len; > + l3_offset =3D l2_offset + pkt_head->l2_len + pkt_head->l3_len; > + hdr_len =3D l3_offset + pkt_head->l4_len; > l2_len =3D l2_offset > 0 ? pkt_head->outer_l2_len : pkt_head->l2_= len; > if (unlikely(pkt_head->pkt_len - l2_len + pkt_tail->pkt_len - > hdr_len > MAX_IPV4_PKT_LENGTH)) > return 0; > > + /* merge push flag to pkt_head */ > + tail_tcp_hdr =3D rte_pktmbuf_mtod_offset(pkt_tail, > + struct rte_tcp_hdr *, l3_offset); > + if (tail_tcp_hdr->tcp_flags & RTE_TCP_PSH_FLAG) { > + head_tcp_hdr =3D rte_pktmbuf_mtod_offset(pkt_head, > + struct rte_tcp_hdr *, l3_offset); > + head_tcp_hdr->tcp_flags |=3D RTE_TCP_PSH_FLAG; > + } > + > /* remove the packet header for the tail packet */ > rte_pktmbuf_adj(pkt_tail, hdr_len); > > diff --git a/lib/gro/gro_vxlan_tcp4.c b/lib/gro/gro_vxlan_tcp4.c > index 3be4deb7c7..884802af0b 100644 > --- a/lib/gro/gro_vxlan_tcp4.c > +++ b/lib/gro/gro_vxlan_tcp4.c > @@ -326,10 +326,10 @@ gro_vxlan_tcp4_reassemble(struct rte_mbuf *pkt, > tcp_hdr =3D (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len= ); > > /* > - * Don't process the packet which has FIN, SYN, RST, PSH, URG, > + * Don't process the packet which has FIN, SYN, RST, URG, > * ECE or CWR set. > */ > - if (tcp_hdr->tcp_flags !=3D RTE_TCP_ACK_FLAG) > + if (tcp_hdr->tcp_flags & (~(RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG))= ) > return -1; > > hdr_len =3D pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len + > -- > 2.25.1 > > --000000000000e645b205e4aff7fd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
We should do it for the rte_gro_reassemble as well, w= ith timer mode it could lead to more duplicate ACKs. I had a proposal for t= he enhancement which would handle both=C2=A0 rte_gro_reassemble and rte_gro= _reassemble_burst but have not got any response yet.

I have a custom patch which is working fine for timer mode where th= ere is no packet reordering, earlier without the patch there were DUP-ACKs = and this could potentially affect the window scaling.

<= div class=3D"gmail_quote">
On Tue, Jul= 26, 2022 at 12:27 PM Jun Qiu <jun.qiu@jaguarmicro.com> wrote:

May be in rte_gro_reassemble_burst= , where no delay is introduced, PUSH packets can be merged

=C2=A0

=E5=8F=91=E4=BB=B6=E4=BA=BA: kumaraparameshwaran rathinavel <kumaraparamesh92@gmail.com>
=E5= =8F=91=E9=80=81=E6=97=B6=E9=97=B4:= 2022= =E5=B9=B47=E6=9C=8826=E6=97=A5 14:41
=E6=94=B6=E4=BB=B6=E4=BA=BA: Jun Qiu <jun.qiu@jaguarmicro.com>
=E6=8A=84=E9=80=81: dev@dpdk.org;= jiayu.hu@intel.com= ; stable@dpdk.org<= /a>
=E4=B8=BB=E9=A2=98: Re: [PATCH] gro: fix gro with tcp push flag

=C2=A0

=C2=A0

=C2=A0

TC= P data packets sometimes carry a PUSH flag. Currently,
only the packets that do not have PUSH flag can be GROed.
The packets that have a PUSH flag cannot be GROed, the packets
that cannot be processed by GRO are placed last.
In this case, the received packets may be out of order.
For example, there are two packets mbuf1 and mbuf2. mbuf1
contains PUSH flag, mbuf2 does not contain PUSH flag.
After GRO processing, mbuf2 is sent for processing before mbuf1.
This out-of-order will affect TCP processing performance and
lead to unnecessary dup-ACK.

Referring to the Linux kernel implementation, packets with PUSH
flag can also perform GRO. And if one of the packets containing
PUSH flag, the packets after GRO will carry PUSH flag.
=

=C2=A0

In case of smaller transfers in= which the TCP segment size is not more than one MTU, it is a single TCP pa= cket with PSH flag set, so in those cases=C2=A0 we are introducing unwanted= delay.=C2=A0 I think the better approach would be if there are previous packets in the flow and the current packet receiv= ed has PSH flag then coalesce with the previous packet, if lookup is failur= e and the current packet has PSH flag set then deliver it immediately.

=C2=A0

Fi= xes: 0d2cbe59b719 ("lib/gro: support TCP/IPv4")
Cc: stable@dpdk.org

Signed-off-by: Jun Qiu <
jun.qiu@jaguarmicro.com>
---
=C2=A0lib/gro/gro_tcp4.c=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 4 ++--
=C2=A0lib/gro/gro_tcp4.h=C2=A0 =C2=A0 =C2=A0 =C2=A0| 16 +++++++++++++--- =C2=A0lib/gro/gro_vxlan_tcp4.c |=C2=A0 4 ++--
=C2=A03 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 7498c66141..7849a2bd1d 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -220,10 +220,10 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 hdr_len =3D pkt->l2_len + pkt->l3_len + p= kt->l4_len;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Don't process the packet which has FIN, = SYN, RST, PSH, URG, ECE
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Don't process the packet which has FIN, = SYN, RST, URG, ECE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* or CWR set.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (tcp_hdr->tcp_flags !=3D RTE_TCP_ACK_FLAG= )
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (tcp_hdr->tcp_flags & (~(RTE_TCP_ACK_= FLAG | RTE_TCP_PSH_FLAG)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return -1;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Don't process the packet whose payl= oad length is less than or
diff --git a/lib/gro/gro_tcp4.h b/lib/gro/gro_tcp4.h
index 212f97a042..2974faf228 100644
--- a/lib/gro/gro_tcp4.h
+++ b/lib/gro/gro_tcp4.h
@@ -210,7 +210,8 @@ merge_two_tcp4_packets(struct gro_tcp4_item *item,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 uint16_t l2_offset)=
=C2=A0{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;=
-=C2=A0 =C2=A0 =C2=A0 =C2=A0uint16_t hdr_len, l2_len;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0struct rte_tcp_hdr *head_tcp_hdr, *tail_tcp_hdr= ;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0uint16_t hdr_len, l2_len, l3_offset;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (cmp > 0) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pkt_head =3D item-&= gt;firstseg;
@@ -221,13 +222,22 @@ merge_two_tcp4_packets(struct gro_tcp4_item *item, =C2=A0 =C2=A0 =C2=A0 =C2=A0 }

=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* check if the IPv4 packet length is greater t= han the max value */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0hdr_len =3D l2_offset + pkt_head->l2_len + p= kt_head->l3_len +
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pkt_head->l4_len= ;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0l3_offset =3D l2_offset + pkt_head->l2_len += pkt_head->l3_len;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0hdr_len =3D l3_offset + pkt_head->l4_len; =C2=A0 =C2=A0 =C2=A0 =C2=A0 l2_len =3D l2_offset > 0 ? pkt_head->oute= r_l2_len : pkt_head->l2_len;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (unlikely(pkt_head->pkt_len - l2_len + pk= t_tail->pkt_len -
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 hdr_len > MAX_IPV4_PKT_LENGTH)) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0;

+=C2=A0 =C2=A0 =C2=A0 =C2=A0/* merge push flag to pkt_head */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0tail_tcp_hdr =3D rte_pktmbuf_mtod_offset(pkt_ta= il,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rte_tcp_hdr *, l3_offset);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (tail_tcp_hdr->tcp_flags & RTE_TCP_PS= H_FLAG) {
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0head_tcp_hdr =3D rt= e_pktmbuf_mtod_offset(pkt_head,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rt= e_tcp_hdr *, l3_offset);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0head_tcp_hdr->tc= p_flags |=3D RTE_TCP_PSH_FLAG;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* remove the packet header for the tail packet= */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 rte_pktmbuf_adj(pkt_tail, hdr_len);

diff --git a/lib/gro/gro_vxlan_tcp4.c b/lib/gro/gro_vxlan_tcp4.c
index 3be4deb7c7..884802af0b 100644
--- a/lib/gro/gro_vxlan_tcp4.c
+++ b/lib/gro/gro_vxlan_tcp4.c
@@ -326,10 +326,10 @@ gro_vxlan_tcp4_reassemble(struct rte_mbuf *pkt,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 tcp_hdr =3D (struct rte_tcp_hdr *)((char *)ipv4= _hdr + pkt->l3_len);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Don't process the packet which has FIN, = SYN, RST, PSH, URG,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Don't process the packet which has FIN, = SYN, RST, URG,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* ECE or CWR set.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (tcp_hdr->tcp_flags !=3D RTE_TCP_ACK_FLAG= )
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (tcp_hdr->tcp_flags & (~(RTE_TCP_ACK_= FLAG | RTE_TCP_PSH_FLAG)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return -1;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 hdr_len =3D pkt->outer_l2_len + pkt->oute= r_l3_len + pkt->l2_len +
--
2.25.1

--000000000000e645b205e4aff7fd--