DPDK patches and discussions
 help / color / mirror / Atom feed
From: kumaraparameshwaran rathinavel <kumaraparamesh92@gmail.com>
To: "Hu, Jiayu" <jiayu.hu@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Kumara Parameshwaran <kparameshwar@vmware.com>,
	 "thomas@monjalon.net" <thomas@monjalon.net>
Subject: Re: [PATCH v5] gro : fix reordering of packets in GRO library
Date: Fri, 30 Jun 2023 17:02:55 +0530	[thread overview]
Message-ID: <CANxNyastJxZ-CTNrXMAiFJC-wivKhemmV2uVFdNBZQ8zjhy6=g@mail.gmail.com> (raw)
In-Reply-To: <DS0PR11MB6494520A13C8C1132DA9E6ED925CA@DS0PR11MB6494.namprd11.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 11009 bytes --]

On Tue, Jun 20, 2023 at 1:06 PM Hu, Jiayu <jiayu.hu@intel.com> wrote:

> Hi Kumara,
>
> Please see replies inline.
>
> Thanks,
> Jiayu
>
> > -----Original Message-----
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Sent: Tuesday, November 1, 2022 3:06 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Kumara Parameshwaran
> > <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> > <kparameshwar@vmware.com>
> > Subject: [PATCH v5] gro : fix reordering of packets in GRO library
> >
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> >
> > When a TCP packet contains flags like PSH it is returned immediately to
> the
> > application though there might be packets of the same flow in the GRO
> table.
> > If PSH flag is set on a segment packets up to the segment should be
> delivered
> > immediately. But the current implementation delivers the last arrived
> packet
> > with PSH flag set causing re-ordering
> >
> > With this patch, if a packet does not contain only ACK flag and if there
> are no
> > previous packets for the flow the packet would be returned immediately,
> > else will be merged with the previous segment and the flag on the last
> > segment will be set on the entire segment.
> > This is the behaviour with linux stack as well.
> >
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> > ---
> > v1:
> >       If the received packet is not a pure ACK packet, we check if
> >       there are any previous packets in the flow, if present we indulge
> >       the received packet also in the coalescing logic and update the
> flags
> >       of the last recived packet to the entire segment which would avoid
> >       re-ordering.
> >
> >       Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in
> > burst mode,
> >       P1 contains PSH flag and since it does not contain any prior
> packets in
> > the flow
> >       we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> > merged together.
> >       In the existing case the  P2,P3 would be delivered as single
> segment
> > first and the
> >       unprocess_packets will be copied later which will cause reordering.
> > With the patch
> >       copy the unprocess packets first and then the packets from the GRO
> > table.
> >
> >       Testing done
> >       The csum test-pmd was modifited to support the following
> >       GET request of 10MB from client to server via test-pmd (static arp
> > entries added in client
> >       and server). Enable GRO and TSO in test-pmd where the packets
> > recived from the client mac
> >       would be sent to server mac and vice versa.
> >       In above testing, without the patch the client observerd
> re-ordering
> > of 25 packets
> >       and with the patch there were no packet re-ordering observerd.
> >
> > v2:
> >       Fix warnings in commit and comment.
> >       Do not consider packet as candidate to merge if it contains SYN/RST
> > flag.
> >
> > v3:
> >       Fix warnings.
> >
> > v4:
> >       Rebase with master.
> >
> > v5:
> >       Adding co-author email
> >
> >  lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> >  lib/gro/rte_gro.c  | 18 +++++++++---------
> >  2 files changed, 46 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 0014096e63..7363c5d540 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> >                       pkt->l2_len);
> >  }
> >
> > +static inline void
> > +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> > +{
> > +     struct rte_ether_hdr *eth_hdr;
> > +     struct rte_ipv4_hdr *ipv4_hdr;
> > +     struct rte_tcp_hdr *merged_tcp_hdr;
> > +
> > +     eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> > +     ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> > +     merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> > >l3_len);
> > +     merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
>
> The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags &
> (TCP_FLAG_FIN | TCP_FLAG_PSH)",
> which only adds FIN and PSH at most to the merge packet.
>
> > +
> >  int32_t
> >  gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >               struct gro_tcp4_tbl *tbl,
> > @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >       uint32_t i, max_flow_num, remaining_flow_num;
> >       int cmp;
> >       uint8_t find;
> > +     uint32_t start_idx;
> >
> >       /*
> >        * Don't process the packet whose TCP header length is greater @@ -
> > 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >       tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> >       hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >
> > -     /*
> > -      * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> > -      * or CWR set.
> > -      */
> > -     if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > -             return -1;
> > -
> >       /* trim the tail padding bytes */
> >       ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> >       if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> > +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >               if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> >                       if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> >                               find = 1;
> > +                             start_idx = tbl->flows[i].start_index;
> >                               break;
> >                       }
> >                       remaining_flow_num--;
> >               }
> >       }
> >
> > +     if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> > +             /*
> > +              * Check and try merging the current TCP segment with the
> > previous
> > +              * TCP segment if the TCP header does not contain RST and
> > SYN flag
> > +              * There are cases where the last segment is sent with
> > FIN|PSH|ACK
> > +              * which should also be considered for merging with
> previous
> > segments.
> > +              */
> > +             if (find && !(tcp_hdr->tcp_flags &
> > (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> > +                     /*
> > +                      * Since PSH flag is set, start time will be set
> to 0 so it
> > will be flushed
> > +                      * immediately.
> > +                      */
> > +                     tbl->items[start_idx].start_time = 0;
> > +             else
> > +                     return -1;
> > +     }
>
> The nested if-else check is not straightforward, and it's hard to read the
> condition-action of
> different combinations of flag bits. In addition, are all flag bits
> considered like Linux kernel?
>
>> In case of Linux kernel the packets are flushed even if the ack numbers
>> are different or if the tcp options are different. In DPDK case if options
>> are different it is inserted as new item in the table. Is this intended?
>> Should we maintain the same approach ? In case of linux kernel, additional
>> flags like CWR, SYN, RST, URG are considered. I think we can consider them
>> as well as well, and if one these flags we can consider flushing the entire
>> flow. As you mentioned the flags can be updated only for FIN and PSH, but
>> what we shoud make sure is when delivered the packet ordering should be
>> maintained where if a packet with PSH arrives and if there are existing
>> packets in the GRO table make sure we copy the flags and deliver the entire
>> packet in-order. This should be trhe case for any case where a packet with
>> one of the flag is set and there packets in the GRO table. Please let me
>> know your thoughts.
>>
> > +
> >       /*
> >        * Fail to find a matched flow. Insert a new flow and store the
> >        * packet into the flow.
> > @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >                               is_atomic);
> >               if (cmp) {
> >                       if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> > -                                             pkt, cmp, sent_seq, ip_id,
> 0))
> > +                                             pkt, cmp, sent_seq, ip_id,
> 0))
> > {
> > +                             if (tbl->items[cur_idx].start_time == 0)
> > +                                     update_tcp_hdr_flags(tcp_hdr, tbl-
> > >items[cur_idx].firstseg);
> >                               return 1;
> > +                     }
> > +
> >                       /*
> >                        * Fail to merge the two packets, as the packet
> >                        * length is greater than the max value. Store
> diff --git
> > a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> > 100644
> > --- a/lib/gro/rte_gro.c
> > +++ b/lib/gro/rte_gro.c
> > @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> >       if ((nb_after_gro < nb_pkts)
> >                || (unprocess_num < nb_pkts)) {
> >               i = 0;
> > +             /* Copy unprocessed packets */
> > +             if (unprocess_num > 0) {
> > +                     memcpy(&pkts[i], unprocess_pkts,
> > +                                     sizeof(struct rte_mbuf *) *
> > +                                     unprocess_num);
> > +                     i = unprocess_num;
> > +             }
>
> Why copy unprocess pkts first? This is for avoiding out-of-order?
>
>> Yes, this it to avoid out of order.
>>
>
> Thanks,
> Jiayu
> >               /* Flush all packets from the tables */
> >               if (do_vxlan_tcp_gro) {
> > -                     i =
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > -                                     0, pkts, nb_pkts);
> > +                     i +=
> > gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > +                                     0, &pkts[i], nb_pkts - i);
> >               }
> >
> >               if (do_vxlan_udp_gro) {
> > @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> >                       i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> >                                       &pkts[i], nb_pkts - i);
> >               }
> > -             /* Copy unprocessed packets */
> > -             if (unprocess_num > 0) {
> > -                     memcpy(&pkts[i], unprocess_pkts,
> > -                                     sizeof(struct rte_mbuf *) *
> > -                                     unprocess_num);
> > -             }
> > -             nb_after_gro = i + unprocess_num;
> >       }
> >
> >       return nb_after_gro;
> > --
> > 2.25.1
>
>

[-- Attachment #2: Type: text/html, Size: 14229 bytes --]

  parent reply	other threads:[~2023-06-30 11:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-13 10:18 [PATCH] " Kumara Parameshwaran
2022-10-13 10:20 ` kumaraparameshwaran rathinavel
2022-10-28  8:09 ` [PATCH v2] " Kumara Parameshwaran
2022-10-28  8:27 ` [PATCH v3] " Kumara Parameshwaran
2022-10-28  9:51 ` [PATCH v4] " Kumara Parameshwaran
2022-11-01  7:05   ` [PATCH v5] " Kumara Parameshwaran
2023-06-19 13:25     ` Thomas Monjalon
2023-06-20  7:35     ` Hu, Jiayu
2023-06-21  8:47       ` kumaraparameshwaran rathinavel
2023-06-30 11:32       ` kumaraparameshwaran rathinavel [this message]
2023-12-08 17:54     ` [PATCH v6] gro: fix reordering of packets in GRO layer Kumara Parameshwaran
2023-12-08 18:05     ` [PATCH v7] " Kumara Parameshwaran
2023-12-08 18:12     ` [PATCH v8] " Kumara Parameshwaran
2023-12-08 18:17     ` [PATCH v9] " Kumara Parameshwaran
2024-01-04 15:49       ` 胡嘉瑜
2024-01-07 11:21       ` [PATCH v10] " Kumara Parameshwaran
2024-01-07 11:29       ` [PATCH v11] " Kumara Parameshwaran
2024-01-07 17:20         ` Stephen Hemminger
2024-01-08 16:11           ` kumaraparameshwaran rathinavel
2024-01-08 15:50       ` [PATCH v12] " Kumara Parameshwaran
2024-01-08 16:04       ` [PATCH v13] " Kumara Parameshwaran
2024-01-16 14:28         ` 胡嘉瑜
2024-02-12 14:30           ` Thomas Monjalon
  -- strict thread matches above, loose matches on Subject: below --
2022-09-07  8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
2022-11-01  7:03 ` [PATCH v5] gro : fix reordering of packets in GRO library Kumara Parameshwaran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANxNyastJxZ-CTNrXMAiFJC-wivKhemmV2uVFdNBZQ8zjhy6=g@mail.gmail.com' \
    --to=kumaraparamesh92@gmail.com \
    --cc=dev@dpdk.org \
    --cc=jiayu.hu@intel.com \
    --cc=kparameshwar@vmware.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).