From: "Hu, Jiayu" <jiayu.hu@intel.com>
To: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
Kumara Parameshwaran <kparameshwar@vmware.com>,
"thomas@monjalon.net" <thomas@monjalon.net>
Subject: RE: [PATCH v5] gro : fix reordering of packets in GRO library
Date: Tue, 20 Jun 2023 07:35:49 +0000 [thread overview]
Message-ID: <DS0PR11MB6494520A13C8C1132DA9E6ED925CA@DS0PR11MB6494.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20221101070557.58808-1-kumaraparmesh92@gmail.com>
Hi Kumara,
Please see replies inline.
Thanks,
Jiayu
> -----Original Message-----
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Sent: Tuesday, November 1, 2022 3:06 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kumara Parameshwaran
> <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> <kparameshwar@vmware.com>
> Subject: [PATCH v5] gro : fix reordering of packets in GRO library
>
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
>
> When a TCP packet contains flags like PSH it is returned immediately to the
> application though there might be packets of the same flow in the GRO table.
> If PSH flag is set on a segment packets up to the segment should be delivered
> immediately. But the current implementation delivers the last arrived packet
> with PSH flag set causing re-ordering
>
> With this patch, if a packet does not contain only ACK flag and if there are no
> previous packets for the flow the packet would be returned immediately,
> else will be merged with the previous segment and the flag on the last
> segment will be set on the entire segment.
> This is the behaviour with linux stack as well.
>
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> ---
> v1:
> If the received packet is not a pure ACK packet, we check if
> there are any previous packets in the flow, if present we indulge
> the received packet also in the coalescing logic and update the flags
> of the last recived packet to the entire segment which would avoid
> re-ordering.
>
> Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in
> burst mode,
> P1 contains PSH flag and since it does not contain any prior packets in
> the flow
> we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> merged together.
> In the existing case the P2,P3 would be delivered as single segment
> first and the
> unprocess_packets will be copied later which will cause reordering.
> With the patch
> copy the unprocess packets first and then the packets from the GRO
> table.
>
> Testing done
> The csum test-pmd was modifited to support the following
> GET request of 10MB from client to server via test-pmd (static arp
> entries added in client
> and server). Enable GRO and TSO in test-pmd where the packets
> recived from the client mac
> would be sent to server mac and vice versa.
> In above testing, without the patch the client observerd re-ordering
> of 25 packets
> and with the patch there were no packet re-ordering observerd.
>
> v2:
> Fix warnings in commit and comment.
> Do not consider packet as candidate to merge if it contains SYN/RST
> flag.
>
> v3:
> Fix warnings.
>
> v4:
> Rebase with master.
>
> v5:
> Adding co-author email
>
> lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> lib/gro/rte_gro.c | 18 +++++++++---------
> 2 files changed, 46 insertions(+), 17 deletions(-)
>
> diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> 0014096e63..7363c5d540 100644
> --- a/lib/gro/gro_tcp4.c
> +++ b/lib/gro/gro_tcp4.c
> @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> pkt->l2_len);
> }
>
> +static inline void
> +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> +{
> + struct rte_ether_hdr *eth_hdr;
> + struct rte_ipv4_hdr *ipv4_hdr;
> + struct rte_tcp_hdr *merged_tcp_hdr;
> +
> + eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> + ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> + merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> >l3_len);
> + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags & (TCP_FLAG_FIN | TCP_FLAG_PSH)",
which only adds FIN and PSH at most to the merge packet.
> +
> int32_t
> gro_tcp4_reassemble(struct rte_mbuf *pkt,
> struct gro_tcp4_tbl *tbl,
> @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> uint32_t i, max_flow_num, remaining_flow_num;
> int cmp;
> uint8_t find;
> + uint32_t start_idx;
>
> /*
> * Don't process the packet whose TCP header length is greater @@ -
> 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>
> - /*
> - * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> - * or CWR set.
> - */
> - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> - return -1;
> -
> /* trim the tail padding bytes */
> ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> find = 1;
> + start_idx = tbl->flows[i].start_index;
> break;
> }
> remaining_flow_num--;
> }
> }
>
> + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> + /*
> + * Check and try merging the current TCP segment with the
> previous
> + * TCP segment if the TCP header does not contain RST and
> SYN flag
> + * There are cases where the last segment is sent with
> FIN|PSH|ACK
> + * which should also be considered for merging with previous
> segments.
> + */
> + if (find && !(tcp_hdr->tcp_flags &
> (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> + /*
> + * Since PSH flag is set, start time will be set to 0 so it
> will be flushed
> + * immediately.
> + */
> + tbl->items[start_idx].start_time = 0;
> + else
> + return -1;
> + }
The nested if-else check is not straightforward, and it's hard to read the condition-action of
different combinations of flag bits. In addition, are all flag bits considered like Linux kernel?
> +
> /*
> * Fail to find a matched flow. Insert a new flow and store the
> * packet into the flow.
> @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> is_atomic);
> if (cmp) {
> if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> - pkt, cmp, sent_seq, ip_id, 0))
> + pkt, cmp, sent_seq, ip_id, 0))
> {
> + if (tbl->items[cur_idx].start_time == 0)
> + update_tcp_hdr_flags(tcp_hdr, tbl-
> >items[cur_idx].firstseg);
> return 1;
> + }
> +
> /*
> * Fail to merge the two packets, as the packet
> * length is greater than the max value. Store diff --git
> a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> 100644
> --- a/lib/gro/rte_gro.c
> +++ b/lib/gro/rte_gro.c
> @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> if ((nb_after_gro < nb_pkts)
> || (unprocess_num < nb_pkts)) {
> i = 0;
> + /* Copy unprocessed packets */
> + if (unprocess_num > 0) {
> + memcpy(&pkts[i], unprocess_pkts,
> + sizeof(struct rte_mbuf *) *
> + unprocess_num);
> + i = unprocess_num;
> + }
Why copy unprocess pkts first? This is for avoiding out-of-order?
Thanks,
Jiayu
> /* Flush all packets from the tables */
> if (do_vxlan_tcp_gro) {
> - i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> - 0, pkts, nb_pkts);
> + i +=
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> + 0, &pkts[i], nb_pkts - i);
> }
>
> if (do_vxlan_udp_gro) {
> @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> &pkts[i], nb_pkts - i);
> }
> - /* Copy unprocessed packets */
> - if (unprocess_num > 0) {
> - memcpy(&pkts[i], unprocess_pkts,
> - sizeof(struct rte_mbuf *) *
> - unprocess_num);
> - }
> - nb_after_gro = i + unprocess_num;
> }
>
> return nb_after_gro;
> --
> 2.25.1
next prev parent reply other threads:[~2023-06-20 7:36 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-13 10:18 [PATCH] " Kumara Parameshwaran
2022-10-13 10:20 ` kumaraparameshwaran rathinavel
2022-10-28 8:09 ` [PATCH v2] " Kumara Parameshwaran
2022-10-28 8:27 ` [PATCH v3] " Kumara Parameshwaran
2022-10-28 9:51 ` [PATCH v4] " Kumara Parameshwaran
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
2023-06-19 13:25 ` Thomas Monjalon
2023-06-20 7:35 ` Hu, Jiayu [this message]
2023-06-21 8:47 ` kumaraparameshwaran rathinavel
2023-06-30 11:32 ` kumaraparameshwaran rathinavel
2023-12-08 17:54 ` [PATCH v6] gro: fix reordering of packets in GRO layer Kumara Parameshwaran
2023-12-08 18:05 ` [PATCH v7] " Kumara Parameshwaran
2023-12-08 18:12 ` [PATCH v8] " Kumara Parameshwaran
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
2024-01-04 15:49 ` 胡嘉瑜
2024-01-07 11:21 ` [PATCH v10] " Kumara Parameshwaran
2024-01-07 11:29 ` [PATCH v11] " Kumara Parameshwaran
2024-01-07 17:20 ` Stephen Hemminger
2024-01-08 16:11 ` kumaraparameshwaran rathinavel
2024-01-08 15:50 ` [PATCH v12] " Kumara Parameshwaran
2024-01-08 16:04 ` [PATCH v13] " Kumara Parameshwaran
2024-01-16 14:28 ` 胡嘉瑜
2024-02-12 14:30 ` Thomas Monjalon
-- strict thread matches above, loose matches on Subject: below --
2022-09-07 8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
2022-11-01 7:03 ` [PATCH v5] gro : fix reordering of packets in GRO library Kumara Parameshwaran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DS0PR11MB6494520A13C8C1132DA9E6ED925CA@DS0PR11MB6494.namprd11.prod.outlook.com \
--to=jiayu.hu@intel.com \
--cc=dev@dpdk.org \
--cc=kparameshwar@vmware.com \
--cc=kumaraparamesh92@gmail.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).