DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets
@ 2022-09-07  8:59 Kumara Parameshwaran
  2022-09-07  9:32 ` Kumara Parameshwaran
  2022-11-01  7:03 ` [PATCH v5] gro : fix reordering of packets in GRO library Kumara Parameshwaran
  0 siblings, 2 replies; 10+ messages in thread
From: Kumara Parameshwaran @ 2022-09-07  8:59 UTC (permalink / raw)
  To: jiayu.hu; +Cc: dev, Kumara Parameshwaran

From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>

When more than two packets are merged in a flow, and if we receive
a 3rd packet which is matching the sequence of the 2nd packet the
prev_idx will be 1 and not 2, hence resulting in packet re-ordering

Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
V1:
	Initial changes to fix packet reordering issue when 
	more than 2 items are chained in a flow.
	Ex:
		3 mergeable TCP packets received in order.
		packet_0 - no flow found so insert the packet and new start
		index -> 0
		packet_1-> flow found. prev_idx, curr_index = 0. So merge works
		find packet_0->packet_1
		packet_3 flow found. prev_indx =0, curr_index = 1. Matching
		dequence numbers found but chained as
		packet_0->packet_2->packet_1

 lib/gro/gro_tcp4.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 7498c66141..9758e28fd5 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -305,7 +305,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 			 * length is greater than the max value. Store
 			 * the packet into the flow.
 			 */
-			if (insert_new_item(tbl, pkt, start_time, prev_idx,
+			if (insert_new_item(tbl, pkt, start_time, cur_idx,
 						sent_seq, ip_id, is_atomic) ==
 					INVALID_ARRAY_INDEX)
 				return -1;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets
  2022-09-07  8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
@ 2022-09-07  9:32 ` Kumara Parameshwaran
  2022-09-08  6:06   ` Hu, Jiayu
  2022-11-01  7:03 ` [PATCH v5] gro : fix reordering of packets in GRO library Kumara Parameshwaran
  1 sibling, 1 reply; 10+ messages in thread
From: Kumara Parameshwaran @ 2022-09-07  9:32 UTC (permalink / raw)
  To: jiayu.hu; +Cc: dev, Kumara Parameshwaran

From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>

When more than two packets are merged in a flow, and if we receive
a 3rd packet which is matching the sequence of the 2nd packet the
prev_idx will be 1 and not 2, hence resulting in packet re-ordering

Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
V1:
	Initial changes to fix packet reordering issue when 
	more than 2 items are chained in a flow.
	Ex:
		3 mergeable TCP packets received in order.
		packet_0 - no flow found so insert the packet and new start
		index -> 0
		packet_1-> flow found. prev_idx, curr_index = 0. So merge works
		find packet_0->packet_1
		packet_2 flow found. prev_indx =0, curr_index = 1. Matching
		dequence numbers found but chained as
		packet_0->packet_2->packet_1

 lib/gro/gro_tcp4.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 7498c66141..9758e28fd5 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -305,7 +305,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 			 * length is greater than the max value. Store
 			 * the packet into the flow.
 			 */
-			if (insert_new_item(tbl, pkt, start_time, prev_idx,
+			if (insert_new_item(tbl, pkt, start_time, cur_idx,
 						sent_seq, ip_id, is_atomic) ==
 					INVALID_ARRAY_INDEX)
 				return -1;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets
  2022-09-07  9:32 ` Kumara Parameshwaran
@ 2022-09-08  6:06   ` Hu, Jiayu
  2022-10-05 12:16     ` Thomas Monjalon
  0 siblings, 1 reply; 10+ messages in thread
From: Hu, Jiayu @ 2022-09-08  6:06 UTC (permalink / raw)
  To: Kumara Parameshwaran; +Cc: dev



> -----Original Message-----
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Sent: Wednesday, September 7, 2022 5:32 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kumara Parameshwaran
> <kumaraparamesh92@gmail.com>
> Subject: [PATCH] gro: fix the chain index in insert_new_item for more than 2
> packets
> 
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> 
> When more than two packets are merged in a flow, and if we receive a 3rd
> packet which is matching the sequence of the 2nd packet the prev_idx will
> be 1 and not 2, hence resulting in packet re-ordering
> 
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> ---
> V1:
> 	Initial changes to fix packet reordering issue when
> 	more than 2 items are chained in a flow.
> 	Ex:
> 		3 mergeable TCP packets received in order.
> 		packet_0 - no flow found so insert the packet and new start
> 		index -> 0
> 		packet_1-> flow found. prev_idx, curr_index = 0. So merge
> works
> 		find packet_0->packet_1
> 		packet_2 flow found. prev_indx =0, curr_index = 1. Matching
> 		dequence numbers found but chained as
> 		packet_0->packet_2->packet_1
> 
>  lib/gro/gro_tcp4.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> 7498c66141..9758e28fd5 100644
> --- a/lib/gro/gro_tcp4.c
> +++ b/lib/gro/gro_tcp4.c
> @@ -305,7 +305,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
>  			 * length is greater than the max value. Store
>  			 * the packet into the flow.
>  			 */
> -			if (insert_new_item(tbl, pkt, start_time, prev_idx,
> +			if (insert_new_item(tbl, pkt, start_time, cur_idx,
>  						sent_seq, ip_id, is_atomic)

Good catch.

Acked-by: Jiayu Hu <Jiayu.hu@intel.com>

Thanks,
Jiayu
> ==
>  					INVALID_ARRAY_INDEX)
>  				return -1;
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets
  2022-09-08  6:06   ` Hu, Jiayu
@ 2022-10-05 12:16     ` Thomas Monjalon
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Monjalon @ 2022-10-05 12:16 UTC (permalink / raw)
  To: Kumara Parameshwaran; +Cc: dev, Hu, Jiayu

> > When more than two packets are merged in a flow, and if we receive a 3rd
> > packet which is matching the sequence of the 2nd packet the prev_idx will
> > be 1 and not 2, hence resulting in packet re-ordering
> > 
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > ---
> > V1:
> > 	Initial changes to fix packet reordering issue when
> > 	more than 2 items are chained in a flow.
> > 	Ex:
> > 		3 mergeable TCP packets received in order.
> > 		packet_0 - no flow found so insert the packet and new start
> > 		index -> 0
> > 		packet_1-> flow found. prev_idx, curr_index = 0. So merge
> > works
> > 		find packet_0->packet_1
> > 		packet_2 flow found. prev_indx =0, curr_index = 1. Matching
> > 		dequence numbers found but chained as
> > 		packet_0->packet_2->packet_1
> > 
> >  lib/gro/gro_tcp4.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 7498c66141..9758e28fd5 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -305,7 +305,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >  			 * length is greater than the max value. Store
> >  			 * the packet into the flow.
> >  			 */
> > -			if (insert_new_item(tbl, pkt, start_time, prev_idx,
> > +			if (insert_new_item(tbl, pkt, start_time, cur_idx,
> >  						sent_seq, ip_id, is_atomic)
> 
> Good catch.
> 
> Acked-by: Jiayu Hu <Jiayu.hu@intel.com>

Applied, thanks.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5] gro : fix reordering of packets in GRO library
  2022-09-07  8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
  2022-09-07  9:32 ` Kumara Parameshwaran
@ 2022-11-01  7:03 ` Kumara Parameshwaran
  1 sibling, 0 replies; 10+ messages in thread
From: Kumara Parameshwaran @ 2022-11-01  7:03 UTC (permalink / raw)
  To: jiayu.hu; +Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran

From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>

When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering

With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.

Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
---
v1:
	If the received packet is not a pure ACK packet, we check if
	there are any previous packets in the flow, if present we indulge
	the received packet also in the coalescing logic and update the flags
	of the last recived packet to the entire segment which would avoid
	re-ordering.

	Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in burst mode,
	P1 contains PSH flag and since it does not contain any prior packets in the flow
	we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
	In the existing case the  P2,P3 would be delivered as single segment first and the
	unprocess_packets will be copied later which will cause reordering. With the patch
	copy the unprocess packets first and then the packets from the GRO table.

	Testing done
	The csum test-pmd was modifited to support the following
	GET request of 10MB from client to server via test-pmd (static arp entries added in client
	and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
	would be sent to server mac and vice versa.
	In above testing, without the patch the client observerd re-ordering of 25 packets
	and with the patch there were no packet re-ordering observerd.

v2: 
	Fix warnings in commit and comment.
	Do not consider packet as candidate to merge if it contains SYN/RST flag.

v3:
	Fix warnings.

v4:
	Rebase with master.

v5:
	Adding co-author email

 lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
 lib/gro/rte_gro.c  | 18 +++++++++---------
 2 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 0014096e63..7363c5d540 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
 			pkt->l2_len);
 }
 
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+	struct rte_ether_hdr *eth_hdr;
+	struct rte_ipv4_hdr *ipv4_hdr;
+	struct rte_tcp_hdr *merged_tcp_hdr;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+	ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+	merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
 int32_t
 gro_tcp4_reassemble(struct rte_mbuf *pkt,
 		struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 	uint32_t i, max_flow_num, remaining_flow_num;
 	int cmp;
 	uint8_t find;
+	uint32_t start_idx;
 
 	/*
 	 * Don't process the packet whose TCP header length is greater
@@ -219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 	tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
 	hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
 
-	/*
-	 * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
-	 * or CWR set.
-	 */
-	if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
-		return -1;
-
 	/* trim the tail padding bytes */
 	ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
 	if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -264,12 +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 		if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
 			if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
 				find = 1;
+				start_idx = tbl->flows[i].start_index;
 				break;
 			}
 			remaining_flow_num--;
 		}
 	}
 
+	if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+		/*
+		 * Check and try merging the current TCP segment with the previous
+		 * TCP segment if the TCP header does not contain RST and SYN flag
+		 * There are cases where the last segment is sent with FIN|PSH|ACK
+		 * which should also be considered for merging with previous segments.
+		 */
+		if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+			/*
+			 * Since PSH flag is set, start time will be set to 0 so it will be flushed
+			 * immediately.
+			 */
+			tbl->items[start_idx].start_time = 0;
+		else
+			return -1;
+	}
+
 	/*
 	 * Fail to find a matched flow. Insert a new flow and store the
 	 * packet into the flow.
@@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 				is_atomic);
 		if (cmp) {
 			if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
-						pkt, cmp, sent_seq, ip_id, 0))
+						pkt, cmp, sent_seq, ip_id, 0)) {
+				if (tbl->items[cur_idx].start_time == 0)
+					update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
 				return 1;
+			}
+
 			/*
 			 * Fail to merge the two packets, as the packet
 			 * length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
 	if ((nb_after_gro < nb_pkts)
 		 || (unprocess_num < nb_pkts)) {
 		i = 0;
+		/* Copy unprocessed packets */
+		if (unprocess_num > 0) {
+			memcpy(&pkts[i], unprocess_pkts,
+					sizeof(struct rte_mbuf *) *
+					unprocess_num);
+			i = unprocess_num;
+		}
 		/* Flush all packets from the tables */
 		if (do_vxlan_tcp_gro) {
-			i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
-					0, pkts, nb_pkts);
+			i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+					0, &pkts[i], nb_pkts - i);
 		}
 
 		if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
 			i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
 					&pkts[i], nb_pkts - i);
 		}
-		/* Copy unprocessed packets */
-		if (unprocess_num > 0) {
-			memcpy(&pkts[i], unprocess_pkts,
-					sizeof(struct rte_mbuf *) *
-					unprocess_num);
-		}
-		nb_after_gro = i + unprocess_num;
 	}
 
 	return nb_after_gro;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5] gro : fix reordering of packets in GRO library
  2023-06-20  7:35   ` Hu, Jiayu
  2023-06-21  8:47     ` kumaraparameshwaran rathinavel
@ 2023-06-30 11:32     ` kumaraparameshwaran rathinavel
  1 sibling, 0 replies; 10+ messages in thread
From: kumaraparameshwaran rathinavel @ 2023-06-30 11:32 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kumara Parameshwaran, thomas

[-- Attachment #1: Type: text/plain, Size: 11009 bytes --]

On Tue, Jun 20, 2023 at 1:06 PM Hu, Jiayu <jiayu.hu@intel.com> wrote:

> Hi Kumara,
>
> Please see replies inline.
>
> Thanks,
> Jiayu
>
> > -----Original Message-----
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Sent: Tuesday, November 1, 2022 3:06 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Kumara Parameshwaran
> > <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> > <kparameshwar@vmware.com>
> > Subject: [PATCH v5] gro : fix reordering of packets in GRO library
> >
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> >
> > When a TCP packet contains flags like PSH it is returned immediately to
> the
> > application though there might be packets of the same flow in the GRO
> table.
> > If PSH flag is set on a segment packets up to the segment should be
> delivered
> > immediately. But the current implementation delivers the last arrived
> packet
> > with PSH flag set causing re-ordering
> >
> > With this patch, if a packet does not contain only ACK flag and if there
> are no
> > previous packets for the flow the packet would be returned immediately,
> > else will be merged with the previous segment and the flag on the last
> > segment will be set on the entire segment.
> > This is the behaviour with linux stack as well.
> >
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> > ---
> > v1:
> >       If the received packet is not a pure ACK packet, we check if
> >       there are any previous packets in the flow, if present we indulge
> >       the received packet also in the coalescing logic and update the
> flags
> >       of the last recived packet to the entire segment which would avoid
> >       re-ordering.
> >
> >       Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in
> > burst mode,
> >       P1 contains PSH flag and since it does not contain any prior
> packets in
> > the flow
> >       we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> > merged together.
> >       In the existing case the  P2,P3 would be delivered as single
> segment
> > first and the
> >       unprocess_packets will be copied later which will cause reordering.
> > With the patch
> >       copy the unprocess packets first and then the packets from the GRO
> > table.
> >
> >       Testing done
> >       The csum test-pmd was modifited to support the following
> >       GET request of 10MB from client to server via test-pmd (static arp
> > entries added in client
> >       and server). Enable GRO and TSO in test-pmd where the packets
> > recived from the client mac
> >       would be sent to server mac and vice versa.
> >       In above testing, without the patch the client observerd
> re-ordering
> > of 25 packets
> >       and with the patch there were no packet re-ordering observerd.
> >
> > v2:
> >       Fix warnings in commit and comment.
> >       Do not consider packet as candidate to merge if it contains SYN/RST
> > flag.
> >
> > v3:
> >       Fix warnings.
> >
> > v4:
> >       Rebase with master.
> >
> > v5:
> >       Adding co-author email
> >
> >  lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> >  lib/gro/rte_gro.c  | 18 +++++++++---------
> >  2 files changed, 46 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 0014096e63..7363c5d540 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> >                       pkt->l2_len);
> >  }
> >
> > +static inline void
> > +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> > +{
> > +     struct rte_ether_hdr *eth_hdr;
> > +     struct rte_ipv4_hdr *ipv4_hdr;
> > +     struct rte_tcp_hdr *merged_tcp_hdr;
> > +
> > +     eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> > +     ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> > +     merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> > >l3_len);
> > +     merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
>
> The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags &
> (TCP_FLAG_FIN | TCP_FLAG_PSH)",
> which only adds FIN and PSH at most to the merge packet.
>
> > +
> >  int32_t
> >  gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >               struct gro_tcp4_tbl *tbl,
> > @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >       uint32_t i, max_flow_num, remaining_flow_num;
> >       int cmp;
> >       uint8_t find;
> > +     uint32_t start_idx;
> >
> >       /*
> >        * Don't process the packet whose TCP header length is greater @@ -
> > 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >       tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> >       hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >
> > -     /*
> > -      * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> > -      * or CWR set.
> > -      */
> > -     if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > -             return -1;
> > -
> >       /* trim the tail padding bytes */
> >       ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> >       if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> > +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >               if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> >                       if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> >                               find = 1;
> > +                             start_idx = tbl->flows[i].start_index;
> >                               break;
> >                       }
> >                       remaining_flow_num--;
> >               }
> >       }
> >
> > +     if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> > +             /*
> > +              * Check and try merging the current TCP segment with the
> > previous
> > +              * TCP segment if the TCP header does not contain RST and
> > SYN flag
> > +              * There are cases where the last segment is sent with
> > FIN|PSH|ACK
> > +              * which should also be considered for merging with
> previous
> > segments.
> > +              */
> > +             if (find && !(tcp_hdr->tcp_flags &
> > (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> > +                     /*
> > +                      * Since PSH flag is set, start time will be set
> to 0 so it
> > will be flushed
> > +                      * immediately.
> > +                      */
> > +                     tbl->items[start_idx].start_time = 0;
> > +             else
> > +                     return -1;
> > +     }
>
> The nested if-else check is not straightforward, and it's hard to read the
> condition-action of
> different combinations of flag bits. In addition, are all flag bits
> considered like Linux kernel?
>
>> In case of Linux kernel the packets are flushed even if the ack numbers
>> are different or if the tcp options are different. In DPDK case if options
>> are different it is inserted as new item in the table. Is this intended?
>> Should we maintain the same approach ? In case of linux kernel, additional
>> flags like CWR, SYN, RST, URG are considered. I think we can consider them
>> as well as well, and if one these flags we can consider flushing the entire
>> flow. As you mentioned the flags can be updated only for FIN and PSH, but
>> what we shoud make sure is when delivered the packet ordering should be
>> maintained where if a packet with PSH arrives and if there are existing
>> packets in the GRO table make sure we copy the flags and deliver the entire
>> packet in-order. This should be trhe case for any case where a packet with
>> one of the flag is set and there packets in the GRO table. Please let me
>> know your thoughts.
>>
> > +
> >       /*
> >        * Fail to find a matched flow. Insert a new flow and store the
> >        * packet into the flow.
> > @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >                               is_atomic);
> >               if (cmp) {
> >                       if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> > -                                             pkt, cmp, sent_seq, ip_id,
> 0))
> > +                                             pkt, cmp, sent_seq, ip_id,
> 0))
> > {
> > +                             if (tbl->items[cur_idx].start_time == 0)
> > +                                     update_tcp_hdr_flags(tcp_hdr, tbl-
> > >items[cur_idx].firstseg);
> >                               return 1;
> > +                     }
> > +
> >                       /*
> >                        * Fail to merge the two packets, as the packet
> >                        * length is greater than the max value. Store
> diff --git
> > a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> > 100644
> > --- a/lib/gro/rte_gro.c
> > +++ b/lib/gro/rte_gro.c
> > @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> >       if ((nb_after_gro < nb_pkts)
> >                || (unprocess_num < nb_pkts)) {
> >               i = 0;
> > +             /* Copy unprocessed packets */
> > +             if (unprocess_num > 0) {
> > +                     memcpy(&pkts[i], unprocess_pkts,
> > +                                     sizeof(struct rte_mbuf *) *
> > +                                     unprocess_num);
> > +                     i = unprocess_num;
> > +             }
>
> Why copy unprocess pkts first? This is for avoiding out-of-order?
>
>> Yes, this it to avoid out of order.
>>
>
> Thanks,
> Jiayu
> >               /* Flush all packets from the tables */
> >               if (do_vxlan_tcp_gro) {
> > -                     i =
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > -                                     0, pkts, nb_pkts);
> > +                     i +=
> > gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > +                                     0, &pkts[i], nb_pkts - i);
> >               }
> >
> >               if (do_vxlan_udp_gro) {
> > @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> >                       i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> >                                       &pkts[i], nb_pkts - i);
> >               }
> > -             /* Copy unprocessed packets */
> > -             if (unprocess_num > 0) {
> > -                     memcpy(&pkts[i], unprocess_pkts,
> > -                                     sizeof(struct rte_mbuf *) *
> > -                                     unprocess_num);
> > -             }
> > -             nb_after_gro = i + unprocess_num;
> >       }
> >
> >       return nb_after_gro;
> > --
> > 2.25.1
>
>

[-- Attachment #2: Type: text/html, Size: 14229 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5] gro : fix reordering of packets in GRO library
  2023-06-20  7:35   ` Hu, Jiayu
@ 2023-06-21  8:47     ` kumaraparameshwaran rathinavel
  2023-06-30 11:32     ` kumaraparameshwaran rathinavel
  1 sibling, 0 replies; 10+ messages in thread
From: kumaraparameshwaran rathinavel @ 2023-06-21  8:47 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kumara Parameshwaran, thomas

[-- Attachment #1: Type: text/plain, Size: 10446 bytes --]

Hi Jiayu,

Thanks for the comments. I have replied inline below. Will address the
review comments, but I think this would be a good patch to have in general.
Please let me know your thoughts.

On Tue, Jun 20, 2023 at 1:06 PM Hu, Jiayu <jiayu.hu@intel.com> wrote:

> Hi Kumara,
>
> Please see replies inline.
>
> Thanks,
> Jiayu
>
> > -----Original Message-----
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Sent: Tuesday, November 1, 2022 3:06 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Kumara Parameshwaran
> > <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> > <kparameshwar@vmware.com>
> > Subject: [PATCH v5] gro : fix reordering of packets in GRO library
> >
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> >
> > When a TCP packet contains flags like PSH it is returned immediately to
> the
> > application though there might be packets of the same flow in the GRO
> table.
> > If PSH flag is set on a segment packets up to the segment should be
> delivered
> > immediately. But the current implementation delivers the last arrived
> packet
> > with PSH flag set causing re-ordering
> >
> > With this patch, if a packet does not contain only ACK flag and if there
> are no
> > previous packets for the flow the packet would be returned immediately,
> > else will be merged with the previous segment and the flag on the last
> > segment will be set on the entire segment.
> > This is the behaviour with linux stack as well.
> >
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> > ---
> > v1:
> >       If the received packet is not a pure ACK packet, we check if
> >       there are any previous packets in the flow, if present we indulge
> >       the received packet also in the coalescing logic and update the
> flags
> >       of the last recived packet to the entire segment which would avoid
> >       re-ordering.
> >
> >       Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in
> > burst mode,
> >       P1 contains PSH flag and since it does not contain any prior
> packets in
> > the flow
> >       we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> > merged together.
> >       In the existing case the  P2,P3 would be delivered as single
> segment
> > first and the
> >       unprocess_packets will be copied later which will cause reordering.
> > With the patch
> >       copy the unprocess packets first and then the packets from the GRO
> > table.
> >
> >       Testing done
> >       The csum test-pmd was modifited to support the following
> >       GET request of 10MB from client to server via test-pmd (static arp
> > entries added in client
> >       and server). Enable GRO and TSO in test-pmd where the packets
> > recived from the client mac
> >       would be sent to server mac and vice versa.
> >       In above testing, without the patch the client observerd
> re-ordering
> > of 25 packets
> >       and with the patch there were no packet re-ordering observerd.
> >
> > v2:
> >       Fix warnings in commit and comment.
> >       Do not consider packet as candidate to merge if it contains SYN/RST
> > flag.
> >
> > v3:
> >       Fix warnings.
> >
> > v4:
> >       Rebase with master.
> >
> > v5:
> >       Adding co-author email
> >
> >  lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> >  lib/gro/rte_gro.c  | 18 +++++++++---------
> >  2 files changed, 46 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 0014096e63..7363c5d540 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> >                       pkt->l2_len);
> >  }
> >
> > +static inline void
> > +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> > +{
> > +     struct rte_ether_hdr *eth_hdr;
> > +     struct rte_ipv4_hdr *ipv4_hdr;
> > +     struct rte_tcp_hdr *merged_tcp_hdr;
> > +
> > +     eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> > +     ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> > +     merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> > >l3_len);
> > +     merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
>
> The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags &
> (TCP_FLAG_FIN | TCP_FLAG_PSH)",
> which only adds FIN and PSH at most to the merge packet.
>
>> Sure, will change it to add only FIN and PSH.
>>
>
> > +
> >  int32_t
> >  gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >               struct gro_tcp4_tbl *tbl,
> > @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >       uint32_t i, max_flow_num, remaining_flow_num;
> >       int cmp;
> >       uint8_t find;
> > +     uint32_t start_idx;
> >
> >       /*
> >        * Don't process the packet whose TCP header length is greater @@ -
> > 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >       tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> >       hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >
> > -     /*
> > -      * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> > -      * or CWR set.
> > -      */
> > -     if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > -             return -1;
> > -
> >       /* trim the tail padding bytes */
> >       ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> >       if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> > +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >               if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> >                       if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> >                               find = 1;
> > +                             start_idx = tbl->flows[i].start_index;
> >                               break;
> >                       }
> >                       remaining_flow_num--;
> >               }
> >       }
> >
> > +     if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> > +             /*
> > +              * Check and try merging the current TCP segment with the
> > previous
> > +              * TCP segment if the TCP header does not contain RST and
> > SYN flag
> > +              * There are cases where the last segment is sent with
> > FIN|PSH|ACK
> > +              * which should also be considered for merging with
> previous
> > segments.
> > +              */
> > +             if (find && !(tcp_hdr->tcp_flags &
> > (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> > +                     /*
> > +                      * Since PSH flag is set, start time will be set
> to 0 so it
> > will be flushed
> > +                      * immediately.
> > +                      */
> > +                     tbl->items[start_idx].start_time = 0;
> > +             else
> > +                     return -1;
> > +     }
>
> The nested if-else check is not straightforward, and it's hard to read the
> condition-action of
> different combinations of flag bits. In addition, are all flag bits
> considered like Linux kernel?
>
>> Not all bits are considered. Will fix to make sure that all bits are
>> considered. Will try to reorganise the code better. Let me know if you have
>> any other suggestions here.
>>
>
> > +
> >       /*
> >        * Fail to find a matched flow. Insert a new flow and store the
> >        * packet into the flow.
> > @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >                               is_atomic);
> >               if (cmp) {
> >                       if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> > -                                             pkt, cmp, sent_seq, ip_id,
> 0))
> > +                                             pkt, cmp, sent_seq, ip_id,
> 0))
> > {
> > +                             if (tbl->items[cur_idx].start_time == 0)
> > +                                     update_tcp_hdr_flags(tcp_hdr, tbl-
> > >items[cur_idx].firstseg);
> >                               return 1;
> > +                     }
> > +
> >                       /*
> >                        * Fail to merge the two packets, as the packet
> >                        * length is greater than the max value. Store
> diff --git
> > a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> > 100644
> > --- a/lib/gro/rte_gro.c
> > +++ b/lib/gro/rte_gro.c
> > @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> >       if ((nb_after_gro < nb_pkts)
> >                || (unprocess_num < nb_pkts)) {
> >               i = 0;
> > +             /* Copy unprocessed packets */
> > +             if (unprocess_num > 0) {
> > +                     memcpy(&pkts[i], unprocess_pkts,
> > +                                     sizeof(struct rte_mbuf *) *
> > +                                     unprocess_num);
> > +                     i = unprocess_num;
> > +             }
>
> Why copy unprocess pkts first? This is for avoiding out-of-order?
>
> Thanks,
> Jiayu
> >               /* Flush all packets from the tables */
> >               if (do_vxlan_tcp_gro) {
> > -                     i =
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > -                                     0, pkts, nb_pkts);
> > +                     i +=
> > gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > +                                     0, &pkts[i], nb_pkts - i);
> >               }
> >
> >               if (do_vxlan_udp_gro) {
> > @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> >                       i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> >                                       &pkts[i], nb_pkts - i);
> >               }
> > -             /* Copy unprocessed packets */
> > -             if (unprocess_num > 0) {
> > -                     memcpy(&pkts[i], unprocess_pkts,
> > -                                     sizeof(struct rte_mbuf *) *
> > -                                     unprocess_num);
> > -             }
> > -             nb_after_gro = i + unprocess_num;
> >       }
> >
> >       return nb_after_gro;
> > --
> > 2.25.1
>
>

[-- Attachment #2: Type: text/html, Size: 13870 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v5] gro : fix reordering of packets in GRO library
  2022-11-01  7:05 ` [PATCH v5] " Kumara Parameshwaran
  2023-06-19 13:25   ` Thomas Monjalon
@ 2023-06-20  7:35   ` Hu, Jiayu
  2023-06-21  8:47     ` kumaraparameshwaran rathinavel
  2023-06-30 11:32     ` kumaraparameshwaran rathinavel
  1 sibling, 2 replies; 10+ messages in thread
From: Hu, Jiayu @ 2023-06-20  7:35 UTC (permalink / raw)
  To: Kumara Parameshwaran; +Cc: dev, Kumara Parameshwaran, thomas

Hi Kumara,

Please see replies inline.

Thanks,
Jiayu

> -----Original Message-----
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Sent: Tuesday, November 1, 2022 3:06 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kumara Parameshwaran
> <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> <kparameshwar@vmware.com>
> Subject: [PATCH v5] gro : fix reordering of packets in GRO library
> 
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> 
> When a TCP packet contains flags like PSH it is returned immediately to the
> application though there might be packets of the same flow in the GRO table.
> If PSH flag is set on a segment packets up to the segment should be delivered
> immediately. But the current implementation delivers the last arrived packet
> with PSH flag set causing re-ordering
> 
> With this patch, if a packet does not contain only ACK flag and if there are no
> previous packets for the flow the packet would be returned immediately,
> else will be merged with the previous segment and the flag on the last
> segment will be set on the entire segment.
> This is the behaviour with linux stack as well.
> 
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> ---
> v1:
> 	If the received packet is not a pure ACK packet, we check if
> 	there are any previous packets in the flow, if present we indulge
> 	the received packet also in the coalescing logic and update the flags
> 	of the last recived packet to the entire segment which would avoid
> 	re-ordering.
> 
> 	Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in
> burst mode,
> 	P1 contains PSH flag and since it does not contain any prior packets in
> the flow
> 	we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> merged together.
> 	In the existing case the  P2,P3 would be delivered as single segment
> first and the
> 	unprocess_packets will be copied later which will cause reordering.
> With the patch
> 	copy the unprocess packets first and then the packets from the GRO
> table.
> 
> 	Testing done
> 	The csum test-pmd was modifited to support the following
> 	GET request of 10MB from client to server via test-pmd (static arp
> entries added in client
> 	and server). Enable GRO and TSO in test-pmd where the packets
> recived from the client mac
> 	would be sent to server mac and vice versa.
> 	In above testing, without the patch the client observerd re-ordering
> of 25 packets
> 	and with the patch there were no packet re-ordering observerd.
> 
> v2:
> 	Fix warnings in commit and comment.
> 	Do not consider packet as candidate to merge if it contains SYN/RST
> flag.
> 
> v3:
> 	Fix warnings.
> 
> v4:
> 	Rebase with master.
> 
> v5:
> 	Adding co-author email
> 
>  lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
>  lib/gro/rte_gro.c  | 18 +++++++++---------
>  2 files changed, 46 insertions(+), 17 deletions(-)
> 
> diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> 0014096e63..7363c5d540 100644
> --- a/lib/gro/gro_tcp4.c
> +++ b/lib/gro/gro_tcp4.c
> @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
>  			pkt->l2_len);
>  }
> 
> +static inline void
> +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> +{
> +	struct rte_ether_hdr *eth_hdr;
> +	struct rte_ipv4_hdr *ipv4_hdr;
> +	struct rte_tcp_hdr *merged_tcp_hdr;
> +
> +	eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> +	ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> +	merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> >l3_len);
> +	merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }

The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags & (TCP_FLAG_FIN | TCP_FLAG_PSH)",
which only adds FIN and PSH at most to the merge packet.

> +
>  int32_t
>  gro_tcp4_reassemble(struct rte_mbuf *pkt,
>  		struct gro_tcp4_tbl *tbl,
> @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
>  	uint32_t i, max_flow_num, remaining_flow_num;
>  	int cmp;
>  	uint8_t find;
> +	uint32_t start_idx;
> 
>  	/*
>  	 * Don't process the packet whose TCP header length is greater @@ -
> 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
>  	tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>  	hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> 
> -	/*
> -	 * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> -	 * or CWR set.
> -	 */
> -	if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> -		return -1;
> -
>  	/* trim the tail padding bytes */
>  	ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
>  	if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
>  		if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
>  			if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
>  				find = 1;
> +				start_idx = tbl->flows[i].start_index;
>  				break;
>  			}
>  			remaining_flow_num--;
>  		}
>  	}
> 
> +	if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> +		/*
> +		 * Check and try merging the current TCP segment with the
> previous
> +		 * TCP segment if the TCP header does not contain RST and
> SYN flag
> +		 * There are cases where the last segment is sent with
> FIN|PSH|ACK
> +		 * which should also be considered for merging with previous
> segments.
> +		 */
> +		if (find && !(tcp_hdr->tcp_flags &
> (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> +			/*
> +			 * Since PSH flag is set, start time will be set to 0 so it
> will be flushed
> +			 * immediately.
> +			 */
> +			tbl->items[start_idx].start_time = 0;
> +		else
> +			return -1;
> +	}

The nested if-else check is not straightforward, and it's hard to read the condition-action of
different combinations of flag bits. In addition, are all flag bits considered like Linux kernel?

> +
>  	/*
>  	 * Fail to find a matched flow. Insert a new flow and store the
>  	 * packet into the flow.
> @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
>  				is_atomic);
>  		if (cmp) {
>  			if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> -						pkt, cmp, sent_seq, ip_id, 0))
> +						pkt, cmp, sent_seq, ip_id, 0))
> {
> +				if (tbl->items[cur_idx].start_time == 0)
> +					update_tcp_hdr_flags(tcp_hdr, tbl-
> >items[cur_idx].firstseg);
>  				return 1;
> +			}
> +
>  			/*
>  			 * Fail to merge the two packets, as the packet
>  			 * length is greater than the max value. Store diff --git
> a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> 100644
> --- a/lib/gro/rte_gro.c
> +++ b/lib/gro/rte_gro.c
> @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
>  	if ((nb_after_gro < nb_pkts)
>  		 || (unprocess_num < nb_pkts)) {
>  		i = 0;
> +		/* Copy unprocessed packets */
> +		if (unprocess_num > 0) {
> +			memcpy(&pkts[i], unprocess_pkts,
> +					sizeof(struct rte_mbuf *) *
> +					unprocess_num);
> +			i = unprocess_num;
> +		}

Why copy unprocess pkts first? This is for avoiding out-of-order?

Thanks,
Jiayu
>  		/* Flush all packets from the tables */
>  		if (do_vxlan_tcp_gro) {
> -			i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> -					0, pkts, nb_pkts);
> +			i +=
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> +					0, &pkts[i], nb_pkts - i);
>  		}
> 
>  		if (do_vxlan_udp_gro) {
> @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
>  			i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
>  					&pkts[i], nb_pkts - i);
>  		}
> -		/* Copy unprocessed packets */
> -		if (unprocess_num > 0) {
> -			memcpy(&pkts[i], unprocess_pkts,
> -					sizeof(struct rte_mbuf *) *
> -					unprocess_num);
> -		}
> -		nb_after_gro = i + unprocess_num;
>  	}
> 
>  	return nb_after_gro;
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5] gro : fix reordering of packets in GRO library
  2022-11-01  7:05 ` [PATCH v5] " Kumara Parameshwaran
@ 2023-06-19 13:25   ` Thomas Monjalon
  2023-06-20  7:35   ` Hu, Jiayu
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Monjalon @ 2023-06-19 13:25 UTC (permalink / raw)
  To: jiayu.hu
  Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran, Kumara Parameshwaran

01/11/2022 08:05, Kumara Parameshwaran:
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> 
> When a TCP packet contains flags like PSH it is returned
> immediately to the application though there might be packets of
> the same flow in the GRO table. If PSH flag is set on a segment
> packets up to the segment should be delivered immediately. But the
> current implementation delivers the last arrived packet with PSH flag
> set causing re-ordering
> 
> With this patch, if a packet does not contain only ACK flag and if
> there are no previous packets for the flow the packet would be returned
> immediately, else will be merged with the previous segment and the
> flag on the last segment will be set on the entire segment.
> This is the behaviour with linux stack as well.
> 
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>

This is yourself, right?
Please choose one email address and use it in your git config.

Jiayu, any comment about the patch content?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5] gro : fix reordering of packets in GRO library
  2022-10-28  9:51 [PATCH v4] " Kumara Parameshwaran
@ 2022-11-01  7:05 ` Kumara Parameshwaran
  2023-06-19 13:25   ` Thomas Monjalon
  2023-06-20  7:35   ` Hu, Jiayu
  0 siblings, 2 replies; 10+ messages in thread
From: Kumara Parameshwaran @ 2022-11-01  7:05 UTC (permalink / raw)
  To: jiayu.hu; +Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran

From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>

When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering

With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.

Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
---
v1:
	If the received packet is not a pure ACK packet, we check if
	there are any previous packets in the flow, if present we indulge
	the received packet also in the coalescing logic and update the flags
	of the last recived packet to the entire segment which would avoid
	re-ordering.

	Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in burst mode,
	P1 contains PSH flag and since it does not contain any prior packets in the flow
	we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
	In the existing case the  P2,P3 would be delivered as single segment first and the
	unprocess_packets will be copied later which will cause reordering. With the patch
	copy the unprocess packets first and then the packets from the GRO table.

	Testing done
	The csum test-pmd was modifited to support the following
	GET request of 10MB from client to server via test-pmd (static arp entries added in client
	and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
	would be sent to server mac and vice versa.
	In above testing, without the patch the client observerd re-ordering of 25 packets
	and with the patch there were no packet re-ordering observerd.

v2: 
	Fix warnings in commit and comment.
	Do not consider packet as candidate to merge if it contains SYN/RST flag.

v3:
	Fix warnings.

v4:
	Rebase with master.

v5:
	Adding co-author email

 lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
 lib/gro/rte_gro.c  | 18 +++++++++---------
 2 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 0014096e63..7363c5d540 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
 			pkt->l2_len);
 }
 
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+	struct rte_ether_hdr *eth_hdr;
+	struct rte_ipv4_hdr *ipv4_hdr;
+	struct rte_tcp_hdr *merged_tcp_hdr;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+	ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+	merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
 int32_t
 gro_tcp4_reassemble(struct rte_mbuf *pkt,
 		struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 	uint32_t i, max_flow_num, remaining_flow_num;
 	int cmp;
 	uint8_t find;
+	uint32_t start_idx;
 
 	/*
 	 * Don't process the packet whose TCP header length is greater
@@ -219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 	tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
 	hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
 
-	/*
-	 * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
-	 * or CWR set.
-	 */
-	if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
-		return -1;
-
 	/* trim the tail padding bytes */
 	ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
 	if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -264,12 +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 		if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
 			if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
 				find = 1;
+				start_idx = tbl->flows[i].start_index;
 				break;
 			}
 			remaining_flow_num--;
 		}
 	}
 
+	if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+		/*
+		 * Check and try merging the current TCP segment with the previous
+		 * TCP segment if the TCP header does not contain RST and SYN flag
+		 * There are cases where the last segment is sent with FIN|PSH|ACK
+		 * which should also be considered for merging with previous segments.
+		 */
+		if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+			/*
+			 * Since PSH flag is set, start time will be set to 0 so it will be flushed
+			 * immediately.
+			 */
+			tbl->items[start_idx].start_time = 0;
+		else
+			return -1;
+	}
+
 	/*
 	 * Fail to find a matched flow. Insert a new flow and store the
 	 * packet into the flow.
@@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 				is_atomic);
 		if (cmp) {
 			if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
-						pkt, cmp, sent_seq, ip_id, 0))
+						pkt, cmp, sent_seq, ip_id, 0)) {
+				if (tbl->items[cur_idx].start_time == 0)
+					update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
 				return 1;
+			}
+
 			/*
 			 * Fail to merge the two packets, as the packet
 			 * length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
 	if ((nb_after_gro < nb_pkts)
 		 || (unprocess_num < nb_pkts)) {
 		i = 0;
+		/* Copy unprocessed packets */
+		if (unprocess_num > 0) {
+			memcpy(&pkts[i], unprocess_pkts,
+					sizeof(struct rte_mbuf *) *
+					unprocess_num);
+			i = unprocess_num;
+		}
 		/* Flush all packets from the tables */
 		if (do_vxlan_tcp_gro) {
-			i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
-					0, pkts, nb_pkts);
+			i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+					0, &pkts[i], nb_pkts - i);
 		}
 
 		if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
 			i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
 					&pkts[i], nb_pkts - i);
 		}
-		/* Copy unprocessed packets */
-		if (unprocess_num > 0) {
-			memcpy(&pkts[i], unprocess_pkts,
-					sizeof(struct rte_mbuf *) *
-					unprocess_num);
-		}
-		nb_after_gro = i + unprocess_num;
 	}
 
 	return nb_after_gro;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-06-30 11:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07  8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
2022-09-07  9:32 ` Kumara Parameshwaran
2022-09-08  6:06   ` Hu, Jiayu
2022-10-05 12:16     ` Thomas Monjalon
2022-11-01  7:03 ` [PATCH v5] gro : fix reordering of packets in GRO library Kumara Parameshwaran
2022-10-28  9:51 [PATCH v4] " Kumara Parameshwaran
2022-11-01  7:05 ` [PATCH v5] " Kumara Parameshwaran
2023-06-19 13:25   ` Thomas Monjalon
2023-06-20  7:35   ` Hu, Jiayu
2023-06-21  8:47     ` kumaraparameshwaran rathinavel
2023-06-30 11:32     ` kumaraparameshwaran rathinavel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).