* [PATCH] gro : fix reordering of packets in GRO library
@ 2022-10-13 10:18 Kumara Parameshwaran
2022-10-13 10:20 ` kumaraparameshwaran rathinavel
` (3 more replies)
0 siblings, 4 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2022-10-13 10:18 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev, Kumara Parameshwaran
From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets upto the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering
With this patch, if a packet does not contain only ACK flag and if there are
no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
v1:
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modifited to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
lib/gro/gro_tcp4.c | 35 ++++++++++++++++++++++++++++-------
lib/gro/rte_gro.c | 18 +++++++++---------
2 files changed, 37 insertions(+), 16 deletions(-)
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 8f5e800250..9ed891c253 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
pkt->l2_len);
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_ether_hdr *eth_hdr;
+ struct rte_ipv4_hdr *ipv4_hdr;
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+ ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+ merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
int32_t
gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
+ uint32_t start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -219,12 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
/*
* Don't process the packet whose payload length is less than or
* equal to 0.
@@ -263,12 +271,21 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ if (find)
+ /* Since PSH flag is set, start time will be set to 0 so it will be flushed immediately */
+ tbl->items[start_idx].start_time = 0;
+ else
+ return -1;
+ }
+
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
@@ -303,8 +320,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, ip_id, 0)) {
+ if (tbl->items[cur_idx].start_time == 0)
+ update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
return 1;
+ }
+
/*
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
if ((nb_after_gro < nb_pkts)
|| (unprocess_num < nb_pkts)) {
i = 0;
+ /* Copy unprocessed packets */
+ if (unprocess_num > 0) {
+ memcpy(&pkts[i], unprocess_pkts,
+ sizeof(struct rte_mbuf *) *
+ unprocess_num);
+ i = unprocess_num;
+ }
/* Flush all packets from the tables */
if (do_vxlan_tcp_gro) {
- i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
- 0, pkts, nb_pkts);
+ i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+ 0, &pkts[i], nb_pkts - i);
}
if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
&pkts[i], nb_pkts - i);
}
- /* Copy unprocessed packets */
- if (unprocess_num > 0) {
- memcpy(&pkts[i], unprocess_pkts,
- sizeof(struct rte_mbuf *) *
- unprocess_num);
- }
- nb_after_gro = i + unprocess_num;
}
return nb_after_gro;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] gro : fix reordering of packets in GRO library
2022-10-13 10:18 [PATCH] gro : fix reordering of packets in GRO library Kumara Parameshwaran
@ 2022-10-13 10:20 ` kumaraparameshwaran rathinavel
2022-10-28 8:09 ` [PATCH v2] " Kumara Parameshwaran
` (2 subsequent siblings)
3 siblings, 0 replies; 24+ messages in thread
From: kumaraparameshwaran rathinavel @ 2022-10-13 10:20 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev
[-- Attachment #1.1: Type: text/plain, Size: 7722 bytes --]
Please find the attached pcap files for the testing done.
Thanks,
Kumara.
On Thu, Oct 13, 2022 at 3:49 PM Kumara Parameshwaran <
kumaraparamesh92@gmail.com> wrote:
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
>
> When a TCP packet contains flags like PSH it is returned
> immediately to the application though there might be packets of
> the same flow in the GRO table. If PSH flag is set on a segment
> packets upto the segment should be delivered immediately. But the
> current implementation delivers the last arrived packet with PSH flag
> set causing re-ordering
>
> With this patch, if a packet does not contain only ACK flag and if there
> are
> no previous packets for the flow the packet would be returned
> immediately, else will be merged with the previous segment and the
> flag on the last segment will be set on the entire segment.
> This is the behaviour with linux stack as well
>
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> ---
> v1:
> If the received packet is not a pure ACK packet, we check if
> there are any previous packets in the flow, if present we indulge
> the received packet also in the coalescing logic and update the flags
> of the last recived packet to the entire segment which would avoid
> re-ordering.
>
> Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst
> mode,
> P1 contains PSH flag and since it does not contain any prior packets
> in the flow
> we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged
> together.
> In the existing case the P2,P3 would be delivered as single segment
> first and the
> unprocess_packets will be copied later which will cause reordering.
> With the patch
> copy the unprocess packets first and then the packets from the GRO
> table.
>
> Testing done
> The csum test-pmd was modifited to support the following
> GET request of 10MB from client to server via test-pmd (static arp
> entries added in client
> and server). Enable GRO and TSO in test-pmd where the packets recived
> from the client mac
> would be sent to server mac and vice versa.
> In above testing, without the patch the client observerd re-ordering
> of 25 packets
> and with the patch there were no packet re-ordering observerd.
>
> lib/gro/gro_tcp4.c | 35 ++++++++++++++++++++++++++++-------
> lib/gro/rte_gro.c | 18 +++++++++---------
> 2 files changed, 37 insertions(+), 16 deletions(-)
>
> diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
> index 8f5e800250..9ed891c253 100644
> --- a/lib/gro/gro_tcp4.c
> +++ b/lib/gro/gro_tcp4.c
> @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> pkt->l2_len);
> }
>
> +static inline void
> +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> +{
> + struct rte_ether_hdr *eth_hdr;
> + struct rte_ipv4_hdr *ipv4_hdr;
> + struct rte_tcp_hdr *merged_tcp_hdr;
> +
> + eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> + ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> + merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr +
> pkt->l3_len);
> + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
> +}
> +
> int32_t
> gro_tcp4_reassemble(struct rte_mbuf *pkt,
> struct gro_tcp4_tbl *tbl,
> @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> uint32_t i, max_flow_num, remaining_flow_num;
> int cmp;
> uint8_t find;
> + uint32_t start_idx;
>
> /*
> * Don't process the packet whose TCP header length is greater
> @@ -219,12 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>
> - /*
> - * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> - * or CWR set.
> - */
> - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> - return -1;
> /*
> * Don't process the packet whose payload length is less than or
> * equal to 0.
> @@ -263,12 +271,21 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> find = 1;
> + start_idx = tbl->flows[i].start_index;
> break;
> }
> remaining_flow_num--;
> }
> }
>
> + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> + if (find)
> + /* Since PSH flag is set, start time will be set
> to 0 so it will be flushed immediately */
> + tbl->items[start_idx].start_time = 0;
> + else
> + return -1;
> + }
> +
> /*
> * Fail to find a matched flow. Insert a new flow and store the
> * packet into the flow.
> @@ -303,8 +320,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> is_atomic);
> if (cmp) {
> if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> - pkt, cmp, sent_seq, ip_id,
> 0))
> + pkt, cmp, sent_seq, ip_id,
> 0)) {
> + if (tbl->items[cur_idx].start_time == 0)
> + update_tcp_hdr_flags(tcp_hdr,
> tbl->items[cur_idx].firstseg);
> return 1;
> + }
> +
> /*
> * Fail to merge the two packets, as the packet
> * length is greater than the max value. Store
> diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
> index e35399fd42..87c5502dce 100644
> --- a/lib/gro/rte_gro.c
> +++ b/lib/gro/rte_gro.c
> @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> if ((nb_after_gro < nb_pkts)
> || (unprocess_num < nb_pkts)) {
> i = 0;
> + /* Copy unprocessed packets */
> + if (unprocess_num > 0) {
> + memcpy(&pkts[i], unprocess_pkts,
> + sizeof(struct rte_mbuf *) *
> + unprocess_num);
> + i = unprocess_num;
> + }
> /* Flush all packets from the tables */
> if (do_vxlan_tcp_gro) {
> - i =
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> - 0, pkts, nb_pkts);
> + i +=
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> + 0, &pkts[i], nb_pkts - i);
> }
>
> if (do_vxlan_udp_gro) {
> @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> &pkts[i], nb_pkts - i);
> }
> - /* Copy unprocessed packets */
> - if (unprocess_num > 0) {
> - memcpy(&pkts[i], unprocess_pkts,
> - sizeof(struct rte_mbuf *) *
> - unprocess_num);
> - }
> - nb_after_gro = i + unprocess_num;
> }
>
> return nb_after_gro;
> --
> 2.25.1
>
>
[-- Attachment #1.2: Type: text/html, Size: 9654 bytes --]
[-- Attachment #2: file_client_with_patch.pcap --]
[-- Type: application/octet-stream, Size: 291054 bytes --]
[-- Attachment #3: file_client_without_patch.pcap --]
[-- Type: application/octet-stream, Size: 291158 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2] gro : fix reordering of packets in GRO library
2022-10-13 10:18 [PATCH] gro : fix reordering of packets in GRO library Kumara Parameshwaran
2022-10-13 10:20 ` kumaraparameshwaran rathinavel
@ 2022-10-28 8:09 ` Kumara Parameshwaran
2022-10-28 8:27 ` [PATCH v3] " Kumara Parameshwaran
2022-10-28 9:51 ` [PATCH v4] " Kumara Parameshwaran
3 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2022-10-28 8:09 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev, Kumara Parameshwaran
From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering
With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
v1:
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modifited to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
* Fix warnings in commit and comment
* Do not consider packet as candidate to merge if it contains SYN/RST flag
lib/gro/gro_tcp4.c | 43 ++++++++++++++++++++++++++++++++++++-------
lib/gro/rte_gro.c | 18 +++++++++---------
2 files changed, 45 insertions(+), 16 deletions(-)
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 8f5e800250..c4fa8ff226 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
pkt->l2_len);
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_ether_hdr *eth_hdr;
+ struct rte_ipv4_hdr *ipv4_hdr;
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+ ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+ merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
int32_t
gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
+ uint32_t start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -219,12 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
/*
* Don't process the packet whose payload length is less than or
* equal to 0.
@@ -263,12 +271,29 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ /*
+ * Check and try merging the current TCP segment with the previous
+ * TCP segment if the TCP header does not contain RST and SYN flag
+ * There are cases where the last segment is sent with FIN|PSH|ACK
+ * which should also be considered for merging with previous segments.
+ */
+ if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+ /*
+ * Since PSH flag is set, start time will be set to 0 so it will be flushed immediately
+ */
+ tbl->items[start_idx].start_time = 0;
+ else
+ return -1;
+ }
+
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
@@ -303,8 +328,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, ip_id, 0)) {
+ if (tbl->items[cur_idx].start_time == 0)
+ update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
return 1;
+ }
+
/*
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
if ((nb_after_gro < nb_pkts)
|| (unprocess_num < nb_pkts)) {
i = 0;
+ /* Copy unprocessed packets */
+ if (unprocess_num > 0) {
+ memcpy(&pkts[i], unprocess_pkts,
+ sizeof(struct rte_mbuf *) *
+ unprocess_num);
+ i = unprocess_num;
+ }
/* Flush all packets from the tables */
if (do_vxlan_tcp_gro) {
- i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
- 0, pkts, nb_pkts);
+ i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+ 0, &pkts[i], nb_pkts - i);
}
if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
&pkts[i], nb_pkts - i);
}
- /* Copy unprocessed packets */
- if (unprocess_num > 0) {
- memcpy(&pkts[i], unprocess_pkts,
- sizeof(struct rte_mbuf *) *
- unprocess_num);
- }
- nb_after_gro = i + unprocess_num;
}
return nb_after_gro;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v3] gro : fix reordering of packets in GRO library
2022-10-13 10:18 [PATCH] gro : fix reordering of packets in GRO library Kumara Parameshwaran
2022-10-13 10:20 ` kumaraparameshwaran rathinavel
2022-10-28 8:09 ` [PATCH v2] " Kumara Parameshwaran
@ 2022-10-28 8:27 ` Kumara Parameshwaran
2022-10-28 9:51 ` [PATCH v4] " Kumara Parameshwaran
3 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2022-10-28 8:27 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev, Kumara Parameshwaran
From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering
With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
v1:
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modifited to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
lib/gro/gro_tcp4.c | 44 +++++++++++++++++++++++++++++++++++++-------
lib/gro/rte_gro.c | 18 +++++++++---------
2 files changed, 46 insertions(+), 16 deletions(-)
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 8f5e800250..2ce0c1391c 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
pkt->l2_len);
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_ether_hdr *eth_hdr;
+ struct rte_ipv4_hdr *ipv4_hdr;
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+ ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+ merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
int32_t
gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
+ uint32_t start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -219,12 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
/*
* Don't process the packet whose payload length is less than or
* equal to 0.
@@ -263,12 +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ /*
+ * Check and try merging the current TCP segment with the previous
+ * TCP segment if the TCP header does not contain RST and SYN flag
+ * There are cases where the last segment is sent with FIN|PSH|ACK
+ * which should also be considered for merging with previous segments.
+ */
+ if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+ /*
+ * Since PSH flag is set, start time will be set to 0 so it will be flushed
+ * immediately.
+ */
+ tbl->items[start_idx].start_time = 0;
+ else
+ return -1;
+ }
+
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
@@ -303,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, ip_id, 0)) {
+ if (tbl->items[cur_idx].start_time == 0)
+ update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
return 1;
+ }
+
/*
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
if ((nb_after_gro < nb_pkts)
|| (unprocess_num < nb_pkts)) {
i = 0;
+ /* Copy unprocessed packets */
+ if (unprocess_num > 0) {
+ memcpy(&pkts[i], unprocess_pkts,
+ sizeof(struct rte_mbuf *) *
+ unprocess_num);
+ i = unprocess_num;
+ }
/* Flush all packets from the tables */
if (do_vxlan_tcp_gro) {
- i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
- 0, pkts, nb_pkts);
+ i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+ 0, &pkts[i], nb_pkts - i);
}
if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
&pkts[i], nb_pkts - i);
}
- /* Copy unprocessed packets */
- if (unprocess_num > 0) {
- memcpy(&pkts[i], unprocess_pkts,
- sizeof(struct rte_mbuf *) *
- unprocess_num);
- }
- nb_after_gro = i + unprocess_num;
}
return nb_after_gro;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v4] gro : fix reordering of packets in GRO library
2022-10-13 10:18 [PATCH] gro : fix reordering of packets in GRO library Kumara Parameshwaran
` (2 preceding siblings ...)
2022-10-28 8:27 ` [PATCH v3] " Kumara Parameshwaran
@ 2022-10-28 9:51 ` Kumara Parameshwaran
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
3 siblings, 1 reply; 24+ messages in thread
From: Kumara Parameshwaran @ 2022-10-28 9:51 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev, Kumara Parameshwaran
From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering
With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
v1:
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modifited to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
lib/gro/rte_gro.c | 18 +++++++++---------
2 files changed, 46 insertions(+), 17 deletions(-)
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 0014096e63..7363c5d540 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
pkt->l2_len);
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_ether_hdr *eth_hdr;
+ struct rte_ipv4_hdr *ipv4_hdr;
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+ ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+ merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
int32_t
gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
+ uint32_t start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -264,12 +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ /*
+ * Check and try merging the current TCP segment with the previous
+ * TCP segment if the TCP header does not contain RST and SYN flag
+ * There are cases where the last segment is sent with FIN|PSH|ACK
+ * which should also be considered for merging with previous segments.
+ */
+ if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+ /*
+ * Since PSH flag is set, start time will be set to 0 so it will be flushed
+ * immediately.
+ */
+ tbl->items[start_idx].start_time = 0;
+ else
+ return -1;
+ }
+
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
@@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, ip_id, 0)) {
+ if (tbl->items[cur_idx].start_time == 0)
+ update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
return 1;
+ }
+
/*
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
if ((nb_after_gro < nb_pkts)
|| (unprocess_num < nb_pkts)) {
i = 0;
+ /* Copy unprocessed packets */
+ if (unprocess_num > 0) {
+ memcpy(&pkts[i], unprocess_pkts,
+ sizeof(struct rte_mbuf *) *
+ unprocess_num);
+ i = unprocess_num;
+ }
/* Flush all packets from the tables */
if (do_vxlan_tcp_gro) {
- i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
- 0, pkts, nb_pkts);
+ i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+ 0, &pkts[i], nb_pkts - i);
}
if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
&pkts[i], nb_pkts - i);
}
- /* Copy unprocessed packets */
- if (unprocess_num > 0) {
- memcpy(&pkts[i], unprocess_pkts,
- sizeof(struct rte_mbuf *) *
- unprocess_num);
- }
- nb_after_gro = i + unprocess_num;
}
return nb_after_gro;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v5] gro : fix reordering of packets in GRO library
2022-10-28 9:51 ` [PATCH v4] " Kumara Parameshwaran
@ 2022-11-01 7:05 ` Kumara Parameshwaran
2023-06-19 13:25 ` Thomas Monjalon
` (5 more replies)
0 siblings, 6 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2022-11-01 7:05 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran
From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering
With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
---
v1:
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modifited to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
lib/gro/rte_gro.c | 18 +++++++++---------
2 files changed, 46 insertions(+), 17 deletions(-)
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 0014096e63..7363c5d540 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
pkt->l2_len);
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_ether_hdr *eth_hdr;
+ struct rte_ipv4_hdr *ipv4_hdr;
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+ ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+ merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
int32_t
gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
+ uint32_t start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -264,12 +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ /*
+ * Check and try merging the current TCP segment with the previous
+ * TCP segment if the TCP header does not contain RST and SYN flag
+ * There are cases where the last segment is sent with FIN|PSH|ACK
+ * which should also be considered for merging with previous segments.
+ */
+ if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+ /*
+ * Since PSH flag is set, start time will be set to 0 so it will be flushed
+ * immediately.
+ */
+ tbl->items[start_idx].start_time = 0;
+ else
+ return -1;
+ }
+
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
@@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, ip_id, 0)) {
+ if (tbl->items[cur_idx].start_time == 0)
+ update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
return 1;
+ }
+
/*
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
if ((nb_after_gro < nb_pkts)
|| (unprocess_num < nb_pkts)) {
i = 0;
+ /* Copy unprocessed packets */
+ if (unprocess_num > 0) {
+ memcpy(&pkts[i], unprocess_pkts,
+ sizeof(struct rte_mbuf *) *
+ unprocess_num);
+ i = unprocess_num;
+ }
/* Flush all packets from the tables */
if (do_vxlan_tcp_gro) {
- i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
- 0, pkts, nb_pkts);
+ i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+ 0, &pkts[i], nb_pkts - i);
}
if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
&pkts[i], nb_pkts - i);
}
- /* Copy unprocessed packets */
- if (unprocess_num > 0) {
- memcpy(&pkts[i], unprocess_pkts,
- sizeof(struct rte_mbuf *) *
- unprocess_num);
- }
- nb_after_gro = i + unprocess_num;
}
return nb_after_gro;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5] gro : fix reordering of packets in GRO library
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
@ 2023-06-19 13:25 ` Thomas Monjalon
2023-06-20 7:35 ` Hu, Jiayu
` (4 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: Thomas Monjalon @ 2023-06-19 13:25 UTC (permalink / raw)
To: jiayu.hu
Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran, Kumara Parameshwaran
01/11/2022 08:05, Kumara Parameshwaran:
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
>
> When a TCP packet contains flags like PSH it is returned
> immediately to the application though there might be packets of
> the same flow in the GRO table. If PSH flag is set on a segment
> packets up to the segment should be delivered immediately. But the
> current implementation delivers the last arrived packet with PSH flag
> set causing re-ordering
>
> With this patch, if a packet does not contain only ACK flag and if
> there are no previous packets for the flow the packet would be returned
> immediately, else will be merged with the previous segment and the
> flag on the last segment will be set on the entire segment.
> This is the behaviour with linux stack as well.
>
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
This is yourself, right?
Please choose one email address and use it in your git config.
Jiayu, any comment about the patch content?
^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [PATCH v5] gro : fix reordering of packets in GRO library
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
2023-06-19 13:25 ` Thomas Monjalon
@ 2023-06-20 7:35 ` Hu, Jiayu
2023-06-21 8:47 ` kumaraparameshwaran rathinavel
2023-06-30 11:32 ` kumaraparameshwaran rathinavel
2023-12-08 17:54 ` [PATCH v6] gro: fix reordering of packets in GRO layer Kumara Parameshwaran
` (3 subsequent siblings)
5 siblings, 2 replies; 24+ messages in thread
From: Hu, Jiayu @ 2023-06-20 7:35 UTC (permalink / raw)
To: Kumara Parameshwaran; +Cc: dev, Kumara Parameshwaran, thomas
Hi Kumara,
Please see replies inline.
Thanks,
Jiayu
> -----Original Message-----
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Sent: Tuesday, November 1, 2022 3:06 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kumara Parameshwaran
> <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> <kparameshwar@vmware.com>
> Subject: [PATCH v5] gro : fix reordering of packets in GRO library
>
> From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
>
> When a TCP packet contains flags like PSH it is returned immediately to the
> application though there might be packets of the same flow in the GRO table.
> If PSH flag is set on a segment packets up to the segment should be delivered
> immediately. But the current implementation delivers the last arrived packet
> with PSH flag set causing re-ordering
>
> With this patch, if a packet does not contain only ACK flag and if there are no
> previous packets for the flow the packet would be returned immediately,
> else will be merged with the previous segment and the flag on the last
> segment will be set on the entire segment.
> This is the behaviour with linux stack as well.
>
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> ---
> v1:
> If the received packet is not a pure ACK packet, we check if
> there are any previous packets in the flow, if present we indulge
> the received packet also in the coalescing logic and update the flags
> of the last recived packet to the entire segment which would avoid
> re-ordering.
>
> Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in
> burst mode,
> P1 contains PSH flag and since it does not contain any prior packets in
> the flow
> we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> merged together.
> In the existing case the P2,P3 would be delivered as single segment
> first and the
> unprocess_packets will be copied later which will cause reordering.
> With the patch
> copy the unprocess packets first and then the packets from the GRO
> table.
>
> Testing done
> The csum test-pmd was modifited to support the following
> GET request of 10MB from client to server via test-pmd (static arp
> entries added in client
> and server). Enable GRO and TSO in test-pmd where the packets
> recived from the client mac
> would be sent to server mac and vice versa.
> In above testing, without the patch the client observerd re-ordering
> of 25 packets
> and with the patch there were no packet re-ordering observerd.
>
> v2:
> Fix warnings in commit and comment.
> Do not consider packet as candidate to merge if it contains SYN/RST
> flag.
>
> v3:
> Fix warnings.
>
> v4:
> Rebase with master.
>
> v5:
> Adding co-author email
>
> lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> lib/gro/rte_gro.c | 18 +++++++++---------
> 2 files changed, 46 insertions(+), 17 deletions(-)
>
> diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> 0014096e63..7363c5d540 100644
> --- a/lib/gro/gro_tcp4.c
> +++ b/lib/gro/gro_tcp4.c
> @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> pkt->l2_len);
> }
>
> +static inline void
> +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> +{
> + struct rte_ether_hdr *eth_hdr;
> + struct rte_ipv4_hdr *ipv4_hdr;
> + struct rte_tcp_hdr *merged_tcp_hdr;
> +
> + eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> + ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> + merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> >l3_len);
> + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags & (TCP_FLAG_FIN | TCP_FLAG_PSH)",
which only adds FIN and PSH at most to the merge packet.
> +
> int32_t
> gro_tcp4_reassemble(struct rte_mbuf *pkt,
> struct gro_tcp4_tbl *tbl,
> @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> uint32_t i, max_flow_num, remaining_flow_num;
> int cmp;
> uint8_t find;
> + uint32_t start_idx;
>
> /*
> * Don't process the packet whose TCP header length is greater @@ -
> 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>
> - /*
> - * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> - * or CWR set.
> - */
> - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> - return -1;
> -
> /* trim the tail padding bytes */
> ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> find = 1;
> + start_idx = tbl->flows[i].start_index;
> break;
> }
> remaining_flow_num--;
> }
> }
>
> + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> + /*
> + * Check and try merging the current TCP segment with the
> previous
> + * TCP segment if the TCP header does not contain RST and
> SYN flag
> + * There are cases where the last segment is sent with
> FIN|PSH|ACK
> + * which should also be considered for merging with previous
> segments.
> + */
> + if (find && !(tcp_hdr->tcp_flags &
> (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> + /*
> + * Since PSH flag is set, start time will be set to 0 so it
> will be flushed
> + * immediately.
> + */
> + tbl->items[start_idx].start_time = 0;
> + else
> + return -1;
> + }
The nested if-else check is not straightforward, and it's hard to read the condition-action of
different combinations of flag bits. In addition, are all flag bits considered like Linux kernel?
> +
> /*
> * Fail to find a matched flow. Insert a new flow and store the
> * packet into the flow.
> @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> is_atomic);
> if (cmp) {
> if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> - pkt, cmp, sent_seq, ip_id, 0))
> + pkt, cmp, sent_seq, ip_id, 0))
> {
> + if (tbl->items[cur_idx].start_time == 0)
> + update_tcp_hdr_flags(tcp_hdr, tbl-
> >items[cur_idx].firstseg);
> return 1;
> + }
> +
> /*
> * Fail to merge the two packets, as the packet
> * length is greater than the max value. Store diff --git
> a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> 100644
> --- a/lib/gro/rte_gro.c
> +++ b/lib/gro/rte_gro.c
> @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> if ((nb_after_gro < nb_pkts)
> || (unprocess_num < nb_pkts)) {
> i = 0;
> + /* Copy unprocessed packets */
> + if (unprocess_num > 0) {
> + memcpy(&pkts[i], unprocess_pkts,
> + sizeof(struct rte_mbuf *) *
> + unprocess_num);
> + i = unprocess_num;
> + }
Why copy unprocess pkts first? This is for avoiding out-of-order?
Thanks,
Jiayu
> /* Flush all packets from the tables */
> if (do_vxlan_tcp_gro) {
> - i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> - 0, pkts, nb_pkts);
> + i +=
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> + 0, &pkts[i], nb_pkts - i);
> }
>
> if (do_vxlan_udp_gro) {
> @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> &pkts[i], nb_pkts - i);
> }
> - /* Copy unprocessed packets */
> - if (unprocess_num > 0) {
> - memcpy(&pkts[i], unprocess_pkts,
> - sizeof(struct rte_mbuf *) *
> - unprocess_num);
> - }
> - nb_after_gro = i + unprocess_num;
> }
>
> return nb_after_gro;
> --
> 2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5] gro : fix reordering of packets in GRO library
2023-06-20 7:35 ` Hu, Jiayu
@ 2023-06-21 8:47 ` kumaraparameshwaran rathinavel
2023-06-30 11:32 ` kumaraparameshwaran rathinavel
1 sibling, 0 replies; 24+ messages in thread
From: kumaraparameshwaran rathinavel @ 2023-06-21 8:47 UTC (permalink / raw)
To: Hu, Jiayu; +Cc: dev, Kumara Parameshwaran, thomas
[-- Attachment #1: Type: text/plain, Size: 10446 bytes --]
Hi Jiayu,
Thanks for the comments. I have replied inline below. Will address the
review comments, but I think this would be a good patch to have in general.
Please let me know your thoughts.
On Tue, Jun 20, 2023 at 1:06 PM Hu, Jiayu <jiayu.hu@intel.com> wrote:
> Hi Kumara,
>
> Please see replies inline.
>
> Thanks,
> Jiayu
>
> > -----Original Message-----
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Sent: Tuesday, November 1, 2022 3:06 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Kumara Parameshwaran
> > <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> > <kparameshwar@vmware.com>
> > Subject: [PATCH v5] gro : fix reordering of packets in GRO library
> >
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> >
> > When a TCP packet contains flags like PSH it is returned immediately to
> the
> > application though there might be packets of the same flow in the GRO
> table.
> > If PSH flag is set on a segment packets up to the segment should be
> delivered
> > immediately. But the current implementation delivers the last arrived
> packet
> > with PSH flag set causing re-ordering
> >
> > With this patch, if a packet does not contain only ACK flag and if there
> are no
> > previous packets for the flow the packet would be returned immediately,
> > else will be merged with the previous segment and the flag on the last
> > segment will be set on the entire segment.
> > This is the behaviour with linux stack as well.
> >
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> > ---
> > v1:
> > If the received packet is not a pure ACK packet, we check if
> > there are any previous packets in the flow, if present we indulge
> > the received packet also in the coalescing logic and update the
> flags
> > of the last recived packet to the entire segment which would avoid
> > re-ordering.
> >
> > Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in
> > burst mode,
> > P1 contains PSH flag and since it does not contain any prior
> packets in
> > the flow
> > we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> > merged together.
> > In the existing case the P2,P3 would be delivered as single
> segment
> > first and the
> > unprocess_packets will be copied later which will cause reordering.
> > With the patch
> > copy the unprocess packets first and then the packets from the GRO
> > table.
> >
> > Testing done
> > The csum test-pmd was modifited to support the following
> > GET request of 10MB from client to server via test-pmd (static arp
> > entries added in client
> > and server). Enable GRO and TSO in test-pmd where the packets
> > recived from the client mac
> > would be sent to server mac and vice versa.
> > In above testing, without the patch the client observerd
> re-ordering
> > of 25 packets
> > and with the patch there were no packet re-ordering observerd.
> >
> > v2:
> > Fix warnings in commit and comment.
> > Do not consider packet as candidate to merge if it contains SYN/RST
> > flag.
> >
> > v3:
> > Fix warnings.
> >
> > v4:
> > Rebase with master.
> >
> > v5:
> > Adding co-author email
> >
> > lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> > lib/gro/rte_gro.c | 18 +++++++++---------
> > 2 files changed, 46 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 0014096e63..7363c5d540 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> > pkt->l2_len);
> > }
> >
> > +static inline void
> > +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> > +{
> > + struct rte_ether_hdr *eth_hdr;
> > + struct rte_ipv4_hdr *ipv4_hdr;
> > + struct rte_tcp_hdr *merged_tcp_hdr;
> > +
> > + eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> > + ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> > + merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> > >l3_len);
> > + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
>
> The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags &
> (TCP_FLAG_FIN | TCP_FLAG_PSH)",
> which only adds FIN and PSH at most to the merge packet.
>
>> Sure, will change it to add only FIN and PSH.
>>
>
> > +
> > int32_t
> > gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > struct gro_tcp4_tbl *tbl,
> > @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > uint32_t i, max_flow_num, remaining_flow_num;
> > int cmp;
> > uint8_t find;
> > + uint32_t start_idx;
> >
> > /*
> > * Don't process the packet whose TCP header length is greater @@ -
> > 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >
> > - /*
> > - * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> > - * or CWR set.
> > - */
> > - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > - return -1;
> > -
> > /* trim the tail padding bytes */
> > ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> > if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> > +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> > if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> > find = 1;
> > + start_idx = tbl->flows[i].start_index;
> > break;
> > }
> > remaining_flow_num--;
> > }
> > }
> >
> > + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> > + /*
> > + * Check and try merging the current TCP segment with the
> > previous
> > + * TCP segment if the TCP header does not contain RST and
> > SYN flag
> > + * There are cases where the last segment is sent with
> > FIN|PSH|ACK
> > + * which should also be considered for merging with
> previous
> > segments.
> > + */
> > + if (find && !(tcp_hdr->tcp_flags &
> > (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> > + /*
> > + * Since PSH flag is set, start time will be set
> to 0 so it
> > will be flushed
> > + * immediately.
> > + */
> > + tbl->items[start_idx].start_time = 0;
> > + else
> > + return -1;
> > + }
>
> The nested if-else check is not straightforward, and it's hard to read the
> condition-action of
> different combinations of flag bits. In addition, are all flag bits
> considered like Linux kernel?
>
>> Not all bits are considered. Will fix to make sure that all bits are
>> considered. Will try to reorganise the code better. Let me know if you have
>> any other suggestions here.
>>
>
> > +
> > /*
> > * Fail to find a matched flow. Insert a new flow and store the
> > * packet into the flow.
> > @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > is_atomic);
> > if (cmp) {
> > if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> > - pkt, cmp, sent_seq, ip_id,
> 0))
> > + pkt, cmp, sent_seq, ip_id,
> 0))
> > {
> > + if (tbl->items[cur_idx].start_time == 0)
> > + update_tcp_hdr_flags(tcp_hdr, tbl-
> > >items[cur_idx].firstseg);
> > return 1;
> > + }
> > +
> > /*
> > * Fail to merge the two packets, as the packet
> > * length is greater than the max value. Store
> diff --git
> > a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> > 100644
> > --- a/lib/gro/rte_gro.c
> > +++ b/lib/gro/rte_gro.c
> > @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> > if ((nb_after_gro < nb_pkts)
> > || (unprocess_num < nb_pkts)) {
> > i = 0;
> > + /* Copy unprocessed packets */
> > + if (unprocess_num > 0) {
> > + memcpy(&pkts[i], unprocess_pkts,
> > + sizeof(struct rte_mbuf *) *
> > + unprocess_num);
> > + i = unprocess_num;
> > + }
>
> Why copy unprocess pkts first? This is for avoiding out-of-order?
>
> Thanks,
> Jiayu
> > /* Flush all packets from the tables */
> > if (do_vxlan_tcp_gro) {
> > - i =
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > - 0, pkts, nb_pkts);
> > + i +=
> > gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > + 0, &pkts[i], nb_pkts - i);
> > }
> >
> > if (do_vxlan_udp_gro) {
> > @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> > i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> > &pkts[i], nb_pkts - i);
> > }
> > - /* Copy unprocessed packets */
> > - if (unprocess_num > 0) {
> > - memcpy(&pkts[i], unprocess_pkts,
> > - sizeof(struct rte_mbuf *) *
> > - unprocess_num);
> > - }
> > - nb_after_gro = i + unprocess_num;
> > }
> >
> > return nb_after_gro;
> > --
> > 2.25.1
>
>
[-- Attachment #2: Type: text/html, Size: 13870 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5] gro : fix reordering of packets in GRO library
2023-06-20 7:35 ` Hu, Jiayu
2023-06-21 8:47 ` kumaraparameshwaran rathinavel
@ 2023-06-30 11:32 ` kumaraparameshwaran rathinavel
1 sibling, 0 replies; 24+ messages in thread
From: kumaraparameshwaran rathinavel @ 2023-06-30 11:32 UTC (permalink / raw)
To: Hu, Jiayu; +Cc: dev, Kumara Parameshwaran, thomas
[-- Attachment #1: Type: text/plain, Size: 11009 bytes --]
On Tue, Jun 20, 2023 at 1:06 PM Hu, Jiayu <jiayu.hu@intel.com> wrote:
> Hi Kumara,
>
> Please see replies inline.
>
> Thanks,
> Jiayu
>
> > -----Original Message-----
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Sent: Tuesday, November 1, 2022 3:06 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Kumara Parameshwaran
> > <kumaraparamesh92@gmail.com>; Kumara Parameshwaran
> > <kparameshwar@vmware.com>
> > Subject: [PATCH v5] gro : fix reordering of packets in GRO library
> >
> > From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> >
> > When a TCP packet contains flags like PSH it is returned immediately to
> the
> > application though there might be packets of the same flow in the GRO
> table.
> > If PSH flag is set on a segment packets up to the segment should be
> delivered
> > immediately. But the current implementation delivers the last arrived
> packet
> > with PSH flag set causing re-ordering
> >
> > With this patch, if a packet does not contain only ACK flag and if there
> are no
> > previous packets for the flow the packet would be returned immediately,
> > else will be merged with the previous segment and the flag on the last
> > segment will be set on the entire segment.
> > This is the behaviour with linux stack as well.
> >
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> > Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
> > ---
> > v1:
> > If the received packet is not a pure ACK packet, we check if
> > there are any previous packets in the flow, if present we indulge
> > the received packet also in the coalescing logic and update the
> flags
> > of the last recived packet to the entire segment which would avoid
> > re-ordering.
> >
> > Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in
> > burst mode,
> > P1 contains PSH flag and since it does not contain any prior
> packets in
> > the flow
> > we copy it to unprocess_packets and P2(ACK) and P3(ACK) are
> > merged together.
> > In the existing case the P2,P3 would be delivered as single
> segment
> > first and the
> > unprocess_packets will be copied later which will cause reordering.
> > With the patch
> > copy the unprocess packets first and then the packets from the GRO
> > table.
> >
> > Testing done
> > The csum test-pmd was modifited to support the following
> > GET request of 10MB from client to server via test-pmd (static arp
> > entries added in client
> > and server). Enable GRO and TSO in test-pmd where the packets
> > recived from the client mac
> > would be sent to server mac and vice versa.
> > In above testing, without the patch the client observerd
> re-ordering
> > of 25 packets
> > and with the patch there were no packet re-ordering observerd.
> >
> > v2:
> > Fix warnings in commit and comment.
> > Do not consider packet as candidate to merge if it contains SYN/RST
> > flag.
> >
> > v3:
> > Fix warnings.
> >
> > v4:
> > Rebase with master.
> >
> > v5:
> > Adding co-author email
> >
> > lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
> > lib/gro/rte_gro.c | 18 +++++++++---------
> > 2 files changed, 46 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 0014096e63..7363c5d540 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
> > pkt->l2_len);
> > }
> >
> > +static inline void
> > +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> > +{
> > + struct rte_ether_hdr *eth_hdr;
> > + struct rte_ipv4_hdr *ipv4_hdr;
> > + struct rte_tcp_hdr *merged_tcp_hdr;
> > +
> > + eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
> > + ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> > + merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt-
> > >l3_len);
> > + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; }
>
> The Linux kernel updates the TCP flag via "tcp_flag_word(th2) |= flags &
> (TCP_FLAG_FIN | TCP_FLAG_PSH)",
> which only adds FIN and PSH at most to the merge packet.
>
> > +
> > int32_t
> > gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > struct gro_tcp4_tbl *tbl,
> > @@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > uint32_t i, max_flow_num, remaining_flow_num;
> > int cmp;
> > uint8_t find;
> > + uint32_t start_idx;
> >
> > /*
> > * Don't process the packet whose TCP header length is greater @@ -
> > 219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >
> > - /*
> > - * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> > - * or CWR set.
> > - */
> > - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > - return -1;
> > -
> > /* trim the tail padding bytes */
> > ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> > if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -264,12
> > +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> > if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> > find = 1;
> > + start_idx = tbl->flows[i].start_index;
> > break;
> > }
> > remaining_flow_num--;
> > }
> > }
> >
> > + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
> > + /*
> > + * Check and try merging the current TCP segment with the
> > previous
> > + * TCP segment if the TCP header does not contain RST and
> > SYN flag
> > + * There are cases where the last segment is sent with
> > FIN|PSH|ACK
> > + * which should also be considered for merging with
> previous
> > segments.
> > + */
> > + if (find && !(tcp_hdr->tcp_flags &
> > (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
> > + /*
> > + * Since PSH flag is set, start time will be set
> to 0 so it
> > will be flushed
> > + * immediately.
> > + */
> > + tbl->items[start_idx].start_time = 0;
> > + else
> > + return -1;
> > + }
>
> The nested if-else check is not straightforward, and it's hard to read the
> condition-action of
> different combinations of flag bits. In addition, are all flag bits
> considered like Linux kernel?
>
>> In case of Linux kernel the packets are flushed even if the ack numbers
>> are different or if the tcp options are different. In DPDK case if options
>> are different it is inserted as new item in the table. Is this intended?
>> Should we maintain the same approach ? In case of linux kernel, additional
>> flags like CWR, SYN, RST, URG are considered. I think we can consider them
>> as well as well, and if one these flags we can consider flushing the entire
>> flow. As you mentioned the flags can be updated only for FIN and PSH, but
>> what we shoud make sure is when delivered the packet ordering should be
>> maintained where if a packet with PSH arrives and if there are existing
>> packets in the GRO table make sure we copy the flags and deliver the entire
>> packet in-order. This should be trhe case for any case where a packet with
>> one of the flag is set and there packets in the GRO table. Please let me
>> know your thoughts.
>>
> > +
> > /*
> > * Fail to find a matched flow. Insert a new flow and store the
> > * packet into the flow.
> > @@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > is_atomic);
> > if (cmp) {
> > if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
> > - pkt, cmp, sent_seq, ip_id,
> 0))
> > + pkt, cmp, sent_seq, ip_id,
> 0))
> > {
> > + if (tbl->items[cur_idx].start_time == 0)
> > + update_tcp_hdr_flags(tcp_hdr, tbl-
> > >items[cur_idx].firstseg);
> > return 1;
> > + }
> > +
> > /*
> > * Fail to merge the two packets, as the packet
> > * length is greater than the max value. Store
> diff --git
> > a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c index e35399fd42..87c5502dce
> > 100644
> > --- a/lib/gro/rte_gro.c
> > +++ b/lib/gro/rte_gro.c
> > @@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> > if ((nb_after_gro < nb_pkts)
> > || (unprocess_num < nb_pkts)) {
> > i = 0;
> > + /* Copy unprocessed packets */
> > + if (unprocess_num > 0) {
> > + memcpy(&pkts[i], unprocess_pkts,
> > + sizeof(struct rte_mbuf *) *
> > + unprocess_num);
> > + i = unprocess_num;
> > + }
>
> Why copy unprocess pkts first? This is for avoiding out-of-order?
>
>> Yes, this it to avoid out of order.
>>
>
> Thanks,
> Jiayu
> > /* Flush all packets from the tables */
> > if (do_vxlan_tcp_gro) {
> > - i =
> gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > - 0, pkts, nb_pkts);
> > + i +=
> > gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
> > + 0, &pkts[i], nb_pkts - i);
> > }
> >
> > if (do_vxlan_udp_gro) {
> > @@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> > i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
> > &pkts[i], nb_pkts - i);
> > }
> > - /* Copy unprocessed packets */
> > - if (unprocess_num > 0) {
> > - memcpy(&pkts[i], unprocess_pkts,
> > - sizeof(struct rte_mbuf *) *
> > - unprocess_num);
> > - }
> > - nb_after_gro = i + unprocess_num;
> > }
> >
> > return nb_after_gro;
> > --
> > 2.25.1
>
>
[-- Attachment #2: Type: text/html, Size: 14229 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v6] gro: fix reordering of packets in GRO layer
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
2023-06-19 13:25 ` Thomas Monjalon
2023-06-20 7:35 ` Hu, Jiayu
@ 2023-12-08 17:54 ` Kumara Parameshwaran
2023-12-08 18:05 ` [PATCH v7] " Kumara Parameshwaran
` (2 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2023-12-08 17:54 UTC (permalink / raw)
To: dev; +Cc: hujiayu.hu, Kumara Parameshwaran, Kumara Parameshwaran
In the current implementation when packets is received with
special flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
beloging to the same flow but not handled.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <krathinavel@microsoft.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
lib/gro/gro_tcp.h | 10 +++++++
lib/gro/gro_tcp4.c | 65 +++++++++++++++++++++++++++++-----------------
2 files changed, 51 insertions(+), 24 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..d2073b02f3 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -187,4 +187,14 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct cmn_tcp_key *k2)
return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key)));
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pkt->l2_len + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+
+}
+
#endif
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..6745ebc45e 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
@@ -190,28 +185,50 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
}
if (find == 0) {
- sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
- item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
- tbl->max_item_num, start_time,
- INVALID_ARRAY_INDEX, sent_seq, ip_id,
- is_atomic);
- if (item_idx == INVALID_ARRAY_INDEX)
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
+ sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+ item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
+ tbl->max_item_num, start_time,
+ INVALID_ARRAY_INDEX, sent_seq, ip_id,
+ is_atomic);
+ if (item_idx == INVALID_ARRAY_INDEX)
+ return -1;
+ if (insert_new_flow(tbl, &key, item_idx) ==
+ INVALID_ARRAY_INDEX) {
+ /*
+ * Fail to insert a new flow, so delete the
+ * stored packet.
+ */
+ delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ return -1;
+ }
+ return 0;
+ } else {
return -1;
- if (insert_new_flow(tbl, &key, item_idx) ==
- INVALID_ARRAY_INDEX) {
- /*
- * Fail to insert a new flow, so delete the
- * stored packet.
- */
- delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ }
+ } else {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed immediately in timer
+ * mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG|RTE_TCP_PSH_FLAG|RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ tbl->items[item_start_idx].start_time = 0;
+ }
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num,
+ ip_id, is_atomic, start_time);
+ } else {
return -1;
}
- return 0;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
--
2.34.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v7] gro: fix reordering of packets in GRO layer
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
` (2 preceding siblings ...)
2023-12-08 17:54 ` [PATCH v6] gro: fix reordering of packets in GRO layer Kumara Parameshwaran
@ 2023-12-08 18:05 ` Kumara Parameshwaran
2023-12-08 18:12 ` [PATCH v8] " Kumara Parameshwaran
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
5 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2023-12-08 18:05 UTC (permalink / raw)
To: dev; +Cc: hujiayu.hu, Kumara Parameshwaran, Kumara Parameshwaran
In the current implementation when packets is received with
special flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
beloging to the same flow but not handled.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <krathinavel@microsoft.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
lib/gro/gro_tcp.h | 11 ++++++++
lib/gro/gro_tcp4.c | 65 +++++++++++++++++++++++++++++-----------------
2 files changed, 52 insertions(+), 24 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..137a03bc96 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -187,4 +187,15 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct cmn_tcp_key *k2)
return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key)));
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pkt->l2_len +
+ pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+
+}
+
#endif
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..7a68615031 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
@@ -190,28 +185,50 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
}
if (find == 0) {
- sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
- item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
- tbl->max_item_num, start_time,
- INVALID_ARRAY_INDEX, sent_seq, ip_id,
- is_atomic);
- if (item_idx == INVALID_ARRAY_INDEX)
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
+ sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+ item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
+ tbl->max_item_num, start_time,
+ INVALID_ARRAY_INDEX, sent_seq, ip_id,
+ is_atomic);
+ if (item_idx == INVALID_ARRAY_INDEX)
+ return -1;
+ if (insert_new_flow(tbl, &key, item_idx) ==
+ INVALID_ARRAY_INDEX) {
+ /*
+ * Fail to insert a new flow, so delete the
+ * stored packet.
+ */
+ delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ return -1;
+ }
+ return 0;
+ } else {
return -1;
- if (insert_new_flow(tbl, &key, item_idx) ==
- INVALID_ARRAY_INDEX) {
- /*
- * Fail to insert a new flow, so delete the
- * stored packet.
- */
- delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ }
+ } else {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG|RTE_TCP_PSH_FLAG|RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num,
+ ip_id, is_atomic, start_time);
+ } else {
return -1;
}
- return 0;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
--
2.34.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v8] gro: fix reordering of packets in GRO layer
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
` (3 preceding siblings ...)
2023-12-08 18:05 ` [PATCH v7] " Kumara Parameshwaran
@ 2023-12-08 18:12 ` Kumara Parameshwaran
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
5 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2023-12-08 18:12 UTC (permalink / raw)
To: dev; +Cc: hujiayu.hu, Kumara Parameshwaran, Kumara Parameshwaran
In the current implementation when packets is received with
special flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
beloging to the same flow but not handled.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <krathinavel@microsoft.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
v7:
Fix warnings and errors
v8:
Fix warnings and errors
lib/gro/gro_tcp.h | 11 ++++++++
lib/gro/gro_tcp4.c | 67 +++++++++++++++++++++++++++++-----------------
2 files changed, 54 insertions(+), 24 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..137a03bc96 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -187,4 +187,15 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct cmn_tcp_key *k2)
return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key)));
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pkt->l2_len +
+ pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+
+}
+
#endif
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..8af5a8d8a9 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
@@ -190,28 +185,52 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
}
if (find == 0) {
- sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
- item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
- tbl->max_item_num, start_time,
- INVALID_ARRAY_INDEX, sent_seq, ip_id,
- is_atomic);
- if (item_idx == INVALID_ARRAY_INDEX)
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
+ sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+ item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
+ tbl->max_item_num, start_time,
+ INVALID_ARRAY_INDEX, sent_seq, ip_id,
+ is_atomic);
+ if (item_idx == INVALID_ARRAY_INDEX)
+ return -1;
+ if (insert_new_flow(tbl, &key, item_idx) ==
+ INVALID_ARRAY_INDEX) {
+ /*
+ * Fail to insert a new flow, so delete the
+ * stored packet.
+ */
+ delete_tcp_item(tbl->items, item_idx, &tbl->item_num,
+ INVALID_ARRAY_INDEX);
+ return -1;
+ }
+ return 0;
+ } else {
return -1;
- if (insert_new_flow(tbl, &key, item_idx) ==
- INVALID_ARRAY_INDEX) {
- /*
- * Fail to insert a new flow, so delete the
- * stored packet.
- */
- delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ }
+ } else {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG|RTE_TCP_PSH_FLAG|RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
+ tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num,
+ ip_id, is_atomic, start_time);
+ } else {
return -1;
}
- return 0;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
--
2.34.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v9] gro: fix reordering of packets in GRO layer
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
` (4 preceding siblings ...)
2023-12-08 18:12 ` [PATCH v8] " Kumara Parameshwaran
@ 2023-12-08 18:17 ` Kumara Parameshwaran
2024-01-04 15:49 ` 胡嘉瑜
` (4 more replies)
5 siblings, 5 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2023-12-08 18:17 UTC (permalink / raw)
To: dev; +Cc: hujiayu.hu, Kumara Parameshwaran, Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <krathinavel@microsoft.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
v7:
Fix warnings and errors
v8:
Fix warnings and errors
v9:
Fix commit message
lib/gro/gro_tcp.h | 11 ++++++++
lib/gro/gro_tcp4.c | 67 +++++++++++++++++++++++++++++-----------------
2 files changed, 54 insertions(+), 24 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..137a03bc96 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -187,4 +187,15 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct cmn_tcp_key *k2)
return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key)));
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pkt->l2_len +
+ pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+
+}
+
#endif
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..8af5a8d8a9 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
@@ -190,28 +185,52 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
}
if (find == 0) {
- sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
- item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
- tbl->max_item_num, start_time,
- INVALID_ARRAY_INDEX, sent_seq, ip_id,
- is_atomic);
- if (item_idx == INVALID_ARRAY_INDEX)
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
+ sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+ item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
+ tbl->max_item_num, start_time,
+ INVALID_ARRAY_INDEX, sent_seq, ip_id,
+ is_atomic);
+ if (item_idx == INVALID_ARRAY_INDEX)
+ return -1;
+ if (insert_new_flow(tbl, &key, item_idx) ==
+ INVALID_ARRAY_INDEX) {
+ /*
+ * Fail to insert a new flow, so delete the
+ * stored packet.
+ */
+ delete_tcp_item(tbl->items, item_idx, &tbl->item_num,
+ INVALID_ARRAY_INDEX);
+ return -1;
+ }
+ return 0;
+ } else {
return -1;
- if (insert_new_flow(tbl, &key, item_idx) ==
- INVALID_ARRAY_INDEX) {
- /*
- * Fail to insert a new flow, so delete the
- * stored packet.
- */
- delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ }
+ } else {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG|RTE_TCP_PSH_FLAG|RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
+ tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num,
+ ip_id, is_atomic, start_time);
+ } else {
return -1;
}
- return 0;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
--
2.34.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9] gro: fix reordering of packets in GRO layer
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
@ 2024-01-04 15:49 ` 胡嘉瑜
2024-01-07 11:21 ` [PATCH v10] " Kumara Parameshwaran
` (3 subsequent siblings)
4 siblings, 0 replies; 24+ messages in thread
From: 胡嘉瑜 @ 2024-01-04 15:49 UTC (permalink / raw)
To: Kumara Parameshwaran, dev; +Cc: Kumara Parameshwaran
[-- Attachment #1: Type: text/plain, Size: 7023 bytes --]
在 2023/12/9 上午2:17, Kumara Parameshwaran 写道:
> In the current implementation when a packet is received with
> special TCP flag(s) set, only that packet is delivered out of order.
> There could be already coalesced packets in the GRO table
> belonging to the same flow but not delivered.
> This fix makes sure that the entire segment is delivered with the
> special flag(s) set which is how the Linux GRO is also implemented
>
> Signed-off-by: Kumara Parameshwaran<kumaraparamesh92@gmail.com>
> Co-authored-by: Kumara Parameshwaran<krathinavel@microsoft.com>
> ---
> If the received packet is not a pure ACK packet, we check if
> there are any previous packets in the flow, if present we indulge
> the received packet also in the coalescing logic and update the flags
> of the last recived packet to the entire segment which would avoid
> re-ordering.
>
> Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
> P1 contains PSH flag and since it does not contain any prior packets in the flow
> we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
> In the existing case the P2,P3 would be delivered as single segment first and the
> unprocess_packets will be copied later which will cause reordering. With the patch
> copy the unprocess packets first and then the packets from the GRO table.
>
> Testing done
> The csum test-pmd was modified to support the following
> GET request of 10MB from client to server via test-pmd (static arp entries added in client
> and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
> would be sent to server mac and vice versa.
> In above testing, without the patch the client observerd re-ordering of 25 packets
> and with the patch there were no packet re-ordering observerd.
>
> v2:
> Fix warnings in commit and comment.
> Do not consider packet as candidate to merge if it contains SYN/RST flag.
>
> v3:
> Fix warnings.
>
> v4:
> Rebase with master.
>
> v5:
> Adding co-author email
> v6:
> Address review comments from the maintainer to restructure the code
> and handle only special flags PSH,FIN
>
> v7:
> Fix warnings and errors
>
> v8:
> Fix warnings and errors
>
> v9:
> Fix commit message
>
> lib/gro/gro_tcp.h | 11 ++++++++
> lib/gro/gro_tcp4.c | 67 +++++++++++++++++++++++++++++-----------------
> 2 files changed, 54 insertions(+), 24 deletions(-)
>
> diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
> index d926c4b8cc..137a03bc96 100644
> --- a/lib/gro/gro_tcp.h
> +++ b/lib/gro/gro_tcp.h
> @@ -187,4 +187,15 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct cmn_tcp_key *k2)
> return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key)));
> }
>
> +static inline void
> +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
> +{
> + struct rte_tcp_hdr *merged_tcp_hdr;
> +
> + merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pkt->l2_len +
> + pkt->l3_len);
> + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
> +
> +}
> +
> #endif
> diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
> index 6645de592b..8af5a8d8a9 100644
> --- a/lib/gro/gro_tcp4.c
> +++ b/lib/gro/gro_tcp4.c
> @@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> uint32_t item_idx;
> uint32_t i, max_flow_num, remaining_flow_num;
> uint8_t find;
> + uint32_t item_start_idx;
>
> /*
> * Don't process the packet whose TCP header length is greater
> @@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>
> - /*
> - * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
> - * or CWR set.
> - */
> - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> - return -1;
> -
> /* trim the tail padding bytes */
> ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
> if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
> @@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
> if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
> find = 1;
> + item_start_idx = tbl->flows[i].start_index;
> break;
> }
> remaining_flow_num--;
> @@ -190,28 +185,52 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> }
>
> if (find == 0) {
It is more likely to find a match flow. So better to put the below logic
to the else statement.
> - sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> - item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
> - tbl->max_item_num, start_time,
> - INVALID_ARRAY_INDEX, sent_seq, ip_id,
> - is_atomic);
> - if (item_idx == INVALID_ARRAY_INDEX)
> + /*
> + * Add new flow to the table only if contains ACK flag with data.
> + * Do not add any packets with additional tcp flags to the GRO table
> + */
> + if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
> + sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> + item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
> + tbl->max_item_num, start_time,
> + INVALID_ARRAY_INDEX, sent_seq, ip_id,
> + is_atomic);
> + if (item_idx == INVALID_ARRAY_INDEX)
> + return -1;
> + if (insert_new_flow(tbl, &key, item_idx) ==
> + INVALID_ARRAY_INDEX) {
> + /*
> + * Fail to insert a new flow, so delete the
> + * stored packet.
> + */
> + delete_tcp_item(tbl->items, item_idx, &tbl->item_num,
> + INVALID_ARRAY_INDEX);
> + return -1;
> + }
> + return 0;
> + } else {
"else" is not needed.
> return -1;
> - if (insert_new_flow(tbl, &key, item_idx) ==
> - INVALID_ARRAY_INDEX) {
> - /*
> - * Fail to insert a new flow, so delete the
> - * stored packet.
> - */
> - delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
> + }
> + } else {
> + /*
> + * Any packet with additional flags like PSH,FIN should be processed
> + * and flushed immediately.
> + * Hence marking the start time to 0, so that the packets will be flushed
> + * immediately in timer mode.
> + */
> + if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG|RTE_TCP_PSH_FLAG|RTE_TCP_FIN_FLAG)) {
Add a space beween RTE_TCP_ACK_FLAG and '|'.
> + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> + tbl->items[item_start_idx].start_time = 0;
> + return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
> + tbl->flows[i].start_index,
> + &tbl->item_num, tbl->max_item_num,
> + ip_id, is_atomic, start_time);
> + } else {
It is better to check the "invalid" flags, like SYN, RST and URG, at the
beginning of
gro_tcp4_reassemble(), as the packet can be returned earlier.
Thanks,
Jiayu
> return -1;
> }
> - return 0;
> }
>
> - return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
> - &tbl->item_num, tbl->max_item_num,
> - ip_id, is_atomic, start_time);
> + return -1;
> }
>
> /*
[-- Attachment #2: Type: text/html, Size: 8305 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v10] gro: fix reordering of packets in GRO layer
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
2024-01-04 15:49 ` 胡嘉瑜
@ 2024-01-07 11:21 ` Kumara Parameshwaran
2024-01-07 11:29 ` [PATCH v11] " Kumara Parameshwaran
` (2 subsequent siblings)
4 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2024-01-07 11:21 UTC (permalink / raw)
To: hujiayu.hu; +Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <krathinavel@microsoft.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
v7:
Fix warnings and errors
v8:
Fix warnings and errors
v9:
Fix commit message
v10:
Update tcp header flags and address review comments
lib/gro/gro_tcp.h | 9 ++++++++
lib/gro/gro_tcp4.c | 46 ++++++++++++++++++++++++++++----------
lib/gro/gro_tcp_internal.h | 2 +-
lib/gro/gro_vxlan_tcp4.c | 5 +++--
4 files changed, 47 insertions(+), 15 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..2c68b5f23e 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -19,6 +19,8 @@
#define INVALID_TCP_HDRLEN(len) \
(((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN))
+#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)
+
struct cmn_tcp_key {
struct rte_ether_addr eth_saddr;
struct rte_ether_addr eth_daddr;
@@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t ip_id,
uint16_t l2_offset)
{
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t hdr_len, l2_len;
+ struct rte_tcp_hdr *tcp_hdr;
if (cmp > 0) {
pkt_head = item->firstseg;
@@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
/* update MBUF metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
pkt_head->pkt_len += pkt_tail->pkt_len;
+ if (tcp_flags != RTE_TCP_ACK_FLAG) {
+ tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *,
+ l2_offset + pkt_head->l2_len + pkt_head->l3_len);
+ tcp_hdr->tcp_flags |= tcp_flags;
+ }
return 1;
}
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..707cd050da 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ /* Return early if the TCP flags are not handled in GRO layer */
+ if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS)))
return -1;
/* trim the tail padding bytes */
@@ -183,13 +181,36 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
- if (find == 0) {
+ if (find == 1) {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
+ tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num,
+ ip_id, is_atomic, start_time);
+ } else {
+ return -1;
+ }
+ }
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
tbl->max_item_num, start_time,
@@ -200,18 +221,19 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (insert_new_flow(tbl, &key, item_idx) ==
INVALID_ARRAY_INDEX) {
/*
- * Fail to insert a new flow, so delete the
- * stored packet.
+ * Fail to insert a new flow, so delete the
+ * stored packet.
*/
- delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ delete_tcp_item(tbl->items, item_idx, &tbl->item_num,
+ INVALID_ARRAY_INDEX);
return -1;
}
return 0;
+ } else {
+ return -1;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
diff --git a/lib/gro/gro_tcp_internal.h b/lib/gro/gro_tcp_internal.h
index cc84abeaeb..e4855da1ad 100644
--- a/lib/gro/gro_tcp_internal.h
+++ b/lib/gro/gro_tcp_internal.h
@@ -101,7 +101,7 @@ process_tcp_item(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp_packets(&items[cur_idx],
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags, ip_id, 0))
return 1;
/*
* Fail to merge the two packets, as the packet
diff --git a/lib/gro/gro_vxlan_tcp4.c b/lib/gro/gro_vxlan_tcp4.c
index 6ab7001922..8dd62a949c 100644
--- a/lib/gro/gro_vxlan_tcp4.c
+++ b/lib/gro/gro_vxlan_tcp4.c
@@ -239,10 +239,11 @@ merge_two_vxlan_tcp4_packets(struct gro_vxlan_tcp4_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t outer_ip_id,
uint16_t ip_id)
{
- if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq,
+ if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq, tcp_flags,
ip_id, pkt->outer_l2_len +
pkt->outer_l3_len)) {
/* Update the outer IPv4 ID to the large value. */
@@ -413,7 +414,7 @@ gro_vxlan_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_dl, outer_is_atomic, is_atomic);
if (cmp) {
if (merge_two_vxlan_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq,
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags,
outer_ip_id, ip_id))
return 1;
/*
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v11] gro: fix reordering of packets in GRO layer
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
2024-01-04 15:49 ` 胡嘉瑜
2024-01-07 11:21 ` [PATCH v10] " Kumara Parameshwaran
@ 2024-01-07 11:29 ` Kumara Parameshwaran
2024-01-07 17:20 ` Stephen Hemminger
2024-01-08 15:50 ` [PATCH v12] " Kumara Parameshwaran
2024-01-08 16:04 ` [PATCH v13] " Kumara Parameshwaran
4 siblings, 1 reply; 24+ messages in thread
From: Kumara Parameshwaran @ 2024-01-07 11:29 UTC (permalink / raw)
To: hujiayu.hu; +Cc: dev, Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
v7:
Fix warnings and errors
v8:
Fix warnings and errors
v9:
Fix commit message
v10:
Update tcp header flags and address review comments
v11:
Fix warnings
lib/gro/gro_tcp.h | 9 ++++++++
lib/gro/gro_tcp4.c | 46 ++++++++++++++++++++++++++++----------
lib/gro/gro_tcp_internal.h | 2 +-
lib/gro/gro_vxlan_tcp4.c | 5 +++--
4 files changed, 47 insertions(+), 15 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..2c68b5f23e 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -19,6 +19,8 @@
#define INVALID_TCP_HDRLEN(len) \
(((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN))
+#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)
+
struct cmn_tcp_key {
struct rte_ether_addr eth_saddr;
struct rte_ether_addr eth_daddr;
@@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t ip_id,
uint16_t l2_offset)
{
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t hdr_len, l2_len;
+ struct rte_tcp_hdr *tcp_hdr;
if (cmp > 0) {
pkt_head = item->firstseg;
@@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
/* update MBUF metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
pkt_head->pkt_len += pkt_tail->pkt_len;
+ if (tcp_flags != RTE_TCP_ACK_FLAG) {
+ tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *,
+ l2_offset + pkt_head->l2_len + pkt_head->l3_len);
+ tcp_hdr->tcp_flags |= tcp_flags;
+ }
return 1;
}
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..d426127dbd 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ /* Return early if the TCP flags are not handled in GRO layer */
+ if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS)))
return -1;
/* trim the tail padding bytes */
@@ -183,13 +181,36 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
- if (find == 0) {
+ if (find == 1) {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
+ tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num,
+ ip_id, is_atomic, start_time);
+ } else {
+ return -1;
+ }
+ }
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
tbl->max_item_num, start_time,
@@ -200,18 +221,19 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (insert_new_flow(tbl, &key, item_idx) ==
INVALID_ARRAY_INDEX) {
/*
- * Fail to insert a new flow, so delete the
- * stored packet.
+ * Fail to insert a new flow, so delete the
+ * stored packet.
*/
- delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
+ delete_tcp_item(tbl->items, item_idx, &tbl->item_num,
+ INVALID_ARRAY_INDEX);
return -1;
}
return 0;
+ } else {
+ return -1;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
diff --git a/lib/gro/gro_tcp_internal.h b/lib/gro/gro_tcp_internal.h
index cc84abeaeb..e4855da1ad 100644
--- a/lib/gro/gro_tcp_internal.h
+++ b/lib/gro/gro_tcp_internal.h
@@ -101,7 +101,7 @@ process_tcp_item(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp_packets(&items[cur_idx],
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags, ip_id, 0))
return 1;
/*
* Fail to merge the two packets, as the packet
diff --git a/lib/gro/gro_vxlan_tcp4.c b/lib/gro/gro_vxlan_tcp4.c
index 6ab7001922..8dd62a949c 100644
--- a/lib/gro/gro_vxlan_tcp4.c
+++ b/lib/gro/gro_vxlan_tcp4.c
@@ -239,10 +239,11 @@ merge_two_vxlan_tcp4_packets(struct gro_vxlan_tcp4_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t outer_ip_id,
uint16_t ip_id)
{
- if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq,
+ if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq, tcp_flags,
ip_id, pkt->outer_l2_len +
pkt->outer_l3_len)) {
/* Update the outer IPv4 ID to the large value. */
@@ -413,7 +414,7 @@ gro_vxlan_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_dl, outer_is_atomic, is_atomic);
if (cmp) {
if (merge_two_vxlan_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq,
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags,
outer_ip_id, ip_id))
return 1;
/*
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v11] gro: fix reordering of packets in GRO layer
2024-01-07 11:29 ` [PATCH v11] " Kumara Parameshwaran
@ 2024-01-07 17:20 ` Stephen Hemminger
2024-01-08 16:11 ` kumaraparameshwaran rathinavel
0 siblings, 1 reply; 24+ messages in thread
From: Stephen Hemminger @ 2024-01-07 17:20 UTC (permalink / raw)
To: Kumara Parameshwaran; +Cc: hujiayu.hu, dev
On Sun, 7 Jan 2024 16:59:20 +0530
Kumara Parameshwaran <kumaraparamesh92@gmail.com> wrote:
> + /* Return early if the TCP flags are not handled in GRO layer */
> + if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS)))
Nit, lots of extra paren here. Could be:
if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
> + if (find == 1) {
> + /*
> + * Any packet with additional flags like PSH,FIN should be processed
> + * and flushed immediately.
> + * Hence marking the start time to 0, so that the packets will be flushed
> + * immediately in timer mode.
> + */
> + if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
> + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> + tbl->items[item_start_idx].start_time = 0;
> + return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
> + tbl->flows[i].start_index,
> + &tbl->item_num, tbl->max_item_num,
> + ip_id, is_atomic, start_time);
> + } else {
> + return -1;
> + }
> + }
Reordering this conditional would keep code from being so indented.
> - delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
> + delete_tcp_item(tbl->items, item_idx, &tbl->item_num,
> + INVALID_ARRAY_INDEX);
> return -1;
This change is unnecessary, max line length in DPDK is 100 characters for readability.
> return 0;
> + } else {
> + return -1;
> }
>
> - return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
> - &tbl->item_num, tbl->max_item_num,
> - ip_id, is_atomic, start_time);
> + return -1;
> }
Since end of else and end of function both return -1, the else clause is unnecessary.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v12] gro: fix reordering of packets in GRO layer
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
` (2 preceding siblings ...)
2024-01-07 11:29 ` [PATCH v11] " Kumara Parameshwaran
@ 2024-01-08 15:50 ` Kumara Parameshwaran
2024-01-08 16:04 ` [PATCH v13] " Kumara Parameshwaran
4 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2024-01-08 15:50 UTC (permalink / raw)
To: hujiayu.hu; +Cc: dev, Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
v7:
Fix warnings and errors
v8:
Fix warnings and errors
v9:
Fix commit message
v10:
Update tcp header flags and address review comments
v11:
Fix warnings
v12:
Fix nit review comments
lib/gro/gro_tcp.h | 9 +++++++
lib/gro/gro_tcp4.c | 48 ++++++++++++++++++++++++--------------
lib/gro/gro_tcp_internal.h | 2 +-
lib/gro/gro_vxlan_tcp4.c | 5 ++--
4 files changed, 44 insertions(+), 20 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..2c68b5f23e 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -19,6 +19,8 @@
#define INVALID_TCP_HDRLEN(len) \
(((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN))
+#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)
+
struct cmn_tcp_key {
struct rte_ether_addr eth_saddr;
struct rte_ether_addr eth_daddr;
@@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t ip_id,
uint16_t l2_offset)
{
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t hdr_len, l2_len;
+ struct rte_tcp_hdr *tcp_hdr;
if (cmp > 0) {
pkt_head = item->firstseg;
@@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
/* update MBUF metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
pkt_head->pkt_len += pkt_tail->pkt_len;
+ if (tcp_flags != RTE_TCP_ACK_FLAG) {
+ tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *,
+ l2_offset + pkt_head->l2_len + pkt_head->l3_len);
+ tcp_hdr->tcp_flags |= tcp_flags;
+ }
return 1;
}
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..ad9cf04dbe 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ /* Return early if the TCP flags are not handled in GRO layer */
+ if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
return -1;
/* trim the tail padding bytes */
@@ -183,25 +181,43 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
- if (find == 0) {
+ if (find == 1) {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
+ &tbl->item_num, tbl->max_item_num, ip_id, is_atomic, start_time);
+ } else {
+ return -1;
+ }
+ }
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
- item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
- tbl->max_item_num, start_time,
- INVALID_ARRAY_INDEX, sent_seq, ip_id,
- is_atomic);
+ item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num, tbl->max_item_num,
+ start_time, INVALID_ARRAY_INDEX, sent_seq, ip_id, is_atomic);
if (item_idx == INVALID_ARRAY_INDEX)
return -1;
- if (insert_new_flow(tbl, &key, item_idx) ==
- INVALID_ARRAY_INDEX) {
+ if (insert_new_flow(tbl, &key, item_idx) == INVALID_ARRAY_INDEX) {
/*
- * Fail to insert a new flow, so delete the
- * stored packet.
+ * Fail to insert a new flow, so delete the
+ * stored packet.
*/
delete_tcp_item(tbl->items, item_idx, &tbl->item_num, INVALID_ARRAY_INDEX);
return -1;
@@ -209,9 +225,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
return 0;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
diff --git a/lib/gro/gro_tcp_internal.h b/lib/gro/gro_tcp_internal.h
index cc84abeaeb..e4855da1ad 100644
--- a/lib/gro/gro_tcp_internal.h
+++ b/lib/gro/gro_tcp_internal.h
@@ -101,7 +101,7 @@ process_tcp_item(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp_packets(&items[cur_idx],
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags, ip_id, 0))
return 1;
/*
* Fail to merge the two packets, as the packet
diff --git a/lib/gro/gro_vxlan_tcp4.c b/lib/gro/gro_vxlan_tcp4.c
index 6ab7001922..8dd62a949c 100644
--- a/lib/gro/gro_vxlan_tcp4.c
+++ b/lib/gro/gro_vxlan_tcp4.c
@@ -239,10 +239,11 @@ merge_two_vxlan_tcp4_packets(struct gro_vxlan_tcp4_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t outer_ip_id,
uint16_t ip_id)
{
- if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq,
+ if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq, tcp_flags,
ip_id, pkt->outer_l2_len +
pkt->outer_l3_len)) {
/* Update the outer IPv4 ID to the large value. */
@@ -413,7 +414,7 @@ gro_vxlan_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_dl, outer_is_atomic, is_atomic);
if (cmp) {
if (merge_two_vxlan_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq,
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags,
outer_ip_id, ip_id))
return 1;
/*
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v13] gro: fix reordering of packets in GRO layer
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
` (3 preceding siblings ...)
2024-01-08 15:50 ` [PATCH v12] " Kumara Parameshwaran
@ 2024-01-08 16:04 ` Kumara Parameshwaran
2024-01-16 14:28 ` 胡嘉瑜
4 siblings, 1 reply; 24+ messages in thread
From: Kumara Parameshwaran @ 2024-01-08 16:04 UTC (permalink / raw)
To: hujiayu.hu; +Cc: dev, Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN
v7:
Fix warnings and errors
v8:
Fix warnings and errors
v9:
Fix commit message
v10:
Update tcp header flags and address review comments
v11:
Fix warnings
v12:
Fix nit review comments
v13:
Fix warnings
lib/gro/gro_tcp.h | 9 +++++++++
lib/gro/gro_tcp4.c | 36 +++++++++++++++++++++++++++---------
lib/gro/gro_tcp_internal.h | 2 +-
lib/gro/gro_vxlan_tcp4.c | 5 +++--
4 files changed, 40 insertions(+), 12 deletions(-)
diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..2c68b5f23e 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -19,6 +19,8 @@
#define INVALID_TCP_HDRLEN(len) \
(((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN))
+#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)
+
struct cmn_tcp_key {
struct rte_ether_addr eth_saddr;
struct rte_ether_addr eth_daddr;
@@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t ip_id,
uint16_t l2_offset)
{
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t hdr_len, l2_len;
+ struct rte_tcp_hdr *tcp_hdr;
if (cmp > 0) {
pkt_head = item->firstseg;
@@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
/* update MBUF metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
pkt_head->pkt_len += pkt_tail->pkt_len;
+ if (tcp_flags != RTE_TCP_ACK_FLAG) {
+ tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *,
+ l2_offset + pkt_head->l2_len + pkt_head->l3_len);
+ tcp_hdr->tcp_flags |= tcp_flags;
+ }
return 1;
}
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..c8b8d7990c 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+ uint32_t item_start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ /* Return early if the TCP flags are not handled in GRO layer */
+ if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
return -1;
/* trim the tail padding bytes */
@@ -183,13 +181,35 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
- if (find == 0) {
+ if (find == 1) {
+ /*
+ * Any packet with additional flags like PSH,FIN should be processed
+ * and flushed immediately.
+ * Hence marking the start time to 0, so that the packets will be flushed
+ * immediately in timer mode.
+ */
+ if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+ tbl->items[item_start_idx].start_time = 0;
+ return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
+ tbl->flows[i].start_index, &tbl->item_num,
+ tbl->max_item_num, ip_id, is_atomic, start_time);
+ } else {
+ return -1;
+ }
+ }
+ /*
+ * Add new flow to the table only if contains ACK flag with data.
+ * Do not add any packets with additional tcp flags to the GRO table
+ */
+ if (tcp_hdr->tcp_flags == RTE_TCP_ACK_FLAG) {
sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
tbl->max_item_num, start_time,
@@ -209,9 +229,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
return 0;
}
- return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items, tbl->flows[i].start_index,
- &tbl->item_num, tbl->max_item_num,
- ip_id, is_atomic, start_time);
+ return -1;
}
/*
diff --git a/lib/gro/gro_tcp_internal.h b/lib/gro/gro_tcp_internal.h
index cc84abeaeb..e4855da1ad 100644
--- a/lib/gro/gro_tcp_internal.h
+++ b/lib/gro/gro_tcp_internal.h
@@ -101,7 +101,7 @@ process_tcp_item(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp_packets(&items[cur_idx],
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags, ip_id, 0))
return 1;
/*
* Fail to merge the two packets, as the packet
diff --git a/lib/gro/gro_vxlan_tcp4.c b/lib/gro/gro_vxlan_tcp4.c
index 6ab7001922..8dd62a949c 100644
--- a/lib/gro/gro_vxlan_tcp4.c
+++ b/lib/gro/gro_vxlan_tcp4.c
@@ -239,10 +239,11 @@ merge_two_vxlan_tcp4_packets(struct gro_vxlan_tcp4_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+ uint8_t tcp_flags,
uint16_t outer_ip_id,
uint16_t ip_id)
{
- if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq,
+ if (merge_two_tcp_packets(&item->inner_item, pkt, cmp, sent_seq, tcp_flags,
ip_id, pkt->outer_l2_len +
pkt->outer_l3_len)) {
/* Update the outer IPv4 ID to the large value. */
@@ -413,7 +414,7 @@ gro_vxlan_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_dl, outer_is_atomic, is_atomic);
if (cmp) {
if (merge_two_vxlan_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq,
+ pkt, cmp, sent_seq, tcp_hdr->tcp_flags,
outer_ip_id, ip_id))
return 1;
/*
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v11] gro: fix reordering of packets in GRO layer
2024-01-07 17:20 ` Stephen Hemminger
@ 2024-01-08 16:11 ` kumaraparameshwaran rathinavel
0 siblings, 0 replies; 24+ messages in thread
From: kumaraparameshwaran rathinavel @ 2024-01-08 16:11 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: hujiayu.hu, dev
[-- Attachment #1: Type: text/plain, Size: 2603 bytes --]
On Sun, Jan 7, 2024 at 10:50 PM Stephen Hemminger <
stephen@networkplumber.org> wrote:
> On Sun, 7 Jan 2024 16:59:20 +0530
> Kumara Parameshwaran <kumaraparamesh92@gmail.com> wrote:
>
> > + /* Return early if the TCP flags are not handled in GRO layer */
> > + if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS)))
>
> Nit, lots of extra paren here. Could be:
> if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
>
>> Done.
>>
> > + if (find == 1) {
> > + /*
> > + * Any packet with additional flags like PSH,FIN should be
> processed
> > + * and flushed immediately.
> > + * Hence marking the start time to 0, so that the packets
> will be flushed
> > + * immediately in timer mode.
> > + */
> > + if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG |
> RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
> > + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > + tbl->items[item_start_idx].start_time = 0;
> > + return process_tcp_item(pkt, tcp_hdr, tcp_dl,
> tbl->items,
> > + tbl->flows[i].start_index,
> > + &tbl->item_num,
> tbl->max_item_num,
> > + ip_id, is_atomic,
> start_time);
> > + } else {
> > + return -1;
> > + }
> > + }
>
> Reordering this conditional would keep code from being so indented.
>
>> Doing this reordering as suggested by Jiyau since the find == 1 would be
>> likely in most cases.
>>
> > - delete_tcp_item(tbl->items, item_idx,
> &tbl->item_num, INVALID_ARRAY_INDEX);
> > + delete_tcp_item(tbl->items, item_idx,
> &tbl->item_num,
> > + INVALID_ARRAY_INDEX);
> > return -1;
>
> This change is unnecessary, max line length in DPDK is 100 characters for
> readability.
>
>> Done.
>>
> > return 0;
> > + } else {
> > + return -1;
> > }
> >
> > - return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
> tbl->flows[i].start_index,
> > - &tbl->item_num,
> tbl->max_item_num,
> > - ip_id, is_atomic,
> start_time);
> > + return -1;
> > }
>
> Since end of else and end of function both return -1, the else clause is
> unnecessary.
>
>> Done.
>>
>
[-- Attachment #2: Type: text/html, Size: 4052 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v13] gro: fix reordering of packets in GRO layer
2024-01-08 16:04 ` [PATCH v13] " Kumara Parameshwaran
@ 2024-01-16 14:28 ` 胡嘉瑜
2024-02-12 14:30 ` Thomas Monjalon
0 siblings, 1 reply; 24+ messages in thread
From: 胡嘉瑜 @ 2024-01-16 14:28 UTC (permalink / raw)
To: Kumara Parameshwaran; +Cc: dev
在 2024/1/9 上午12:04, Kumara Parameshwaran 写道:
> In the current implementation when a packet is received with
> special TCP flag(s) set, only that packet is delivered out of order.
> There could be already coalesced packets in the GRO table
> belonging to the same flow but not delivered.
> This fix makes sure that the entire segment is delivered with the
> special flag(s) set which is how the Linux GRO is also implemented
>
> Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
> ---
> If the received packet is not a pure ACK packet, we check if
> there are any previous packets in the flow, if present we indulge
> the received packet also in the coalescing logic and update the flags
> of the last recived packet to the entire segment which would avoid
> re-ordering.
>
> Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
> P1 contains PSH flag and since it does not contain any prior packets in the flow
> we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
> In the existing case the P2,P3 would be delivered as single segment first and the
> unprocess_packets will be copied later which will cause reordering. With the patch
> copy the unprocess packets first and then the packets from the GRO table.
>
> Testing done
> The csum test-pmd was modified to support the following
> GET request of 10MB from client to server via test-pmd (static arp entries added in client
> and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
> would be sent to server mac and vice versa.
> In above testing, without the patch the client observerd re-ordering of 25 packets
> and with the patch there were no packet re-ordering observerd.
>
> v2:
> Fix warnings in commit and comment.
> Do not consider packet as candidate to merge if it contains SYN/RST flag.
>
> v3:
> Fix warnings.
>
> v4:
> Rebase with master.
>
> v5:
> Adding co-author email
> v6:
> Address review comments from the maintainer to restructure the code
> and handle only special flags PSH,FIN
>
> v7:
> Fix warnings and errors
>
> v8:
> Fix warnings and errors
>
> v9:
> Fix commit message
>
> v10:
> Update tcp header flags and address review comments
>
> v11:
> Fix warnings
>
> v12:
> Fix nit review comments
>
> v13:
> Fix warnings
>
> lib/gro/gro_tcp.h | 9 +++++++++
> lib/gro/gro_tcp4.c | 36 +++++++++++++++++++++++++++---------
> lib/gro/gro_tcp_internal.h | 2 +-
> lib/gro/gro_vxlan_tcp4.c | 5 +++--
> 4 files changed, 40 insertions(+), 12 deletions(-)
>
Reviewed-by: Jiayu Hu <hujiayu.hu@foxmail.com>
Thanks,
Jiayu
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v13] gro: fix reordering of packets in GRO layer
2024-01-16 14:28 ` 胡嘉瑜
@ 2024-02-12 14:30 ` Thomas Monjalon
0 siblings, 0 replies; 24+ messages in thread
From: Thomas Monjalon @ 2024-02-12 14:30 UTC (permalink / raw)
To: Kumara Parameshwaran; +Cc: dev, 胡嘉瑜, stable
16/01/2024 15:28, 胡嘉瑜:
>
> 在 2024/1/9 上午12:04, Kumara Parameshwaran 写道:
> > In the current implementation when a packet is received with
> > special TCP flag(s) set, only that packet is delivered out of order.
> > There could be already coalesced packets in the GRO table
> > belonging to the same flow but not delivered.
> > This fix makes sure that the entire segment is delivered with the
> > special flag(s) set which is how the Linux GRO is also implemented
> >
> > Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
>
> Reviewed-by: Jiayu Hu <hujiayu.hu@foxmail.com>
Applied with Cc stable, thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v5] gro : fix reordering of packets in GRO library
2022-09-07 8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
@ 2022-11-01 7:03 ` Kumara Parameshwaran
0 siblings, 0 replies; 24+ messages in thread
From: Kumara Parameshwaran @ 2022-11-01 7:03 UTC (permalink / raw)
To: jiayu.hu; +Cc: dev, Kumara Parameshwaran, Kumara Parameshwaran
From: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
When a TCP packet contains flags like PSH it is returned
immediately to the application though there might be packets of
the same flow in the GRO table. If PSH flag is set on a segment
packets up to the segment should be delivered immediately. But the
current implementation delivers the last arrived packet with PSH flag
set causing re-ordering
With this patch, if a packet does not contain only ACK flag and if
there are no previous packets for the flow the packet would be returned
immediately, else will be merged with the previous segment and the
flag on the last segment will be set on the entire segment.
This is the behaviour with linux stack as well.
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Co-authored-by: Kumara Parameshwaran <kparameshwar@vmware.com>
---
v1:
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.
Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode,
P1 contains PSH flag and since it does not contain any prior packets in the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together.
In the existing case the P2,P3 would be delivered as single segment first and the
unprocess_packets will be copied later which will cause reordering. With the patch
copy the unprocess packets first and then the packets from the GRO table.
Testing done
The csum test-pmd was modifited to support the following
GET request of 10MB from client to server via test-pmd (static arp entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 25 packets
and with the patch there were no packet re-ordering observerd.
v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST flag.
v3:
Fix warnings.
v4:
Rebase with master.
v5:
Adding co-author email
lib/gro/gro_tcp4.c | 45 +++++++++++++++++++++++++++++++++++++--------
lib/gro/rte_gro.c | 18 +++++++++---------
2 files changed, 46 insertions(+), 17 deletions(-)
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 0014096e63..7363c5d540 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -188,6 +188,19 @@ update_header(struct gro_tcp4_item *item)
pkt->l2_len);
}
+static inline void
+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+ struct rte_ether_hdr *eth_hdr;
+ struct rte_ipv4_hdr *ipv4_hdr;
+ struct rte_tcp_hdr *merged_tcp_hdr;
+
+ eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
+ ipv4_hdr = (struct rte_ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+ merged_tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+ merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+}
+
int32_t
gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
@@ -206,6 +219,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
+ uint32_t start_idx;
/*
* Don't process the packet whose TCP header length is greater
@@ -219,13 +233,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
- /*
- * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
- * or CWR set.
- */
- if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
- return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -264,12 +271,30 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+ start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
}
}
+ if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) {
+ /*
+ * Check and try merging the current TCP segment with the previous
+ * TCP segment if the TCP header does not contain RST and SYN flag
+ * There are cases where the last segment is sent with FIN|PSH|ACK
+ * which should also be considered for merging with previous segments.
+ */
+ if (find && !(tcp_hdr->tcp_flags & (RTE_TCP_RST_FLAG|RTE_TCP_SYN_FLAG)))
+ /*
+ * Since PSH flag is set, start time will be set to 0 so it will be flushed
+ * immediately.
+ */
+ tbl->items[start_idx].start_time = 0;
+ else
+ return -1;
+ }
+
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
@@ -304,8 +329,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
is_atomic);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
- pkt, cmp, sent_seq, ip_id, 0))
+ pkt, cmp, sent_seq, ip_id, 0)) {
+ if (tbl->items[cur_idx].start_time == 0)
+ update_tcp_hdr_flags(tcp_hdr, tbl->items[cur_idx].firstseg);
return 1;
+ }
+
/*
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
diff --git a/lib/gro/rte_gro.c b/lib/gro/rte_gro.c
index e35399fd42..87c5502dce 100644
--- a/lib/gro/rte_gro.c
+++ b/lib/gro/rte_gro.c
@@ -283,10 +283,17 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
if ((nb_after_gro < nb_pkts)
|| (unprocess_num < nb_pkts)) {
i = 0;
+ /* Copy unprocessed packets */
+ if (unprocess_num > 0) {
+ memcpy(&pkts[i], unprocess_pkts,
+ sizeof(struct rte_mbuf *) *
+ unprocess_num);
+ i = unprocess_num;
+ }
/* Flush all packets from the tables */
if (do_vxlan_tcp_gro) {
- i = gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
- 0, pkts, nb_pkts);
+ i += gro_vxlan_tcp4_tbl_timeout_flush(&vxlan_tcp_tbl,
+ 0, &pkts[i], nb_pkts - i);
}
if (do_vxlan_udp_gro) {
@@ -304,13 +311,6 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
i += gro_udp4_tbl_timeout_flush(&udp_tbl, 0,
&pkts[i], nb_pkts - i);
}
- /* Copy unprocessed packets */
- if (unprocess_num > 0) {
- memcpy(&pkts[i], unprocess_pkts,
- sizeof(struct rte_mbuf *) *
- unprocess_num);
- }
- nb_after_gro = i + unprocess_num;
}
return nb_after_gro;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2024-02-12 14:30 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-13 10:18 [PATCH] gro : fix reordering of packets in GRO library Kumara Parameshwaran
2022-10-13 10:20 ` kumaraparameshwaran rathinavel
2022-10-28 8:09 ` [PATCH v2] " Kumara Parameshwaran
2022-10-28 8:27 ` [PATCH v3] " Kumara Parameshwaran
2022-10-28 9:51 ` [PATCH v4] " Kumara Parameshwaran
2022-11-01 7:05 ` [PATCH v5] " Kumara Parameshwaran
2023-06-19 13:25 ` Thomas Monjalon
2023-06-20 7:35 ` Hu, Jiayu
2023-06-21 8:47 ` kumaraparameshwaran rathinavel
2023-06-30 11:32 ` kumaraparameshwaran rathinavel
2023-12-08 17:54 ` [PATCH v6] gro: fix reordering of packets in GRO layer Kumara Parameshwaran
2023-12-08 18:05 ` [PATCH v7] " Kumara Parameshwaran
2023-12-08 18:12 ` [PATCH v8] " Kumara Parameshwaran
2023-12-08 18:17 ` [PATCH v9] " Kumara Parameshwaran
2024-01-04 15:49 ` 胡嘉瑜
2024-01-07 11:21 ` [PATCH v10] " Kumara Parameshwaran
2024-01-07 11:29 ` [PATCH v11] " Kumara Parameshwaran
2024-01-07 17:20 ` Stephen Hemminger
2024-01-08 16:11 ` kumaraparameshwaran rathinavel
2024-01-08 15:50 ` [PATCH v12] " Kumara Parameshwaran
2024-01-08 16:04 ` [PATCH v13] " Kumara Parameshwaran
2024-01-16 14:28 ` 胡嘉瑜
2024-02-12 14:30 ` Thomas Monjalon
-- strict thread matches above, loose matches on Subject: below --
2022-09-07 8:59 [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets Kumara Parameshwaran
2022-11-01 7:03 ` [PATCH v5] gro : fix reordering of packets in GRO library Kumara Parameshwaran
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).