* [PATCH 1/7] net/gve: send whole packet when mbuf is large
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
2025-07-07 23:18 ` [PATCH 2/7] net/gve: clean when there are insufficient Tx descs Joshua Washington
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Thomas Monjalon, Jeroen de Borst, Joshua Washington, Junfeng Guo,
Rushil Gupta
Cc: dev, junfeng.guo, stable, Ankit Garg
Before this patch, only one descriptor would be written per mbuf in a
packet. In cases like TSO, it is possible for a single mbuf to have more
bytes than GVE_MAX_TX_BUF_SIZE_DQO. As such, instead of simply
truncating the data down to this size, the driver should actually write
descriptors for the rest of the buffers in the mbuf segment.
To that effect, the number of descriptors needed to send a packet must
be corrected to account for the potential additional descriptors.
Fixes: 4022f9999f56 ("net/gve: support basic Tx data path for DQO")
Cc: junfeng.guo@intel.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
.mailmap | 1 +
drivers/net/gve/gve_tx_dqo.c | 54 ++++++++++++++++++++++++------------
2 files changed, 38 insertions(+), 17 deletions(-)
diff --git a/.mailmap b/.mailmap
index 1ea4f9446d..758878bd8b 100644
--- a/.mailmap
+++ b/.mailmap
@@ -124,6 +124,7 @@ Andy Green <andy@warmcat.com>
Andy Moreton <andy.moreton@amd.com> <amoreton@xilinx.com> <amoreton@solarflare.com>
Andy Pei <andy.pei@intel.com>
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
+Ankit Garg <nktgrg@google.com>
Ankur Dwivedi <adwivedi@marvell.com> <ankur.dwivedi@caviumnetworks.com> <ankur.dwivedi@cavium.com>
Anna Lukin <annal@silicom.co.il>
Anoob Joseph <anoobj@marvell.com> <anoob.joseph@caviumnetworks.com>
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 6984f92443..6227fa73b0 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -74,6 +74,19 @@ gve_tx_clean_dqo(struct gve_tx_queue *txq)
txq->complq_tail = next;
}
+static uint16_t
+gve_tx_pkt_nb_data_descs(struct rte_mbuf *tx_pkt)
+{
+ int nb_descs = 0;
+
+ while (tx_pkt) {
+ nb_descs += (GVE_TX_MAX_BUF_SIZE_DQO - 1 + tx_pkt->data_len) /
+ GVE_TX_MAX_BUF_SIZE_DQO;
+ tx_pkt = tx_pkt->next;
+ }
+ return nb_descs;
+}
+
static inline void
gve_tx_fill_seg_desc_dqo(volatile union gve_tx_desc_dqo *desc, struct rte_mbuf *tx_pkt)
{
@@ -97,7 +110,7 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
uint16_t nb_to_clean;
uint16_t nb_tx = 0;
uint64_t ol_flags;
- uint16_t nb_used;
+ uint16_t nb_descs;
uint16_t tx_id;
uint16_t sw_id;
uint64_t bytes;
@@ -124,14 +137,14 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
ol_flags = tx_pkt->ol_flags;
- nb_used = tx_pkt->nb_segs;
first_sw_id = sw_id;
tso = !!(ol_flags & RTE_MBUF_F_TX_TCP_SEG);
csum = !!(ol_flags & GVE_TX_CKSUM_OFFLOAD_MASK_DQO);
- nb_used += tso;
- if (txq->nb_free < nb_used)
+ nb_descs = gve_tx_pkt_nb_data_descs(tx_pkt);
+ nb_descs += tso;
+ if (txq->nb_free < nb_descs)
break;
if (tso) {
@@ -144,21 +157,28 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
if (sw_ring[sw_id] != NULL)
PMD_DRV_LOG(DEBUG, "Overwriting an entry in sw_ring");
- txd = &txr[tx_id];
sw_ring[sw_id] = tx_pkt;
- /* fill Tx descriptor */
- txd->pkt.buf_addr = rte_cpu_to_le_64(rte_mbuf_data_iova(tx_pkt));
- txd->pkt.dtype = GVE_TX_PKT_DESC_DTYPE_DQO;
- txd->pkt.compl_tag = rte_cpu_to_le_16(first_sw_id);
- txd->pkt.buf_size = RTE_MIN(tx_pkt->data_len, GVE_TX_MAX_BUF_SIZE_DQO);
- txd->pkt.end_of_packet = 0;
- txd->pkt.checksum_offload_enable = csum;
+ /* fill Tx descriptors */
+ int mbuf_offset = 0;
+ while (mbuf_offset < tx_pkt->data_len) {
+ uint64_t buf_addr = rte_mbuf_data_iova(tx_pkt) +
+ mbuf_offset;
+
+ txd = &txr[tx_id];
+ txd->pkt.buf_addr = rte_cpu_to_le_64(buf_addr);
+ txd->pkt.compl_tag = rte_cpu_to_le_16(first_sw_id);
+ txd->pkt.dtype = GVE_TX_PKT_DESC_DTYPE_DQO;
+ txd->pkt.buf_size = RTE_MIN(tx_pkt->data_len - mbuf_offset,
+ GVE_TX_MAX_BUF_SIZE_DQO);
+ txd->pkt.end_of_packet = 0;
+ txd->pkt.checksum_offload_enable = csum;
+
+ mbuf_offset += txd->pkt.buf_size;
+ tx_id = (tx_id + 1) & mask;
+ }
- /* size of desc_ring and sw_ring could be different */
- tx_id = (tx_id + 1) & mask;
sw_id = (sw_id + 1) & sw_mask;
-
bytes += tx_pkt->data_len;
tx_pkt = tx_pkt->next;
} while (tx_pkt);
@@ -166,8 +186,8 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
/* fill the last descriptor with End of Packet (EOP) bit */
txd->pkt.end_of_packet = 1;
- txq->nb_free -= nb_used;
- txq->nb_used += nb_used;
+ txq->nb_free -= nb_descs;
+ txq->nb_used += nb_descs;
}
/* update the tail pointer if any packets were processed */
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/7] net/gve: clean when there are insufficient Tx descs
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
2025-07-07 23:18 ` [PATCH 1/7] net/gve: send whole packet when mbuf is large Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
2025-07-07 23:18 ` [PATCH 3/7] net/gve: don't write zero-length descriptors Joshua Washington
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Jeroen de Borst, Joshua Washington, Rushil Gupta, Junfeng Guo
Cc: dev, junfeng.guo, stable, Ankit Garg
A single packet can technically require more than 32 (free_thresh)
descriptors to send. Count the number of descriptors needed to send out
a packet in DQO Tx, and ensure that there are enough descriptors in the
ring before writing. If there are not enough free descriptors, drop the
packet and increment drop counters.
Fixes: 4022f9999f56 ("net/gve: support basic Tx data path for DQO")
Cc: junfeng.guo@intel.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
drivers/net/gve/gve_tx_dqo.c | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 6227fa73b0..652a0e5175 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -74,6 +74,12 @@ gve_tx_clean_dqo(struct gve_tx_queue *txq)
txq->complq_tail = next;
}
+static inline void
+gve_tx_clean_descs_dqo(struct gve_tx_queue *txq, uint16_t nb_descs) {
+ while (--nb_descs)
+ gve_tx_clean_dqo(txq);
+}
+
static uint16_t
gve_tx_pkt_nb_data_descs(struct rte_mbuf *tx_pkt)
{
@@ -107,7 +113,6 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
struct rte_mbuf **sw_ring;
struct rte_mbuf *tx_pkt;
uint16_t mask, sw_mask;
- uint16_t nb_to_clean;
uint16_t nb_tx = 0;
uint64_t ol_flags;
uint16_t nb_descs;
@@ -130,11 +135,9 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
for (nb_tx = 0; nb_tx < nb_pkts; nb_tx++) {
tx_pkt = tx_pkts[nb_tx];
- if (txq->nb_free <= txq->free_thresh) {
- nb_to_clean = DQO_TX_MULTIPLIER * txq->rs_thresh;
- while (nb_to_clean--)
- gve_tx_clean_dqo(txq);
- }
+ if (txq->nb_free <= txq->free_thresh)
+ gve_tx_clean_descs_dqo(txq, DQO_TX_MULTIPLIER *
+ txq->rs_thresh);
ol_flags = tx_pkt->ol_flags;
first_sw_id = sw_id;
@@ -144,8 +147,16 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
nb_descs = gve_tx_pkt_nb_data_descs(tx_pkt);
nb_descs += tso;
- if (txq->nb_free < nb_descs)
- break;
+
+ /* Clean if there aren't enough descriptors to send the packet. */
+ if (unlikely(txq->nb_free < nb_descs)) {
+ int nb_to_clean = RTE_MAX(DQO_TX_MULTIPLIER * txq->rs_thresh,
+ nb_descs);
+
+ gve_tx_clean_descs_dqo(txq, nb_to_clean);
+ if (txq->nb_free < nb_descs)
+ break;
+ }
if (tso) {
txd = &txr[tx_id];
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/7] net/gve: don't write zero-length descriptors
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
2025-07-07 23:18 ` [PATCH 1/7] net/gve: send whole packet when mbuf is large Joshua Washington
2025-07-07 23:18 ` [PATCH 2/7] net/gve: clean when there are insufficient Tx descs Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
2025-07-07 23:18 ` [PATCH 4/7] net/gve: validate Tx packet before sending Joshua Washington
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Jeroen de Borst, Joshua Washington, Junfeng Guo, Rushil Gupta
Cc: dev, junfeng.guo, stable, Ankit Garg
Writing zero-length descriptors to the hardware can cause the hardware
to reject that packet and stop transmitting altogether.
Fixes: 4022f9999f56 ("net/gve: support basic Tx data path for DQO")
Cc: junfeng.guo@intel.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
drivers/net/gve/gve_tx_dqo.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 652a0e5175..b2e5aae634 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -168,6 +168,11 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
if (sw_ring[sw_id] != NULL)
PMD_DRV_LOG(DEBUG, "Overwriting an entry in sw_ring");
+ /* Skip writing descriptor if mbuf has no data. */
+ if (!tx_pkt->data_len)
+ goto finish_mbuf;
+
+ txd = &txr[tx_id];
sw_ring[sw_id] = tx_pkt;
/* fill Tx descriptors */
@@ -189,12 +194,14 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
tx_id = (tx_id + 1) & mask;
}
+finish_mbuf:
sw_id = (sw_id + 1) & sw_mask;
bytes += tx_pkt->data_len;
tx_pkt = tx_pkt->next;
} while (tx_pkt);
- /* fill the last descriptor with End of Packet (EOP) bit */
+ /* fill the last written descriptor with End of Packet (EOP) bit */
+ txd = &txr[(tx_id - 1) & mask];
txd->pkt.end_of_packet = 1;
txq->nb_free -= nb_descs;
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 4/7] net/gve: validate Tx packet before sending
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
` (2 preceding siblings ...)
2025-07-07 23:18 ` [PATCH 3/7] net/gve: don't write zero-length descriptors Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
2025-07-07 23:18 ` [PATCH 5/7] net/gve: add DQO Tx descriptor limit Joshua Washington
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Jeroen de Borst, Joshua Washington, Rushil Gupta, Junfeng Guo
Cc: dev, junfeng.guo, stable, Ankit Garg
The hardware assumes that a mismatch between the reported packet length
and the total amount of data in the descriptors is caused by a malicious
driver, leading the hardware to disable transmission altogether. To
avoid such a scenario, use rte_mbuf_check to validate that the mbuf is
correctly formed before processing.
Fixes: 4022f9999f56 ("net/gve: support basic Tx data path for DQO")
Cc: junfeng.guo@intel.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
drivers/net/gve/gve_tx_dqo.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index b2e5aae634..7e03e75b20 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -113,13 +113,14 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
struct rte_mbuf **sw_ring;
struct rte_mbuf *tx_pkt;
uint16_t mask, sw_mask;
+ uint16_t first_sw_id;
+ const char *reason;
uint16_t nb_tx = 0;
uint64_t ol_flags;
uint16_t nb_descs;
uint16_t tx_id;
uint16_t sw_id;
uint64_t bytes;
- uint16_t first_sw_id;
uint8_t tso;
uint8_t csum;
@@ -139,6 +140,12 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
gve_tx_clean_descs_dqo(txq, DQO_TX_MULTIPLIER *
txq->rs_thresh);
+
+ if (rte_mbuf_check(tx_pkt, true, &reason)) {
+ PMD_DRV_LOG(DEBUG, "Invalid mbuf: %s", reason);
+ break;
+ }
+
ol_flags = tx_pkt->ol_flags;
first_sw_id = sw_id;
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 5/7] net/gve: add DQO Tx descriptor limit
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
` (3 preceding siblings ...)
2025-07-07 23:18 ` [PATCH 4/7] net/gve: validate Tx packet before sending Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
2025-07-07 23:18 ` [PATCH 6/7] net/gve: fix DQO TSO " Joshua Washington
2025-07-07 23:18 ` [PATCH 7/7] net/gve: clear DQO Tx descriptors before writing Joshua Washington
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Jeroen de Borst, Joshua Washington, Junfeng Guo, Rushil Gupta
Cc: dev, junfeng.guo, stable, Ankit Garg
The hardware supports at most 10 data descriptors per MTU-sized segment.
GVE_TX_MAX_DATA_DESCS was defined in the initial implmenentation, but
the descriptor limit was never actually enforced.
Fixes: 4022f9999f56 ("net/gve: support basic Tx data path for DQO")
Cc: junfeng.guo@intel.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
drivers/net/gve/gve_ethdev.c | 1 +
drivers/net/gve/gve_ethdev.h | 1 +
drivers/net/gve/gve_tx_dqo.c | 6 ++++++
3 files changed, 8 insertions(+)
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index bdb7f1d075..81325ba98c 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -603,6 +603,7 @@ gve_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
.nb_max = priv->max_tx_desc_cnt,
.nb_min = priv->min_tx_desc_cnt,
.nb_align = 1,
+ .nb_mtu_seg_max = GVE_TX_MAX_DATA_DESCS,
};
dev_info->flow_type_rss_offloads = GVE_RTE_RSS_OFFLOAD_ALL;
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 35cb9062b1..195dadc4d4 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -102,6 +102,7 @@ struct gve_tx_stats {
uint64_t packets;
uint64_t bytes;
uint64_t errors;
+ uint64_t too_many_descs;
};
struct gve_rx_stats {
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 7e03e75b20..27f98cdeb3 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -165,6 +165,12 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
break;
}
+ /* Drop packet if it doesn't adhere to hardware limits. */
+ if (!tso && nb_descs > GVE_TX_MAX_DATA_DESCS) {
+ txq->stats.too_many_descs++;
+ break;
+ }
+
if (tso) {
txd = &txr[tx_id];
gve_tx_fill_seg_desc_dqo(txd, tx_pkt);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 6/7] net/gve: fix DQO TSO descriptor limit
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
` (4 preceding siblings ...)
2025-07-07 23:18 ` [PATCH 5/7] net/gve: add DQO Tx descriptor limit Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
2025-07-07 23:18 ` [PATCH 7/7] net/gve: clear DQO Tx descriptors before writing Joshua Washington
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Jeroen de Borst, Joshua Washington, Varun Lakkur Ambaji Rao,
Tathagat Priyadarshi, Rushil Gupta
Cc: dev, stable, Ankit Garg
The DQ queue format expects that any MTU-sized packet or segment will
only cross at most 10 data descriptors.
In the non-TSO case, this means that a given packet simply can have at
most 10 descriptors.
In the TSO case, things are a bit more complex. For large TSO packets,
mbufs must be parsed and split into tso_segsz-sized (MSS) segments. For
any such MSS segment, the number of descriptors that would be used to
transmit the segment must be counted. The following restrictions apply
when counting descriptors:
1) Every TSO segment (including the very first) will be prepended by a
_separate_ data descriptor holding only header data,
2) The hardware can send at most up to 16K bytes in a single data
descriptor, and
3) The start of every mbuf counts as a separator between data
descriptors -- data is not assumed to be coalesced or copied
The value of nb_mtu_seg_max is set to GVE_TX_MAX_DATA_DESCS-1 to ensure
that the hidden extra prepended descriptor added to the beginning of
each segment in the TSO case is accounted for.
Fixes: 403c671a46b6 ("net/gve: support TSO in DQO RDA")
Cc: tathagat.dpdk@gmail.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
drivers/net/gve/gve_ethdev.c | 2 +-
drivers/net/gve/gve_tx_dqo.c | 64 +++++++++++++++++++++++++++++++++++-
2 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index 81325ba98c..ef1c543aac 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -603,7 +603,7 @@ gve_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
.nb_max = priv->max_tx_desc_cnt,
.nb_min = priv->min_tx_desc_cnt,
.nb_align = 1,
- .nb_mtu_seg_max = GVE_TX_MAX_DATA_DESCS,
+ .nb_mtu_seg_max = GVE_TX_MAX_DATA_DESCS - 1,
};
dev_info->flow_type_rss_offloads = GVE_RTE_RSS_OFFLOAD_ALL;
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 27f98cdeb3..3befbbcacb 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -80,6 +80,68 @@ gve_tx_clean_descs_dqo(struct gve_tx_queue *txq, uint16_t nb_descs) {
gve_tx_clean_dqo(txq);
}
+/* GVE expects at most 10 data descriptors per mtu-sized segment. Beyond this,
+ * the hardware will assume the driver is malicious and stop transmitting
+ * packets altogether. Validate that a packet can be sent to avoid sending
+ * posting descriptors for an invalid packet.
+ */
+static inline bool
+gve_tx_validate_descs(struct rte_mbuf *tx_pkt, uint16_t nb_descs, bool is_tso)
+{
+ if (!is_tso)
+ return nb_descs <= GVE_TX_MAX_DATA_DESCS;
+
+ int tso_segsz = tx_pkt->tso_segsz;
+ int num_descs, seg_offset, mbuf_len;
+ int headlen = tx_pkt->l2_len + tx_pkt->l3_len + tx_pkt->l4_len;
+
+ /* Headers will be split into their own buffer. */
+ num_descs = 1;
+ seg_offset = 0;
+ mbuf_len = tx_pkt->data_len - headlen;
+
+ while (tx_pkt) {
+ if (!mbuf_len)
+ goto next_mbuf;
+
+ int seg_remain = tso_segsz - seg_offset;
+ if (num_descs == GVE_TX_MAX_DATA_DESCS && seg_remain)
+ return false;
+
+ if (seg_remain < mbuf_len) {
+ seg_offset = mbuf_len % tso_segsz;
+ /* The MSS is bound from above by 9728B, so a
+ * single TSO segment in the middle of an mbuf
+ * will be part of at most two descriptors, and
+ * is not at risk of defying this limitation.
+ * Thus, such segments are ignored.
+ */
+ int mbuf_remain = tx_pkt->data_len % GVE_TX_MAX_BUF_SIZE_DQO;
+
+ /* For each TSO segment, HW will prepend
+ * headers. The remaining bytes of this mbuf
+ * will be the start of the payload of the next
+ * TSO segment. In addition, if the final
+ * segment in this mbuf is divided between two
+ * descriptors, both must be counted.
+ */
+ num_descs = 1 + !!(seg_offset) +
+ (mbuf_remain < seg_offset && mbuf_remain);
+ } else {
+ seg_offset += mbuf_len;
+ num_descs++;
+ }
+
+next_mbuf:
+ tx_pkt = tx_pkt->next;
+ if (tx_pkt)
+ mbuf_len = tx_pkt->data_len;
+ }
+
+
+ return true;
+}
+
static uint16_t
gve_tx_pkt_nb_data_descs(struct rte_mbuf *tx_pkt)
{
@@ -166,7 +228,7 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
/* Drop packet if it doesn't adhere to hardware limits. */
- if (!tso && nb_descs > GVE_TX_MAX_DATA_DESCS) {
+ if (!gve_tx_validate_descs(tx_pkt, nb_descs, tso)) {
txq->stats.too_many_descs++;
break;
}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 7/7] net/gve: clear DQO Tx descriptors before writing
2025-07-07 23:18 [PATCH 0/7] net/gve: Tx datapath fixes for GVE DQO Joshua Washington
` (5 preceding siblings ...)
2025-07-07 23:18 ` [PATCH 6/7] net/gve: fix DQO TSO " Joshua Washington
@ 2025-07-07 23:18 ` Joshua Washington
6 siblings, 0 replies; 8+ messages in thread
From: Joshua Washington @ 2025-07-07 23:18 UTC (permalink / raw)
To: Jeroen de Borst, Joshua Washington, Tathagat Priyadarshi,
Rushil Gupta, Varun Lakkur Ambaji Rao
Cc: dev, stable, Ankit Garg
When TSO was introduced, it became possible for two differing descriptor
formats to be written to the descriptor ring, GVE_TX_PKT_DESC_DTYPE_DQO
and GVE_TX_TSO_CTX_DESC_DTYPE_DQO. Because these descriptor types have
different formats, they end up setting different fields, which can be
misinterpreted by the hardware if not fully cleared.
Fixes: 403c671a46b6 ("net/gve: support TSO in DQO RDA")
Cc: tathagat.dpdk@gmail.com
Cc: stable@dpdk.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Ankit Garg <nktgrg@google.com>
---
drivers/net/gve/gve_tx_dqo.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 3befbbcacb..169c40d5b0 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -159,6 +159,8 @@ static inline void
gve_tx_fill_seg_desc_dqo(volatile union gve_tx_desc_dqo *desc, struct rte_mbuf *tx_pkt)
{
uint32_t hlen = tx_pkt->l2_len + tx_pkt->l3_len + tx_pkt->l4_len;
+
+ desc->tso_ctx = (struct gve_tx_tso_context_desc_dqo) {};
desc->tso_ctx.cmd_dtype.dtype = GVE_TX_TSO_CTX_DESC_DTYPE_DQO;
desc->tso_ctx.cmd_dtype.tso = 1;
desc->tso_ctx.mss = (uint16_t)tx_pkt->tso_segsz;
@@ -257,6 +259,7 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
mbuf_offset;
txd = &txr[tx_id];
+ txd->pkt = (struct gve_tx_pkt_desc_dqo) {};
txd->pkt.buf_addr = rte_cpu_to_le_64(buf_addr);
txd->pkt.compl_tag = rte_cpu_to_le_16(first_sw_id);
txd->pkt.dtype = GVE_TX_PKT_DESC_DTYPE_DQO;
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 8+ messages in thread