* [RFC PATCH] ethdev: support Rx data discard
@ 2026-01-04 13:13 Gregory Etelson
2026-01-04 18:04 ` Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Gregory Etelson @ 2026-01-04 13:13 UTC (permalink / raw)
To: dev; +Cc: getelson, , viacheslavo, Thomas Monjalon, Andrew Rybchenko
In some cases application does not need to receive entire packet
from port hardware.
If application could fetch required data only and safely discard the
rest of Rx packet data, that could improve port performance by
reducing PCI bandwidth.
The RTE_ETH_DEV_DISCARD_RX_DATA device capability flag notifies that
a port hardware supports Rx data discard.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
lib/ethdev/rte_ethdev.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index a66c2abbdb..10938ddad3 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2170,6 +2170,8 @@ struct rte_eth_dev_owner {
* PMDs filling the queue xstats themselves should not set this flag
*/
#define RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS RTE_BIT32(6)
+/** Device supports Rx data discard */
+#define RTE_ETH_DEV_DISCARD_RX_DATA RTE_BIT32(7)
/**@}*/
/**
--
2.51.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] ethdev: support Rx data discard
2026-01-04 13:13 [RFC PATCH] ethdev: support Rx data discard Gregory Etelson
@ 2026-01-04 18:04 ` Stephen Hemminger
2026-01-05 16:51 ` [RFC PATCH v2] ethdev: support selective Rx data Gregory Etelson
2026-01-06 15:45 ` [RFC PATCH v3] " Gregory Etelson
2 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-01-04 18:04 UTC (permalink / raw)
To: Gregory Etelson
Cc: dev, , viacheslavo, Thomas Monjalon, Andrew Rybchenko
On Sun, 4 Jan 2026 15:13:01 +0200
Gregory Etelson <getelson@nvidia.com> wrote:
> In some cases application does not need to receive entire packet
> from port hardware.
> If application could fetch required data only and safely discard the
> rest of Rx packet data, that could improve port performance by
> reducing PCI bandwidth.
>
> The RTE_ETH_DEV_DISCARD_RX_DATA device capability flag notifies that
> a port hardware supports Rx data discard.
>
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> ---
> lib/ethdev/rte_ethdev.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index a66c2abbdb..10938ddad3 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -2170,6 +2170,8 @@ struct rte_eth_dev_owner {
> * PMDs filling the queue xstats themselves should not set this flag
> */
> #define RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS RTE_BIT32(6)
> +/** Device supports Rx data discard */
> +#define RTE_ETH_DEV_DISCARD_RX_DATA RTE_BIT32(7)
> /**@}*/
>
> /**
Just because HW can do a feature doesn't mean that DPDK has to support it.
There needs to be more justification.
Why not use a flow rule?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH v2] ethdev: support selective Rx data
2026-01-04 13:13 [RFC PATCH] ethdev: support Rx data discard Gregory Etelson
2026-01-04 18:04 ` Stephen Hemminger
@ 2026-01-05 16:51 ` Gregory Etelson
2026-01-06 15:04 ` Thomas Monjalon
2026-01-06 15:45 ` [RFC PATCH v3] " Gregory Etelson
2 siblings, 1 reply; 7+ messages in thread
From: Gregory Etelson @ 2026-01-05 16:51 UTC (permalink / raw)
To: getelson
Cc: andrew.rybchenko, dev, mkashani, thomas, viacheslavo, matan, stephen
In some cases application does not need to receive entire packet
from port hardware.
If application could receive required Rx data only and safely discard
the rest of Rx packet data, that could improve port performance by
reducing PCI bandwidth and application memory consumption.
Selective Rx data allows application to receive
only pre-configured packet segments and discard the rest.
For example:
- Deliver the first N bytes only.
- Deliver the last N bytes only.
- Deliver N1 bytes from offest Off1 and N2 bytes from offset Off2.
Selective Rx data is implemented on-top of the existing Rx
BUFFER_SPLIT functionality:
- The rte_eth_rxseg_split will use the NULL mempool for data segments
that should be discarded.
- PMD will not create MBUF segments if no data was read.
For example: Deliver Ethernet header only
Rx queue segments configuration:
struct rte_eth_rxseg_split split[2] = {
{
.mp = <some mempool>,
.length = sizeof(struct rte_ether_hdr)
},
{
.mp = NULL, /* discard data */
.length = <MTU>
}
};
Received MBUF configuration:
mbuf[0].pkt_len = <original packet length>;
mbuf[0].data_len = sizeof(struct rte_ether_hdr);
mbuf[0].next = NULL; /* The next segment did not deliver data */
A PMD activates the selective Rx data capability by setting the
rte_eth_rxseg_capa.selective_read bit.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
lib/ethdev/rte_ethdev.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index a66c2abbdb..bc456bb6d7 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1121,7 +1121,11 @@ struct rte_eth_txmode {
* The rest will be put into the last valid pool.
*/
struct rte_eth_rxseg_split {
- struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
+ /**
+ * Memory pool to allocate segment from.
+ * NULL means skipped segment in selective Rx data. @see selective_read.
+ */
+ struct rte_mempool *mp;
uint16_t length; /**< Segment data length, configures split point. */
uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
/**
@@ -1757,6 +1761,7 @@ struct rte_eth_rxseg_capa {
__extension__
uint32_t multi_pools:1; /**< Supports receiving to multiple pools.*/
uint32_t offset_allowed:1; /**< Supports buffer offsets. */
+ uint32_t selective_read:1; /**< Supports selective read. */
uint32_t offset_align_log2:4; /**< Required offset alignment. */
uint16_t max_nseg; /**< Maximum amount of segments to split. */
uint16_t reserved; /**< Reserved field. */
--
2.51.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v2] ethdev: support selective Rx data
2026-01-05 16:51 ` [RFC PATCH v2] ethdev: support selective Rx data Gregory Etelson
@ 2026-01-06 15:04 ` Thomas Monjalon
0 siblings, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2026-01-06 15:04 UTC (permalink / raw)
To: Gregory Etelson
Cc: andrew.rybchenko, dev, mkashani, viacheslavo, matan, stephen,
David Marchand
05/01/2026 17:51, Gregory Etelson:
> @@ -1757,6 +1761,7 @@ struct rte_eth_rxseg_capa {
> __extension__
> uint32_t multi_pools:1; /**< Supports receiving to multiple pools.*/
> uint32_t offset_allowed:1; /**< Supports buffer offsets. */
> + uint32_t selective_read:1; /**< Supports selective read. */
> uint32_t offset_align_log2:4; /**< Required offset alignment. */
> uint16_t max_nseg; /**< Maximum amount of segments to split. */
> uint16_t reserved; /**< Reserved field. */
Adding a field in the middle in an ABI breakage.
If the ABI checker is smart enough, it should be OK to move the 1-bit field
at the end of the bit-fields.
If we still have an error, we may add an exception to devtools/libabigail.abignore
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH v3] ethdev: support selective Rx data
2026-01-04 13:13 [RFC PATCH] ethdev: support Rx data discard Gregory Etelson
2026-01-04 18:04 ` Stephen Hemminger
2026-01-05 16:51 ` [RFC PATCH v2] ethdev: support selective Rx data Gregory Etelson
@ 2026-01-06 15:45 ` Gregory Etelson
2026-01-06 16:33 ` Stephen Hemminger
2 siblings, 1 reply; 7+ messages in thread
From: Gregory Etelson @ 2026-01-06 15:45 UTC (permalink / raw)
To: getelson
Cc: andrew.rybchenko, dev, mkashani, thomas, viacheslavo, matan, stephen
In some cases application does not need to receive entire packet
from port hardware.
If application could receive required Rx data only and safely discard
the rest of Rx packet data, that could improve port performance by
reducing PCI bandwidth and application memory consumption.
Selective Rx data allows application to receive
only pre-configured packet segments and discard the rest.
For example:
- Deliver the first N bytes only.
- Deliver the last N bytes only.
- Deliver N1 bytes from offset Off1 and N2 bytes from offset Off2.
Selective Rx data is implemented on-top of the existing Rx
BUFFER_SPLIT functionality:
- The rte_eth_rxseg_split will use the NULL mempool for data segments
that should be discarded.
- PMD will not create MBUF segments if no data was read.
For example: Deliver Ethernet header only
Rx queue segments configuration:
struct rte_eth_rxseg_split split[2] = {
{
.mp = <some mempool>,
.length = sizeof(struct rte_ether_hdr)
},
{
.mp = NULL, /* discard data */
.length = <MTU>
}
};
Received MBUF configuration:
mbuf[0].pkt_len = <original packet length>;
mbuf[0].data_len = sizeof(struct rte_ether_hdr);
mbuf[0].next = NULL; /* The next segment did not deliver data */
A PMD activates the selective Rx data capability by setting the
rte_eth_rxseg_capa.selective_read bit.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
v3: Change the selective_read bit location.
---
lib/ethdev/rte_ethdev.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index a66c2abbdb..84769d3d26 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1121,7 +1121,11 @@ struct rte_eth_txmode {
* The rest will be put into the last valid pool.
*/
struct rte_eth_rxseg_split {
- struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
+ /**
+ * Memory pool to allocate segment from.
+ * NULL means skipped segment in selective Rx data. @see selective_read.
+ */
+ struct rte_mempool *mp;
uint16_t length; /**< Segment data length, configures split point. */
uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
/**
@@ -1758,6 +1762,7 @@ struct rte_eth_rxseg_capa {
uint32_t multi_pools:1; /**< Supports receiving to multiple pools.*/
uint32_t offset_allowed:1; /**< Supports buffer offsets. */
uint32_t offset_align_log2:4; /**< Required offset alignment. */
+ uint32_t selective_read:1; /**< Supports selective read. */
uint16_t max_nseg; /**< Maximum amount of segments to split. */
uint16_t reserved; /**< Reserved field. */
};
--
2.51.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v3] ethdev: support selective Rx data
2026-01-06 15:45 ` [RFC PATCH v3] " Gregory Etelson
@ 2026-01-06 16:33 ` Stephen Hemminger
2026-01-06 16:52 ` Etelson, Gregory
0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2026-01-06 16:33 UTC (permalink / raw)
To: Gregory Etelson
Cc: andrew.rybchenko, dev, mkashani, thomas, viacheslavo, matan
On Tue, 6 Jan 2026 17:45:54 +0200
Gregory Etelson <getelson@nvidia.com> wrote:
> In some cases application does not need to receive entire packet
> from port hardware.
> If application could receive required Rx data only and safely discard
> the rest of Rx packet data, that could improve port performance by
> reducing PCI bandwidth and application memory consumption.
>
> Selective Rx data allows application to receive
> only pre-configured packet segments and discard the rest.
> For example:
> - Deliver the first N bytes only.
> - Deliver the last N bytes only.
> - Deliver N1 bytes from offset Off1 and N2 bytes from offset Off2.
>
> Selective Rx data is implemented on-top of the existing Rx
> BUFFER_SPLIT functionality:
> - The rte_eth_rxseg_split will use the NULL mempool for data segments
> that should be discarded.
> - PMD will not create MBUF segments if no data was read.
>
> For example: Deliver Ethernet header only
>
> Rx queue segments configuration:
> struct rte_eth_rxseg_split split[2] = {
> {
> .mp = <some mempool>,
> .length = sizeof(struct rte_ether_hdr)
> },
> {
> .mp = NULL, /* discard data */
> .length = <MTU>
> }
> };
>
> Received MBUF configuration:
> mbuf[0].pkt_len = <original packet length>;
> mbuf[0].data_len = sizeof(struct rte_ether_hdr);
> mbuf[0].next = NULL; /* The next segment did not deliver data */
>
> A PMD activates the selective Rx data capability by setting the
> rte_eth_rxseg_capa.selective_read bit.
>
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> ---
> v3: Change the selective_read bit location.
> ---
> lib/ethdev/rte_ethdev.h | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index a66c2abbdb..84769d3d26 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1121,7 +1121,11 @@ struct rte_eth_txmode {
> * The rest will be put into the last valid pool.
> */
> struct rte_eth_rxseg_split {
> - struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
> + /**
> + * Memory pool to allocate segment from.
> + * NULL means skipped segment in selective Rx data. @see selective_read.
> + */
> + struct rte_mempool *mp;
> uint16_t length; /**< Segment data length, configures split point. */
> uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
> /**
> @@ -1758,6 +1762,7 @@ struct rte_eth_rxseg_capa {
> uint32_t multi_pools:1; /**< Supports receiving to multiple pools.*/
> uint32_t offset_allowed:1; /**< Supports buffer offsets. */
> uint32_t offset_align_log2:4; /**< Required offset alignment. */
> + uint32_t selective_read:1; /**< Supports selective read. */
> uint16_t max_nseg; /**< Maximum amount of segments to split. */
> uint16_t reserved; /**< Reserved field. */
> };
Will need testpmd extension and a driver (ideally multiple) that support it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v3] ethdev: support selective Rx data
2026-01-06 16:33 ` Stephen Hemminger
@ 2026-01-06 16:52 ` Etelson, Gregory
0 siblings, 0 replies; 7+ messages in thread
From: Etelson, Gregory @ 2026-01-06 16:52 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Gregory Etelson, andrew.rybchenko, dev, mkashani, thomas,
viacheslavo, matan
Hello Stephen,
>
>> In some cases application does not need to receive entire packet
>> from port hardware.
>> If application could receive required Rx data only and safely discard
>> the rest of Rx packet data, that could improve port performance by
>> reducing PCI bandwidth and application memory consumption.
>>
[ snip ]
>
> Will need testpmd extension and a driver (ideally multiple) that support it.
>
Ethdev library, MLX5 PMD and testpmd patches with the selective Rx data read
implemetnation will be proveded after the RFC acceptence.
Additional PMD implementations depend on HW support to safely discard Rx data.
Regards,
Gregory
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-01-06 16:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-04 13:13 [RFC PATCH] ethdev: support Rx data discard Gregory Etelson
2026-01-04 18:04 ` Stephen Hemminger
2026-01-05 16:51 ` [RFC PATCH v2] ethdev: support selective Rx data Gregory Etelson
2026-01-06 15:04 ` Thomas Monjalon
2026-01-06 15:45 ` [RFC PATCH v3] " Gregory Etelson
2026-01-06 16:33 ` Stephen Hemminger
2026-01-06 16:52 ` Etelson, Gregory
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).