DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To: Stephen Hemminger <stephen@networkplumber.org>, dev@dpdk.org
Cc: bruce.richardson@intel.com, Thomas Monjalon <thomas@monjalon.net>,
	John McNamara <john.mcnamara@intel.com>
Subject: Re: [RFC] ethdev: fix UAF in rte_eth_tx_burst with filtering
Date: Sun, 9 Nov 2025 18:22:41 +0300	[thread overview]
Message-ID: <11bf07b2-30fe-4935-a43c-fe64a37cc0b4@oktetlabs.ru> (raw)
In-Reply-To: <20251029235400.1486861-1-stephen@networkplumber.org>

If I remember correctly we should avoid function names in summary.

(I'm really surprised that callbacks and rte_eth_tx_burst() are
supposed to reorder mbufs in tx_pkts array, but it looks like it
the only sensible option.)

On 10/30/25 02:54, Stephen Hemminger wrote:
> When packets are filtered by Tx callback, they return value
> needs to include those packets; otherwise the application will
> do a duplicate free of those mbufs.
> 
> The example would be application that sent a burst of 8 packets,
> and 2 were filtered but the PMD was busy and could not send any.
> The result should be a return value of 2 and mbufs 0..1 should
> be freed and 2..7 should be untouched and up to application to
> resend or drop.
> 
> Original buggy code would return 0 and 0..1 would be freed
> 2..7 would be unused. Application would then try and drop
> or resend an already freed mbuf.
> 
> Bugzilla ID: 1816
> Fixes: 4dc294158cac ("ethdev: support optional Rx and Tx callbacks")
> Cc: bruce.richardson@intel.com
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>   lib/ethdev/rte_ethdev.h | 47 ++++++++++++++++++++++++++++++-----------
>   1 file changed, 35 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index a66c2abbdb..d2cfe390f9 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6714,6 +6714,10 @@ rte_eth_tx_burst(uint16_t port_id, uint16_t queue_id,
>   	}
>   #endif
>   
> +	uint16_t requested_pkts = nb_pkts;
> +	uint16_t filtered_pkts = 0;
> +	rte_mbuf_history_mark_bulk(tx_pkts, nb_pkts, RTE_MBUF_HISTORY_OP_TX);
> +
>   #ifdef RTE_ETHDEV_RXTX_CALLBACKS
>   	{
>   		void *cb;
> @@ -6727,22 +6731,41 @@ rte_eth_tx_burst(uint16_t port_id, uint16_t queue_id,
>   		cb = rte_atomic_load_explicit(&p->txq.clbk[queue_id],
>   				rte_memory_order_relaxed);
>   		if (unlikely(cb != NULL))
> -			nb_pkts = rte_eth_call_tx_callbacks(port_id, queue_id,
> -					tx_pkts, nb_pkts, cb);
> +			filtered_pkts = nb_pkts -
> +				rte_eth_call_tx_callbacks(port_id, queue_id, tx_pkts, nb_pkts, cb);
>   	}
>   #endif
>   
> -	uint16_t requested_pkts = nb_pkts;
> -	rte_mbuf_history_mark_bulk(tx_pkts, nb_pkts, RTE_MBUF_HISTORY_OP_TX);
> -
> -	nb_pkts = p->tx_pkt_burst(qd, tx_pkts, nb_pkts);
> -
> -	if (requested_pkts > nb_pkts)
> -		rte_mbuf_history_mark_bulk(tx_pkts + nb_pkts,
> -				requested_pkts - nb_pkts, RTE_MBUF_HISTORY_OP_TX_BUSY);
> +	uint16_t sent_pkts = p->tx_pkt_burst(qd, tx_pkts, nb_pkts - filtered_pkts);
> +	uint16_t result_pkts = sent_pkts + filtered_pkts;
> +
> +	if (unlikely(result_pkts < requested_pkts)) {

First of all unlikely() is wrong here. It is not an error path. It is OK
when not everything is sent and it should not cost too much as unlikely
does.
Second, I'd remove the condition completely. It is not a big deal to
calculate unsent_pkts and call trace (if enabled) with zero unsent_pkts.

> +		uint16_t unsent_pkts = requested_pkts - result_pkts;
> +
> +		if (unlikely(filtered_pkts > 0)) {

Same here for unlikely().
Also since filtered_pkts appear in RTE_ETHDEV_RXTX_CALLBACKS case,
I'd put it under corresponding conditional compilation to be consistent.
Yes, code with conditional compilation is hard to read, but consistency
is more important here.

> +			/* Need to reshuffle packet list so that unsent packets are
> +			 * after the return value
> +			 *
> +			 * Original: MMMMMM
> +			 * Filter:   MMMMMF filtered = 1
> +			 * Tx_burst: SSSUUF sent = 3 unsent = 2
> +			 *
> +			 * Return:   SSSFUU
> +			 *   return from tx_burst is 4
> +			 */
> +			memmove(tx_pkts + result_pkts, tx_pkts + sent_pkts,
> +				unsent_pkts * sizeof(struct rte_mbuf *));
> +
> +			/* Tx filter process does not drop the packets until here */
> +			rte_pktmbuf_free_bulk(tx_pkts + sent_pkts, filtered_pkts);
> +		}
> +
> +		rte_mbuf_history_mark_bulk(tx_pkts + result_pkts, unsent_pkts,
> +					   RTE_MBUF_HISTORY_OP_TX_BUSY);
> +	}
>   
> -	rte_ethdev_trace_tx_burst(port_id, queue_id, (void **)tx_pkts, nb_pkts);
> -	return nb_pkts;
> +	rte_ethdev_trace_tx_burst(port_id, queue_id, (void **)tx_pkts, result_pkts);
> +	return result_pkts;
>   }
>   
>   /**


  reply	other threads:[~2025-11-09 15:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29 23:54 Stephen Hemminger
2025-11-09 15:22 ` Andrew Rybchenko [this message]
2025-11-10  8:34   ` Morten Brørup
2025-11-10  9:02     ` Andrew Rybchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11bf07b2-30fe-4935-a43c-fe64a37cc0b4@oktetlabs.ru \
    --to=andrew.rybchenko@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=john.mcnamara@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).