From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: <thomas@monjalon.net>, <andrew.rybchenko@oktetlabs.ru>,
<dev@dpdk.org>, Praveen Shetty <praveen.shetty@intel.com>,
Vladimir Medvedkin <vladimir.medvedkin@intel.com>,
Anatoly Burakov <anatoly.burakov@intel.com>,
Jingjing Wu <jingjing.wu@intel.com>
Subject: Re: [PATCH] net/intel: improve Rx descriptor ring size checks
Date: Tue, 16 Dec 2025 09:52:26 +0000 [thread overview]
Message-ID: <aUEr2u81g44Osaim@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35F655E9@smartserver.smartshare.dk>
On Tue, Dec 16, 2025 at 10:25:41AM +0100, Morten Brørup wrote:
> +TO: Ethdev maintainers
>
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Tuesday, 16 December 2025 09.48
> >
> > On Mon, Dec 15, 2025 at 07:53:27PM +0100, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Monday, 15 December 2025 19.21
> > > >
> > > > On Mon, Dec 15, 2025 at 05:58:33PM +0000, Bruce Richardson wrote:
> > > > > On Mon, Dec 15, 2025 at 06:54:50PM +0100, Morten Brørup wrote:
> > > > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > > > > Sent: Monday, 15 December 2025 18.36
> > > > > > >
> > > > > > > The default Rx ring size checks did not account for the fact
> > that
> > > > the
> > > > > > > port would not work correctly if the Rx ring size was only
> > twice
> > > > the
> > > > > > > free threshold size or less, so add in a new check for this.
> > This
> > > > would
> > > > > > > generally only apply in cases where very small rings sizes
> > are
> > > > being
> > > > > > > used, for example, with default Rx free thresh of 32, only
> > ring
> > > > size of
> > > > > > > 64 would cause issues.
> > > > > > >
> > > > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > > ---
> > > > > >
> > > > > > Does dev_info.rx_desc_lim.nb_min returned by
> > rte_eth_dev_info_get()
> > > > need correction too?
> > > > > >
> > > > > The minimum number of descriptors stays the same, however, if
> > > > choosing the
> > > > > minimum number of descriptors you may need to reduce the
> > > > rx_free_thresh
> > > > > value.
> > > > >
> > > > However, I think you raise a good point. I'll see about adding a
> > > > specific
> > > > error message in case the user is using the default threshold and
> > > > setting
> > > > the min ring size.
> > >
> > > The applications need some generic code sequence that always works,
> > on all NICs.
> > >
> > > E.g. if an application uses rte_eth_dev_adjust_nb_rx_tx_desc() to
> > move a requested crazy number of descriptors within bounds, and uses
> > the default values for all other parameters, it should work.
> > >
> >
> > This is surprisingly difficult to make working with the way things are
> > set
> > up right now. For example, if the user wants defaults for config
> > settings
> > and passes in NULL to the ethdev API, the ethdev library queries the
> > defaults from the driver and fills those in before calling the relevant
> > ring setup functions. Therefore, the driver level has no knowledge of
> > whether the user explicitly requested a value which happens to match
> > the
> > default, or if the user just wants a working default value.
> >
> > Another option would be to set the default low enough that it would
> > work
> > with any ring size possible, but that would then cause a perf impact
> > for
> > apps which don't need such low values (as an extreme example, think on
> > a
> > theoretical driver which allows ring sizes of as small as 16 or 8, a
> > free
> > threshold to work there is likely suboptimal for more normal ring sizes
> > of
> > e.g. 1k or 2k).
>
> Small descriptor rings are not theoretical.
> Our application configures very small descriptor rings on unused ports, to be able to process background traffic (e.g. slow protocols) on those ports, but still conserve memory.
>
> E.g. the igb driver reports dev_info.rx_desc_lim.nb_min = 32, but multiple times this value is required with default thresholds.
> The i40e driver reports dev_info.rx_desc_lim.nb_min = 64, and IIRC more is required with default thresholds.
>
> I agree that defaults should remain optimized for performance (maximum packets per second).
>
> The problem is rte_eth_dev_adjust_nb_rx_tx_desc() not having sufficient information about all the thresholds to correctly calculate its output values. I'll file a bug report.
>
> Updating the drivers to report dev_info.rx/tx_desc_lim.nb_min and dev_info.rx/tx_desc_lim.nb_align that work with default thresholds could be a workaround.
>
I'm not sure I like that option. How would one then query the limits with
non-default thresholds? Also, it doesn't inform user as to which thresholds
are causing what limits, vs limits that are hard ones from the HW.
Other options may be greater use of e.g. 0 as a sentinal value for allowing
the driver to pick, or changing things internally so that the source of the
rx_conf is passed to the drivers. However, I actually feel that the best
option if we want a "most usable" solution here, is to document that the
free_thresh is a hint, and that it may be adjusted by the driver if
necessary to accomodate a requested ring size. [We could log a warning on
adjustment]
/Bruce
next prev parent reply other threads:[~2025-12-16 9:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-15 17:35 Bruce Richardson
2025-12-15 17:54 ` Morten Brørup
2025-12-15 17:58 ` Bruce Richardson
2025-12-15 18:20 ` Bruce Richardson
2025-12-15 18:53 ` Morten Brørup
2025-12-16 8:48 ` Bruce Richardson
2025-12-16 9:25 ` Morten Brørup
2025-12-16 9:52 ` Bruce Richardson [this message]
2025-12-16 10:48 ` Morten Brørup
2025-12-15 18:43 ` [PATCH v2] " Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUEr2u81g44Osaim@bricha3-mobl1.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=jingjing.wu@intel.com \
--cc=mb@smartsharesystems.com \
--cc=praveen.shetty@intel.com \
--cc=thomas@monjalon.net \
--cc=vladimir.medvedkin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).