* Fixing MBUF_FAST_FREE TX offload requirements? @ 2025-09-18 8:50 Morten Brørup 2025-09-18 9:09 ` Bruce Richardson 0 siblings, 1 reply; 3+ messages in thread From: Morten Brørup @ 2025-09-18 8:50 UTC (permalink / raw) To: Ajit Khaparde, Somnath Kotur, Nithin Dabilpuram, Kiran Kumar K, Sunil Kumar Kori, Satha Rao, Harman Kalra, Hemant Agrawal, Sachin Saxena, Shai Brandes, Evgeny Schemeilin, Ron Beider, Amit Bernstein, Wajeeh Atrash, Gaetan Rivet, Xingui Yang, Chengwen Feng, Bruce Richardson, Praveen Shetty, Vladimir Medvedkin, Anatoly Burakov, Jingjing Wu, Praveen Shetty, Rosen Xu, Andrew Boyer, Dariusz Sosnowski, Viacheslav Ovsiienko, Bing Zhao, Ori Kam, Suanming Mou, Matan Azrad, Harman Kalra, Wenbo Cao, Andrew Rybchenko, Jerin Jacob, Maciej Czekaj Cc: dev, techboard, Konstantin Ananyev, Ivan Malov, Thomas Monjalon Dear NIC driver maintainers (CC: DPDK Tech Board), The DPDK Tech Board has discussed that patch [1] (included in DPDK 25.07) extended the documented requirements to the RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE offload. These changes put additional limitations on applications' use of the MBUF_FAST_FREE TX offload, and made MBUF_FAST_FREE mutually exclusive with MULTI_SEGS (which is typically used for jumbo frame support). The Tech Board discussed that these changes do not reflect the intention of the MBUF_FAST_FREE TX offload, and wants to fix it. Mainly, MBUF_FAST_FREE and MULTI_SEGS should not be mutually exclusive. The original RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE requirements were: When set, application must guarantee that 1) per-queue all mbufs come from the same mempool, and 2) mbufs have refcnt = 1. The patch added the following requirements to the MBUF_FAST_FREE offload, reflecting rte_pktmbuf_prefree_seg() postconditions: 3) mbufs are direct, 4) mbufs have next = NULL and nb_segs = 1. Now, the key question is: Can we roll back to the original two requirements? Or do the drivers also depend on the third and/or fourth requirements? <advertisement> Drivers freeing mbufs directly to a mempool should use the new rte_mbuf_raw_free_bulk() instead of rte_mempool_put_bulk(), so the preconditions for freeing mbufs directly into a mempool are validated in mbuf debug mode (with RTE_LIBRTE_MBUF_DEBUG enabled). Similarly, rte_mbuf_raw_alloc_bulk() should be used instead of rte_mempool_get_bulk(). </advertisement> PS: The feature documentation [2] still reflects the original requirements. [1]: https://github.com/DPDK/dpdk/commit/55624173bacb2becaa67793b71391884876673c1 [2]: https://elixir.bootlin.com/dpdk/v25.07/source/doc/guides/nics/features.rst#L125 Venlig hilsen / Kind regards, -Morten Brørup ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fixing MBUF_FAST_FREE TX offload requirements? 2025-09-18 8:50 Fixing MBUF_FAST_FREE TX offload requirements? Morten Brørup @ 2025-09-18 9:09 ` Bruce Richardson 2025-09-18 10:00 ` Morten Brørup 0 siblings, 1 reply; 3+ messages in thread From: Bruce Richardson @ 2025-09-18 9:09 UTC (permalink / raw) To: Morten Brørup Cc: Ajit Khaparde, Somnath Kotur, Nithin Dabilpuram, Kiran Kumar K, Sunil Kumar Kori, Satha Rao, Harman Kalra, Hemant Agrawal, Sachin Saxena, Shai Brandes, Evgeny Schemeilin, Ron Beider, Amit Bernstein, Wajeeh Atrash, Gaetan Rivet, Xingui Yang, Chengwen Feng, Praveen Shetty, Vladimir Medvedkin, Anatoly Burakov, Jingjing Wu, Rosen Xu, Andrew Boyer, Dariusz Sosnowski, Viacheslav Ovsiienko, Bing Zhao, Ori Kam, Suanming Mou, Matan Azrad, Wenbo Cao, Andrew Rybchenko, Jerin Jacob, Maciej Czekaj, dev, techboard, Konstantin Ananyev, Ivan Malov, Thomas Monjalon On Thu, Sep 18, 2025 at 10:50:11AM +0200, Morten Brørup wrote: > Dear NIC driver maintainers (CC: DPDK Tech Board), > > The DPDK Tech Board has discussed that patch [1] (included in DPDK 25.07) extended the documented requirements to the RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE offload. > These changes put additional limitations on applications' use of the MBUF_FAST_FREE TX offload, and made MBUF_FAST_FREE mutually exclusive with MULTI_SEGS (which is typically used for jumbo frame support). > The Tech Board discussed that these changes do not reflect the intention of the MBUF_FAST_FREE TX offload, and wants to fix it. > Mainly, MBUF_FAST_FREE and MULTI_SEGS should not be mutually exclusive. > > The original RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE requirements were: > When set, application must guarantee that > 1) per-queue all mbufs come from the same mempool, and > 2) mbufs have refcnt = 1. > > The patch added the following requirements to the MBUF_FAST_FREE offload, reflecting rte_pktmbuf_prefree_seg() postconditions: > 3) mbufs are direct, > 4) mbufs have next = NULL and nb_segs = 1. > > Now, the key question is: > Can we roll back to the original two requirements? > Or do the drivers also depend on the third and/or fourth requirements? > > <advertisement> > Drivers freeing mbufs directly to a mempool should use the new rte_mbuf_raw_free_bulk() instead of rte_mempool_put_bulk(), so the preconditions for freeing mbufs directly into a mempool are validated in mbuf debug mode (with RTE_LIBRTE_MBUF_DEBUG enabled). > Similarly, rte_mbuf_raw_alloc_bulk() should be used instead of rte_mempool_get_bulk(). > </advertisement> > > PS: The feature documentation [2] still reflects the original requirements. > > [1]: https://github.com/DPDK/dpdk/commit/55624173bacb2becaa67793b71391884876673c1 > [2]: https://elixir.bootlin.com/dpdk/v25.07/source/doc/guides/nics/features.rst#L125 > > > Venlig hilsen / Kind regards, > -Morten Brørup > I'm a little torn on this question, because I can see benefits for both approaches. Firstly, it would be nice if we made FAST_FREE as accessible for driver use as it was originally, with minimal requirements. However, on looking at the code, I believe that many drivers actually took it to mean that scattered packets couldn't occur in that case either, so the use was incorrect. Similarly, and secondly, if we do have the extra requirements for FAST_FREE, it does mean that any use of it can be very, very minimal and efficient, since we don't need to check anything before freeing the buffers. Given where we are now, I think keeping the more restrictive definition of FAST_FREE is the way to go - keeping it exclusive with MULTI_SEGS - because it means that we are less likely to have bugs. If we look to change it back, I think we'd have to check all drivers to ensure they are using the flag safely. /Bruce ^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: Fixing MBUF_FAST_FREE TX offload requirements? 2025-09-18 9:09 ` Bruce Richardson @ 2025-09-18 10:00 ` Morten Brørup 0 siblings, 0 replies; 3+ messages in thread From: Morten Brørup @ 2025-09-18 10:00 UTC (permalink / raw) To: Bruce Richardson Cc: Ajit Khaparde, Somnath Kotur, Nithin Dabilpuram, Kiran Kumar K, Sunil Kumar Kori, Satha Rao, Harman Kalra, Hemant Agrawal, Sachin Saxena, Shai Brandes, Evgeny Schemeilin, Ron Beider, Amit Bernstein, Wajeeh Atrash, Gaetan Rivet, Xingui Yang, Chengwen Feng, Praveen Shetty, Vladimir Medvedkin, Anatoly Burakov, Jingjing Wu, Rosen Xu, Andrew Boyer, Dariusz Sosnowski, Viacheslav Ovsiienko, Bing Zhao, Ori Kam, Suanming Mou, Matan Azrad, Wenbo Cao, Andrew Rybchenko, Jerin Jacob, Maciej Czekaj, dev, techboard, Konstantin Ananyev, Ivan Malov, Thomas Monjalon > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Thursday, 18 September 2025 11.09 > > On Thu, Sep 18, 2025 at 10:50:11AM +0200, Morten Brørup wrote: > > Dear NIC driver maintainers (CC: DPDK Tech Board), > > > > The DPDK Tech Board has discussed that patch [1] (included in DPDK > 25.07) extended the documented requirements to the > RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE offload. > > These changes put additional limitations on applications' use of the > MBUF_FAST_FREE TX offload, and made MBUF_FAST_FREE mutually exclusive > with MULTI_SEGS (which is typically used for jumbo frame support). > > The Tech Board discussed that these changes do not reflect the > intention of the MBUF_FAST_FREE TX offload, and wants to fix it. > > Mainly, MBUF_FAST_FREE and MULTI_SEGS should not be mutually > exclusive. > > > > The original RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE requirements were: > > When set, application must guarantee that > > 1) per-queue all mbufs come from the same mempool, and > > 2) mbufs have refcnt = 1. > > > > The patch added the following requirements to the MBUF_FAST_FREE > offload, reflecting rte_pktmbuf_prefree_seg() postconditions: > > 3) mbufs are direct, > > 4) mbufs have next = NULL and nb_segs = 1. > > > > Now, the key question is: > > Can we roll back to the original two requirements? > > Or do the drivers also depend on the third and/or fourth > requirements? > > > > <advertisement> > > Drivers freeing mbufs directly to a mempool should use the new > rte_mbuf_raw_free_bulk() instead of rte_mempool_put_bulk(), so the > preconditions for freeing mbufs directly into a mempool are validated > in mbuf debug mode (with RTE_LIBRTE_MBUF_DEBUG enabled). > > Similarly, rte_mbuf_raw_alloc_bulk() should be used instead of > rte_mempool_get_bulk(). > > </advertisement> > > > > PS: The feature documentation [2] still reflects the original > requirements. > > > > [1]: > https://github.com/DPDK/dpdk/commit/55624173bacb2becaa67793b71391884876 > 673c1 > > [2]: > https://elixir.bootlin.com/dpdk/v25.07/source/doc/guides/nics/features. > rst#L125 > > > > > > Venlig hilsen / Kind regards, > > -Morten Brørup > > > I'm a little torn on this question, because I can see benefits for both > approaches. Firstly, it would be nice if we made FAST_FREE as > accessible > for driver use as it was originally, with minimal requirements. > However, on > looking at the code, I believe that many drivers actually took it to > mean > that scattered packets couldn't occur in that case either, so the use > was > incorrect. I primarily look at Intel drivers, and that's how I read the driver code too. > Similarly, and secondly, if we do have the extra > requirements > for FAST_FREE, it does mean that any use of it can be very, very > minimal > and efficient, since we don't need to check anything before freeing the > buffers. > > Given where we are now, I think keeping the more restrictive definition > of > FAST_FREE is the way to go - keeping it exclusive with MULTI_SEGS - > because > it means that we are less likely to have bugs. If we look to change it > back, I think we'd have to check all drivers to ensure they are using > the > flag safely. However, those driver bugs are not new. If we haven't received bug reports from users affected by them, maybe we can disregard them (in this discussion about pros and cons). I prefer we register them as driver bugs, instead of changing the API to accommodate bugs in the drivers. From an application perspective, here's an idea for consideration: Assuming that indirect mbufs are uncommon, we keep requirement #3. To allow MULTI_SEGS (jumbo frames) with FAST_FREE, we get rid of requirement #4. Since the driver knows that refcnt == 1, the driver can set next = NULL and nb_segs = 1 at any time, either when writing the TX descriptor (when it reads the mbuf anyway), or when freeing the mbuf. Regarding performance, this means that the driver's TX code path has to write to the mbufs (i.e. adding the performance cost of memory store operations) when segmented - but that is a universal requirement when freeing segmented mbufs to the mempool. For even more optimized driver performance, as Bruce mentions... If a port is configured for FAST_FREE and not MULTI_SEGS, the driver can use a super lean transmit function. Since the driver's transmit function pointer is per port (not per queue), this would require the driver to provide the MULTI_SEGS capability only per port, and not per queue. (Or we would have to add a NOT_MULTI_SEGS offload flag, to ensure that no queue is configured for MULTI_SEGS.) ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-09-18 10:00 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-18 8:50 Fixing MBUF_FAST_FREE TX offload requirements? Morten Brørup 2025-09-18 9:09 ` Bruce Richardson 2025-09-18 10:00 ` Morten Brørup
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).