DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] mbuf: add fast free bulk function
@ 2025-01-14 16:25 Morten Brørup
  2025-01-14 16:39 ` [PATCH v2] " Morten Brørup
  0 siblings, 1 reply; 6+ messages in thread
From: Morten Brørup @ 2025-01-14 16:25 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Slava Ovsiienko,
	Shahaf Shuler, Olivier Matz
  Cc: dev, Morten Brørup

When putting an mbuf back into its mempool, there are certain requirements
to the mbuf. Specifically, some of its fields must be initialized.

These requirements are in fact invariants about free mbufs, held in
mempools, and thus also apply when allocating an mbuf from a mempool.
With this in mind, the additional assertions in rte_mbuf_raw_free() were
moved to __rte_mbuf_raw_sanity_check().
Furthermore, the assertion regarding pinned external buffer was enhanced;
it now also asserts that the referenced pinned external buffer has
refcnt == 1.

The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to
include the remaining requirements, which were missing here.

And finally:
A new rte_mbuf_fast_free_bulk() inline function was added for the benefit
of ethdev drivers supporting fast release of mbufs.
It asserts these requirements and that the mbufs belong to the specified
mempool, and then calls rte_mempool_put_bulk().

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/ethdev/rte_ethdev.h |  6 ++++--
 lib/mbuf/rte_mbuf.h     | 39 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 1f71cad244..e9267fca79 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1612,8 +1612,10 @@ struct rte_eth_conf {
 #define RTE_ETH_TX_OFFLOAD_MULTI_SEGS       RTE_BIT64(15)
 /**
  * Device supports optimization for fast release of mbufs.
- * When set application must guarantee that per-queue all mbufs comes from
- * the same mempool and has refcnt = 1.
+ * When set application must guarantee that per-queue all mbufs come from the same mempool,
+ * are direct, have refcnt=1, next=NULL and nb_segs=1, as done by rte_pktmbuf_prefree_seg().
+ *
+ * @see rte_mbuf_fast_free_bulk()
  */
 #define RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE   RTE_BIT64(16)
 #define RTE_ETH_TX_OFFLOAD_SECURITY         RTE_BIT64(17)
diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h
index 0d2e0e64b3..cae96bed20 100644
--- a/lib/mbuf/rte_mbuf.h
+++ b/lib/mbuf/rte_mbuf.h
@@ -568,6 +568,10 @@ __rte_mbuf_raw_sanity_check(__rte_unused const struct rte_mbuf *m)
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
+	RTE_ASSERT(!RTE_MBUF_CLONED(m));
+	RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m) ||
+			(RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
+			rte_mbuf_ext_refcnt_read(m->shinfo) == 1));
 	__rte_mbuf_sanity_check(m, 0);
 }
 
@@ -623,12 +627,43 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
-		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	__rte_mbuf_raw_sanity_check(m);
 	rte_mempool_put(m->pool, m);
 }
 
+/**
+ * Put a bulk of mbufs allocated from the same mempool back into the mempool.
+ *
+ * The caller must ensure that the mbufs come from the specified mempool,
+ * are direct and properly reinitialized (refcnt=1, next=NULL, nb_segs=1), as done by
+ * rte_pktmbuf_prefree_seg().
+ *
+ * This function should be used with care, when optimization is
+ * required. For standard needs, prefer rte_pktmbuf_free_bulk().
+ *
+ * @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE
+ *
+ * @param mp
+ *   The mempool to which the mbufs belong.
+ * @param mbufs
+ *   Array of pointers to packet mbufs.
+ *   The array must not contain NULL pointers.
+ * @param count
+ *   Array size.
+ */
+void
+rte_mbuf_fast_free_bulk(struct rte_mempool *mp, struct rte_mbuf **mbufs, unsigned int count)
+{
+	for (unsigned int idx = 0; idx < count; idx++) {
+		const struct rte_mbuf *m = mbufs[idx];
+		RTE_ASSERT(m != NULL);
+		RTE_ASSERT(m->pool == mp);
+		__rte_mbuf_raw_sanity_check(m);
+	}
+
+	rte_mempool_put_bulk(mp, (void **)mbufs, count);
+}
+
 /**
  * The packet mbuf constructor.
  *
-- 
2.43.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] mbuf: add fast free bulk function
  2025-01-14 16:25 [PATCH] mbuf: add fast free bulk function Morten Brørup
@ 2025-01-14 16:39 ` Morten Brørup
  2025-01-14 17:39   ` Stephen Hemminger
  2025-01-15  6:52   ` huangdengdui
  0 siblings, 2 replies; 6+ messages in thread
From: Morten Brørup @ 2025-01-14 16:39 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Slava Ovsiienko,
	Shahaf Shuler, Olivier Matz
  Cc: dev, Morten Brørup

mbuf: add fast free bulk function

When putting an mbuf back into its mempool, there are certain requirements
to the mbuf. Specifically, some of its fields must be initialized.

These requirements are in fact invariants about free mbufs, held in
mempools, and thus also apply when allocating an mbuf from a mempool.
With this in mind, the additional assertions in rte_mbuf_raw_free() were
moved to __rte_mbuf_raw_sanity_check().
Furthermore, the assertion regarding pinned external buffer was enhanced;
it now also asserts that the referenced pinned external buffer has
refcnt == 1.

The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to
include the remaining requirements, which were missing here.

And finally:
A new rte_mbuf_fast_free_bulk() inline function was added for the benefit
of ethdev drivers supporting fast release of mbufs.
It asserts these requirements and that the mbufs belong to the specified
mempool, and then calls rte_mempool_put_bulk().

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v2:
* Fixed missing inline.
---
 lib/ethdev/rte_ethdev.h |  6 ++++--
 lib/mbuf/rte_mbuf.h     | 39 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 1f71cad244..e9267fca79 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1612,8 +1612,10 @@ struct rte_eth_conf {
 #define RTE_ETH_TX_OFFLOAD_MULTI_SEGS       RTE_BIT64(15)
 /**
  * Device supports optimization for fast release of mbufs.
- * When set application must guarantee that per-queue all mbufs comes from
- * the same mempool and has refcnt = 1.
+ * When set application must guarantee that per-queue all mbufs come from the same mempool,
+ * are direct, have refcnt=1, next=NULL and nb_segs=1, as done by rte_pktmbuf_prefree_seg().
+ *
+ * @see rte_mbuf_fast_free_bulk()
  */
 #define RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE   RTE_BIT64(16)
 #define RTE_ETH_TX_OFFLOAD_SECURITY         RTE_BIT64(17)
diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h
index 0d2e0e64b3..7590d82689 100644
--- a/lib/mbuf/rte_mbuf.h
+++ b/lib/mbuf/rte_mbuf.h
@@ -568,6 +568,10 @@ __rte_mbuf_raw_sanity_check(__rte_unused const struct rte_mbuf *m)
 	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
 	RTE_ASSERT(m->next == NULL);
 	RTE_ASSERT(m->nb_segs == 1);
+	RTE_ASSERT(!RTE_MBUF_CLONED(m));
+	RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m) ||
+			(RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
+			rte_mbuf_ext_refcnt_read(m->shinfo) == 1));
 	__rte_mbuf_sanity_check(m, 0);
 }
 
@@ -623,12 +627,43 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
 static __rte_always_inline void
 rte_mbuf_raw_free(struct rte_mbuf *m)
 {
-	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
-		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
 	__rte_mbuf_raw_sanity_check(m);
 	rte_mempool_put(m->pool, m);
 }
 
+/**
+ * Put a bulk of mbufs allocated from the same mempool back into the mempool.
+ *
+ * The caller must ensure that the mbufs come from the specified mempool,
+ * are direct and properly reinitialized (refcnt=1, next=NULL, nb_segs=1), as done by
+ * rte_pktmbuf_prefree_seg().
+ *
+ * This function should be used with care, when optimization is
+ * required. For standard needs, prefer rte_pktmbuf_free_bulk().
+ *
+ * @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE
+ *
+ * @param mp
+ *   The mempool to which the mbufs belong.
+ * @param mbufs
+ *   Array of pointers to packet mbufs.
+ *   The array must not contain NULL pointers.
+ * @param count
+ *   Array size.
+ */
+static __rte_always_inline void
+rte_mbuf_fast_free_bulk(struct rte_mempool *mp, struct rte_mbuf **mbufs, unsigned int count)
+{
+	for (unsigned int idx = 0; idx < count; idx++) {
+		const struct rte_mbuf *m = mbufs[idx];
+		RTE_ASSERT(m != NULL);
+		RTE_ASSERT(m->pool == mp);
+		__rte_mbuf_raw_sanity_check(m);
+	}
+
+	rte_mempool_put_bulk(mp, (void **)mbufs, count);
+}
+
 /**
  * The packet mbuf constructor.
  *
-- 
2.43.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mbuf: add fast free bulk function
  2025-01-14 16:39 ` [PATCH v2] " Morten Brørup
@ 2025-01-14 17:39   ` Stephen Hemminger
  2025-01-15  6:52   ` huangdengdui
  1 sibling, 0 replies; 6+ messages in thread
From: Stephen Hemminger @ 2025-01-14 17:39 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Slava Ovsiienko,
	Shahaf Shuler, Olivier Matz, dev

On Tue, 14 Jan 2025 16:39:51 +0000
Morten Brørup <mb@smartsharesystems.com> wrote:

> mbuf: add fast free bulk function
> 
> When putting an mbuf back into its mempool, there are certain requirements
> to the mbuf. Specifically, some of its fields must be initialized.
> 
> These requirements are in fact invariants about free mbufs, held in
> mempools, and thus also apply when allocating an mbuf from a mempool.
> With this in mind, the additional assertions in rte_mbuf_raw_free() were
> moved to __rte_mbuf_raw_sanity_check().
> Furthermore, the assertion regarding pinned external buffer was enhanced;
> it now also asserts that the referenced pinned external buffer has
> refcnt == 1.
> 
> The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to
> include the remaining requirements, which were missing here.
> 
> And finally:
> A new rte_mbuf_fast_free_bulk() inline function was added for the benefit
> of ethdev drivers supporting fast release of mbufs.
> It asserts these requirements and that the mbufs belong to the specified
> mempool, and then calls rte_mempool_put_bulk().
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>

Since it is a new function it should be marked experimental for now.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mbuf: add fast free bulk function
  2025-01-14 16:39 ` [PATCH v2] " Morten Brørup
  2025-01-14 17:39   ` Stephen Hemminger
@ 2025-01-15  6:52   ` huangdengdui
  2025-01-15  9:38     ` Morten Brørup
  1 sibling, 1 reply; 6+ messages in thread
From: huangdengdui @ 2025-01-15  6:52 UTC (permalink / raw)
  To: dev


On 2025/1/15 0:39, Morten Brørup wrote:
> mbuf: add fast free bulk function
> 
> When putting an mbuf back into its mempool, there are certain requirements
> to the mbuf. Specifically, some of its fields must be initialized.
> 
> These requirements are in fact invariants about free mbufs, held in
> mempools, and thus also apply when allocating an mbuf from a mempool.
> With this in mind, the additional assertions in rte_mbuf_raw_free() were
> moved to __rte_mbuf_raw_sanity_check().
> Furthermore, the assertion regarding pinned external buffer was enhanced;
> it now also asserts that the referenced pinned external buffer has
> refcnt == 1.
> 
> The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to
> include the remaining requirements, which were missing here.
> 
> And finally:
> A new rte_mbuf_fast_free_bulk() inline function was added for the benefit
> of ethdev drivers supporting fast release of mbufs.
> It asserts these requirements and that the mbufs belong to the specified
> mempool, and then calls rte_mempool_put_bulk().
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> v2:
> * Fixed missing inline.
> ---
>  lib/ethdev/rte_ethdev.h |  6 ++++--
>  lib/mbuf/rte_mbuf.h     | 39 +++++++++++++++++++++++++++++++++++++--
>  2 files changed, 41 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 1f71cad244..e9267fca79 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1612,8 +1612,10 @@ struct rte_eth_conf {
>  #define RTE_ETH_TX_OFFLOAD_MULTI_SEGS       RTE_BIT64(15)
>  /**
>   * Device supports optimization for fast release of mbufs.
> - * When set application must guarantee that per-queue all mbufs comes from
> - * the same mempool and has refcnt = 1.
> + * When set application must guarantee that per-queue all mbufs come from the same mempool,
> + * are direct, have refcnt=1, next=NULL and nb_segs=1, as done by rte_pktmbuf_prefree_seg().
> + *
> + * @see rte_mbuf_fast_free_bulk()
>   */
>  #define RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE   RTE_BIT64(16)
>  #define RTE_ETH_TX_OFFLOAD_SECURITY         RTE_BIT64(17)
> diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h
> index 0d2e0e64b3..7590d82689 100644
> --- a/lib/mbuf/rte_mbuf.h
> +++ b/lib/mbuf/rte_mbuf.h
> @@ -568,6 +568,10 @@ __rte_mbuf_raw_sanity_check(__rte_unused const struct rte_mbuf *m)
>  	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
>  	RTE_ASSERT(m->next == NULL);
>  	RTE_ASSERT(m->nb_segs == 1);
> +	RTE_ASSERT(!RTE_MBUF_CLONED(m));
> +	RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m) ||
> +			(RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
> +			rte_mbuf_ext_refcnt_read(m->shinfo) == 1));
>  	__rte_mbuf_sanity_check(m, 0);
>  }
>  
> @@ -623,12 +627,43 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>  static __rte_always_inline void
>  rte_mbuf_raw_free(struct rte_mbuf *m)
>  {
> -	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
> -		  (!RTE_MBUF_HAS_EXTBUF(m) || RTE_MBUF_HAS_PINNED_EXTBUF(m)));
>  	__rte_mbuf_raw_sanity_check(m);
>  	rte_mempool_put(m->pool, m);
>  }
>  
> +/**
> + * Put a bulk of mbufs allocated from the same mempool back into the mempool.
> + *
> + * The caller must ensure that the mbufs come from the specified mempool,
> + * are direct and properly reinitialized (refcnt=1, next=NULL, nb_segs=1), as done by
> + * rte_pktmbuf_prefree_seg().
> + *
> + * This function should be used with care, when optimization is
> + * required. For standard needs, prefer rte_pktmbuf_free_bulk().
> + *
> + * @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE
> + *
> + * @param mp
> + *   The mempool to which the mbufs belong.
> + * @param mbufs
> + *   Array of pointers to packet mbufs.
> + *   The array must not contain NULL pointers.
> + * @param count
> + *   Array size.
> + */
> +static __rte_always_inline void
> +rte_mbuf_fast_free_bulk(struct rte_mempool *mp, struct rte_mbuf **mbufs, unsigned int count)
> +{
> +	for (unsigned int idx = 0; idx < count; idx++) {
> +		const struct rte_mbuf *m = mbufs[idx];
> +		RTE_ASSERT(m != NULL);
> +		RTE_ASSERT(m->pool == mp);
> +		__rte_mbuf_raw_sanity_check(m);
> +	}

Is there some way to avoid executing a loop in non-debug mode? Like the following or other better way

#ifdef RTE_LIBRTE_MBUF_DEBUG
	{
		for (unsigned int idx = 0; idx < count; idx++) {
			const struct rte_mbuf *m = mbufs[idx];
			RTE_ASSERT(m != NULL);
			RTE_ASSERT(m->pool == mp);
			__rte_mbuf_raw_sanity_check(m);
		}
	}
#endif

> +
> +	rte_mempool_put_bulk(mp, (void **)mbufs, count);

Can the mp be obtained from the mbuf?

> +}
> +
>  /**
>   * The packet mbuf constructor.
>   *

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] mbuf: add fast free bulk function
  2025-01-15  6:52   ` huangdengdui
@ 2025-01-15  9:38     ` Morten Brørup
  2025-01-15 12:14       ` huangdengdui
  0 siblings, 1 reply; 6+ messages in thread
From: Morten Brørup @ 2025-01-15  9:38 UTC (permalink / raw)
  To: huangdengdui, dev
  Cc: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Slava Ovsiienko,
	Shahaf Shuler, Olivier Matz

> From: huangdengdui [mailto:huangdengdui@huawei.com]
> Sent: Wednesday, 15 January 2025 07.52
> 
> On 2025/1/15 0:39, Morten Brørup wrote:
> > mbuf: add fast free bulk function
> >
> > When putting an mbuf back into its mempool, there are certain
> requirements
> > to the mbuf. Specifically, some of its fields must be initialized.
> >
> > These requirements are in fact invariants about free mbufs, held in
> > mempools, and thus also apply when allocating an mbuf from a mempool.
> > With this in mind, the additional assertions in rte_mbuf_raw_free()
> were
> > moved to __rte_mbuf_raw_sanity_check().
> > Furthermore, the assertion regarding pinned external buffer was
> enhanced;
> > it now also asserts that the referenced pinned external buffer has
> > refcnt == 1.
> >
> > The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to
> > include the remaining requirements, which were missing here.
> >
> > And finally:
> > A new rte_mbuf_fast_free_bulk() inline function was added for the
> benefit
> > of ethdev drivers supporting fast release of mbufs.
> > It asserts these requirements and that the mbufs belong to the
> specified
> > mempool, and then calls rte_mempool_put_bulk().
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> > v2:
> > * Fixed missing inline.
> > ---
> >  lib/ethdev/rte_ethdev.h |  6 ++++--
> >  lib/mbuf/rte_mbuf.h     | 39 +++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 41 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index 1f71cad244..e9267fca79 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1612,8 +1612,10 @@ struct rte_eth_conf {
> >  #define RTE_ETH_TX_OFFLOAD_MULTI_SEGS       RTE_BIT64(15)
> >  /**
> >   * Device supports optimization for fast release of mbufs.
> > - * When set application must guarantee that per-queue all mbufs
> comes from
> > - * the same mempool and has refcnt = 1.
> > + * When set application must guarantee that per-queue all mbufs come
> from the same mempool,
> > + * are direct, have refcnt=1, next=NULL and nb_segs=1, as done by
> rte_pktmbuf_prefree_seg().
> > + *
> > + * @see rte_mbuf_fast_free_bulk()
> >   */
> >  #define RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE   RTE_BIT64(16)
> >  #define RTE_ETH_TX_OFFLOAD_SECURITY         RTE_BIT64(17)
> > diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h
> > index 0d2e0e64b3..7590d82689 100644
> > --- a/lib/mbuf/rte_mbuf.h
> > +++ b/lib/mbuf/rte_mbuf.h
> > @@ -568,6 +568,10 @@ __rte_mbuf_raw_sanity_check(__rte_unused const
> struct rte_mbuf *m)
> >  	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
> >  	RTE_ASSERT(m->next == NULL);
> >  	RTE_ASSERT(m->nb_segs == 1);
> > +	RTE_ASSERT(!RTE_MBUF_CLONED(m));
> > +	RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m) ||
> > +			(RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
> > +			rte_mbuf_ext_refcnt_read(m->shinfo) == 1));
> >  	__rte_mbuf_sanity_check(m, 0);
> >  }
> >
> > @@ -623,12 +627,43 @@ static inline struct rte_mbuf
> *rte_mbuf_raw_alloc(struct rte_mempool *mp)
> >  static __rte_always_inline void
> >  rte_mbuf_raw_free(struct rte_mbuf *m)
> >  {
> > -	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
> > -		  (!RTE_MBUF_HAS_EXTBUF(m) ||
> RTE_MBUF_HAS_PINNED_EXTBUF(m)));
> >  	__rte_mbuf_raw_sanity_check(m);
> >  	rte_mempool_put(m->pool, m);
> >  }
> >
> > +/**
> > + * Put a bulk of mbufs allocated from the same mempool back into the
> mempool.
> > + *
> > + * The caller must ensure that the mbufs come from the specified
> mempool,
> > + * are direct and properly reinitialized (refcnt=1, next=NULL,
> nb_segs=1), as done by
> > + * rte_pktmbuf_prefree_seg().
> > + *
> > + * This function should be used with care, when optimization is
> > + * required. For standard needs, prefer rte_pktmbuf_free_bulk().
> > + *
> > + * @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE
> > + *
> > + * @param mp
> > + *   The mempool to which the mbufs belong.
> > + * @param mbufs
> > + *   Array of pointers to packet mbufs.
> > + *   The array must not contain NULL pointers.
> > + * @param count
> > + *   Array size.
> > + */
> > +static __rte_always_inline void
> > +rte_mbuf_fast_free_bulk(struct rte_mempool *mp, struct rte_mbuf
> **mbufs, unsigned int count)
> > +{
> > +	for (unsigned int idx = 0; idx < count; idx++) {
> > +		const struct rte_mbuf *m = mbufs[idx];
> > +		RTE_ASSERT(m != NULL);
> > +		RTE_ASSERT(m->pool == mp);
> > +		__rte_mbuf_raw_sanity_check(m);
> > +	}
> 
> Is there some way to avoid executing a loop in non-debug mode? Like the
> following or other better way
> 
> #ifdef RTE_LIBRTE_MBUF_DEBUG
> 	{
> 		for (unsigned int idx = 0; idx < count; idx++) {
> 			const struct rte_mbuf *m = mbufs[idx];
> 			RTE_ASSERT(m != NULL);
> 			RTE_ASSERT(m->pool == mp);
> 			__rte_mbuf_raw_sanity_check(m);
> 		}
> 	}
> #endif

The loop is already omitted in non-debug mode:
RTE_ASSERT() [1] is omitted unless RTE_ENABLE_ASSERT is set.
__rte_mbuf_raw_sanity_check() [2] consist of some RTE_ASSERTs and __rte_mbuf_sanity_check().
__rte_mbuf_sanity_check() [3] is omitted unless RTE_LIBRTE_MBUF_DEBUG is set.

So the compiler will detect that the loop has no effect, and optimize away the loop.

[1]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/eal/include/rte_debug.h#L46
[2]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/mbuf/rte_mbuf.h#L566
[3]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/mbuf/rte_mbuf.h#L348

> 
> > +
> > +	rte_mempool_put_bulk(mp, (void **)mbufs, count);
> 
> Can the mp be obtained from the mbuf?

Passing "mp" as a parameter avoids a potential CPU cache miss by not dereferencing the first mbuf, if the driver/application already has the mempool pointer readily available (and hot in the CPU cache) from somewhere else.
If the driver/or application doesn't have the mempool pointer readily available, it can obtain it from the mbuf when calling this function.

And as a bonus side effect, passing "mp" as a parameter allows calling the function with count=0 without special handling inside the function.

Obviously, if the driver/application gets "mp" from mbuf[0]->pool, it needs to first check that count>0; but that would be the situation for the driver/application whenever it accesses an mbuf array, regardless what it is doing with that array.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mbuf: add fast free bulk function
  2025-01-15  9:38     ` Morten Brørup
@ 2025-01-15 12:14       ` huangdengdui
  0 siblings, 0 replies; 6+ messages in thread
From: huangdengdui @ 2025-01-15 12:14 UTC (permalink / raw)
  To: Morten Brørup, dev
  Cc: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Slava Ovsiienko,
	Shahaf Shuler, Olivier Matz


On 2025/1/15 17:38, Morten Brørup wrote:
>> From: huangdengdui [mailto:huangdengdui@huawei.com]
>> Sent: Wednesday, 15 January 2025 07.52
>>
>> On 2025/1/15 0:39, Morten Brørup wrote:
>>> mbuf: add fast free bulk function
>>>
>>> When putting an mbuf back into its mempool, there are certain
>> requirements
>>> to the mbuf. Specifically, some of its fields must be initialized.
>>>
>>> These requirements are in fact invariants about free mbufs, held in
>>> mempools, and thus also apply when allocating an mbuf from a mempool.
>>> With this in mind, the additional assertions in rte_mbuf_raw_free()
>> were
>>> moved to __rte_mbuf_raw_sanity_check().
>>> Furthermore, the assertion regarding pinned external buffer was
>> enhanced;
>>> it now also asserts that the referenced pinned external buffer has
>>> refcnt == 1.
>>>
>>> The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to
>>> include the remaining requirements, which were missing here.
>>>
>>> And finally:
>>> A new rte_mbuf_fast_free_bulk() inline function was added for the
>> benefit
>>> of ethdev drivers supporting fast release of mbufs.
>>> It asserts these requirements and that the mbufs belong to the
>> specified
>>> mempool, and then calls rte_mempool_put_bulk().
>>>
>>> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
>>> ---
>>> v2:
>>> * Fixed missing inline.
>>> ---
>>>  lib/ethdev/rte_ethdev.h |  6 ++++--
>>>  lib/mbuf/rte_mbuf.h     | 39 +++++++++++++++++++++++++++++++++++++--
>>>  2 files changed, 41 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index 1f71cad244..e9267fca79 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -1612,8 +1612,10 @@ struct rte_eth_conf {
>>>  #define RTE_ETH_TX_OFFLOAD_MULTI_SEGS       RTE_BIT64(15)
>>>  /**
>>>   * Device supports optimization for fast release of mbufs.
>>> - * When set application must guarantee that per-queue all mbufs
>> comes from
>>> - * the same mempool and has refcnt = 1.
>>> + * When set application must guarantee that per-queue all mbufs come
>> from the same mempool,
>>> + * are direct, have refcnt=1, next=NULL and nb_segs=1, as done by
>> rte_pktmbuf_prefree_seg().
>>> + *
>>> + * @see rte_mbuf_fast_free_bulk()
>>>   */
>>>  #define RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE   RTE_BIT64(16)
>>>  #define RTE_ETH_TX_OFFLOAD_SECURITY         RTE_BIT64(17)
>>> diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h
>>> index 0d2e0e64b3..7590d82689 100644
>>> --- a/lib/mbuf/rte_mbuf.h
>>> +++ b/lib/mbuf/rte_mbuf.h
>>> @@ -568,6 +568,10 @@ __rte_mbuf_raw_sanity_check(__rte_unused const
>> struct rte_mbuf *m)
>>>  	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
>>>  	RTE_ASSERT(m->next == NULL);
>>>  	RTE_ASSERT(m->nb_segs == 1);
>>> +	RTE_ASSERT(!RTE_MBUF_CLONED(m));
>>> +	RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m) ||
>>> +			(RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
>>> +			rte_mbuf_ext_refcnt_read(m->shinfo) == 1));
>>>  	__rte_mbuf_sanity_check(m, 0);
>>>  }
>>>
>>> @@ -623,12 +627,43 @@ static inline struct rte_mbuf
>> *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>>>  static __rte_always_inline void
>>>  rte_mbuf_raw_free(struct rte_mbuf *m)
>>>  {
>>> -	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
>>> -		  (!RTE_MBUF_HAS_EXTBUF(m) ||
>> RTE_MBUF_HAS_PINNED_EXTBUF(m)));
>>>  	__rte_mbuf_raw_sanity_check(m);
>>>  	rte_mempool_put(m->pool, m);
>>>  }
>>>
>>> +/**
>>> + * Put a bulk of mbufs allocated from the same mempool back into the
>> mempool.
>>> + *
>>> + * The caller must ensure that the mbufs come from the specified
>> mempool,
>>> + * are direct and properly reinitialized (refcnt=1, next=NULL,
>> nb_segs=1), as done by
>>> + * rte_pktmbuf_prefree_seg().
>>> + *
>>> + * This function should be used with care, when optimization is
>>> + * required. For standard needs, prefer rte_pktmbuf_free_bulk().
>>> + *
>>> + * @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE
>>> + *
>>> + * @param mp
>>> + *   The mempool to which the mbufs belong.
>>> + * @param mbufs
>>> + *   Array of pointers to packet mbufs.
>>> + *   The array must not contain NULL pointers.
>>> + * @param count
>>> + *   Array size.
>>> + */
>>> +static __rte_always_inline void
>>> +rte_mbuf_fast_free_bulk(struct rte_mempool *mp, struct rte_mbuf
>> **mbufs, unsigned int count)
>>> +{
>>> +	for (unsigned int idx = 0; idx < count; idx++) {
>>> +		const struct rte_mbuf *m = mbufs[idx];
>>> +		RTE_ASSERT(m != NULL);
>>> +		RTE_ASSERT(m->pool == mp);
>>> +		__rte_mbuf_raw_sanity_check(m);
>>> +	}
>>
>> Is there some way to avoid executing a loop in non-debug mode? Like the
>> following or other better way
>>
>> #ifdef RTE_LIBRTE_MBUF_DEBUG
>> 	{
>> 		for (unsigned int idx = 0; idx < count; idx++) {
>> 			const struct rte_mbuf *m = mbufs[idx];
>> 			RTE_ASSERT(m != NULL);
>> 			RTE_ASSERT(m->pool == mp);
>> 			__rte_mbuf_raw_sanity_check(m);
>> 		}
>> 	}
>> #endif
> 
> The loop is already omitted in non-debug mode:
> RTE_ASSERT() [1] is omitted unless RTE_ENABLE_ASSERT is set.
> __rte_mbuf_raw_sanity_check() [2] consist of some RTE_ASSERTs and __rte_mbuf_sanity_check().
> __rte_mbuf_sanity_check() [3] is omitted unless RTE_LIBRTE_MBUF_DEBUG is set.
> 
> So the compiler will detect that the loop has no effect, and optimize away the loop.

Okay, you're right, I compiled the code and this loop code didn't generate any instructions after the compiler optimization was enabled.

> 
> [1]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/eal/include/rte_debug.h#L46
> [2]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/mbuf/rte_mbuf.h#L566
> [3]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/mbuf/rte_mbuf.h#L348
> 
>>
>>> +
>>> +	rte_mempool_put_bulk(mp, (void **)mbufs, count);
>>
>> Can the mp be obtained from the mbuf?
> 
> Passing "mp" as a parameter avoids a potential CPU cache miss by not dereferencing the first mbuf, if the driver/application already has the mempool pointer readily available (and hot in the CPU cache) from somewhere else.
> If the driver/or application doesn't have the mempool pointer readily available, it can obtain it from the mbuf when calling this function.
> 
> And as a bonus side effect, passing "mp" as a parameter allows calling the function with count=0 without special handling inside the function.
> 
> Obviously, if the driver/application gets "mp" from mbuf[0]->pool, it needs to first check that count>0; but that would be the situation for the driver/application whenever it accesses an mbuf array, regardless what it is doing with that array.
> 

Okay, It's better this way.

Acked-by: huangdengdui@huawei.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-01-15 12:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-14 16:25 [PATCH] mbuf: add fast free bulk function Morten Brørup
2025-01-14 16:39 ` [PATCH v2] " Morten Brørup
2025-01-14 17:39   ` Stephen Hemminger
2025-01-15  6:52   ` huangdengdui
2025-01-15  9:38     ` Morten Brørup
2025-01-15 12:14       ` huangdengdui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).