DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Andrew Rybchenko" <arybchenko@solarflare.com>, <olivier.matz@6wind.com>
Cc: <stephen@networkplumber.org>, <harry.van.haaren@intel.com>,
	<konstantin.ananyev@intel.com>, <mattias.ronnblom@ericsson.com>,
	<bruce.richardson@intel.com>, <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v4 1/1] mbuf: add bulk free function
Date: Tue, 8 Oct 2019 22:12:45 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35C60B62@smartserver.smartshare.dk> (raw)
In-Reply-To: <e7df1048-1f58-9cad-32b3-3c70dac42643@solarflare.com>

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
> Sent: Tuesday, October 8, 2019 7:26 PM
> 
> On 10/7/19 1:14 PM, Morten Brørup wrote:
> > Add function for freeing a bulk of mbufs.
> >
> > v4:
> > * Marked as experimental by adding __rte_experimental.
> > * Add function to experimental section of map file.
> > * Fix source code formatting regarding pointer to pointer.
> > * Squashed multiple modifications into one.
> > v3:
> > * Bugfix: Handle pakets with multiple segments.
> > * Added inline helper function, mainly for readability.
> > * Fix source code formatting regarding indentation.
> > v2:
> > * Function is not inline.
> > * Optimized to free multible mbufs belonging to the same mempool in
> > bulk. Inspired by ixgbe_tx_free_bufs(), but allowing NULL pointers
> > in the array, just like rte_pktmbuf_free() can take a NULL pointer.
> > * Use unsigned int instead of unsigned. Passes checkpatch, but
> > mismatches the original coding style of the modified files.
> > * Fixed a typo in the description headline: mempools is plural.
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> >   lib/librte_mbuf/rte_mbuf.c           | 50 ++++++++++++++++++++++++++++
> >   lib/librte_mbuf/rte_mbuf.h           | 12 +++++++
> >   lib/librte_mbuf/rte_mbuf_version.map |  1 +
> >   3 files changed, 63 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > index 37718d49c..0088117a3 100644
> > --- a/lib/librte_mbuf/rte_mbuf.c
> > +++ b/lib/librte_mbuf/rte_mbuf.c
> > @@ -245,6 +245,56 @@ int rte_mbuf_check(const struct rte_mbuf *m, int
> is_header,
> >   	return 0;
> >   }
> >
> > +/**
> > + * Size of the array holding mbufs from the same membool to be freed in
> bulk.
> > + */
> > +#define RTE_PKTMBUF_FREE_BULK_SZ 64
> 
> How is the constant chosen? When we benchmarked similar thing in
> net/sfc, we finally stick to 32 as the best option.
> 

I took it from the ixgbe driver that was the inspiration:
http://code.dpdk.org/dpdk/latest/source/drivers/net/ixgbe/ixgbe_rxtx.c#L111
http://code.dpdk.org/dpdk/latest/source/drivers/net/ixgbe/ixgbe_rxtx.h#L32

If only DPDK provided a common #define for recommended burst array sizes. However, I vaguely recall a discussion recommending it to be dependent on the application and various drivers due to requirements of 100 Gbps drivers. I guess it probably also depends on the CPU being used. I have no qualified opinion about this constant, so I can change it to anything you recommend.

> > +
> > +/**
> > + * @internal helper function for freeing a bulk of mbufs via an array
> holding
> > + * mbufs from the same mempool.
> > + */
> > +static __rte_always_inline void
> > +rte_pktmbuf_free_seg_via_array(struct rte_mbuf *m,
> 
> Other internal functions in the file have __ (two underscore) prefix.
> 

The other internal function is public, i.e. not private (static). However, this might also go public one day, so I'll update it in the next version.

> > +	struct rte_mbuf ** const free, unsigned int * const nb_free)
> 
> free is bad variable name because of libc free() function.
> It is a bit confusing. Same below.
> 

I agree. I simply took it from the ixgbe driver without thinking further about it. How about "pending" or "bulk", or do you have a better suggestion?

> > +{
> > +	m = rte_pktmbuf_prefree_seg(m);
> > +	if (likely(m != NULL)) {
> 
> I'm not sure about likely() here, but hope that compiler is wise
> enough to inline rte_pktmbuf_prefree_seg() here and avoid the
> branch completely.
> 

I somewhat agree with you about this likely() being questionable, and also hope for the compiler to be wise enough anyway.

This line of code in my function stems from unfolding rte_pktmbuf_free_seg(), which has the same likely(), so I kept it.

Apparently, the DPDK mbuf library has a strong preference for single-reference mbufs, with lots of likely()/unlikely() assuming that the m.refcnt is never more than one. rte_pktmbuf_prefree_seg() has the same preference. So I think that we should follow the library's convention, respecting that this function is at the very core of the data plane.

The DPDK rationale behind these likely()/unlikely() hints must be that multicasting, flooding and mirroring packets are rare exceptions. We have previously discussed this internally in our organization, and we tend to agree under normal circumstances. It is just an unfortunate side effect that a common network debugging feature - mirroring - introduces an additional performance penalty of having likely()/unlikely() go in the wrong direction, which might again affect what is being debugged.

> > +		if (*nb_free >= RTE_PKTMBUF_FREE_BULK_SZ ||
> > +		    (*nb_free > 0 && m->pool != free[0]->pool)) {
> > +			rte_mempool_put_bulk(free[0]->pool, (void **)free,
> > +					     *nb_free);
> > +			*nb_free = 0;
> > +		}
> > +
> > +		free[(*nb_free)++] = m;
> > +	}
> > +}
> > +
> > +/* Free a bulk of mbufs back into their original mempools. */
> > +void rte_pktmbuf_free_bulk(struct rte_mbuf **mbufs, unsigned int count)
> > +{
> > +	struct rte_mbuf *m, *m_next, *free[RTE_PKTMBUF_FREE_BULK_SZ];
> > +	unsigned int idx, nb_free = 0;
> > +
> > +	for (idx = 0; idx < count; idx++) {
> > +		m = mbufs[idx];
> > +		if (unlikely(m == NULL))
> > +			continue;
> > +
> > +		__rte_mbuf_sanity_check(m, 1);
> > +
> > +		do {
> > +			m_next = m->next;
> > +			rte_pktmbuf_free_seg_via_array(m, free, &nb_free);
> > +			m = m_next;
> > +		} while (m != NULL);
> > +	}
> > +
> > +	if (nb_free > 0)
> > +		rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
> > +}
> > +
> >   /* dump a mbuf on console */
> >   void
> >   rte_pktmbuf_dump(FILE *f, const struct rte_mbuf *m, unsigned dump_len)
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 98225ec80..871145796 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -1907,6 +1907,18 @@ static inline void rte_pktmbuf_free(struct
> rte_mbuf *m)
> >   	}
> >   }
> >
> > +/**
> > + * Free a bulk of mbufs back into their original mempools.
> > + *
> > + *  @param mbufs
> > + *    Array of pointers to mbufs.
> > + *    The array may contain NULL pointers.
> > + *  @param count
> > + *    Array size.
> > + */
> > +__rte_experimental
> > +void rte_pktmbuf_free_bulk(struct rte_mbuf **mbufs, unsigned int count);
> > +
> >   /**
> >    * Creates a "clone" of the given packet mbuf.
> >    *
> > diff --git a/lib/librte_mbuf/rte_mbuf_version.map
> b/lib/librte_mbuf/rte_mbuf_version.map
> > index 2662a37bf..cd6154ef2 100644
> > --- a/lib/librte_mbuf/rte_mbuf_version.map
> > +++ b/lib/librte_mbuf/rte_mbuf_version.map
> > @@ -50,4 +50,5 @@ EXPERIMENTAL {
> >   	global:
> >
> >   	rte_mbuf_check;
> > +	rte_pktmbuf_free_bulk;
> >   } DPDK_18.08;
> 


  reply	other threads:[~2019-10-08 20:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-07 10:14 [dpdk-dev] [PATCH v4 0/1] " Morten Brørup
2019-10-07 10:14 ` [dpdk-dev] [PATCH v4 1/1] " Morten Brørup
2019-10-08 17:25   ` Andrew Rybchenko
2019-10-08 20:12     ` Morten Brørup [this message]
2019-10-09  6:46       ` Andrew Rybchenko
2019-10-09  9:32   ` Ananyev, Konstantin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35C60B62@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=arybchenko@solarflare.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=harry.van.haaren@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=olivier.matz@6wind.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).