DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Olivier Matz" <olivier.matz@6wind.com>
Cc: "Harris, James R" <james.r.harris@intel.com>,
	"Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	<dev@dpdk.org>, "nd" <nd@arm.com>,
	<andrew.rybchenko@oktetlabs.ru>, <bruce.richardson@intel.com>
Subject: RE: Bug in rte_mempool_do_generic_get?
Date: Mon, 27 Feb 2023 10:48:09 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8777F@smartserver.smartshare.dk> (raw)
In-Reply-To: <Y/xzT2NZQf/7thxY@platinum>

> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Monday, 27 February 2023 10.10
> 
> Hi,
> 
> On Sun, Feb 26, 2023 at 11:45:48AM +0100, Morten Brørup wrote:
> > > From: Harris, James R [mailto:james.r.harris@intel.com]
> > > Sent: Friday, 24 February 2023 17.43
> > >
> > > > On Feb 24, 2023, at 6:56 AM, Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com> wrote:
> > > >
> > > >> From: Morten Brørup <mb@smartsharesystems.com>
> > > >> Sent: Friday, February 24, 2023 6:13 AM
> > > >> To: Harris, James R <james.r.harris@intel.com>; dev@dpdk.org
> > > >> Subject: RE: Bug in rte_mempool_do_generic_get?
> > > >>
> > > >>> If you have a mempool with 2048 objects, shouldn't 4 cores each be
> > > able to do a 512 buffer bulk get, regardless of the configured cache
> > > size?
> > > >>
> > > >> No, the scenario you described above is the expected behavior. I
> > > think it is
> > > >> documented somewhere that objects in the caches are unavailable for
> > > other
> > > >> cores, but now I cannot find where this is documented.
> > > >>
> > >
> > > Thanks Morten.
> > >
> > > Yeah, I think it is documented somewhere, but I also couldn’t find it.
> > > I was aware of cores not being able to allocate from another core’s
> > > cache.  My surprise was that in a pristine new mempool, that 4 cores
> > > could not each do one initial 512 buffer bulk get.  But I also see that
> > > even before the a2833ecc5 patch, the cache would get populated on gets
> > > less than cache size, in addition to the buffers requested by the user.
> > > So if cache size is 256, and bulk get is for 128 buffers, it pulls 384
> > > buffers from backing pool - 128 for the caller, another 256 to prefill
> > > the cache.  Your patch makes this cache filling consistent between less-
> > > than-cache-size and greater-than-or-equal-to-cache-size cases.
> >
> > Yeah... I have tried hard to clean up some of the mess introduced by the
> <insert curse words here> patch [1] that increased the effective cache size by
> factor 1.5.
> >
> > [1]
> http://git.dpdk.org/dpdk/commit/lib/librte_mempool/rte_mempool.h?id=ea5dd2744b
> 90b330f07fd10f327ab99ef55c7266
> >
> > E.g., somewhat related to your use case, I have argued against the design
> goal of trying to keep the mempool caches full at all times, but have not yet
> been able to convince the community against it.
> >
> > Overall, this is one of the absolute core DPDK libraries, so the support for
> the changes I would like to see is near zero - even minor adjustments are
> considered very high risk, which I must admit has quite a lot of merit.
> >
> > So instead of fixing the library's implementation to make it behave as
> reasonably expected according to its documentation, the library's
> implementation is considered the reference. And as a consequence, the
> library's internals, such as the cache size multiplier, is getting promoted to
> be part of the public API, e.g. for applications to implement mempool sizing
> calculations like the one below.
> >
> > I recall running into problems once myself, when I had sized a mempool with
> a specific cache size for a specific number of cores, because 50 % additional
> objects got stuck in the caches. If you ask me, this bug is so fundamental
> that it should be fixed at the root, not by trying to document the weird
> behavior of allocating 50 % more than specified.
> 
> I think we should document it: even in the case the 1.5 factor is
> removed, it is helpful to document that a user needs to reserve more
> objects when using a cache, to ensure that all cores can allocate their
> objects.
> 
> Morten, what would you think about this change below?

A big improvement! Only a few comments, inline below.

> 
>   --- a/doc/guides/prog_guide/mempool_lib.rst
>   +++ b/doc/guides/prog_guide/mempool_lib.rst
>   @@ -89,9 +89,13 @@ In this way, each core has full access to its own cache
> (with locks) of free obj
>    only when the cache fills does the core need to shuffle some of the free
> objects back to the pools ring or
>    obtain more objects when the cache is empty.
> 
>   -While this may mean a number of buffers may sit idle on some core's cache,
>   +While this may mean a number of buffers may sit idle on some core's cache
> (up to ``1.5 * cache_length``),

Typo: cache_length -> cache_size

Looking at this, I noticed another detail: Chapter 10.4 uses the term "buffers", but it should use the term "objects", because a mempool may hold other objects than packet buffers. (Using the term "packet buffers" in the example in chapter 10.3 is correct.)

>    the speed at which a core can access its own cache for a specific memory
> pool without locks provides performance gains.
> 
>   +However, to ensure that all cores can get objects when a cache is used, it
> is required to allocate
>   +more objects: if N objects are needed, allocate a mempool with ``N +
> (number_of_active_cores *
>   +cache_size * 1.5)`` objects.
>   +

Perhaps add here that the factor 1.5 stems from the cache being allowed to exceed its configured size by 50 %.

>    The cache is composed of a small, per-core table of pointers and its length
> (used as a stack).
>    This internal cache can be enabled or disabled at creation of the pool.
> 

If something similar is added to the description of the "cache_size" parameter to the "rte_mempool_create()" function, the most obvious places to look for documentation are covered.

A lengthy description might be redundant here, so perhaps only mention that the actual cache size is 1.5 * cache_size, because the cache is allowed to exceed its configured size by 50 %.

> 
> >
> > >
> > > >> Furthermore, since the effective per-core cache size is 1.5 *
> > > configured cache
> > > >> size, a configured cache size of 256 may leave up to 384 objects in
> > > each per-
> > > >> core cache.
> > > >>
> > > >> With 4 cores, you can expect up to 3 * 384 = 1152 objects sitting in
> > > the
> > > >> caches of other cores. If you want to be able to pull 512 objects
> > > with each
> > > >> core, the pool size should be 4 * 512 + 1152 objects.
> > > > May be we should document this in mempool library?
> > > >
> > >
> > > Maybe.  But this case I described here is a bit wonky - SPDK should
> > > never have been specifying a non-zero cache in this case.  We only
> > > noticed this change in behavior because we were creating the mempool
> > > with a cache when we shouldn’t have.
> >
> > So, looking at the positive side, this regression revealed a "bug" in SPDK.
> ;-)
> >
> > PS: I assume that you are aware that mempools generally perform better with
> cache, so I assume that you know what you are doing when you say that you
> don't need the cache.
> >
> > PPS: Sorry about venting here. If nothing else, I hope it had some
> entertainment value. :-)
> >


  reply	other threads:[~2023-02-27  9:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-24  3:02 Harris, James R
2023-02-24 12:13 ` Morten Brørup
2023-02-24 13:56   ` Honnappa Nagarahalli
2023-02-24 16:42     ` Harris, James R
2023-02-26 10:45       ` Morten Brørup
2023-02-27  9:09         ` Olivier Matz
2023-02-27  9:48           ` Morten Brørup [this message]
2023-02-27 10:39             ` Olivier Matz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D8777F@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=james.r.harris@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).