From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Olivier Matz <olivier.matz@6wind.com>,
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
dev@dpdk.org
Subject: Re: [RFC] mempool: modify flush threshold
Date: Mon, 10 Jan 2022 09:40:48 +0000 [thread overview]
Message-ID: <Ydv/IMz8eIRPSguY@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D86DF0@smartserver.smartshare.dk>
On Sat, Jan 08, 2022 at 12:00:17PM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Friday, 7 January 2022 16.12
> >
> > On Tue, Dec 28, 2021 at 03:28:45PM +0100, Morten Brørup wrote:
> > > Hi mempool maintainers and DPDK team.
> > >
> > > Does anyone know the reason or history why
> > CACHE_FLUSHTHRESH_MULTIPLIER was chosen to be 1.5? I think it is
> > counterintuitive.
> > >
> > > The mempool cache flush threshold was introduced in DPDK version 1.3;
> > it was not in DPDK version 1.2. The copyright notice for rte_mempool.c
> > says year 2012.
> > >
> > >
> > > Here is my analysis:
> > >
> > > With the multiplier of 1.5, a mempool cache is allowed to be filled
> > up to 50 % above than its target size before its excess entries are
> > flushed to the mempool (thereby reducing the cache length to the target
> > size).
> > >
> > > In the opposite direction, a mempool cache is allowed to be drained
> > completely, i.e. up to 100 % below its target size.
> > >
> > > My instinct tells me that it would be more natural to let a mempool
> > cache go the same amount above and below its target size, i.e. using a
> > flush multiplier of 2 instead of 1.5.
> > >
> > > Also, the cache should be allowed to fill up to and including the
> > flush threshold, so it is flushed when the threshold is exceeded,
> > instead of when it is reached.
> > >
> > > Here is a simplified example:
> > >
> > > Imagine a cache target size of 32, corresponding to a typical packet
> > burst. With a flush threshold of 2 (and len > threshold instead of len
> > >= threshold), the cache could hold 1 +/-1 packet bursts. With the
> > current multiplier it can only hold [0 .. 1.5[ packet bursts, not
> > really providing a lot of elasticity.
> > >
> > Hi Morten,
> >
> > Interesting to see this being looked at again. The original idea of
> > adding
> > in some extra room above the requested value was to avoid the worst-
> > case
> > scenario of a pool oscillating between full and empty repeatedly due to
> > the
> > addition/removal of perhaps a single packet. As for why 1.5 was chosen
> > as
> > the value, I don't recall any particular reason for it myself. The main
> > objective was to have a separate flush and size values so that we could
> > go
> > a bit above full, and when flushing, not emptying the entire cache down
> > to
> > zero.
>
> Thanks for providing the historical background for this feature, Bruce.
>
> >
> > In terms of the behavioural points you make above, I wonder if symmetry
> > is
> > actually necessary or desirable in this case. After all, the ideal case
> > is
> > probably to keep the mempool neither full nor empty, so that both
> > allocations or frees can be done without having to go to the underlying
> > shared data structure. To accomodate this, the mempool will only flush
> > when
> > the number of elements is greater than size * 1.5, and then it only
> > flushes
> > elements down to size, ensuring that allocations can still take place.
> > On allocation, new buffers are taken when we don't have enough in the
> > cache
> > to fullfil the request, and then the cache is filled up to size, not to
> > the
> > flush threshold.
>
> I agree with the ideal case.
>
> However, it looks like the addition of the flush threshold also changed the "size" parameter to effectively become "desired length" instead. This interpretation is also supported by the flush algorithm, which doesn't flush below the "size", but to the "size". So based on interpretation, I was wondering why it is not symmetrical around the "desired length", but allowed to go 100 % below and only 50 % above.
>
> >
> > Now, for the scenario you describe - where the mempool cache size is
> > set to
> > be the same as the burst size, this scheme probably does break down, in
> > that we don't really have any burst elasticity. However, I would query
> > if
> > that is a configuration that is used, since to the user it should
> > appear
> > correctly to provide no elasticity. Looking at testpmd, and our example
> > apps, the standard there is a burst size of 32 and mempool cache of
> > ~256.
> > In OVS code, netdev-dpdk.c seems to initialize the mempool with cache
> > size
> > of RTE_MEMPOOL_CACHE_MAX_SIZE (through define MP_CACHE_SZ). In all
> > these
> > cases, I think the 1.5 threshold should work just fine for us.
>
> My example was for demonstration only, and hopefully not being used by any applications.
>
> The simplified example was intended to demonstrate the theoretical effect of the unbalance in using the 1.5 threshold. It will be similar with a cache size of 256 objects: You will be allowed to go 8 bursts (calculation: 256 objects / 32 objects per burst) below the cache "size" without retrieving objects from the backing pool, but only 4 bursts above the cache "size" before flushing objects to the backing pool.
>
> > That said,
> > if you want to bump it up to 2x, I can't say I'd object strongly as it
> > should be harmless, I think.
>
> From an academic viewpoint, 2x seems much better to me than 1.5x. And based on your background information from the introduction of the flush threshold, there was no academic reason why 1.5x was chosen - this is certainly valuable feedback.
>
> I will consider providing a patch. Worst case it probably doesn't do any harm. And if anyone is really interested, they can look at the mempool debug stats to see the difference it makes for their specific application, e.g. the put_common_pool_bulk/put_bulk and get_common_pool_bulk/get_success_bulk ratios should decrease.
Sure, thanks.
/Bruce
next prev parent reply other threads:[~2022-01-10 9:40 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-28 14:28 Morten Brørup
2022-01-07 15:12 ` Bruce Richardson
2022-01-08 11:00 ` Morten Brørup
2022-01-10 9:40 ` Bruce Richardson [this message]
2022-01-24 15:56 ` Olivier Matz
2022-01-28 8:40 ` Morten Brørup
2022-01-28 8:52 ` Morten Brørup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ydv/IMz8eIRPSguY@bricha3-MOBL.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=mb@smartsharesystems.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).