From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 247F2A04A7;
	Sat,  8 Jan 2022 12:00:25 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 0B989410E9;
	Sat,  8 Jan 2022 12:00:25 +0100 (CET)
Received: from smartserver.smartsharesystems.com
 (smartserver.smartsharesystems.com [77.243.40.215])
 by mails.dpdk.org (Postfix) with ESMTP id 3E6FC40042
 for <dev@dpdk.org>; Sat,  8 Jan 2022 12:00:23 +0100 (CET)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [RFC] mempool: modify flush threshold
Date: Sat, 8 Jan 2022 12:00:17 +0100
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86DF0@smartserver.smartshare.dk>
In-Reply-To: <YdhYSRXDTBJLpJob@bricha3-MOBL.ger.corp.intel.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [RFC] mempool: modify flush threshold
Thread-Index: AdgD2PmiwSsrlwBhSpakRGxs1xtGpgAnV4Iw
References: <98CBD80474FA8B44BF855DF32C47DC35D86DB9@smartserver.smartshare.dk>
 <YdhYSRXDTBJLpJob@bricha3-MOBL.ger.corp.intel.com>
From: =?iso-8859-1?Q?Morten_Br=F8rup?= <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: "Olivier Matz" <olivier.matz@6wind.com>,
 "Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>, <dev@dpdk.org>
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Friday, 7 January 2022 16.12
>=20
> On Tue, Dec 28, 2021 at 03:28:45PM +0100, Morten Br=F8rup wrote:
> > Hi mempool maintainers and DPDK team.
> >
> > Does anyone know the reason or history why
> CACHE_FLUSHTHRESH_MULTIPLIER was chosen to be 1.5? I think it is
> counterintuitive.
> >
> > The mempool cache flush threshold was introduced in DPDK version =
1.3;
> it was not in DPDK version 1.2. The copyright notice for rte_mempool.c
> says year 2012.
> >
> >
> > Here is my analysis:
> >
> > With the multiplier of 1.5, a mempool cache is allowed to be filled
> up to 50 % above than its target size before its excess entries are
> flushed to the mempool (thereby reducing the cache length to the =
target
> size).
> >
> > In the opposite direction, a mempool cache is allowed to be drained
> completely, i.e. up to 100 % below its target size.
> >
> > My instinct tells me that it would be more natural to let a mempool
> cache go the same amount above and below its target size, i.e. using a
> flush multiplier of 2 instead of 1.5.
> >
> > Also, the cache should be allowed to fill up to and including the
> flush threshold, so it is flushed when the threshold is exceeded,
> instead of when it is reached.
> >
> > Here is a simplified example:
> >
> > Imagine a cache target size of 32, corresponding to a typical packet
> burst. With a flush threshold of 2 (and len > threshold instead of len
> >=3D threshold), the cache could hold 1 +/-1 packet bursts. With the
> current multiplier it can only hold [0 .. 1.5[ packet bursts, not
> really providing a lot of elasticity.
> >
> Hi Morten,
>=20
> Interesting to see this being looked at again. The original idea of
> adding
> in some extra room above the requested value was to avoid the worst-
> case
> scenario of a pool oscillating between full and empty repeatedly due =
to
> the
> addition/removal of perhaps a single packet. As for why 1.5 was chosen
> as
> the value, I don't recall any particular reason for it myself. The =
main
> objective was to have a separate flush and size values so that we =
could
> go
> a bit above full, and when flushing, not emptying the entire cache =
down
> to
> zero.

Thanks for providing the historical background for this feature, Bruce.

>=20
> In terms of the behavioural points you make above, I wonder if =
symmetry
> is
> actually necessary or desirable in this case. After all, the ideal =
case
> is
> probably to keep the mempool neither full nor empty, so that both
> allocations or frees can be done without having to go to the =
underlying
> shared data structure. To accomodate this, the mempool will only flush
> when
> the number of elements is greater than size * 1.5, and then it only
> flushes
> elements down to size, ensuring that allocations can still take place.
> On allocation, new buffers are taken when we don't have enough in the
> cache
> to fullfil the request, and then the cache is filled up to size, not =
to
> the
> flush threshold.

I agree with the ideal case.

However, it looks like the addition of the flush threshold also changed =
the "size" parameter to effectively become "desired length" instead. =
This interpretation is also supported by the flush algorithm, which =
doesn't flush below the "size", but to the "size". So based on =
interpretation, I was wondering why it is not symmetrical around the =
"desired length", but allowed to go 100 % below and only 50 % above.

>=20
> Now, for the scenario you describe - where the mempool cache size is
> set to
> be the same as the burst size, this scheme probably does break down, =
in
> that we don't really have any burst elasticity. However, I would query
> if
> that is a configuration that is used, since to the user it should
> appear
> correctly to provide no elasticity. Looking at testpmd, and our =
example
> apps, the standard there is a burst size of 32 and mempool cache of
> ~256.
> In OVS code, netdev-dpdk.c seems to initialize the mempool  with cache
> size
> of RTE_MEMPOOL_CACHE_MAX_SIZE (through define MP_CACHE_SZ). In all
> these
> cases, I think the 1.5 threshold should work just fine for us.

My example was for demonstration only, and hopefully not being used by =
any applications.

The simplified example was intended to demonstrate the theoretical =
effect of the unbalance in using the 1.5 threshold. It will be similar =
with a cache size of 256 objects: You will be allowed to go 8 bursts =
(calculation: 256 objects / 32 objects per burst) below the cache "size" =
without retrieving objects from the backing pool, but only 4 bursts =
above the cache "size" before flushing objects to the backing pool.

> That said,
> if you want to bump it up to 2x, I can't say I'd object strongly as it
> should be harmless, I think.

>From an academic viewpoint, 2x seems much better to me than 1.5x. And =
based on your background information from the introduction of the flush =
threshold, there was no academic reason why 1.5x was chosen - this is =
certainly valuable feedback.

I will consider providing a patch. Worst case it probably doesn't do any =
harm. And if anyone is really interested, they can look at the mempool =
debug stats to see the difference it makes for their specific =
application, e.g. the put_common_pool_bulk/put_bulk and =
get_common_pool_bulk/get_success_bulk ratios should decrease.

>=20
> Just my initial thoughts,
>=20
> /Bruce