DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: olivier.matz@6wind.com, andrew.rybchenko@oktetlabs.ru
Cc: bruce.richardson@intel.com, jerinjacobk@gmail.com, dev@dpdk.org,
	"Morten Brørup" <mb@smartsharesystems.com>
Subject: [PATCH v4] mempool: fix mempool cache flushing algorithm
Date: Wed,  2 Feb 2022 11:33:54 +0100	[thread overview]
Message-ID: <20220202103354.79832-1-mb@smartsharesystems.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D86DB2@smartserver.smartshare.dk>

This patch fixes the rte_mempool_do_generic_put() caching algorithm,
which was fundamentally wrong, causing multiple performance issues when
flushing.

Although the bugs do have serious performance implications when
flushing, the function did not fail when flushing (or otherwise).
Backporting could be considered optional.

The algorithm was:
 1. Add the objects to the cache
 2. Anything greater than the cache size (if it crosses the cache flush
    threshold) is flushed to the ring.

Please note that the description in the source code said that it kept
"cache min value" objects after flushing, but the function actually kept
the cache full after flushing, which the above description reflects.

Now, the algorithm is:
 1. If the objects cannot be added to the cache without crossing the
    flush threshold, flush the cache to the ring.
 2. Add the objects to the cache.

This patch fixes these bugs:

1. The cache was still full after flushing.
In the opposite direction, i.e. when getting objects from the cache, the
cache is refilled to full level when it crosses the low watermark (which
happens to be zero).
Similarly, the cache should be flushed to empty level when it crosses
the high watermark (which happens to be 1.5 x the size of the cache).
The existing flushing behaviour was suboptimal for real applications,
because crossing the low or high watermark typically happens when the
application is in a state where the number of put/get events are out of
balance, e.g. when absorbing a burst of packets into a QoS queue
(getting more mbufs from the mempool), or when a burst of packets is
trickling out from the QoS queue (putting the mbufs back into the
mempool).
Now, the mempool cache is completely flushed when crossing the flush
threshold, so only the newly put (hot) objects remain in the mempool
cache afterwards.

This bug degraded performance caused by too frequent flushing.

Consider this application scenario:

Either, an lcore thread in the application is in a state of balance,
where it uses the mempool cache within its flush/refill boundaries; in
this situation, the flush method is less important, and this fix is
irrelevant.

Or, an lcore thread in the application is out of balance (either
permanently or temporarily), and mostly gets or puts objects from/to the
mempool. If it mostly puts objects, not flushing all of the objects will
cause more frequent flushing. This is the scenario addressed by this
fix. E.g.:

Cache size=256, flushthresh=384 (1.5x size), initial len=256;
application burst len=32.

If there are "size" objects in the cache after flushing, the cache is
flushed at every 4th burst.

If the cache is flushed completely, the cache is only flushed at every
16th burst.

As you can see, this bug caused the cache to be flushed 4x too
frequently in this example.

And when/if the application thread breaks its pattern of continuously
putting objects, and suddenly starts to get objects instead, it will
either get objects already in the cache, or the get() function will
refill the cache.

The concept of not flushing the cache completely was probably based on
an assumption that it is more likely for an application's lcore thread
to get() after flushing than to put() after flushing.
I strongly disagree with this assumption! If an application thread is
continuously putting so much that it overflows the cache, it is much
more likely to keep putting than it is to start getting. If in doubt,
consider how CPU branch predictors work: When the application has done
something many times consecutively, the branch predictor will expect the
application to do the same again, rather than suddenly do something
else.

Also, if you consider the description of the algorithm in the source
code, and agree that "cache min value" cannot mean "cache size", the
function did not behave as intended. This in itself is a bug.

2. The flush threshold comparison was off by one.
It must be "len > flushthresh", not "len >= flushthresh".
Consider a flush multiplier of 1 instead of 1.5; the cache would be
flushed already when reaching size objecs, not when exceeding size
objects. In other words, the cache would not be able to hold "size"
objects, which is clearly a bug.
Now, flushing is triggered when the flush threshold is exceeded, not
when reached.

This bug degraded performance due to premature flushing. In my example
above, this bug caused flushing every 3rd burst instead of every 4th.

3. The most recent (hot) objects were flushed, leaving the oldest (cold)
objects in the mempool cache.
This bug degraded performance, because flushing prevented immediate
reuse of the (hot) objects already in the CPU cache.
Now, the existing (cold) objects in the mempool cache are flushed before
the new (hot) objects are added the to the mempool cache.

4. With RTE_LIBRTE_MEMPOOL_DEBUG defined, the return value of
rte_mempool_ops_enqueue_bulk() was not checked when flushing the cache.
Now, it is checked in both locations where used; and obviously still
only if RTE_LIBRTE_MEMPOOL_DEBUG is defined.

v2 changes:

- Not adding the new objects to the mempool cache before flushing it
also allows the memory allocated for the mempool cache to be reduced
from 3 x to 2 x RTE_MEMPOOL_CACHE_MAX_SIZE.
However, such this change would break the ABI, so it was removed in v2.

- The mempool cache should be cache line aligned for the benefit of the
copying method, which on some CPU architectures performs worse on data
crossing a cache boundary.
However, such this change would break the ABI, so it was removed in v2;
and yet another alternative copying method replaced the rte_memcpy().

v3 changes:

- Actually remove my modifications of the rte_mempool_cache structure.

v4 changes:

- Updated patch title to reflect that the scope of the patch is only
mempool cache flushing.

- Do not replace rte_memcpy() with alternative copying method. This was
a pure optimization, not a fix.

- Elaborate even more on the bugs fixed by the modifications.

- Added 4th bullet item to the patch description, regarding
rte_mempool_ops_enqueue_bulk() with RTE_LIBRTE_MEMPOOL_DEBUG.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/mempool/rte_mempool.h | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 1e7a3c1527..e7e09e48fc 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1344,31 +1344,41 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 	if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
 		goto ring_enqueue;
 
-	cache_objs = &cache->objs[cache->len];
+	/* If the request itself is too big for the cache */
+	if (unlikely(n > cache->flushthresh))
+		goto ring_enqueue;
 
 	/*
 	 * The cache follows the following algorithm
-	 *   1. Add the objects to the cache
-	 *   2. Anything greater than the cache min value (if it crosses the
-	 *   cache flush threshold) is flushed to the ring.
+	 *   1. If the objects cannot be added to the cache without
+	 *   crossing the flush threshold, flush the cache to the ring.
+	 *   2. Add the objects to the cache.
 	 */
 
-	/* Add elements back into the cache */
-	rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n);
+	if (cache->len + n <= cache->flushthresh) {
+		cache_objs = &cache->objs[cache->len];
 
-	cache->len += n;
+		cache->len += n;
+	} else {
+		cache_objs = &cache->objs[0];
 
-	if (cache->len >= cache->flushthresh) {
-		rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache->size],
-				cache->len - cache->size);
-		cache->len = cache->size;
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
+		if (rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len) < 0)
+			rte_panic("cannot put objects in mempool\n");
+#else
+		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
+#endif
+		cache->len = n;
 	}
 
+	/* Add the objects to the cache. */
+	rte_memcpy(cache_objs, obj_table, sizeof(void *) * n);
+
 	return;
 
 ring_enqueue:
 
-	/* push remaining objects in ring */
+	/* Put the objects into the ring */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	if (rte_mempool_ops_enqueue_bulk(mp, obj_table, n) < 0)
 		rte_panic("cannot put objects in mempool\n");
-- 
2.17.1


  parent reply	other threads:[~2022-02-02 10:33 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-26 15:34 [RFC] mempool: rte_mempool_do_generic_get optimizations Morten Brørup
2022-01-06 12:23 ` [PATCH] mempool: optimize incomplete cache handling Morten Brørup
2022-01-06 16:55   ` Jerin Jacob
2022-01-07  8:46     ` Morten Brørup
2022-01-10  7:26       ` Jerin Jacob
2022-01-10 10:55         ` Morten Brørup
2022-01-14 16:36 ` [PATCH] mempool: fix get objects from mempool with cache Morten Brørup
2022-01-17 17:35   ` Bruce Richardson
2022-01-18  8:25     ` Morten Brørup
2022-01-18  9:07       ` Bruce Richardson
2022-01-24 15:38   ` Olivier Matz
2022-01-24 16:11     ` Olivier Matz
2022-01-28 10:22     ` Morten Brørup
2022-01-17 11:52 ` [PATCH] mempool: optimize put objects to " Morten Brørup
2022-01-19 14:52 ` [PATCH v2] mempool: fix " Morten Brørup
2022-01-19 15:03 ` [PATCH v3] " Morten Brørup
2022-01-24 15:39   ` Olivier Matz
2022-01-28  9:37     ` Morten Brørup
2022-02-02  8:14 ` [PATCH v2] mempool: fix get objects from " Morten Brørup
2022-06-15 21:18   ` Morten Brørup
2022-09-29 10:52     ` Morten Brørup
2022-10-04 12:57   ` Andrew Rybchenko
2022-10-04 15:13     ` Morten Brørup
2022-10-04 15:58       ` Andrew Rybchenko
2022-10-04 18:09         ` Morten Brørup
2022-10-06 13:43       ` Aaron Conole
2022-10-04 16:03   ` Morten Brørup
2022-10-04 16:36   ` Morten Brørup
2022-10-04 16:39   ` Morten Brørup
2022-02-02 10:33 ` Morten Brørup [this message]
2022-04-07  9:04   ` [PATCH v4] mempool: fix mempool cache flushing algorithm Morten Brørup
2022-04-07  9:14     ` Bruce Richardson
2022-04-07  9:26       ` Morten Brørup
2022-04-07 10:32         ` Bruce Richardson
2022-04-07 10:43           ` Bruce Richardson
2022-04-07 11:36             ` Morten Brørup
2022-10-04 20:01   ` Morten Brørup
2022-10-09 11:11   ` [PATCH 1/2] mempool: check driver enqueue result in one place Andrew Rybchenko
2022-10-09 11:11     ` [PATCH 2/2] mempool: avoid usage of term ring on put Andrew Rybchenko
2022-10-09 13:08       ` Morten Brørup
2022-10-09 13:14         ` Andrew Rybchenko
2022-10-09 13:01     ` [PATCH 1/2] mempool: check driver enqueue result in one place Morten Brørup
2022-10-09 13:19   ` [PATCH v4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
2022-10-04 12:53 ` [PATCH v3] mempool: fix get objects from mempool with cache Andrew Rybchenko
2022-10-04 14:42   ` Morten Brørup
2022-10-07 10:44 ` [PATCH v4] " Andrew Rybchenko
2022-10-08 20:56   ` Thomas Monjalon
2022-10-11 20:30     ` Copy-pasted code should be updated Morten Brørup
2022-10-11 21:47       ` Honnappa Nagarahalli
2022-10-30  8:44         ` Morten Brørup
2022-10-30 22:50           ` Honnappa Nagarahalli
2022-10-14 14:01     ` [PATCH v4] mempool: fix get objects from mempool with cache Olivier Matz
2022-10-09 13:37 ` [PATCH v6 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
2022-10-09 13:37   ` [PATCH v6 1/4] mempool: check driver enqueue result in one place Andrew Rybchenko
2022-10-09 13:37   ` [PATCH v6 2/4] mempool: avoid usage of term ring on put Andrew Rybchenko
2022-10-09 13:37   ` [PATCH v6 3/4] mempool: fix cache flushing algorithm Andrew Rybchenko
2022-10-09 14:31     ` Morten Brørup
2022-10-09 14:51       ` Andrew Rybchenko
2022-10-09 15:08         ` Morten Brørup
2022-10-14 14:01           ` Olivier Matz
2022-10-14 15:57             ` Morten Brørup
2022-10-14 19:50               ` Olivier Matz
2022-10-15  6:57                 ` Morten Brørup
2022-10-18 16:32                   ` Jerin Jacob
2022-10-09 13:37   ` [PATCH v6 4/4] mempool: flush cache completely on overflow Andrew Rybchenko
2022-10-09 14:44     ` Morten Brørup
2022-10-14 14:01       ` Olivier Matz
2022-10-10 15:21   ` [PATCH v6 0/4] mempool: fix mempool cache flushing algorithm Thomas Monjalon
2022-10-11 19:26     ` Morten Brørup
2022-10-26 14:09     ` Thomas Monjalon
2022-10-26 14:26       ` Morten Brørup
2022-10-26 14:44         ` [PATCH] mempool: cache align mempool cache objects Morten Brørup
2022-10-26 19:44           ` Andrew Rybchenko
2022-10-27  8:34           ` Olivier Matz
2022-10-27  9:22             ` Morten Brørup
2022-10-27 11:42               ` Olivier Matz
2022-10-27 12:11                 ` Morten Brørup
2022-10-27 15:20                   ` Olivier Matz
2022-10-28  6:35           ` [PATCH v3 1/2] " Morten Brørup
2022-10-28  6:35             ` [PATCH v3 2/2] mempool: optimized debug statistics Morten Brørup
2022-10-28  6:41           ` [PATCH v4 1/2] mempool: cache align mempool cache objects Morten Brørup
2022-10-28  6:41             ` [PATCH v4 2/2] mempool: optimized debug statistics Morten Brørup
2022-10-30  9:09               ` Morten Brørup
2022-10-30  9:16                 ` Thomas Monjalon
2022-10-30  9:17             ` [PATCH v4 1/2] mempool: cache align mempool cache objects Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220202103354.79832-1-mb@smartsharesystems.com \
    --to=mb@smartsharesystems.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=jerinjacobk@gmail.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).