[PATCH v5 0/4] mempool: fix mempool cache flushing algorithm

DPDK patches and discussions
 help / color / mirror / Atom feed

* [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm
@ 2022-10-09 13:25 Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 1/4] mempool: check driver enqueue result in one place Andrew Rybchenko
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Andrew Rybchenko @ 2022-10-09 13:25 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Morten Brørup, Bruce Richardson

v5 changes (Andrew Rybchenko):

- Factor out cosmetic fixes into separate patches to make all
  patches smaller and easier to review
- Remove extra check as per review notes
- Factor out entire cache flushing into a separate patch.
  It is nice from logical changes separation point of view,
  easier to bisect and revert.

v4 changes:

- Updated patch title to reflect that the scope of the patch is only
mempool cache flushing.

- Do not replace rte_memcpy() with alternative copying method. This was
a pure optimization, not a fix.

- Elaborate even more on the bugs fixed by the modifications.

- Added 4th bullet item to the patch description, regarding
rte_mempool_ops_enqueue_bulk() with RTE_LIBRTE_MEMPOOL_DEBUG.

v3 changes:

- Actually remove my modifications of the rte_mempool_cache structure.

v2 changes:

- Not adding the new objects to the mempool cache before flushing it
also allows the memory allocated for the mempool cache to be reduced
from 3 x to 2 x RTE_MEMPOOL_CACHE_MAX_SIZE.
However, such this change would break the ABI, so it was removed in v2.

- The mempool cache should be cache line aligned for the benefit of the
copying method, which on some CPU architectures performs worse on data
crossing a cache boundary.
However, such this change would break the ABI, so it was removed in v2;
and yet another alternative copying method replaced the rte_memcpy().

Andrew Rybchenko (3):
  mempool: check driver enqueue result in one place
  mempool: avoid usage of term ring on put
  mempool: flush cache completely on overflow

Morten Brørup (1):
  mempool: fix cache flushing algorithm

 lib/mempool/rte_mempool.c |  5 ++++
 lib/mempool/rte_mempool.h | 55 ++++++++++++++++++++-------------------
 2 files changed, 33 insertions(+), 27 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 1/4] mempool: check driver enqueue result in one place
  2022-10-09 13:25 [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
@ 2022-10-09 13:25 ` Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 2/4] mempool: avoid usage of term ring on put Andrew Rybchenko
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Andrew Rybchenko @ 2022-10-09 13:25 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Morten Brørup, Bruce Richardson

Enqueue operation must not fail. Move corresponding debug check
from one particular case to dequeue operation helper in order
to do it for all invocations.

Log critical message with useful information instead of rte_panic().

Make rte_mempool_do_generic_put() implementation more readable and
fix incosistency when return value is not checked in one place and
checked in another.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/mempool/rte_mempool.h | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 2401c4ac80..bc29d49aab 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -786,12 +786,19 @@ rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
 	struct rte_mempool_ops *ops;
+	int ret;
 
 	RTE_MEMPOOL_STAT_ADD(mp, put_common_pool_bulk, 1);
 	RTE_MEMPOOL_STAT_ADD(mp, put_common_pool_objs, n);
 	rte_mempool_trace_ops_enqueue_bulk(mp, obj_table, n);
 	ops = rte_mempool_get_ops(mp->ops_index);
-	return ops->enqueue(mp, obj_table, n);
+	ret = ops->enqueue(mp, obj_table, n);
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
+	if (unlikely(ret < 0))
+		RTE_LOG(CRIT, MEMPOOL, "cannot enqueue %u objects to mempool %s\n",
+			n, mp->name);
+#endif
+	return ret;
 }
 
 /**
@@ -1351,12 +1358,7 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 ring_enqueue:
 
 	/* push remaining objects in ring */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	if (rte_mempool_ops_enqueue_bulk(mp, obj_table, n) < 0)
-		rte_panic("cannot put objects in mempool\n");
-#else
 	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
-#endif
 }
 
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 2/4] mempool: avoid usage of term ring on put
  2022-10-09 13:25 [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 1/4] mempool: check driver enqueue result in one place Andrew Rybchenko
@ 2022-10-09 13:25 ` Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 3/4] mempool: fix cache flushing algorithm Andrew Rybchenko
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Andrew Rybchenko @ 2022-10-09 13:25 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Morten Brørup, Bruce Richardson

Term ring is misleading since it is the default, but still just
one of possible drivers to store objects.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/mempool/rte_mempool.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index bc29d49aab..a072e5554b 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1331,7 +1331,7 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 
 	/* No cache provided or if put would overflow mem allocated for cache */
 	if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
-		goto ring_enqueue;
+		goto driver_enqueue;
 
 	cache_objs = &cache->objs[cache->len];
 
@@ -1339,7 +1339,7 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 	 * The cache follows the following algorithm
 	 *   1. Add the objects to the cache
 	 *   2. Anything greater than the cache min value (if it crosses the
-	 *   cache flush threshold) is flushed to the ring.
+	 *   cache flush threshold) is flushed to the backend.
 	 */
 
 	/* Add elements back into the cache */
@@ -1355,9 +1355,9 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 
 	return;
 
-ring_enqueue:
+driver_enqueue:
 
-	/* push remaining objects in ring */
+	/* push objects to the backend */
 	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 3/4] mempool: fix cache flushing algorithm
  2022-10-09 13:25 [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 1/4] mempool: check driver enqueue result in one place Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 2/4] mempool: avoid usage of term ring on put Andrew Rybchenko
@ 2022-10-09 13:25 ` Andrew Rybchenko
  2022-10-09 13:25 ` [PATCH v5 4/4] mempool: flush cache completely on overflow Andrew Rybchenko
  2022-10-09 13:29 ` [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
  4 siblings, 0 replies; 6+ messages in thread
From: Andrew Rybchenko @ 2022-10-09 13:25 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Morten Brørup, Bruce Richardson

From: Morten Brørup <mb@smartsharesystems.com>

Fix the rte_mempool_do_generic_put() caching flushing algorithm to
keep hot objects in cache instead of cold ones.

The algorithm was:
 1. Add the objects to the cache.
 2. Anything greater than the cache size (if it crosses the cache flush
    threshold) is flushed to the backend.

Please note that the description in the source code said that it kept
"cache min value" objects after flushing, but the function actually kept
the cache full after flushing, which the above description reflects.

Now, the algorithm is:
 1. If the objects cannot be added to the cache without crossing the
    flush threshold, flush some cached objects to the backend to
    free up required space.
 2. Add the objects to the cache.

The most recent (hot) objects were flushed, leaving the oldest (cold)
objects in the mempool cache. The bug degraded performance, because
flushing prevented immediate reuse of the (hot) objects already in
the CPU cache.  Now, the existing (cold) objects in the mempool cache
are flushed before the new (hot) objects are added the to the mempool
cache.

Since nearby code is touched anyway fix flush threshold comparison
to do flushing if the threshold is really execeed, not just reached.
I.e. it must be "len > flushthresh", not "len >= flushthresh".
Consider a flush multiplier of 1 instead of 1.5; the cache would be
flushed already when reaching size objecs, not when exceeding size
objects. In other words, the cache would not be able to hold "size"
objects, which is clearly a bug. The bug could degraded performance
due to premature flushing.

Since we never exceed flush threshold now, cache size in the mempool
may be decreased from RTE_MEMPOOL_CACHE_MAX_SIZE * 3 to
RTE_MEMPOOL_CACHE_MAX_SIZE * 2. In fact it could be
CALC_CACHE_FLUSHTHRESH(RTE_MEMPOOL_CACHE_MAX_SIZE), but flush
threshold multiplier is internal.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/mempool/rte_mempool.c |  5 +++++
 lib/mempool/rte_mempool.h | 43 +++++++++++++++++++++++----------------
 2 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index de59009baf..4ba8ab7b63 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -746,6 +746,11 @@ rte_mempool_free(struct rte_mempool *mp)
 static void
 mempool_cache_init(struct rte_mempool_cache *cache, uint32_t size)
 {
+	/* Check that cache have enough space for flush threshold */
+	RTE_BUILD_BUG_ON(CALC_CACHE_FLUSHTHRESH(RTE_MEMPOOL_CACHE_MAX_SIZE) >
+			 RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs) /
+			 RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
+
 	cache->size = size;
 	cache->flushthresh = CALC_CACHE_FLUSHTHRESH(size);
 	cache->len = 0;
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index a072e5554b..e3364ed7b8 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -90,7 +90,7 @@ struct rte_mempool_cache {
 	 * Cache is allocated to this size to allow it to overflow in certain
 	 * cases to avoid needless emptying of cache.
 	 */
-	void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */
+	void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2]; /**< Cache objects */
 } __rte_cache_aligned;
 
 /**
@@ -1329,30 +1329,39 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
 	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
 
-	/* No cache provided or if put would overflow mem allocated for cache */
-	if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
+	/* No cache provided or the request itself is too big for the cache */
+	if (unlikely(cache == NULL || n > cache->flushthresh))
 		goto driver_enqueue;
 
-	cache_objs = &cache->objs[cache->len];
-
 	/*
-	 * The cache follows the following algorithm
-	 *   1. Add the objects to the cache
-	 *   2. Anything greater than the cache min value (if it crosses the
-	 *   cache flush threshold) is flushed to the backend.
+	 * The cache follows the following algorithm:
+	 *   1. If the objects cannot be added to the cache without crossing
+	 *      the flush threshold, flush the cache to the backend.
+	 *   2. Add the objects to the cache.
 	 */
 
-	/* Add elements back into the cache */
-	rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n);
-
-	cache->len += n;
+	if (cache->len + n <= cache->flushthresh) {
+		cache_objs = &cache->objs[cache->len];
+		cache->len += n;
+	} else {
+		unsigned int keep = (n >= cache->size) ? 0 : (cache->size - n);
 
-	if (cache->len >= cache->flushthresh) {
-		rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache->size],
-				cache->len - cache->size);
-		cache->len = cache->size;
+		/*
+		 * If number of object to keep in the cache is positive:
+		 * keep = cache->size - n < cache->flushthresh - n < cache->len
+		 * since cache->flushthresh > cache->size.
+		 * If keep is 0, cache->len cannot be 0 anyway since
+		 * n <= cache->flushthresh and we'd no be here with
+		 * cache->len == 0.
+		 */
+		cache_objs = &cache->objs[keep];
+		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len - keep);
+		cache->len = keep + n;
 	}
 
+	/* Add the objects to the cache. */
+	rte_memcpy(cache_objs, obj_table, sizeof(void *) * n);
+
 	return;
 
 driver_enqueue:
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 4/4] mempool: flush cache completely on overflow
  2022-10-09 13:25 [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
                   ` (2 preceding siblings ...)
  2022-10-09 13:25 ` [PATCH v5 3/4] mempool: fix cache flushing algorithm Andrew Rybchenko
@ 2022-10-09 13:25 ` Andrew Rybchenko
  2022-10-09 13:29 ` [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
  4 siblings, 0 replies; 6+ messages in thread
From: Andrew Rybchenko @ 2022-10-09 13:25 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Morten Brørup, Bruce Richardson

The cache was still full after flushing. In the opposite direction,
i.e. when getting objects from the cache, the cache is refilled to full
level when it crosses the low watermark (which happens to be zero).
Similarly, the cache should be flushed to empty level when it crosses
the high watermark (which happens to be 1.5 x the size of the cache).
The existing flushing behaviour was suboptimal for real applications,
because crossing the low or high watermark typically happens when the
application is in a state where the number of put/get events are out of
balance, e.g. when absorbing a burst of packets into a QoS queue
(getting more mbufs from the mempool), or when a burst of packets is
trickling out from the QoS queue (putting the mbufs back into the
mempool).
Now, the mempool cache is completely flushed when crossing the flush
threshold, so only the newly put (hot) objects remain in the mempool
cache afterwards.

This bug degraded performance caused by too frequent flushing.

Consider this application scenario:

Either, an lcore thread in the application is in a state of balance,
where it uses the mempool cache within its flush/refill boundaries; in
this situation, the flush method is less important, and this fix is
irrelevant.

Or, an lcore thread in the application is out of balance (either
permanently or temporarily), and mostly gets or puts objects from/to the
mempool. If it mostly puts objects, not flushing all of the objects will
cause more frequent flushing. This is the scenario addressed by this
fix. E.g.:

Cache size=256, flushthresh=384 (1.5x size), initial len=256;
application burst len=32.

If there are "size" objects in the cache after flushing, the cache is
flushed at every 4th burst.

If the cache is flushed completely, the cache is only flushed at every
16th burst.

As you can see, this bug caused the cache to be flushed 4x too
frequently in this example.

And when/if the application thread breaks its pattern of continuously
putting objects, and suddenly starts to get objects instead, it will
either get objects already in the cache, or the get() function will
refill the cache.

The concept of not flushing the cache completely was probably based on
an assumption that it is more likely for an application's lcore thread
to get() after flushing than to put() after flushing.
I strongly disagree with this assumption! If an application thread is
continuously putting so much that it overflows the cache, it is much
more likely to keep putting than it is to start getting. If in doubt,
consider how CPU branch predictors work: When the application has done
something many times consecutively, the branch predictor will expect the
application to do the same again, rather than suddenly do something
else.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/mempool/rte_mempool.h | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index e3364ed7b8..26b2697572 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1344,19 +1344,9 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 		cache_objs = &cache->objs[cache->len];
 		cache->len += n;
 	} else {
-		unsigned int keep = (n >= cache->size) ? 0 : (cache->size - n);
-
-		/*
-		 * If number of object to keep in the cache is positive:
-		 * keep = cache->size - n < cache->flushthresh - n < cache->len
-		 * since cache->flushthresh > cache->size.
-		 * If keep is 0, cache->len cannot be 0 anyway since
-		 * n <= cache->flushthresh and we'd no be here with
-		 * cache->len == 0.
-		 */
-		cache_objs = &cache->objs[keep];
-		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len - keep);
-		cache->len = keep + n;
+		cache_objs = &cache->objs[0];
+		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
+		cache->len = n;
 	}

 	/* Add the objects to the cache. */
-- 
2.30.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm
  2022-10-09 13:25 [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
                   ` (3 preceding siblings ...)
  2022-10-09 13:25 ` [PATCH v5 4/4] mempool: flush cache completely on overflow Andrew Rybchenko
@ 2022-10-09 13:29 ` Andrew Rybchenko
  4 siblings, 0 replies; 6+ messages in thread
From: Andrew Rybchenko @ 2022-10-09 13:29 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Morten Brørup, Bruce Richardson

Sorry, forgot --in-reply-to.
I'll send v6 with spelling fixes and --in-reply-to to have
thread for so interesting topic complete.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-09 13:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-09 13:25 [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko
2022-10-09 13:25 ` [PATCH v5 1/4] mempool: check driver enqueue result in one place Andrew Rybchenko
2022-10-09 13:25 ` [PATCH v5 2/4] mempool: avoid usage of term ring on put Andrew Rybchenko
2022-10-09 13:25 ` [PATCH v5 3/4] mempool: fix cache flushing algorithm Andrew Rybchenko
2022-10-09 13:25 ` [PATCH v5 4/4] mempool: flush cache completely on overflow Andrew Rybchenko
2022-10-09 13:29 ` [PATCH v5 0/4] mempool: fix mempool cache flushing algorithm Andrew Rybchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).