DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] mempool: optimize get objects with constant n
@ 2023-04-11  6:48 Morten Brørup
  2023-04-18 11:06 ` Bruce Richardson
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Morten Brørup @ 2023-04-11  6:48 UTC (permalink / raw)
  To: olivier.matz, andrew.rybchenko; +Cc: dev, Morten Brørup

When getting objects from the mempool, the number of objects to get is
often constant at build time.

This patch adds another code path for this case, so the compiler can
optimize more, e.g. unroll the copy loop when the entire request is
satisfied from the cache.

On an Intel(R) Xeon(R) E5-2620 v4 CPU, and compiled with gcc 9.4.0,
mempool_perf_test with constant n shows an increase in rate_persec by an
average of 17 %, minimum 9.5 %, maximum 24 %.

The code path where the number of objects to get is unknown at build time
remains essentially unchanged.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/mempool/rte_mempool.h | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..ade0100ec7 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1500,15 +1500,33 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	if (unlikely(cache == NULL))
 		goto driver_dequeue;
 
-	/* Use the cache as much as we have to return hot objects first */
-	len = RTE_MIN(remaining, cache->len);
 	cache_objs = &cache->objs[cache->len];
+
+	if (__extension__(__builtin_constant_p(n)) && n <= cache->len) {
+		/*
+		 * The request size is known at build time, and
+		 * the entire request can be satisfied from the cache,
+		 * so let the compiler unroll the fixed length copy loop.
+		 */
+		cache->len -= n;
+		for (index = 0; index < n; index++)
+			*obj_table++ = *--cache_objs;
+
+		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
+		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
+
+		return 0;
+	}
+
+	/* Use the cache as much as we have to return hot objects first */
+	len = __extension__(__builtin_constant_p(n)) ? cache->len :
+			RTE_MIN(remaining, cache->len);
 	cache->len -= len;
 	remaining -= len;
 	for (index = 0; index < len; index++)
 		*obj_table++ = *--cache_objs;
 
-	if (remaining == 0) {
+	if (!__extension__(__builtin_constant_p(n)) && remaining == 0) {
 		/* The entire request is satisfied from the cache. */
 
 		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-06-07  9:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-11  6:48 [PATCH] mempool: optimize get objects with constant n Morten Brørup
2023-04-18 11:06 ` Bruce Richardson
2023-04-18 11:29   ` Morten Brørup
2023-04-18 12:54     ` Bruce Richardson
2023-04-18 12:55     ` Bruce Richardson
2023-06-07  7:51       ` Thomas Monjalon
2023-06-07  8:03         ` Morten Brørup
2023-06-07  8:10           ` Thomas Monjalon
2023-06-07  8:33             ` Morten Brørup
2023-06-07  8:41             ` Morten Brørup
2023-04-18 15:15   ` Tyler Retzlaff
2023-04-18 15:30     ` Morten Brørup
2023-04-18 15:44       ` Tyler Retzlaff
2023-04-18 15:50         ` Morten Brørup
2023-04-18 16:01           ` Tyler Retzlaff
2023-04-18 16:05   ` Morten Brørup
2023-04-18 19:51 ` [PATCH v3] " Morten Brørup
2023-04-18 20:09 ` [PATCH v4] " Morten Brørup
2023-06-07  9:12   ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).