[PATCH v4] mempool: optimize get objects with constant n

DPDK patches and discussions
 help / color / mirror / Atom feed

From: "Morten Brørup" <mb@smartsharesystems.com>
To: dev@dpdk.org, olivier.matz@6wind.com, andrew.rybchenko@oktetlabs.ru
Cc: bruce.richardson@intel.com, roretzla@linux.microsoft.com,
	"Morten Brørup" <mb@smartsharesystems.com>
Subject: [PATCH v4] mempool: optimize get objects with constant n
Date: Tue, 18 Apr 2023 22:09:24 +0200	[thread overview]
Message-ID: <20230418200924.13290-1-mb@smartsharesystems.com> (raw)
In-Reply-To: <20230411064845.37713-1-mb@smartsharesystems.com>

When getting objects from the mempool, the number of objects to get is
often constant at build time.

This patch adds another code path for this case, so the compiler can
optimize more, e.g. unroll the copy loop when the entire request is
satisfied from the cache.

On an Intel(R) Xeon(R) E5-2620 v4 CPU, and compiled with gcc 9.4.0,
mempool_perf_test with constant n shows an increase in rate_persec by an
average of 17 %, minimum 9.5 %, maximum 24 %.

The code path where the number of objects to get is unknown at build time
remains essentially unchanged.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

---
v4:
* Rebased to main.
v3:
* Tried to rebase to main, but failed to include all my changes.
v2:
* Added comments describing why some code is omitted when 'n' is known
  at buuild time.
* Improved source code readability by explicitly setting 'remaining' where
  relevant, instead of at initialization.
---
 lib/mempool/rte_mempool.h | 41 +++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..7d14dea5c4 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1492,23 +1492,52 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 			   unsigned int n, struct rte_mempool_cache *cache)
 {
 	int ret;
-	unsigned int remaining = n;
+	unsigned int remaining;
 	uint32_t index, len;
 	void **cache_objs;
 
 	/* No cache provided */
-	if (unlikely(cache == NULL))
+	if (unlikely(cache == NULL)) {
+		remaining = n;
 		goto driver_dequeue;
+	}
 
-	/* Use the cache as much as we have to return hot objects first */
-	len = RTE_MIN(remaining, cache->len);
 	cache_objs = &cache->objs[cache->len];
+
+	if (__extension__(__builtin_constant_p(n)) && n <= cache->len) {
+		/*
+		 * The request size is known at build time, and
+		 * the entire request can be satisfied from the cache,
+		 * so let the compiler unroll the fixed length copy loop.
+		 */
+		cache->len -= n;
+		for (index = 0; index < n; index++)
+			*obj_table++ = *--cache_objs;
+
+		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
+		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
+
+		return 0;
+	}
+
+	/*
+	 * Use the cache as much as we have to return hot objects first.
+	 * If the request size 'n' is known at build time, the above comparison
+	 * ensures that n > cache->len here, so omit RTE_MIN().
+	 */
+	len = __extension__(__builtin_constant_p(n)) ? cache->len :
+			RTE_MIN(n, cache->len);
 	cache->len -= len;
-	remaining -= len;
+	remaining = n - len;
 	for (index = 0; index < len; index++)
 		*obj_table++ = *--cache_objs;
 
-	if (remaining == 0) {
+	/*
+	 * If the request size 'n' is known at build time, the case
+	 * where the entire request can be satisfied from the cache
+	 * has already been handled above, so omit handling it here.
+	 */
+	if (!__extension__(__builtin_constant_p(n)) && remaining == 0) {
 		/* The entire request is satisfied from the cache. */
 
 		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
-- 
2.17.1

next prev parent reply	other threads:[~2023-04-18 20:09 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-11  6:48 [PATCH] " Morten Brørup
2023-04-18 11:06 ` Bruce Richardson
2023-04-18 11:29   ` Morten Brørup
2023-04-18 12:54     ` Bruce Richardson
2023-04-18 12:55     ` Bruce Richardson
2023-06-07  7:51       ` Thomas Monjalon
2023-06-07  8:03         ` Morten Brørup
2023-06-07  8:10           ` Thomas Monjalon
2023-06-07  8:33             ` Morten Brørup
2023-06-07  8:41             ` Morten Brørup
2023-04-18 15:15   ` Tyler Retzlaff
2023-04-18 15:30     ` Morten Brørup
2023-04-18 15:44       ` Tyler Retzlaff
2023-04-18 15:50         ` Morten Brørup
2023-04-18 16:01           ` Tyler Retzlaff
2023-04-18 16:05   ` Morten Brørup
2023-04-18 19:51 ` [PATCH v3] " Morten Brørup
2023-04-18 20:09 ` Morten Brørup [this message]
2023-06-07  9:12   ` [PATCH v4] " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230418200924.13290-1-mb@smartsharesystems.com \
    --to=mb@smartsharesystems.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    --cc=roretzla@linux.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).