From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1898042980; Tue, 18 Apr 2023 21:51:22 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9D7994021F; Tue, 18 Apr 2023 21:51:21 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 1A0BE4014F for ; Tue, 18 Apr 2023 21:51:20 +0200 (CEST) Received: from dkrd2.smartsharesys.local ([192.168.4.12]) by smartserver.smartsharesystems.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 18 Apr 2023 21:51:19 +0200 From: =?UTF-8?q?Morten=20Br=C3=B8rup?= To: dev@dpdk.org, olivier.matz@6wind.com, andrew.rybchenko@oktetlabs.ru Cc: bruce.richardson@intel.com, roretzla@linux.microsoft.com, =?UTF-8?q?Morten=20Br=C3=B8rup?= Subject: [PATCH v3] mempool: optimize get objects with constant n Date: Tue, 18 Apr 2023 21:51:17 +0200 Message-Id: <20230418195117.12862-1-mb@smartsharesystems.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230411064845.37713-1-mb@smartsharesystems.com> References: <20230411064845.37713-1-mb@smartsharesystems.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 18 Apr 2023 19:51:19.0628 (UTC) FILETIME=[2523E8C0:01D9722F] X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org When getting objects from the mempool, the number of objects to get is often constant at build time. This patch adds another code path for this case, so the compiler can optimize more, e.g. unroll the copy loop when the entire request is satisfied from the cache. On an Intel(R) Xeon(R) E5-2620 v4 CPU, and compiled with gcc 9.4.0, mempool_perf_test with constant n shows an increase in rate_persec by an average of 17 %, minimum 9.5 %, maximum 24 %. The code path where the number of objects to get is unknown at build time remains essentially unchanged. Signed-off-by: Morten Brørup Acked-by: Bruce Richardson --- v3: * Rebase to main. v2: * Added comments describing why some code is omitted when 'n' is known at buuild time. * Improved source code readability by explicitly setting 'remaining' where relevant, instead of at initialization. --- lib/mempool/rte_mempool.h | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h index ade0100ec7..7d14dea5c4 100644 --- a/lib/mempool/rte_mempool.h +++ b/lib/mempool/rte_mempool.h @@ -1492,13 +1492,15 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table, unsigned int n, struct rte_mempool_cache *cache) { int ret; - unsigned int remaining = n; + unsigned int remaining; uint32_t index, len; void **cache_objs; /* No cache provided */ - if (unlikely(cache == NULL)) + if (unlikely(cache == NULL)) { + remaining = n; goto driver_dequeue; + } cache_objs = &cache->objs[cache->len]; @@ -1518,14 +1520,23 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table, return 0; } - /* Use the cache as much as we have to return hot objects first */ + /* + * Use the cache as much as we have to return hot objects first. + * If the request size 'n' is known at build time, the above comparison + * ensures that n > cache->len here, so omit RTE_MIN(). + */ len = __extension__(__builtin_constant_p(n)) ? cache->len : - RTE_MIN(remaining, cache->len); + RTE_MIN(n, cache->len); cache->len -= len; - remaining -= len; + remaining = n - len; for (index = 0; index < len; index++) *obj_table++ = *--cache_objs; + /* + * If the request size 'n' is known at build time, the case + * where the entire request can be satisfied from the cache + * has already been handled above, so omit handling it here. + */ if (!__extension__(__builtin_constant_p(n)) && remaining == 0) { /* The entire request is satisfied from the cache. */ -- 2.17.1