From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 04F45A0542; Sun, 9 Oct 2022 15:38:00 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D399D42802; Sun, 9 Oct 2022 15:37:47 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id E20E040042 for ; Sun, 9 Oct 2022 15:37:44 +0200 (CEST) Received: by shelob.oktetlabs.ru (Postfix, from userid 115) id 7F39098; Sun, 9 Oct 2022 16:37:44 +0300 (MSK) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mail1.oktetlabs.ru X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=ALL_TRUSTED, DKIM_ADSP_DISCARD autolearn=no autolearn_force=no version=3.4.6 Received: from aros.oktetlabs.ru (aros.oktetlabs.ru [192.168.38.17]) by shelob.oktetlabs.ru (Postfix) with ESMTP id B850592; Sun, 9 Oct 2022 16:37:42 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru B850592 Authentication-Results: shelob.oktetlabs.ru/B850592; dkim=none; dkim-atps=neutral From: Andrew Rybchenko To: Olivier Matz Cc: dev@dpdk.org, =?UTF-8?q?Morten=20Br=C3=B8rup?= , Bruce Richardson Subject: [PATCH v6 3/4] mempool: fix cache flushing algorithm Date: Sun, 9 Oct 2022 16:37:36 +0300 Message-Id: <20221009133737.795377-4-andrew.rybchenko@oktetlabs.ru> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221009133737.795377-1-andrew.rybchenko@oktetlabs.ru> References: <98CBD80474FA8B44BF855DF32C47DC35D86DB2@smartserver.smartshare.dk> <20221009133737.795377-1-andrew.rybchenko@oktetlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Morten Brørup Fix the rte_mempool_do_generic_put() caching flushing algorithm to keep hot objects in cache instead of cold ones. The algorithm was: 1. Add the objects to the cache. 2. Anything greater than the cache size (if it crosses the cache flush threshold) is flushed to the backend. Please note that the description in the source code said that it kept "cache min value" objects after flushing, but the function actually kept the cache full after flushing, which the above description reflects. Now, the algorithm is: 1. If the objects cannot be added to the cache without crossing the flush threshold, flush some cached objects to the backend to free up required space. 2. Add the objects to the cache. The most recent (hot) objects were flushed, leaving the oldest (cold) objects in the mempool cache. The bug degraded performance, because flushing prevented immediate reuse of the (hot) objects already in the CPU cache. Now, the existing (cold) objects in the mempool cache are flushed before the new (hot) objects are added the to the mempool cache. Since nearby code is touched anyway fix flush threshold comparison to do flushing if the threshold is really exceed, not just reached. I.e. it must be "len > flushthresh", not "len >= flushthresh". Consider a flush multiplier of 1 instead of 1.5; the cache would be flushed already when reaching size objects, not when exceeding size objects. In other words, the cache would not be able to hold "size" objects, which is clearly a bug. The bug could degraded performance due to premature flushing. Since we never exceed flush threshold now, cache size in the mempool may be decreased from RTE_MEMPOOL_CACHE_MAX_SIZE * 3 to RTE_MEMPOOL_CACHE_MAX_SIZE * 2. In fact it could be CALC_CACHE_FLUSHTHRESH(RTE_MEMPOOL_CACHE_MAX_SIZE), but flush threshold multiplier is internal. Signed-off-by: Morten Brørup Signed-off-by: Andrew Rybchenko --- lib/mempool/rte_mempool.c | 5 +++++ lib/mempool/rte_mempool.h | 43 +++++++++++++++++++++++---------------- 2 files changed, 31 insertions(+), 17 deletions(-) diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c index de59009baf..4ba8ab7b63 100644 --- a/lib/mempool/rte_mempool.c +++ b/lib/mempool/rte_mempool.c @@ -746,6 +746,11 @@ rte_mempool_free(struct rte_mempool *mp) static void mempool_cache_init(struct rte_mempool_cache *cache, uint32_t size) { + /* Check that cache have enough space for flush threshold */ + RTE_BUILD_BUG_ON(CALC_CACHE_FLUSHTHRESH(RTE_MEMPOOL_CACHE_MAX_SIZE) > + RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs) / + RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0])); + cache->size = size; cache->flushthresh = CALC_CACHE_FLUSHTHRESH(size); cache->len = 0; diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h index a072e5554b..e3364ed7b8 100644 --- a/lib/mempool/rte_mempool.h +++ b/lib/mempool/rte_mempool.h @@ -90,7 +90,7 @@ struct rte_mempool_cache { * Cache is allocated to this size to allow it to overflow in certain * cases to avoid needless emptying of cache. */ - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2]; /**< Cache objects */ } __rte_cache_aligned; /** @@ -1329,30 +1329,39 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table, RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1); RTE_MEMPOOL_STAT_ADD(mp, put_objs, n); - /* No cache provided or if put would overflow mem allocated for cache */ - if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE)) + /* No cache provided or the request itself is too big for the cache */ + if (unlikely(cache == NULL || n > cache->flushthresh)) goto driver_enqueue; - cache_objs = &cache->objs[cache->len]; - /* - * The cache follows the following algorithm - * 1. Add the objects to the cache - * 2. Anything greater than the cache min value (if it crosses the - * cache flush threshold) is flushed to the backend. + * The cache follows the following algorithm: + * 1. If the objects cannot be added to the cache without crossing + * the flush threshold, flush the cache to the backend. + * 2. Add the objects to the cache. */ - /* Add elements back into the cache */ - rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n); - - cache->len += n; + if (cache->len + n <= cache->flushthresh) { + cache_objs = &cache->objs[cache->len]; + cache->len += n; + } else { + unsigned int keep = (n >= cache->size) ? 0 : (cache->size - n); - if (cache->len >= cache->flushthresh) { - rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache->size], - cache->len - cache->size); - cache->len = cache->size; + /* + * If number of object to keep in the cache is positive: + * keep = cache->size - n < cache->flushthresh - n < cache->len + * since cache->flushthresh > cache->size. + * If keep is 0, cache->len cannot be 0 anyway since + * n <= cache->flushthresh and we'd no be here with + * cache->len == 0. + */ + cache_objs = &cache->objs[keep]; + rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len - keep); + cache->len = keep + n; } + /* Add the objects to the cache. */ + rte_memcpy(cache_objs, obj_table, sizeof(void *) * n); + return; driver_enqueue: -- 2.30.2