From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E38EBA0542; Sun, 9 Oct 2022 15:38:04 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B238C4281E; Sun, 9 Oct 2022 15:37:48 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 0DA9E4021E for ; Sun, 9 Oct 2022 15:37:45 +0200 (CEST) Received: by shelob.oktetlabs.ru (Postfix, from userid 115) id B0BD690; Sun, 9 Oct 2022 16:37:44 +0300 (MSK) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mail1.oktetlabs.ru X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=ALL_TRUSTED, DKIM_ADSP_DISCARD autolearn=no autolearn_force=no version=3.4.6 Received: from aros.oktetlabs.ru (aros.oktetlabs.ru [192.168.38.17]) by shelob.oktetlabs.ru (Postfix) with ESMTP id 277B083; Sun, 9 Oct 2022 16:37:43 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 277B083 Authentication-Results: shelob.oktetlabs.ru/277B083; dkim=none; dkim-atps=neutral From: Andrew Rybchenko To: Olivier Matz Cc: dev@dpdk.org, =?UTF-8?q?Morten=20Br=C3=B8rup?= , Bruce Richardson Subject: [PATCH v6 4/4] mempool: flush cache completely on overflow Date: Sun, 9 Oct 2022 16:37:37 +0300 Message-Id: <20221009133737.795377-5-andrew.rybchenko@oktetlabs.ru> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221009133737.795377-1-andrew.rybchenko@oktetlabs.ru> References: <98CBD80474FA8B44BF855DF32C47DC35D86DB2@smartserver.smartshare.dk> <20221009133737.795377-1-andrew.rybchenko@oktetlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The cache was still full after flushing. In the opposite direction, i.e. when getting objects from the cache, the cache is refilled to full level when it crosses the low watermark (which happens to be zero). Similarly, the cache should be flushed to empty level when it crosses the high watermark (which happens to be 1.5 x the size of the cache). The existing flushing behaviour was suboptimal for real applications, because crossing the low or high watermark typically happens when the application is in a state where the number of put/get events are out of balance, e.g. when absorbing a burst of packets into a QoS queue (getting more mbufs from the mempool), or when a burst of packets is trickling out from the QoS queue (putting the mbufs back into the mempool). Now, the mempool cache is completely flushed when crossing the flush threshold, so only the newly put (hot) objects remain in the mempool cache afterwards. This bug degraded performance caused by too frequent flushing. Consider this application scenario: Either, an lcore thread in the application is in a state of balance, where it uses the mempool cache within its flush/refill boundaries; in this situation, the flush method is less important, and this fix is irrelevant. Or, an lcore thread in the application is out of balance (either permanently or temporarily), and mostly gets or puts objects from/to the mempool. If it mostly puts objects, not flushing all of the objects will cause more frequent flushing. This is the scenario addressed by this fix. E.g.: Cache size=256, flushthresh=384 (1.5x size), initial len=256; application burst len=32. If there are "size" objects in the cache after flushing, the cache is flushed at every 4th burst. If the cache is flushed completely, the cache is only flushed at every 16th burst. As you can see, this bug caused the cache to be flushed 4x too frequently in this example. And when/if the application thread breaks its pattern of continuously putting objects, and suddenly starts to get objects instead, it will either get objects already in the cache, or the get() function will refill the cache. The concept of not flushing the cache completely was probably based on an assumption that it is more likely for an application's lcore thread to get() after flushing than to put() after flushing. I strongly disagree with this assumption! If an application thread is continuously putting so much that it overflows the cache, it is much more likely to keep putting than it is to start getting. If in doubt, consider how CPU branch predictors work: When the application has done something many times consecutively, the branch predictor will expect the application to do the same again, rather than suddenly do something else. Signed-off-by: Morten Brørup Signed-off-by: Andrew Rybchenko --- lib/mempool/rte_mempool.h | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h index e3364ed7b8..26b2697572 100644 --- a/lib/mempool/rte_mempool.h +++ b/lib/mempool/rte_mempool.h @@ -1344,19 +1344,9 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table, cache_objs = &cache->objs[cache->len]; cache->len += n; } else { - unsigned int keep = (n >= cache->size) ? 0 : (cache->size - n); - - /* - * If number of object to keep in the cache is positive: - * keep = cache->size - n < cache->flushthresh - n < cache->len - * since cache->flushthresh > cache->size. - * If keep is 0, cache->len cannot be 0 anyway since - * n <= cache->flushthresh and we'd no be here with - * cache->len == 0. - */ - cache_objs = &cache->objs[keep]; - rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len - keep); - cache->len = keep + n; + cache_objs = &cache->objs[0]; + rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len); + cache->len = n; } /* Add the objects to the cache. */ -- 2.30.2