From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1EFA0A0542; Sun, 9 Oct 2022 16:51:49 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BB1F8400D5; Sun, 9 Oct 2022 16:51:48 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id E8FA740042 for ; Sun, 9 Oct 2022 16:51:47 +0200 (CEST) Received: from [192.168.38.17] (aros.oktetlabs.ru [192.168.38.17]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id 54C0166; Sun, 9 Oct 2022 17:51:47 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 54C0166 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1665327107; bh=v4QtivyefsMJigPvYysR4n8hxoStr7Jj73Pi4KY0hDo=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=MDrgljVc0okEYTZQa3RPNz04AzCDuSqH1N1eULK1zYULwIPKSMVMU9gvizhWoNFlf F9EsKVmq7HOukCXLwci4/QXCW4SBhnKEnVrnJuQN7IXlphu2IUmHp01ZtmrhAQrS1b 6z9GuZSWoUohCkVwgiI5xhdoWunNxrCoN3AMmfuk= Message-ID: Date: Sun, 9 Oct 2022 17:51:46 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH v6 3/4] mempool: fix cache flushing algorithm Content-Language: en-US To: =?UTF-8?Q?Morten_Br=c3=b8rup?= , Olivier Matz Cc: dev@dpdk.org, Bruce Richardson References: <98CBD80474FA8B44BF855DF32C47DC35D86DB2@smartserver.smartshare.dk> <20221009133737.795377-1-andrew.rybchenko@oktetlabs.ru> <20221009133737.795377-4-andrew.rybchenko@oktetlabs.ru> <98CBD80474FA8B44BF855DF32C47DC35D873B6@smartserver.smartshare.dk> From: Andrew Rybchenko Organization: OKTET Labs In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D873B6@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 10/9/22 17:31, Morten Brørup wrote: >> From: Andrew Rybchenko [mailto:andrew.rybchenko@oktetlabs.ru] >> Sent: Sunday, 9 October 2022 15.38 >> >> From: Morten Brørup >> >> Fix the rte_mempool_do_generic_put() caching flushing algorithm to >> keep hot objects in cache instead of cold ones. >> >> The algorithm was: >> 1. Add the objects to the cache. >> 2. Anything greater than the cache size (if it crosses the cache flush >> threshold) is flushed to the backend. >> >> Please note that the description in the source code said that it kept >> "cache min value" objects after flushing, but the function actually >> kept >> the cache full after flushing, which the above description reflects. >> >> Now, the algorithm is: >> 1. If the objects cannot be added to the cache without crossing the >> flush threshold, flush some cached objects to the backend to >> free up required space. >> 2. Add the objects to the cache. >> >> The most recent (hot) objects were flushed, leaving the oldest (cold) >> objects in the mempool cache. The bug degraded performance, because >> flushing prevented immediate reuse of the (hot) objects already in >> the CPU cache. Now, the existing (cold) objects in the mempool cache >> are flushed before the new (hot) objects are added the to the mempool >> cache. >> >> Since nearby code is touched anyway fix flush threshold comparison >> to do flushing if the threshold is really exceed, not just reached. >> I.e. it must be "len > flushthresh", not "len >= flushthresh". >> Consider a flush multiplier of 1 instead of 1.5; the cache would be >> flushed already when reaching size objects, not when exceeding size >> objects. In other words, the cache would not be able to hold "size" >> objects, which is clearly a bug. The bug could degraded performance >> due to premature flushing. >> >> Since we never exceed flush threshold now, cache size in the mempool >> may be decreased from RTE_MEMPOOL_CACHE_MAX_SIZE * 3 to >> RTE_MEMPOOL_CACHE_MAX_SIZE * 2. In fact it could be >> CALC_CACHE_FLUSHTHRESH(RTE_MEMPOOL_CACHE_MAX_SIZE), but flush >> threshold multiplier is internal. >> >> Signed-off-by: Morten Brørup >> Signed-off-by: Andrew Rybchenko >> --- > > [...] > >> --- a/lib/mempool/rte_mempool.h >> +++ b/lib/mempool/rte_mempool.h >> @@ -90,7 +90,7 @@ struct rte_mempool_cache { >> * Cache is allocated to this size to allow it to overflow in >> certain >> * cases to avoid needless emptying of cache. >> */ >> - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ >> + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2]; /**< Cache objects */ >> } __rte_cache_aligned; > > How much are we allowed to break the ABI here? > > This patch reduces the size of the structure by removing a now unused part at the end, which should be harmless. > > If we may also move the position of the objs array, I would add __rte_cache_aligned to the objs array. It makes no difference in the general case, but if get/put operations are always 32 objects, it will reduce the number of memory (or last level cache) accesses from five to four 64 B cache lines for every get/put operation. > > uint32_t len; /**< Current cache count */ > - /* > - * Cache is allocated to this size to allow it to overflow in certain > - * cases to avoid needless emptying of cache. > - */ > - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ > + /** > + * Cache objects > + * > + * Cache is allocated to this size to allow it to overflow in certain > + * cases to avoid needless emptying of cache. > + */ > + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2] __rte_cache_aligned; > } __rte_cache_aligned; I think aligning objs on cacheline should be a separate patch. > > With or without the above suggested optimization... > > Reviewed-by: Morten Brørup >