From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 24221A0093; Mon, 7 Nov 2022 08:30:44 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 08C1740156; Mon, 7 Nov 2022 08:30:44 +0100 (CET) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id BE77B40151 for ; Mon, 7 Nov 2022 08:30:42 +0100 (CET) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 6F13E147D9 for ; Mon, 7 Nov 2022 08:30:42 +0100 (CET) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 6D6F5143F6; Mon, 7 Nov 2022 08:30:42 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A autolearn=disabled version=3.4.6 X-Spam-Score: -1.5 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 1FC3714B17; Mon, 7 Nov 2022 08:30:41 +0100 (CET) Message-ID: <532b74d2-37f3-7f0d-f7bc-f8f7034b6559@lysator.liu.se> Date: Mon, 7 Nov 2022 08:30:40 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [PATCH v4 3/3] mempool: use cache for frequently updated stats Content-Language: en-US To: =?UTF-8?Q?Morten_Br=c3=b8rup?= , olivier.matz@6wind.com, andrew.rybchenko@oktetlabs.ru, mattias.ronnblom@ericsson.com, stephen@networkplumber.org, jerinj@marvell.com, bruce.richardson@intel.com Cc: thomas@monjalon.net, dev@dpdk.org References: <20221104111740.330-1-mb@smartsharesystems.com> <20221104120329.1219-1-mb@smartsharesystems.com> <20221104120329.1219-3-mb@smartsharesystems.com> From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= In-Reply-To: <20221104120329.1219-3-mb@smartsharesystems.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2022-11-04 13:03, Morten Brørup wrote: > When built with stats enabled (RTE_LIBRTE_MEMPOOL_STATS defined), the > performance of mempools with caches is improved as follows. > > When accessing objects in the mempool, either the put_bulk and put_objs or > the get_success_bulk and get_success_objs statistics counters are likely > to be incremented. > > By adding an alternative set of these counters to the mempool cache > structure, accessing the dedicated statistics structure is avoided in the > likely cases where these counters are incremented. > > The trick here is that the cache line holding the mempool cache structure > is accessed anyway, in order to access the 'len' or 'flushthresh' fields. > Updating some statistics counters in the same cache line has lower > performance cost than accessing the statistics counters in the dedicated > statistics structure, which resides in another cache line. > > mempool_perf_autotest with this patch shows the following improvements in > rate_persec. > > The cost of enabling mempool stats (without debug) after this patch: > -6.8 % and -6.7 %, respectively without and with cache. > > v4: > * Fix checkpatch warnings: > A couple of typos in the patch description. > The macro to add to a mempool cache stat variable should not use > do {} while (0). Personally, I would tend to disagree with this, but > whatever keeps the CI happy. > v3: > * Don't update the description of the RTE_MEMPOOL_STAT_ADD macro. > This change belongs in the first patch of the series. > v2: > * Move the statistics counters into a stats structure. > > Signed-off-by: Morten Brørup > --- > lib/mempool/rte_mempool.c | 9 ++++++ > lib/mempool/rte_mempool.h | 66 ++++++++++++++++++++++++++++++++------- > 2 files changed, 64 insertions(+), 11 deletions(-) > > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c > index e6208125e0..a18e39af04 100644 > --- a/lib/mempool/rte_mempool.c > +++ b/lib/mempool/rte_mempool.c > @@ -1286,6 +1286,15 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp) > sum.get_success_blks += mp->stats[lcore_id].get_success_blks; > sum.get_fail_blks += mp->stats[lcore_id].get_fail_blks; > } > + if (mp->cache_size != 0) { > + /* Add the statistics stored in the mempool caches. */ > + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { > + sum.put_bulk += mp->local_cache[lcore_id].stats.put_bulk; > + sum.put_objs += mp->local_cache[lcore_id].stats.put_objs; > + sum.get_success_bulk += mp->local_cache[lcore_id].stats.get_success_bulk; > + sum.get_success_objs += mp->local_cache[lcore_id].stats.get_success_objs; > + } > + } > fprintf(f, " stats:\n"); > fprintf(f, " put_bulk=%"PRIu64"\n", sum.put_bulk); > fprintf(f, " put_objs=%"PRIu64"\n", sum.put_objs); > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h > index abfe34c05f..e6eb573739 100644 > --- a/lib/mempool/rte_mempool.h > +++ b/lib/mempool/rte_mempool.h > @@ -86,6 +86,19 @@ struct rte_mempool_cache { > uint32_t size; /**< Size of the cache */ > uint32_t flushthresh; /**< Threshold before we flush excess elements */ > uint32_t len; /**< Current cache count */ > +#ifdef RTE_LIBRTE_MEMPOOL_STATS > + uint32_t unused; > + /* > + * Alternative location for the most frequently updated mempool statistics (per-lcore), > + * providing faster update access when using a mempool cache. > + */ > + struct { > + uint64_t put_bulk; /**< Number of puts. */ > + uint64_t put_objs; /**< Number of objects successfully put. */ > + uint64_t get_success_bulk; /**< Successful allocation number. */ > + uint64_t get_success_objs; /**< Objects successfully allocated. */ > + } stats; /**< Statistics */ > +#endif > /** > * Cache objects > * > @@ -319,6 +332,22 @@ struct rte_mempool { > #define RTE_MEMPOOL_STAT_ADD(mp, name, n) do {} while (0) > #endif > > +/** > + * @internal When stats is enabled, store some statistics. > + * > + * @param cache > + * Pointer to the memory pool cache. > + * @param name > + * Name of the statistics field to increment in the memory pool cache. > + * @param n > + * Number to add to the statistics. > + */ > +#ifdef RTE_LIBRTE_MEMPOOL_STATS > +#define RTE_MEMPOOL_CACHE_STAT_ADD(cache, name, n) (cache)->stats.name += n > +#else > +#define RTE_MEMPOOL_CACHE_STAT_ADD(cache, name, n) do {} while (0) > +#endif > + > /** > * @internal Calculate the size of the mempool header. > * > @@ -1333,13 +1362,17 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table, > { > void **cache_objs; > > + /* No cache provided */ > + if (unlikely(cache == NULL)) > + goto driver_enqueue; > + > /* increment stat now, adding in mempool always success */ > - RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1); > - RTE_MEMPOOL_STAT_ADD(mp, put_objs, n); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n); > > - /* No cache provided or the request itself is too big for the cache */ > - if (unlikely(cache == NULL || n > cache->flushthresh)) > - goto driver_enqueue; > + /* The request itself is too big for the cache */ > + if (unlikely(n > cache->flushthresh)) > + goto driver_enqueue_stats_incremented; > > /* > * The cache follows the following algorithm: > @@ -1364,6 +1397,12 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table, > > driver_enqueue: > > + /* increment stat now, adding in mempool always success */ > + RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1); > + RTE_MEMPOOL_STAT_ADD(mp, put_objs, n); > + > +driver_enqueue_stats_incremented: > + > /* push objects to the backend */ > rte_mempool_ops_enqueue_bulk(mp, obj_table, n); > } > @@ -1470,8 +1509,8 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table, > if (remaining == 0) { > /* The entire request is satisfied from the cache. */ > > - RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1); > - RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n); > > return 0; > } > @@ -1500,8 +1539,8 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table, > > cache->len = cache->size; > > - RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1); > - RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n); > > return 0; > > @@ -1523,8 +1562,13 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table, > RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1); > RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n); > } else { > - RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1); > - RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n); > + if (likely(cache != NULL)) { > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1); > + RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n); > + } else { > + RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1); > + RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n); > + } > } > > return ret; For the series, Reviewed-By: Mattias Rönnblom