From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com [209.85.212.171]) by dpdk.org (Postfix) with ESMTP id 455965A74 for ; Tue, 30 Jun 2015 13:58:47 +0200 (CEST) Received: by wiga1 with SMTP id a1so96103261wig.0 for ; Tue, 30 Jun 2015 04:58:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=o62gUH5Nnp895VJJBh/oqLyBPmyaN2DSjdHHOhktpYg=; b=Ou2HEF932K6OjNmnTTDORL6j603Ikw78ca6moSFLx8/Qzbc97TP30gHPRi7PzErVH9 trWadUuxTUFA0O+vgNeZnE+imLUxh6H7kZkJPMF5rNghe0tZ6KkHmr4TtHdm7fuQFWB9 1xh1wEaNARFdQ2xhDIzGQXAx5H+xXP2TLMe2DTCyjryRIld+vj7yRq/QsU9JruxQl1uF dcvFcWjWgZNgG5TF0cOPROxnIi0lMiOJ5alw0J8efLeU0uES8SqqJBA7vGvHOHoOoq5t jCt3SxJFpt/CmBTbKwo+niKTtQ7uj5+ihzZhnAU/pwG2dsaUUzecC2FYnmf5YRaDF9ZE CINQ== X-Gm-Message-State: ALoCoQlIsJsPcaHNBsQDgSffw6RQyJr01nS33DJeqmVPJEE1qjLeK9zGKADILNoQpSLCHey1lsis X-Received: by 10.194.10.165 with SMTP id j5mr40491499wjb.147.1435665527042; Tue, 30 Jun 2015 04:58:47 -0700 (PDT) Received: from [10.16.0.195] (6wind.net2.nerim.net. [213.41.151.210]) by mx.google.com with ESMTPSA id d3sm16659538wic.1.2015.06.30.04.58.46 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Jun 2015 04:58:46 -0700 (PDT) Message-ID: <5592846F.8050505@6wind.com> Date: Tue, 30 Jun 2015 13:58:39 +0200 From: Olivier MATZ User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0 MIME-Version: 1.0 To: Zoltan Kiss , dev@dpdk.org References: <1435258110-17140-1-git-send-email-zoltan.kiss@linaro.org> In-Reply-To: <1435258110-17140-1-git-send-email-zoltan.kiss@linaro.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] mempool: improbe cache search X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Jun 2015 11:58:47 -0000 Hi Zoltan, On 06/25/2015 08:48 PM, Zoltan Kiss wrote: > The current way has a few problems: > > - if cache->len < n, we copy our elements into the cache first, then > into obj_table, that's unnecessary > - if n >= cache_size (or the backfill fails), and we can't fulfil the > request from the ring alone, we don't try to combine with the cache > - if refill fails, we don't return anything, even if the ring has enough > for our request > > This patch rewrites it severely: > - at the first part of the function we only try the cache if cache->len < n > - otherwise take our elements straight from the ring > - if that fails but we have something in the cache, try to combine them > - the refill happens at the end, and its failure doesn't modify our return > value Indeed, it looks easier to read that way. I checked the performance with "mempool_perf_autotest" of app/test and it show that there is no regression (it is even slightly better in some test cases). There is a small typo in the title: s/improbe/improve Please see also a comment below. > > Signed-off-by: Zoltan Kiss > --- > lib/librte_mempool/rte_mempool.h | 63 +++++++++++++++++++++++++--------------- > 1 file changed, 39 insertions(+), 24 deletions(-) > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h > index a8054e1..896946c 100644 > --- a/lib/librte_mempool/rte_mempool.h > +++ b/lib/librte_mempool/rte_mempool.h > @@ -948,34 +948,14 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table, > unsigned lcore_id = rte_lcore_id(); > uint32_t cache_size = mp->cache_size; > > - /* cache is not enabled or single consumer */ > + cache = &mp->local_cache[lcore_id]; > + /* cache is not enabled or single consumer or not enough */ > if (unlikely(cache_size == 0 || is_mc == 0 || > - n >= cache_size || lcore_id >= RTE_MAX_LCORE)) > + cache->len < n || lcore_id >= RTE_MAX_LCORE)) > goto ring_dequeue; > > - cache = &mp->local_cache[lcore_id]; > cache_objs = cache->objs; > > - /* Can this be satisfied from the cache? */ > - if (cache->len < n) { > - /* No. Backfill the cache first, and then fill from it */ > - uint32_t req = n + (cache_size - cache->len); > - > - /* How many do we require i.e. number to fill the cache + the request */ > - ret = rte_ring_mc_dequeue_bulk(mp->ring, &cache->objs[cache->len], req); > - if (unlikely(ret < 0)) { > - /* > - * In the offchance that we are buffer constrained, > - * where we are not able to allocate cache + n, go to > - * the ring directly. If that fails, we are truly out of > - * buffers. > - */ > - goto ring_dequeue; > - } > - > - cache->len += req; > - } > - > /* Now fill in the response ... */ > for (index = 0, len = cache->len - 1; index < n; ++index, len--, obj_table++) > *obj_table = cache_objs[len]; > @@ -984,7 +964,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table, > > __MEMPOOL_STAT_ADD(mp, get_success, n); > > - return 0; > + ret = 0; > + goto cache_refill; > > ring_dequeue: > #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ > @@ -995,11 +976,45 @@ ring_dequeue: > else > ret = rte_ring_sc_dequeue_bulk(mp->ring, obj_table, n); > > +#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > + if (ret < 0 && is_mc == 1 && cache->len > 0) { if (unlikely(ret < 0 && is_mc == 1 && cache->len > 0)) ? > + uint32_t req = n - cache->len; > + > + ret = rte_ring_mc_dequeue_bulk(mp->ring, obj_table, req); > + if (ret == 0) { > + cache_objs = cache->objs; > + obj_table += req; > + for (index = 0; index < cache->len; > + ++index, ++obj_table) > + *obj_table = cache_objs[index]; > + cache->len = 0; > + } > + } > +#endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ > + > if (ret < 0) > __MEMPOOL_STAT_ADD(mp, get_fail, n); > else > __MEMPOOL_STAT_ADD(mp, get_success, n); > > +#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > +cache_refill: > + /* If previous dequeue was OK and we have less than n, start refill */ > + if (ret == 0 && cache_size > 0 && cache->len < n) { Not sure it's likely or unlikely there. I'll tend to say unlikely as the cache size is probably big compared to n most of the time. I don't know if it would have a real performance impact thought, but I think it won't hurt. Regards, Olivier > + uint32_t req = cache_size - cache->len; > + > + cache_objs = cache->objs; > + ret = rte_ring_mc_dequeue_bulk(mp->ring, > + &cache->objs[cache->len], > + req); > + if (likely(ret == 0)) > + cache->len += req; > + else > + /* Don't spoil the return value */ > + ret = 0; > + } > +#endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ > + > return ret; > } > >