From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id A3F0A8E82 for ; Fri, 30 Oct 2015 14:52:56 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP; 30 Oct 2015 06:52:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,218,1444719600"; d="scan'208";a="838859075" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.208.63]) by orsmga002.jf.intel.com with SMTP; 30 Oct 2015 06:52:34 -0700 Received: by (sSMTP sendmail emulation); Fri, 30 Oct 2015 13:52:34 +0025 Date: Fri, 30 Oct 2015 13:52:34 +0000 From: Bruce Richardson To: Pablo de Lara Message-ID: <20151030135233.GA10248@bricha3-MOBL3> References: <1443802033-245423-1-git-send-email-pablo.de.lara.guarch@intel.com> <1446207772-58543-1-git-send-email-pablo.de.lara.guarch@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1446207772-58543-1-git-send-email-pablo.de.lara.guarch@intel.com> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH v2] hash: fix scaling by reducing contention X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Oct 2015 13:52:57 -0000 On Fri, Oct 30, 2015 at 12:22:52PM +0000, Pablo de Lara wrote: > From: "De Lara Guarch, Pablo" > > If using multiple cores on a system with hardware transactional > memory support, thread scaling does not work, as there was a single > point in the hash library which is a bottleneck for all threads, > which is the "free_slots" ring, which stores all the indices of > the free slots in the table. > > This patch fixes the problem, by creating a local cache per logical core, > which stores locally indices of free slots, > so most times, writer threads will not interfere each other. > > Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") > > Signed-off-by: Pablo de Lara > --- > > Changes in v2: > - Include patch dependency below > > This patch depends on patch "hash: free internal ring when freeing hash" > (http://www.dpdk.org/dev/patchwork/patch/7377/) > > app/test/test_hash_scaling.c | 1 + > doc/guides/rel_notes/release_2_2.rst | 5 ++ > lib/librte_hash/rte_cuckoo_hash.c | 144 ++++++++++++++++++++++++++++++----- > lib/librte_hash/rte_hash.h | 3 + > 4 files changed, 133 insertions(+), 20 deletions(-) > > diff --git a/app/test/test_hash_scaling.c b/app/test/test_hash_scaling.c > index 39602cb..e7cb7e2 100644 > --- a/app/test/test_hash_scaling.c > +++ b/app/test/test_hash_scaling.c > @@ -133,6 +133,7 @@ test_hash_scaling(int locking_mode) > .hash_func = rte_hash_crc, > .hash_func_init_val = 0, > .socket_id = rte_socket_id(), > + .extra_flag = 1 << RTE_HASH_TRANS_MEM_SUPPORT_FLAG, This flag could do with having it's name adjusted, because as used here it's not really a flag, but a shift value. > > +static inline void > +enqueue_slot(const struct rte_hash *h, struct lcore_cache *cached_free_slots, > + void *slot_id) > +{ > + if (h->hw_trans_mem_support) { > + cached_free_slots->objs[cached_free_slots->len] = slot_id; > + cached_free_slots->len++; > + } else > + rte_ring_sp_enqueue(h->free_slots, slot_id); > +} > + This function could do with a comment explaining that it's just used inside the add function and is not a generic enqueue function as it assumes that we are just putting back an element we have just dequeued, so there is guaranteed to be space in the local cache for it. Otherwise the lack of support for a full cache is a glaring omission. :-) /Bruce