From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id E2ADBCD0E for ; Fri, 17 Jun 2016 16:18:30 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 17 Jun 2016 07:18:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,484,1459839600"; d="scan'208";a="978094866" Received: from dhunt5-mobl.ger.corp.intel.com (HELO [10.237.220.29]) ([10.237.220.29]) by orsmga001.jf.intel.com with ESMTP; 17 Jun 2016 07:18:28 -0700 To: Olivier Matz , dev@dpdk.org References: <1462472982-49782-1-git-send-email-david.hunt@intel.com> <1463669335-30378-1-git-send-email-david.hunt@intel.com> <1463669335-30378-2-git-send-email-david.hunt@intel.com> <5742FDA6.5070108@6wind.com> From: "Hunt, David" Message-ID: <576406B2.5060605@intel.com> Date: Fri, 17 Jun 2016 15:18:26 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <5742FDA6.5070108@6wind.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 1/3] mempool: add stack (lifo) mempool handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Jun 2016 14:18:31 -0000 Hi Olivier, On 23/5/2016 1:55 PM, Olivier Matz wrote: > Hi David, > > Please find some comments below. > > On 05/19/2016 04:48 PM, David Hunt wrote: >> [...] >> +++ b/lib/librte_mempool/rte_mempool_stack.c >> @@ -0,0 +1,145 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. >> + * All rights reserved. > Should be 2016? Yes, fixed. >> ... >> + >> +static void * >> +common_stack_alloc(struct rte_mempool *mp) >> +{ >> + struct rte_mempool_common_stack *s; >> + unsigned n = mp->size; >> + int size = sizeof(*s) + (n+16)*sizeof(void *); >> + >> + /* Allocate our local memory structure */ >> + s = rte_zmalloc_socket("common-stack", > "mempool-stack" ? Done >> + size, >> + RTE_CACHE_LINE_SIZE, >> + mp->socket_id); >> + if (s == NULL) { >> + RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n"); >> + return NULL; >> + } >> + >> + rte_spinlock_init(&s->sl); >> + >> + s->size = n; >> + mp->pool = s; >> + rte_mempool_set_handler(mp, "stack"); > rte_mempool_set_handler() is a user function, it should be called here Removed. >> + >> + return s; >> +} >> + >> +static int common_stack_put(void *p, void * const *obj_table, >> + unsigned n) >> +{ >> + struct rte_mempool_common_stack *s = p; >> + void **cache_objs; >> + unsigned index; >> + >> + rte_spinlock_lock(&s->sl); >> + cache_objs = &s->objs[s->len]; >> + >> + /* Is there sufficient space in the stack ? */ >> + if ((s->len + n) > s->size) { >> + rte_spinlock_unlock(&s->sl); >> + return -ENOENT; >> + } > The usual return value for a failing put() is ENOBUFS (see in rte_ring). Done. > > After reading it, I realize that it's nearly exactly the same code than > in "app/test: test external mempool handler". > http://patchwork.dpdk.org/dev/patchwork/patch/12896/ > > We should drop one of them. If this stack handler is really useful for > a performance use-case, it could go in librte_mempool. At the first > read, the code looks like a demo example : it uses a simple spinlock for > concurrent accesses to the common pool. Maybe the mempool cache hides > this cost, in this case we could also consider removing the use of the > rte_ring. While I agree that the code is similar, the handler in the test is a ring based handler, where as this patch adds an array based handler. I think that the case for leaving it in as a test for the standard handler as part of the previous mempool handler is valid, but maybe there is a case for removing it if we add the stack handler. Maybe a future patch? > Do you have some some performance numbers? Do you know if it scales > with the number of cores? For the mempool_perf_autotest, I'm seeing a 30% increase in performance for the local cache use-case for 1 - 36 cores (results vary within those tests between 10-45% gain, but with an average of 30% gain over all the tests.). However, for the tests with no local cache configured, throughput of the enqueue/dequeue drops by about 30%, with the 36 core yelding the largest drop of 40%. So this handler would not be recommended in no-cache applications. > If we can identify the conditions where this mempool handler > overperforms the default handler, it would be valuable to have them > in the documentation. > Regards, Dave.