From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9360141CB0; Thu, 16 Feb 2023 11:40:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1F1C640FDF; Thu, 16 Feb 2023 11:40:58 +0100 (CET) Received: from sender11-of-o51.zoho.eu (sender11-of-o51.zoho.eu [31.186.226.237]) by mails.dpdk.org (Postfix) with ESMTP id D38ED40EE3 for ; Thu, 16 Feb 2023 11:40:56 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; t=1676544012; cv=none; d=zohomail.eu; s=zohoarc; b=lfZCgkT1Ay+ZPVFjF5hy8sGaUO7t2BYXWotSu5PL+IR47zJZNlUFkP+yWy6HOBa75avi3oXUQRSl2u+UONiVRqRVPCnBBFpSn9RNYZ4XAREAEdVITq08+j3P+NNfD1yebAE+OaNIW4HhbetJKDVanM7H2FFW/P9EBOXrN9TF9Wo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.eu; s=zohoarc; t=1676544012; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=reheq8YehJDOAYyOIGAWoo+tgscQqpO6y1Udi2jUHfw=; b=f+f3atLQjB8TP4lFZfiSr19gM1xRSMGQeLVrzFnAjIPVtcIoLGgSV/VqHpCK7HzwJmFyrxXx801pdYKptx2ARy8iQhz/PBGdD41ct2cqC9DezG5Akc+cAn+/Fga4Je/JtAWNv3M3JSYotIAGqlXiR6NU/zL76XzOLw9R5Q2zLZY= ARC-Authentication-Results: i=1; mx.zohomail.eu; spf=pass smtp.mailfrom=liangma@liangbit.com; dmarc=pass header.from= Received: from C02GF04TMD6V (ec2-54-93-135-31.eu-central-1.compute.amazonaws.com [54.93.135.31]) by mx.zoho.eu with SMTPS id 1676544008903910.6440960528311; Thu, 16 Feb 2023 11:40:08 +0100 (CET) Date: Thu, 16 Feb 2023 10:40:05 +0000 From: Liang Ma To: Thomas Monjalon Cc: anatoly.burakov@intel.com, Fengnan Chang , dev@dpdk.org, rsanford@akamai.com, bruce.richardson@intel.com, Morten =?iso-8859-1?Q?Br=F8rup?= , maxime.coquelin@redhat.com, david.marchand@redhat.com, jerinj@marvell.com, honnappa.nagarahalli@arm.com Subject: Re: [PATCH] malloc: fix malloc performance may becomes worse as the number of malloc increases Message-ID: References: <20230210063022.52171-1-changfengnan@bytedance.com> <2134819.GUh0CODmnK@thomas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2134819.GUh0CODmnK@thomas> X-ZohoMailClient: External X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, Feb 15, 2023 at 11:10:25AM +0100, Thomas Monjalon wrote: > Looking for reviewers please. I will help on this. > > 10/02/2023 07:30, Fengnan Chang: > > Here is a simple test case: > > " > > uint64_t entry_time, time; > > size_t size = 4096; > > unsigned align = 4096; > > for (int j = 0; j < 10; j++) { > > entry_time = rte_get_timer_cycles(); > > for (int i = 0; i < 2000; i++) { > > rte_malloc(NULL, size, align); > > } > > time = (rte_get_timer_cycles()-entry_time) * 1000000 / > > rte_get_timer_hz(); > > printf("total open time %lu avg time %lu\n", time, time/2000); > > } > > " > > > > Single rte_malloc cost time may becomes wrose as the number of malloc > > increases, In my env, first round avg time is 15us, second is 44us, > > third is 77us, fourth is 168us... > > > > The reason is,in the malloc process, malloc_elem_alloc may split new_elem > > if there have too much free space after new_elem, and insert the trailer > > into freelist. When alloc 4k with align 4k, the trailer very likely insert > > to free_head[2] again, it makes free_head[2] longer. when alloc 4k again, > > it will search free_head[2] from begin, with the number of malloc increases, > > search free_head[2] need more time, so the performance will become worse. > > Same problem will also occurs in alloc 64k with align 64k, but if alloc > > 4k with align 64, doesn't have this problem. > > > > Fix this by adjust free_head list size range, make free_head[3] hold > > elements which bigger or equal 4k, free_head[4] hold elements which bigger > > or equal 16k. > > In terms of probabilities, when alloc 4k or 16k, the probability of finding > > a suitable elem from a larger size list is greater than from a smaller > > size list. > > > > Signed-off-by: Fengnan Chang > > --- > > lib/eal/common/malloc_elem.c | 14 +++++++------- > > 1 file changed, 7 insertions(+), 7 deletions(-) > > > > diff --git a/lib/eal/common/malloc_elem.c b/lib/eal/common/malloc_elem.c > > index 83f05497cc..35a2313d04 100644 > > --- a/lib/eal/common/malloc_elem.c > > +++ b/lib/eal/common/malloc_elem.c > > @@ -367,11 +367,11 @@ prev_elem_is_adjacent(struct malloc_elem *elem) > > * containing larger elements. > > * > > * Example element size ranges for a heap with five free lists: > > - * heap->free_head[0] - (0 , 2^8] > > - * heap->free_head[1] - (2^8 , 2^10] > > - * heap->free_head[2] - (2^10 ,2^12] > > - * heap->free_head[3] - (2^12, 2^14] > > - * heap->free_head[4] - (2^14, MAX_SIZE] > > + * heap->free_head[0] - (0 , 2^8) > > + * heap->free_head[1] - [2^8 , 2^10) > > + * heap->free_head[2] - [2^10 ,2^12) > > + * heap->free_head[3] - [2^12, 2^14) > > + * heap->free_head[4] - [2^14, MAX_SIZE] > > */ > > size_t > > malloc_elem_free_list_index(size_t size) > > @@ -385,8 +385,8 @@ malloc_elem_free_list_index(size_t size) > > if (size <= (1UL << MALLOC_MINSIZE_LOG2)) > > return 0; > > > > - /* Find next power of 2 >= size. */ > > - log2 = sizeof(size) * 8 - __builtin_clzl(size - 1); > > + /* Find next power of 2 > size. */ > > + log2 = sizeof(size) * 8 - __builtin_clzl(size); > > > > /* Compute freelist index, based on log2(size). */ > > index = (log2 - MALLOC_MINSIZE_LOG2 + MALLOC_LOG2_INCREMENT - 1) / > > > > > > >