From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7432341CA4; Wed, 15 Feb 2023 11:10:32 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 60D7C40EE3; Wed, 15 Feb 2023 11:10:32 +0100 (CET) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by mails.dpdk.org (Postfix) with ESMTP id 7180240A8B for ; Wed, 15 Feb 2023 11:10:31 +0100 (CET) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 067B85C0143; Wed, 15 Feb 2023 05:10:29 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Wed, 15 Feb 2023 05:10:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1676455829; x= 1676542229; bh=y5hdlowHYBcjW2Tv3y9NbxZfXgtxKVcJT7Ku2Q4Yk0I=; b=j m0Rda4hin9bgPQGpUOU/slq70v8z73ksqrXFn8IyotQohpqNWgV20BvDmyEgeTuz lRbfEzZnDkLDL3AY+rmy0n4ngI5dS4MncAhjiBd8i3nsmmipCNqxV7wMarMq0wty 1jeID6s8jaTl4pMP9MUK5I5PNCe9/8rdODb5C9nXMf1db/3Y3/OB921fJm7x1+Zh abN1IqiQJOZBK29gekWHZBaTn/kIJLOfcWYjJH2134tAWycqE2BIvuOMmkYLgGqZ YwmhTk2ALFZeoFI6+m5vNI5MI2hM6Ynjnow1l3b/e9gsdzrSFhADWLeGd90SWsN9 1x6Ltr+4vvsIdE4OTJeOg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1676455829; x= 1676542229; bh=y5hdlowHYBcjW2Tv3y9NbxZfXgtxKVcJT7Ku2Q4Yk0I=; b=e pE35KESFa+RW3usIZNtX4pLB85MtdeBEj9YLEfxDxs4H12UhnSgWO1HZ24FYKUes +cQusinDMextfsCA/n5G/dVCBlTVUeRqxS4Z5/azsazv2tXd91eVOnXKFX4B4rgd X/BOVJc0+ahW4oJgi/4oAd+5BdHJ36gMVXFm3kjQmkFfORSj+YbZKTHnl4MryLeI XYrKdf/GmNXyKsR2rmQbVj90vCCsRCryjHR4H7ZadgnofzMY5QslceOuRVFUKpwq UybCMU7aqA0AT57g+vXlDVH+EE3nqxLySznaQeVcDgpknqtfZY4OT+Qfo15m9Vrj EvGuw5ISdgGXQnzIzm6PA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeihedgtdeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpedtjeeiieefhedtfffgvdelteeufeefheeujefgueetfedttdei kefgkeduhedtgfenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 15 Feb 2023 05:10:27 -0500 (EST) From: Thomas Monjalon To: anatoly.burakov@intel.com, Fengnan Chang Cc: dev@dpdk.org, rsanford@akamai.com, bruce.richardson@intel.com, Morten =?ISO-8859-1?Q?Br=F8rup?= , maxime.coquelin@redhat.com, david.marchand@redhat.com, jerinj@marvell.com, honnappa.nagarahalli@arm.com Subject: Re: [PATCH] malloc: fix malloc performance may becomes worse as the number of malloc increases Date: Wed, 15 Feb 2023 11:10:25 +0100 Message-ID: <2134819.GUh0CODmnK@thomas> In-Reply-To: <20230210063022.52171-1-changfengnan@bytedance.com> References: <20230210063022.52171-1-changfengnan@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Looking for reviewers please. 10/02/2023 07:30, Fengnan Chang: > Here is a simple test case: > " > uint64_t entry_time, time; > size_t size = 4096; > unsigned align = 4096; > for (int j = 0; j < 10; j++) { > entry_time = rte_get_timer_cycles(); > for (int i = 0; i < 2000; i++) { > rte_malloc(NULL, size, align); > } > time = (rte_get_timer_cycles()-entry_time) * 1000000 / > rte_get_timer_hz(); > printf("total open time %lu avg time %lu\n", time, time/2000); > } > " > > Single rte_malloc cost time may becomes wrose as the number of malloc > increases, In my env, first round avg time is 15us, second is 44us, > third is 77us, fourth is 168us... > > The reason is,in the malloc process, malloc_elem_alloc may split new_elem > if there have too much free space after new_elem, and insert the trailer > into freelist. When alloc 4k with align 4k, the trailer very likely insert > to free_head[2] again, it makes free_head[2] longer. when alloc 4k again, > it will search free_head[2] from begin, with the number of malloc increases, > search free_head[2] need more time, so the performance will become worse. > Same problem will also occurs in alloc 64k with align 64k, but if alloc > 4k with align 64, doesn't have this problem. > > Fix this by adjust free_head list size range, make free_head[3] hold > elements which bigger or equal 4k, free_head[4] hold elements which bigger > or equal 16k. > In terms of probabilities, when alloc 4k or 16k, the probability of finding > a suitable elem from a larger size list is greater than from a smaller > size list. > > Signed-off-by: Fengnan Chang > --- > lib/eal/common/malloc_elem.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/lib/eal/common/malloc_elem.c b/lib/eal/common/malloc_elem.c > index 83f05497cc..35a2313d04 100644 > --- a/lib/eal/common/malloc_elem.c > +++ b/lib/eal/common/malloc_elem.c > @@ -367,11 +367,11 @@ prev_elem_is_adjacent(struct malloc_elem *elem) > * containing larger elements. > * > * Example element size ranges for a heap with five free lists: > - * heap->free_head[0] - (0 , 2^8] > - * heap->free_head[1] - (2^8 , 2^10] > - * heap->free_head[2] - (2^10 ,2^12] > - * heap->free_head[3] - (2^12, 2^14] > - * heap->free_head[4] - (2^14, MAX_SIZE] > + * heap->free_head[0] - (0 , 2^8) > + * heap->free_head[1] - [2^8 , 2^10) > + * heap->free_head[2] - [2^10 ,2^12) > + * heap->free_head[3] - [2^12, 2^14) > + * heap->free_head[4] - [2^14, MAX_SIZE] > */ > size_t > malloc_elem_free_list_index(size_t size) > @@ -385,8 +385,8 @@ malloc_elem_free_list_index(size_t size) > if (size <= (1UL << MALLOC_MINSIZE_LOG2)) > return 0; > > - /* Find next power of 2 >= size. */ > - log2 = sizeof(size) * 8 - __builtin_clzl(size - 1); > + /* Find next power of 2 > size. */ > + log2 = sizeof(size) * 8 - __builtin_clzl(size); > > /* Compute freelist index, based on log2(size). */ > index = (log2 - MALLOC_MINSIZE_LOG2 + MALLOC_LOG2_INCREMENT - 1) / >