From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3F28041C5B; Fri, 10 Feb 2023 07:30:33 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D060740EE6; Fri, 10 Feb 2023 07:30:32 +0100 (CET) Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by mails.dpdk.org (Postfix) with ESMTP id 1941540EE3 for ; Fri, 10 Feb 2023 07:30:31 +0100 (CET) Received: by mail-pj1-f52.google.com with SMTP id m2-20020a17090a414200b00231173c006fso8014225pjg.5 for ; Thu, 09 Feb 2023 22:30:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=YI/O9k0a04f7MxwmetmCJXBBwep+zkDerDRUxi8tIlI=; b=o7XaNqMb6K52gBPGjspmVWY5nFIa9YghadXeoPSRGKcDQZYoTvu8/S/TCj4+JjUnwk fo/13b856RkMgvAR7+7XSVCd8d0Gndo84KIdcfv9R+YojtTluyQMXxzb2ruJee1ByC36 /B0NZsc1k6pe/A5igfu0p8vkMS/LpSPt4F2l88smJlh/2mSh4Xpz1vtnFRXRf1h1rqVG p9iY+hqzgzlWKvZDRX4vXlPlqWfH+5JYxgSgSnk3OsrAId+g3Ya61Qyyq60wWVn+JMfx WoqBvPNZPGeaw6fbi7/0dUzZBImYlrIQ6maB2ZaRvMhOgW6N64DpbKywUvM8nsZLod3R 1iEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YI/O9k0a04f7MxwmetmCJXBBwep+zkDerDRUxi8tIlI=; b=bRclAfdNkLIwal7SEwwdLFaJ2ro4PY9e/kc7O3+jGR/5BRlSdVgei31utd96Pg5DM1 CJf8Pv2ql8VTtaWMEfEmuo4gGNLWajI9Y/c2tv944TNjJIUK9o66AjfCxa5zM+YZn/6T DLHiEs3/Hcx/Y461UWYvq2LbK60kcLEdK9BEPeU74L4sDrCIEG+gMAszmB2/6ZoHO/1I uPmHspXIgDhS07BJUwfrQjNBzo5ChO5VyQmX4yVxR7fYjdD8CAHEbkvL9Xth4LM8oz4i ZwO04umjGeE54teAoG3Ze5ERCxXJ+2mBMUA5xraYtejPCPe/dyCGTmsFOZ7/gZxrLF7H V6+w== X-Gm-Message-State: AO0yUKV3mHq4M5x6+ktZhA2LNsZIivb71AwOdfeqJj2lpiaXY6QRWmrF CL0HJFs2PSnuvKJqH9UZKWfKdw== X-Google-Smtp-Source: AK7set88dfDq5cVtzP5Xg8OdGASbAHgBNQNjO+Cn5giA93yGfuGGwi4FqNORkwgT1ptZsU8nUEJnEQ== X-Received: by 2002:a17:902:e748:b0:19a:723a:81b2 with SMTP id p8-20020a170902e74800b0019a723a81b2mr1266675plf.5.1676010630225; Thu, 09 Feb 2023 22:30:30 -0800 (PST) Received: from HTW5T2C6VL.bytedance.net ([139.177.225.231]) by smtp.gmail.com with ESMTPSA id jl14-20020a170903134e00b0019117164732sm2554333plb.213.2023.02.09.22.30.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 09 Feb 2023 22:30:29 -0800 (PST) From: Fengnan Chang To: anatoly.burakov@intel.com Cc: rsanford@akamai.com, dev@dpdk.org, Fengnan Chang Subject: [PATCH] malloc: fix malloc performance may becomes worse as the number of malloc increases Date: Fri, 10 Feb 2023 14:30:22 +0800 Message-Id: <20230210063022.52171-1-changfengnan@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Here is a simple test case: " uint64_t entry_time, time; size_t size = 4096; unsigned align = 4096; for (int j = 0; j < 10; j++) { entry_time = rte_get_timer_cycles(); for (int i = 0; i < 2000; i++) { rte_malloc(NULL, size, align); } time = (rte_get_timer_cycles()-entry_time) * 1000000 / rte_get_timer_hz(); printf("total open time %lu avg time %lu\n", time, time/2000); } " Single rte_malloc cost time may becomes wrose as the number of malloc increases, In my env, first round avg time is 15us, second is 44us, third is 77us, fourth is 168us... The reason is,in the malloc process, malloc_elem_alloc may split new_elem if there have too much free space after new_elem, and insert the trailer into freelist. When alloc 4k with align 4k, the trailer very likely insert to free_head[2] again, it makes free_head[2] longer. when alloc 4k again, it will search free_head[2] from begin, with the number of malloc increases, search free_head[2] need more time, so the performance will become worse. Same problem will also occurs in alloc 64k with align 64k, but if alloc 4k with align 64, doesn't have this problem. Fix this by adjust free_head list size range, make free_head[3] hold elements which bigger or equal 4k, free_head[4] hold elements which bigger or equal 16k. In terms of probabilities, when alloc 4k or 16k, the probability of finding a suitable elem from a larger size list is greater than from a smaller size list. Signed-off-by: Fengnan Chang --- lib/eal/common/malloc_elem.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/lib/eal/common/malloc_elem.c b/lib/eal/common/malloc_elem.c index 83f05497cc..35a2313d04 100644 --- a/lib/eal/common/malloc_elem.c +++ b/lib/eal/common/malloc_elem.c @@ -367,11 +367,11 @@ prev_elem_is_adjacent(struct malloc_elem *elem) * containing larger elements. * * Example element size ranges for a heap with five free lists: - * heap->free_head[0] - (0 , 2^8] - * heap->free_head[1] - (2^8 , 2^10] - * heap->free_head[2] - (2^10 ,2^12] - * heap->free_head[3] - (2^12, 2^14] - * heap->free_head[4] - (2^14, MAX_SIZE] + * heap->free_head[0] - (0 , 2^8) + * heap->free_head[1] - [2^8 , 2^10) + * heap->free_head[2] - [2^10 ,2^12) + * heap->free_head[3] - [2^12, 2^14) + * heap->free_head[4] - [2^14, MAX_SIZE] */ size_t malloc_elem_free_list_index(size_t size) @@ -385,8 +385,8 @@ malloc_elem_free_list_index(size_t size) if (size <= (1UL << MALLOC_MINSIZE_LOG2)) return 0; - /* Find next power of 2 >= size. */ - log2 = sizeof(size) * 8 - __builtin_clzl(size - 1); + /* Find next power of 2 > size. */ + log2 = sizeof(size) * 8 - __builtin_clzl(size); /* Compute freelist index, based on log2(size). */ index = (log2 - MALLOC_MINSIZE_LOG2 + MALLOC_LOG2_INCREMENT - 1) / -- 2.37.0 (Apple Git-136)