DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] malloc: fix malloc performance may becomes worse as the number of malloc increases
@ 2023-02-10  6:30 Fengnan Chang
  2023-02-15 10:10 ` Thomas Monjalon
  2023-02-20 10:59 ` David Marchand
  0 siblings, 2 replies; 9+ messages in thread
From: Fengnan Chang @ 2023-02-10  6:30 UTC (permalink / raw)
  To: anatoly.burakov; +Cc: rsanford, dev, Fengnan Chang

Here is a simple test case:
"
uint64_t entry_time, time;
size_t size = 4096;
unsigned align = 4096;
for (int j = 0; j < 10; j++) {
	entry_time = rte_get_timer_cycles();
	for (int i = 0; i < 2000; i++) {
		rte_malloc(NULL, size, align);
	}
	time = (rte_get_timer_cycles()-entry_time) * 1000000 /
		rte_get_timer_hz();
	printf("total open time %lu avg time %lu\n", time, time/2000);
}
"

Single rte_malloc cost time may becomes wrose as the number of malloc
increases, In my env, first round avg time is 15us, second is 44us,
third is 77us, fourth is 168us...

The reason is,in the malloc process, malloc_elem_alloc may split new_elem
if there have too much free space after new_elem, and insert the trailer
into freelist. When alloc 4k with align 4k, the trailer very likely insert
to free_head[2] again, it makes free_head[2] longer. when alloc 4k again,
it will search free_head[2] from begin, with the number of malloc increases,
search free_head[2] need more time, so the performance will become worse.
Same problem will also occurs in alloc 64k with align 64k, but if alloc
4k with align 64, doesn't have this problem.

Fix this by adjust free_head list size range, make free_head[3] hold
elements which bigger or equal 4k, free_head[4] hold elements which bigger
or equal 16k.
In terms of probabilities, when alloc 4k or 16k, the probability of finding
a suitable elem from a larger size list is greater than from a smaller
size list.

Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
---
 lib/eal/common/malloc_elem.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/eal/common/malloc_elem.c b/lib/eal/common/malloc_elem.c
index 83f05497cc..35a2313d04 100644
--- a/lib/eal/common/malloc_elem.c
+++ b/lib/eal/common/malloc_elem.c
@@ -367,11 +367,11 @@ prev_elem_is_adjacent(struct malloc_elem *elem)
  * containing larger elements.
  *
  * Example element size ranges for a heap with five free lists:
- *   heap->free_head[0] - (0   , 2^8]
- *   heap->free_head[1] - (2^8 , 2^10]
- *   heap->free_head[2] - (2^10 ,2^12]
- *   heap->free_head[3] - (2^12, 2^14]
- *   heap->free_head[4] - (2^14, MAX_SIZE]
+ *   heap->free_head[0] - (0   , 2^8)
+ *   heap->free_head[1] - [2^8 , 2^10)
+ *   heap->free_head[2] - [2^10 ,2^12)
+ *   heap->free_head[3] - [2^12, 2^14)
+ *   heap->free_head[4] - [2^14, MAX_SIZE]
  */
 size_t
 malloc_elem_free_list_index(size_t size)
@@ -385,8 +385,8 @@ malloc_elem_free_list_index(size_t size)
 	if (size <= (1UL << MALLOC_MINSIZE_LOG2))
 		return 0;
 
-	/* Find next power of 2 >= size. */
-	log2 = sizeof(size) * 8 - __builtin_clzl(size - 1);
+	/* Find next power of 2 > size. */
+	log2 = sizeof(size) * 8 - __builtin_clzl(size);
 
 	/* Compute freelist index, based on log2(size). */
 	index = (log2 - MALLOC_MINSIZE_LOG2 + MALLOC_LOG2_INCREMENT - 1) /
-- 
2.37.0 (Apple Git-136)


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-02-20 10:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-10  6:30 [PATCH] malloc: fix malloc performance may becomes worse as the number of malloc increases Fengnan Chang
2023-02-15 10:10 ` Thomas Monjalon
2023-02-15 11:10   ` Morten Brørup
2023-02-15 17:16     ` Stephen Hemminger
2023-02-16  2:54       ` [External] " Fengnan Chang
2023-02-16 14:02     ` Liang Ma
2023-02-17  2:14       ` [External] " Fengnan Chang
2023-02-16 10:40   ` Liang Ma
2023-02-20 10:59 ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).