From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <users-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id D5EF5A0C50
	for <public@inbox.dpdk.org>; Sat, 24 Jul 2021 16:54:47 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 5B1D340DDE;
	Sat, 24 Jul 2021 16:54:47 +0200 (CEST)
Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com
 [209.85.128.45]) by mails.dpdk.org (Postfix) with ESMTP id 2C36840DDA
 for <users@dpdk.org>; Sat, 24 Jul 2021 16:54:45 +0200 (CEST)
Received: by mail-wm1-f45.google.com with SMTP id k4so2492715wms.3
 for <users@dpdk.org>; Sat, 24 Jul 2021 07:54:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:from:date:message-id:subject:to;
 bh=jhkgSQVa9cPa54uCh5oUAderZFXDs1BKrnc2zD/LRko=;
 b=LS9sF9sXnQ4xYRK5yQAgsQMaxstOJaOJOr5C0TlULBMyKqIxzdB/K6ZxS2OwFQXhWD
 61saO8TDY3aqmmZacxIlAIdxL0q2wsaMAlztFmyuDAV1cBNl02ZT0DBbJv6fp1+seQm9
 SrSJe/XmfvzhIOVdkoCyL6iW1l07k+NeO44ZAUEJadlsX2NKEv4eBjAh3p0TAXRY/ODk
 vSfQBI4zvmamfpQM3p3o3Avm5kuXhhNNIaNQqsKA9UDcbNpszGCdvJsTELYN2Zc0dybY
 Fsxvofm8EZj6zr9ogh3galbWEIvBMqM/m3+NobBSH/lEEqA1MAKfF0hUVWZ3Zw1ZG/eT
 QPAg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=jhkgSQVa9cPa54uCh5oUAderZFXDs1BKrnc2zD/LRko=;
 b=Af217bl/uzDkb7XcsHhmc/RURh3iVyfODZSnAB3r4gDd6Y8DCCkUvhmaS2wCbyXNHJ
 T+MQRuGGe4L7cHFJLNN9VpuL4rGABehK7zJG7j/wMb40HOJyw6YHdPJ7Nuwp/Ek48YD0
 5LJJFiuyZD6i2cQxbM2UMvjx6rGccV/ROGu6qeVw7Kf3yk3gMk0i1M2ha479sX1adhOS
 /3u6eS7EfOHlRWd0MEsM00HjLRakfBdQJV6x+Pm7T3r1sYgnD4vaqLkop42Y+bqxsJyW
 vWDc0ibXeySNFbDl7oj1U+1ekZ3xJoaqKR6qTdJTD1XnzbibeXMbDkHHSZcRcAvP/600
 nEkQ==
X-Gm-Message-State: AOAM5318EZojrOy8+/gwtGjJkFAxfcz8NR+PUNluHRn8j7+biYU6Q3dH
 YDkYa0u1ZF9K5UyLnY4MgMXqhFOBMdOXioWVP9pgbqNwlPAQ7g==
X-Google-Smtp-Source: ABdhPJxwgB2rAOvsHPggjf4muQgbpstOGv+U5fiRtl1rKPI6/D8qqqzgOQSFg1AQwvPLDFrHK8eRHm467bZU1xa8wP4=
X-Received: by 2002:a05:600c:3581:: with SMTP id
 p1mr19228872wmq.150.1627138484553; 
 Sat, 24 Jul 2021 07:54:44 -0700 (PDT)
MIME-Version: 1.0
From: Pavel Vazharov <freakpv@gmail.com>
Date: Sat, 24 Jul 2021 17:54:29 +0300
Message-ID: <CAK9EM1-34xATiHNfoMVs=1mc+MKy-w4cwL4-Jcu5u9ixe3wAEw@mail.gmail.com>
To: users <users@dpdk.org>
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
Subject: [dpdk-users] rte_malloc behavior
X-BeenThere: users@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK usage discussions <users.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/users>,
 <mailto:users-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/users/>
List-Post: <mailto:users@dpdk.org>
List-Help: <mailto:users-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/users>,
 <mailto:users-request@dpdk.org?subject=subscribe>
Errors-To: users-bounces@dpdk.org
Sender: "users" <users-bounces@dpdk.org>

Hi,

Short intro to the original cause of my problem.
We have an application where we use the FreeBSD stack running on top of
DPDK. It's based on F-stack <https://github.com/F-Stack/f-stack> but with
lots of modifications at this point. There are situations where we send
lots of TCP data in a short period of time. Lots of 16KB blocks. The
FreeBSD networking stack internally splits these blocks into so-called jumbo
clusters <http://nginx.org/en/docs/freebsd_tuning.html#mbufs> (4K size)
before putting them into the TCP socket buffers. All allocations needed
from the FreeBSD stack are redirected to call rte_malloc. I observed that
during such TCP sends we get peak delays in the sendmsg API call and
tracked these delays down to the rte_malloc calls.

After that I did some tests with the following piece of C++ code trying to
isolate the issue further.
[[gnu::noinline]] static void

test_allocations(int allocations, int size, int align) noexcept

{

    using namespace std::chrono;

    std::vector<void*> mem(allocations);

    const auto beg = high_resolution_clock::now();

    for (int i = 0; i < allocations; ++i) {

        mem[i] = rte_malloc(nullptr, size, align);

        X3ME_ENFORCE(mem[i]);

    }

    const auto end = high_resolution_clock::now();

    fmt::print(

        "Allocations:{} Size:{} Align:{} Time_msecs:{}
Avg_time_usecs:{}\n",
        allocations, size, align,

        duration_cast<milliseconds>(end - beg).count(),

        duration_cast<microseconds>(end - beg).count() / allocations);

    for (void* m : mem) rte_free(m);
}

The results show big delays in the rte_malloc function if we ask for 1K or
4K. These delays are not present if the size is not an exact multiple of 2
like these values.

Allocations:4096 Size:4096 Align:4096 Time_msecs:330 Avg_time_usecs:80
Allocations:16384 Size:4096 Align:4096 Time_msecs:8724 Avg_time_usecs:532
Allocations:32768 Size:4096 Align:4096 Time_msecs:38291 Avg_time_usecs:1168

Allocations:4096 Size:4112 Align:4096 Time_msecs:12 Avg_time_usecs:3
Allocations:16384 Size:4112 Align:4096 Time_msecs:45 Avg_time_usecs:2
Allocations:32768 Size:4112 Align:4096 Time_msecs:83 Avg_time_usecs:2

Allocations:4096 Size:1024 Align:1024 Time_msecs:244 Avg_time_usecs:59
Allocations:16384 Size:1024 Align:1024 Time_msecs:4428 Avg_time_usecs:270
Allocations:32768 Size:1024 Align:1024 Time_msecs:26901 Avg_time_usecs:820

Allocations:4096 Size:1040 Align:1024 Time_msecs:4 Avg_time_usecs:1
Allocations:16384 Size:1040 Align:1024 Time_msecs:16 Avg_time_usecs:1
Allocations:32768 Size:1040 Align:1024 Time_msecs:30 Avg_time_usecs:0

And just for a reference the speed of the allocations using the standard
"aligned_alloc/free" API instead of rte_malloc/rte_free.
Allocations:32768 Size:1024 Align:1024 Time_msecs:66 Avg_time_usecs:2
Allocations:32768 Size:4096 Align:4096 Time_msecs:118 Avg_time_usecs:3

As far as I know some allocators have inefficiencies working with
particular allocation sizes but I haven't expected such a big difference.
Am I missing something from the documentation explaining this behavior?
Should I report it to the devs mailing list?

Regards,
Pavel.