DPDK usage discussions
 help / color / mirror / Atom feed
From: fwefew 4t4tg <7532yahoo@gmail.com>
To: users@dpdk.org
Subject: Cache misses on allocating mbufs
Date: Sat, 22 Jun 2024 19:22:21 -0400	[thread overview]
Message-ID: <CA+Tq66XZr+7AXsgnCfzYkFxKC9Mbrbj19Bu5XB_EjD9_=i_ZzQ@mail.gmail.com> (raw)

I happened to be looking into mbuf allocation, and am a little
underwhelmed by the default DPDK performance. There are a lot of
last-level cache misses as measured by Intel's PMU.

I made a single-producer/single-consumer mempool and benchmarked
rte_pktmbuf_alloc and rte_pktmbuf_free using 1Gb huge pages. This ran
on 1 pinned core doing nothing else. A representative loop,

  for (u_int64_t i=0; i<MAX; ++i) {
    data[i] = rte_pktmbuf_alloc(pool);
  }

The salient results are:

- allocating 100,000 mbufs in a tight loop
  * 246.836380 cycles (72.598935 ns) per alloc
  * 797231 LLC cache misses or 90.8% of all LLC references

That's darn near 8 LLC misses per allocation.

Now, we may assume some level of coldness the first time through. If
one frees everything then reallocates 100,000 mbufs on the same pool
the data is slightly better:

- reallocating the same 100,000 mbufs in a tight loop:
  * 221.091480 cycles (65.026906 ns) per alloc
  * 521377 LLC misses 62.8% of all LLC references

It's not so much the allocation of memory; it's the DPDK
initialization of it. Contrast with raw allocations, which skips some
init work rte_pktmbuf_alloc() does. It's an order of 10 better:

- allocate 100,000 mbufs with rte_mbuf_raw_alloc
  * 62.446280 cycles (18.366553 ns) per alloc
  * 8244 LLC misses 14.9% of all LLC references

The bigger problem I have is there seems to be no way around this.
Even if I allocate my own memory with mmap, and add it to the DPDK
heap, I still have to make a mempool on it and still use the same
rte_pktmbuf_alloc calls on it.

Ideally, I want one mempool per TX queue in 1 thread pinned to 1 core
that has NO/NONE/ZERO contention with anything else.

Ideas?

Env:
CPU: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz
Ubuntu 24.04 stock
MLX5 driver
DPDK 24.07-rc1
OFED MLNX_OFED_LINUX-24.04-0.6.6.0-ubuntu24.04-x86_64.tgz

Mempool created like this:

    rte_mempool *mempool = rte_pktmbuf_pool_create_by_ops(
      name,
      102399,
      512,
      8,
      1700,
      0,
      "ring_sp_sc");

                 reply	other threads:[~2024-06-22 23:23 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+Tq66XZr+7AXsgnCfzYkFxKC9Mbrbj19Bu5XB_EjD9_=i_ZzQ@mail.gmail.com' \
    --to=7532yahoo@gmail.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).