for (size_t i = 0; i < kMaxQueuesPerPort; i++) {
const std::string pname = get_mempool_name(phy_port, i);
rte_mempool *mempool =
rte_pktmbuf_pool_create(pname.c_str(), kNumMbufs, 0 /* cache */,
0 /* priv size */, kMbufSize, numa_node);
2022-01-29 18:46 (UTC-0500), fwefew 4t4tg:
[...]
> 1. Does cache_size include or exclude data_room_size?
> 2. Does cache_size include or exclude sizeof(struct rtre_mbuf)?
> 3. Does cache size include or exclude RTE_PKTMBUF_HEADROOM?
Cache size is measured in the number of elements, irrelevant of their size.
It is not a memory size, so the questions above are not really meaningful.
> 4. What lcore is the allocated memory pinned to?
Memory is associated with a NUMA node (DPDK calls it "socket"), not an lcore.
Each lcore belongs to one NUMA node, see rte_lcore_to_socket_id().
> The lcore of the caller
> when this method is run? The answer here is important. If it's the lcore of
> the caller when called, this routine should be called in the lcore's entry
> point so it's on the right lcore the memory is intended. Calling it on the
> lcore that happens to be running main, for example, could have a bad side
> effect if it's different from where the memory will be ultimately used.
The NUMA node is controlled by "socket_id" parameter.
Your considerations are correct, often you should create separate mempools
for each NUMA node to avoid this performance issue. (You should also
consider which NUMA node each device belongs to.)
> 5. Which one of the formal arguments represents tail room indicated in
> https://doc.dpdk.org/guides/prog_guide/mbuf_lib.html#figure-mbuf1
[...]
> 5. Unknown. Perhaps if you want private data which corresponds to tail room
> in the diagram above one has to call rte_mempool_create() instead and focus
> on private_data_size.
Incorrect; tail room is simply an unused part at the end of the data room.
Private data is for the entire mempool, not for individual mbufs.
> Mempool creation is like malloc: you request the total number of absolute
> bytes required. The API will not add or remove bytes to the number you
> specify. Therefore the number you give must be inclusive of all needs
> including your payload, any DPDK overheader, headroom, tailroom, and so on.
> DPDK is not adding to the number you give for its own purposes. Clearer?
> Perhaps ... but what needs? Read on ...
On the contrary: rte_pktmbuf_pool_create() takes the amount
of usable memory (dataroom) and adds space for rte_mbuf and the headroom.
Furthermore, the underlying rte_mempool_create() ensures element (mbuf)
alignment, may spread the elements between pages, etc.
[...]
> No. I might not. I might have half my TXQ and RXQs dealing with tiny
> mbufs/packets, and the other half dealing with completely different traffic
> of a completely different size and structure. So I might want memory pool
> allocation to be done on a smaller scale e.g. per RXQ/TXQ/lcore. DPDK
> doesn't seem to permit this.
You can create different mempools for each purpose
and specify the proper mempool to rte_eth_rx_queue_setup().
When creating them, you can and should also take NUMA into account.
Take a look at init_mem() function of examples/l3fwd.