We have a DPDK application that only calls rte_eth_rx_burst() (we
do not transmit packets) and it must process the payload very quickly. The payload of a single network packet
MUST be in contiguous memory.
The DPDK API is optimized around having memory pools of fixed-size mbufs
in memory pools. If a packet is received on the DPDK port that is larger than the mbuf size, but smaller than the max MTU then it will be segmented according to the figure in the
mbuf documentation:
This leads us the following problems:
·
If we configure the memory pool to store large packets (for example max MTU size) then we will always store the payload in contiguous
memory, but we will waste huge amounts memory in the case we receive traffic containing small packets. Imagine that our mbuf size is 9216 bytes, but we are receiving mostly packets of size 100-300 bytes. We are wasting memory by a factor of 90!
·
If we reduce the size of mbufs, to let's say 512 bytes, then we need special handling of those segments to store the payload in
contiguous memory. Special handling and copying of segments hurt our performance, so it should be limited.
Considering the above, my questions are as follows:
1.
What strategy is recommended for a DPDK application that needs to process the payload of network packets in contiguous memory?
With both small (100-300 bytes) and large (9216) packets, without wasting huge amounts of memory with 9K-sized mbuf pools? Is copying segmented jumbo frames into a larger max_mtu mbuf the
only option?
2.
Some frameworks and drivers allow decoupling the RX descriptors (mbuf) from the payload, such that payloads are stored contiguously.
This does not seem to be possible in the DPDK framework for segmented packets, since mbufs are stored right next to their payload in memory?