I completed code to initialize an AWS ENA adapter with RX, TX queues. With this work in hand, DPDK creates one thread pinned to the right core as per the --lcores argument. So far so good. The DPDK documentation and example code is fairly clear here. What's not as clear is how RX packets are handled. As far as I can tell the canonical way to deal with RX packets is running 'rte_eth_add_rx_callback' for each RXQ. This allows one to process each received packet (for a given RXQ) via a provided callback in the same lcore/hardware-thread that DPDK created for me. As such, there is no need to create additional threads. Correct? Furthermore, I hope the mbufs the callback gets somehow correspond to mbufs associated with the RX descriptors provided to the RXQs so there's no need for copying packets after the NIC receives them before the callback acts on it. As far as I can this hope is ill-founded.. A lot of DPDK code I've seen allocates more mbufs per RXQ than the number of RX descriptors. To me this seems to imply DPDK's RXQ threads put copies of the received-off-the-wire-packets into a copy for delivery to app code. TX is less clear to me. For TX there seems to be no way to transmit packets (burst or otherwise) without creating another thread --- that is, another thread beyond what DPDK makes for me. This other thread must at the appropriate time prepare mbufs and call 'rte_eth_tx_burst' on the correct TXQ. DPDK seems to want to keep its thread for it's own work. Yes, DPDK provides 'rte_eth_add_tx_callback' but that only works after the mbufs have been created and told to transmit, which is after the fact of creation. Putting this together, DPDK requires me to create new threads unlike RX. Correct? While creating additional threads for TX is not the end of the world, I do not want the DPDK TX thread to copy mbufs; I want zero-copy. Here, then, I gather DPDK's TXQ thread takes the mbufs the helper TX thread provides in the 'rte_eth_tx_burst' call and provides them to the TXQS descriptors so they go out on the wire without copying. Is this correct? Now, it's worth pointing out here that 'rte_eth_tx_queue_setup' unlike the RX equivalent does not accept a mempool. So in addition to the above points, those additional TX helper threads (those which call rte_eth_tx_burst) will need to arrange for its own mempool. That's not hard to do, but I just want confirmation. Thanks.