Hi folks!

I'm using Mellanox ConnectX-6 Dx EN adapter card (100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16) with DPDK 21.11 on a server with AMD EPYC 7702 64-Core Processor (NUMA system with 2 sockets). Hyperthreading is turned off.
I'm testing the maximum receive throughput I can get from a single port using testpmd utility (shipped with dpdk). My generator produces random UDP packets with zero payload length.

I get the maximum performance using 8-12 lcores (overall 120-125Mpps on receive path of single port):

numactl -N 1 -m 1 /opt/dpdk-21.11/build/app/dpdk-testpmd -l 64-127 -n 4  -a 0000:c1:00.0 -- --stats-period 1 --nb-cores=12 --rxq=12 --txq=12 --rxd=512

With more than 12 lcores overall receive performance reduces. With 16-32 lcores I get 100-110 Mpps, and I get a significant performance fall with 33 lcores - 84Mpps. With 63 cores I get even 35Mpps  overall receive performance.

Are there any limitations on the total number of receive queues (total lcores) that can handle a single port on a given NIC?

Thanks,
Dmitriy Stepanov