Hi,
I am testing the perfomance of the mellanox CX5 100GbE NIC. I found that the perfomance(Mpps) of AMD(X86) server is better than that of Kunpeng920 Server(ARM) in the single-core scenario. the test command is as follows:
RX-side:
dpdk-testpmd -l 1-23 -n 4 -a XXXX -- -i --rxq=1 --txq=1 --txd=1024 --rxd=1024 --nb-cores=1 --eth-peer=0,xxxx --burst=128 --forward-mode=rxonly -a --txpkts=128 --mbcache=512 --rss-udp
TX-side (payload_size is 128byte):
dpdk-testpmd -a XXXXX -l 1-23 -n 4 -- -i --rxq=4 --txq=4 --txd=1024 --rxd=1024 --nb-cores=4 --eth-peer=0,XXXXX --burst=64 --forward-mode=txonly -a --txpkts=128 --mbcache=512 --rss-udp
firmware-version:
16.32.1010 (HUA0000000004)
OS:
OpenEuler 22.03
Kernel:
5.10
DPDK:
21.11.5
Results:
ARM:
28.598Gbps, 27.928Mpps
X86:
34.015Gbps, 33.218Mpps
After some checks, I suspect that the bottleneck is mainly the NIC. Have you tested the performance of the CX5 on the ARM server?
Do you have any optimization methods for ARM server, such as some parameters or firmware versions?
Thanks