Hi,

I am testing the perfomance of the mellanox CX5 100GbE NIC. I found that the perfomance(Mpps) of AMD(X86) server is better than that of Kunpeng920 Server(ARM) in the single-core scenario. the test command is as follows:

RX-side:

dpdk-testpmd -l 1-23 -n 4 -a XXXX -- -i --rxq=1 --txq=1 --txd=1024 --rxd=1024 --nb-cores=1 --eth-peer=0,xxxx  --burst=128 --forward-mode=rxonly -a --txpkts=128 --mbcache=512 --rss-udp

TX-side (payload_size is 128byte):

dpdk-testpmd -a XXXXX -l 1-23 -n 4 -- -i --rxq=4 --txq=4 --txd=1024 --rxd=1024 --nb-cores=4  --eth-peer=0,XXXXX --burst=64    --forward-mode=txonly -a --txpkts=128 --mbcache=512 --rss-udp

 

firmware-version:

16.32.1010 (HUA0000000004)

OS:

OpenEuler 22.03

Kernel:

5.10

DPDK:

21.11.5

 

Results:

ARM:

28.598Gbps, 27.928Mpps

X86:

34.015Gbps, 33.218Mpps

 

After some checks, I suspect that the bottleneck is mainly the NIC. Have you tested the performance of the CX5 on the ARM server?

Do you have any optimization methods for ARM server, such as some parameters or firmware versions?

 

Thanks