DPDK usage discussions
 help / color / mirror / Atom feed
* Mellanox Connectx-6 Dx dual port performance
@ 2022-03-22  9:03 Дмитрий Степанов
  2022-04-10  7:30 ` Asaf Penso
  0 siblings, 1 reply; 2+ messages in thread
From: Дмитрий Степанов @ 2022-03-22  9:03 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 3223 bytes --]

Hi!

I'm testing overall dual port performance on ConnectX-6 Dx EN adapter card
(100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16) with DPDK 21.11 on Ubuntu
20.04.
I have 2 dual port NICs installed on the same server (but on different NUMA
nodes) which I use as a generator and a reciever respectively.
First, I started custom packet generator on port 0 and got 148 Mpps TX (64
bytes TCP packets with zero payload lentgh) which equals the maximum of 100
Gbps line rate. Then I launched the same generator with the same parameters
simultaneously on port 1.
Performance on both ports decreased to 105-106 Mpss per port (210-212 Mpps
in sum). If I use 512 bytes TCP packets - then running generators on both
ports gives me 23 Mpps for each port (46 Mpps in sum, which for given TCP
packet size equals the maximum line rate).

Mellanox performance report
http://fast.dpdk.org/doc/perf/DPDK_21_08_Mellanox_NIC_performance_report.pdf
doesn't contain measurements for TX path, only for RX.
Provided Test#11 Mellanox ConnectX-6 Dx 100GbE PCIe Gen4 Throughput at Zero
Packet Loss (2x 100GbE) for RX path contains near the same results that I
got for TX path (214 Mpps for 64 bytes packets, 47 Mpps for 512 bytes
packets). The question is - do my results for TX path should coincide with
provided results for RX path? Why I can't get 148 x 2 Mpps for small
packets when using both ports? What is a bottleneck here - PCIe, RAM or NIC
itself?

To test RX path I used testpmd and l3fwd (slightly midified to print RX
stats) utilities.

./dpdk-testpmd -l 64-127 -n 4 -a
0000:c1:00.0,mprq_en=1,mprq_log_stride_num=9 -a
0000:c1:00.1,mprq_en=1,mprq_log_stride_num=9 -- --stats-period 1
--nb-cores=16 --rxq=16 --txq=16 --rxd=4096 --txd=4096 --burst=64
--mbcache=512

./build/examples/dpdk-l3fwd -l 96-111 -n 4 --socketmem=0,4096 -a
0000:c1:00.0,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1
-a
0000:c1:00.1,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1
-- -p 0x3 -P
--config='(0,0,111),(0,1,110),(0,2,109),(0,3,108),(0,4,107),(0,5,106),(0,6,105),(0,7,104),(1,0,103),(1,1,102),(1,2,101),(1,3,100),(1,4,99),(1,5,98),(1,6,97),(1,7,96)'
--eth-dest=0,00:15:77:1f:eb:fb --eth-dest=1,00:15:77:1f:eb:fb

Then I provided 105 Mpps of 64 bytes TCP packets from another dual port NIC
to each port (210 Mpps in sum). As I described above I can't get more than
210 Mpps in sum from generator. In both cases I was not able to get more
than 75-85 Mpps for each port (150-170 Mpps in sum) on RX path. This
contradicts with results provided in Mellanox performance report (214 Mpps
for both ports, 112 Mpps per port on RX path). Running only single
generator gives me 148 Mpps on both TX and RX sides. But after starting
generator on the second port - the TX performance decreased to 105 Mpps per
port (210 Mpps in sum), RX performance descreased to 75-85 Mpps per port
(150-170 Mpps in sum for both ports). Could these poor RX results be due
not fully utilized generator or I should get 210 Mpps provided by generator
on both ports in sum? I used all suggestions for system tuning described in
Mellanox performance report document.
I would be grateful for any advice.

Thanks in advance!

[-- Attachment #2: Type: text/html, Size: 3433 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: Mellanox Connectx-6 Dx dual port performance
  2022-03-22  9:03 Mellanox Connectx-6 Dx dual port performance Дмитрий Степанов
@ 2022-04-10  7:30 ` Asaf Penso
  0 siblings, 0 replies; 2+ messages in thread
From: Asaf Penso @ 2022-04-10  7:30 UTC (permalink / raw)
  To: Дмитрий
	Степанов,
	users

[-- Attachment #1: Type: text/plain, Size: 4061 bytes --]

Hello,

Thanks for your mail and analysis.
The results below of max packet rate of 214Mpps for dual port ConnectX-6 Dx are expected, and are aligned with the NIC capabilities.

Regards,
Asaf Penso

From: Дмитрий Степанов <stepanov.dmit@gmail.com>
Sent: Tuesday, March 22, 2022 11:04 AM
To: users@dpdk.org
Subject: Mellanox Connectx-6 Dx dual port performance

Hi!

I'm testing overall dual port performance on ConnectX-6 Dx EN adapter card (100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16) with DPDK 21.11 on Ubuntu 20.04.
I have 2 dual port NICs installed on the same server (but on different NUMA nodes) which I use as a generator and a reciever respectively.
First, I started custom packet generator on port 0 and got 148 Mpps TX (64 bytes TCP packets with zero payload lentgh) which equals the maximum of 100 Gbps line rate. Then I launched the same generator with the same parameters simultaneously on port 1.
Performance on both ports decreased to 105-106 Mpss per port (210-212 Mpps in sum). If I use 512 bytes TCP packets - then running generators on both ports gives me 23 Mpps for each port (46 Mpps in sum, which for given TCP packet size equals the maximum line rate).

Mellanox performance report http://fast.dpdk.org/doc/perf/DPDK_21_08_Mellanox_NIC_performance_report.pdf<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffast.dpdk.org%2Fdoc%2Fperf%2FDPDK_21_08_Mellanox_NIC_performance_report.pdf&data=04%7C01%7Casafp%40nvidia.com%7Cc85c2868f7cb46bcd74e08da0be2eefe%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637835366991655176%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Y%2FIkVkGUK%2FGozni1b%2B5ICrdMDO%2B8LW84I8Poiol4wWw%3D&reserved=0> doesn't contain measurements for TX path, only for RX.
Provided Test#11 Mellanox ConnectX-6 Dx 100GbE PCIe Gen4 Throughput at Zero Packet Loss (2x 100GbE) for RX path contains near the same results that I got for TX path (214 Mpps for 64 bytes packets, 47 Mpps for 512 bytes packets). The question is - do my results for TX path should coincide with provided results for RX path? Why I can't get 148 x 2 Mpps for small packets when using both ports? What is a bottleneck here - PCIe, RAM or NIC itself?

To test RX path I used testpmd and l3fwd (slightly midified to print RX stats) utilities.

./dpdk-testpmd -l 64-127 -n 4 -a 0000:c1:00.0,mprq_en=1,mprq_log_stride_num=9 -a 0000:c1:00.1,mprq_en=1,mprq_log_stride_num=9 -- --stats-period 1 --nb-cores=16 --rxq=16 --txq=16 --rxd=4096 --txd=4096 --burst=64 --mbcache=512

./build/examples/dpdk-l3fwd -l 96-111 -n 4 --socketmem=0,4096 -a 0000:c1:00.0,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1 -a 0000:c1:00.1,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1 -- -p 0x3 -P --config='(0,0,111),(0,1,110),(0,2,109),(0,3,108),(0,4,107),(0,5,106),(0,6,105),(0,7,104),(1,0,103),(1,1,102),(1,2,101),(1,3,100),(1,4,99),(1,5,98),(1,6,97),(1,7,96)' --eth-dest=0,00:15:77:1f:eb:fb --eth-dest=1,00:15:77:1f:eb:fb

Then I provided 105 Mpps of 64 bytes TCP packets from another dual port NIC to each port (210 Mpps in sum). As I described above I can't get more than 210 Mpps in sum from generator. In both cases I was not able to get more than 75-85 Mpps for each port (150-170 Mpps in sum) on RX path. This contradicts with results provided in Mellanox performance report (214 Mpps for both ports, 112 Mpps per port on RX path). Running only single generator gives me 148 Mpps on both TX and RX sides. But after starting generator on the second port - the TX performance decreased to 105 Mpps per port (210 Mpps in sum), RX performance descreased to 75-85 Mpps per port (150-170 Mpps in sum for both ports). Could these poor RX results be due not fully utilized generator or I should get 210 Mpps provided by generator on both ports in sum? I used all suggestions for system tuning described in Mellanox performance report document.
I would be grateful for any advice.

Thanks in advance!

[-- Attachment #2: Type: text/html, Size: 7277 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-04-10  7:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-22  9:03 Mellanox Connectx-6 Dx dual port performance Дмитрий Степанов
2022-04-10  7:30 ` Asaf Penso

DPDK usage discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/users/0 users/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 users users/ http://inbox.dpdk.org/users \
		users@dpdk.org
	public-inbox-index users

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.users


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git