Thanks for the answer.
 
Oops, sorry, some details:
- one core run generator routine
- one core run routine to save/display statistic
 
core run a generator routine like:
 
while (1) {
get buffer from pool
make eth+ip+udp header (it's static content)
generate payload like memset(packet.payload , 'A' + something, payload_size);
generate packet sequence and CRC32C  - and add it to the payload part
"send" packet to tx_buffer
 
if (tx_buffer.size == tx_buffer.length)
do flush()
}
 
"header; part of the packet : sizeof(eth+ip+udp) -
"payload" part - 20-1024 octets
 
RSS - it's on received side, yes ?
 
testpmd - have not tried, I'll.
 
 
26.12.2022, 16:07, "Dmitry Kozlyuk" <dmitry.kozliuk@gmail.com>:

Hi,

2022-12-26 15:20 (UTC+0300), Ruslan R. Laishev:

 I studying programming with DPDK SDK . So I write a small app to send/receive packets , now I testing it and see next situation:
 iperf3 show  9,4 - 9,7 Gbps on TCP
  
 my app can *send* only at 4+Gbps (I see counters in the rte_eth_stats) .  I have tried to speed-up my app by:
 -  using 1+ number of TX queues (device claim support 64)
 -  increase size of burst from 32 up to 128  
 - turn off any offloads related to checksumming
  
 No effect.


Please tell more about what your app does and how (w.r.t. DPDK usage).
Are you sure that all cores are loaded? E.g. if you send identical packets,
RSS can steer them all to a single queue and thus a single core.

What performance do you see using testpmd with txonly/rxonly forward mode,
if applicable?

What is the packet performance, i.e. Mpps, not Gbps, and packet size?
Unless you do TCP payload processing (or compute large payload checksums),
packets per second usually matter rather than bits per second.

 
 
--- 
С уважением,
Ruslan R. Laishev
OpenVMS bigot, natural born system/network progger, C contractor.
+79013163222
+79910009922