DPDK usage discussions
 help / color / mirror / Atom feed
* Performance "problem" or how to get power of the DPDK
@ 2022-12-26 12:20 Ruslan R. Laishev
  2022-12-26 13:07 ` Dmitry Kozlyuk
  0 siblings, 1 reply; 9+ messages in thread
From: Ruslan R. Laishev @ 2022-12-26 12:20 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/html, Size: 748 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Performance "problem" or how to get power of the DPDK
  2022-12-26 12:20 Performance "problem" or how to get power of the DPDK Ruslan R. Laishev
@ 2022-12-26 13:07 ` Dmitry Kozlyuk
  2022-12-26 13:22   ` Ruslan R. Laishev
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Kozlyuk @ 2022-12-26 13:07 UTC (permalink / raw)
  To: Ruslan R. Laishev; +Cc: users

Hi,

2022-12-26 15:20 (UTC+0300), Ruslan R. Laishev:
> I studying programming with DPDK SDK . So I write a small app to send/receive packets , now I testing it and see next situation:
> iperf3 show  9,4 - 9,7 Gbps on TCP
>  
> my app can *send* only at 4+Gbps (I see counters in the rte_eth_stats) .  I have tried to speed-up my app by:
> -  using 1+ number of TX queues (device claim support 64)
> -  increase size of burst from 32 up to 128  
> - turn off any offloads related to checksumming
>  
> No effect.

Please tell more about what your app does and how (w.r.t. DPDK usage).
Are you sure that all cores are loaded? E.g. if you send identical packets,
RSS can steer them all to a single queue and thus a single core.

What performance do you see using testpmd with txonly/rxonly forward mode,
if applicable?

What is the packet performance, i.e. Mpps, not Gbps, and packet size?
Unless you do TCP payload processing (or compute large payload checksums),
packets per second usually matter rather than bits per second.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Performance "problem" or how to get power of the DPDK
  2022-12-26 13:07 ` Dmitry Kozlyuk
@ 2022-12-26 13:22   ` Ruslan R. Laishev
  2022-12-26 19:21     ` Ruslan R. Laishev
  0 siblings, 1 reply; 9+ messages in thread
From: Ruslan R. Laishev @ 2022-12-26 13:22 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/html, Size: 2728 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Performance "problem" or how to get power of the DPDK
  2022-12-26 13:22   ` Ruslan R. Laishev
@ 2022-12-26 19:21     ` Ruslan R. Laishev
  2022-12-26 19:24       ` Ruslan R. Laishev
  0 siblings, 1 reply; 9+ messages in thread
From: Ruslan R. Laishev @ 2022-12-26 19:21 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/html, Size: 6142 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Performance "problem" or how to get power of the DPDK
  2022-12-26 19:21     ` Ruslan R. Laishev
@ 2022-12-26 19:24       ` Ruslan R. Laishev
  2022-12-26 20:04         ` Dmitry Kozlyuk
  0 siblings, 1 reply; 9+ messages in thread
From: Ruslan R. Laishev @ 2022-12-26 19:24 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/html, Size: 7360 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Performance "problem" or how to get power of the DPDK
  2022-12-26 19:24       ` Ruslan R. Laishev
@ 2022-12-26 20:04         ` Dmitry Kozlyuk
  2022-12-26 20:10           ` Ruslan R. Laishev
  2023-01-08 13:23           ` One transmitter worker stop after "some time" (Was: Performance "problem" or how to get power of the DPDK ) Ruslan R. Laishev
  0 siblings, 2 replies; 9+ messages in thread
From: Dmitry Kozlyuk @ 2022-12-26 20:04 UTC (permalink / raw)
  To: Ruslan R. Laishev; +Cc: users

2022-12-26 22:24 (UTC+0300), Ruslan R. Laishev:
> Sorry, right interface:
>  
> Network devices using DPDK-compatible driver
> ============================================
> 0000:09:00.0 'Ethernet Connection X553 10 GbE SFP+ 15c4' drv=igb_uio unused=ixgbe,vfio-pci
>  
>  
> 26.12.2022, 22:21, "Ruslan R. Laishev" <zator@yandex.ru>:
> There is what i do at xmiter side, may be u will get a quantum to see a code of pupil:  https://pastebin.com/1WMyXtr5

I see nothing terribly wrong there.
If you run it on more then one core each using a distinct Tx queue,
do you see a performance increase?

You might want to remove the shared atomic counter;
use simple per-lcore counters and sum them for display.
Also, if "g_nqueue" is not const, "%" should not be used on the data path.

> > I spent some time with the testpmd, sorry but there is not an ability to get rate information on sending, may be i'll add it into the code ...

Run testpmd as follows to make it transmit packets
(add DPDK options before "--" as needed):

dpdk-testpmd -- --forward-mode=txonly --tx-first

It will print statistics on exit ("exit" or Ctrl+C).

> > Some statistics (rte_eth_stats)  :
> >  
> > Date;Time;Device;Port;Name;Area;In pkts;Out pkts;In bytes;Out bytes;In missed;In errors;Out errors;No mbufs;
> > (payload is 0 octets)
> > 26-12-2022; 22:14:10; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 21122753; 0; 1563085750; 0; 0; 0; 0
> > 26-12-2022; 22:14:20; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 21122392; 0; 1563057008; 0; 0; 0; 0
> > 26-12-2022; 22:14:30; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 21121978; 0; 1563024500; 0; 0; 0; 0
> > 26-12-2022; 22:14:40; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 21122012; 0; 1563028888; 0; 0; 0; 0
> >  
> > (payload is 1024 octets)
> > 26-12-2022; 22:15:20; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 5246799; 0; 4648659464; 0; 0; 0; 0
> > 26-12-2022; 22:15:30; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 5246456; 0; 4648360016; 0; 0; 0; 0
> > 26-12-2022; 22:15:40; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 5246168; 0; 4648108408; 0; 0; 0; 0
> > 26-12-2022; 22:15:50; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 5246143; 0; 4648084478; 0; 0; 0; 0
> > 26-12-2022; 22:16:00; 0000:09:00.0; _PEA00:; WAN0; WAN; 0; 5246129; 0; 4648070294; 0; 0; 0; 0
> >  
> > A piece of the DPDK-DEVBIND.SH
> > 0000:02:00.0 'I211 Gigabit Network Connection 1539' if=enp2s0 drv=igb unused=igb_uio,vfio-pci *Active*
> >  
> > 26.12.2022, 16:22, "Ruslan R. Laishev" <zator@yandex.ru>:
> > Thanks for the answer.
> >>  
> >> Oops, sorry, some details:
> >> - one core run generator routine
> >> - one core run routine to save/display statistic
> >>  
> >> core run a generator routine like:
> >>  
> >> while (1) {
> >> get buffer from pool
> >> make eth+ip+udp header (it's static content)
> >> generate payload like memset(packet.payload , 'A' + something, payload_size);
> >> generate packet sequence and CRC32C  - and add it to the payload part
> >> "send" packet to tx_buffer
> >>  
> >> if (tx_buffer.size == tx_buffer.length)
> >> do flush()
> >> }
> >>  
> >> "header; part of the packet : sizeof(eth+ip+udp) -
> >> "payload" part - 20-1024 octets

From your statistics I calculate 74 bytes per packet (Ethernet),
i.e. the theoretical maximum for 10 Gbps is 12.25 Mpps,
with packet budget of 81 ns per packet.

https://calc.pktgen.com/#gbe=20&payload=40&rate=12250000&header=20

> >> RSS - it's on received side, yes ?

Correct.
I asked because it was unclear from the initial message
whether you app does the receiving or not.

P.S. Please avoid top-posting, i.e. reply below the quote.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Performance "problem" or how to get power of the DPDK
  2022-12-26 20:04         ` Dmitry Kozlyuk
@ 2022-12-26 20:10           ` Ruslan R. Laishev
  2023-01-08 13:23           ` One transmitter worker stop after "some time" (Was: Performance "problem" or how to get power of the DPDK ) Ruslan R. Laishev
  1 sibling, 0 replies; 9+ messages in thread
From: Ruslan R. Laishev @ 2022-12-26 20:10 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/html, Size: 5310 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* One transmitter worker stop after "some time" (Was: Performance "problem" or how to get power of the DPDK )
  2022-12-26 20:04         ` Dmitry Kozlyuk
  2022-12-26 20:10           ` Ruslan R. Laishev
@ 2023-01-08 13:23           ` Ruslan R. Laishev
  2023-01-08 17:13             ` Stephen Hemminger
  1 sibling, 1 reply; 9+ messages in thread
From: Ruslan R. Laishev @ 2023-01-08 13:23 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/html, Size: 4504 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: One transmitter worker stop after "some time" (Was: Performance "problem" or how to get power of the DPDK )
  2023-01-08 13:23           ` One transmitter worker stop after "some time" (Was: Performance "problem" or how to get power of the DPDK ) Ruslan R. Laishev
@ 2023-01-08 17:13             ` Stephen Hemminger
  0 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2023-01-08 17:13 UTC (permalink / raw)
  To: Ruslan R. Laishev; +Cc: Dmitry Kozlyuk, users

On Sun, 08 Jan 2023 16:23:13 +0300
Ruslan R. Laishev <zator@yandex.ru> wrote:

> Hello!
>  
> According the advices (from previous mails)  I recoded my little app to use several lcore-queue pairs to generate traffic. Thanks it's works fine, I see 8Gbps+ now with 2 workers .
> But! Now I have some other situation which I cannot to resolve. 2 workers  (every worker run on assigned lcore and put packets to separated tx queue) :
> after start of the app - both worker works some time, but at "some moment" one worker cannot get mbufs by rte_pktmbuf_alloc_bulk() .  Juts for demonstration a piece of stats:
>  
> At start :
> 08-01-2023 16:03:20.065  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#001] TX/NoMbufs/Flush:1981397/0/1981397
> 08-01-2023 16:03:20.065  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#002] TX/NoMbufs/Flush:1989108/0/1989108
>  
> Since "some moment"
> 08-01-2023 16:15:20.110  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#001] TX/NoMbufs/Flush:2197615/5778976181/2197631
> 08-01-2023 16:15:20.110  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#002] TX/NoMbufs/Flush:3952732/0/3952732
>  
> 08-01-2023 16:15:30.111  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#001] TX/NoMbufs/Flush:2197615/5869388078/2197631
> 08-01-2023 16:15:30.111  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#002] TX/NoMbufs/Flush:3980054/0/3980054
>  
> 08-01-2023 16:15:40.111  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#001] TX/NoMbufs/Flush:2197615/5959777107/2197631
> 08-01-2023 16:15:40.111  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#002] TX/NoMbufs/Flush:4007378/0/4007378
>  
> 08-01-2023 16:15:50.112  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#001] TX/NoMbufs/Flush:2197615/6050173812/2197631
> 08-01-2023 16:15:50.112  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#002] TX/NoMbufs/Flush:4034699/0/4034699
>  
> 08-01-2023 16:16:00.112  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#001] TX/NoMbufs/Flush:2197615/6140583818/2197631
> 08-01-2023 16:16:00.112  58628 [CPPROC\s_proc_auxilary:822] %TTR2CP-I:  [LCore:#002] TX/NoMbufs/Flush:4062021/0/4062021
>  
> So one worker works fine and as expected, second worker - permanently don't getting mbufs .
> Is there what I have to check ?
> Thanks in advance!
>  
> --- 
> С уважением,
> Ruslan R. Laishev
> OpenVMS bigot, natural born system/network progger, C contractor.
> +79013163222
> +79910009922
>  
> 

Two thing to look at. First is the allocated mbuf pool big enough to handle the maximum
number of mbufs in flight in your application. For Tx, that is the number of transmit
queues multiplied by the number of transmit descriptors per ring. With some additional
buffers for staging.  Similar for receive side.

Second, transmit mbufs need to get cleaned up by the device driver after they
are sent. Depending on the the device, this maybe triggered by the receive path.
So polling for receive data may be needed even if you aren't doing any receives.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-08 17:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-26 12:20 Performance "problem" or how to get power of the DPDK Ruslan R. Laishev
2022-12-26 13:07 ` Dmitry Kozlyuk
2022-12-26 13:22   ` Ruslan R. Laishev
2022-12-26 19:21     ` Ruslan R. Laishev
2022-12-26 19:24       ` Ruslan R. Laishev
2022-12-26 20:04         ` Dmitry Kozlyuk
2022-12-26 20:10           ` Ruslan R. Laishev
2023-01-08 13:23           ` One transmitter worker stop after "some time" (Was: Performance "problem" or how to get power of the DPDK ) Ruslan R. Laishev
2023-01-08 17:13             ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).