DPDK patches and discussions
 help / color / mirror / Atom feed
* High packet capturing rate in DPDK enabled port
@ 2024-05-05  7:09 Fuji Nafiul
  2024-05-05 16:02 ` Stephen Hemminger
  0 siblings, 1 reply; 3+ messages in thread
From: Fuji Nafiul @ 2024-05-05  7:09 UTC (permalink / raw)
  To: users, dev

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

I have a DPDK-enabled port (Linux server) that serves around 5,000-50,000
concurrent calls, per packet size of 80 bytes to 200 bytes. so in peak
time, I require packet capture + file writing speed of around 1GByte/s or 8
Gbit/sec (at least 0.5Gbyte/s is expected). dpdk official packet capture
example project "dpdk-dumpcap"'s documentation says it has a capability of
around 10MByte/s which is far less than required. I implemented a simple
packet capture and pcap writing code which was able to dump
around 5000-7000 concurrent call data where I used 1 core and 1 single ring
of size 4096, and this was all integrated into actual media code (didn't
use librte_pdump, simply copied to separate rte_ring after capturing
through rte_eth_rx_burst() and before sending through rte_eth_tx_burst() ).
I know I can launch this multiple cores and with multiple rings and so on
but is there any current project which already does this?

I found a third-party project named "dpdkcap" which says it can support up
to 10Gbit/s. Has anyone used it and what's the review?

Or, should I modify the "dpdk-dumpcap" project to my need to
implement multicore and multi-ring support so I can extend the capability?
Thanks in advance

[-- Attachment #2: Type: text/html, Size: 1357 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: High packet capturing rate in DPDK enabled port
  2024-05-05  7:09 High packet capturing rate in DPDK enabled port Fuji Nafiul
@ 2024-05-05 16:02 ` Stephen Hemminger
       [not found]   ` <CA+3hWeyG3tKv1+8gv0D3SzV_9oObwx5LFhp8_3xfCB4CGuU9GA@mail.gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Hemminger @ 2024-05-05 16:02 UTC (permalink / raw)
  To: Fuji Nafiul; +Cc: users, dev

On Sun, 5 May 2024 13:09:42 +0600
Fuji Nafiul <nafiul.fuji@gmail.com> wrote:

> I have a DPDK-enabled port (Linux server) that serves around 5,000-50,000
> concurrent calls, per packet size of 80 bytes to 200 bytes. so in peak
> time, I require packet capture + file writing speed of around 1GByte/s or 8
> Gbit/sec (at least 0.5Gbyte/s is expected). dpdk official packet capture
> example project "dpdk-dumpcap"'s documentation says it has a capability of
> around 10MByte/s which is far less than required. I implemented a simple
> packet capture and pcap writing code which was able to dump
> around 5000-7000 concurrent call data where I used 1 core and 1 single ring
> of size 4096, and this was all integrated into actual media code (didn't
> use librte_pdump, simply copied to separate rte_ring after capturing
> through rte_eth_rx_burst() and before sending through rte_eth_tx_burst() ).
> I know I can launch this multiple cores and with multiple rings and so on
> but is there any current project which already does this?
> 
> I found a third-party project named "dpdkcap" which says it can support up
> to 10Gbit/s. Has anyone used it and what's the review?
> 
> Or, should I modify the "dpdk-dumpcap" project to my need to
> implement multicore and multi-ring support so I can extend the capability?
> Thanks in advance

The limitations of high speed packet capture is more about speed of writing
to disk. Doing single write per packet is part of the problem. Getting higher
performance requires faster SSD, and using ioring API.

I do not believe that dpdkcap is really supporting writing at 10 Gbit/sec only
that it can capture data on a 10 Gbit/sec device.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: High packet capturing rate in DPDK enabled port
       [not found]   ` <CA+3hWeyG3tKv1+8gv0D3SzV_9oObwx5LFhp8_3xfCB4CGuU9GA@mail.gmail.com>
@ 2024-05-06 17:59     ` Stephen Hemminger
  0 siblings, 0 replies; 3+ messages in thread
From: Stephen Hemminger @ 2024-05-06 17:59 UTC (permalink / raw)
  To: Fuji Nafiul, dev

On Mon, 6 May 2024 02:15:10 +0600
Fuji Nafiul <nafiul.fuji@gmail.com> wrote:

> I understand that I will need more cores and SSD, which I have. The thing
> is is there any current project available that exposes params to dump the
> highest possible rate with available resources? or I have to use the pdump
> framework and implement it myself. I previously wrote dumping code
> integrated with my dpdk media which was able to dump around 0.5 Gbits/s (1
> big rte ring and 2 cores, not much optimized) then found out that pdump
> framework does a similar kind of thing but just with a secondary process
> with interception in rx/tx. But I need to modify to scale it, That is why I
> was wondering whether there is already a project that aims to dump the
> highest rate possible in dpdk port, otherwise, I will start modifying it. I
> haven't looked into "dpdkcap" code but it says that it aims to dump around
> 10Gbit/s if resources are available. Has anyone used or tested this project
> or tried to modify pdump code to scale?

The things that could speed up dpdk-dumpcap are:

1. Use Linux async io via ioring. But creates work around supporting
   older distros. I would not make it an option, if ioring works it should
   be used. Might be easier not that RHEL/CentOS 7 is end of life and does
   not need to be supported.

2. Get rid of copy in pdump side by using ref counts. But this exposes
   potential issues with drivers and applications that don't handle
   mbufs with refcount > 1.  It means if refcount > 1 then the application
   can not overwrite the buffer.  On Tx side, that means handling vlan
   gets more complicated.  On Rx side, it needs to be an option; and most
   applications (especially 3rd party) can't handle refcounts.

3. Get rid of callback and just put mbuf into ring directly.
   Indirect calls slow things down and introduces bugs when secondary
   is doing rx/tx.

4. Have dumpcap use multithreads (one per queue) when doing ring -> write.

These are in order of complexity/performance gain.

I haven't done them because don't work full time on this. And it would
require lots of testing effort as wel.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-05-06 18:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-05  7:09 High packet capturing rate in DPDK enabled port Fuji Nafiul
2024-05-05 16:02 ` Stephen Hemminger
     [not found]   ` <CA+3hWeyG3tKv1+8gv0D3SzV_9oObwx5LFhp8_3xfCB4CGuU9GA@mail.gmail.com>
2024-05-06 17:59     ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).