RE: [PATCH] net/pcap: reduce time for stopping device

patches for DPDK stable branches
 help / color / mirror / Atom feed

From: "Zhou, YidingX" <yidingx.zhou@intel.com>
To: Ferruh Yigit <ferruh.yigit@xilinx.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: RE: [PATCH] net/pcap: reduce time for stopping device
Date: Thu, 1 Sep 2022 07:40:46 +0000	[thread overview]
Message-ID: <DM5PR1101MB21071904C922173A1A5E6F93857B9@DM5PR1101MB2107.namprd11.prod.outlook.com> (raw)
In-Reply-To: <DM5PR1101MB2107A250EC67513C52E95D5885769@DM5PR1101MB2107.namprd11.prod.outlook.com>

Hi, Ferruh

> > >>> The pcap file will be synchronized to the disk when stopping the device.
> > >>> It takes a long time if the file is large that would cause the
> > >>> 'detach sync request' timeout when the device is closed under
> > >>> multi-process scenario.
> > >>>
> > >>> This commit fixes the issue by performing synchronization in Tx
> > >>> path
> > >>>
> > >>> Fixes: 4c173302c307 ("pcap: add new driver")
> > >>> Cc: stable@dpdk.org
> > >>>
> > >>> Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>
> > >>> ---
> > >>>    drivers/net/pcap/pcap_ethdev.c | 18 ++++++++++++++++--
> > >>>    1 file changed, 16 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/drivers/net/pcap/pcap_ethdev.c
> > >>> b/drivers/net/pcap/pcap_ethdev.c index ec29fd6bc5..52eafa5674
> > >>> 100644
> > >>> --- a/drivers/net/pcap/pcap_ethdev.c
> > >>> +++ b/drivers/net/pcap/pcap_ethdev.c
> > >>> @@ -3,7 +3,7 @@
> > >>>     * Copyright(c) 2014 6WIND S.A.
> > >>>     * All rights reserved.
> > >>>     */
> > >>> -
> > >>> +#include <unistd.h>
> > >>>    #include <time.h>
> > >>>
> > >>>    #include <pcap.h>
> > >>> @@ -38,6 +38,8 @@
> > >>>
> > >>>    #define RTE_PMD_PCAP_MAX_QUEUES 16
> > >>>
> > >>> +#define ETH_PCAP_SYNC_THRESHOLD 0x20000000
> > >>> +
> >
> > I guess this is 512MB, can you please comment this.
> > Is there any specific reason to select this value, or is it arbitrary?
> >
> >
> 
> 512M is arbitrary, because there is no API to get the disk cache size associated
> with a specific file.
> I will test the performance impact of different values.
> 
> > >>>    static char errbuf[PCAP_ERRBUF_SIZE];
> > >>>    static struct timespec start_time;
> > >>>    static uint64_t start_cycles;
> > >>> @@ -47,6 +49,8 @@ static uint8_t iface_idx;
> > >>>    static uint64_t timestamp_rx_dynflag;
> > >>>    static int timestamp_dynfield_offset = -1;
> > >>>
> > >>> +RTE_DEFINE_PER_LCORE(uint64_t, _pcap_cached_bytes);
> > >>> +
> > >>>    struct queue_stat {
> > >>>    	volatile unsigned long pkts;
> > >>>    	volatile unsigned long bytes;
> > >>> @@ -144,6 +148,16 @@ static struct rte_eth_link pmd_link = {
> > >>>
> > >>>    RTE_LOG_REGISTER_DEFAULT(eth_pcap_logtype, NOTICE);
> > >>>
> > >>> +static inline void
> > >>> +pcap_dumper_data_sync(pcap_dumper_t *dumper, uint32_t bytes) {
> >
> > 'pcap_' is the namespace for the libpcap, can you select another
> > prefix, like 'eth_pcap_' as many driver functions does.
> >
> > >>> +	RTE_PER_LCORE(_pcap_cached_bytes) += bytes;
> > >>> +	if (unlikely(RTE_PER_LCORE(_pcap_cached_bytes) >
> > >> ETH_PCAP_SYNC_THRESHOLD)) {
> > >>> +		if (!fdatasync(fileno(pcap_dump_file(dumper))))
> > >>> +			RTE_PER_LCORE(_pcap_cached_bytes) = 0;
> > >>> +	}
> > >>> +}
> > >>> +
> >
> > pcap supports multiple queue, and each queue creates a new pcap dumper
> > and single core/thread can be used for this multiple dumpers. In that
> > case I think above per lcore variable logic doesn't work.
> >
> > And instead of having a global value, what do you think to add a
> > variable to 'struct pcap_tx_queue' for this purpose?
> >
> 
> Thanks for the comments, I will follow this.
> 
> > >>>    static struct queue_missed_stat*
> > >>>    queue_missed_stat_update(struct rte_eth_dev *dev, unsigned int qid)
> > >>>    {
> > >>> @@ -421,7 +435,7 @@ eth_pcap_tx_dumper(void *queue, struct
> > >>> rte_mbuf
> > >> **bufs, uint16_t nb_pkts)
> > >>>    	 * process stops and to make sure the pcap file is actually written,
> > >>>    	 * we flush the pcap dumper within each burst.
> > >>>    	 */
> > >>> -	pcap_dump_flush(dumper);
> > >>> +	pcap_dumper_data_sync(dumper, tx_bytes);
> > >>
> > >> 'pcap_dump_flush()' should be doing the same thing, to write buffer
> > >> to file, isn't it working?
> > >>
> > >> Can you check the return value of the 'pcap_dump_flush()' API, I
> > >> wonder if it keeps failing, for some reason?
> > >>
> > >
> > > 'pcap_dump_flush()' returns 0 each time without error, it calls
> > > 'fflush()' to
> > flush userspace buffers to kernel buffers, not disk. 'fdatasync()' to
> > ensure data is written to disk.
> > >
> >
> > 'pcap_dump_flush()' API documentation says "flushes the output buffer
> > to the ``savefile,''", but as you said it uses 'fflush()' internally,
> > so there is a chance that data is not written to the disk.
> >
> > In this case, won't need to keep both, first flush and later fsync/fdatasync?
> >
> 
> I draw a diagram to describe it more clearly
> 
>  fwrite                             fclose/fflush                                                   fclose/fdatasync
> --------->| libc buffer  |----------------> |    disk cache in RAM          |--------------------->
> |disk|
>               | 4096 Bytes |                         | size is determined by OS |
> |       |
> 
> When the libc buffer is full, the system will automatically sync it to the disk
> cache.
> It is easily full as it's only 4096 B size. so there is no need to call 'fflush()' every
> time.
> The real time consuming action is syncing the disk cache to disk.
> Because the disk cache is very large, it will take a long time to sync all at one
> time during 'fclose()', so need to call 'fdatasync()' periodically to amortize the
> time.
> 
> > Do you observe any performance difference after change, since now
> > writing to actual disk on datapath?
> >
> 
> I will verify any performance difference under these 2 scenarios.
> 
I have done some tests, using 'fdatasync()' will cause the packet loss rate to increase by 15%-20%, 
It's not a good solution, need to rework. I plan to use a timer to call pcap_dump_close to fix this, 
do you have any suggestion?
thanks
> > >>>    	dumper_q->tx_stat.pkts += num_tx;
> > >>>    	dumper_q->tx_stat.bytes += tx_bytes;
> > >>>    	dumper_q->tx_stat.err_pkts += nb_pkts - num_tx;
> > >

next prev parent reply	other threads:[~2022-09-01  7:41 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25  7:20 Yiding Zhou
2022-08-25 10:09 ` Ferruh Yigit
2022-08-25 11:17   ` Zhou, YidingX
2022-08-25 12:21     ` Ferruh Yigit
2022-08-29 11:50       ` Zhou, YidingX
2022-08-31 16:42         ` Stephen Hemminger
2022-09-01  7:40         ` Zhou, YidingX [this message]
2022-09-06  8:05 ` [PATCH v2] net/pcap: fix timeout of " Yiding Zhou
2022-09-06 14:57   ` Stephen Hemminger
2022-09-06 16:21     ` Zhou, YidingX
2022-09-21  7:14     ` Zhou, YidingX
2022-10-03 15:00       ` Ferruh Yigit
2022-11-22  9:25       ` Zhou, YidingX
2022-11-22 17:28         ` Stephen Hemminger
2022-12-02 10:22           ` Zhou, YidingX
2022-11-29 14:11         ` Ferruh Yigit
2022-12-02 10:13           ` Zhou, YidingX
2022-12-02 11:19             ` Ferruh Yigit
2022-12-05  1:58               ` Zhou, YidingX
  -- strict thread matches above, loose matches on Subject: below --
2022-08-25  6:27 [PATCH] net/pcap: reduce time for " Yiding Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR1101MB21071904C922173A1A5E6F93857B9@DM5PR1101MB2107.namprd11.prod.outlook.com \
    --to=yidingx.zhou@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@xilinx.com \
    --cc=qi.z.zhang@intel.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).