patches for DPDK stable branches
 help / color / mirror / Atom feed
From: "Zhou, YidingX" <yidingx.zhou@intel.com>
To: Ferruh Yigit <ferruh.yigit@xilinx.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: RE: [PATCH] net/pcap: reduce time for stopping device
Date: Mon, 29 Aug 2022 11:50:29 +0000	[thread overview]
Message-ID: <DM5PR1101MB2107A250EC67513C52E95D5885769@DM5PR1101MB2107.namprd11.prod.outlook.com> (raw)
In-Reply-To: <af6bef36-1e0a-4f9d-b4f6-95f99c686c75@xilinx.com>

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@xilinx.com>
> Sent: Thursday, August 25, 2022 8:22 PM
> To: Zhou, YidingX <yidingx.zhou@intel.com>; dev@dpdk.org
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; stable@dpdk.org
> Subject: Re: [PATCH] net/pcap: reduce time for stopping device
> 
> On 8/25/2022 12:17 PM, Zhou, YidingX wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@xilinx.com>
> >> Sent: Thursday, August 25, 2022 6:09 PM
> >> To: Zhou, YidingX <yidingx.zhou@intel.com>; dev@dpdk.org
> >> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; stable@dpdk.org
> >> Subject: Re: [PATCH] net/pcap: reduce time for stopping device
> >>
> >> On 8/25/2022 8:20 AM, Yiding Zhou wrote:
> >>> The pcap file will be synchronized to the disk when stopping the device.
> >>> It takes a long time if the file is large that would cause the
> >>> 'detach sync request' timeout when the device is closed under
> >>> multi-process scenario.
> >>>
> >>> This commit fixes the issue by performing synchronization in Tx path
> >>>
> >>> Fixes: 4c173302c307 ("pcap: add new driver")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>
> >>> ---
> >>>    drivers/net/pcap/pcap_ethdev.c | 18 ++++++++++++++++--
> >>>    1 file changed, 16 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/pcap/pcap_ethdev.c
> >>> b/drivers/net/pcap/pcap_ethdev.c index ec29fd6bc5..52eafa5674 100644
> >>> --- a/drivers/net/pcap/pcap_ethdev.c
> >>> +++ b/drivers/net/pcap/pcap_ethdev.c
> >>> @@ -3,7 +3,7 @@
> >>>     * Copyright(c) 2014 6WIND S.A.
> >>>     * All rights reserved.
> >>>     */
> >>> -
> >>> +#include <unistd.h>
> >>>    #include <time.h>
> >>>
> >>>    #include <pcap.h>
> >>> @@ -38,6 +38,8 @@
> >>>
> >>>    #define RTE_PMD_PCAP_MAX_QUEUES 16
> >>>
> >>> +#define ETH_PCAP_SYNC_THRESHOLD 0x20000000
> >>> +
> 
> I guess this is 512MB, can you please comment this.
> Is there any specific reason to select this value, or is it arbitrary?
> 
> 

512M is arbitrary, because there is no API to get the disk cache size associated with a specific file.
I will test the performance impact of different values.

> >>>    static char errbuf[PCAP_ERRBUF_SIZE];
> >>>    static struct timespec start_time;
> >>>    static uint64_t start_cycles;
> >>> @@ -47,6 +49,8 @@ static uint8_t iface_idx;
> >>>    static uint64_t timestamp_rx_dynflag;
> >>>    static int timestamp_dynfield_offset = -1;
> >>>
> >>> +RTE_DEFINE_PER_LCORE(uint64_t, _pcap_cached_bytes);
> >>> +
> >>>    struct queue_stat {
> >>>    	volatile unsigned long pkts;
> >>>    	volatile unsigned long bytes;
> >>> @@ -144,6 +148,16 @@ static struct rte_eth_link pmd_link = {
> >>>
> >>>    RTE_LOG_REGISTER_DEFAULT(eth_pcap_logtype, NOTICE);
> >>>
> >>> +static inline void
> >>> +pcap_dumper_data_sync(pcap_dumper_t *dumper, uint32_t bytes) {
> 
> 'pcap_' is the namespace for the libpcap, can you select another prefix, like
> 'eth_pcap_' as many driver functions does.
> 
> >>> +	RTE_PER_LCORE(_pcap_cached_bytes) += bytes;
> >>> +	if (unlikely(RTE_PER_LCORE(_pcap_cached_bytes) >
> >> ETH_PCAP_SYNC_THRESHOLD)) {
> >>> +		if (!fdatasync(fileno(pcap_dump_file(dumper))))
> >>> +			RTE_PER_LCORE(_pcap_cached_bytes) = 0;
> >>> +	}
> >>> +}
> >>> +
> 
> pcap supports multiple queue, and each queue creates a new pcap dumper and
> single core/thread can be used for this multiple dumpers. In that case I think
> above per lcore variable logic doesn't work.
> 
> And instead of having a global value, what do you think to add a variable to
> 'struct pcap_tx_queue' for this purpose?
> 

Thanks for the comments, I will follow this.

> >>>    static struct queue_missed_stat*
> >>>    queue_missed_stat_update(struct rte_eth_dev *dev, unsigned int qid)
> >>>    {
> >>> @@ -421,7 +435,7 @@ eth_pcap_tx_dumper(void *queue, struct rte_mbuf
> >> **bufs, uint16_t nb_pkts)
> >>>    	 * process stops and to make sure the pcap file is actually written,
> >>>    	 * we flush the pcap dumper within each burst.
> >>>    	 */
> >>> -	pcap_dump_flush(dumper);
> >>> +	pcap_dumper_data_sync(dumper, tx_bytes);
> >>
> >> 'pcap_dump_flush()' should be doing the same thing, to write buffer
> >> to file, isn't it working?
> >>
> >> Can you check the return value of the 'pcap_dump_flush()' API, I
> >> wonder if it keeps failing, for some reason?
> >>
> >
> > 'pcap_dump_flush()' returns 0 each time without error, it calls 'fflush()' to
> flush userspace buffers to kernel buffers, not disk. 'fdatasync()' to ensure data
> is written to disk.
> >
> 
> 'pcap_dump_flush()' API documentation says "flushes the output buffer to the
> ``savefile,''", but as you said it uses 'fflush()' internally, so there is a chance that
> data is not written to the disk.
> 
> In this case, won't need to keep both, first flush and later fsync/fdatasync?
> 

I draw a diagram to describe it more clearly

 fwrite                             fclose/fflush                                                   fclose/fdatasync
--------->| libc buffer  |----------------> |    disk cache in RAM          |---------------------> |disk|
              | 4096 Bytes |                         | size is determined by OS |                                |       |

When the libc buffer is full, the system will automatically sync it to the disk cache.
It is easily full as it's only 4096 B size. so there is no need to call 'fflush()' every time.
The real time consuming action is syncing the disk cache to disk.
Because the disk cache is very large, it will take a long time to sync all at one time during 'fclose()',
so need to call 'fdatasync()' periodically to amortize the time.

> Do you observe any performance difference after change, since now writing to
> actual disk on datapath?
> 

I will verify any performance difference under these 2 scenarios.

> >>>    	dumper_q->tx_stat.pkts += num_tx;
> >>>    	dumper_q->tx_stat.bytes += tx_bytes;
> >>>    	dumper_q->tx_stat.err_pkts += nb_pkts - num_tx;
> >


  reply	other threads:[~2022-08-29 11:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25  7:20 Yiding Zhou
2022-08-25 10:09 ` Ferruh Yigit
2022-08-25 11:17   ` Zhou, YidingX
2022-08-25 12:21     ` Ferruh Yigit
2022-08-29 11:50       ` Zhou, YidingX [this message]
2022-08-31 16:42         ` Stephen Hemminger
2022-09-01  7:40         ` Zhou, YidingX
2022-09-06  8:05 ` [PATCH v2] net/pcap: fix timeout of " Yiding Zhou
2022-09-06 14:57   ` Stephen Hemminger
2022-09-06 16:21     ` Zhou, YidingX
2022-09-21  7:14     ` Zhou, YidingX
2022-10-03 15:00       ` Ferruh Yigit
2022-11-22  9:25       ` Zhou, YidingX
2022-11-22 17:28         ` Stephen Hemminger
2022-12-02 10:22           ` Zhou, YidingX
2022-11-29 14:11         ` Ferruh Yigit
2022-12-02 10:13           ` Zhou, YidingX
2022-12-02 11:19             ` Ferruh Yigit
2022-12-05  1:58               ` Zhou, YidingX
  -- strict thread matches above, loose matches on Subject: below --
2022-08-25  6:27 [PATCH] net/pcap: reduce time for " Yiding Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR1101MB2107A250EC67513C52E95D5885769@DM5PR1101MB2107.namprd11.prod.outlook.com \
    --to=yidingx.zhou@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@xilinx.com \
    --cc=qi.z.zhang@intel.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).