DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Polehn, Mike A" <mike.a.polehn@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>,
	Moon-Sang Lee <sang0627@gmail.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] rte_prefetch0() is effective?
Date: Wed, 13 Jan 2016 15:17:36 +0000	[thread overview]
Message-ID: <745DB4B8861F8E4B9849C970520ABBF149847573@ORSMSX102.amr.corp.intel.com> (raw)
In-Reply-To: <20160113113432.GA7216@bricha3-MOBL3>

Prefetchs make a big difference because a powerful CPU like IA is always trying to find items to prefetch and the priority of these is not always easy to determine. This is especially a problem across subroutine calls since the compiler cannot determine what is of priority in the other subroutines and the runtime CPU logic cannot always have the future well predicted far enough in the future for all possible paths, especially if you have a cache miss, which takes eons of clock cycles to do a memory access probably resulting in a CPU stall.

Until we get to the point of the computers full understanding the logic of the program and writing optimum code (putting programmers out of business) , the understanding of what is important as the program progresses gives the programmer knowledge of what is desirable to prefetch. It is difficult to determine if the CPU is going to have the same priority of the prefetch, so having a prefetch may or may not show up as a measureable performance improvement under some conditions, but having the prefetch decision in place can make prefetch priority decision correct in these other cases, which make a performance improvement.

Removing a prefetch without thinking through and fully understanding the logic of why it is there, or what he added cost (in the case of calculating an address for the prefetch that affects other current operations) if any, is just plain amateur  work.  It is not to say people do not make bad judgments on what needs to be prefetched and put poor prefetch placement and should only be removed if not logically proper for expected runtime operation.

Only more primitive CPUs with no prefetch capabilities don't benefit from properly placed prefetches. 

Mike

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
Sent: Wednesday, January 13, 2016 3:35 AM
To: Moon-Sang Lee
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] rte_prefetch0() is effective?

On Thu, Dec 24, 2015 at 03:35:14PM +0900, Moon-Sang Lee wrote:
> I see codes as below in example directory, and I wonder it is effective.
> Coherent IO is adopted to modern architectures, so I think that DMA 
> initiation by rte_eth_rx_burst() might already fulfills cache lines of 
> RX buffers.
> Do I really need to call rte_prefetchX()?
> 
>             nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst, 
> MAX_PKT_BURST);
>             ...
>             /* Prefetch and forward already prefetched packets */
>             for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
>                 rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
>                         j + PREFETCH_OFFSET], void *));
>                 l3fwd_simple_forward(pkts_burst[j], portid,
>                     qconf);
>             }
> 

Good question.
When the first example apps using this style of prefetch were originally written, yes, there was a noticable performance increase achieved by using the prefetch.
Thereafter, I'm not sure that anyone has checked with each generation of platforms whether the prefetches are still necessary and how much they help, but I suspect that they still help a bit, and don't hurt performance.
It would be an interesting exercise to check whether the prefetch offsets used in code like above can be adjusted to give better performance on our latest supported platforms.

/Bruce

  reply	other threads:[~2016-01-13 15:18 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-24  6:35 Moon-Sang Lee
2016-01-13 11:34 ` Bruce Richardson
2016-01-13 15:17   ` Polehn, Mike A [this message]
2016-01-13 17:29   ` Matthew Hall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=745DB4B8861F8E4B9849C970520ABBF149847573@ORSMSX102.amr.corp.intel.com \
    --to=mike.a.polehn@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=sang0627@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).