DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ?
@ 2019-03-26  9:03 Dell Will
  2019-03-26 10:53 ` Van Haaren, Harry
  0 siblings, 1 reply; 2+ messages in thread
From: Dell Will @ 2019-03-26  9:03 UTC (permalink / raw)
  To: users

Hello, everybody

I find that many codes in DPDK only prefetch the first cache line of struct rte_mbuf.
The struct rte_mbuf has 2 cache lines.
Why not prefetch the second line ?
Is it hinted that the CPU (x64 or ARM) always automatically prefetch the next immediately followed cache line ?

Thanks a lot !

________________________________
coolwilled@hotmail.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ?
  2019-03-26  9:03 [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ? Dell Will
@ 2019-03-26 10:53 ` Van Haaren, Harry
  0 siblings, 0 replies; 2+ messages in thread
From: Van Haaren, Harry @ 2019-03-26 10:53 UTC (permalink / raw)
  To: Dell Will, users

> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Dell Will
> Sent: Tuesday, March 26, 2019 9:04 AM
> To: users <users@dpdk.org>
> Subject: [dpdk-users] Why not prefetch the second cache line of struct
> rte_mbuf for better performance ?
> 
> Hello, everybody

Hi,

> I find that many codes in DPDK only prefetch the first cache line of struct
> rte_mbuf.
> The struct rte_mbuf has 2 cache lines.
> Why not prefetch the second line ?

A reason that cache-line 2 is not always prefetched is that it is not
always going to be used.

For example, the packet RX routines modify only the 1-st cache line, and do
not require the 2nd to be available.


> Is it hinted that the CPU (x64 or ARM) always automatically prefetch the
> next immediately followed cache line ?

Some details on x86-64 prefetchers here, particularly the "Adjacent Cache-Line Prefetch" is of interest;
https://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers

[Side note, x64 is actually a different architecture than x86-64].


> Thanks a lot !

Hope that helps, -Harry

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-03-26 10:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-26  9:03 [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ? Dell Will
2019-03-26 10:53 ` Van Haaren, Harry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).