* [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ?
@ 2019-03-26 9:03 Dell Will
2019-03-26 10:53 ` Van Haaren, Harry
0 siblings, 1 reply; 2+ messages in thread
From: Dell Will @ 2019-03-26 9:03 UTC (permalink / raw)
To: users
Hello, everybody
I find that many codes in DPDK only prefetch the first cache line of struct rte_mbuf.
The struct rte_mbuf has 2 cache lines.
Why not prefetch the second line ?
Is it hinted that the CPU (x64 or ARM) always automatically prefetch the next immediately followed cache line ?
Thanks a lot !
________________________________
coolwilled@hotmail.com
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ?
2019-03-26 9:03 [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ? Dell Will
@ 2019-03-26 10:53 ` Van Haaren, Harry
0 siblings, 0 replies; 2+ messages in thread
From: Van Haaren, Harry @ 2019-03-26 10:53 UTC (permalink / raw)
To: Dell Will, users
> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Dell Will
> Sent: Tuesday, March 26, 2019 9:04 AM
> To: users <users@dpdk.org>
> Subject: [dpdk-users] Why not prefetch the second cache line of struct
> rte_mbuf for better performance ?
>
> Hello, everybody
Hi,
> I find that many codes in DPDK only prefetch the first cache line of struct
> rte_mbuf.
> The struct rte_mbuf has 2 cache lines.
> Why not prefetch the second line ?
A reason that cache-line 2 is not always prefetched is that it is not
always going to be used.
For example, the packet RX routines modify only the 1-st cache line, and do
not require the 2nd to be available.
> Is it hinted that the CPU (x64 or ARM) always automatically prefetch the
> next immediately followed cache line ?
Some details on x86-64 prefetchers here, particularly the "Adjacent Cache-Line Prefetch" is of interest;
https://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers
[Side note, x64 is actually a different architecture than x86-64].
> Thanks a lot !
Hope that helps, -Harry
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2019-03-26 10:53 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-26 9:03 [dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ? Dell Will
2019-03-26 10:53 ` Van Haaren, Harry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).