> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Dell Will
> Sent: Tuesday, March 26, 2019 9:04 AM
> To: users <users@dpdk.org>
> Subject: [dpdk-users] Why not prefetch the second cache line of struct
> rte_mbuf for better performance ?
>
> Hello, everybody
Hi,
> I find that many codes in DPDK only prefetch the first cache line of struct
> rte_mbuf.
> The struct rte_mbuf has 2 cache lines.
> Why not prefetch the second line ?
A reason that cache-line 2 is not always prefetched is that it is not
always going to be used.
For example, the packet RX routines modify only the 1-st cache line, and do
not require the 2nd to be available.
> Is it hinted that the CPU (x64 or ARM) always automatically prefetch the
> next immediately followed cache line ?
Some details on x86-64 prefetchers here, particularly the "Adjacent Cache-Line Prefetch" is of interest;
https://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers
[Side note, x64 is actually a different architecture than x86-64].
> Thanks a lot !
Hope that helps, -Harry