DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: <dev@dpdk.org>, <olivier.matz@6wind.com>,
	<andrew.rybchenko@oktetlabs.ru>
Subject: RE: cache thrashing question
Date: Fri, 25 Aug 2023 11:06:01 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87B3A@smartserver.smartshare.dk> (raw)
In-Reply-To: <ZOhk4SRRog9E1mlq@bricha3-MOBL.ger.corp.intel.com>

+CC mempool maintainers

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Friday, 25 August 2023 10.23
> 
> On Fri, Aug 25, 2023 at 08:45:12AM +0200, Morten Brørup wrote:
> > Bruce,
> >
> > With this patch [1], it is noted that the ring producer and consumer data
> should not be on adjacent cache lines, for performance reasons.
> >
> > [1]:
> https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring.h?id=d9f0d3a1ffd4b66
> e75485cc8b63b9aedfbdfe8b0
> >
> > (It's obvious that they cannot share the same cache line, because they are
> accessed by two different threads.)
> >
> > Intuitively, I would think that having them on different cache lines would
> suffice. Why does having an empty cache line between them make a difference?
> >
> > And does it need to be an empty cache line? Or does it suffice having the
> second structure start at two cache lines after the start of the first
> structure (e.g. if the size of the first structure is two cache lines)?
> >
> > I'm asking because the same principle might apply to other code too.
> >
> Hi Morten,
> 
> this was something we discovered when working on the distributor library.
> If we have cachelines per core where there is heavy access, having some
> cachelines as a gap between the content cachelines can help performance. We
> believe this helps due to avoiding issues with the HW prefetchers (e.g.
> adjacent cacheline prefetcher) bringing in the second cacheline
> speculatively when an operation is done on the first line.

I guessed that it had something to do with speculative prefetching, but wasn't sure. Good to get confirmation, and that it has a measureable effect somewhere. Very interesting!

NB: More comments in the ring lib about stuff like this would be nice.

So, for the mempool lib, what do you think about applying the same technique to the rte_mempool_debug_stats structure (which is an array indexed per lcore)... Two adjacent lcores heavily accessing their local mempool caches seems likely to me. But how heavy does the access need to be for this technique to be relevant?

For the rte_mempool_cache structure (also an array indexed per lcore), the last entries of the "objs" array at the end of the structure are unlikely to be used, so they already serve as a gap, and an additional gap seems irrelevant here.


  reply	other threads:[~2023-08-25  9:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-25  6:45 Morten Brørup
2023-08-25  8:22 ` Bruce Richardson
2023-08-25  9:06   ` Morten Brørup [this message]
2023-08-25  9:23     ` Bruce Richardson
2023-08-27  8:34       ` [RFC] cache guard Morten Brørup
2023-08-27 13:55         ` Mattias Rönnblom
2023-08-27 15:40           ` Morten Brørup
2023-08-27 22:30             ` Mattias Rönnblom
2023-08-28  6:32               ` Morten Brørup
2023-08-28  8:46                 ` Mattias Rönnblom
2023-08-28  9:54                   ` Morten Brørup
2023-08-28 10:40                     ` Stephen Hemminger
2023-08-28  7:57             ` Bruce Richardson
2023-09-01 12:26         ` Thomas Monjalon
2023-09-01 16:57           ` Mattias Rönnblom
2023-09-01 18:52             ` Morten Brørup
2023-09-04 12:07               ` Mattias Rönnblom
2023-09-04 12:48                 ` Morten Brørup
2023-09-05  5:50                   ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D87B3A@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).