DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>,
	"Manish Sharma" <manish.sharmajee75@gmail.com>
Cc: <dev@dpdk.org>
Subject: Re: [dpdk-dev] rte_memcpy - fence and stream
Date: Thu, 27 May 2021 17:49:19 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35C617E1@smartserver.smartshare.dk> (raw)
In-Reply-To: <YKzBWEKuHU2AlUtI@bricha3-MOBL.ger.corp.intel.com>

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Tuesday, 25 May 2021 11.20
> 
> On Mon, May 24, 2021 at 11:43:24PM +0530, Manish Sharma wrote:
> > I am looking at the source for rte_memcpy (this is a discussion only
> > for x86-64)
> >
> > For one of the cases, when aligned correctly, it uses
> >
> > /**
> >  * Copy 64 bytes from one location to another,
> >  * locations should not overlap.
> >  */
> > static __rte_always_inline void
> > rte_mov64(uint8_t *dst, const uint8_t *src)
> > {
> >         __m512i zmm0;
> >
> >         zmm0 = _mm512_loadu_si512((const void *)src);
> >         _mm512_storeu_si512((void *)dst, zmm0);
> > }
> >
> > I had some questions about this:
> >

[snip]

> > 3. Why isn't the code using  stream variants,
> > _mm512_stream_load_si512 and friends?
> > It would not pollute the cache, so should be better - unless
> > the required fence instructions cause a drop in performance?
> >
> Whether the stream variants perform better really depends on what you
> are
> trying to measure. However, in the vast majority of cases a core is
> making
> a copy of data to work on it, in which case having the data not in
> cache is
> going to cause massive stalls on the next access to that data as it's
> fetched from DRAM. Therefore, the best option is to use regular
> instructions meaning that the data is in local cache afterwards, giving
> far
> better performance when the data is actually used.
> 

Good response, Bruce. And you are probably right about most use cases looking like you describe.

I'm certainly no expert on deep x86-64 optimization, but please let me think out loud here...

I can come up with a different scenario: One core is analyzing packet headers, and determines to copy some of the packets in their entirety (e.g. using rte_pktmbuf_copy) for another core to process later, so it enqueues the copies to a ring, which the other core dequeues from.

The first core doesn't care about the packet contents, and the second core will read the copy of the packet much later, because it needs to process the packets in front of the queue first.

Wouldn't this scenario benefit from a rte_memcpy variant that doesn't pollute the cache?

I know that rte_pktmbuf_clone might be better to use in the described scenario, but rte_pktmbuf_copy must be there for a reason - and I guess that some uses of rte_pktmbuf_copy could benefit from a non-cache polluting variant of rte_memcpy.

-Morten

  reply	other threads:[~2021-05-27 15:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-24  8:49 [dpdk-dev] rte_memcpy Manish Sharma
2021-05-24 18:13 ` [dpdk-dev] rte_memcpy - fence and stream Manish Sharma
2021-05-25  9:20   ` Bruce Richardson
2021-05-27 15:49     ` Morten Brørup [this message]
2021-05-27 16:25       ` Bruce Richardson
2021-05-27 17:09         ` Manish Sharma
2021-05-27 17:22           ` Bruce Richardson
2021-05-27 18:15             ` Morten Brørup
2021-06-22 21:55               ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35C617E1@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=manish.sharmajee75@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).