DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] rte_memcpy
@ 2021-05-24  8:49 Manish Sharma
  2021-05-24 18:13 ` [dpdk-dev] rte_memcpy - fence and stream Manish Sharma
  0 siblings, 1 reply; 9+ messages in thread
From: Manish Sharma @ 2021-05-24  8:49 UTC (permalink / raw)
  To: dev

I am looking at the source for rte_memcpy (this is a discussion only for
x64-64)

For one of the cases, when aligned correctly, it uses

/**
 * Copy 64 bytes from one location to another,
 * locations should not overlap.
 */
static __rte_always_inline void
rte_mov64(uint8_t *dst, const uint8_t *src)
{
        __m512i zmm0;

        zmm0 = _mm512_loadu_si512((const void *)src);
        _mm512_storeu_si512((void *)dst, zmm0);
}

I had some questions about this:

1. What I dont see is any use of x86 fence(rmb,wmb) instructions. Is that
not required in this case and if not, why isnt it needed?

2. Are the  mm512_loadu_si512 and  _mm512_storeu_si512 non temporal?

3. Why isn't the code using  stream variants, _mm512_stream_load_si512 and
friends?

4. Do the _mm512_stream_load_si512 need fence instructions?

TIA,
Manish

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-22 21:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-24  8:49 [dpdk-dev] rte_memcpy Manish Sharma
2021-05-24 18:13 ` [dpdk-dev] rte_memcpy - fence and stream Manish Sharma
2021-05-25  9:20   ` Bruce Richardson
2021-05-27 15:49     ` Morten Brørup
2021-05-27 16:25       ` Bruce Richardson
2021-05-27 17:09         ` Manish Sharma
2021-05-27 17:22           ` Bruce Richardson
2021-05-27 18:15             ` Morten Brørup
2021-06-22 21:55               ` Morten Brørup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).