Re: [dpdk-dev] rte_memcpy - fence and stream

DPDK patches and discussions
 help / color / mirror / Atom feed

From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Manish Sharma" <manish.sharmajee75@gmail.com>
Cc: <dev@dpdk.org>, "Bruce Richardson" <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] rte_memcpy - fence and stream
Date: Tue, 22 Jun 2021 23:55:55 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35C61880@smartserver.smartshare.dk> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35C617E4@smartserver.smartshare.dk>

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Morten Brørup
> Sent: Thursday, 27 May 2021 20.15
> 
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Thursday, 27 May 2021 19.22
> >
> > On Thu, May 27, 2021 at 10:39:59PM +0530, Manish Sharma wrote:
> > >    For the case I have, hardly 2% of the data buffers which are
> being
> > >    copied get looked at - mostly its for DMA.

Which data buffers are you not looking at, Manish? The original data buffers, or the copies, or both?

> > >    Having a version of DPDK
> > >    memcopy that does non temporal copies would definitely be good.
> > >    If in my case, I have a lot of CPUs doing the copy in parallel,
> > would
> > >    I/OAT driver copy accelerator still help?
> > >
> > It will depend upon the size of the copies being done. For bigger
> > packets
> > the accelerator can help free up CPU cycles for other things.
> >
> > However, if only 2% of the data which is being copied gets looked at,
> > why
> > does it need to be copied? Can the original buffers not be used in
> that
> > case?
> 
> I can only speak for myself here...
> 
> Our firmware has a packet capture feature with a filter.
> 
> If a packet matches the capture filter, a metadata header and the
> relevant part of the packet contents ("snap length" in tcpdump
> terminology) is appended to a large memory area (the "capture buffer")
> using rte_pktmbuf_read/rte_memcpy. This capture buffer is only read
> through the GUI or management API by the network administrator, i.e. it
> will only be read minutes or hours later, so there is no need to put
> any of it in any CPU cache.
> 
> It does not make sense to clone and hold on to many thousands of mbufs
> when we only need some of their contents. So we copy the contents
> instead of increasing the mbuf refcount.
> 
> We currently only use our packet capture feature for R&D purposes, so
> we have not optimized it yet. However, we will need to optimize it for
> production use at some point. So I find this discussion initiated by
> Manish very interesting.
> 
> -Morten

Here's some code for inspiration. I haven't tested it yet. And it can be further optimized.

/**
 * Copy 16 bytes from one location to another, using non-temporal storage
 * at the destination.
 * The locations must not overlap.
 *
 * @param dst
 *   Pointer to the destination of the data.
 *   Must be aligned on a 16-byte boundary.
 * @param src
 *   Pointer to the source data.
 *   Does not need to be aligned on any particular boundary.
 */
static __rte_always_inline void
rte_mov16_aligned16_non_temporal(uint8_t *dst, const uint8_t *src)
{
    __m128i xmm0;

    xmm0 = _mm_loadu_si128((const __m128i *)src);
    _mm_stream_si128((__m128i *)dst, xmm0);
}

/**
 * Copy bytes from one location to another, using non-temporal storage
 * at the destination.
 * The locations must not overlap.
 *
 * @param dst
 *   Pointer to the destination of the data.
 *   Must be aligned on a 16-byte boundary.
 * @param src
 *   Pointer to the source data.
 *   Does not need to be aligned on any particular boundary.
 * @param n
 *   Number of bytes to copy.
 *   Must be divisble by 4.
 * @return
 *   Pointer to the destination data.
 */
static __rte_always_inline void *
rte_memcpy_aligned16_non_temporal(void *dst, const void *src, size_t n)
{
    void * const ret = dst;

    RTE_ASSERT(!((uintptr_t)dst & 0xF));
    RTE_ASSERT(!(n & 3));

    while (n >= 16) {
        rte_mov16_aligned16_non_temporal(dst, src);
        src = (const uint8_t *)src + 16;
        dst = (uint8_t *)dst + 16;
        n -= 16;
    }
    if (n & 8) {
        int64_t a = *(const int64_t *)src;
        _mm_stream_si64((long long int *)dst, a);
        src = (const uint8_t *)src + 8;
        dst = (uint8_t *)dst + 8;
        n -= 8;
    }
    if (n & 4) {
        int32_t a = *(const int32_t *)src;
        _mm_stream_si32((int32_t *)dst, a);
        src = (const uint8_t *)src + 4;
        dst = (uint8_t *)dst + 4;
        n -= 4;
    }

    return ret;
}

     prev parent reply	other threads:[~2021-06-22 21:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-24  8:49 [dpdk-dev] rte_memcpy Manish Sharma
2021-05-24 18:13 ` [dpdk-dev] rte_memcpy - fence and stream Manish Sharma
2021-05-25  9:20   ` Bruce Richardson
2021-05-27 15:49     ` Morten Brørup
2021-05-27 16:25       ` Bruce Richardson
2021-05-27 17:09         ` Manish Sharma
2021-05-27 17:22           ` Bruce Richardson
2021-05-27 18:15             ` Morten Brørup
2021-06-22 21:55               ` Morten Brørup [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35C61880@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=manish.sharmajee75@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).