Re: [dpdk-dev] rte_prefetch0() performance info

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Anuj Kalia <anujkaliaiitd@gmail.com>
To: Parikshith Chowdaiah <pchowdai@gmail.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] rte_prefetch0() performance info
Date: Thu, 5 Mar 2015 03:51:01 -0500	[thread overview]
Message-ID: <CADPSxAh+riebDqBzBd4hKGcgXoac_C7-O1RRmTg7g5M0byPzcg@mail.gmail.com> (raw)
In-Reply-To: <CABxpG4ZA5FmaNOUZODGCnDcgFoT1eSfo5=4gO-rp_wSv8SJeCw@mail.gmail.com>

Hi Parikshith.

A CPU core can have a limited number of prefetches in flight (around 10).
So if you issue 64 (or nb_rx > 10) prefetches in quick succession, you'll
stall on memory access. The main idea here is to overlap prefetches of some
packets with computation from other packets.

This paper explains it in the context of hash tables, but the idea is
similar: https://www.cs.cmu.edu/~binfan/papers/conext13_cuckooswitch.pdf

--Anuj

On Thu, Mar 5, 2015 at 3:46 AM, Parikshith Chowdaiah <pchowdai@gmail.com>
wrote:

> Hi all,
> I have a question related to usage of rte_prefetch0() function,In one of
> the sample files, we have implementation like:
>
>             /* Prefetch first packets */
>
>             for (j = 0; j < PREFETCH_OFFSET && j < nb_rx; j++) {
>
>                 rte_prefetch0(rte_pktmbuf_mtod(
>
>                         pkts_burst[j], void *));
>
>             }
>
>
>
>             /* Prefetch and forward already prefetched packets */
>
>             for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
>
>                 rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
>
>                         j + PREFETCH_OFFSET], void *));
>
>                 l3fwd_simple_forward(pkts_burst[j], portid,
>
>                                 qconf);
>
>             }
>
>
>
>             /* Forward remaining prefetched packets */
>
>             for (; j < nb_rx; j++) {
>
>                 l3fwd_simple_forward(pkts_burst[j], portid,
>
>                                 qconf);
>
>             }
>
>
> where the prefetch0() is carried out in multiple split iterations, would
> like to have an insight on whether it makes performance improvement to
> likes of:
>
>
>
>        for (j = 0; j  < nb_rx; j++) {
>
>                 rte_prefetch0(rte_pktmbuf_mtod(
>
>                         pkts_burst[j], void *));
>
>             }
>
>
> and how frequent rte_prefetch() needs to called for the same packet. and
> any mechanisms to call in bulk for 64 packets at once ?
>
>
> thanks
>
> Parikshith
>

     prev parent reply	other threads:[~2015-03-05  8:51 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-05  8:46 Parikshith Chowdaiah
2015-03-05  8:51 ` Anuj Kalia [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADPSxAh+riebDqBzBd4hKGcgXoac_C7-O1RRmTg7g5M0byPzcg@mail.gmail.com \
    --to=anujkaliaiitd@gmail.com \
    --cc=dev@dpdk.org \
    --cc=pchowdai@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).