Re: AF_XDP performance - Stephen Hemminger

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Stephen Hemminger <stephen@networkplumber.org>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: Alireza Sanaee <a.sanaee@qmul.ac.uk>, <dev@dpdk.org>
Subject: Re: AF_XDP performance
Date: Wed, 24 May 2023 11:29:19 -0700	[thread overview]
Message-ID: <20230524112919.64fc806c@hermes.local> (raw)
In-Reply-To: <ZG49EApcTGFoADmE@bricha3-MOBL.ger.corp.intel.com>

On Wed, 24 May 2023 17:36:32 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:

> On Wed, May 24, 2023 at 01:32:17PM +0100, Alireza Sanaee wrote:
> > Hi everyone,
> > 
> > I was looking at this deck of slides
> > https://www.dpdk.org/wp-content/uploads/sites/35/2020/11/XDP_ZC_PMD-1.pdf
> > 
> > I tried to reproduce the results with the testpmd application. I am
> > working with BlueField 2 NIC and I could sustain ~10Mpps with testpmd
> > with AF_XDP, and about 20Mpps without AF_XDP on the RX drop experiment. I
> > was wondering why AF_XDP so lower compared to PCI-e scenario given the
> > fact that both cases are zero-cpy. Is it because of the frame size?
> >   
> While I can't claim to explain all the differences, in short I believe the
> AF_XDP version is just doing more work. With a native DPDK driver, the
> driver takes the packet descriptors directly from the NIC RX ring and uses
> the metadata to construct a packet mbuf, which is returned to the
> application.
> 
> With AF_XDP, however, the NIC descriptor ring is not directly accessible by
> the app. Therefore the processing is (AFAIK):
> * kernel reads NIC descriptor ring and processes descriptor
> * kernel calls BPF program for the received packets to determine what
>   action to take, e.g. forward to socket
> * kernel writes an AF_XDP descriptor to the AF_XDP socket RX ring
> * application reads the AF_XDP ring entry written by the kernel and then
>   creates a DPDK mbuf to return to the application.
> 
> There are also other considerations around potential cache locality of
> descriptors too that could affect things, but I would expect the extra
> descriptor processing work outlined above probably explains most of the
> difference.
> 
> Regards,
> /Bruce

There is also a context switch from kernel polling thread to the DPDK polling thread
to consider. Plus the overhead of running the BPF program.  The context switches
mean that both the instruction and data cache is likely to have lots of misses.
Remember on modern processes the limiting factor is usually memory performance
from caching, not number of instructions.

     prev parent reply	other threads:[~2023-05-24 18:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-24 12:32 Alireza Sanaee
2023-05-24 16:36 ` Bruce Richardson
2023-05-24 18:29   ` Stephen Hemminger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230524112919.64fc806c@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=a.sanaee@qmul.ac.uk \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).