DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Avi Kivity <avi@scylladb.com>
Subject: Re: [dpdk-dev] Having troubles binding an SR-IOV VF to uio_pci_generic on Amazon instance
Date: Thu, 1 Oct 2015 14:23:17 +0300	[thread overview]
Message-ID: <20151001141124-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <20151001110806.GA16248@bricha3-MOBL3>

On Thu, Oct 01, 2015 at 12:08:07PM +0100, Bruce Richardson wrote:
> On Thu, Oct 01, 2015 at 01:38:37PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Oct 01, 2015 at 12:59:47PM +0300, Avi Kivity wrote:
> > > 
> > > 
> > > On 10/01/2015 12:55 PM, Michael S. Tsirkin wrote:
> > > >On Thu, Oct 01, 2015 at 12:22:46PM +0300, Avi Kivity wrote:
> > > >>It's easy to claim that
> > > >>a solution is around the corner, only no one was looking for it, but the
> > > >>reality is that kernel bypass has been a solution for years for high
> > > >>performance users,
> > > >I never said that it's trivial.
> > > >
> > > >It's probably a lot of work. It's definitely more work than just abusing
> > > >sysfs.
> > > >
> > > >But it looks like a write system call into an eventfd is about 1.5
> > > >microseconds on my laptop. Even with a system call per packet, system
> > > >call overhead is not what makes DPDK drivers outperform Linux ones.
> > > >
> > > 
> > > 1.5 us = 0.6 Mpps per core limit.
> > 
> > Oh, I calculated it incorrectly. It's 0.15 us. So 6Mpps.
> > But for RX, you can batch a lot of packets.
> > 
> > You can see by now I'm not that good at benchmarking.
> > Here's what I wrote:
> > 
> > 
> > #include <stdbool.h>
> > #include <sys/eventfd.h>
> > #include <inttypes.h>
> > #include <unistd.h>
> > 
> > 
> > int main(int argc, char **argv)
> > {
> >         int e = eventfd(0, 0);
> >         uint64_t v = 1;
> > 
> >         int i;
> > 
> >         for (i = 0; i < 10000000; ++i) {
> >                 write(e, &v, sizeof v);
> >         }
> > }
> > 
> > 
> > This takes 1.5 seconds to run on my laptop:
> > 
> > $ time ./a.out 
> > 
> > real    0m1.507s
> > user    0m0.179s
> > sys     0m1.328s
> > 
> > 
> > > dpdk performance is in the tens of
> > > millions of packets per system.
> > 
> > I think that's with a bunch of batching though.
> > 
> > > It's not just the lack of system calls, of course, the architecture is
> > > completely different.
> > 
> > Absolutely - I'm not saying move all of DPDK into kernel.
> > We just need to protect the RX rings so hardware does
> > not corrupt kernel memory.
> > 
> > 
> > Thinking about it some more, many devices
> > have separate rings for DMA: TX (device reads memory)
> > and RX (device writes memory).
> > With such devices, a mode where userspace can write TX ring
> > but not RX ring might make sense.
> > 
> > This will mean userspace might read kernel memory
> > through the device, but can not corrupt it.
> > 
> > That's already a big win!
> > 
> > And RX buffers do not have to be added one at a time.
> > If we assume 0.2usec per system call, batching some 100 buffers per
> > system call gives you 2 nano seconds overhead.  That seems quite
> > reasonable.
> > 
> Hi,
> 
> just to jump in a bit on this.
> 
> Batching of 100 packets is a very large batch, and will add to latency.



This is not on transmit or receive path!
This is only for re-adding buffers to the receive ring.
This batching should not add latency at all:


process rx:
	get packet
	packets[n] = alloc packet
	if (++n > 100) {
		system call: add bufs(packets, n);
	}





> The
> standard batch size in DPDK right now is 32, and even that may be too high for
> applications in certain domains.
> 
> However, even with that 2ns of overhead calculation, I'd make a few additional
> points.
> * For DPDK, we are reasonably close to being able to do 40GB of IO - both RX 
> and TX on a single thread. 10GB of IO doesn't really stress a core any more. For
> 40GB of small packet traffic, the packet arrival rate is 16.8ns, so even with a
> huge batch size of 100 packets, your system call overhead on RX is taking almost
> 12% of our processing time. For a batch size of 32 this overhead would rise to
> over 35% of our packet processing time.

As I said, yes, measureable, but not breaking the bank, and that's with
40GB which still are not widespread.
With 10GB and 100 packets, only 3% overhead.

> For 100G line rate, the packet arrival
> rate is just 6.7ns...

Hypervisors still have time get their act together and support IOMMUs
by the time 100G systems become widespread.

> * As well as this overhead from the system call itself, you are also omitting
> the overhead of scanning the RX descriptors.

I omit it because scanning descriptors can still be done in userspace,
just write-protect the RX ring page.


> This in itself is going to use up
> a good proportion of the processing time, as well as that we have to spend cycles
> copying the descriptors from one ring in memory to another. Given that right now
> with the vector ixgbe driver, the cycle cost per packet of RX is just a few dozen
> cycles on modern cores, every additional cycle (fraction of a nanosecond) has
> an impact.
> 
> Regards,
> /Bruce

See above.  There is no need for that on data path. Only re-adding
buffers requires a system call.

-- 
MST

  reply	other threads:[~2015-10-01 11:23 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-27  7:05 Vlad Zolotarov
2015-09-27  9:43 ` Michael S. Tsirkin
2015-09-27 10:50   ` Vladislav Zolotarov
2015-09-29 16:41   ` Vlad Zolotarov
2015-09-29 20:54     ` Michael S. Tsirkin
2015-09-29 21:46       ` Stephen Hemminger
2015-09-29 21:49         ` Michael S. Tsirkin
2015-09-30 10:37           ` Vlad Zolotarov
2015-09-30 10:58             ` Michael S. Tsirkin
2015-09-30 11:26               ` Vlad Zolotarov
     [not found]                 ` <20150930143927-mutt-send-email-mst@redhat.com>
2015-09-30 11:53                   ` Vlad Zolotarov
2015-09-30 12:03                     ` Michael S. Tsirkin
2015-09-30 12:16                       ` Vlad Zolotarov
2015-09-30 12:27                         ` Michael S. Tsirkin
2015-09-30 12:50                           ` Vlad Zolotarov
2015-09-30 15:26                             ` Michael S. Tsirkin
2015-09-30 18:15                               ` Vlad Zolotarov
2015-09-30 18:55                                 ` Michael S. Tsirkin
2015-09-30 19:06                                   ` Vlad Zolotarov
2015-09-30 19:10                                     ` Vlad Zolotarov
2015-09-30 19:11                                       ` Vlad Zolotarov
2015-09-30 19:39                                     ` Michael S. Tsirkin
2015-09-30 20:09                                       ` Vlad Zolotarov
2015-09-30 21:36                                         ` Stephen Hemminger
2015-09-30 21:53                                           ` Michael S. Tsirkin
2015-09-30 22:20                                           ` Vlad Zolotarov
2015-10-01  8:00                                           ` Vlad Zolotarov
2015-10-01 14:47                                             ` Stephen Hemminger
2015-10-01 15:03                                               ` Vlad Zolotarov
2015-09-30 13:05                           ` Avi Kivity
2015-09-30 14:39                             ` Michael S. Tsirkin
2015-09-30 14:53                               ` Avi Kivity
2015-09-30 15:21                                 ` Michael S. Tsirkin
2015-09-30 15:36                                   ` Avi Kivity
2015-09-30 20:40                                     ` Michael S. Tsirkin
2015-09-30 21:00                                       ` Avi Kivity
2015-10-01  8:44                                       ` Michael S. Tsirkin
2015-10-01  8:46                                         ` Vlad Zolotarov
2015-10-01  8:52                                         ` Avi Kivity
2015-10-01  9:15                                           ` Michael S. Tsirkin
2015-10-01  9:22                                             ` Avi Kivity
2015-10-01  9:42                                               ` Michael S. Tsirkin
2015-10-01  9:53                                                 ` Avi Kivity
2015-10-01 10:17                                                   ` Michael S. Tsirkin
2015-10-01 10:24                                                     ` Avi Kivity
2015-10-01 10:25                                                       ` Avi Kivity
2015-10-01 10:44                                                         ` Michael S. Tsirkin
2015-10-01 10:55                                                           ` Avi Kivity
2015-10-01 21:17                                                 ` Alexander Duyck
2015-10-02 13:50                                                   ` Michael S. Tsirkin
2015-10-01  9:42                                               ` Vincent JARDIN
2015-10-01  9:43                                                 ` Avi Kivity
2015-10-01  9:48                                                   ` Vincent JARDIN
2015-10-01  9:54                                                     ` Avi Kivity
2015-10-01 10:14                                                   ` Michael S. Tsirkin
2015-10-01 10:23                                                     ` Avi Kivity
2015-10-01 14:55                                                     ` Stephen Hemminger
2015-10-01 15:49                                                       ` Michael S. Tsirkin
2015-10-01 14:54                                                 ` Stephen Hemminger
2015-10-01  9:55                                               ` Michael S. Tsirkin
2015-10-01  9:59                                                 ` Avi Kivity
2015-10-01 10:38                                                   ` Michael S. Tsirkin
2015-10-01 10:50                                                     ` Avi Kivity
2015-10-01 11:09                                                       ` Michael S. Tsirkin
2015-10-01 11:20                                                         ` Avi Kivity
2015-10-01 11:27                                                           ` Michael S. Tsirkin
2015-10-01 11:32                                                             ` Avi Kivity
2015-10-01 15:01                                                               ` Stephen Hemminger
2015-10-01 15:08                                                                 ` Avi Kivity
2015-10-01 15:46                                                                 ` Michael S. Tsirkin
2015-10-01 15:11                                                               ` Michael S. Tsirkin
2015-10-01 15:19                                                                 ` Avi Kivity
2015-10-01 15:40                                                                   ` Michael S. Tsirkin
2015-10-01 11:31                                                           ` Michael S. Tsirkin
2015-10-01 11:34                                                             ` Avi Kivity
2015-10-01 11:08                                                     ` Bruce Richardson
2015-10-01 11:23                                                       ` Michael S. Tsirkin [this message]
2015-10-01 12:07                                                         ` Bruce Richardson
2015-10-01 13:14                                                           ` Michael S. Tsirkin
2015-10-01 16:04                                                             ` Michael S. Tsirkin
2015-10-01 21:02                                                             ` Alexander Duyck
2015-10-02 14:00                                                               ` Michael S. Tsirkin
2015-10-02 14:07                                                                 ` Bruce Richardson
2015-10-04  9:07                                                                   ` Michael S. Tsirkin
2015-10-02 15:56                                                                 ` Gleb Natapov
2015-10-02 16:57                                                                 ` Alexander Duyck
2015-10-01  9:15                                           ` Avi Kivity
2015-10-01  9:29                                             ` Michael S. Tsirkin
2015-10-01  9:38                                               ` Avi Kivity
2015-10-01 10:07                                                 ` Michael S. Tsirkin
2015-10-01 10:11                                                   ` Avi Kivity
2015-10-01  9:16                                         ` Michael S. Tsirkin
2015-09-30 17:28             ` Stephen Hemminger
2015-09-30 17:39               ` Michael S. Tsirkin
2015-09-30 17:43                 ` Stephen Hemminger
2015-09-30 18:50                   ` Michael S. Tsirkin
2015-09-30 20:00                     ` Gleb Natapov
2015-09-30 20:36                       ` Michael S. Tsirkin
2015-10-01  5:04                         ` Gleb Natapov
2015-09-30 17:44                 ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151001141124-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=avi@scylladb.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).