DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] vSwitch Performance Comparison for NFV Use Case
@ 2015-08-21 19:18 Jun Xiao
  2015-08-24 16:50 ` Traynor, Kevin
  0 siblings, 1 reply; 2+ messages in thread
From: Jun Xiao @ 2015-08-21 19:18 UTC (permalink / raw)
  To: Gray, Mark D; +Cc: dev

Hi Mark,
Last time we discussed methodologies for vSwitch performance comparison, and the performance data we published is more for typical TCP based applications in virtualized data centers.Today we shared more data for small packet size traffic at http://cloudnetengine.com/en/blog/2015/08/21/vswitch-performance-comparison-nfv-use-case, and the perfomance gets much closed (around 10-20%) between OVS-DPDK and CNE vSwitch as the tests are barely forwarding and without any other features.

On the other hand, it's really hard to find any public performance data for OVS-DPDK under pNIC -> vSwitch -> VM -> vSwitch -> pNIC case. What I observed is that OVS-DPDK can have generally less than 3 MPPS on my setup (vhost user is used instead of IVSHMEM), don't know if the data are aligned with what you have?
Thanks,Junwww.cloudnetengine.com
From stephen@networkplumber.org  Fri Aug 21 22:25:37 2015
Return-Path: <stephen@networkplumber.org>
Received: from mail-ig0-f173.google.com (mail-ig0-f173.google.com
 [209.85.213.173]) by dpdk.org (Postfix) with ESMTP id 1756A8DAA
 for <dev@dpdk.org>; Fri, 21 Aug 2015 22:25:37 +0200 (CEST)
Received: by igfj19 with SMTP id j19so26035076igf.0
 for <dev@dpdk.org>; Fri, 21 Aug 2015 13:25:36 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d\x1e100.net; s 130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=qa7U39pG6P/6ZM/fgE1uQZj7nwS0ffVs9Jw07m1S2Yg=;
 b=GZBnu79WtRLfS4CCElyC6oX/iFryEE/1iGbXXX/q/mmXrfaO2NeiKLREI8K+bq4f55
 urgAwKNmx7aoPqeaR2xv1VGkokWyr+6K0yP5UVZBiw6swKJh+r+bjCaca1ZXXOjzuUDC
 T5fVfnq0HwpCmV0I4IjecikFx4AVDzkutl+2blo55JbQjZOWeNIy6tRdJAobmQdEHM7a
 DkdfrLqtbUZ+CbaZl80NQt2IE6tbODe//KYGfevyyTK13rfUPFOl7Y4ZnLXZqFGJm2XJ
 YgvenEQT7dOnhtc7hLXdOYNADSTFShjcvTxCyfrl2zUcu+IRJT0jAKQ+b01ooqEuiUiL
 XytQ=X-Gm-Message-State: ALoCoQnp/6xrYINs4m683s+MnXS2cHRBkFgIiNItWTaMlVgGKRUUjLpdWN9To1gOsba5w4Ce9i16
MIME-Version: 1.0
X-Received: by 10.50.136.134 with SMTP id qa6mr4547843igb.13.1440188736445;
 Fri, 21 Aug 2015 13:25:36 -0700 (PDT)
Received: by 10.64.197.39 with HTTP; Fri, 21 Aug 2015 13:25:36 -0700 (PDT)
In-Reply-To: <55D76854.5010306@linaro.org>
References: <55D76854.5010306@linaro.org>
Date: Fri, 21 Aug 2015 13:25:36 -0700
Message-ID: <CAOaVG14+qkEJXnSiF+r79CzbJdZrfqSH3kb56yp2nGuPEUFZfA@mail.gmail.com>
From: Stephen Hemminger <stephen@networkplumber.org>
To: Zoltan Kiss <zoltan.kiss@linaro.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Cc: "dev@dpdk.org" <dev@dpdk.org>, dev@openvswitch.org
Subject: Re: [dpdk-dev] OVS-DPDK performance problem on ixgbe vector PMD
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Aug 2015 20:25:37 -0000

Use perf top it gives much better data than oprofile

On Fri, Aug 21, 2015 at 11:05 AM, Zoltan Kiss <zoltan.kiss@linaro.org>
wrote:

> Hi,
>
> I've set up a simple packet forwarding perf test on a dual-port 10G
> 82599ES: one port receives 64 byte UDP packets, the other sends it out, one
> core used. I've used latest OVS with DPDK 2.1, and the first result was
> only 13.2 Mpps, which was a bit far from the 13.9 I've seen last year with
> the same test. The first thing I've changed was to revert back to the old
> behaviour about this issue:
>
> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/22731
>
> So instead of the new default I've passed 2048 + RTE_PKTMBUF_HEADROOM.
> That increased the performance to 13.5, but to figure out what's wrong
> started to play with the receive functions. First I've disabled vector PMD,
> but ixgbe_recv_pkts_bulk_alloc() was even worse, only 12.5 Mpps. So then
> I've enabled scattered RX, and with ixgbe_recv_pkts_lro_bulk_alloc() I
> could manage to get 13.98 Mpps, which is I guess as close as possible to
> the 14.2 line rate (on my HW at least, with one core)
> Does anyone has a good explanation about why the vector PMD performs so
> significantly worse? I would expect that on a 3.2 GHz i5-4570 one core
> should be able to reach ~14 Mpps, SG and vector PMD shouldn't make a
> difference.
> I've tried to look into it with oprofile, but the results were quite
> strange: 35% of the samples were from miniflow_extract, the part where
> parse_vlan calls data_pull to jump after the MAC addresses. The oprofile
> snippet (1M samples):
>
>   511454 19        0.0037  flow.c:511
>   511458 149       0.0292  dp-packet.h:266
>   51145f 4264      0.8357  dp-packet.h:267
>   511466 18        0.0035  dp-packet.h:268
>   51146d 43        0.0084  dp-packet.h:269
>   511474 172       0.0337  flow.c:511
>   51147a 4320      0.8467  string3.h:51
>   51147e 358763   70.3176  flow.c:99
>   511482 2        3.9e-04  string3.h:51
>   511485 3060      0.5998  string3.h:51
>   511488 1693      0.3318  string3.h:51
>   51148c 2933      0.5749  flow.c:326
>   511491 47        0.0092  flow.c:326
>
> And the corresponding disassembled code:
>
>   511454:       49 83 f9 0d             cmp    r9,0xd
>   511458:       c6 83 81 00 00 00 00    mov    BYTE PTR [rbx+0x81],0x0
>   51145f:       66 89 83 82 00 00 00    mov    WORD PTR [rbx+0x82],ax
>   511466:       66 89 93 84 00 00 00    mov    WORD PTR [rbx+0x84],dx
>   51146d:       66 89 8b 86 00 00 00    mov    WORD PTR [rbx+0x86],cx
>   511474:       0f 86 af 01 00 00       jbe    511629
> <miniflow_extract+0x279>
>   51147a:       48 8b 45 00             mov    rax,QWORD PTR [rbp+0x0]
>   51147e:       4c 8d 5d 0c             lea    r11,[rbp+0xc]
>   511482:       49 89 00                mov    QWORD PTR [r8],rax
>   511485:       8b 45 08                mov    eax,DWORD PTR [rbp+0x8]
>   511488:       41 89 40 08             mov    DWORD PTR [r8+0x8],eax
>   51148c:       44 0f b7 55 0c          movzx  r10d,WORD PTR [rbp+0xc]
>   511491:       66 41 81 fa 81 00       cmp    r10w,0x81
>
> My only explanation to this so far is that I misunderstand something about
> the oprofile results.
>
> Regards,
>
> Zoltan
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [dpdk-dev] vSwitch Performance Comparison for NFV Use Case
  2015-08-21 19:18 [dpdk-dev] vSwitch Performance Comparison for NFV Use Case Jun Xiao
@ 2015-08-24 16:50 ` Traynor, Kevin
  0 siblings, 0 replies; 2+ messages in thread
From: Traynor, Kevin @ 2015-08-24 16:50 UTC (permalink / raw)
  To: Jun Xiao, Gray, Mark D; +Cc: dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jun Xiao
> Sent: Friday, August 21, 2015 8:18 PM
> To: Gray, Mark D
> Cc: dev
> Subject: [dpdk-dev] vSwitch Performance Comparison for NFV Use Case
> 
> Hi Mark,
> Last time we discussed methodologies for vSwitch performance comparison, and
> the performance data we published is more for typical TCP based applications
> in virtualized data centers.Today we shared more data for small packet size
> traffic at http://cloudnetengine.com/en/blog/2015/08/21/vswitch-performance-
> comparison-nfv-use-case, and the perfomance gets much closed (around 10-20%)
> between OVS-DPDK and CNE vSwitch as the tests are barely forwarding and
> without any other features.
> 
> On the other hand, it's really hard to find any public performance data for
> OVS-DPDK under pNIC -> vSwitch -> VM -> vSwitch -> pNIC case. What I observed
> is that OVS-DPDK can have generally less than 3 MPPS on my setup (vhost user
> is used instead of IVSHMEM), don't know if the data are aligned with what you
> have?

That seems reasonable enough (maybe a little low) considering you are on a 2.4 GHz
and are using one logical core so the pmd will be sharing a physical core. As you
said you will get greater performance if you add another pmd. You will also get
better performance if you set rx_mrgbuf=off.

Did you core affinitize the pmd to an empty core and the pkt fwding qemu thread to
make sure they get the cycles they need?

> Thanks,Junwww.cloudnetengine.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-08-24 16:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-21 19:18 [dpdk-dev] vSwitch Performance Comparison for NFV Use Case Jun Xiao
2015-08-24 16:50 ` Traynor, Kevin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).