From: Zoltan Kiss <zoltan.kiss@linaro.org>
To: "dev@dpdk.org" <dev@dpdk.org>, dev@openvswitch.org
Subject: Re: [dpdk-dev] OVS-DPDK performance problem on ixgbe vector PMD
Date: Wed, 26 Aug 2015 19:49:45 +0100 [thread overview]
Message-ID: <55DE0A49.8060803@linaro.org> (raw)
In-Reply-To: <55D76854.5010306@linaro.org>
Hi,
I've checked it further, based on Stephen's suggestion I've tried perf
top as well. The results were the same, it spends a lot of time in that
part of the code, and there are high number of branch load misses
(BR_MISS_PRED_RETIRED) around there too.
I've also started to strip down miniflow_extract() to remove parts which
are not relevant to this very simple testcase. I've removed the metadata
checking branches and the "size < sizeof(struct eth_header)". I've
removed the size check from emc_processing, and placed log messages in
flow_extract and netdev_flow_key_from_flow, to make sure the excessive
time spent in miniflow_extract is not because these two are somehow
calling it.
That way I've closed out all of the branches preceding this instruction.
Oddly the high sample number now moved down a few instructions:
...
dp_packet_reset_offsets
5113eb: b8 ff ff ff ff mov $0xffffffff,%eax
5113f0: 66 89 8f 86 00 00 00 mov %cx,0x86(%rdi)
5113f7: c6 87 81 00 00 00 00 movb $0x0,0x81(%rdi)
5113fe: 66 89 87 82 00 00 00 mov %ax,0x82(%rdi)
data_pull
511405: 48 8d 4d 0c lea 0xc(%rbp),%rcx
dp_packet_reset_offsets
511409: 66 89 97 84 00 00 00 mov %dx,0x84(%rdi)
memcpy
511410: 48 8b 45 00 mov 0x0(%rbp),%rax
511414: 48 89 46 18 mov %rax,0x18(%rsi)
This last instruction moves the first 8 bytes of the MAC address (coming
from 0x0(%rbp)) to 0x18(%rsi), which is basically memory pointed by
parameter "struct miniflow *dst". It is allocated on the stack by
emc_processing.
I couldn't find any branch which can cause this miss, but then I've
checked the PMD stats:
pmd thread numa_id 0 core_id 1:
emc hits:4395834176
megaflow hits:1
miss:1
lost:0
polling cycles:166083129380 (16.65%)
processing cycles:831536059972 (83.35%)
avg cycles per packet: 226.95 (997619189352/4395834178)
avg processing cycles per packet: 189.16 (831536059972/4395834178)
So everything hits EMC, when I measured the change of that counter for
10 seconds, the result was around ~13.3 Mpps too. The cycle statistics
shows that it should be able to handle more than 15M packets per second,
yet it doesn't receive that much, while with the non-vector PMD it can
max out the link.
Any more suggestions?
Regards,
Zoltan
On 21/08/15 19:05, Zoltan Kiss wrote:
> Hi,
>
> I've set up a simple packet forwarding perf test on a dual-port 10G
> 82599ES: one port receives 64 byte UDP packets, the other sends it out,
> one core used. I've used latest OVS with DPDK 2.1, and the first result
> was only 13.2 Mpps, which was a bit far from the 13.9 I've seen last
> year with the same test. The first thing I've changed was to revert back
> to the old behaviour about this issue:
>
> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/22731
>
> So instead of the new default I've passed 2048 + RTE_PKTMBUF_HEADROOM.
> That increased the performance to 13.5, but to figure out what's wrong
> started to play with the receive functions. First I've disabled vector
> PMD, but ixgbe_recv_pkts_bulk_alloc() was even worse, only 12.5 Mpps. So
> then I've enabled scattered RX, and with
> ixgbe_recv_pkts_lro_bulk_alloc() I could manage to get 13.98 Mpps, which
> is I guess as close as possible to the 14.2 line rate (on my HW at
> least, with one core)
> Does anyone has a good explanation about why the vector PMD performs so
> significantly worse? I would expect that on a 3.2 GHz i5-4570 one core
> should be able to reach ~14 Mpps, SG and vector PMD shouldn't make a
> difference.
> I've tried to look into it with oprofile, but the results were quite
> strange: 35% of the samples were from miniflow_extract, the part where
> parse_vlan calls data_pull to jump after the MAC addresses. The oprofile
> snippet (1M samples):
>
> 511454 19 0.0037 flow.c:511
> 511458 149 0.0292 dp-packet.h:266
> 51145f 4264 0.8357 dp-packet.h:267
> 511466 18 0.0035 dp-packet.h:268
> 51146d 43 0.0084 dp-packet.h:269
> 511474 172 0.0337 flow.c:511
> 51147a 4320 0.8467 string3.h:51
> 51147e 358763 70.3176 flow.c:99
> 511482 2 3.9e-04 string3.h:51
> 511485 3060 0.5998 string3.h:51
> 511488 1693 0.3318 string3.h:51
> 51148c 2933 0.5749 flow.c:326
> 511491 47 0.0092 flow.c:326
>
> And the corresponding disassembled code:
>
> 511454: 49 83 f9 0d cmp r9,0xd
> 511458: c6 83 81 00 00 00 00 mov BYTE PTR [rbx+0x81],0x0
> 51145f: 66 89 83 82 00 00 00 mov WORD PTR [rbx+0x82],ax
> 511466: 66 89 93 84 00 00 00 mov WORD PTR [rbx+0x84],dx
> 51146d: 66 89 8b 86 00 00 00 mov WORD PTR [rbx+0x86],cx
> 511474: 0f 86 af 01 00 00 jbe 511629
> <miniflow_extract+0x279>
> 51147a: 48 8b 45 00 mov rax,QWORD PTR [rbp+0x0]
> 51147e: 4c 8d 5d 0c lea r11,[rbp+0xc]
> 511482: 49 89 00 mov QWORD PTR [r8],rax
> 511485: 8b 45 08 mov eax,DWORD PTR [rbp+0x8]
> 511488: 41 89 40 08 mov DWORD PTR [r8+0x8],eax
> 51148c: 44 0f b7 55 0c movzx r10d,WORD PTR [rbp+0xc]
> 511491: 66 41 81 fa 81 00 cmp r10w,0x81
>
> My only explanation to this so far is that I misunderstand something
> about the oprofile results.
>
> Regards,
>
> Zoltan
prev parent reply other threads:[~2015-08-26 18:49 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <55D76854.5010306@linaro.org>
2015-08-23 6:46 ` Gray, Mark D
2015-08-24 11:43 ` [dpdk-dev] [ovs-dev] " Traynor, Kevin
2015-08-26 17:07 ` Zoltan Kiss
2015-08-26 18:49 ` Zoltan Kiss [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55DE0A49.8060803@linaro.org \
--to=zoltan.kiss@linaro.org \
--cc=dev@dpdk.org \
--cc=dev@openvswitch.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).