From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>,
Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: "stephen@networkplumber.org" <stephen@networkplumber.org>,
"Pierre Pfister (ppfister)" <ppfister@cisco.com>,
"Xie, Huawei" <huawei.xie@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>,
"vkaplans@redhat.com" <vkaplans@redhat.com>,
"mst@redhat.com" <mst@redhat.com>
Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path
Date: Fri, 4 Nov 2016 08:59:57 +0100 [thread overview]
Message-ID: <7e1c8953-db15-f377-cece-85cb7169bb17@redhat.com> (raw)
In-Reply-To: <17d285a9-818c-b060-8969-daccb052dc1f@redhat.com>
On 11/04/2016 08:57 AM, Maxime Coquelin wrote:
> Hi Zhihong,
>
> On 11/04/2016 08:20 AM, Wang, Zhihong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>> Sent: Thursday, November 3, 2016 4:11 PM
>>> To: Wang, Zhihong <zhihong.wang@intel.com>; Yuanhan Liu
>>> <yuanhan.liu@linux.intel.com>
>>> Cc: stephen@networkplumber.org; Pierre Pfister (ppfister)
>>> <ppfister@cisco.com>; Xie, Huawei <huawei.xie@intel.com>; dev@dpdk.org;
>>> vkaplans@redhat.com; mst@redhat.com
>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>> support
>>> to the TX path
>>>
>>>
>>>
>>> On 11/02/2016 11:51 AM, Maxime Coquelin wrote:
>>>>
>>>>
>>>> On 10/31/2016 11:01 AM, Wang, Zhihong wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>>>> Sent: Friday, October 28, 2016 3:42 PM
>>>>>> To: Wang, Zhihong <zhihong.wang@intel.com>; Yuanhan Liu
>>>>>> <yuanhan.liu@linux.intel.com>
>>>>>> Cc: stephen@networkplumber.org; Pierre Pfister (ppfister)
>>>>>> <ppfister@cisco.com>; Xie, Huawei <huawei.xie@intel.com>;
>>> dev@dpdk.org;
>>>>>> vkaplans@redhat.com; mst@redhat.com
>>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>>>>> support
>>>>>> to the TX path
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/28/2016 02:49 AM, Wang, Zhihong wrote:
>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
>>>>>>>>> Sent: Thursday, October 27, 2016 6:46 PM
>>>>>>>>> To: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>>> Cc: Wang, Zhihong <zhihong.wang@intel.com>;
>>>>>>>>> stephen@networkplumber.org; Pierre Pfister (ppfister)
>>>>>>>>> <ppfister@cisco.com>; Xie, Huawei <huawei.xie@intel.com>;
>>>>>> dev@dpdk.org;
>>>>>>>>> vkaplans@redhat.com; mst@redhat.com
>>>>>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>>>>> support
>>>>>>>>> to the TX path
>>>>>>>>>
>>>>>>>>> On Thu, Oct 27, 2016 at 12:35:11PM +0200, Maxime Coquelin wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10/27/2016 12:33 PM, Yuanhan Liu wrote:
>>>>>>>>>>>>> On Thu, Oct 27, 2016 at 11:10:34AM +0200, Maxime Coquelin
>>>>>> wrote:
>>>>>>>>>>>>>>> Hi Zhihong,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/27/2016 11:00 AM, Wang, Zhihong wrote:
>>>>>>>>>>>>>>>>> Hi Maxime,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Seems indirect desc feature is causing serious
>>> performance
>>>>>>>>>>>>>>>>> degradation on Haswell platform, about 20% drop for both
>>>>>>>>>>>>>>>>> mrg=on and mrg=off (--txqflags=0xf00, non-vector
>>> version),
>>>>>>>>>>>>>>>>> both iofwd and macfwd.
>>>>>>>>>>>>>>> I tested PVP (with macswap on guest) and Txonly/Rxonly on
>>> an
>>>>>> Ivy
>>>>>>>>> Bridge
>>>>>>>>>>>>>>> platform, and didn't faced such a drop.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was actually wondering that may be the cause. I tested it
>>>>>>>>>>>>> with
>>>>>>>>>>>>> my IvyBridge server as well, I saw no drop.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe you should find a similar platform (Haswell) and have a
>>>>>>>>>>>>> try?
>>>>>>>>>>> Yes, that's why I asked Zhihong whether he could test Txonly in
>>>>>>>>>>> guest
>>>>>> to
>>>>>>>>>>> see if issue is reproducible like this.
>>>>>>>>>
>>>>>>>>> I have no Haswell box, otherwise I could do a quick test for you.
>>>>>>>>> IIRC,
>>>>>>>>> he tried to disable the indirect_desc feature, then the
>>>>>>>>> performance
>>>>>>>>> recovered. So, it's likely the indirect_desc is the culprit here.
>>>>>>>>>
>>>>>>>>>>> I will be easier for me to find an Haswell machine if it has not
>>>>>>>>>>> to be
>>>>>>>>>>> connected back to back to and HW/SW packet generator.
>>>>>>> In fact simple loopback test will also do, without pktgen.
>>>>>>>
>>>>>>> Start testpmd in both host and guest, and do "start" in one
>>>>>>> and "start tx_first 32" in another.
>>>>>>>
>>>>>>> Perf drop is about 24% in my test.
>>>>>>>
>>>>>>
>>>>>> Thanks, I never tried this test.
>>>>>> I managed to find an Haswell platform (Intel(R) Xeon(R) CPU
>>>>>> E5-2699 v3
>>>>>> @ 2.30GHz), and can reproduce the problem with the loop test you
>>>>>> mention. I see a performance drop about 10% (8.94Mpps/8.08Mpps).
>>>>>> Out of curiosity, what are the numbers you get with your setup?
>>>>>
>>>>> Hi Maxime,
>>>>>
>>>>> Let's align our test case to RC2, mrg=on, loopback, on Haswell.
>>>>> My results below:
>>>>> 1. indirect=1: 5.26 Mpps
>>>>> 2. indirect=0: 6.54 Mpps
>>>>>
>>>>> It's about 24% drop.
>>>> OK, so on my side, same setup on Haswell:
>>>> 1. indirect=1: 7.44 Mpps
>>>> 2. indirect=0: 8.18 Mpps
>>>>
>>>> Still 10% drop in my case with mrg=on.
>>>>
>>>> The strange thing with both of our figures is that this is below from
>>>> what I obtain with my SandyBridge machine. The SB cpu freq is 4%
>>>> higher,
>>>> but that doesn't explain the gap between the measurements.
>>>>
>>>> I'm continuing the investigations on my side.
>>>> Maybe we should fix a deadline, and decide do disable indirect in
>>>> Virtio PMD if root cause not identified/fixed at some point?
>>>>
>>>> Yuanhan, what do you think?
>>>
>>> I have done some measurements using perf, and know understand better
>>> what happens.
>>>
>>> With indirect descriptors, I can see a cache miss when fetching the
>>> descriptors in the indirect table. Actually, this is expected, so
>>> we prefetch the first desc as soon as possible, but still not soon
>>> enough to make it transparent.
>>> In direct descriptors case, the desc in the virtqueue seems to be
>>> remain in the cache from its previous use, so we have a hit.
>>>
>>> That said, in realistic use-case, I think we should not have a hit,
>>> even with direct descriptors.
>>> Indeed, the test case use testpmd on guest side with the forwarding set
>>> in IO mode. It means the packet content is never accessed by the guest.
>>>
>>> In my experiments, I am used to set the "macswap" forwarding mode, which
>>> swaps src and dest MAC addresses in the packet. I find it more
>>> realistic, because I don't see the point in sending packets to the guest
>>> if it is not accessed (not even its header).
>>>
>>> I tried again the test case, this time with setting the forwarding mode
>>> to macswap in the guest. This time, I get same performance with both
>>> direct and indirect (indirect even a little better with a small
>>> optimization, consisting in prefetching the 2 first descs
>>> systematically as we know there are contiguous).
>>
>>
>> Hi Maxime,
>>
>> I did a little more macswap test and found out more stuff here:
> Thanks for doing more tests.
>
>>
>> 1. I did loopback test on another HSW machine with the same H/W,
>> and indirect_desc on and off seems have close perf
>>
>> 2. So I checked the gcc version:
>>
>> * Previous: gcc version 6.2.1 20160916 (Fedora 24)
>>
>> * New: gcc version 5.4.0 20160609 (Ubuntu 16.04.1 LTS)
>
> On my side, I tested with RHEL7.3:
> - gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
>
> It certainly contains some backports from newer GCC versions.
>
>>
>> On previous one indirect_desc has 20% drop
>>
>> 3. Then I compiled binary on Ubuntu and scp to Fedora, and as
>> expected I got the same perf as on Ubuntu, and the perf gap
>> disappeared, so gcc is definitely one factor here
>>
>> 4. Then I use the Ubuntu binary on Fedora for PVP test, then the
>> perf gap comes back again and the same with the Fedora binary
>> results, indirect_desc causes about 20% drop
>
> Let me know if I understand correctly:
> Loopback test with macswap:
> - gcc version 6.2.1 : 20% perf drop
> - gcc version 5.4.0 : No drop
>
> PVP test with macswap:
> - gcc version 6.2.1 : 20% perf drop
> - gcc version 5.4.0 : 20% perf drop
I forgot to ask, did you recompile only host, or both host and guest
testmpd's in your test?
>
>>
>> So in all, could you try PVP traffic on HSW to see how it works?
> Sadly, the HSW machine I borrowed does not have other device connected
> back to back on its 10G port. I can only test PVP with SNB machines
> currently.
>
>>
>>
>>>
>>> Do you agree we should assume that the packet (header or/and buf) will
>>> always be accessed by the guest application?
>>> If so, do you agree we should keep indirect descs enabled, and maybe
>>> update the test cases?
>>
>>
>> I agree with you that mac/macswap test is more realistic and makes
>> more sense for real applications.
>
> Thanks,
> Maxime
next prev parent reply other threads:[~2016-11-04 8:00 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-23 8:28 [dpdk-dev] [PATCH v3] " Maxime Coquelin
2016-09-23 15:49 ` Michael S. Tsirkin
2016-09-23 18:02 ` Maxime Coquelin
2016-09-23 18:06 ` Michael S. Tsirkin
2016-09-23 18:16 ` Maxime Coquelin
2016-09-23 18:22 ` Michael S. Tsirkin
2016-09-23 20:24 ` Stephen Hemminger
2016-09-26 3:03 ` Yuanhan Liu
2016-09-26 12:25 ` Michael S. Tsirkin
2016-09-26 13:04 ` Yuanhan Liu
2016-09-27 4:15 ` Yuanhan Liu
2016-09-27 7:25 ` Maxime Coquelin
2016-09-27 8:42 ` [dpdk-dev] [PATCH v4] " Maxime Coquelin
2016-09-27 12:18 ` Yuanhan Liu
2016-10-14 7:24 ` Wang, Zhihong
2016-10-14 7:34 ` Wang, Zhihong
2016-10-14 15:50 ` Maxime Coquelin
2016-10-17 11:23 ` Maxime Coquelin
2016-10-17 13:21 ` Yuanhan Liu
2016-10-17 14:14 ` Maxime Coquelin
2016-10-27 9:00 ` Wang, Zhihong
2016-10-27 9:10 ` Maxime Coquelin
2016-10-27 9:55 ` Maxime Coquelin
2016-10-27 10:19 ` Wang, Zhihong
2016-10-28 7:32 ` Pierre Pfister (ppfister)
2016-10-28 7:58 ` Maxime Coquelin
2016-11-01 8:15 ` Yuanhan Liu
2016-11-01 9:39 ` Thomas Monjalon
2016-11-02 2:44 ` Yuanhan Liu
2016-10-27 10:33 ` Yuanhan Liu
2016-10-27 10:35 ` Maxime Coquelin
2016-10-27 10:46 ` Yuanhan Liu
2016-10-28 0:49 ` Wang, Zhihong
2016-10-28 7:42 ` Maxime Coquelin
2016-10-31 10:01 ` Wang, Zhihong
2016-11-02 10:51 ` Maxime Coquelin
2016-11-03 8:11 ` Maxime Coquelin
2016-11-04 6:18 ` Xu, Qian Q
2016-11-04 7:41 ` Maxime Coquelin
2016-11-04 7:20 ` Wang, Zhihong
2016-11-04 7:57 ` Maxime Coquelin
2016-11-04 7:59 ` Maxime Coquelin [this message]
2016-11-04 10:43 ` Wang, Zhihong
2016-11-04 11:22 ` Maxime Coquelin
2016-11-04 11:36 ` Yuanhan Liu
2016-11-04 11:39 ` Maxime Coquelin
2016-11-04 12:30 ` Wang, Zhihong
2016-11-04 12:54 ` Maxime Coquelin
2016-11-04 13:09 ` Wang, Zhihong
2016-11-08 10:51 ` Wang, Zhihong
2016-10-27 10:53 ` Maxime Coquelin
2016-10-28 6:05 ` Xu, Qian Q
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e1c8953-db15-f377-cece-85cb7169bb17@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=dev@dpdk.org \
--cc=huawei.xie@intel.com \
--cc=mst@redhat.com \
--cc=ppfister@cisco.com \
--cc=stephen@networkplumber.org \
--cc=vkaplans@redhat.com \
--cc=yuanhan.liu@linux.intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).