DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: "stephen@networkplumber.org" <stephen@networkplumber.org>,
	"Pierre Pfister (ppfister)" <ppfister@cisco.com>,
	"Xie, Huawei" <huawei.xie@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"vkaplans@redhat.com" <vkaplans@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>
Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path
Date: Fri, 4 Nov 2016 08:57:49 +0100	[thread overview]
Message-ID: <17d285a9-818c-b060-8969-daccb052dc1f@redhat.com> (raw)
In-Reply-To: <8F6C2BD409508844A0EFC19955BE09414E7DC40F@SHSMSX103.ccr.corp.intel.com>

Hi Zhihong,

On 11/04/2016 08:20 AM, Wang, Zhihong wrote:
>
>
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Thursday, November 3, 2016 4:11 PM
>> To: Wang, Zhihong <zhihong.wang@intel.com>; Yuanhan Liu
>> <yuanhan.liu@linux.intel.com>
>> Cc: stephen@networkplumber.org; Pierre Pfister (ppfister)
>> <ppfister@cisco.com>; Xie, Huawei <huawei.xie@intel.com>; dev@dpdk.org;
>> vkaplans@redhat.com; mst@redhat.com
>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support
>> to the TX path
>>
>>
>>
>> On 11/02/2016 11:51 AM, Maxime Coquelin wrote:
>>>
>>>
>>> On 10/31/2016 11:01 AM, Wang, Zhihong wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>>> Sent: Friday, October 28, 2016 3:42 PM
>>>>> To: Wang, Zhihong <zhihong.wang@intel.com>; Yuanhan Liu
>>>>> <yuanhan.liu@linux.intel.com>
>>>>> Cc: stephen@networkplumber.org; Pierre Pfister (ppfister)
>>>>> <ppfister@cisco.com>; Xie, Huawei <huawei.xie@intel.com>;
>> dev@dpdk.org;
>>>>> vkaplans@redhat.com; mst@redhat.com
>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>>>> support
>>>>> to the TX path
>>>>>
>>>>>
>>>>>
>>>>> On 10/28/2016 02:49 AM, Wang, Zhihong wrote:
>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
>>>>>>>> Sent: Thursday, October 27, 2016 6:46 PM
>>>>>>>> To: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>> Cc: Wang, Zhihong <zhihong.wang@intel.com>;
>>>>>>>> stephen@networkplumber.org; Pierre Pfister (ppfister)
>>>>>>>> <ppfister@cisco.com>; Xie, Huawei <huawei.xie@intel.com>;
>>>>> dev@dpdk.org;
>>>>>>>> vkaplans@redhat.com; mst@redhat.com
>>>>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>>>> support
>>>>>>>> to the TX path
>>>>>>>>
>>>>>>>> On Thu, Oct 27, 2016 at 12:35:11PM +0200, Maxime Coquelin wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/27/2016 12:33 PM, Yuanhan Liu wrote:
>>>>>>>>>>>> On Thu, Oct 27, 2016 at 11:10:34AM +0200, Maxime Coquelin
>>>>> wrote:
>>>>>>>>>>>>>> Hi Zhihong,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 10/27/2016 11:00 AM, Wang, Zhihong wrote:
>>>>>>>>>>>>>>>> Hi Maxime,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Seems indirect desc feature is causing serious
>> performance
>>>>>>>>>>>>>>>> degradation on Haswell platform, about 20% drop for both
>>>>>>>>>>>>>>>> mrg=on and mrg=off (--txqflags=0xf00, non-vector
>> version),
>>>>>>>>>>>>>>>> both iofwd and macfwd.
>>>>>>>>>>>>>> I tested PVP (with macswap on guest) and Txonly/Rxonly on
>> an
>>>>> Ivy
>>>>>>>> Bridge
>>>>>>>>>>>>>> platform, and didn't faced such a drop.
>>>>>>>>>>>>
>>>>>>>>>>>> I was actually wondering that may be the cause. I tested it with
>>>>>>>>>>>> my IvyBridge server as well, I saw no drop.
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe you should find a similar platform (Haswell) and have a
>>>>>>>>>>>> try?
>>>>>>>>>> Yes, that's why I asked Zhihong whether he could test Txonly in
>>>>>>>>>> guest
>>>>> to
>>>>>>>>>> see if issue is reproducible like this.
>>>>>>>>
>>>>>>>> I have no Haswell box, otherwise I could do a quick test for you.
>>>>>>>> IIRC,
>>>>>>>> he tried to disable the indirect_desc feature, then the performance
>>>>>>>> recovered. So, it's likely the indirect_desc is the culprit here.
>>>>>>>>
>>>>>>>>>> I will be easier for me to find an Haswell machine if it has not
>>>>>>>>>> to be
>>>>>>>>>> connected back to back to and HW/SW packet generator.
>>>>>> In fact simple loopback test will also do, without pktgen.
>>>>>>
>>>>>> Start testpmd in both host and guest, and do "start" in one
>>>>>> and "start tx_first 32" in another.
>>>>>>
>>>>>> Perf drop is about 24% in my test.
>>>>>>
>>>>>
>>>>> Thanks, I never tried this test.
>>>>> I managed to find an Haswell platform (Intel(R) Xeon(R) CPU E5-2699 v3
>>>>> @ 2.30GHz), and can reproduce the problem with the loop test you
>>>>> mention. I see a performance drop about 10% (8.94Mpps/8.08Mpps).
>>>>> Out of curiosity, what are the numbers you get with your setup?
>>>>
>>>> Hi Maxime,
>>>>
>>>> Let's align our test case to RC2, mrg=on, loopback, on Haswell.
>>>> My results below:
>>>>  1. indirect=1: 5.26 Mpps
>>>>  2. indirect=0: 6.54 Mpps
>>>>
>>>> It's about 24% drop.
>>> OK, so on my side, same setup on Haswell:
>>> 1. indirect=1: 7.44 Mpps
>>> 2. indirect=0: 8.18 Mpps
>>>
>>> Still 10% drop in my case with mrg=on.
>>>
>>> The strange thing with both of our figures is that this is below from
>>> what I obtain with my SandyBridge machine. The SB cpu freq is 4% higher,
>>> but that doesn't explain the gap between the measurements.
>>>
>>> I'm continuing the investigations on my side.
>>> Maybe we should fix a deadline, and decide do disable indirect in
>>> Virtio PMD if root cause not identified/fixed at some point?
>>>
>>> Yuanhan, what do you think?
>>
>> I have done some measurements using perf, and know understand better
>> what happens.
>>
>> With indirect descriptors, I can see a cache miss when fetching the
>> descriptors in the indirect table. Actually, this is expected, so
>> we prefetch the first desc as soon as possible, but still not soon
>> enough to make it transparent.
>> In direct descriptors case, the desc in the virtqueue seems to be
>> remain in the cache from its previous use, so we have a hit.
>>
>> That said, in realistic use-case, I think we should not have a hit,
>> even with direct descriptors.
>> Indeed, the test case use testpmd on guest side with the forwarding set
>> in IO mode. It means the packet content is never accessed by the guest.
>>
>> In my experiments, I am used to set the "macswap" forwarding mode, which
>> swaps src and dest MAC addresses in the packet. I find it more
>> realistic, because I don't see the point in sending packets to the guest
>> if it is not accessed (not even its header).
>>
>> I tried again the test case, this time with setting the forwarding mode
>> to macswap in the guest. This time, I get same performance with both
>> direct and indirect (indirect even a little better with a small
>> optimization, consisting in prefetching the 2 first descs
>> systematically as we know there are contiguous).
>
>
> Hi Maxime,
>
> I did a little more macswap test and found out more stuff here:
Thanks for doing more tests.

>
>  1. I did loopback test on another HSW machine with the same H/W,
>     and indirect_desc on and off seems have close perf
>
>  2. So I checked the gcc version:
>
>      *  Previous: gcc version 6.2.1 20160916 (Fedora 24)
>
>      *  New: gcc version 5.4.0 20160609 (Ubuntu 16.04.1 LTS)

On my side, I tested with RHEL7.3:
  - gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)

It certainly contains some backports from newer GCC versions.

>
>     On previous one indirect_desc has 20% drop
>
>  3. Then I compiled binary on Ubuntu and scp to Fedora, and as
>     expected I got the same perf as on Ubuntu, and the perf gap
>     disappeared, so gcc is definitely one factor here
>
>  4. Then I use the Ubuntu binary on Fedora for PVP test, then the
>     perf gap comes back again and the same with the Fedora binary
>     results, indirect_desc causes about 20% drop

Let me know if I understand correctly:
Loopback test with macswap:
  - gcc version 6.2.1 : 20% perf drop
  - gcc version 5.4.0 : No drop

PVP test with macswap:
  - gcc version 6.2.1 : 20% perf drop
  - gcc version 5.4.0 : 20% perf drop

>
> So in all, could you try PVP traffic on HSW to see how it works?
Sadly, the HSW machine I borrowed does not have other device connected 
back to back on its 10G port. I can only test PVP with SNB machines
currently.

>
>
>>
>> Do you agree we should assume that the packet (header or/and buf) will
>> always be accessed by the guest application?
>> If so, do you agree we should keep indirect descs enabled, and maybe
>> update the test cases?
>
>
> I agree with you that mac/macswap test is more realistic and makes
> more sense for real applications.

Thanks,
Maxime

  reply	other threads:[~2016-11-04  7:57 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-23  8:28 [dpdk-dev] [PATCH v3] " Maxime Coquelin
2016-09-23 15:49 ` Michael S. Tsirkin
2016-09-23 18:02   ` Maxime Coquelin
2016-09-23 18:06     ` Michael S. Tsirkin
2016-09-23 18:16       ` Maxime Coquelin
2016-09-23 18:22         ` Michael S. Tsirkin
2016-09-23 20:24           ` Stephen Hemminger
2016-09-26  3:03             ` Yuanhan Liu
2016-09-26 12:25               ` Michael S. Tsirkin
2016-09-26 13:04                 ` Yuanhan Liu
2016-09-27  4:15 ` Yuanhan Liu
2016-09-27  7:25   ` Maxime Coquelin
2016-09-27  8:42 ` [dpdk-dev] [PATCH v4] " Maxime Coquelin
2016-09-27 12:18   ` Yuanhan Liu
2016-10-14  7:24   ` Wang, Zhihong
2016-10-14  7:34     ` Wang, Zhihong
2016-10-14 15:50     ` Maxime Coquelin
2016-10-17 11:23       ` Maxime Coquelin
2016-10-17 13:21         ` Yuanhan Liu
2016-10-17 14:14           ` Maxime Coquelin
2016-10-27  9:00             ` Wang, Zhihong
2016-10-27  9:10               ` Maxime Coquelin
2016-10-27  9:55                 ` Maxime Coquelin
2016-10-27 10:19                   ` Wang, Zhihong
2016-10-28  7:32                     ` Pierre Pfister (ppfister)
2016-10-28  7:58                       ` Maxime Coquelin
2016-11-01  8:15                         ` Yuanhan Liu
2016-11-01  9:39                           ` Thomas Monjalon
2016-11-02  2:44                             ` Yuanhan Liu
2016-10-27 10:33                 ` Yuanhan Liu
2016-10-27 10:35                   ` Maxime Coquelin
2016-10-27 10:46                     ` Yuanhan Liu
2016-10-28  0:49                       ` Wang, Zhihong
2016-10-28  7:42                         ` Maxime Coquelin
2016-10-31 10:01                           ` Wang, Zhihong
2016-11-02 10:51                             ` Maxime Coquelin
2016-11-03  8:11                               ` Maxime Coquelin
2016-11-04  6:18                                 ` Xu, Qian Q
2016-11-04  7:41                                   ` Maxime Coquelin
2016-11-04  7:20                                 ` Wang, Zhihong
2016-11-04  7:57                                   ` Maxime Coquelin [this message]
2016-11-04  7:59                                     ` Maxime Coquelin
2016-11-04 10:43                                       ` Wang, Zhihong
2016-11-04 11:22                                         ` Maxime Coquelin
2016-11-04 11:36                                           ` Yuanhan Liu
2016-11-04 11:39                                             ` Maxime Coquelin
2016-11-04 12:30                                           ` Wang, Zhihong
2016-11-04 12:54                                             ` Maxime Coquelin
2016-11-04 13:09                                               ` Wang, Zhihong
2016-11-08 10:51                                                 ` Wang, Zhihong
2016-10-27 10:53                   ` Maxime Coquelin
2016-10-28  6:05                     ` Xu, Qian Q

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17d285a9-818c-b060-8969-daccb052dc1f@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=dev@dpdk.org \
    --cc=huawei.xie@intel.com \
    --cc=mst@redhat.com \
    --cc=ppfister@cisco.com \
    --cc=stephen@networkplumber.org \
    --cc=vkaplans@redhat.com \
    --cc=yuanhan.liu@linux.intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).