From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 56E9337B3 for ; Fri, 4 Nov 2016 11:43:24 +0100 (CET) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP; 04 Nov 2016 03:43:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,442,1473145200"; d="scan'208";a="1063872751" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by fmsmga001.fm.intel.com with ESMTP; 04 Nov 2016 03:43:21 -0700 Received: from fmsmsx155.amr.corp.intel.com (10.18.116.71) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.248.2; Fri, 4 Nov 2016 03:43:19 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by FMSMSX155.amr.corp.intel.com (10.18.116.71) with Microsoft SMTP Server (TLS) id 14.3.248.2; Fri, 4 Nov 2016 03:43:19 -0700 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.139]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.2]) with mapi id 14.03.0248.002; Fri, 4 Nov 2016 18:43:17 +0800 From: "Wang, Zhihong" To: Maxime Coquelin , Yuanhan Liu CC: "stephen@networkplumber.org" , "Pierre Pfister (ppfister)" , "Xie, Huawei" , "dev@dpdk.org" , "vkaplans@redhat.com" , "mst@redhat.com" Thread-Topic: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path Thread-Index: AQHSJjLIgzIjeGrkS0+qEKh9oEoL36Cr/zyAgAAg9YCAAA7sgIAP4USA//+BDwCAABccgIAAAIiAgAADH4CAAXEIoP//7duAgAVhEfCAAq9+AIABZX6AgAIH/aD//4ajgAAAEyqAABN0pMA= Date: Fri, 4 Nov 2016 10:43:16 +0000 Message-ID: <8F6C2BD409508844A0EFC19955BE09414E7DC5B6@SHSMSX103.ccr.corp.intel.com> References: <1474965769-24782-1-git-send-email-maxime.coquelin@redhat.com> <70cc3b89-d680-1519-add3-f38b228e65b5@redhat.com> <20161017132121.GG16751@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7D8BDF@SHSMSX103.ccr.corp.intel.com> <20161027103317.GM16751@yliu-dev.sh.intel.com> <0ba8f8c9-2174-b3c1-4f07-f6911bffa6cd@redhat.com> <20161027104621.GN16751@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7D90C7@SHSMSX103.ccr.corp.intel.com> <88169067-290d-a7bb-ab2c-c9b8ec1b1ded@redhat.com> <8F6C2BD409508844A0EFC19955BE09414E7DA533@SHSMSX103.ccr.corp.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7DC40F@SHSMSX103.ccr.corp.intel.com> <17d285a9-818c-b060-8969-daccb052dc1f@redhat.com> <7e1c8953-db15-f377-cece-85cb7169bb17@redhat.com> In-Reply-To: <7e1c8953-db15-f377-cece-85cb7169bb17@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Nov 2016 10:43:26 -0000 > -----Original Message----- > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] > Sent: Friday, November 4, 2016 4:00 PM > To: Wang, Zhihong ; Yuanhan Liu > > Cc: stephen@networkplumber.org; Pierre Pfister (ppfister) > ; Xie, Huawei ; dev@dpdk.org; > vkaplans@redhat.com; mst@redhat.com > Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors suppor= t > to the TX path >=20 >=20 >=20 > On 11/04/2016 08:57 AM, Maxime Coquelin wrote: > > Hi Zhihong, > > > > On 11/04/2016 08:20 AM, Wang, Zhihong wrote: > >> > >> > >>> -----Original Message----- > >>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] > >>> Sent: Thursday, November 3, 2016 4:11 PM > >>> To: Wang, Zhihong ; Yuanhan Liu > >>> > >>> Cc: stephen@networkplumber.org; Pierre Pfister (ppfister) > >>> ; Xie, Huawei ; > dev@dpdk.org; > >>> vkaplans@redhat.com; mst@redhat.com > >>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors > >>> support > >>> to the TX path > >>> > >>> > >>> > >>> On 11/02/2016 11:51 AM, Maxime Coquelin wrote: > >>>> > >>>> > >>>> On 10/31/2016 11:01 AM, Wang, Zhihong wrote: > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] > >>>>>> Sent: Friday, October 28, 2016 3:42 PM > >>>>>> To: Wang, Zhihong ; Yuanhan Liu > >>>>>> > >>>>>> Cc: stephen@networkplumber.org; Pierre Pfister (ppfister) > >>>>>> ; Xie, Huawei ; > >>> dev@dpdk.org; > >>>>>> vkaplans@redhat.com; mst@redhat.com > >>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors > >>>>>> support > >>>>>> to the TX path > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 10/28/2016 02:49 AM, Wang, Zhihong wrote: > >>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com] > >>>>>>>>> Sent: Thursday, October 27, 2016 6:46 PM > >>>>>>>>> To: Maxime Coquelin > >>>>>>>>> Cc: Wang, Zhihong ; > >>>>>>>>> stephen@networkplumber.org; Pierre Pfister (ppfister) > >>>>>>>>> ; Xie, Huawei ; > >>>>>> dev@dpdk.org; > >>>>>>>>> vkaplans@redhat.com; mst@redhat.com > >>>>>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect > descriptors > >>>>>> support > >>>>>>>>> to the TX path > >>>>>>>>> > >>>>>>>>> On Thu, Oct 27, 2016 at 12:35:11PM +0200, Maxime Coquelin > wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 10/27/2016 12:33 PM, Yuanhan Liu wrote: > >>>>>>>>>>>>> On Thu, Oct 27, 2016 at 11:10:34AM +0200, Maxime > Coquelin > >>>>>> wrote: > >>>>>>>>>>>>>>> Hi Zhihong, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 10/27/2016 11:00 AM, Wang, Zhihong wrote: > >>>>>>>>>>>>>>>>> Hi Maxime, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Seems indirect desc feature is causing serious > >>> performance > >>>>>>>>>>>>>>>>> degradation on Haswell platform, about 20% drop for > both > >>>>>>>>>>>>>>>>> mrg=3Don and mrg=3Doff (--txqflags=3D0xf00, non-vector > >>> version), > >>>>>>>>>>>>>>>>> both iofwd and macfwd. > >>>>>>>>>>>>>>> I tested PVP (with macswap on guest) and Txonly/Rxonly > on > >>> an > >>>>>> Ivy > >>>>>>>>> Bridge > >>>>>>>>>>>>>>> platform, and didn't faced such a drop. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I was actually wondering that may be the cause. I tested it > >>>>>>>>>>>>> with > >>>>>>>>>>>>> my IvyBridge server as well, I saw no drop. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Maybe you should find a similar platform (Haswell) and > have a > >>>>>>>>>>>>> try? > >>>>>>>>>>> Yes, that's why I asked Zhihong whether he could test Txonly > in > >>>>>>>>>>> guest > >>>>>> to > >>>>>>>>>>> see if issue is reproducible like this. > >>>>>>>>> > >>>>>>>>> I have no Haswell box, otherwise I could do a quick test for yo= u. > >>>>>>>>> IIRC, > >>>>>>>>> he tried to disable the indirect_desc feature, then the > >>>>>>>>> performance > >>>>>>>>> recovered. So, it's likely the indirect_desc is the culprit her= e. > >>>>>>>>> > >>>>>>>>>>> I will be easier for me to find an Haswell machine if it has = not > >>>>>>>>>>> to be > >>>>>>>>>>> connected back to back to and HW/SW packet generator. > >>>>>>> In fact simple loopback test will also do, without pktgen. > >>>>>>> > >>>>>>> Start testpmd in both host and guest, and do "start" in one > >>>>>>> and "start tx_first 32" in another. > >>>>>>> > >>>>>>> Perf drop is about 24% in my test. > >>>>>>> > >>>>>> > >>>>>> Thanks, I never tried this test. > >>>>>> I managed to find an Haswell platform (Intel(R) Xeon(R) CPU > >>>>>> E5-2699 v3 > >>>>>> @ 2.30GHz), and can reproduce the problem with the loop test you > >>>>>> mention. I see a performance drop about 10% (8.94Mpps/8.08Mpps). > >>>>>> Out of curiosity, what are the numbers you get with your setup? > >>>>> > >>>>> Hi Maxime, > >>>>> > >>>>> Let's align our test case to RC2, mrg=3Don, loopback, on Haswell. > >>>>> My results below: > >>>>> 1. indirect=3D1: 5.26 Mpps > >>>>> 2. indirect=3D0: 6.54 Mpps > >>>>> > >>>>> It's about 24% drop. > >>>> OK, so on my side, same setup on Haswell: > >>>> 1. indirect=3D1: 7.44 Mpps > >>>> 2. indirect=3D0: 8.18 Mpps > >>>> > >>>> Still 10% drop in my case with mrg=3Don. > >>>> > >>>> The strange thing with both of our figures is that this is below fro= m > >>>> what I obtain with my SandyBridge machine. The SB cpu freq is 4% > >>>> higher, > >>>> but that doesn't explain the gap between the measurements. > >>>> > >>>> I'm continuing the investigations on my side. > >>>> Maybe we should fix a deadline, and decide do disable indirect in > >>>> Virtio PMD if root cause not identified/fixed at some point? > >>>> > >>>> Yuanhan, what do you think? > >>> > >>> I have done some measurements using perf, and know understand > better > >>> what happens. > >>> > >>> With indirect descriptors, I can see a cache miss when fetching the > >>> descriptors in the indirect table. Actually, this is expected, so > >>> we prefetch the first desc as soon as possible, but still not soon > >>> enough to make it transparent. > >>> In direct descriptors case, the desc in the virtqueue seems to be > >>> remain in the cache from its previous use, so we have a hit. > >>> > >>> That said, in realistic use-case, I think we should not have a hit, > >>> even with direct descriptors. > >>> Indeed, the test case use testpmd on guest side with the forwarding s= et > >>> in IO mode. It means the packet content is never accessed by the gues= t. > >>> > >>> In my experiments, I am used to set the "macswap" forwarding mode, > which > >>> swaps src and dest MAC addresses in the packet. I find it more > >>> realistic, because I don't see the point in sending packets to the gu= est > >>> if it is not accessed (not even its header). > >>> > >>> I tried again the test case, this time with setting the forwarding mo= de > >>> to macswap in the guest. This time, I get same performance with both > >>> direct and indirect (indirect even a little better with a small > >>> optimization, consisting in prefetching the 2 first descs > >>> systematically as we know there are contiguous). > >> > >> > >> Hi Maxime, > >> > >> I did a little more macswap test and found out more stuff here: > > Thanks for doing more tests. > > > >> > >> 1. I did loopback test on another HSW machine with the same H/W, > >> and indirect_desc on and off seems have close perf > >> > >> 2. So I checked the gcc version: > >> > >> * Previous: gcc version 6.2.1 20160916 (Fedora 24) > >> > >> * New: gcc version 5.4.0 20160609 (Ubuntu 16.04.1 LTS) > > > > On my side, I tested with RHEL7.3: > > - gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11) > > > > It certainly contains some backports from newer GCC versions. > > > >> > >> On previous one indirect_desc has 20% drop > >> > >> 3. Then I compiled binary on Ubuntu and scp to Fedora, and as > >> expected I got the same perf as on Ubuntu, and the perf gap > >> disappeared, so gcc is definitely one factor here > >> > >> 4. Then I use the Ubuntu binary on Fedora for PVP test, then the > >> perf gap comes back again and the same with the Fedora binary > >> results, indirect_desc causes about 20% drop > > > > Let me know if I understand correctly: Yes, and it's hard to breakdown further at this time. Also we may need to check whether it's caused by certain NIC model. Unfortunately I don't have the right setup right now. > > Loopback test with macswap: > > - gcc version 6.2.1 : 20% perf drop > > - gcc version 5.4.0 : No drop > > > > PVP test with macswap: > > - gcc version 6.2.1 : 20% perf drop > > - gcc version 5.4.0 : 20% perf drop >=20 > I forgot to ask, did you recompile only host, or both host and guest > testmpd's in your test? Both. >=20 > > > >> > >> So in all, could you try PVP traffic on HSW to see how it works? > > Sadly, the HSW machine I borrowed does not have other device connected > > back to back on its 10G port. I can only test PVP with SNB machines > > currently. > > > >> > >> > >>> > >>> Do you agree we should assume that the packet (header or/and buf) wil= l > >>> always be accessed by the guest application? > >>> If so, do you agree we should keep indirect descs enabled, and maybe > >>> update the test cases? > >> > >> > >> I agree with you that mac/macswap test is more realistic and makes > >> more sense for real applications. > > > > Thanks, > > Maxime