From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id CBBD911C5 for ; Fri, 28 Oct 2016 09:58:55 +0200 (CEST) Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 17CDB81222; Fri, 28 Oct 2016 07:58:55 +0000 (UTC) Received: from [10.36.7.170] (vpn1-7-170.ams2.redhat.com [10.36.7.170]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u9S7wpmM017801 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Oct 2016 03:58:53 -0400 To: "Pierre Pfister (ppfister)" , "Wang, Zhihong" References: <1474619303-16709-1-git-send-email-maxime.coquelin@redhat.com> <1474965769-24782-1-git-send-email-maxime.coquelin@redhat.com> <8F6C2BD409508844A0EFC19955BE09414E7CE6D1@SHSMSX103.ccr.corp.intel.com> <70cc3b89-d680-1519-add3-f38b228e65b5@redhat.com> <20161017132121.GG16751@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7D8BDF@SHSMSX103.ccr.corp.intel.com> <7a987be6-feec-0639-211a-462cf2b44514@redhat.com> <8F6C2BD409508844A0EFC19955BE09414E7D8C3B@SHSMSX103.ccr.corp.intel.com> <79DA8EFC-215C-4075-8D1A-FF81EC3CBB21@cisco.com> From: Maxime Coquelin Message-ID: <108f3c0a-bef5-5124-fde6-01fa9870c970@redhat.com> Date: Fri, 28 Oct 2016 09:58:51 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <79DA8EFC-215C-4075-8D1A-FF81EC3CBB21@cisco.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 28 Oct 2016 07:58:55 +0000 (UTC) Cc: "mst@redhat.com" , "dev@dpdk.org" , "vkaplans@redhat.com" Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Oct 2016 07:58:56 -0000 On 10/28/2016 09:32 AM, Pierre Pfister (ppfister) wrote: > >> Le 27 oct. 2016 à 12:19, Wang, Zhihong a écrit : >> >> >> >>> -----Original Message----- >>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] >>> Sent: Thursday, October 27, 2016 5:55 PM >>> To: Wang, Zhihong ; Yuanhan Liu >>> ; stephen@networkplumber.org; Pierre >>> Pfister (ppfister) >>> Cc: Xie, Huawei ; dev@dpdk.org; >>> vkaplans@redhat.com; mst@redhat.com >>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support >>> to the TX path >>> >>> >>> >>> On 10/27/2016 11:10 AM, Maxime Coquelin wrote: >>>> Hi Zhihong, >>>> >>>> On 10/27/2016 11:00 AM, Wang, Zhihong wrote: >>>>> Hi Maxime, >>>>> >>>>> Seems indirect desc feature is causing serious performance >>>>> degradation on Haswell platform, about 20% drop for both >>>>> mrg=on and mrg=off (--txqflags=0xf00, non-vector version), >>>>> both iofwd and macfwd. >>>> I tested PVP (with macswap on guest) and Txonly/Rxonly on an Ivy Bridge >>>> platform, and didn't faced such a drop. >>>> Have you tried to pass indirect_desc=off to qemu cmdline to see if you >>>> recover the performance? >>>> >>>> Yuanhan, which platform did you use when you tested it with zero copy? >>>> >>>>> >>>>> I'm using RC2, and the CPU is Xeon E5-2699 v3 @ 2.30GHz. >>>>> >>>>> Could you please verify if this is true in your test? >>>> I'll try -rc1/-rc2 on my platform, and let you know. >>> As a first test, I tried again Txonly from the guest to the host (Rxonly), >>> where Tx indirect descriptors are used, on my E5-2665 @2.40GHz: >>> v16.11-rc1: 10.81Mpps >>> v16.11-rc2: 10.91Mpps >>> >>> -rc2 is even slightly better in my case. >>> Could you please run the same test on your platform? >> >> I mean to use rc2 as both host and guest, and compare the >> perf between indirect=0 and indirect=1. >> >> I use PVP traffic, tried both testpmd and OvS as the forwarding >> engine in host, and testpmd in guest. >> >> Thanks >> Zhihong > > From my experience, and as Michael pointed out, the best mode for small packets is obviously > ANY_LAYOUT so it uses a single descriptor per packet. Of course, having a single descriptor is in theory the best way. But, in current Virtio PMD implementation, with no offload supported, we never access the virtio header at transmit time, it is allocated and zeroed at startup. For ANY_LAYOUT case, the virtio header is prepended to the packet, and need to be zeroed at packet transmit time. The performance impact is quite important, as show the measurements I made one month ago (txonly): - 2 descs per packet: 11.6Mpps - 1 desc per packet: 9.6Mpps As Michael suggested, I tried to replace the memset by direct fields assignation, but it only recovers a small part of the drop. What I suggested is to introduce a new feature, so that we can skip the virtio header when no offload is negotiated. Maybe you have other ideas? > So, disabling indirect descriptors may give you better pps for 64 bytes packets, but that doesn't mean you should not implement, or enable, it in your driver. That just means that the guest is not taking the right decision, and uses indirect while it should actually use any_layout. +1, it really depends on the use-case. > > Given virtio/vhost design (most decision comes from the guest), the host should be liberal in what it accepts, and not try to influence guest implementation by carefully picking the features it supports. Otherwise guests will never get a chance to make the right decisions either. Agree, what we need is to be able to disable Virtio PMD features without having to rebuild the PMD. It will certainly require an new API change to add this option. Thanks, Maxime > > - Pierre > >> >>> >>> And could you provide me more info on your fwd bench? >>> Do you use dpdk-pktgen on host, or you do fwd on howt with a real NIC >>> also? >>> >>> Thanks, >>> Maxime >>>> Thanks, >>>> Maxime >>>> >>>>> >>>>> >>>>> Thanks >>>>> Zhihong >>>>> >>>>>> -----Original Message----- >>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] >>>>>> Sent: Monday, October 17, 2016 10:15 PM >>>>>> To: Yuanhan Liu >>>>>> Cc: Wang, Zhihong ; Xie, Huawei >>>>>> ; dev@dpdk.org; vkaplans@redhat.com; >>>>>> mst@redhat.com; stephen@networkplumber.org >>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors >>>>>> support >>>>>> to the TX path >>>>>> >>>>>> >>>>>> >>>>>> On 10/17/2016 03:21 PM, Yuanhan Liu wrote: >>>>>>> On Mon, Oct 17, 2016 at 01:23:23PM +0200, Maxime Coquelin wrote: >>>>>>>>> On my side, I just setup 2 Windows 2016 VMs, and confirm the issue. >>>>>>>>> I'll continue the investigation early next week. >>>>>>>> >>>>>>>> The root cause is identified. >>>>>>>> When INDIRECT_DESC feature is negotiated, Windows guest uses >>> indirect >>>>>>>> for both Tx and Rx descriptors, whereas Linux guests (Virtio PMD & >>>>>>>> virtio-net kernel driver) use indirect only for Tx. >>>>>>>> I'll implement indirect support for the Rx path in vhost lib, but the >>>>>>>> change will be too big for -rc release. >>>>>>>> I propose in the mean time to disable INDIRECT_DESC feature in vhost >>>>>>>> lib, we can still enable it locally for testing. >>>>>>>> >>>>>>>> Yuanhan, is it ok for you? >>>>>>> >>>>>>> That's okay. >>>>>> I'll send a patch to disable it then. >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> Has anyone already tested Windows guest with vhost-net, which >>> also >>>>>> has >>>>>>>>> indirect descs support? >>>>>>>> >>>>>>>> I tested and confirm it works with vhost-net. >>>>>>> >>>>>>> I'm a bit confused then. IIRC, vhost-net also doesn't support indirect >>>>>>> for Rx path, right? >>>>>> >>>>>> No, it does support it actually. >>>>>> I thought it didn't support too, I misread the Kernel implementation of >>>>>> vhost-net and virtio-net. Acutally, virtio-net makes use of indirect >>>>>> in Rx path when mergeable buffers is disabled. >>>>>> >>>>>> The confusion certainly comes from me, sorry about that. >>>>>> >>>>>> Maxime >