From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id C23555589 for ; Wed, 17 Aug 2016 11:17:49 +0200 (CEST) Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 20D0380B22; Wed, 17 Aug 2016 09:17:49 +0000 (UTC) Received: from [10.36.4.8] (vpn1-4-8.ams2.redhat.com [10.36.4.8]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u7H9HkJQ001783 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 17 Aug 2016 05:17:48 -0400 To: "Wang, Zhihong" , Yuanhan Liu References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com> <8F6C2BD409508844A0EFC19955BE09411077206B@SHSMSX103.ccr.corp.intel.com> <20160817023825.GO30752@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09411077220A@SHSMSX103.ccr.corp.intel.com> Cc: "dev@dpdk.org" From: Maxime Coquelin Message-ID: <020de331-94f0-049a-6e7d-30825faf54dd@redhat.com> Date: Wed, 17 Aug 2016 11:17:46 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <8F6C2BD409508844A0EFC19955BE09411077220A@SHSMSX103.ccr.corp.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 17 Aug 2016 09:17:49 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] optimize vhost enqueue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2016 09:17:50 -0000 On 08/17/2016 08:41 AM, Wang, Zhihong wrote: > > >> -----Original Message----- >> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com] >> Sent: Wednesday, August 17, 2016 10:38 AM >> To: Wang, Zhihong >> Cc: Maxime Coquelin ; dev@dpdk.org >> Subject: Re: [dpdk-dev] [PATCH] optimize vhost enqueue >> >> On Wed, Aug 17, 2016 at 01:45:26AM +0000, Wang, Zhihong wrote: >>> >>> >>>> -----Original Message----- >>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] >>>> Sent: Tuesday, August 16, 2016 10:00 PM >>>> To: Wang, Zhihong ; dev@dpdk.org >>>> Subject: Re: [dpdk-dev] [PATCH] optimize vhost enqueue >>>> >>>> Hi Zhihong, >>>> >>>> On 08/16/2016 05:50 AM, Zhihong Wang wrote: >>>>> This patch optimizes the vhost enqueue function: >> rte_vhost_enqueue_burst. >>>>> >>>>> Currently there're 2 callbacks for vhost enqueue: >>>>> * virtio_dev_merge_rx for mrg_rxbuf turned on cases. >>>>> * virtio_dev_rx for mrg_rxbuf turned off cases. >>>>> >>>>> The virtio_dev_merge_rx doesn't provide optimal performance, also it is >>>>> reported having compatibility issue working with Windows VMs. >>>> Could you tell us more please about this compatibility issue? >>> >>> >>> For example, when you have testpmd in the host and Window VM as the >> guest, >>> with mrg_rxbuf turned on, the guest will hang once there's packets enqueued >>> by virtio_dev_merge_rx. >> >> You should put it into commit log. > > > Okay. > > >> >>> Let me know if you see the same issue. >>> >>> >>>> >>>>> >>>>> Besides, having 2 separated functions increases maintenance efforts. >>>>> >>>>> This patch uses a single function logic to replace the current 2 for >>>>> better maintainability, and provides better performance by optimizing >>>>> caching behavior especially for mrg_rxbuf turned on cases. >> >> Here, here sounds two parts to me: >> >> - one to unite mergeable and non-mergeable Rx >> >> - another one to optimize the mergeable path >> >> That means you should do it in two patches, with that we can have clear >> understanding what changes the performance boost. It also helps review. > > > Please see explanation below. > > >> >>>> Do you have some benchmark comparison before and after your change? >>>> >>>> Also, for maintainability, I would suggest the that the enqueue >>>> function be split. Because vhost_enqueue_burst becomes very long (220 >>>> LoC), and max level of indentation is too high (6). >>>> >>>> It makes the code hard to understand, and prone to miss bugs during >>>> review and maintenance. >> >> Agreed. >> >>> >>> This is something I've thought about while writing the code, the reason I >>> keep it as one function body is that: >>> >>> 1. This function is very performance sensitive, and we need full control of >>> code ordering (You can compare with the current performance with the >>> mrg_rxbuf feature turned on to see the difference). >> >> Will inline functions help? > > > Optimization in this patch actually reorganizes the code from its logic, > so it's not suitable for making separated functions. > > I'll explain this in v2. I agree with Yuanhan. Inline functions should not break the optimizations. IMHO, this is mandatory for the patch to be accepted. > > >> >>> 2. I somehow find that a single function logic makes it easier to understand, >>> surely I can add comments to make it easiler to read for . >>> >>> Please let me know if you still insist, we can discuss more on it. >> >> I am personally not a fan of huge function; I would try hard to avoid >> too many levels of indentation as well. >> >>> >>>> >>>>> >>>>> It also fixes the issue working with Windows VMs. >>>> Ideally, the fix should be sent separately, before the rework. >>>> Indeed, we might want to have the fix in the stable branch, without >>>> picking the optimization. >> >> Agreed. > > > The fact is that I don't have much time to debug with the current code > since it's messy and I don't have Windows virtio code and the debugging > environment. It seems you are not the only one facing the issue: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/70 So a dedicated fix is really important. > This patch doesn't try to fix this issue, it rewrites the logic totally, > and somehow fixes this issue. > > Do you think integrating this whole patch into the stable branch will work? > Personally I think it makes more sense. No. We don't even know why/how it fixes the Windows issue, which would be the first thing to understand before integrating a fix in stable branch. And the stable branch is not meant for integrating such big reworks, it is only meant to fix bugs. The risk of regressions have to be avoided as much as possible. > > >> >>>> >>>>> >>>>> Signed-off-by: Zhihong Wang >>>>> --- >>>>> lib/librte_vhost/vhost-net.h | 6 +- >>>>> lib/librte_vhost/vhost_rxtx.c | 582 >> ++++++++++++++---------------------------- >>>>> lib/librte_vhost/virtio-net.c | 15 +- >>>>> 3 files changed, 208 insertions(+), 395 deletions(-) >>>> 582 lines changed is a huge patch. >>>> If possible, it would be better splitting it in incremental changes, >>>> making the review process easier. >>> >>> >>> It looks like a huge patch, but it simply deletes the current implementation >>> and add the new code. I think perhaps split it into 2, 1st one to replace >>> just the rte_vhost_enqueue_burst, 2nd one to delete all the obsolete >> functions. >>> It should make the patch clear, how do you think? :) >> >> Nope, it's not working in that way. It should be: >> >> - one patch to fix the hang issue for windows guest >> >> Please cc it to stable@dpdk.org as well so that we could pick it for >> v16.07 stable release. >> >> - one patch to unite the two different Rx code path >> >> - another patch to optimize mergeable code path > > > I can separate optimization from the basic code in v2, however as I explained > this patch is built from scratch and doesn't take anything from the existing > code, so there's no way to transform from the existing code incrementally into > the new code. > > >> >> --yliu