From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 479ED377E for ; Fri, 8 Sep 2017 02:48:26 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Sep 2017 17:48:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,360,1500966000"; d="scan'208";a="1192807677" Received: from debian-zgviawfucg.sh.intel.com (HELO debian-ZGViaWFuCg) ([10.67.104.160]) by fmsmga001.fm.intel.com with ESMTP; 07 Sep 2017 17:48:24 -0700 Date: Fri, 8 Sep 2017 08:48:50 +0800 From: Tiwei Bie To: Maxime Coquelin Cc: dev@dpdk.org, yliu@fridaylinux.org, Zhihong Wang , Zhiyong Yang Message-ID: <20170908004849.GA18498@debian-ZGViaWFuCg> References: <20170824021939.21306-1-tiwei.bie@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.2 (2016-11-26) Subject: Re: [dpdk-dev] [PATCH] vhost: adaptively batch small guest memory copies X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Sep 2017 00:48:27 -0000 Hi Maxime, On Thu, Sep 07, 2017 at 07:47:57PM +0200, Maxime Coquelin wrote: > Hi Tiwei, > > On 08/24/2017 04:19 AM, Tiwei Bie wrote: > > This patch adaptively batches the small guest memory copies. > > By batching the small copies, the efficiency of executing the > > memory LOAD instructions can be improved greatly, because the > > memory LOAD latency can be effectively hidden by the pipeline. > > We saw great performance boosts for small packets PVP test. > > > > This patch improves the performance for small packets, and has > > distinguished the packets by size. So although the performance > > for big packets doesn't change, it makes it relatively easy to > > do some special optimizations for the big packets too. > > > > Signed-off-by: Tiwei Bie > > Signed-off-by: Zhihong Wang > > Signed-off-by: Zhiyong Yang > > --- > > This optimization depends on the CPU internal pipeline design. > > So further tests (e.g. ARM) from the community is appreciated. > > > > lib/librte_vhost/vhost.c | 2 +- > > lib/librte_vhost/vhost.h | 13 +++ > > lib/librte_vhost/vhost_user.c | 12 +++ > > lib/librte_vhost/virtio_net.c | 240 ++++++++++++++++++++++++++++++++---------- > > 4 files changed, 209 insertions(+), 58 deletions(-) > > I did some PVP benchmark with your patch. > First I tried my standard PVP setup, with io forwarding on host and > macswap on guest in bidirectional mode. > > With this, I notice no improvement (18.8Mpps), but I think it explains > because guest is the bottleneck here. > So I change my setup to do csum forwarding on host side, so that host's > PMD threads are more loaded. > > In this case, I notice a great improvement, I get 18.8Mpps with your > patch instead of 14.8Mpps without! Great work! > > Reviewed-by: Maxime Coquelin > Thank you very much for taking time to review and test this patch! :-) Best regards, Tiwei Bie