From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 0E616F04 for ; Fri, 21 Sep 2018 16:06:50 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5CBCD3001773; Fri, 21 Sep 2018 14:06:49 +0000 (UTC) Received: from localhost (dhcp-192-209.str.redhat.com [10.33.192.209]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 862F81073022; Fri, 21 Sep 2018 14:06:45 +0000 (UTC) Date: Fri, 21 Sep 2018 16:06:44 +0200 From: Jens Freimann To: Tiwei Bie Cc: dev@dpdk.org, maxime.coquelin@redhat.com, Gavin.Hu@arm.com, zhihong.wang@intel.com Message-ID: <20180921140644.oiwl7gwekreflc7v@jenstp.localdomain> References: <20180921103308.16357-1-jfreimann@redhat.com> <20180921123222.GA25292@debian> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20180921123222.GA25292@debian> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Fri, 21 Sep 2018 14:06:49 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH v6 00/11] implement packed virtqueues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Sep 2018 14:06:50 -0000 On Fri, Sep 21, 2018 at 08:32:22PM +0800, Tiwei Bie wrote: >On Fri, Sep 21, 2018 at 12:32:57PM +0200, Jens Freimann wrote: >> This is a basic implementation of packed virtqueues as specified in the >> Virtio 1.1 draft. A compiled version of the current draft is available >> at https://github.com/oasis-tcs/virtio-docs.git (or as .pdf at >> https://github.com/oasis-tcs/virtio-docs/blob/master/virtio-v1.1-packed-wd10.pdf >> >> A packed virtqueue is different from a split virtqueue in that it >> consists of only a single descriptor ring that replaces available and >> used ring, index and descriptor buffer. >> >> Each descriptor is readable and writable and has a flags field. These flags >> will mark if a descriptor is available or used. To detect new available descriptors >> even after the ring has wrapped, device and driver each have a >> single-bit wrap counter that is flipped from 0 to 1 and vice versa every time >> the last descriptor in the ring is used/made available. >> >> The idea behind this is to 1. improve performance by avoiding cache misses >> and 2. be easier for devices to implement. >> >> Regarding performance: with these patches I get 21.13 Mpps on my system >> as compared to 18.8 Mpps with the virtio 1.0 code. Packet size was 64 > >Did you enable multiple-queue and use multiple cores on >vhost side? If not, I guess the above performance gain >is the gain in vhost side instead of virtio side. I tested several variations back then and they all looked very good. But code change a lot meanwhile and I need to do more benchmarking in any case. > >If you use more cores on vhost side or virtio side, will >you see any performance changes? > >Did you do any performance test with the kernel vhost-net >backend (with zero-copy enabled and disabled)? I think we >also need some performance data for these two cases. And >it can help us to make sure that it works with the kernel >backends. I tested against vhost-kernel but only to test functionality not to benchmark. > >And for the "virtio-PMD + vhost-PMD" test cases, I think >we need below performance data: > >#1. The maximum 1 core performance of virtio PMD when using split ring. >#2. The maximum 1 core performance of virtio PMD when using packed ring. >#3. The maximum 1 core performance of vhost PMD when using split ring. >#4. The maximum 1 core performance of vhost PMD when using packed ring. > >And then we can have a clear understanding of the >performance gain in DPDK with packed ring. > >And FYI, the maximum 1 core performance of virtio PMD >can be got in below steps: > >1. Launch vhost-PMD with multiple queues, and use multiple > CPU cores for forwarding. >2. Launch virtio-PMD with multiple queues and use 1 CPU > core for forwarding. >3. Repeat above two steps with adding more CPU cores > for forwarding in vhost-PMD side until we can't see > performance increase anymore. Thanks for the suggestions, I'll come back with more numbers. > >Besides, I just did a quick glance at the Tx implementation, >it still assumes the descs will be written back in order >by device. You can find more details from my comments on >that patch. Saw it and noted. I had hoped to be able to avoid the list but I see no way around it now. Thanks for your review Tiwei! regards, Jens