From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jfreimann@redhat.com>
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 by dpdk.org (Postfix) with ESMTP id 0E616F04
 for <dev@dpdk.org>; Fri, 21 Sep 2018 16:06:50 +0200 (CEST)
Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com
 [10.5.11.22])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id 5CBCD3001773;
 Fri, 21 Sep 2018 14:06:49 +0000 (UTC)
Received: from localhost (dhcp-192-209.str.redhat.com [10.33.192.209])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 862F81073022;
 Fri, 21 Sep 2018 14:06:45 +0000 (UTC)
Date: Fri, 21 Sep 2018 16:06:44 +0200
From: Jens Freimann <jfreimann@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: dev@dpdk.org, maxime.coquelin@redhat.com, Gavin.Hu@arm.com,
 zhihong.wang@intel.com
Message-ID: <20180921140644.oiwl7gwekreflc7v@jenstp.localdomain>
References: <20180921103308.16357-1-jfreimann@redhat.com>
 <20180921123222.GA25292@debian>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Disposition: inline
In-Reply-To: <20180921123222.GA25292@debian>
User-Agent: NeoMutt/20180716
X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
 (mx1.redhat.com [10.5.110.46]); Fri, 21 Sep 2018 14:06:49 +0000 (UTC)
Subject: Re: [dpdk-dev] [PATCH v6 00/11] implement packed virtqueues
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2018 14:06:50 -0000

On Fri, Sep 21, 2018 at 08:32:22PM +0800, Tiwei Bie wrote:
>On Fri, Sep 21, 2018 at 12:32:57PM +0200, Jens Freimann wrote:
>> This is a basic implementation of packed virtqueues as specified in the
>> Virtio 1.1 draft. A compiled version of the current draft is available
>> at https://github.com/oasis-tcs/virtio-docs.git (or as .pdf at
>> https://github.com/oasis-tcs/virtio-docs/blob/master/virtio-v1.1-packed-wd10.pdf
>>
>> A packed virtqueue is different from a split virtqueue in that it
>> consists of only a single descriptor ring that replaces available and
>> used ring, index and descriptor buffer.
>>
>> Each descriptor is readable and writable and has a flags field. These flags
>> will mark if a descriptor is available or used.  To detect new available descriptors
>> even after the ring has wrapped, device and driver each have a
>> single-bit wrap counter that is flipped from 0 to 1 and vice versa every time
>> the last descriptor in the ring is used/made available.
>>
>> The idea behind this is to 1. improve performance by avoiding cache misses
>> and 2. be easier for devices to implement.
>>
>> Regarding performance: with these patches I get 21.13 Mpps on my system
>> as compared to 18.8 Mpps with the virtio 1.0 code. Packet size was 64
>
>Did you enable multiple-queue and use multiple cores on
>vhost side? If not, I guess the above performance gain
>is the gain in vhost side instead of virtio side.

I tested several variations back then and they all looked very good.
But code change a lot meanwhile and I need to do more benchmarking
in any case.

>
>If you use more cores on vhost side or virtio side, will
>you see any performance changes?
>
>Did you do any performance test with the kernel vhost-net
>backend (with zero-copy enabled and disabled)? I think we
>also need some performance data for these two cases. And
>it can help us to make sure that it works with the kernel
>backends.

I tested against vhost-kernel but only to test functionality not
to benchmark. 
>
>And for the "virtio-PMD + vhost-PMD" test cases, I think
>we need below performance data:
>
>#1. The maximum 1 core performance of virtio PMD when using split ring.
>#2. The maximum 1 core performance of virtio PMD when using packed ring.
>#3. The maximum 1 core performance of vhost PMD when using split ring.
>#4. The maximum 1 core performance of vhost PMD when using packed ring.
>
>And then we can have a clear understanding of the
>performance gain in DPDK with packed ring.
>
>And FYI, the maximum 1 core performance of virtio PMD
>can be got in below steps:
>
>1. Launch vhost-PMD with multiple queues, and use multiple
>   CPU cores for forwarding.
>2. Launch virtio-PMD with multiple queues and use 1 CPU
>   core for forwarding.
>3. Repeat above two steps with adding more CPU cores
>   for forwarding in vhost-PMD side until we can't see
>   performance increase anymore.

 Thanks for the suggestions, I'll come back with more
numbers.

>
>Besides, I just did a quick glance at the Tx implementation,
>it still assumes the descs will be written back in order
>by device. You can find more details from my comments on
>that patch.

Saw it and noted. I had hoped to be able to avoid the list but
I see no way around it now. 

Thanks for your review Tiwei!

regards,
Jens