From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <maxime.coquelin@redhat.com>
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 by dpdk.org (Postfix) with ESMTP id DD9C42946
 for <dev@dpdk.org>; Thu, 29 Sep 2016 17:30:56 +0200 (CEST)
Received: from int-mx10.intmail.prod.int.phx2.redhat.com
 (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id 4877D68803;
 Thu, 29 Sep 2016 15:30:56 +0000 (UTC)
Received: from [10.36.5.134] (vpn1-5-134.ams2.redhat.com [10.36.5.134])
 by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id
 u8TFUrWR026822
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Thu, 29 Sep 2016 11:30:54 -0400
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>,
 "Michael S. Tsirkin" <mst@redhat.com>
References: <1474872056-24665-1-git-send-email-yuanhan.liu@linux.intel.com>
 <1474872056-24665-2-git-send-email-yuanhan.liu@linux.intel.com>
 <CAOaVG17N_giiwPwWg-=8eHCCTGOcLHPxJBUxrx04YhLp2EjCdw@mail.gmail.com>
 <20160926221112-mutt-send-email-mst@kernel.org>
 <20160927031158.GA25823@yliu-dev.sh.intel.com>
 <20160927224935-mutt-send-email-mst@kernel.org>
 <20160928022848.GE1597@yliu-dev.sh.intel.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>, dev@dpdk.org,
 qemu-devel@nongnu.org
From: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-ID: <bfe463e9-0d1a-c136-4115-7620f9e6ec90@redhat.com>
Date: Thu, 29 Sep 2016 17:30:53 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.2.0
MIME-Version: 1.0
In-Reply-To: <20160928022848.GE1597@yliu-dev.sh.intel.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
 (mx1.redhat.com [10.5.110.38]); Thu, 29 Sep 2016 15:30:56 +0000 (UTC)
Subject: Re: [dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout
	feature
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Sep 2016 15:30:57 -0000



On 09/28/2016 04:28 AM, Yuanhan Liu wrote:
> On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote:
>> On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote:
>>> On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote:
>>>> On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote:
>>>>> I assume that if using Version 1 that the bit will be ignored
>>>
>>> Yes, but I will just quote what you just said: what if the guest
>>> virtio device is a legacy device? I also gave my reasons in another
>>> email why I consistently set this flag:
>>>
>>>   - we have to return all features we support to the guest.
>>>
>>>     We don't know the guest is a modern or legacy device. That means
>>>     we should claim we support both: VERSION_1 and ANY_LAYOUT.
>>>
>>>     Assume guest is a legacy device and we just set VERSION_1 (the current
>>>     case), ANY_LAYOUT will never be negotiated.
>>>
>>>   - I'm following the way Linux kernel takes: it also set both features.
>>>
>>>   Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_?
>>>
>>> The unset after negotiation I proposed turned out it won't work: the
>>> feature is already negotiated; unsetting it only in vhost side doesn't
>>> change anything. Besides, it may break the migration as Michael stated
>>> below.
>>
>> I think the reverse. Teach vhost user that for future machine types
>> only VERSION_1 implies ANY_LAYOUT.
>>
>>
>>>> Therein lies a problem. If dpdk tweaks flags, updating it
>>>> will break guest migration.
>>>>
>>>> One way is to require that users specify all flags fully when
>>>> creating the virtio net device.
>>>
>>> Like how? By a new command line option? And user has to type
>>> all those features?
>>
>> Make libvirt do this.  users use management normally. those that don't
>> likely don't migrate VMs.
>
> Fair enough.
>
>>
>>>> QEMU could verify that all required
>>>> flags are set, and fail init if not.
>>>>
>>>> This has other advantages, e.g. it adds ability to
>>>> init device without waiting for dpdk to connect.
>
> Will the feature negotiation between DPDK and QEMU still exist
> in your proposal?
>
>>>>
>>>> However, enabling each new feature would now require
>>>> management work. How about dpdk ships the list
>>>> of supported features instead?
>>>> Management tools could read them on source and destination
>>>> and select features supported on both sides.
>>>
>>> That means the management tool would somehow has a dependency on
>>> DPDK project, which I have no objection at all. But, is that
>>> a good idea?
>>
>> It already starts the bridge somehow, does it not?
>
> Indeed. I was firstly thinking about reading the dpdk source file
> to determine the DPDK supported feature list, with which the bind
> is too tight. I later realized you may ask DPDK to provide a binary
> to dump the list, or something like that.
>
>>
>>> BTW, I'm not quite sure I followed your idea. I mean, how it supposed
>>> to fix the ANY_LAYOUT issue here? How this flag will be set for
>>> legacy device?
>>>
>>> 	--yliu
>>
>> For ANY_LAYOUT, I think we should just set in in qemu,
>> but only for new machine types.
>
> What do you mean by "new machine types"? Virtio device with newer
> virtio-spec version?
>
>> This addresses migration
>> concerns.
>
> To make sure I followed you, do you mean the migration issue from
> an older "dpdk + qemu" combo to a newer "dpdk + qemu" combo (that
> more new features might be shipped)?
>
> Besides that, your proposal looks like a big work to accomplish.
> Are you okay to make it simple first: set it consistently like
> what Linux kernel does? This would at least make the ANY_LAYOUT
> actually be enabled for legacy device (which is also the default
> one that's widely used so far).

Before enabling anything by default, we should first optimize the 1 slot
case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17%
perf regression for 64 bytes case:
  - 2 descs per packet: 11.6Mpps
  - 1 desc per packet: 9.6Mpps

This is due to the virtio header clearing in virtqueue_enqueue_xmit().
Removing it, we get better results than with 2 descs (1.20Mpps).
Since the Virtio PMD doesn't support offloads, I wonder whether we can
just drop the memset?

  -- Maxime
[0]: For testing, you'll need these patches, else only first packets
will use a single slot:
  - http://dpdk.org/dev/patchwork/patch/16222/
  - http://dpdk.org/dev/patchwork/patch/16223/