From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 4C8C19E3 for ; Thu, 2 Feb 2017 15:14:09 +0100 (CET) Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 92E6380B22; Thu, 2 Feb 2017 14:14:09 +0000 (UTC) Received: from [10.36.117.234] (ovpn-117-234.ams2.redhat.com [10.36.117.234]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id v12EE27h020432 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 2 Feb 2017 09:14:05 -0500 To: "Daniel P. Berrange" References: <20170201094304.GA3232@redhat.com> <20170201114119.GE3232@redhat.com> Cc: Michal Privoznik , Kevin Traynor , "Michael S. Tsirkin" , Ciara Loftus , mark.b.kavanagh@intel.com, Flavio Leitner , Yuanhan Liu , Daniele Di Proietto , "dev@openvswitch.org" , "dev@dpdk.org" , "libvir-list@redhat.com" From: Maxime Coquelin Message-ID: Date: Thu, 2 Feb 2017 15:14:01 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <20170201114119.GE3232@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 02 Feb 2017 14:14:09 +0000 (UTC) Subject: Re: [dpdk-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Feb 2017 14:14:09 -0000 On 02/01/2017 12:41 PM, Daniel P. Berrange wrote: > On Wed, Feb 01, 2017 at 12:33:22PM +0100, Maxime Coquelin wrote: >> >> >> On 02/01/2017 10:43 AM, Daniel P. Berrange wrote: >>> On Wed, Feb 01, 2017 at 10:14:54AM +0100, Michal Privoznik wrote: >>>> On 02/01/2017 09:35 AM, Maxime Coquelin wrote: >>> >>>>> Solution 3: Libvirt queries OVS for vhost backend version string: *OK* >>>>> ====================================================================== >>>>> >>>>> >>>>> The idea is to have a table of supported versions, associated to >>>>> key/value pairs. Libvirt could query the list of supported versions >>>>> strings for each hosts, and select the first common one among all hosts. >>>> >>>> How does libvirt know what hosts to ask? Libvirt aims on managing a >>>> single host. It has no knowledge of other hosts on the network. That's >>>> task for upper layers like RHEV, OpenStack, etc. >>>> >>>>> >>>>> Then, libvirt would ask OVS to probe the vhost-user interfaces in the >>>>> selected version (compatibility mode). For example host A runs OVS-2.7, >>>>> and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode >>>>> (e.g. with indirect descriptors disabled), which should be selected at >>>>> vhost-user interface probe time. >>>>> >>>>> Advantage of doing so is that libvirt does not need any update if new >>>>> keys are introduced (i.e. it does not need to know how the new keys have >>>>> to be handled), all these checks remain in OVS's vhost-user implementation. >>>> >>>> And that's where they should stay. Duplicating code between projects >>>> will inevitably lead to a divergence. >>>> >>>>> >>>>> Ideally, we would support per vhost-user interface compatibility mode, >>>>> which may have an impact also on DPDK API, as the Virtio feature update >>>>> API is global, and not per port. >>>> >>>> In general, I don't think we want any kind of this logic in libvirt. Either: >>>> >>>> a) fallback logic should be implemented in qemu (e.g. upon migration it >>>> should detect that the migrated guest uses certain version and thus set >>>> backend to use that version or error out and cancel migration), or >>>> >>>> b) libvirt would grew another element/attribute to specify version of >>>> vhost-user backend in use and do nothing more than pass it to qemu. At >>>> the same time, we can provide an API (or extend and existing one, e.g. >>>> virsh domcapabilities) to list all available versions on given host. >>>> Upper layer, which knows what are the possible hosts suitable for >>>> virtualization, can then use this API to ask all the hosts, construct >>>> the matrix, select preferred version and put it into libvirt's domain XML. >>>> >>>> But frankly, I don't like b) that much. Lets put the fact this is OVS >>>> aside for a moment. Just pretend this is a generic device in qemu. Would >>>> we do the same magic with it? No! Or lets talk about machine types. You >>>> spawn -M type$((X+1)) guest and then decide to migrate it to a host with >>>> older qemu wich supports just typeX. Well, you get an error. Do we care? >>>> Not at all! It's your responsibility (as user/admin) to upgrade the qemu >>>> so that it supports new machine type. I think the same applies to OVS. >>> >>> With machine types, if the latest machine type is X, libvirt allows >>> the mgmt app to spawn a guest with mcahine type X-1, so that you can >>> later migrate the VM to a host with older QEMU. >>> >>> With the vhost user device, the VM will always be launched with version >>> Y. There's currently no way to request launching the vhost user device >>> wtih version Y-1. So even if you set your machine type to X-1 for >>> compat with older host, vhost user will be stuck at version Y preventing >>> the migration. >>> >>> One argument would be to say that the vhost user featureset should be >>> determined by the machine type version, instead of introducing a new >>> version. The complexity is that vhost-user is a pretty dumb device >>> from QEMUs POV - most of the interesting logic & the features that >>> need to be constrained lie in code outside of QEMU, in whatever >>> server is connected to the vhost user socket. >>> >>> So I can see the value in allowing a simple version string to be >>> associated with the vhost-user NIC. >>> >>> What I'm unclear about is how we would be able to report capabilities >>> for a host to enumerate what versions were possible. Libvirt doesn't >>> interact with any of the 3rd party vhost user servers, so it is probably >>> out of scope - it'd be upto the higher level mgmt app to talk to DPDK >>> and figure out what versions it supports. >>> >>> That does make me wonder though if libvirt & QEMU need to be involved >>> at all in any part. >> >> Indeed, if the higher level management tool decides for the VM's machine >> type, it is where it should also be done for the vhost-user backend. I >> now understand this does not make much sense to have libvirt being >> involved in this, all (querying/selecting/setting compat mode) should be >> manageable in the upper layer. >> >> I'm not familiar with these layers, so your inputs are really helpful. >> >>> >>> When provisioning a new guest, the mgmt app presumably has to talk to >>> DPDK to setup the NIC port, so DPDK is ready when QEMU launches and >>> connects. Surely as part of that NIC port setup, it could directly >>> tell DPDK which version or featureset to permit on the port ? It is >>> not obvious why the version string has to be fed in via QEMU. >> No it is not, my proposal was that libvirt set the version string in >> OVS, QEMU was not involved. >> >> From these inputs, the idea remains valid, except that libvirt is not >> the right place to manage this. Instead, RHEV, Openstack or any other >> management tool should handle the compat mode selection. > > It depends where / how in OVS it needs to be set. The only stuff libvirt > does with OVS is to run 'add-port' and 'del-port' commands via the ovs > cli tool. We pass through arguments from the port profile stored in the > XML config. > > > > > > > > > eg those things in get passed as cli args to the 'add-port' > command. Soo if add-port needs this new version string, then we'd need > to add the version to the openvswitch virtualport XML. > > If the version is provided to OVS in a different command, then it would > probably be outside scope of libvirt. I think it would make sense to be a parameter of the add-port command. But it would be for vhost-user related add-port command, I didn't find where/if this is managed in libvirt XML. Regards, Maxime