From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id DC10E2B96 for ; Fri, 3 Feb 2017 16:34:15 +0100 (CET) Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D5E0EC05D26F; Fri, 3 Feb 2017 15:34:15 +0000 (UTC) Received: from redhat.com (ovpn-121-90.rdu2.redhat.com [10.10.121.90] (may be forged)) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id v13FYDpw019222; Fri, 3 Feb 2017 10:34:13 -0500 Date: Fri, 3 Feb 2017 17:34:07 +0200 From: "Michael S. Tsirkin" To: Maxime Coquelin Cc: Ciara Loftus , mark.b.kavanagh@intel.com, Flavio Leitner , Daniele Di Proietto , "dev@openvswitch.org" , Kevin Traynor , "Daniel P. Berrange" , Yuanhan Liu , "dev@dpdk.org" , "libvir-list@redhat.com" , sean.k.mooney@intel.com Message-ID: <20170203172140-mutt-send-email-mst@kernel.org> References: <4cad5796-7024-4a48-a73a-8dd780259968@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4cad5796-7024-4a48-a73a-8dd780259968@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 03 Feb 2017 15:34:16 +0000 (UTC) Subject: Re: [dpdk-dev] [RFC] Vhost-user backends cross-version migration support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Feb 2017 15:34:16 -0000 On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote: > Hi, > > On 02/01/2017 09:35 AM, Maxime Coquelin wrote: > > Hi, > > > > Few months ago, Michael reported a problem about migrating VMs relying > > on vhost-user between hosts supporting different backend versions: > > - Message-Id: <20161011173526-mutt-send-email-mst@kernel.org> > > - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html > > > > The goal of this thread is to draft a proposal based on the outcomes > > of discussions with contributors of the different parties (DPDK/OVS > > /libvirt/...). > > Thanks the first feedback. It seems to converge that this is Nova's > role, but not Libvirt one to manage these versions from management tool > layer. I think the conclusion is not that it should go up the stack. I think this will just get broken all the time. No one understands versions and stuff. Even QEMU developers get confused and break compatibility once in a while. My conclusion is that doing it from OVS side is wrong. Migration is not an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU. People just want migration to work, ok? It's our job to do it, we do not really need a "make things work" flag. If libvirt does not want to use the vhost-user protocol (which sounds reasonable, it's rather complex) how about qemu providing a small utility to query the port? We could output json or whatever. This can help with MTU as well. And maybe it will help with nowait support - if someone uses the utility to dump backend config once, QEMU can later start the device without feature queries. > This change has has no impact from OVS perspective, same requirements > apply. I am interested on OVS contributors feedback on the below design > proposal. > > Especially, I would like to have your opinion on the best way for OVS to > expose its supported versions: > - Static file generated at build time from version table described below > - Entries in the OVS DB > - Dedicated tool listing strings from the version table described below > > For selecting the right version of the vhost-user backend, do you agree > it should be done via a new parameter of the ovs-vsctl add-port command > for dpdkvhostuser ports? > > > > Problem statement: > > ================== > > > > When migrating a VM from one host to another, the interfaces exposed by > > QEMU must stay unchanged in order to guarantee a successful migration. > > In the case of vhost-user interface, parameters like supported Virtio > > feature set, max number of queues, max vring sizes,... must remain > > compatible. Indeed, the frontend not being re-initialized, no > > renegotiation happens at migration time. > > > > For example, we have a VM that runs on host A, which has its vhost-user > > backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest > > also support this feature, it is successfully negotiated, and guest > > transmit packets using indirect descriptor tables, that the backend > > knows to handle. > > At some point, the VM is being migrated to host B, which runs an older > > version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC > > feature. The migration would break, because the Guest still have the > > VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some > > decriptors pointing to indirect tables, that backend B doesn't know to > > handle. > > This is just an example about Virtio features compatibility, but other > > backend implementation details could cause other failures. > > > > What we need is to be able to query the destination host's backend to > > ensure migration is possible. Also, we would need to query this > > statically, even before the VM is started, to be sure it could be > > migrated elsewhere for any reason. > > ... > > > > > Solution 3: Libvirt queries OVS for vhost backend version string: *OK* > > ====================================================================== > > > > > > The idea is to have a table of supported versions, associated to > > key/value pairs. Libvirt could query the list of supported versions > > strings for each hosts, and select the first common one among all hosts. > > > > Then, libvirt would ask OVS to probe the vhost-user interfaces in the > > selected version (compatibility mode). For example host A runs OVS-2.7, > > and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode > > (e.g. with indirect descriptors disabled), which should be selected at > > vhost-user interface probe time. > > > > Advantage of doing so is that libvirt does not need any update if new > > keys are introduced (i.e. it does not need to know how the new keys have > > to be handled), all these checks remain in OVS's vhost-user implementation. > > > > Ideally, we would support per vhost-user interface compatibility mode, > > which may have an impact also on DPDK API, as the Virtio feature update > > API is global, and not per port. > > > > - Implementation: > > ----------------- > > > > Goal here is just to illustrate this proposal, I'm sure you will have > > good suggestion to improve it. > > In OVS vhost-user library, we would introduce a new structure, for > > example (neither compiled nor tested): > > > > struct vhostuser_compat { > > char *version; > > uint64_t virtio_features; > > uint32_t max_rx_queue_sz; > > uint32_t max_nr_queues; > > }; > > > > *version* field is the compatibility version string. > > It could be something like: "upstream.ovs-dpdk.v2.6" > > In case for example Fedora adds some more patches to its > > package that would break migration to upstream version, it could have > > a dedicated compatibility string: "fc26.ovs-dpdk.v2.6". > > In case OVS-v2.7 does not break compatibility with previous OVS-v2.6 > > version, then no need to create a new compatibility entry, just keep > > v2.6 one. > > > > *virtio_features* field is the Virtio features set for a given > > compatibility version. When an OVS tag is to be created, it would be > > associated to a DPDK version. The Virtio features for these version > > would be stored in this field. It would allow to upgrade the DPDK > > package for example from v16.07 to v16.11 without breaking migration. > > In case the distribution wants to benefit from latests Virtio > > features, it would have to create a new entry to ensure migration > > won't be broken. > > > > *max_rx_queue_sz* > > *max_nr_queues* fields are just here for example, don't think this is > > needed today. I just want to illustrate that we have to anticipate > > other parameters than the Virtio feature set, even if not necessary > > at the moment. > > > > We create a table with different compatibility versions in OVS > > vhost-user lib: > > > > static struct vhostuser_compat vu_compat[] = { > > { > > .version = "upstream.ovs-dpdk.v2.7", > > .virtio_features = 0x12045694, > > .max_rx_queue_sz = 512, > > }, > > { > > .version = "upstream.ovs-dpdk.v2.6", > > .virtio_features = 0x10045694, > > .max_rx_queue_sz = 1024, > > }, > > } > > > > At some time during installation, or system init, the table would be > > parsed, and compatibility version strings would be stored into the OVS > > database, or a new tool would be created to list these strings. > > > > Before launching the VM, libvirt will query the version strings for > > each hosts using for example the JSON RPC API of OVS (maybe not the best > > solution, looking forward for your comments on this). Libvirt would then > > select the first common supported version, and insert this string into > > the vhost-user interfaces parameters in the OVS DBs of each host. > > > > When the vhost-user connection is initiated, OVS would know in which > > compatibility mode to init the interface, for example by restricting > > the support Virtio features of the interface. > > > > Do you think this is reasonable? Or maybe you have alternative ideas > > that would be best fit to ensure successful migration? > > Thanks, > Maxime