From: "Michael S. Tsirkin" <mst@redhat.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Ciara Loftus <ciara.loftus@intel.com>,
mark.b.kavanagh@intel.com, Flavio Leitner <fleitner@redhat.com>,
Daniele Di Proietto <diproiettod@vmware.com>,
"dev@openvswitch.org" <dev@openvswitch.org>,
Kevin Traynor <ktraynor@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>,
Yuanhan Liu <yuanhan.liu@linux.intel.com>,
"dev@dpdk.org" <dev@dpdk.org>,
"libvir-list@redhat.com" <libvir-list@redhat.com>,
sean.k.mooney@intel.com
Subject: Re: [dpdk-dev] [RFC] Vhost-user backends cross-version migration support
Date: Fri, 3 Feb 2017 17:34:07 +0200 [thread overview]
Message-ID: <20170203172140-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <4cad5796-7024-4a48-a73a-8dd780259968@redhat.com>
On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote:
> Hi,
>
> On 02/01/2017 09:35 AM, Maxime Coquelin wrote:
> > Hi,
> >
> > Few months ago, Michael reported a problem about migrating VMs relying
> > on vhost-user between hosts supporting different backend versions:
> > - Message-Id: <20161011173526-mutt-send-email-mst@kernel.org>
> > - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html
> >
> > The goal of this thread is to draft a proposal based on the outcomes
> > of discussions with contributors of the different parties (DPDK/OVS
> > /libvirt/...).
>
> Thanks the first feedback. It seems to converge that this is Nova's
> role, but not Libvirt one to manage these versions from management tool
> layer.
I think the conclusion is not that it should go up the stack. I think
this will just get broken all the time. No one understands versions and
stuff. Even QEMU developers get confused and break compatibility once in
a while.
My conclusion is that doing it from OVS side is wrong. Migration is not
an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU. People
just want migration to work, ok? It's our job to do it, we do not really
need a "make things work" flag.
If libvirt does not want to use the vhost-user protocol (which sounds
reasonable, it's rather complex) how about qemu providing a small
utility to query the port? We could output json or whatever.
This can help with MTU as well.
And maybe it will help with nowait support - if someone uses the utility
to dump backend config once, QEMU can later start the device without
feature queries.
> This change has has no impact from OVS perspective, same requirements
> apply. I am interested on OVS contributors feedback on the below design
> proposal.
>
> Especially, I would like to have your opinion on the best way for OVS to
> expose its supported versions:
> - Static file generated at build time from version table described below
> - Entries in the OVS DB
> - Dedicated tool listing strings from the version table described below
>
> For selecting the right version of the vhost-user backend, do you agree
> it should be done via a new parameter of the ovs-vsctl add-port command
> for dpdkvhostuser ports?
>
>
> > Problem statement:
> > ==================
> >
> > When migrating a VM from one host to another, the interfaces exposed by
> > QEMU must stay unchanged in order to guarantee a successful migration.
> > In the case of vhost-user interface, parameters like supported Virtio
> > feature set, max number of queues, max vring sizes,... must remain
> > compatible. Indeed, the frontend not being re-initialized, no
> > renegotiation happens at migration time.
> >
> > For example, we have a VM that runs on host A, which has its vhost-user
> > backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest
> > also support this feature, it is successfully negotiated, and guest
> > transmit packets using indirect descriptor tables, that the backend
> > knows to handle.
> > At some point, the VM is being migrated to host B, which runs an older
> > version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC
> > feature. The migration would break, because the Guest still have the
> > VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some
> > decriptors pointing to indirect tables, that backend B doesn't know to
> > handle.
> > This is just an example about Virtio features compatibility, but other
> > backend implementation details could cause other failures.
> >
> > What we need is to be able to query the destination host's backend to
> > ensure migration is possible. Also, we would need to query this
> > statically, even before the VM is started, to be sure it could be
> > migrated elsewhere for any reason.
>
> ...
>
> >
> > Solution 3: Libvirt queries OVS for vhost backend version string: *OK*
> > ======================================================================
> >
> >
> > The idea is to have a table of supported versions, associated to
> > key/value pairs. Libvirt could query the list of supported versions
> > strings for each hosts, and select the first common one among all hosts.
> >
> > Then, libvirt would ask OVS to probe the vhost-user interfaces in the
> > selected version (compatibility mode). For example host A runs OVS-2.7,
> > and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode
> > (e.g. with indirect descriptors disabled), which should be selected at
> > vhost-user interface probe time.
> >
> > Advantage of doing so is that libvirt does not need any update if new
> > keys are introduced (i.e. it does not need to know how the new keys have
> > to be handled), all these checks remain in OVS's vhost-user implementation.
> >
> > Ideally, we would support per vhost-user interface compatibility mode,
> > which may have an impact also on DPDK API, as the Virtio feature update
> > API is global, and not per port.
> >
> > - Implementation:
> > -----------------
> >
> > Goal here is just to illustrate this proposal, I'm sure you will have
> > good suggestion to improve it.
> > In OVS vhost-user library, we would introduce a new structure, for
> > example (neither compiled nor tested):
> >
> > struct vhostuser_compat {
> > char *version;
> > uint64_t virtio_features;
> > uint32_t max_rx_queue_sz;
> > uint32_t max_nr_queues;
> > };
> >
> > *version* field is the compatibility version string.
> > It could be something like: "upstream.ovs-dpdk.v2.6"
> > In case for example Fedora adds some more patches to its
> > package that would break migration to upstream version, it could have
> > a dedicated compatibility string: "fc26.ovs-dpdk.v2.6".
> > In case OVS-v2.7 does not break compatibility with previous OVS-v2.6
> > version, then no need to create a new compatibility entry, just keep
> > v2.6 one.
> >
> > *virtio_features* field is the Virtio features set for a given
> > compatibility version. When an OVS tag is to be created, it would be
> > associated to a DPDK version. The Virtio features for these version
> > would be stored in this field. It would allow to upgrade the DPDK
> > package for example from v16.07 to v16.11 without breaking migration.
> > In case the distribution wants to benefit from latests Virtio
> > features, it would have to create a new entry to ensure migration
> > won't be broken.
> >
> > *max_rx_queue_sz*
> > *max_nr_queues* fields are just here for example, don't think this is
> > needed today. I just want to illustrate that we have to anticipate
> > other parameters than the Virtio feature set, even if not necessary
> > at the moment.
> >
> > We create a table with different compatibility versions in OVS
> > vhost-user lib:
> >
> > static struct vhostuser_compat vu_compat[] = {
> > {
> > .version = "upstream.ovs-dpdk.v2.7",
> > .virtio_features = 0x12045694,
> > .max_rx_queue_sz = 512,
> > },
> > {
> > .version = "upstream.ovs-dpdk.v2.6",
> > .virtio_features = 0x10045694,
> > .max_rx_queue_sz = 1024,
> > },
> > }
> >
> > At some time during installation, or system init, the table would be
> > parsed, and compatibility version strings would be stored into the OVS
> > database, or a new tool would be created to list these strings.
> >
> > Before launching the VM, libvirt will query the version strings for
> > each hosts using for example the JSON RPC API of OVS (maybe not the best
> > solution, looking forward for your comments on this). Libvirt would then
> > select the first common supported version, and insert this string into
> > the vhost-user interfaces parameters in the OVS DBs of each host.
> >
> > When the vhost-user connection is initiated, OVS would know in which
> > compatibility mode to init the interface, for example by restricting
> > the support Virtio features of the interface.
> >
> > Do you think this is reasonable? Or maybe you have alternative ideas
> > that would be best fit to ensure successful migration?
>
> Thanks,
> Maxime
next prev parent reply other threads:[~2017-02-03 15:34 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-01 8:35 Maxime Coquelin
2017-02-01 9:14 ` [dpdk-dev] [libvirt] " Michal Privoznik
2017-02-01 9:43 ` Daniel P. Berrange
2017-02-01 11:33 ` Maxime Coquelin
2017-02-01 11:41 ` Daniel P. Berrange
2017-02-01 22:32 ` Michael S. Tsirkin
2017-02-02 14:14 ` Maxime Coquelin
2017-02-02 15:06 ` Daniel P. Berrange
2017-02-02 16:21 ` Michael S. Tsirkin
2017-02-02 17:10 ` Daniel P. Berrange
2017-02-02 17:20 ` Michael S. Tsirkin
2017-02-02 17:29 ` Daniel P. Berrange
2017-02-02 17:31 ` Michael S. Tsirkin
2017-02-02 18:21 ` Daniel P. Berrange
2017-02-02 18:27 ` Michael S. Tsirkin
2017-02-03 9:27 ` Daniel P. Berrange
2017-02-03 9:41 ` Maxime Coquelin
2017-02-03 10:11 ` Daniel P. Berrange
2017-02-03 11:36 ` Maxime Coquelin
2017-02-02 16:47 ` Laine Stump
2017-02-02 17:09 ` Michael S. Tsirkin
2017-02-02 17:13 ` Daniel P. Berrange
2017-02-02 17:16 ` Maxime Coquelin
2017-02-03 9:12 ` Michal Privoznik
2017-02-03 17:40 ` Laine Stump
2017-02-03 14:11 ` [dpdk-dev] " Maxime Coquelin
2017-02-03 15:34 ` Michael S. Tsirkin [this message]
2017-02-03 15:54 ` Daniel P. Berrange
2017-02-03 16:10 ` Michael S. Tsirkin
2017-02-03 17:22 ` Maxime Coquelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170203172140-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=berrange@redhat.com \
--cc=ciara.loftus@intel.com \
--cc=dev@dpdk.org \
--cc=dev@openvswitch.org \
--cc=diproiettod@vmware.com \
--cc=fleitner@redhat.com \
--cc=ktraynor@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mark.b.kavanagh@intel.com \
--cc=maxime.coquelin@redhat.com \
--cc=sean.k.mooney@intel.com \
--cc=yuanhan.liu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).