From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
dev@dpdk.org, "Stephen Hemminger" <stephen@networkplumber.org>,
qemu-devel@nongnu.org, libvir-list@redhat.com,
vpp-dev@lists.fd.io,
"Marc-André Lureau" <marcandre.lureau@redhat.com>
Subject: Re: [dpdk-dev] dpdk/vpp and cross-version migration for vhost
Date: Thu, 17 Nov 2016 17:49:36 +0800 [thread overview]
Message-ID: <20161117094936.GN5048@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <b9e55320-f53d-d7d3-978f-ec696f3c1d93@redhat.com>
On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
>
>
> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> >As usaual, sorry for late response :/
> >
> >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> >>Hi!
> >>So it looks like we face a problem with cross-version
> >>migration when using vhost. It's not new but became more
> >>acute with the advent of vhost user.
> >>
> >>For users to be able to migrate between different versions
> >>of the hypervisor the interface exposed to guests
> >>by hypervisor must stay unchanged.
> >>
> >>The problem is that a qemu device is connected
> >>to a backend in another process, so the interface
> >>exposed to guests depends on the capabilities of that
> >>process.
> >>
> >>Specifically, for vhost user interface based on virtio, this includes
> >>the "host features" bitmap that defines the interface, as well as more
> >>host values such as the max ring size. Adding new features/changing
> >>values to this interface is required to make progress, but on the other
> >>hand we need ability to get the old host features to be compatible.
> >
> >It looks like to the same issue of vhost-user reconnect to me. For example,
> >
> >- start dpdk 16.07 & qemu 2.5
> >- kill dpdk
> >- start dpdk 16.11
> >
> >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> >above should work. Because qemu saves the negotiated features before the
> >disconnect and stores it back after the reconnection.
> >
> > commit a463215b087c41d7ca94e51aa347cde523831873
> > Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > Date: Mon Jun 6 18:45:05 2016 +0200
> >
> > vhost-net: save & restore vhost-user acked features
> >
> > The initial vhost-user connection sets the features to be negotiated
> > with the driver. Renegotiation isn't possible without device reset.
> >
> > To handle reconnection of vhost-user backend, ensure the same set of
> > features are provided, and reuse already acked features.
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >
> >
> >So we could do similar to vhost-user? I mean, save the acked features
> >before migration and store it back after it. This should be able to
> >keep the compatibility. If user downgrades DPDK version, it also could
> >be easily detected, and then exit with an error to user: migration
> >failed due to un-compatible vhost features.
> >
> >Just some rough thoughts. Makes tiny sense?
>
> My understanding is that the management tool has to know whether
> versions are compatible before initiating the migration:
Makes sense. How about getting and restoring the acked features through
qemu command lines then, say, through the monitor interface?
With that, it would be something like:
- start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
- read the acked features (through monitor interface)
- start vhost-user backend in the dst host
- start qemu in the dst host with the just queried acked features
QEMU then is expected to use this feature set for the later vhost-user
feature negotitation. Exit if features compatibility is broken.
Thoughts?
--yliu
> 1. The downtime could be unpredictable if a VM has to move from hosts
> to hosts multiple times, which is problematic, especially for NFV.
> 2. If migration is not possible, maybe the management tool would
> prefer not to interrupt the VM on current host.
>
> I have little experience with migration though, so I could be mistaken.
>
> Thanks,
> Maxime
>
> >
> > --yliu
> >>
> >>To solve this problem within qemu, qemu has a versioning system based on
> >>a machine type concept which fundamentally is a version string, by
> >>specifying that string one can get hardware compatible with a previous
> >>qemu version. QEMU also reports the latest version and list of versions
> >>supported so libvirt records the version at VM creation and then is
> >>careful to use this machine version whenever it migrates a VM.
> >>
> >>One might wonder how is this solved with a kernel vhost backend. The
> >>answer is that it mostly isn't - instead an assumption is made, that
> >>qemu versions are deployed together with the kernel - this is generally
> >>true for downstreams. Thus whenever qemu gains a new feature, it is
> >>already supported by the kernel as well. However, if one attempts
> >>migration with a new qemu from a system with a new to old kernel, one
> >>would get a failure.
> >>
> >>In the world where we have multiple userspace backends, with some of
> >>these supplied by ISVs, this seems non-realistic.
> >>
> >>IMO we need to support vhost backend versioning, ideally
> >>in a way that will also work for vhost kernel backends.
> >>
> >>So I'd like to get some input from both backend and management
> >>developers on what a good solution would look like.
> >>
> >>If we want to emulate the qemu solution, this involves adding the
> >>concept of interface versions to dpdk. For example, dpdk could supply a
> >>file (or utility printing?) with list of versions: latest and versions
> >>supported. libvirt could read that and
> >>- store latest version at vm creation
> >>- pass it around with the vm
> >>- pass it to qemu
> >>
> >>>From here, qemu could pass this over the vhost-user channel,
> >>thus making sure it's initialized with the correct
> >>compatible interface.
> >>
> >>As version here is an opaque string for libvirt and qemu,
> >>anything can be used - but I suggest either a list
> >>of values defining the interface, e.g.
> >>any_layout=on,max_ring=256
> >>or a version including the name and vendor of the backend,
> >>e.g. "org.dpdk.v4.5.6".
> >>
> >>Note that typically the list of supported versions can only be
> >>extended, not shrunk. Also, if the host/guest interface
> >>does not change, don't change the current version as
> >>this just creates work for everyone.
> >>
> >>Thoughts? Would this work well for management? dpdk? vpp?
> >>
> >>Thanks!
> >>
> >>--
> >>MST
next prev parent reply other threads:[~2016-11-17 9:48 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-13 17:50 Michael S. Tsirkin
2016-11-16 20:43 ` Maxime Coquelin
2016-11-17 8:29 ` Yuanhan Liu
2016-11-17 8:47 ` Maxime Coquelin
2016-11-17 9:49 ` Yuanhan Liu [this message]
2016-11-17 15:25 ` [dpdk-dev] [vpp-dev] " Thomas F Herbert
2016-11-17 17:37 ` [dpdk-dev] " Michael S. Tsirkin
2016-11-22 13:02 ` Yuanhan Liu
2016-11-22 14:53 ` Michael S. Tsirkin
2016-11-24 6:31 ` Yuanhan Liu
2016-11-24 9:30 ` Kevin Traynor
2016-11-24 12:33 ` Yuanhan Liu
2016-11-24 12:47 ` Maxime Coquelin
2016-11-24 15:01 ` Kevin Traynor
2016-11-24 15:24 ` Kavanagh, Mark B
2016-11-28 15:28 ` Maxime Coquelin
2016-11-28 22:18 ` Thomas Monjalon
2016-11-29 8:09 ` Maxime Coquelin
2016-12-09 13:35 ` Maxime Coquelin
2016-12-09 14:42 ` Daniel P. Berrange
2016-12-09 16:45 ` Maxime Coquelin
2016-12-09 16:48 ` Daniel P. Berrange
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161117094936.GN5048@yliu-dev.sh.intel.com \
--to=yuanhan.liu@linux.intel.com \
--cc=dev@dpdk.org \
--cc=libvir-list@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=maxime.coquelin@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stephen@networkplumber.org \
--cc=vpp-dev@lists.fd.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).