DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: "Maxime Coquelin" <maxime.coquelin@redhat.com>,
	dev@dpdk.org, "Stephen Hemminger" <stephen@networkplumber.org>,
	qemu-devel@nongnu.org, libvir-list@redhat.com,
	vpp-dev@lists.fd.io,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>
Subject: Re: [dpdk-dev] dpdk/vpp and cross-version migration for vhost
Date: Tue, 22 Nov 2016 16:53:05 +0200	[thread overview]
Message-ID: <20161122164143-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20161122130223.GW5048@yliu-dev.sh.intel.com>

On Tue, Nov 22, 2016 at 09:02:23PM +0800, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > > >As usaual, sorry for late response :/
> > > > >
> > > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > > >>Hi!
> > > > >>So it looks like we face a problem with cross-version
> > > > >>migration when using vhost. It's not new but became more
> > > > >>acute with the advent of vhost user.
> > > > >>
> > > > >>For users to be able to migrate between different versions
> > > > >>of the hypervisor the interface exposed to guests
> > > > >>by hypervisor must stay unchanged.
> > > > >>
> > > > >>The problem is that a qemu device is connected
> > > > >>to a backend in another process, so the interface
> > > > >>exposed to guests depends on the capabilities of that
> > > > >>process.
> > > > >>
> > > > >>Specifically, for vhost user interface based on virtio, this includes
> > > > >>the "host features" bitmap that defines the interface, as well as more
> > > > >>host values such as the max ring size.  Adding new features/changing
> > > > >>values to this interface is required to make progress, but on the other
> > > > >>hand we need ability to get the old host features to be compatible.
> > > > >
> > > > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > > > >
> > > > >- start dpdk 16.07 & qemu 2.5
> > > > >- kill dpdk
> > > > >- start dpdk 16.11
> > > > >
> > > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > > > >above should work. Because qemu saves the negotiated features before the
> > > > >disconnect and stores it back after the reconnection.
> > > > >
> > > > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > > > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > > > >
> > > > >        vhost-net: save & restore vhost-user acked features
> > > > >
> > > > >        The initial vhost-user connection sets the features to be negotiated
> > > > >        with the driver. Renegotiation isn't possible without device reset.
> > > > >
> > > > >        To handle reconnection of vhost-user backend, ensure the same set of
> > > > >        features are provided, and reuse already acked features.
> > > > >
> > > > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > > >
> > > > >
> > > > >So we could do similar to vhost-user? I mean, save the acked features
> > > > >before migration and store it back after it. This should be able to
> > > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > > >be easily detected, and then exit with an error to user: migration
> > > > >failed due to un-compatible vhost features.
> > > > >
> > > > >Just some rough thoughts. Makes tiny sense?
> > > > 
> > > > My understanding is that the management tool has to know whether
> > > > versions are compatible before initiating the migration:
> > > 
> > > Makes sense. How about getting and restoring the acked features through
> > > qemu command lines then, say, through the monitor interface?
> > > 
> > > With that, it would be something like:
> > > 
> > > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > > 
> > > - read the acked features (through monitor interface)
> > > 
> > > - start vhost-user backend in the dst host
> > > 
> > > - start qemu in the dst host with the just queried acked features
> > > 
> > >   QEMU then is expected to use this feature set for the later vhost-user
> > >   feature negotitation. Exit if features compatibility is broken.
> > > 
> > > Thoughts?
> > > 
> > > 	--yliu
> > 
> > 
> > You keep assuming that you have the VM started first and
> > figure out things afterwards, but this does not work.
> > 
> > Think about a cluster of machines. You want to start a VM in
> > a way that will ensure compatibility with all hosts
> > in a cluster.
> 
> I see. I was more considering about the case when the dst
> host (including the qemu and dpdk combo) is given, and
> then determine whether it will be a successfull migration
> or not.
> 
> And you are asking that we need to know which host could
> be a good candidate before starting the migration. In such
> case, we indeed need some inputs from both the qemu and
> vhost-user backend.
> 
> For DPDK, I think it could be simple, just as you said, it
> could be either a tiny script, or even a macro defined in
> the source code file (we extend it every time we add a
> new feature) to let the libvirt to read it. Or something
> else.

There's the issue of APIs that tweak features as Maxime
suggested. Maybe the only thing to do is to deprecate it,
but I feel some way for application to pass info into
guest might be benefitial.


> > If you don't, guest visible interface will change
> > and you won't be able to migrate.
> > 
> > It does not make sense to discuss feature bits specifically
> > since that is not the only part of interface.
> > For example, max ring size supported might change.
> 
> I don't quite understand why we have to consider the max ring
> size here? Isn't it a virtio device attribute, that QEMU could
> provide such compatibility information?
>
> I mean, DPDK is supposed to support vary vring size, it's QEMU
> to give a specifc value.

If backend supports s/g of any size up to 2^16, there's no issue.

ATM some backends might be assuming up to 1K s/g since
QEMU never supported bigger ones. We might classify this
as a bug, or not and add a feature flag.

But it's just an example. There might be more values at issue
in the future.

> > Let me describe how it works in qemu/libvirt.
> > When you install a VM, you can specify compatibility
> > level (aka "machine type"), and you can query the supported compatibility
> > levels. Management uses that to find the supported compatibility
> > and stores the compatibility in XML that is migrated with the VM.
> > There's also a way to find the latest level which is the
> > default unless overridden by user, again this level
> > is recorded and then
> > - management can make sure migration destination is compatible
> > - management can avoid migration to hosts without that support
> 
> Thanks for the info, it helps.
> 
> ...
> > > > >>As version here is an opaque string for libvirt and qemu,
> > > > >>anything can be used - but I suggest either a list
> > > > >>of values defining the interface, e.g.
> > > > >>any_layout=on,max_ring=256
> > > > >>or a version including the name and vendor of the backend,
> > > > >>e.g. "org.dpdk.v4.5.6".
> 
> The version scheme may not be ideal here. Assume a QEMU is supposed
> to work with a specific DPDK version, however, user may disable some
> newer features through qemu command line, that it also could work with
> an elder DPDK version. Using the version scheme will not allow us doing
> such migration to an elder DPDK version. The MTU is a lively example
> here? (when MTU feature is provided by QEMU but is actually disabled
> by user, that it could also work with an elder DPDK without MTU support).
> 
> 	--yliu

OK, so does a list of values look better to you then?



> > > > >>
> > > > >>Note that typically the list of supported versions can only be
> > > > >>extended, not shrunk. Also, if the host/guest interface
> > > > >>does not change, don't change the current version as
> > > > >>this just creates work for everyone.
> > > > >>
> > > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > > >>
> > > > >>Thanks!
> > > > >>
> > > > >>--
> > > > >>MST

  reply	other threads:[~2016-11-22 14:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-13 17:50 Michael S. Tsirkin
2016-11-16 20:43 ` Maxime Coquelin
2016-11-17  8:29 ` Yuanhan Liu
2016-11-17  8:47   ` Maxime Coquelin
2016-11-17  9:49     ` Yuanhan Liu
2016-11-17 15:25       ` [dpdk-dev] [vpp-dev] " Thomas F Herbert
2016-11-17 17:37       ` [dpdk-dev] " Michael S. Tsirkin
2016-11-22 13:02         ` Yuanhan Liu
2016-11-22 14:53           ` Michael S. Tsirkin [this message]
2016-11-24  6:31             ` Yuanhan Liu
2016-11-24  9:30               ` Kevin Traynor
2016-11-24 12:33                 ` Yuanhan Liu
2016-11-24 12:47                   ` Maxime Coquelin
2016-11-24 15:01                     ` Kevin Traynor
2016-11-24 15:24                       ` Kavanagh, Mark B
2016-11-28 15:28                         ` Maxime Coquelin
2016-11-28 22:18                           ` Thomas Monjalon
2016-11-29  8:09                             ` Maxime Coquelin
2016-12-09 13:35               ` Maxime Coquelin
2016-12-09 14:42                 ` Daniel P. Berrange
2016-12-09 16:45                   ` Maxime Coquelin
2016-12-09 16:48                     ` Daniel P. Berrange

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161122164143-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=dev@dpdk.org \
    --cc=libvir-list@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stephen@networkplumber.org \
    --cc=vpp-dev@lists.fd.io \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).