DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas F Herbert <therbert@redhat.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	dev@dpdk.org, qemu-devel@nongnu.org, libvir-list@redhat.com,
	vpp-dev@lists.fd.io,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Billy McFall" <bmcfall@redhat.com>
Subject: Re: [dpdk-dev] [vpp-dev] dpdk/vpp and cross-version migration for vhost
Date: Thu, 17 Nov 2016 10:25:16 -0500	[thread overview]
Message-ID: <01af8441-14e2-4933-c96e-0c8f3789c33c@redhat.com> (raw)
In-Reply-To: <20161117094936.GN5048@yliu-dev.sh.intel.com>

+Billy McFall


On 11/17/2016 04:49 AM, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
>>
>> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
>>> As usaual, sorry for late response :/
>>>
>>> On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
>>>> Hi!
>>>> So it looks like we face a problem with cross-version
>>>> migration when using vhost. It's not new but became more
>>>> acute with the advent of vhost user.
>>>>
>>>> For users to be able to migrate between different versions
>>>> of the hypervisor the interface exposed to guests
>>>> by hypervisor must stay unchanged.
>>>>
>>>> The problem is that a qemu device is connected
>>>> to a backend in another process, so the interface
>>>> exposed to guests depends on the capabilities of that
>>>> process.
>>>>
>>>> Specifically, for vhost user interface based on virtio, this includes
>>>> the "host features" bitmap that defines the interface, as well as more
>>>> host values such as the max ring size.  Adding new features/changing
>>>> values to this interface is required to make progress, but on the other
>>>> hand we need ability to get the old host features to be compatible.
>>> It looks like to the same issue of vhost-user reconnect to me. For example,
>>>
>>> - start dpdk 16.07 & qemu 2.5
>>> - kill dpdk
>>> - start dpdk 16.11
>>>
>>> Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
>>> above should work. Because qemu saves the negotiated features before the
>>> disconnect and stores it back after the reconnection.
>>>
>>>     commit a463215b087c41d7ca94e51aa347cde523831873
>>>     Author: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>     Date:   Mon Jun 6 18:45:05 2016 +0200
>>>
>>>         vhost-net: save & restore vhost-user acked features
>>>
>>>         The initial vhost-user connection sets the features to be negotiated
>>>         with the driver. Renegotiation isn't possible without device reset.
>>>
>>>         To handle reconnection of vhost-user backend, ensure the same set of
>>>         features are provided, and reuse already acked features.
>>>
>>>         Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>
>>>
>>> So we could do similar to vhost-user? I mean, save the acked features
>>> before migration and store it back after it. This should be able to
>>> keep the compatibility. If user downgrades DPDK version, it also could
>>> be easily detected, and then exit with an error to user: migration
>>> failed due to un-compatible vhost features.
>>>
>>> Just some rough thoughts. Makes tiny sense?
>> My understanding is that the management tool has to know whether
>> versions are compatible before initiating the migration:
> Makes sense. How about getting and restoring the acked features through
> qemu command lines then, say, through the monitor interface?
>
> With that, it would be something like:
>
> - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
>
> - read the acked features (through monitor interface)
>
> - start vhost-user backend in the dst host
>
> - start qemu in the dst host with the just queried acked features
>
>    QEMU then is expected to use this feature set for the later vhost-user
>    feature negotitation. Exit if features compatibility is broken.
>
> Thoughts?
>
> 	--yliu
>
>>   1. The downtime could be unpredictable if a VM has to move from hosts
>>      to hosts multiple times, which is problematic, especially for NFV.
>>   2. If migration is not possible, maybe the management tool would
>>      prefer not to interrupt the VM on current host.
>>
>> I have little experience with migration though, so I could be mistaken.
>>
>> Thanks,
>> Maxime
>>
>>> 	--yliu
>>>> To solve this problem within qemu, qemu has a versioning system based on
>>>> a machine type concept which fundamentally is a version string, by
>>>> specifying that string one can get hardware compatible with a previous
>>>> qemu version. QEMU also reports the latest version and list of versions
>>>> supported so libvirt records the version at VM creation and then is
>>>> careful to use this machine version whenever it migrates a VM.
>>>>
>>>> One might wonder how is this solved with a kernel vhost backend. The
>>>> answer is that it mostly isn't - instead an assumption is made, that
>>>> qemu versions are deployed together with the kernel - this is generally
>>>> true for downstreams.  Thus whenever qemu gains a new feature, it is
>>>> already supported by the kernel as well.  However, if one attempts
>>>> migration with a new qemu from a system with a new to old kernel, one
>>>> would get a failure.
>>>>
>>>> In the world where we have multiple userspace backends, with some of
>>>> these supplied by ISVs, this seems non-realistic.
>>>>
>>>> IMO we need to support vhost backend versioning, ideally
>>>> in a way that will also work for vhost kernel backends.
>>>>
>>>> So I'd like to get some input from both backend and management
>>>> developers on what a good solution would look like.
>>>>
>>>> If we want to emulate the qemu solution, this involves adding the
>>>> concept of interface versions to dpdk.  For example, dpdk could supply a
>>>> file (or utility printing?) with list of versions: latest and versions
>>>> supported. libvirt could read that and
>>>> - store latest version at vm creation
>>>> - pass it around with the vm
>>>> - pass it to qemu
>>>>
>>>> >From here, qemu could pass this over the vhost-user channel,
>>>> thus making sure it's initialized with the correct
>>>> compatible interface.
>>>>
>>>> As version here is an opaque string for libvirt and qemu,
>>>> anything can be used - but I suggest either a list
>>>> of values defining the interface, e.g.
>>>> any_layout=on,max_ring=256
>>>> or a version including the name and vendor of the backend,
>>>> e.g. "org.dpdk.v4.5.6".
>>>>
>>>> Note that typically the list of supported versions can only be
>>>> extended, not shrunk. Also, if the host/guest interface
>>>> does not change, don't change the current version as
>>>> this just creates work for everyone.
>>>>
>>>> Thoughts? Would this work well for management? dpdk? vpp?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> MST
> _______________________________________________
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev

-- 
*Thomas F Herbert*
SDN Group
Office of Technology
*Red Hat*

  reply	other threads:[~2016-11-17 15:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-13 17:50 [dpdk-dev] " Michael S. Tsirkin
2016-11-16 20:43 ` Maxime Coquelin
2016-11-17  8:29 ` Yuanhan Liu
2016-11-17  8:47   ` Maxime Coquelin
2016-11-17  9:49     ` Yuanhan Liu
2016-11-17 15:25       ` Thomas F Herbert [this message]
2016-11-17 17:37       ` Michael S. Tsirkin
2016-11-22 13:02         ` Yuanhan Liu
2016-11-22 14:53           ` Michael S. Tsirkin
2016-11-24  6:31             ` Yuanhan Liu
2016-11-24  9:30               ` Kevin Traynor
2016-11-24 12:33                 ` Yuanhan Liu
2016-11-24 12:47                   ` Maxime Coquelin
2016-11-24 15:01                     ` Kevin Traynor
2016-11-24 15:24                       ` Kavanagh, Mark B
2016-11-28 15:28                         ` Maxime Coquelin
2016-11-28 22:18                           ` Thomas Monjalon
2016-11-29  8:09                             ` Maxime Coquelin
2016-12-09 13:35               ` Maxime Coquelin
2016-12-09 14:42                 ` Daniel P. Berrange
2016-12-09 16:45                   ` Maxime Coquelin
2016-12-09 16:48                     ` Daniel P. Berrange

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01af8441-14e2-4933-c96e-0c8f3789c33c@redhat.com \
    --to=therbert@redhat.com \
    --cc=bmcfall@redhat.com \
    --cc=dev@dpdk.org \
    --cc=libvir-list@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vpp-dev@lists.fd.io \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).