From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 14EEB5688 for ; Thu, 17 Nov 2016 10:48:48 +0100 (CET) Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP; 17 Nov 2016 01:48:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,653,1473145200"; d="scan'208";a="32371050" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by fmsmga005.fm.intel.com with ESMTP; 17 Nov 2016 01:48:46 -0800 Date: Thu, 17 Nov 2016 17:49:36 +0800 From: Yuanhan Liu To: Maxime Coquelin Cc: "Michael S. Tsirkin" , dev@dpdk.org, Stephen Hemminger , qemu-devel@nongnu.org, libvir-list@redhat.com, vpp-dev@lists.fd.io, =?iso-8859-1?Q?Marc-Andr=E9?= Lureau Message-ID: <20161117094936.GN5048@yliu-dev.sh.intel.com> References: <20161011173526-mutt-send-email-mst@kernel.org> <20161117082902.GM5048@yliu-dev.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] dpdk/vpp and cross-version migration for vhost X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Nov 2016 09:48:49 -0000 On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote: > > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote: > >As usaual, sorry for late response :/ > > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote: > >>Hi! > >>So it looks like we face a problem with cross-version > >>migration when using vhost. It's not new but became more > >>acute with the advent of vhost user. > >> > >>For users to be able to migrate between different versions > >>of the hypervisor the interface exposed to guests > >>by hypervisor must stay unchanged. > >> > >>The problem is that a qemu device is connected > >>to a backend in another process, so the interface > >>exposed to guests depends on the capabilities of that > >>process. > >> > >>Specifically, for vhost user interface based on virtio, this includes > >>the "host features" bitmap that defines the interface, as well as more > >>host values such as the max ring size. Adding new features/changing > >>values to this interface is required to make progress, but on the other > >>hand we need ability to get the old host features to be compatible. > > > >It looks like to the same issue of vhost-user reconnect to me. For example, > > > >- start dpdk 16.07 & qemu 2.5 > >- kill dpdk > >- start dpdk 16.11 > > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect), > >above should work. Because qemu saves the negotiated features before the > >disconnect and stores it back after the reconnection. > > > > commit a463215b087c41d7ca94e51aa347cde523831873 > > Author: Marc-André Lureau > > Date: Mon Jun 6 18:45:05 2016 +0200 > > > > vhost-net: save & restore vhost-user acked features > > > > The initial vhost-user connection sets the features to be negotiated > > with the driver. Renegotiation isn't possible without device reset. > > > > To handle reconnection of vhost-user backend, ensure the same set of > > features are provided, and reuse already acked features. > > > > Signed-off-by: Marc-André Lureau > > > > > >So we could do similar to vhost-user? I mean, save the acked features > >before migration and store it back after it. This should be able to > >keep the compatibility. If user downgrades DPDK version, it also could > >be easily detected, and then exit with an error to user: migration > >failed due to un-compatible vhost features. > > > >Just some rough thoughts. Makes tiny sense? > > My understanding is that the management tool has to know whether > versions are compatible before initiating the migration: Makes sense. How about getting and restoring the acked features through qemu command lines then, say, through the monitor interface? With that, it would be something like: - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host - read the acked features (through monitor interface) - start vhost-user backend in the dst host - start qemu in the dst host with the just queried acked features QEMU then is expected to use this feature set for the later vhost-user feature negotitation. Exit if features compatibility is broken. Thoughts? --yliu > 1. The downtime could be unpredictable if a VM has to move from hosts > to hosts multiple times, which is problematic, especially for NFV. > 2. If migration is not possible, maybe the management tool would > prefer not to interrupt the VM on current host. > > I have little experience with migration though, so I could be mistaken. > > Thanks, > Maxime > > > > > --yliu > >> > >>To solve this problem within qemu, qemu has a versioning system based on > >>a machine type concept which fundamentally is a version string, by > >>specifying that string one can get hardware compatible with a previous > >>qemu version. QEMU also reports the latest version and list of versions > >>supported so libvirt records the version at VM creation and then is > >>careful to use this machine version whenever it migrates a VM. > >> > >>One might wonder how is this solved with a kernel vhost backend. The > >>answer is that it mostly isn't - instead an assumption is made, that > >>qemu versions are deployed together with the kernel - this is generally > >>true for downstreams. Thus whenever qemu gains a new feature, it is > >>already supported by the kernel as well. However, if one attempts > >>migration with a new qemu from a system with a new to old kernel, one > >>would get a failure. > >> > >>In the world where we have multiple userspace backends, with some of > >>these supplied by ISVs, this seems non-realistic. > >> > >>IMO we need to support vhost backend versioning, ideally > >>in a way that will also work for vhost kernel backends. > >> > >>So I'd like to get some input from both backend and management > >>developers on what a good solution would look like. > >> > >>If we want to emulate the qemu solution, this involves adding the > >>concept of interface versions to dpdk. For example, dpdk could supply a > >>file (or utility printing?) with list of versions: latest and versions > >>supported. libvirt could read that and > >>- store latest version at vm creation > >>- pass it around with the vm > >>- pass it to qemu > >> > >>>From here, qemu could pass this over the vhost-user channel, > >>thus making sure it's initialized with the correct > >>compatible interface. > >> > >>As version here is an opaque string for libvirt and qemu, > >>anything can be used - but I suggest either a list > >>of values defining the interface, e.g. > >>any_layout=on,max_ring=256 > >>or a version including the name and vendor of the backend, > >>e.g. "org.dpdk.v4.5.6". > >> > >>Note that typically the list of supported versions can only be > >>extended, not shrunk. Also, if the host/guest interface > >>does not change, don't change the current version as > >>this just creates work for everyone. > >> > >>Thoughts? Would this work well for management? dpdk? vpp? > >> > >>Thanks! > >> > >>-- > >>MST