From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 77EAC2BA7 for ; Thu, 24 Nov 2016 07:30:44 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP; 23 Nov 2016 22:30:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,690,1473145200"; d="scan'208";a="905001839" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by orsmga003.jf.intel.com with ESMTP; 23 Nov 2016 22:30:41 -0800 Date: Thu, 24 Nov 2016 14:31:29 +0800 From: Yuanhan Liu To: "Michael S. Tsirkin" Cc: Maxime Coquelin , dev@dpdk.org, Stephen Hemminger , qemu-devel@nongnu.org, libvir-list@redhat.com, vpp-dev@lists.fd.io, =?iso-8859-1?Q?Marc-Andr=E9?= Lureau Message-ID: <20161124063129.GE5048@yliu-dev.sh.intel.com> References: <20161011173526-mutt-send-email-mst@kernel.org> <20161117082902.GM5048@yliu-dev.sh.intel.com> <20161117094936.GN5048@yliu-dev.sh.intel.com> <20161117192445-mutt-send-email-mst@kernel.org> <20161122130223.GW5048@yliu-dev.sh.intel.com> <20161122164143-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161122164143-mutt-send-email-mst@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] dpdk/vpp and cross-version migration for vhost X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Nov 2016 06:30:45 -0000 On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote: > > > You keep assuming that you have the VM started first and > > > figure out things afterwards, but this does not work. > > > > > > Think about a cluster of machines. You want to start a VM in > > > a way that will ensure compatibility with all hosts > > > in a cluster. > > > > I see. I was more considering about the case when the dst > > host (including the qemu and dpdk combo) is given, and > > then determine whether it will be a successfull migration > > or not. > > > > And you are asking that we need to know which host could > > be a good candidate before starting the migration. In such > > case, we indeed need some inputs from both the qemu and > > vhost-user backend. > > > > For DPDK, I think it could be simple, just as you said, it > > could be either a tiny script, or even a macro defined in > > the source code file (we extend it every time we add a > > new feature) to let the libvirt to read it. Or something > > else. > > There's the issue of APIs that tweak features as Maxime > suggested. Yes, it's a good point. > Maybe the only thing to do is to deprecate it, Looks like so. > but I feel some way for application to pass info into > guest might be benefitial. The two APIs are just for tweaking feature bits DPDK supports before any device got connected. It's another way to disable some features (the another obvious way is to through QEMU command lines). IMO, it's bit handy only in a case like: we have bunch of VMs. Instead of disabling something though qemu one by one, we could disable it once in DPDK. But I doubt the useful of it. It's only used in DPDK's vhost example after all. Nor is it used in vhost pmd, neither is it used in OVS. > > > If you don't, guest visible interface will change > > > and you won't be able to migrate. > > > > > > It does not make sense to discuss feature bits specifically > > > since that is not the only part of interface. > > > For example, max ring size supported might change. > > > > I don't quite understand why we have to consider the max ring > > size here? Isn't it a virtio device attribute, that QEMU could > > provide such compatibility information? > > > > I mean, DPDK is supposed to support vary vring size, it's QEMU > > to give a specifc value. > > If backend supports s/g of any size up to 2^16, there's no issue. I don't know others, but I see no issues in DPDK. > ATM some backends might be assuming up to 1K s/g since > QEMU never supported bigger ones. We might classify this > as a bug, or not and add a feature flag. > > But it's just an example. There might be more values at issue > in the future. Yeah, maybe. But we could analysis it one by one. > > > Let me describe how it works in qemu/libvirt. > > > When you install a VM, you can specify compatibility > > > level (aka "machine type"), and you can query the supported compatibility > > > levels. Management uses that to find the supported compatibility > > > and stores the compatibility in XML that is migrated with the VM. > > > There's also a way to find the latest level which is the > > > default unless overridden by user, again this level > > > is recorded and then > > > - management can make sure migration destination is compatible > > > - management can avoid migration to hosts without that support > > > > Thanks for the info, it helps. > > > > ... > > > > > >>As version here is an opaque string for libvirt and qemu, > > > > > >>anything can be used - but I suggest either a list > > > > > >>of values defining the interface, e.g. > > > > > >>any_layout=on,max_ring=256 > > > > > >>or a version including the name and vendor of the backend, > > > > > >>e.g. "org.dpdk.v4.5.6". > > > > The version scheme may not be ideal here. Assume a QEMU is supposed > > to work with a specific DPDK version, however, user may disable some > > newer features through qemu command line, that it also could work with > > an elder DPDK version. Using the version scheme will not allow us doing > > such migration to an elder DPDK version. The MTU is a lively example > > here? (when MTU feature is provided by QEMU but is actually disabled > > by user, that it could also work with an elder DPDK without MTU support). > > > > --yliu > > OK, so does a list of values look better to you then? Yes, if there are no better way. And I think it may be better to not list all those features, literally. But instead, using the number should be better, say, features=0xdeadbeef. Listing the feature names means we have to come to an agreement in all components involved here (QEMU, libvirt, DPDK, VPP, and maybe more backends), that we have to use the exact same feature names. Though it may not be a big deal, it lacks some flexibility. A feature bits will not have this issue. --yliu > > > > > > > >> > > > > > >>Note that typically the list of supported versions can only be > > > > > >>extended, not shrunk. Also, if the host/guest interface > > > > > >>does not change, don't change the current version as > > > > > >>this just creates work for everyone. > > > > > >> > > > > > >>Thoughts? Would this work well for management? dpdk? vpp? > > > > > >> > > > > > >>Thanks! > > > > > >> > > > > > >>-- > > > > > >>MST