DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange@redhat.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michal Privoznik <mprivozn@redhat.com>,
	Kevin Traynor <ktraynor@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Ciara Loftus <ciara.loftus@intel.com>,
	mark.b.kavanagh@intel.com, Flavio Leitner <fleitner@redhat.com>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>,
	Daniele Di Proietto <diproiettod@vmware.com>,
	"dev@openvswitch.org" <dev@openvswitch.org>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>
Subject: Re: [dpdk-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support
Date: Wed, 1 Feb 2017 11:41:19 +0000	[thread overview]
Message-ID: <20170201114119.GE3232@redhat.com> (raw)
In-Reply-To: <f7454909-a95a-9438-c2ac-26bcde83601c@redhat.com>

On Wed, Feb 01, 2017 at 12:33:22PM +0100, Maxime Coquelin wrote:
> 
> 
> On 02/01/2017 10:43 AM, Daniel P. Berrange wrote:
> > On Wed, Feb 01, 2017 at 10:14:54AM +0100, Michal Privoznik wrote:
> > > On 02/01/2017 09:35 AM, Maxime Coquelin wrote:
> > 
> > > > Solution 3: Libvirt queries OVS for vhost backend version string: *OK*
> > > > ======================================================================
> > > > 
> > > > 
> > > >  The idea is to have a table of supported versions, associated to
> > > > key/value pairs. Libvirt could query the list of supported versions
> > > > strings for each hosts, and select the first common one among all hosts.
> > > 
> > > How does libvirt know what hosts to ask? Libvirt aims on managing a
> > > single host. It has no knowledge of other hosts on the network. That's
> > > task for upper layers like RHEV, OpenStack, etc.
> > > 
> > > > 
> > > >  Then, libvirt would ask OVS to probe the vhost-user interfaces in the
> > > > selected version (compatibility mode). For example host A runs OVS-2.7,
> > > > and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode
> > > > (e.g. with indirect descriptors disabled), which should be selected at
> > > > vhost-user interface probe time.
> > > > 
> > > >  Advantage of doing so is that libvirt does not need any update if new
> > > > keys are introduced (i.e. it does not need to know how the new keys have
> > > > to be handled), all these checks remain in OVS's vhost-user implementation.
> > > 
> > > And that's where they should stay. Duplicating code between projects
> > > will inevitably lead to a divergence.
> > > 
> > > > 
> > > >  Ideally, we would support per vhost-user interface compatibility mode,
> > > > which may have an impact also on DPDK API, as the Virtio feature update
> > > > API is global, and not per port.
> > > 
> > > In general, I don't think we want any kind of this logic in libvirt. Either:
> > > 
> > > a) fallback logic should be implemented in qemu (e.g. upon migration it
> > > should detect that the migrated guest uses certain version and thus set
> > > backend to use that version or error out and cancel migration), or
> > > 
> > > b) libvirt would grew another element/attribute to specify version of
> > > vhost-user backend in use and do nothing more than pass it to qemu. At
> > > the same time, we can provide an API (or extend and existing one, e.g.
> > > virsh domcapabilities) to list all available versions on given host.
> > > Upper layer, which knows what are the possible hosts suitable for
> > > virtualization, can then use this API to ask all the hosts, construct
> > > the matrix, select preferred version and put it into libvirt's domain XML.
> > > 
> > > But frankly, I don't like b) that much. Lets put the fact this is OVS
> > > aside for a moment. Just pretend this is a generic device in qemu. Would
> > > we do the same magic with it? No! Or lets talk about machine types. You
> > > spawn -M type$((X+1)) guest and then decide to migrate it to a host with
> > > older qemu wich supports just typeX. Well, you get an error. Do we care?
> > > Not at all! It's your responsibility (as user/admin) to upgrade the qemu
> > > so that it supports new machine type. I think the same applies to OVS.
> > 
> > With machine types, if the latest machine type is X, libvirt allows
> > the mgmt app to spawn a guest with mcahine type X-1, so that you can
> > later migrate the VM to a host with older QEMU.
> > 
> > With the vhost user device, the VM will always be launched with version
> > Y. There's currently no way to request launching the vhost user device
> > wtih version Y-1. So even if you set your machine type to X-1 for
> > compat with older host, vhost user will be stuck at version Y preventing
> > the migration.
> > 
> > One argument would be to say that the vhost user featureset should be
> > determined by the machine type version, instead of introducing a new
> > version. The complexity is that vhost-user is a pretty dumb device
> > from QEMUs POV - most of the interesting logic & the features that
> > need to be constrained lie in code outside of QEMU, in whatever
> > server is connected to the vhost user socket.
> > 
> > So I can see the value in allowing a simple version string to be
> > associated with the vhost-user NIC.
> > 
> > What I'm unclear about is how we would be able to report capabilities
> > for a host to enumerate what versions were possible. Libvirt doesn't
> > interact with any of the 3rd party vhost user servers, so it is probably
> > out of scope - it'd be upto the higher level mgmt app to talk to DPDK
> > and figure out what versions it supports.
> > 
> > That does make me wonder though if libvirt & QEMU need to be involved
> > at all in any part.
> 
> Indeed, if the higher level management tool decides for the VM's machine
> type, it is where it should also be done for the vhost-user backend. I
> now understand this does not make much sense to have libvirt being
> involved in this, all (querying/selecting/setting compat mode) should be
> manageable in the upper layer.
> 
> I'm not familiar with these layers, so your inputs are really helpful.
> 
> > 
> > When provisioning a new guest, the mgmt app presumably has to talk to
> > DPDK to setup the NIC port, so DPDK is ready when QEMU launches and
> > connects. Surely as part of that NIC port setup, it could directly
> > tell DPDK which version or featureset to permit on the port ? It is
> > not obvious why the version string has to be fed in via QEMU.
> No it is not, my proposal was that libvirt set the version string in
> OVS, QEMU was not involved.
> 
> From these inputs, the idea remains valid, except that libvirt is not
> the right place to manage this. Instead, RHEV, Openstack or any other
> management tool should handle the compat mode selection.

It depends where / how in OVS it needs to be set. The only stuff libvirt
does with OVS is to run 'add-port' and 'del-port' commands via the ovs
cli tool. We pass through arguments from the port profile stored in the
XML config.

  <interface type='bridge'>
    <source bridge='ovsbr'/>
    <virtualport type='openvswitch'>
      <parameters profileid='menial' interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
    </virtualport>
  </interface>

eg those things in <parameters/> get passed as cli args to the 'add-port'
command. Soo if add-port needs this new version string, then we'd need
to add the version to the openvswitch virtualport XML.

If the version is provided to OVS in a different command, then it would
probably be outside scope of libvirt.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

  reply	other threads:[~2017-02-01 11:41 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-01  8:35 [dpdk-dev] " Maxime Coquelin
2017-02-01  9:14 ` [dpdk-dev] [libvirt] " Michal Privoznik
2017-02-01  9:43   ` Daniel P. Berrange
2017-02-01 11:33     ` Maxime Coquelin
2017-02-01 11:41       ` Daniel P. Berrange [this message]
2017-02-01 22:32         ` Michael S. Tsirkin
2017-02-02 14:14         ` Maxime Coquelin
2017-02-02 15:06           ` Daniel P. Berrange
2017-02-02 16:21             ` Michael S. Tsirkin
2017-02-02 17:10               ` Daniel P. Berrange
2017-02-02 17:20                 ` Michael S. Tsirkin
2017-02-02 17:29                   ` Daniel P. Berrange
2017-02-02 17:31                     ` Michael S. Tsirkin
2017-02-02 18:21                       ` Daniel P. Berrange
2017-02-02 18:27                         ` Michael S. Tsirkin
2017-02-03  9:27                           ` Daniel P. Berrange
2017-02-03  9:41                             ` Maxime Coquelin
2017-02-03 10:11                               ` Daniel P. Berrange
2017-02-03 11:36                                 ` Maxime Coquelin
2017-02-02 16:47             ` Laine Stump
2017-02-02 17:09               ` Michael S. Tsirkin
2017-02-02 17:13                 ` Daniel P. Berrange
2017-02-02 17:16                 ` Maxime Coquelin
2017-02-03  9:12                   ` Michal Privoznik
2017-02-03 17:40                     ` Laine Stump
2017-02-03 14:11 ` [dpdk-dev] " Maxime Coquelin
2017-02-03 15:34   ` Michael S. Tsirkin
2017-02-03 15:54     ` Daniel P. Berrange
2017-02-03 16:10       ` Michael S. Tsirkin
2017-02-03 17:22     ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170201114119.GE3232@redhat.com \
    --to=berrange@redhat.com \
    --cc=ciara.loftus@intel.com \
    --cc=dev@dpdk.org \
    --cc=dev@openvswitch.org \
    --cc=diproiettod@vmware.com \
    --cc=fleitner@redhat.com \
    --cc=ktraynor@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mark.b.kavanagh@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mprivozn@redhat.com \
    --cc=mst@redhat.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).