From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f176.google.com (mail-qk0-f176.google.com [209.85.220.176]) by dpdk.org (Postfix) with ESMTP id 0489E2C10 for ; Thu, 17 Nov 2016 16:25:18 +0100 (CET) Received: by mail-qk0-f176.google.com with SMTP id n204so224701588qke.2 for ; Thu, 17 Nov 2016 07:25:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=T+0oJl08eZP30qMdJwtFPspQhbfjQgU0w9pEgXElO7c=; b=gmleZb6N+7xQ0s8MvA2M8+itAga2MGVLxR4eaKORchJNib3sAKV8JqyUghYHWirL8n 6g19gWFrnpW9IF0jU47BRQc68fVNl0sLp2IG7v9Mq9n5MzXJl09LjGbQVvg/b23gnLWt JOvvAwXh6LqYXGwayQnXGpaWidXvRedOsYlTjRnYXKbLGwQfXbcqufBr06S5/LolfMGS L1biOY96vP2JKLbM89qL8pG4b4DxvEdATPZx+zQZ57HENZe76Q+tsOqUacAS+IT74Sll t/GhXA8tKOs7IaFFceqfvUdmXOhCTS1Imh8v4gfte8qyDMopaGVVamOqYtlXmzxlUya1 fZ7A== X-Gm-Message-State: AKaTC01QwNdHZuTYjJdgeM/HTn4MgUM0LvfyoZI+j6kGD3x+UNmogBGRrztpHaXj5ukSKCOR X-Received: by 10.55.24.36 with SMTP id j36mr4281662qkh.268.1479396317423; Thu, 17 Nov 2016 07:25:17 -0800 (PST) Received: from [172.31.98.107] (50-242-58-99-static.hfc.comcastbusiness.net. [50.242.58.99]) by smtp.gmail.com with ESMTPSA id r60sm1671044qtd.24.2016.11.17.07.25.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 Nov 2016 07:25:16 -0800 (PST) To: Yuanhan Liu , Maxime Coquelin References: <20161011173526-mutt-send-email-mst@kernel.org> <20161117082902.GM5048@yliu-dev.sh.intel.com> <20161117094936.GN5048@yliu-dev.sh.intel.com> Cc: "Michael S. Tsirkin" , dev@dpdk.org, qemu-devel@nongnu.org, libvir-list@redhat.com, vpp-dev@lists.fd.io, =?UTF-8?Q?Marc-Andr=c3=a9_Lureau?= , Billy McFall From: Thomas F Herbert Message-ID: <01af8441-14e2-4933-c96e-0c8f3789c33c@redhat.com> Date: Thu, 17 Nov 2016 10:25:16 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20161117094936.GN5048@yliu-dev.sh.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [vpp-dev] dpdk/vpp and cross-version migration for vhost X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Nov 2016 15:25:18 -0000 +Billy McFall On 11/17/2016 04:49 AM, Yuanhan Liu wrote: > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote: >> >> On 11/17/2016 09:29 AM, Yuanhan Liu wrote: >>> As usaual, sorry for late response :/ >>> >>> On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote: >>>> Hi! >>>> So it looks like we face a problem with cross-version >>>> migration when using vhost. It's not new but became more >>>> acute with the advent of vhost user. >>>> >>>> For users to be able to migrate between different versions >>>> of the hypervisor the interface exposed to guests >>>> by hypervisor must stay unchanged. >>>> >>>> The problem is that a qemu device is connected >>>> to a backend in another process, so the interface >>>> exposed to guests depends on the capabilities of that >>>> process. >>>> >>>> Specifically, for vhost user interface based on virtio, this includes >>>> the "host features" bitmap that defines the interface, as well as more >>>> host values such as the max ring size. Adding new features/changing >>>> values to this interface is required to make progress, but on the other >>>> hand we need ability to get the old host features to be compatible. >>> It looks like to the same issue of vhost-user reconnect to me. For example, >>> >>> - start dpdk 16.07 & qemu 2.5 >>> - kill dpdk >>> - start dpdk 16.11 >>> >>> Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect), >>> above should work. Because qemu saves the negotiated features before the >>> disconnect and stores it back after the reconnection. >>> >>> commit a463215b087c41d7ca94e51aa347cde523831873 >>> Author: Marc-André Lureau >>> Date: Mon Jun 6 18:45:05 2016 +0200 >>> >>> vhost-net: save & restore vhost-user acked features >>> >>> The initial vhost-user connection sets the features to be negotiated >>> with the driver. Renegotiation isn't possible without device reset. >>> >>> To handle reconnection of vhost-user backend, ensure the same set of >>> features are provided, and reuse already acked features. >>> >>> Signed-off-by: Marc-André Lureau >>> >>> >>> So we could do similar to vhost-user? I mean, save the acked features >>> before migration and store it back after it. This should be able to >>> keep the compatibility. If user downgrades DPDK version, it also could >>> be easily detected, and then exit with an error to user: migration >>> failed due to un-compatible vhost features. >>> >>> Just some rough thoughts. Makes tiny sense? >> My understanding is that the management tool has to know whether >> versions are compatible before initiating the migration: > Makes sense. How about getting and restoring the acked features through > qemu command lines then, say, through the monitor interface? > > With that, it would be something like: > > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host > > - read the acked features (through monitor interface) > > - start vhost-user backend in the dst host > > - start qemu in the dst host with the just queried acked features > > QEMU then is expected to use this feature set for the later vhost-user > feature negotitation. Exit if features compatibility is broken. > > Thoughts? > > --yliu > >> 1. The downtime could be unpredictable if a VM has to move from hosts >> to hosts multiple times, which is problematic, especially for NFV. >> 2. If migration is not possible, maybe the management tool would >> prefer not to interrupt the VM on current host. >> >> I have little experience with migration though, so I could be mistaken. >> >> Thanks, >> Maxime >> >>> --yliu >>>> To solve this problem within qemu, qemu has a versioning system based on >>>> a machine type concept which fundamentally is a version string, by >>>> specifying that string one can get hardware compatible with a previous >>>> qemu version. QEMU also reports the latest version and list of versions >>>> supported so libvirt records the version at VM creation and then is >>>> careful to use this machine version whenever it migrates a VM. >>>> >>>> One might wonder how is this solved with a kernel vhost backend. The >>>> answer is that it mostly isn't - instead an assumption is made, that >>>> qemu versions are deployed together with the kernel - this is generally >>>> true for downstreams. Thus whenever qemu gains a new feature, it is >>>> already supported by the kernel as well. However, if one attempts >>>> migration with a new qemu from a system with a new to old kernel, one >>>> would get a failure. >>>> >>>> In the world where we have multiple userspace backends, with some of >>>> these supplied by ISVs, this seems non-realistic. >>>> >>>> IMO we need to support vhost backend versioning, ideally >>>> in a way that will also work for vhost kernel backends. >>>> >>>> So I'd like to get some input from both backend and management >>>> developers on what a good solution would look like. >>>> >>>> If we want to emulate the qemu solution, this involves adding the >>>> concept of interface versions to dpdk. For example, dpdk could supply a >>>> file (or utility printing?) with list of versions: latest and versions >>>> supported. libvirt could read that and >>>> - store latest version at vm creation >>>> - pass it around with the vm >>>> - pass it to qemu >>>> >>>> >From here, qemu could pass this over the vhost-user channel, >>>> thus making sure it's initialized with the correct >>>> compatible interface. >>>> >>>> As version here is an opaque string for libvirt and qemu, >>>> anything can be used - but I suggest either a list >>>> of values defining the interface, e.g. >>>> any_layout=on,max_ring=256 >>>> or a version including the name and vendor of the backend, >>>> e.g. "org.dpdk.v4.5.6". >>>> >>>> Note that typically the list of supported versions can only be >>>> extended, not shrunk. Also, if the host/guest interface >>>> does not change, don't change the current version as >>>> this just creates work for everyone. >>>> >>>> Thoughts? Would this work well for management? dpdk? vpp? >>>> >>>> Thanks! >>>> >>>> -- >>>> MST > _______________________________________________ > vpp-dev mailing list > vpp-dev@lists.fd.io > https://lists.fd.io/mailman/listinfo/vpp-dev -- *Thomas F Herbert* SDN Group Office of Technology *Red Hat*