From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 1EBC71B53F; Tue, 9 Oct 2018 22:54:42 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EE1373002711; Tue, 9 Oct 2018 20:54:40 +0000 (UTC) Received: from localhost.localdomain (unknown [10.36.112.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id A80BD7E106; Tue, 9 Oct 2018 20:54:29 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Tue, 9 Oct 2018 22:54:07 +0200 Message-Id: <20181009205426.21219-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Tue, 09 Oct 2018 20:54:41 +0000 (UTC) Subject: [dpdk-dev] [PATCH v5 00/19] vhost: add postcopy live-migration support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Oct 2018 20:54:42 -0000 In this v5: - Move VHOST_USER_PROTOCOL_F_PAGEFAULT disablement if zero copy is requested to socket.c, and emit a warning if it happens (Ilya) - Reset postcopy_ufd after it is closed in vhost_user_postcopy_advise error path (Ilya) - Rework pre/post message handler callback to return vh_result enum typ, and adapt its user (Tiwei) - Sort VHOST_USER_PROTOCOL_F_PAGEFAULT definition (Tiwei) - Point to the right fixing commit in patch 1 (Tiwei) With classic live-migration, the VM runs on source while its content is being migrated to destination. When pages already migrated to destination are dirtied by the source, they get copied until both source and destination memory converge. At that time, the source is stopped and destination is started. With postcopy live-migration, the VM is started on destination before all the memory has been migrated. When the VM tries to access a page that haven't been migrated yet, a pagefault is triggered, handled by userfaultfd which pauses the thread. A Qemu thread in charge of postcopy request the source for the missing page. Once received and mapped, the paused thread gets resumed. Userfaultfd supports handling faults from a different process, and Qemu supports postcopy with vhost-user backends since v2.12. One problem encountered with classic live-migration for VMs relying on vhost-user backends is that when the traffic is high (e.g. PVP), it happens that it never converges as pages gets dirtied at a faster rate than they are copied to the destination. It is expected this problem sould be solved with using postcopy, as rings memory and buffers will be copied once, when destination will pagefault on them. Note that it will certainly require a rebase to apply on top of Nikolay's vhost-user message handling rework. Steps to test postcopy: 1. Run DPDK's Testpmd application on source: ./install/bin/testpmd -m 512 --file-prefix=src -l 0,2 -n 4 \ --vdev 'net_vhost0,iface=/tmp/vu-src' -- --portmask=1 -i \ --rxq=1 --txq=1 --nb-cores=1 --eth-peer=0,52:54:00:11:22:12 \ --no-mlockall 2. Run DPDK's Testpmd application on destination: ./install/bin/testpmd -m 512 --file-prefix=dst -l 0,2 -n 4 \ --vdev 'net_vhost0,iface=/tmp/vu-dst,postcopy-support=1' -- --portmask=1 -i \ --rxq=1 --txq=1 --nb-cores=1 --eth-peer=0,52:54:00:11:22:12 \ --no-mlockall 3. Launch VM on source: ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 3G -smp 2 -cpu host \ -object memory-backend-file,id=mem,size=3G,mem-path=/dev/shm,share=on \ -numa node,memdev=mem -mem-prealloc \ -chardev socket,id=char0,path=/tmp/vu-src \ -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ -device virtio-net-pci,netdev=mynet1 /home/virt/rhel7.6-1-clone.qcow2 \ -net none -vnc :0 -monitor stdio 4. Launch VM on destination: ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 3G -smp 2 -cpu host \ -object memory-backend-file,id=mem,size=3G,mem-path=/dev/shm,share=on \ -numa node,memdev=mem -mem-prealloc \ -chardev socket,id=char0,path=/tmp/vu-dst \ -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ -device virtio-net-pci,netdev=mynet1 /home/virt/rhel7.6-1-clone.qcow2 \ -net none -vnc :1 -monitor stdio -incoming tcp::8888 5. In both testpmd prompts, start flooding the virtio-net device: testpmd> set fwd txonly testpmd> start 6. In destination's Qemu monitor, enable postcopy: (qemu) migrate_set_capability postcopy-ram on 7. In source's Qemu monitor, enable postcopy and launch migration: (qemu) migrate_set_capability postcopy-ram on (qemu) migrate -d tcp:0:8888 (qemu) migrate_start_postcopy Maxime Coquelin (19): vhost: fix messages results handling vhost: fix return code of messages requiring replies vhost: clarify reply-ack in case a reply was already sent vhost: fix payload size of reply vhost: fix error handling when mem table gets updated vhost: define postcopy protocol flag vhost: add number of fds to vhost-user messages and use it vhost: pass socket fd to message handling callbacks vhost: enable fds passing when sending vhost-user messages vhost: add config flag for postcopy feature vhost: introduce postcopy's advise message vhost: add support for postcopy's listen message vhost: register new regions with userfaultfd vhost: avoid useless VhostUserMemory copy vhost: send userfault range addresses back to qemu vhost: add support to postcopy's end request vhost: restrict postcopy live-migration enablement net/vhost: add parameter to enable postcopy support vhost: enable postcopy protocol feature config/common_linuxapp | 1 + doc/guides/nics/vhost.rst | 5 + doc/guides/prog_guide/vhost_lib.rst | 8 + drivers/net/vhost/rte_eth_vhost.c | 13 ++ lib/librte_vhost/meson.build | 2 + lib/librte_vhost/rte_vhost.h | 5 + lib/librte_vhost/socket.c | 55 ++++- lib/librte_vhost/vhost.h | 22 +- lib/librte_vhost/vhost_crypto.c | 24 +- lib/librte_vhost/vhost_user.c | 340 +++++++++++++++++++++------- lib/librte_vhost/vhost_user.h | 21 +- 11 files changed, 381 insertions(+), 115 deletions(-) -- 2.17.1