DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com,
	jfreimann@redhat.com
Cc: dgilbert@redhat.com, Maxime Coquelin <maxime.coquelin@redhat.com>
Subject: [dpdk-dev] [RFC 00/10] vhost: add postcopy live-migration support
Date: Thu, 23 Aug 2018 18:51:47 +0200	[thread overview]
Message-ID: <20180823165157.30001-1-maxime.coquelin@redhat.com> (raw)

This RFC adds support for postcopy live-migration.

With classic live-migration, the VM runs on source while its
content is being migrated to destination. When pages already
migrated to destination are dirtied by the source, they get
copied until both source and destination memory converge.
At that time, the source is stopped and destination is
started.

With postcopy live-migration, the VM is started on destination
before all the memory has been migrated. When the VM tries to
access a page that haven't been migrated yet, a pagefault is
triggered, handled by userfaultfd which pauses the thread.
A Qemu thread in charge of postcopy request the source for
the missing page. Once received and mapped, the paused thread
gets resumed.

Userfaultfd supports handling faults from a different process,
and Qemu supports postcopy with vhost-user backends since
v2.12.

One problem encountered with classic live-migration for VMs
relying on vhost-user backends is that when the traffic is
high (e.g. PVP), it happens that it never converges as
pages gets dirtied at a faster rate than they are copied
to the destination.
It is expected this problem sould be solved with using
postcopy, as rings memory and buffers will be copied once,
when destination will pagefault on them.

My current test bench is limited, so I could test above
scenario. I just ran with flooding the guest using testpmd's
txonly forwarding mode, with either having the virtio-net
binded to its kernel or DPDK driver in guest. In my setup,
migration is done on the same machine. Results are average
of 5 runs.

A. Flooding virtio-net kernel driver (~3Mpps):
1. Classic live-migration:
 - Total time: 12592ms
 - Downtime: 53ms

2. Postcopy live-migration:
 - Total time: 2324ms
 - Downtime: 48ms

B. Flooding Virtio PMD (~15Mpps):
1. Classic live-migration:
 - Total time: 22101ms
 - Downtime: 35ms

2. Postcopy live-migration:
 - Total time: 2995ms
 - Downtime: 47ms

Note that the Total time reported by Qemu is really steady
accross runs, whereas the Downtime is very unsteady.

One problem remaining to be fixed is the memory locking.
Indeed, userfaultfd requires that the memory registered is
neither mmapped with MAP_POPULATE attribute nor mlocked.
For the former, the series address this by not advertizing
postcopy feature if dequeue zero-copy, which requires it, is
enabled.
For the latter, this is really problematic because vhost-user
backend is a library so it cannot prevent the application
to call mlockall(), as opposed to Qemu. When using testpmd,
one has just to append --no-mlockall to the command-line,
but missing it results in non-trivial warnings in Qemu logs.

Steps to test postcopy:
1. Run DPDK's Testpmd application on source:
./install/bin/testpmd -m 512 --file-prefix=src -l 0,2 -n 4 \
  --vdev 'net_vhost0,iface=/tmp/vu-src' -- --portmask=1 -i \
  --rxq=1 --txq=1 --nb-cores=1 --eth-peer=0,52:54:00:11:22:12 \
  --no-mlockall

2. Run DPDK's Testpmd application on destination:
./install/bin/testpmd -m 512 --file-prefix=dst -l 0,2 -n 4 \
  --vdev 'net_vhost0,iface=/tmp/vu-dst' -- --portmask=1 -i \
  --rxq=1 --txq=1 --nb-cores=1 --eth-peer=0,52:54:00:11:22:12 \
  --no-mlockall

3. Launch VM on source:
./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 3G -smp 2 -cpu host \
  -object memory-backend-file,id=mem,size=3G,mem-path=/dev/shm,share=on \
  -numa node,memdev=mem -mem-prealloc \
  -chardev socket,id=char0,path=/tmp/vu-src \
  -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
  -device virtio-net-pci,netdev=mynet1 /home/virt/rhel7.6-1-clone.qcow2 \
  -net none -vnc :0 -monitor stdio

4. Launch VM on destination:
./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 3G -smp 2 -cpu host \
  -object memory-backend-file,id=mem,size=3G,mem-path=/dev/shm,share=on \
  -numa node,memdev=mem -mem-prealloc \
  -chardev socket,id=char0,path=/tmp/vu-dst \
  -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
  -device virtio-net-pci,netdev=mynet1 /home/virt/rhel7.6-1-clone.qcow2 \
  -net none -vnc :1 -monitor stdio -incoming tcp::8888

5. In both testpmd prompts, start flooding the virtio-net device:
testpmd> set fwd txonly
testpmd> start

6. In destination's Qemu monitor, enable postcopy:
(qemu) migrate_set_capability postcopy-ram on

7. In source's Qemu monitor, enable postcopy and launch migration:
(qemu) migrate_set_capability postcopy-ram on
(qemu) migrate -d tcp:0:8888
(qemu) migrate_start_postcopy

Maxime Coquelin (10):
  vhost: define postcopy protocol flag
  vhost: add number of fds to vhost-user messages and use it
  vhost: enable fds passing when sending vhost-user messages
  vhost: introduce postcopy's advise message
  vhost: add support for postcopy's listen message
  vhost: register new regions with userfaultfd
  vhost: avoid useless VhostUserMemory copy
  vhost: send userfault range addresses back to qemu
  vhost: add support to postcopy's end request
  vhost: enable postcopy protocol feature

 lib/librte_vhost/rte_vhost.h  |   4 +
 lib/librte_vhost/socket.c     |  21 +++-
 lib/librte_vhost/vhost.h      |   3 +
 lib/librte_vhost/vhost_user.c | 201 +++++++++++++++++++++++++++++-----
 lib/librte_vhost/vhost_user.h |  12 +-
 5 files changed, 206 insertions(+), 35 deletions(-)

-- 
2.17.1

             reply	other threads:[~2018-08-23 16:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-23 16:51 Maxime Coquelin [this message]
2018-08-23 16:51 ` [dpdk-dev] [RFC 01/10] vhost: define postcopy protocol flag Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 02/10] vhost: add number of fds to vhost-user messages and use it Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 03/10] vhost: enable fds passing when sending vhost-user messages Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 04/10] vhost: introduce postcopy's advise message Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 05/10] vhost: add support for postcopy's listen message Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 06/10] vhost: register new regions with userfaultfd Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 07/10] vhost: avoid useless VhostUserMemory copy Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 08/10] vhost: send userfault range addresses back to qemu Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 09/10] vhost: add support to postcopy's end request Maxime Coquelin
2018-08-23 16:51 ` [dpdk-dev] [RFC 10/10] vhost: enable postcopy protocol feature Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180823165157.30001-1-maxime.coquelin@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dgilbert@redhat.com \
    --cc=jfreimann@redhat.com \
    --cc=tiwei.bie@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).