From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 88B68E72 for ; Wed, 16 Dec 2015 14:00:57 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP; 16 Dec 2015 05:00:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,437,1444719600"; d="scan'208";a="874866409" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.66.49]) by fmsmga002.fm.intel.com with ESMTP; 16 Dec 2015 05:00:13 -0800 Date: Wed, 16 Dec 2015 21:00:27 +0800 From: Yuanhan Liu To: Pavel Fedin Message-ID: <20151216130027.GS29571@yliu-dev.sh.intel.com> References: <00b601d13733$97e063a0$c7a12ae0$@samsung.com> <20151215133612.GJ29571@yliu-dev.sh.intel.com> <00ca01d1373f$3dd4ab30$b97e0190$@samsung.com> <20151215135907.GK29571@yliu-dev.sh.intel.com> <00f101d13749$0eb97330$2c2c5990$@samsung.com> <20151216072818.GO29571@yliu-dev.sh.intel.com> <005501d137f8$e89c0090$b9d401b0$@samsung.com> <20151216120817.GQ29571@yliu-dev.sh.intel.com> <006d01d137ff$50650180$f12f0480$@samsung.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <006d01d137ff$50650180$f12f0480$@samsung.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: dev@dpdk.org, 'Victor Kaplansky' Subject: Re: [dpdk-dev] [PATCH 0/4 for 2.3] vhost-user live migration support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2015 13:00:59 -0000 On Wed, Dec 16, 2015 at 03:43:06PM +0300, Pavel Fedin wrote: > rYR8N8f/ookveMRL7BfPnj5lw+EJZd+uG+v/lZnBuWidyQ4r > g586/P1rPsQw8p6wT+M7LnqvMLZM9eWq2ht53Bd5liqxFGckGmoxFxUnAgC5sFKthAIAAA== > Status: O > Content-Length: 4853 > Lines: 66 > > Hello! > > > However, I'm more curious about the ping loss? Did you still see > > that? And to be more specific, have the wireshark captured the > > GRAP from the guest? > > Yes, everything is fine. Great! > > root@nfv_test_x86_64 /var/log/libvirt/qemu # tshark -i ovs-br0 > Running as user "root" and group "root". This could be dangerous. > Capturing on 'ovs-br0' > 1 0.000000 RealtekU_3b:83:1a -> Broadcast ARP 42 Gratuitous ARP for 192.168.6.2 (Request) > 2 0.000024 fe80::5054:ff:fe3b:831a -> ff02::1 ICMPv6 86 Neighbor Advertisement fe80::5054:ff:fe3b:831a (ovr) is at > 52:54:00:3b:83:1a > 3 0.049490 RealtekU_3b:83:1a -> Broadcast ARP 42 Gratuitous ARP for 192.168.6.2 (Request) > 4 0.049497 fe80::5054:ff:fe3b:831a -> ff02::1 ICMPv6 86 Neighbor Advertisement fe80::5054:ff:fe3b:831a (ovr) is at > 52:54:00:3b:83:1a > 5 0.199485 RealtekU_3b:83:1a -> Broadcast ARP 42 Gratuitous ARP for 192.168.6.2 (Request) > 6 0.199492 fe80::5054:ff:fe3b:831a -> ff02::1 ICMPv6 86 Neighbor Advertisement fe80::5054:ff:fe3b:831a (ovr) is at > 52:54:00:3b:83:1a > 7 0.449500 RealtekU_3b:83:1a -> Broadcast ARP 42 Gratuitous ARP for 192.168.6.2 (Request) > 8 0.449508 fe80::5054:ff:fe3b:831a -> ff02::1 ICMPv6 86 Neighbor Advertisement fe80::5054:ff:fe3b:831a (ovr) is at > 52:54:00:3b:83:1a > 9 0.517229 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=70/17920, ttl=64 > 10 0.517277 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=70/17920, ttl=64 (request in 9) > 11 0.799521 RealtekU_3b:83:1a -> Broadcast ARP 42 Gratuitous ARP for 192.168.6.2 (Request) > 12 0.799553 fe80::5054:ff:fe3b:831a -> ff02::1 ICMPv6 86 Neighbor Advertisement fe80::5054:ff:fe3b:831a (ovr) is at > 52:54:00:3b:83:1a > 13 1.517210 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=71/18176, ttl=64 > 14 1.517238 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=71/18176, ttl=64 (request in 13) > 15 2.517219 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=72/18432, ttl=64 > 16 2.517256 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=72/18432, ttl=64 (request in 15) > 17 3.517497 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=73/18688, ttl=64 > 18 3.517518 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=73/18688, ttl=64 (request in 17) > 19 4.517219 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=74/18944, ttl=64 > 20 4.517237 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=74/18944, ttl=64 (request in 19) > 21 5.517222 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=75/19200, ttl=64 > 22 5.517242 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=75/19200, ttl=64 (request in 21) > 23 6.517235 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=76/19456, ttl=64 > 24 6.517256 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=76/19456, ttl=64 (request in 23) > 25 6.531466 be:e1:71:c1:47:4d -> RealtekU_3b:83:1a ARP 42 Who has 192.168.6.2? Tell 192.168.6.1 > 26 6.531619 RealtekU_3b:83:1a -> be:e1:71:c1:47:4d ARP 42 192.168.6.2 is at 52:54:00:3b:83:1a > 27 7.517212 192.168.6.2 -> 192.168.6.1 ICMP 98 Echo (ping) request id=0x04af, seq=77/19712, ttl=64 > 28 7.517229 192.168.6.1 -> 192.168.6.2 ICMP 98 Echo (ping) reply id=0x04af, seq=77/19712, ttl=64 (request in 27) > > But there's one important detail here. Any replicated network interfaces (LOCAL port in my example) should be fully cloned on both > hosts, including MAC addresses. Otherwise after the migration the guest continues to send packets to old MAC, and, obvious, there's > still ping loss until it redoes the ARP for its ping target. I see. And here I care more about whether we can get the GARP from the target guest just after the migration. If you can, everything should be fine. > > > And what's the output of 'grep virtio /proc/interrupts' inside guest? > > 11: 0 0 0 0 IO-APIC 11-fasteoi uhci_hcd:usb1, virtio3 > 24: 0 0 0 0 PCI-MSI 114688-edge virtio2-config > 25: 3544 0 0 0 PCI-MSI 114689-edge virtio2-req.0 > 26: 10 0 0 0 PCI-MSI 49152-edge virtio0-config The GUEST_ANNOUNCE has indeed been triggered. That's great! I just have no idea why I can't get any config IRQ from the guest after the migration. (I can for migratin inside one same host, but not on two hosts). In my first tries, I just got an error message telling me that the MSI is just lost. I then found it may because I'm using a customized guest kernel. I then switched to the kernel shipped by Fedora 22, I no longer see such error, but I still don't see such interrupt generated inside the guest, either. It might still be an issue on my side. Even it's not, it's likely a KVM bug, but not from vhost-user. And glad it works on your side :) So, I will send v2 tomorow. --yliu