From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 71CCDCE7 for ; Tue, 22 Sep 2015 16:20:16 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP; 22 Sep 2015 07:20:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,573,1437462000"; d="scan'208";a="650043194" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.66.60]) by orsmga003.jf.intel.com with ESMTP; 22 Sep 2015 07:20:14 -0700 Date: Tue, 22 Sep 2015 22:22:20 +0800 From: Yuanhan Liu To: Marcel Apfelbaum Message-ID: <20150922142220.GY2339@yliu-dev.sh.intel.com> References: <1442589061-19225-4-git-send-email-yuanhan.liu@linux.intel.com> <55FEBB92.40403@gmail.com> <20150921020619.GN2339@yliu-dev.sh.intel.com> <560044CE.4040002@gmail.com> <20150922073132.GT2339@yliu-dev.sh.intel.com> <56010CE5.1030608@redhat.com> <20150922083416.GW2339@yliu-dev.sh.intel.com> <560115A5.2020602@redhat.com> <20150922092108.GX2339@yliu-dev.sh.intel.com> <56012819.2090302@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56012819.2090302@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: dev@dpdk.org, "Michael S. Tsirkin" Subject: Re: [dpdk-dev] [PATCH v5 resend 03/12] vhost: vring queue setup for multiple queue support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Sep 2015 14:20:17 -0000 On Tue, Sep 22, 2015 at 01:06:17PM +0300, Marcel Apfelbaum wrote: > On 09/22/2015 12:21 PM, Yuanhan Liu wrote: > >On Tue, Sep 22, 2015 at 11:47:34AM +0300, Marcel Apfelbaum wrote: > >>On 09/22/2015 11:34 AM, Yuanhan Liu wrote: > >>>On Tue, Sep 22, 2015 at 11:10:13AM +0300, Marcel Apfelbaum wrote: > >>>>On 09/22/2015 10:31 AM, Yuanhan Liu wrote: > >>>>>On Mon, Sep 21, 2015 at 08:56:30PM +0300, Marcel Apfelbaum wrote: > >>>>[...] > >>>>>>> > >>>>>>>Hi, > >>>>>>> > >>>>>>>I have made 4 cleanup patches few weeks before, including the patch > >>>>>>>to define kickfd and callfd as int type, and they have already got > >>>>>>>the ACK from Huawei Xie, and Chuangchun Ouyang. It's likely that > >>>>>>>they will be merged, hence I made this patchset based on them. > >>>>>>> > >>>>>>>This will also answer the question from your another email: can't > >>>>>>>apply. > >>>>>> > >>>>>>Hi, > >>>>>>Thank you for the response, it makes sense now. > >>>>>> > >>>>>>T have another issue, maybe you can help. > >>>>>>I have some problems making it work with OVS/DPDK backend and virtio-net driver in guest. > >>>>>> > >>>>>>I am using a simple setup: > >>>>>> http://wiki.qemu.org/Features/vhost-user-ovs-dpdk > >>>>>>that connects 2 VMs using OVS's dpdkvhostuser ports (regular virtio-net driver in guest, not the PMD driver). > >>>>>> > >>>>>>The setup worked fine with the prev DPDK MQ implementation (V4), however on this one the traffic stops > >>>>>>once I set queues=n in guest. > >>>>> > >>>>>Hi, > >>>>> > >>>>>Could you be more specific about that? It also would be helpful if you > >>>>>could tell me the steps, besides those setup steps you mentioned in the > >>>>>qemu wiki and this email, you did for testing. > >>>>> > >>>> > >>>>Hi, > >>>>Thank you for your help. > >>>> > >>>>I am sorry the wiki is not enough, I'll be happy to add all the missing parts. > >>>>In the meantime maybe you can tell me where the problem is, I also suggest to > >>>>post here the output of journalctl command. > >>>> > >>>>We only need a regular machine and we want traffic between 2 VMs. I'll try to summarize the steps: > >>>> > >>>>1. Be sure you have enough hugepages enabled (2M pages are enough) and mounted. > >>>>2. Configure and start OVS following the wiki > >>>> - we only want one bridge with 2 dpdkvhostuser ports. > >>>>3. Start VMs using the wiki command line > >>>> - check journalctl for possible errors. You can use > >>>> journalctl --since `date +%T --date="-10 minutes"` > >>>> to see only last 10 minutes. > >>>>4. Configure the guests IPs. > >>>> - Disable the Network Manager as described bellow in the mail. > >>>>5. At this point you should be able to ping between guests. > >>>> > >>>>Please let me know if you have any problem until this point. > >>>>I'll be happy to help. Please point any special steps you made that > >>>>are not in the WIKI. The journalctl logs would also help. > >>>> > >>>>Does the ping between VMS work now? > >>> > >>>Yes, it works, too. I can ping the other vm inside a vm. > >>> > >>> [root@dpdk-kvm ~]# ethtool -l eth0 > >>> Channel parameters for eth0: > >>> Pre-set maximums: > >>> RX: 0 > >>> TX: 0 > >>> Other: 0 > >>> Combined: 2 > >>> Current hardware settings: > >>> RX: 0 > >>> TX: 0 > >>> Other: 0 > >>> Combined: 2 > >>> > >>> [root@dpdk-kvm ~]# ifconfig eth0 > >>> eth0: flags=4163 mtu 1500 > >>> inet 192.168.100.11 netmask 255.255.255.0 broadcast 192.168.100.255 > >>> inet6 fe80::5054:ff:fe12:3459 prefixlen 64 scopeid 0x20 > >>> ether 52:54:00:12:34:59 txqueuelen 1000 (Ethernet) > >>> RX packets 56 bytes 5166 (5.0 KiB) > >>> RX errors 0 dropped 0 overruns 0 frame 0 > >>> TX packets 84 bytes 8303 (8.1 KiB) > >>> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > >>> > >>> [root@dpdk-kvm ~]# ping 192.168.100.10 > >>> PING 192.168.100.10 (192.168.100.10) 56(84) bytes of data. > >>> 64 bytes from 192.168.100.10: icmp_seq=1 ttl=64 time=0.213 ms > >>> 64 bytes from 192.168.100.10: icmp_seq=2 ttl=64 time=0.094 ms > >>> 64 bytes from 192.168.100.10: icmp_seq=3 ttl=64 time=0.246 ms > >>> 64 bytes from 192.168.100.10: icmp_seq=4 ttl=64 time=0.153 ms > >>> 64 bytes from 192.168.100.10: icmp_seq=5 ttl=64 time=0.104 ms > >>> ^C > >>>> > >>>>If yes, please let me know and I'll go over MQ enabling. > >>> > >>>I'm just wondering why it doesn't work on your side. > >> > >>Hi, > >> > >>This is working also for me, but without enabling the MQ. (ethtool -L eth0 combined n (n>1) ) > >>The problem starts when I am applying the patches and I enable MQ. (Need a slightly different QEMU commandline) > >> > >>> > >>>> > >>>>>I had a very rough testing based on your test guides, I indeed found > >>>>>an issue: the IP address assigned by "ifconfig" disappears soon in the > >>>>>first few times and after about 2 or 3 times reset, it never changes. > >>>>> > >>>>>(well, I saw that quite few times before while trying different QEMU > >>>>>net devices. So, it might be a system configuration issue, or something > >>>>>else?) > >>>>> > >>>> > >>>>You are right, this is a guest config issue, I think you should disable NetworkManager > >>> > >>>Yeah, I figured it out by my self, and it worked when I hardcoded it at > >>>/etc/sysconfig/network-scripts/ifcfg-eth0. > >>> > >>>>for static IP addresses. Please use only the virtio-net device. > >>>> > >>>>You cant try this: > >>>>sudo systemctl stop NetworkManager > >>>>sudo systemctl disable NetworkManager > >>> > >>>Thanks for the info and tip! > >>> > >>>> > >>>>>Besides that, it works, say, I can wget a big file from host. > >>>>> > >>>> > >>>>The target here is traffic between 2 VMs. > >>>>We want to be able to ping (for example) between VMS when MQ > 1 is enabled on both guests: > >>>>- ethtool -L eth0 combined > >>> > >>>As you can see from my command log, I did so and it worked :) > >>> > >> > >>Let me understand, it worked after applying MQ patches on all 3 projects (DPDK, QEMU and OVS)? > >>It worked with MQ enabled? MQ >1 ? > > > >Yes, however, I tried few more times this time, and found it sometimes > >worked, and sometimes not. Sounds like there is a bug somewhere. I put two quick debug printf at ovs dpdk code, and found out why it sometimes works, and sometimes not: all data goes to the first queue works, and otherwise, it fails. :: TX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 0, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 0, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 0, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 0, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 0, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 0, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 0, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 0, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 0, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 0, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 0, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 0, asked: 1, enqueued: 1 And the failed ones: :: TX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 1, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 1, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 1, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 1, asked: 1, enqueued: 1 :: TX: vhost-dev: /var/run/openvswitch/vhost-user2, qp_index: 1, asked: 32, dequeued: 1 :: RX: vhost-dev: /var/run/openvswitch/vhost-user1, qp_index: 1, asked: 1, enqueued: 1 You can see that vhost-user1 never transfer back packet, hence ping didn't work. And if you run ifconfig at the VM-1, you will find packet drops. --- I then spent some time for figuring out why packet drops happened, and found that the vq->vhost_hlen for second (and above) queue pairs is set wrongly: that's why it failed if packets are not transfered by the first queue. What's "ironic" is that while making this patchset, I am somehow aware of that I missed it, and I had planned to fix it. But I just forgot it, and then it takes me (as well as you) some time to figure it out, in a more painful way. So, thank you a lot for your testing as well as the effort to guide me on the OVS DPDK test. It's proved to work after the fix (at least in my testing), but it's late here and I'm gonna send a new version tomorrow, including some other comments addressing. Please do more test then :) --yliu > > > > Yes, I've been hunting it since you submitted the series :) > > >> > >>You can be sure by using the following command in one of the VMs: > >> cat /proc/interrupts | grep virtio > >>and see that you have interrupts for all virtio0-input.0/1/... > > > > [root@dpdk-kvm ~]# cat /proc/interrupts | grep virtio > > 24: 0 0 PCI-MSI-edge virtio0-config > > 25: 425 0 PCI-MSI-edge virtio0-virtqueues > > > > Here it shows that MQ is not enabled in the guest. > For queues=2 in qemu commandline and 'ethtool -L eth0 combined 2' in guest you should see: > > 24: 0 0 0 0 PCI-MSI 65536-edge virtio0-config > 25: 32 0 14 0 PCI-MSI 65537-edge virtio0-input.0 > 26: 1 0 0 0 PCI-MSI 65538-edge virtio0-output.0 > 27: 53 0 0 0 PCI-MSI 65539-edge virtio0-input.1 > 28: 1 0 0 0 PCI-MSI 65540-edge virtio0-output.1 > > > So, you are very close to reproduce the MQ bug: > Please ensure: > 1. You have applied MQ patches to QEMU/DPDK > 2. You applies the MQ patch to *OVS*: > https://www.mail-archive.com/dev@openvswitch.org/msg49198.html > - It does not apply correctly, just remove the chunk with the "if" statement that it fails to compile > 3. Configure OVS for 2 queues: > - ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=2 > - ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xff00 > 4. Enable MQ on virtio-net device: > -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=2 \ > -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:$2,mq=on,vectors=8 \ > > At this stage you should still have ping working between VMS. > > However, when running on both VMs: > ethtool -L eth0 combined 2 > traffic stops... > > Thanks again for the help! > > > > >BTW, I have seen some warnings from ovs: > > > > 2015-09-22T02:08:58Z|00003|ofproto_dpif_upcall(pmd45)|WARN|upcall_cb failure: ukey installation fails > > > > 2015-09-22T02:11:05Z|00003|ofproto_dpif_upcall(pmd44)|WARN|Dropped 29 log messages in last 127 seconds (most recently, 82 seconds ago) due to excessive rate > > 2015-09-22T02:11:05Z|00004|ofproto_dpif_upcall(pmd44)|WARN|upcall_cb failure: ukey installation fails > > 2015-09-22T02:12:17Z|00005|ofproto_dpif_upcall(pmd44)|WARN|Dropped 11 log messages in last 32 seconds (most recently, 14 seconds ago) due to excessive rate > > 2015-09-22T02:12:17Z|00006|ofproto_dpif_upcall(pmd44)|WARN|upcall_cb failure: ukey installation fails > > 2015-09-22T02:14:59Z|00007|ofproto_dpif_upcall(pmd44)|WARN|Dropped 2 log messages in last 161 seconds (most recently, 161 seconds ago) due to excessive rate > > 2015-09-22T02:14:59Z|00008|ofproto_dpif_upcall(pmd44)|WARN|upcall_cb failure: ukey installation fails > > > > > >Does that look abnormal to you? > > Nope, but since you have ping between VMS it should not bother > > > >Anyway, I here check if there is anything I can fix. > Thanks!!! > > Marcel > > > > --yliu > >