From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout3.w1.samsung.com (mailout3.w1.samsung.com [210.118.77.13]) by dpdk.org (Postfix) with ESMTP id 10E9A5AB9 for ; Fri, 8 Jul 2016 13:49:00 +0200 (CEST) Received: from eucpsbgm2.samsung.com (unknown [203.254.199.245]) by mailout3.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O9Z0007UVHMTEA0@mailout3.w1.samsung.com> for dev@dpdk.org; Fri, 08 Jul 2016 12:48:58 +0100 (BST) X-AuditID: cbfec7f5-f792a6d000001302-33-577f932912f1 Received: from eusync3.samsung.com ( [203.254.199.213]) by eucpsbgm2.samsung.com (EUCPMTA) with SMTP id 41.EF.04866.9239F775; Fri, 8 Jul 2016 12:48:57 +0100 (BST) Received: from [106.109.129.180] by eusync3.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0O9Z00IY3VHK0V20@eusync3.samsung.com>; Fri, 08 Jul 2016 12:48:57 +0100 (BST) To: Yuanhan Liu References: <1463748604-27251-1-git-send-email-i.maximets@samsung.com> <20160701073506.GQ2831@yliu-dev.sh.intel.com> <577CE930.2070007@samsung.com> <20160706122446.GO26521@yliu-dev.sh.intel.com> Cc: dev@dpdk.org, Huawei Xie , Dyasly Sergey , Heetae Ahn , Jianfeng Tan From: Ilya Maximets Message-id: <577F9328.1030901@samsung.com> Date: Fri, 08 Jul 2016 14:48:56 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-version: 1.0 In-reply-to: <20160706122446.GO26521@yliu-dev.sh.intel.com> Content-type: text/plain; charset=windows-1252 Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrKLMWRmVeSWpSXmKPExsVy+t/xq7qak+vDDc5/M7J492k7k8W0z7fZ LdpnnmWyuNL+k92ie/YXNovJs6Usrk+4wOrA7vFrwVJWj8V7XjJ5zDsZ6NG3ZRVjAEsUl01K ak5mWWqRvl0CV8bbD80sBauVKxo/zmJsYLwt3cXIySEhYCLxqvc4M4QtJnHh3nq2LkYuDiGB pYwS7/q/QDkvGCXm//jFBFIlLOAsMeFsHyOILSKgK/F0zjpWiKJjjBI9y+ezgzjMAqsZJe6+ XsUCUsUmoCNxavURoA4ODl4BLYnX79lAwiwCqhL9La1gg0QFIiRmbf8BtoBXQFDix+R7YK2c AtYSB559YgJpZRbQk7h/UQskzCwgL7F5zVvmCYwCs5B0zEKomoWkagEj8ypG0dTS5ILipPRc I73ixNzi0rx0veT83E2MkMD+uoNx6TGrQ4wCHIxKPLw3ROrDhVgTy4orcw8xSnAwK4nwTpkE FOJNSaysSi3Kjy8qzUktPsQozcGiJM47c9f7ECGB9MSS1OzU1ILUIpgsEwenVAOjSo2/7Y3N 0yVWW/e5ya9quLw9RGRloOJp3vQ6/bD2tS332fObPMUufb4d3hLD9cHm5f1duXGadR5n+li+ 769+p37k5NutzCfW2/oulehu7prxLun5u52nfvwNyGYOu3bNfmLBT6VdgvFS3Tplxw3mdG2x keP9a1Ubf3BG6GzDnaWqb6MuG/xVYinOSDTUYi4qTgQAIRxZlWgCAAA= Subject: Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jul 2016 11:49:00 -0000 On 06.07.2016 15:24, Yuanhan Liu wrote: > On Wed, Jul 06, 2016 at 02:19:12PM +0300, Ilya Maximets wrote: >> On 01.07.2016 10:35, Yuanhan Liu wrote: >>> Hi, >>> >>> Sorry for the long delay. >>> >>> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote: >>>> In current implementation guest application can reinitialize vrings >>>> by executing start after stop. In the same time host application >>>> can still poll virtqueue while device stopped in guest and it will >>>> crash with segmentation fault while vring reinitialization because >>>> of dereferencing of bad descriptor addresses. >>> >>> Yes, you are right that vring will be reinitialized after restart. >>> But even though, I don't see the reason it will cause a vhost crash, >>> since the reinitialization will reset all the vring memeory by 0: >>> >>> memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size); >>> >>> That means those bad descriptors will be skipped, safely, at vhost >>> side by: >>> >>> if (unlikely(desc->len < dev->vhost_hlen)) >>> return -1; >>> >>>> >>>> OVS crash for example: >>>> <------------------------------------------------------------------------> >>>> [test-pmd inside guest VM] >>>> >>>> testpmd> port stop all >>>> Stopping ports... >>>> Checking link statuses... >>>> Port 0 Link Up - speed 10000 Mbps - full-duplex >>>> Done >>>> testpmd> port config all rxq 2 >>>> testpmd> port config all txq 2 >>>> testpmd> port start all >>>> Configuring Port 0 (socket 0) >>>> Port 0: 52:54:00:CB:44:C8 >>>> Checking link statuses... >>>> Port 0 Link Up - speed 10000 Mbps - full-duplex >>>> Done >>>> >>>> [OVS on host] >>>> Program received signal SIGSEGV, Segmentation fault. >>>> rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h >>> >>> Interesting, so it bypasses the above check since desc->len is non-zero >>> while desc->addr is zero. The size (2056) also looks weird. >>> >>> Do you mind to check this issue a bit deeper, say why desc->addr is >>> zero, however, desc->len is not? >> >> OK. I checked this few more times. > > Thanks! > >> Actually, I see, that desc->addr is >> not zero. All desc memory looks like some rubbish: >> >> <------------------------------------------------------------------------------> >> (gdb) >> #3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363, >> m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0) >> desc = 0x2aabc00ff530 >> desc_addr = 0 >> mbuf_offset = 0 >> prev = 0x7fe9db269400 >> nr_desc = 1 >> desc_offset = 12 >> cur = 0x7fe9db269400 >> hdr = 0x0 >> desc_avail = 1012591375 >> mbuf_avail = 1526 >> cpy_len = 1526 >> >> (gdb) p *desc >> $2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516} >> >> <------------------------------------------------------------------------------> >> >> And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge >> address to host's. >> >> Scenario was the same. SIGSEGV received right after 'port start all'. >> >> Another thought: >> >> Actually, there is a race window between 'memset' in guest and reading >> of 'desc->len' and 'desc->addr' on host. So, it's possible to read non >> zero 'len' and zero 'addr' right after that. > > That's also what I was thinking, that it should the only reason caused > such issue. > >> But you're right, this case should be very rare. > > Yes, it should be very rare. What troubles me is that seems you can > reproduce this issue very easily, that I doubt it's caused by this > rare race. The reason could be something else? I don't know what exactly happens, but it constantly happens on 'port start all' command execution. Descriptors becomes broken for some time and after that time descriptors becomes normal. I think so, because with my patch applied network is working. It means, that broken descriptors appears for some little time while virtio restarting with new configuration. Another point is that crash constantly happens on queue_id=3 (second RX queue) in my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to rxq=2. Obviously, virtio needs to be fixed, but we need to check address anyway on vhost side, because we don't know what happens in guest in common case. Best regards, Ilya Maximets.