DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ilya Maximets <i.maximets@samsung.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: dev@dpdk.org, Huawei Xie <huawei.xie@intel.com>,
	Dyasly Sergey <s.dyasly@samsung.com>,
	Heetae Ahn <heetae82.ahn@samsung.com>,
	Jianfeng Tan <jianfeng.tan@intel.com>
Subject: Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
Date: Wed, 06 Jul 2016 14:19:12 +0300	[thread overview]
Message-ID: <577CE930.2070007@samsung.com> (raw)
In-Reply-To: <20160701073506.GQ2831@yliu-dev.sh.intel.com>

On 01.07.2016 10:35, Yuanhan Liu wrote:
> Hi,
> 
> Sorry for the long delay.
> 
> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>> In current implementation guest application can reinitialize vrings
>> by executing start after stop. In the same time host application
>> can still poll virtqueue while device stopped in guest and it will
>> crash with segmentation fault while vring reinitialization because
>> of dereferencing of bad descriptor addresses.
> 
> Yes, you are right that vring will be reinitialized after restart.
> But even though, I don't see the reason it will cause a vhost crash,
> since the reinitialization will reset all the vring memeory by 0:
> 
>     memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
> 
> That means those bad descriptors will be skipped, safely, at vhost
> side by:
> 
> 	if (unlikely(desc->len < dev->vhost_hlen))
> 		return -1;
> 
>>
>> OVS crash for example:
>> <------------------------------------------------------------------------>
>> [test-pmd inside guest VM]
>>
>> 	testpmd> port stop all
>> 	    Stopping ports...
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>> 	testpmd> port config all rxq 2
>> 	testpmd> port config all txq 2
>> 	testpmd> port start all
>> 	    Configuring Port 0 (socket 0)
>> 	    Port 0: 52:54:00:CB:44:C8
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>>
>> [OVS on host]
>> 	Program received signal SIGSEGV, Segmentation fault.
>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
> 
> Interesting, so it bypasses the above check since desc->len is non-zero
> while desc->addr is zero. The size (2056) also looks weird.
> 
> Do you mind to check this issue a bit deeper, say why desc->addr is
> zero, however, desc->len is not?

OK. I checked this few more times. Actually, I see, that desc->addr is
not zero. All desc memory looks like some rubbish:

<------------------------------------------------------------------------------>
(gdb)
#3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363,
                      m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0)
        desc = 0x2aabc00ff530
        desc_addr = 0
        mbuf_offset = 0
        prev = 0x7fe9db269400
        nr_desc = 1
        desc_offset = 12
        cur = 0x7fe9db269400
        hdr = 0x0
        desc_avail = 1012591375
        mbuf_avail = 1526
        cpy_len = 1526

(gdb) p *desc
$2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516}

<------------------------------------------------------------------------------>

And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge
address to host's.

Scenario was the same. SIGSEGV received right after 'port start all'.

Another thought:

	Actually, there is a race window between 'memset' in guest and reading
	of 'desc->len' and 'desc->addr' on host. So, it's possible to read non
	zero 'len' and zero 'addr' right after that. But you're right, this
	case should be very rare.

> 
>> 	(gdb) bt
>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>> 	    #1  copy_desc_to_mbuf
>> 	    #2  rte_vhost_dequeue_burst
>> 	    #3  netdev_dpdk_vhost_rxq_recv
>> 	    ...
>>
>> 	(gdb) bt full
>> 	    #0  rte_memcpy
>> 	        ...
>> 	    #1  copy_desc_to_mbuf
>> 	        desc_addr = 0
>> 	        mbuf_offset = 0
>> 	        desc_offset = 12
>> 	        ...
>> <------------------------------------------------------------------------>
>>
>> Fix that by checking addresses of descriptors before using them.
>>
>> Note: For mergeable buffers this patch checks only guest's address for
>> zero, but in non-meargeable case host's address checked. This is done
>> because checking of host's address in mergeable case requires additional
>> refactoring to keep virtqueue in consistent state in case of error.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> ---
>>
>> Actually, current virtio implementation looks broken for me. Because
>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
>> side.
> 
> Yes, this sounds buggy. Maybe we could not reset the avail idx, in such
> case vhost dequeue/enqueue will just return as there are no more packets
> to dequeue and no more space to enqueue, respectively?

Maybe this will be a good fix for virtio because vhost will not try to receive
from wrong descriptors. But this will not help if vhost already tries to receive
something in time of guest's reconfiguration.

Best regards, Ilya Maximets.

  reply	other threads:[~2016-07-06 11:19 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-20 12:50 Ilya Maximets
2016-05-23 10:57 ` Yuanhan Liu
2016-05-23 11:04   ` Ilya Maximets
2016-05-30 11:05     ` Ilya Maximets
2016-05-30 14:25       ` Yuanhan Liu
2016-05-31  9:12         ` Ilya Maximets
2016-05-30 12:00 ` Tan, Jianfeng
2016-05-30 12:24   ` Ilya Maximets
2016-05-31  6:53     ` Tan, Jianfeng
2016-05-31  9:10       ` Ilya Maximets
2016-05-31 22:06 ` Rich Lane
2016-06-02 10:46   ` Ilya Maximets
2016-06-02 16:22     ` Rich Lane
2016-06-03  6:01       ` Ilya Maximets
2016-07-01  7:35 ` Yuanhan Liu
2016-07-06 11:19   ` Ilya Maximets [this message]
2016-07-06 12:24     ` Yuanhan Liu
2016-07-08 11:48       ` Ilya Maximets
2016-07-10 13:17         ` Yuanhan Liu
2016-07-11  8:38           ` Yuanhan Liu
2016-07-11  9:50             ` Ilya Maximets
2016-07-11 11:05               ` Yuanhan Liu
2016-07-11 11:47                 ` Ilya Maximets
2016-07-12  2:43                   ` Yuanhan Liu
2016-07-12  5:53                     ` Ilya Maximets
2016-07-13  7:34                       ` Ilya Maximets
2016-07-13  8:47                         ` Yuanhan Liu
2016-07-13 15:54                           ` Rich Lane
2016-07-14  1:42                             ` Yuanhan Liu
2016-07-14  4:38                               ` Ilya Maximets
2016-07-14  8:18 ` [dpdk-dev] [PATCH v2] " Ilya Maximets
2016-07-15  6:17   ` Yuanhan Liu
2016-07-15  7:23     ` Ilya Maximets
2016-07-15  8:40       ` Yuanhan Liu
2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue Ilya Maximets
2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 2/2] vhost: do sanity check for ring descriptor address Ilya Maximets
2016-07-15 12:14   ` [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad " Yuanhan Liu
2016-07-15 19:37     ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=577CE930.2070007@samsung.com \
    --to=i.maximets@samsung.com \
    --cc=dev@dpdk.org \
    --cc=heetae82.ahn@samsung.com \
    --cc=huawei.xie@intel.com \
    --cc=jianfeng.tan@intel.com \
    --cc=s.dyasly@samsung.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).