DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
@ 2016-05-20 12:50 Ilya Maximets
  2016-05-23 10:57 ` Yuanhan Liu
                   ` (5 more replies)
  0 siblings, 6 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-05-20 12:50 UTC (permalink / raw)
  To: dev, Huawei Xie, Yuanhan Liu
  Cc: Dyasly Sergey, Heetae Ahn, Jianfeng Tan, Ilya Maximets

In current implementation guest application can reinitialize vrings
by executing start after stop. In the same time host application
can still poll virtqueue while device stopped in guest and it will
crash with segmentation fault while vring reinitialization because
of dereferencing of bad descriptor addresses.

OVS crash for example:
<------------------------------------------------------------------------>
[test-pmd inside guest VM]

	testpmd> port stop all
	    Stopping ports...
	    Checking link statuses...
	    Port 0 Link Up - speed 10000 Mbps - full-duplex
	    Done
	testpmd> port config all rxq 2
	testpmd> port config all txq 2
	testpmd> port start all
	    Configuring Port 0 (socket 0)
	    Port 0: 52:54:00:CB:44:C8
	    Checking link statuses...
	    Port 0 Link Up - speed 10000 Mbps - full-duplex
	    Done

[OVS on host]
	Program received signal SIGSEGV, Segmentation fault.
	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h

	(gdb) bt
	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
	    #1  copy_desc_to_mbuf
	    #2  rte_vhost_dequeue_burst
	    #3  netdev_dpdk_vhost_rxq_recv
	    ...

	(gdb) bt full
	    #0  rte_memcpy
	        ...
	    #1  copy_desc_to_mbuf
	        desc_addr = 0
	        mbuf_offset = 0
	        desc_offset = 12
	        ...
<------------------------------------------------------------------------>

Fix that by checking addresses of descriptors before using them.

Note: For mergeable buffers this patch checks only guest's address for
zero, but in non-meargeable case host's address checked. This is done
because checking of host's address in mergeable case requires additional
refactoring to keep virtqueue in consistent state in case of error.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
---

Actually, current virtio implementation looks broken for me. Because
'virtio_dev_start' breaks virtqueue while it still available from the vhost
side.

There was 2 patches about this behaviour:

	1. a85786dc816f ("virtio: fix states handling during initialization")
	2. 9a0615af7746 ("virtio: fix restart")

The second patch fixes somehow issue intoduced in the first patch, but actually
also breaks vhost in the way described above.
It's not pretty clear for me what to do in current situation with virtio,
because it will be broken for guest application even if vhost will not crash.

May be it'll be better to forbid stopping of virtio device and force user to
exit and start again (may be implemented in hidden from user way)?

This patch adds additional sane checks, so it should be applied anyway, IMHO.

 lib/librte_vhost/vhost_rxtx.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 750821a..9d05739 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
 
 	desc = &vq->desc[desc_idx];
-	if (unlikely(desc->len < vq->vhost_hlen))
+	desc_addr = gpa_to_vva(dev, desc->addr);
+	if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
 		return -1;
 
-	desc_addr = gpa_to_vva(dev, desc->addr);
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	virtio_enqueue_offload(m, &virtio_hdr.hdr);
@@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 
 			desc = &vq->desc[desc->next];
 			desc_addr   = gpa_to_vva(dev, desc->addr);
+			if (unlikely(!desc_addr))
+				return -1;
+
 			desc_offset = 0;
 			desc_avail  = desc->len;
 		}
@@ -344,7 +347,8 @@ fill_vec_buf(struct vhost_virtqueue *vq, uint32_t avail_idx,
 	uint32_t len    = *allocated;
 
 	while (1) {
-		if (unlikely(vec_id >= BUF_VECTOR_MAX || idx >= vq->size))
+		if (unlikely(vec_id >= BUF_VECTOR_MAX || idx >= vq->size
+			|| !vq->desc[idx].addr))
 			return -1;
 
 		len += vq->desc[idx].len;
@@ -747,10 +751,10 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t nr_desc = 1;
 
 	desc = &vq->desc[desc_idx];
-	if (unlikely(desc->len < vq->vhost_hlen))
+	desc_addr = gpa_to_vva(dev, desc->addr);
+	if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
 		return -1;
 
-	desc_addr = gpa_to_vva(dev, desc->addr);
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	/* Retrieve virtio net header */
@@ -769,6 +773,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			desc = &vq->desc[desc->next];
 
 			desc_addr = gpa_to_vva(dev, desc->addr);
+			if (unlikely(!desc_addr))
+				return -1;
+
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 			desc_offset = 0;
-- 
2.5.0

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
@ 2016-05-23 10:57 ` Yuanhan Liu
  2016-05-23 11:04   ` Ilya Maximets
  2016-05-30 12:00 ` Tan, Jianfeng
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-05-23 10:57 UTC (permalink / raw)
  To: Ilya Maximets; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
> In current implementation guest application can reinitialize vrings
> by executing start after stop. In the same time host application
> can still poll virtqueue while device stopped in guest and it will
> crash with segmentation fault while vring reinitialization because
> of dereferencing of bad descriptor addresses.
> 
> OVS crash for example:
> <------------------------------------------------------------------------>
> [test-pmd inside guest VM]
> 
> 	testpmd> port stop all
> 	    Stopping ports...
> 	    Checking link statuses...
> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> 	    Done
> 	testpmd> port config all rxq 2
> 	testpmd> port config all txq 2
> 	testpmd> port start all
> 	    Configuring Port 0 (socket 0)
> 	    Port 0: 52:54:00:CB:44:C8
> 	    Checking link statuses...
> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> 	    Done

I actually didn't manage to reproduce it on my side, with the
vhost-example instead of OVS though. Is that all the commands
to reproduce it, and run them just after start test-pmd?

> [OVS on host]
> 	Program received signal SIGSEGV, Segmentation fault.
> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
> 
> 	(gdb) bt
> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
> 	    #1  copy_desc_to_mbuf
> 	    #2  rte_vhost_dequeue_burst
> 	    #3  netdev_dpdk_vhost_rxq_recv
> 	    ...
> 
> 	(gdb) bt full
> 	    #0  rte_memcpy
> 	        ...
> 	    #1  copy_desc_to_mbuf
> 	        desc_addr = 0
> 	        mbuf_offset = 0
> 	        desc_offset = 12
> 	        ...
> <------------------------------------------------------------------------>
> 
> Fix that by checking addresses of descriptors before using them.
> 
> Note: For mergeable buffers this patch checks only guest's address for
> zero, but in non-meargeable case host's address checked. This is done
> because checking of host's address in mergeable case requires additional
> refactoring to keep virtqueue in consistent state in case of error.
> 
> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> ---
> 
> Actually, current virtio implementation looks broken for me. Because
> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
> side.
> 
> There was 2 patches about this behaviour:
> 
> 	1. a85786dc816f ("virtio: fix states handling during initialization")
> 	2. 9a0615af7746 ("virtio: fix restart")
> 
> The second patch fixes somehow issue intoduced in the first patch, but actually
> also breaks vhost in the way described above.
> It's not pretty clear for me what to do in current situation with virtio,
> because it will be broken for guest application even if vhost will not crash.
> 
> May be it'll be better to forbid stopping of virtio device and force user to
> exit and start again (may be implemented in hidden from user way)?
> 
> This patch adds additional sane checks, so it should be applied anyway, IMHO.

Agreed.

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-23 10:57 ` Yuanhan Liu
@ 2016-05-23 11:04   ` Ilya Maximets
  2016-05-30 11:05     ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-05-23 11:04 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On 23.05.2016 13:57, Yuanhan Liu wrote:
> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>> In current implementation guest application can reinitialize vrings
>> by executing start after stop. In the same time host application
>> can still poll virtqueue while device stopped in guest and it will
>> crash with segmentation fault while vring reinitialization because
>> of dereferencing of bad descriptor addresses.
>>
>> OVS crash for example:
>> <------------------------------------------------------------------------>
>> [test-pmd inside guest VM]
>>
>> 	testpmd> port stop all
>> 	    Stopping ports...
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>> 	testpmd> port config all rxq 2
>> 	testpmd> port config all txq 2
>> 	testpmd> port start all
>> 	    Configuring Port 0 (socket 0)
>> 	    Port 0: 52:54:00:CB:44:C8
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
> 
> I actually didn't manage to reproduce it on my side, with the
> vhost-example instead of OVS though. Is that all the commands
> to reproduce it, and run them just after start test-pmd?

Actually, I think, packet flow should be enabled while performing
above actions and some traffic already should be sent through port
to change last used idx on vhost side.

Something like:
	start
	..wait a while.. see that packets are flowing.
	stop
	port stop
	port config
	port config
	port start
> 
>> [OVS on host]
>> 	Program received signal SIGSEGV, Segmentation fault.
>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
>>
>> 	(gdb) bt
>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>> 	    #1  copy_desc_to_mbuf
>> 	    #2  rte_vhost_dequeue_burst
>> 	    #3  netdev_dpdk_vhost_rxq_recv
>> 	    ...
>>
>> 	(gdb) bt full
>> 	    #0  rte_memcpy
>> 	        ...
>> 	    #1  copy_desc_to_mbuf
>> 	        desc_addr = 0
>> 	        mbuf_offset = 0
>> 	        desc_offset = 12
>> 	        ...
>> <------------------------------------------------------------------------>
>>
>> Fix that by checking addresses of descriptors before using them.
>>
>> Note: For mergeable buffers this patch checks only guest's address for
>> zero, but in non-meargeable case host's address checked. This is done
>> because checking of host's address in mergeable case requires additional
>> refactoring to keep virtqueue in consistent state in case of error.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> ---
>>
>> Actually, current virtio implementation looks broken for me. Because
>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
>> side.
>>
>> There was 2 patches about this behaviour:
>>
>> 	1. a85786dc816f ("virtio: fix states handling during initialization")
>> 	2. 9a0615af7746 ("virtio: fix restart")
>>
>> The second patch fixes somehow issue intoduced in the first patch, but actually
>> also breaks vhost in the way described above.
>> It's not pretty clear for me what to do in current situation with virtio,
>> because it will be broken for guest application even if vhost will not crash.
>>
>> May be it'll be better to forbid stopping of virtio device and force user to
>> exit and start again (may be implemented in hidden from user way)?
>>
>> This patch adds additional sane checks, so it should be applied anyway, IMHO.
> 
> Agreed.
> 
> 	--yliu
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-23 11:04   ` Ilya Maximets
@ 2016-05-30 11:05     ` Ilya Maximets
  2016-05-30 14:25       ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-05-30 11:05 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

Ping.

Best regards, Ilya Maximets.

On 23.05.2016 14:04, Ilya Maximets wrote:
> On 23.05.2016 13:57, Yuanhan Liu wrote:
>> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>>> In current implementation guest application can reinitialize vrings
>>> by executing start after stop. In the same time host application
>>> can still poll virtqueue while device stopped in guest and it will
>>> crash with segmentation fault while vring reinitialization because
>>> of dereferencing of bad descriptor addresses.
>>>
>>> OVS crash for example:
>>> <------------------------------------------------------------------------>
>>> [test-pmd inside guest VM]
>>>
>>> 	testpmd> port stop all
>>> 	    Stopping ports...
>>> 	    Checking link statuses...
>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>> 	    Done
>>> 	testpmd> port config all rxq 2
>>> 	testpmd> port config all txq 2
>>> 	testpmd> port start all
>>> 	    Configuring Port 0 (socket 0)
>>> 	    Port 0: 52:54:00:CB:44:C8
>>> 	    Checking link statuses...
>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>> 	    Done
>>
>> I actually didn't manage to reproduce it on my side, with the
>> vhost-example instead of OVS though. Is that all the commands
>> to reproduce it, and run them just after start test-pmd?
> 
> Actually, I think, packet flow should be enabled while performing
> above actions and some traffic already should be sent through port
> to change last used idx on vhost side.
> 
> Something like:
> 	start
> 	..wait a while.. see that packets are flowing.
> 	stop
> 	port stop
> 	port config
> 	port config
> 	port start
>>
>>> [OVS on host]
>>> 	Program received signal SIGSEGV, Segmentation fault.
>>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
>>>
>>> 	(gdb) bt
>>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>>> 	    #1  copy_desc_to_mbuf
>>> 	    #2  rte_vhost_dequeue_burst
>>> 	    #3  netdev_dpdk_vhost_rxq_recv
>>> 	    ...
>>>
>>> 	(gdb) bt full
>>> 	    #0  rte_memcpy
>>> 	        ...
>>> 	    #1  copy_desc_to_mbuf
>>> 	        desc_addr = 0
>>> 	        mbuf_offset = 0
>>> 	        desc_offset = 12
>>> 	        ...
>>> <------------------------------------------------------------------------>
>>>
>>> Fix that by checking addresses of descriptors before using them.
>>>
>>> Note: For mergeable buffers this patch checks only guest's address for
>>> zero, but in non-meargeable case host's address checked. This is done
>>> because checking of host's address in mergeable case requires additional
>>> refactoring to keep virtqueue in consistent state in case of error.
>>>
>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>>> ---
>>>
>>> Actually, current virtio implementation looks broken for me. Because
>>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
>>> side.
>>>
>>> There was 2 patches about this behaviour:
>>>
>>> 	1. a85786dc816f ("virtio: fix states handling during initialization")
>>> 	2. 9a0615af7746 ("virtio: fix restart")
>>>
>>> The second patch fixes somehow issue intoduced in the first patch, but actually
>>> also breaks vhost in the way described above.
>>> It's not pretty clear for me what to do in current situation with virtio,
>>> because it will be broken for guest application even if vhost will not crash.
>>>
>>> May be it'll be better to forbid stopping of virtio device and force user to
>>> exit and start again (may be implemented in hidden from user way)?
>>>
>>> This patch adds additional sane checks, so it should be applied anyway, IMHO.
>>
>> Agreed.
>>
>> 	--yliu
>>
>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
  2016-05-23 10:57 ` Yuanhan Liu
@ 2016-05-30 12:00 ` Tan, Jianfeng
  2016-05-30 12:24   ` Ilya Maximets
  2016-05-31 22:06 ` Rich Lane
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 39+ messages in thread
From: Tan, Jianfeng @ 2016-05-30 12:00 UTC (permalink / raw)
  To: Ilya Maximets, dev, Xie, Huawei, Yuanhan Liu; +Cc: Dyasly Sergey, Heetae Ahn

Hi,

> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets@samsung.com]
> Sent: Friday, May 20, 2016 8:50 PM
> To: dev@dpdk.org; Xie, Huawei; Yuanhan Liu
> Cc: Dyasly Sergey; Heetae Ahn; Tan, Jianfeng; Ilya Maximets
> Subject: [PATCH] vhost: fix segfault on bad descriptor address.
> 
> In current implementation guest application can reinitialize vrings
> by executing start after stop. In the same time host application
> can still poll virtqueue while device stopped in guest and it will
> crash with segmentation fault while vring reinitialization because
> of dereferencing of bad descriptor addresses.
> 
> OVS crash for example:
> <------------------------------------------------------------------------>
> [test-pmd inside guest VM]
> 
> 	testpmd> port stop all
> 	    Stopping ports...
> 	    Checking link statuses...
> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> 	    Done
> 	testpmd> port config all rxq 2
> 	testpmd> port config all txq 2
> 	testpmd> port start all
> 	    Configuring Port 0 (socket 0)
> 	    Port 0: 52:54:00:CB:44:C8
> 	    Checking link statuses...
> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> 	    Done
> 
> [OVS on host]
> 	Program received signal SIGSEGV, Segmentation fault.
> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at
> rte_memcpy.h
> 
> 	(gdb) bt
> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
> 	    #1  copy_desc_to_mbuf
> 	    #2  rte_vhost_dequeue_burst
> 	    #3  netdev_dpdk_vhost_rxq_recv
> 	    ...
> 
> 	(gdb) bt full
> 	    #0  rte_memcpy
> 	        ...
> 	    #1  copy_desc_to_mbuf
> 	        desc_addr = 0
> 	        mbuf_offset = 0
> 	        desc_offset = 12
> 	        ...
> <------------------------------------------------------------------------>
> 
> Fix that by checking addresses of descriptors before using them.
> 
> Note: For mergeable buffers this patch checks only guest's address for
> zero, but in non-meargeable case host's address checked. This is done
> because checking of host's address in mergeable case requires additional
> refactoring to keep virtqueue in consistent state in case of error.


I agree with you that it should be fixed because malicious guest could launch DOS attack on vswitch with the current implementation.

But I don't understand why you do not fix the mergable case in copy_mbuf_to_desc_mergable() on where gpa_to_vva() happens? And the change in fill_vec_buf(), checking !vq->desc[idx].addr, make any sense?

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-30 12:00 ` Tan, Jianfeng
@ 2016-05-30 12:24   ` Ilya Maximets
  2016-05-31  6:53     ` Tan, Jianfeng
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-05-30 12:24 UTC (permalink / raw)
  To: Tan, Jianfeng, dev, Xie, Huawei, Yuanhan Liu; +Cc: Dyasly Sergey, Heetae Ahn

On 30.05.2016 15:00, Tan, Jianfeng wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Ilya Maximets [mailto:i.maximets@samsung.com]
>> Sent: Friday, May 20, 2016 8:50 PM
>> To: dev@dpdk.org; Xie, Huawei; Yuanhan Liu
>> Cc: Dyasly Sergey; Heetae Ahn; Tan, Jianfeng; Ilya Maximets
>> Subject: [PATCH] vhost: fix segfault on bad descriptor address.
>>
>> In current implementation guest application can reinitialize vrings
>> by executing start after stop. In the same time host application
>> can still poll virtqueue while device stopped in guest and it will
>> crash with segmentation fault while vring reinitialization because
>> of dereferencing of bad descriptor addresses.
>>
>> OVS crash for example:
>> <------------------------------------------------------------------------>
>> [test-pmd inside guest VM]
>>
>> 	testpmd> port stop all
>> 	    Stopping ports...
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>> 	testpmd> port config all rxq 2
>> 	testpmd> port config all txq 2
>> 	testpmd> port start all
>> 	    Configuring Port 0 (socket 0)
>> 	    Port 0: 52:54:00:CB:44:C8
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>>
>> [OVS on host]
>> 	Program received signal SIGSEGV, Segmentation fault.
>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at
>> rte_memcpy.h
>>
>> 	(gdb) bt
>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>> 	    #1  copy_desc_to_mbuf
>> 	    #2  rte_vhost_dequeue_burst
>> 	    #3  netdev_dpdk_vhost_rxq_recv
>> 	    ...
>>
>> 	(gdb) bt full
>> 	    #0  rte_memcpy
>> 	        ...
>> 	    #1  copy_desc_to_mbuf
>> 	        desc_addr = 0
>> 	        mbuf_offset = 0
>> 	        desc_offset = 12
>> 	        ...
>> <------------------------------------------------------------------------>
>>
>> Fix that by checking addresses of descriptors before using them.
>>
>> Note: For mergeable buffers this patch checks only guest's address for
>> zero, but in non-meargeable case host's address checked. This is done
>> because checking of host's address in mergeable case requires additional
>> refactoring to keep virtqueue in consistent state in case of error.
> 
> 
> I agree with you that it should be fixed because malicious guest could launch
> DOS attack on vswitch with the current implementation.
> 
> But I don't understand why you do not fix the mergable case in
> copy_mbuf_to_desc_mergable() on where gpa_to_vva() happens? And the change in
> fill_vec_buf(), checking !vq->desc[idx].addr, make any sense?
> 
> Thanks,
> Jianfeng

Hi.
As I said inside commit-message, checking of host's address in mergeable case
requires additional refactoring to keep virtqueue in consistent state.

There are few issues with checking inside copy_mbuf_to_desc_mergable() :

	1. Ring elements already reserved and we must fill them with some
	   sane data before going out of virtio_dev_merge_rx().

	2. copy_mbuf_to_desc_mergable() can't return an error in current
	   implementation (additional checking needed), otherwise used->idx
	   will be decremented (I think, it's bad).


Checking !vq->desc[idx].addr inside fill_vec_buf() make sense in case of virtio
reinitialization, because guest's address will be zero (case described in
commit-message). Checking of guest's address will not help in case of bad and
not NULL address, but useful in this common case.
Also, we can't catch bad address what we able to map, so, malicious guest could
break vhost anyway.

I agree, that checking of host's address is better, but this may be done later
together with resolving above issues.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-30 11:05     ` Ilya Maximets
@ 2016-05-30 14:25       ` Yuanhan Liu
  2016-05-31  9:12         ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-05-30 14:25 UTC (permalink / raw)
  To: Ilya Maximets; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

Hi Ilya,

Generically speaking, this patch looks good to me. But I guess still
need more time to check this issue later; I still failed to reproduce
it on my side after all. So, please allow a late merge.

Thanks.

	--yliu

On Mon, May 30, 2016 at 02:05:07PM +0300, Ilya Maximets wrote:
> Ping.
> 
> Best regards, Ilya Maximets.
> 
> On 23.05.2016 14:04, Ilya Maximets wrote:
> > On 23.05.2016 13:57, Yuanhan Liu wrote:
> >> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
> >>> In current implementation guest application can reinitialize vrings
> >>> by executing start after stop. In the same time host application
> >>> can still poll virtqueue while device stopped in guest and it will
> >>> crash with segmentation fault while vring reinitialization because
> >>> of dereferencing of bad descriptor addresses.
> >>>
> >>> OVS crash for example:
> >>> <------------------------------------------------------------------------>
> >>> [test-pmd inside guest VM]
> >>>
> >>> 	testpmd> port stop all
> >>> 	    Stopping ports...
> >>> 	    Checking link statuses...
> >>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> >>> 	    Done
> >>> 	testpmd> port config all rxq 2
> >>> 	testpmd> port config all txq 2
> >>> 	testpmd> port start all
> >>> 	    Configuring Port 0 (socket 0)
> >>> 	    Port 0: 52:54:00:CB:44:C8
> >>> 	    Checking link statuses...
> >>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> >>> 	    Done
> >>
> >> I actually didn't manage to reproduce it on my side, with the
> >> vhost-example instead of OVS though. Is that all the commands
> >> to reproduce it, and run them just after start test-pmd?
> > 
> > Actually, I think, packet flow should be enabled while performing
> > above actions and some traffic already should be sent through port
> > to change last used idx on vhost side.
> > 
> > Something like:
> > 	start
> > 	..wait a while.. see that packets are flowing.
> > 	stop
> > 	port stop
> > 	port config
> > 	port config
> > 	port start
> >>
> >>> [OVS on host]
> >>> 	Program received signal SIGSEGV, Segmentation fault.
> >>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
> >>>
> >>> 	(gdb) bt
> >>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
> >>> 	    #1  copy_desc_to_mbuf
> >>> 	    #2  rte_vhost_dequeue_burst
> >>> 	    #3  netdev_dpdk_vhost_rxq_recv
> >>> 	    ...
> >>>
> >>> 	(gdb) bt full
> >>> 	    #0  rte_memcpy
> >>> 	        ...
> >>> 	    #1  copy_desc_to_mbuf
> >>> 	        desc_addr = 0
> >>> 	        mbuf_offset = 0
> >>> 	        desc_offset = 12
> >>> 	        ...
> >>> <------------------------------------------------------------------------>
> >>>
> >>> Fix that by checking addresses of descriptors before using them.
> >>>
> >>> Note: For mergeable buffers this patch checks only guest's address for
> >>> zero, but in non-meargeable case host's address checked. This is done
> >>> because checking of host's address in mergeable case requires additional
> >>> refactoring to keep virtqueue in consistent state in case of error.
> >>>
> >>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> >>> ---
> >>>
> >>> Actually, current virtio implementation looks broken for me. Because
> >>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
> >>> side.
> >>>
> >>> There was 2 patches about this behaviour:
> >>>
> >>> 	1. a85786dc816f ("virtio: fix states handling during initialization")
> >>> 	2. 9a0615af7746 ("virtio: fix restart")
> >>>
> >>> The second patch fixes somehow issue intoduced in the first patch, but actually
> >>> also breaks vhost in the way described above.
> >>> It's not pretty clear for me what to do in current situation with virtio,
> >>> because it will be broken for guest application even if vhost will not crash.
> >>>
> >>> May be it'll be better to forbid stopping of virtio device and force user to
> >>> exit and start again (may be implemented in hidden from user way)?
> >>>
> >>> This patch adds additional sane checks, so it should be applied anyway, IMHO.
> >>
> >> Agreed.
> >>
> >> 	--yliu
> >>
> >>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-30 12:24   ` Ilya Maximets
@ 2016-05-31  6:53     ` Tan, Jianfeng
  2016-05-31  9:10       ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Tan, Jianfeng @ 2016-05-31  6:53 UTC (permalink / raw)
  To: Ilya Maximets, dev, Xie, Huawei, Yuanhan Liu; +Cc: Dyasly Sergey, Heetae Ahn

Hi,


On 5/30/2016 8:24 PM, Ilya Maximets wrote:
> On 30.05.2016 15:00, Tan, Jianfeng wrote:
>> Hi,
>>
>>> -----Original Message-----
>>> From: Ilya Maximets [mailto:i.maximets@samsung.com]
>>> Sent: Friday, May 20, 2016 8:50 PM
>>> To: dev@dpdk.org; Xie, Huawei; Yuanhan Liu
>>> Cc: Dyasly Sergey; Heetae Ahn; Tan, Jianfeng; Ilya Maximets
>>> Subject: [PATCH] vhost: fix segfault on bad descriptor address.
>>>
>>> In current implementation guest application can reinitialize vrings
>>> by executing start after stop. In the same time host application
>>> can still poll virtqueue while device stopped in guest and it will
>>> crash with segmentation fault while vring reinitialization because
>>> of dereferencing of bad descriptor addresses.
>>>
>>> OVS crash for example:
>>> <------------------------------------------------------------------------>
>>> [test-pmd inside guest VM]
>>>
>>> 	testpmd> port stop all
>>> 	    Stopping ports...
>>> 	    Checking link statuses...
>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>> 	    Done
>>> 	testpmd> port config all rxq 2
>>> 	testpmd> port config all txq 2
>>> 	testpmd> port start all
>>> 	    Configuring Port 0 (socket 0)
>>> 	    Port 0: 52:54:00:CB:44:C8
>>> 	    Checking link statuses...
>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>> 	    Done
>>>
>>> [OVS on host]
>>> 	Program received signal SIGSEGV, Segmentation fault.
>>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at
>>> rte_memcpy.h
>>>
>>> 	(gdb) bt
>>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>>> 	    #1  copy_desc_to_mbuf
>>> 	    #2  rte_vhost_dequeue_burst
>>> 	    #3  netdev_dpdk_vhost_rxq_recv
>>> 	    ...
>>>
>>> 	(gdb) bt full
>>> 	    #0  rte_memcpy
>>> 	        ...
>>> 	    #1  copy_desc_to_mbuf
>>> 	        desc_addr = 0
>>> 	        mbuf_offset = 0
>>> 	        desc_offset = 12
>>> 	        ...
>>> <------------------------------------------------------------------------>
>>>
>>> Fix that by checking addresses of descriptors before using them.
>>>
>>> Note: For mergeable buffers this patch checks only guest's address for
>>> zero, but in non-meargeable case host's address checked. This is done
>>> because checking of host's address in mergeable case requires additional
>>> refactoring to keep virtqueue in consistent state in case of error.
>>
>> I agree with you that it should be fixed because malicious guest could launch
>> DOS attack on vswitch with the current implementation.
>>
>> But I don't understand why you do not fix the mergable case in
>> copy_mbuf_to_desc_mergable() on where gpa_to_vva() happens? And the change in
>> fill_vec_buf(), checking !vq->desc[idx].addr, make any sense?
>>
>> Thanks,
>> Jianfeng
> Hi.
> As I said inside commit-message, checking of host's address in mergeable case
> requires additional refactoring to keep virtqueue in consistent state.
>
> There are few issues with checking inside copy_mbuf_to_desc_mergable() :
>
> 	1. Ring elements already reserved and we must fill them with some
> 	   sane data before going out of virtio_dev_merge_rx().
>
> 	2. copy_mbuf_to_desc_mergable() can't return an error in current
> 	   implementation (additional checking needed), otherwise used->idx
> 	   will be decremented (I think, it's bad).

Yes, currently there is no way to return these invalid desc back to 
virtio because there's no invalid flag in virtio_net_hdr to indicate 
this desc contains no pkt. I see kernel just skips those descriptors 
with bad addr. I think it may rely on reset of the virtio device to 
improve such situation.

Another thing is that, your patch only checks the desc->addr, but we 
should check desc->addr + desc->len too, right?

Thanks,
Jianfeng

>
>
> Checking !vq->desc[idx].addr inside fill_vec_buf() make sense in case of virtio
> reinitialization, because guest's address will be zero (case described in
> commit-message). Checking of guest's address will not help in case of bad and
> not NULL address, but useful in this common case.
> Also, we can't catch bad address what we able to map, so, malicious guest could
> break vhost anyway.
>
> I agree, that checking of host's address is better, but this may be done later
> together with resolving above issues.
>
> Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-31  6:53     ` Tan, Jianfeng
@ 2016-05-31  9:10       ` Ilya Maximets
  0 siblings, 0 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-05-31  9:10 UTC (permalink / raw)
  To: Tan, Jianfeng, dev, Xie, Huawei, Yuanhan Liu; +Cc: Dyasly Sergey, Heetae Ahn

On 31.05.2016 09:53, Tan, Jianfeng wrote:
> Hi,
> 
> 
> On 5/30/2016 8:24 PM, Ilya Maximets wrote:
>> On 30.05.2016 15:00, Tan, Jianfeng wrote:
>>> Hi,
>>>
>>>> -----Original Message-----
>>>> From: Ilya Maximets [mailto:i.maximets@samsung.com]
>>>> Sent: Friday, May 20, 2016 8:50 PM
>>>> To: dev@dpdk.org; Xie, Huawei; Yuanhan Liu
>>>> Cc: Dyasly Sergey; Heetae Ahn; Tan, Jianfeng; Ilya Maximets
>>>> Subject: [PATCH] vhost: fix segfault on bad descriptor address.
>>>>
>>>> In current implementation guest application can reinitialize vrings
>>>> by executing start after stop. In the same time host application
>>>> can still poll virtqueue while device stopped in guest and it will
>>>> crash with segmentation fault while vring reinitialization because
>>>> of dereferencing of bad descriptor addresses.
>>>>
>>>> OVS crash for example:
>>>> <------------------------------------------------------------------------>
>>>> [test-pmd inside guest VM]
>>>>
>>>>     testpmd> port stop all
>>>>         Stopping ports...
>>>>         Checking link statuses...
>>>>         Port 0 Link Up - speed 10000 Mbps - full-duplex
>>>>         Done
>>>>     testpmd> port config all rxq 2
>>>>     testpmd> port config all txq 2
>>>>     testpmd> port start all
>>>>         Configuring Port 0 (socket 0)
>>>>         Port 0: 52:54:00:CB:44:C8
>>>>         Checking link statuses...
>>>>         Port 0 Link Up - speed 10000 Mbps - full-duplex
>>>>         Done
>>>>
>>>> [OVS on host]
>>>>     Program received signal SIGSEGV, Segmentation fault.
>>>>     rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at
>>>> rte_memcpy.h
>>>>
>>>>     (gdb) bt
>>>>         #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>>>>         #1  copy_desc_to_mbuf
>>>>         #2  rte_vhost_dequeue_burst
>>>>         #3  netdev_dpdk_vhost_rxq_recv
>>>>         ...
>>>>
>>>>     (gdb) bt full
>>>>         #0  rte_memcpy
>>>>             ...
>>>>         #1  copy_desc_to_mbuf
>>>>             desc_addr = 0
>>>>             mbuf_offset = 0
>>>>             desc_offset = 12
>>>>             ...
>>>> <------------------------------------------------------------------------>
>>>>
>>>> Fix that by checking addresses of descriptors before using them.
>>>>
>>>> Note: For mergeable buffers this patch checks only guest's address for
>>>> zero, but in non-meargeable case host's address checked. This is done
>>>> because checking of host's address in mergeable case requires additional
>>>> refactoring to keep virtqueue in consistent state in case of error.
>>>
>>> I agree with you that it should be fixed because malicious guest could launch
>>> DOS attack on vswitch with the current implementation.
>>>
>>> But I don't understand why you do not fix the mergable case in
>>> copy_mbuf_to_desc_mergable() on where gpa_to_vva() happens? And the change in
>>> fill_vec_buf(), checking !vq->desc[idx].addr, make any sense?
>>>
>>> Thanks,
>>> Jianfeng
>> Hi.
>> As I said inside commit-message, checking of host's address in mergeable case
>> requires additional refactoring to keep virtqueue in consistent state.
>>
>> There are few issues with checking inside copy_mbuf_to_desc_mergable() :
>>
>>     1. Ring elements already reserved and we must fill them with some
>>        sane data before going out of virtio_dev_merge_rx().
>>
>>     2. copy_mbuf_to_desc_mergable() can't return an error in current
>>        implementation (additional checking needed), otherwise used->idx
>>        will be decremented (I think, it's bad).
> 
> Yes, currently there is no way to return these invalid desc back to virtio because there's no invalid flag in virtio_net_hdr to indicate this desc contains no pkt. I see kernel just skips those descriptors with bad addr. I think it may rely on reset of the virtio device to improve such situation.
> 
> Another thing is that, your patch only checks the desc->addr, but we should check desc->addr + desc->len too, right?

To do it fast we need to check whole range inside gpa_to_vva(), but even
more refactoring is required for that. Also, this can be a different
patch because checking of addr + len not required to fix original issue
with virtio reconfiguration.

> 
> Thanks,
> Jianfeng
> 
>>
>>
>> Checking !vq->desc[idx].addr inside fill_vec_buf() make sense in case of virtio
>> reinitialization, because guest's address will be zero (case described in
>> commit-message). Checking of guest's address will not help in case of bad and
>> not NULL address, but useful in this common case.
>> Also, we can't catch bad address what we able to map, so, malicious guest could
>> break vhost anyway.
>>
>> I agree, that checking of host's address is better, but this may be done later
>> together with resolving above issues.
>>
>> Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-30 14:25       ` Yuanhan Liu
@ 2016-05-31  9:12         ` Ilya Maximets
  0 siblings, 0 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-05-31  9:12 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

OK.

On 30.05.2016 17:25, Yuanhan Liu wrote:
> Hi Ilya,
> 
> Generically speaking, this patch looks good to me. But I guess still
> need more time to check this issue later; I still failed to reproduce
> it on my side after all. So, please allow a late merge.
> 
> Thanks.
> 
> 	--yliu
> 
> On Mon, May 30, 2016 at 02:05:07PM +0300, Ilya Maximets wrote:
>> Ping.
>>
>> Best regards, Ilya Maximets.
>>
>> On 23.05.2016 14:04, Ilya Maximets wrote:
>>> On 23.05.2016 13:57, Yuanhan Liu wrote:
>>>> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>>>>> In current implementation guest application can reinitialize vrings
>>>>> by executing start after stop. In the same time host application
>>>>> can still poll virtqueue while device stopped in guest and it will
>>>>> crash with segmentation fault while vring reinitialization because
>>>>> of dereferencing of bad descriptor addresses.
>>>>>
>>>>> OVS crash for example:
>>>>> <------------------------------------------------------------------------>
>>>>> [test-pmd inside guest VM]
>>>>>
>>>>> 	testpmd> port stop all
>>>>> 	    Stopping ports...
>>>>> 	    Checking link statuses...
>>>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>>>> 	    Done
>>>>> 	testpmd> port config all rxq 2
>>>>> 	testpmd> port config all txq 2
>>>>> 	testpmd> port start all
>>>>> 	    Configuring Port 0 (socket 0)
>>>>> 	    Port 0: 52:54:00:CB:44:C8
>>>>> 	    Checking link statuses...
>>>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>>>> 	    Done
>>>>
>>>> I actually didn't manage to reproduce it on my side, with the
>>>> vhost-example instead of OVS though. Is that all the commands
>>>> to reproduce it, and run them just after start test-pmd?
>>>
>>> Actually, I think, packet flow should be enabled while performing
>>> above actions and some traffic already should be sent through port
>>> to change last used idx on vhost side.
>>>
>>> Something like:
>>> 	start
>>> 	..wait a while.. see that packets are flowing.
>>> 	stop
>>> 	port stop
>>> 	port config
>>> 	port config
>>> 	port start
>>>>
>>>>> [OVS on host]
>>>>> 	Program received signal SIGSEGV, Segmentation fault.
>>>>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
>>>>>
>>>>> 	(gdb) bt
>>>>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>>>>> 	    #1  copy_desc_to_mbuf
>>>>> 	    #2  rte_vhost_dequeue_burst
>>>>> 	    #3  netdev_dpdk_vhost_rxq_recv
>>>>> 	    ...
>>>>>
>>>>> 	(gdb) bt full
>>>>> 	    #0  rte_memcpy
>>>>> 	        ...
>>>>> 	    #1  copy_desc_to_mbuf
>>>>> 	        desc_addr = 0
>>>>> 	        mbuf_offset = 0
>>>>> 	        desc_offset = 12
>>>>> 	        ...
>>>>> <------------------------------------------------------------------------>
>>>>>
>>>>> Fix that by checking addresses of descriptors before using them.
>>>>>
>>>>> Note: For mergeable buffers this patch checks only guest's address for
>>>>> zero, but in non-meargeable case host's address checked. This is done
>>>>> because checking of host's address in mergeable case requires additional
>>>>> refactoring to keep virtqueue in consistent state in case of error.
>>>>>
>>>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>>>>> ---
>>>>>
>>>>> Actually, current virtio implementation looks broken for me. Because
>>>>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
>>>>> side.
>>>>>
>>>>> There was 2 patches about this behaviour:
>>>>>
>>>>> 	1. a85786dc816f ("virtio: fix states handling during initialization")
>>>>> 	2. 9a0615af7746 ("virtio: fix restart")
>>>>>
>>>>> The second patch fixes somehow issue intoduced in the first patch, but actually
>>>>> also breaks vhost in the way described above.
>>>>> It's not pretty clear for me what to do in current situation with virtio,
>>>>> because it will be broken for guest application even if vhost will not crash.
>>>>>
>>>>> May be it'll be better to forbid stopping of virtio device and force user to
>>>>> exit and start again (may be implemented in hidden from user way)?
>>>>>
>>>>> This patch adds additional sane checks, so it should be applied anyway, IMHO.
>>>>
>>>> Agreed.
>>>>
>>>> 	--yliu
>>>>
>>>>
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
  2016-05-23 10:57 ` Yuanhan Liu
  2016-05-30 12:00 ` Tan, Jianfeng
@ 2016-05-31 22:06 ` Rich Lane
  2016-06-02 10:46   ` Ilya Maximets
  2016-07-01  7:35 ` Yuanhan Liu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 39+ messages in thread
From: Rich Lane @ 2016-05-31 22:06 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Yuanhan Liu, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets <i.maximets@samsung.com>
wrote:

> In current implementation guest application can reinitialize vrings
> by executing start after stop. In the same time host application
> can still poll virtqueue while device stopped in guest and it will
> crash with segmentation fault while vring reinitialization because
> of dereferencing of bad descriptor addresses.
>

I see a performance regression with this patch at large packet sizes (> 768
bytes). rte_vhost_enqueue_burst is consuming 10% more cycles. Strangely,
there's actually a ~1% performance improvement at small packet sizes.

The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1.

AFAICT this is just the compiler generating bad code. One difference is
that it's storing the offset on the stack instead of in a register. A
workaround is to move the !desc_addr check outside the unlikely macros.

--- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
>         struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0},
> 0};
>
>         desc = &vq->desc[desc_idx];
> -       if (unlikely(desc->len < vq->vhost_hlen))
> +       desc_addr = gpa_to_vva(dev, desc->addr);
> +       if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
>

 Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) ||
!desc_addr)".

                return -1;


> -       desc_addr = gpa_to_vva(dev, desc->addr);
>         rte_prefetch0((void *)(uintptr_t)desc_addr);
>
>         virtio_enqueue_offload(m, &virtio_hdr.hdr);
> @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
>
>                         desc = &vq->desc[desc->next];
>                         desc_addr   = gpa_to_vva(dev, desc->addr);
> +                       if (unlikely(!desc_addr))
>

Workaround: change to "if (!desc_addr)".


> +                               return -1;
> +
>                         desc_offset = 0;
>                         desc_avail  = desc->len;
>                 }
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-31 22:06 ` Rich Lane
@ 2016-06-02 10:46   ` Ilya Maximets
  2016-06-02 16:22     ` Rich Lane
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-06-02 10:46 UTC (permalink / raw)
  To: Rich Lane
  Cc: dev, Huawei Xie, Yuanhan Liu, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

Hi, Rich.
Thank you for testing and analysing.

On 01.06.2016 01:06, Rich Lane wrote:
> On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets <i.maximets@samsung.com <mailto:i.maximets@samsung.com>> wrote:
> 
>     In current implementation guest application can reinitialize vrings
>     by executing start after stop. In the same time host application
>     can still poll virtqueue while device stopped in guest and it will
>     crash with segmentation fault while vring reinitialization because
>     of dereferencing of bad descriptor addresses.
> 
> 
> I see a performance regression with this patch at large packet sizes (> 768 bytes). rte_vhost_enqueue_burst is consuming 10% more cycles. Strangely, there's actually a ~1% performance improvement at small packet sizes.
> 
> The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1.
> 
> AFAICT this is just the compiler generating bad code. One difference is that it's storing the offset on the stack instead of in a register. A workaround is to move the !desc_addr check outside the unlikely macros.
> 
>     --- a/lib/librte_vhost/vhost_rxtx.c
>     +++ b/lib/librte_vhost/vhost_rxtx.c
>     @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>             struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
> 
>             desc = &vq->desc[desc_idx];
>     -       if (unlikely(desc->len < vq->vhost_hlen))
>     +       desc_addr = gpa_to_vva(dev, desc->addr);
>     +       if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
> 
> 
>  Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) || !desc_addr)".
> 
>                     return -1;
> 
> 
>     -       desc_addr = gpa_to_vva(dev, desc->addr);
>             rte_prefetch0((void *)(uintptr_t)desc_addr);
> 
>             virtio_enqueue_offload(m, &virtio_hdr.hdr);
>     @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
> 
>                             desc = &vq->desc[desc->next];
>                             desc_addr   = gpa_to_vva(dev, desc->addr);
>     +                       if (unlikely(!desc_addr))
> 
> 
> Workaround: change to "if (!desc_addr)".
>  
> 
>     +                               return -1;
>     +
>                             desc_offset = 0;
>                             desc_avail  = desc->len;
>                     }
> 

What about other places? Is there same issues or it's only inside copy_mbuf_to_desc() ?

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-06-02 10:46   ` Ilya Maximets
@ 2016-06-02 16:22     ` Rich Lane
  2016-06-03  6:01       ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Rich Lane @ 2016-06-02 16:22 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Yuanhan Liu, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On Thu, Jun 2, 2016 at 3:46 AM, Ilya Maximets <i.maximets@samsung.com>
wrote:

> Hi, Rich.
> Thank you for testing and analysing.
>
> On 01.06.2016 01:06, Rich Lane wrote:
> > On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets <i.maximets@samsung.com
> <mailto:i.maximets@samsung.com>> wrote:
> >
> >     In current implementation guest application can reinitialize vrings
> >     by executing start after stop. In the same time host application
> >     can still poll virtqueue while device stopped in guest and it will
> >     crash with segmentation fault while vring reinitialization because
> >     of dereferencing of bad descriptor addresses.
> >
> >
> > I see a performance regression with this patch at large packet sizes (>
> 768 bytes). rte_vhost_enqueue_burst is consuming 10% more cycles.
> Strangely, there's actually a ~1% performance improvement at small packet
> sizes.
> >
> > The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1.
> >
> > AFAICT this is just the compiler generating bad code. One difference is
> that it's storing the offset on the stack instead of in a register. A
> workaround is to move the !desc_addr check outside the unlikely macros.
> >
> >     --- a/lib/librte_vhost/vhost_rxtx.c
> >     +++ b/lib/librte_vhost/vhost_rxtx.c
> >     @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev,
> struct vhost_virtqueue *vq,
> >             struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0,
> 0, 0}, 0};
> >
> >             desc = &vq->desc[desc_idx];
> >     -       if (unlikely(desc->len < vq->vhost_hlen))
> >     +       desc_addr = gpa_to_vva(dev, desc->addr);
> >     +       if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
> >
> >
> >  Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) ||
> !desc_addr)".
> >
> >                     return -1;
> >
> >
> >     -       desc_addr = gpa_to_vva(dev, desc->addr);
> >             rte_prefetch0((void *)(uintptr_t)desc_addr);
> >
> >             virtio_enqueue_offload(m, &virtio_hdr.hdr);
> >     @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
> >
> >                             desc = &vq->desc[desc->next];
> >                             desc_addr   = gpa_to_vva(dev, desc->addr);
> >     +                       if (unlikely(!desc_addr))
> >
> >
> > Workaround: change to "if (!desc_addr)".
> >
> >
> >     +                               return -1;
> >     +
> >                             desc_offset = 0;
> >                             desc_avail  = desc->len;
> >                     }
> >
>
> What about other places? Is there same issues or it's only inside
> copy_mbuf_to_desc() ?
>

Only copy_mbuf_to_desc has the issue.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-06-02 16:22     ` Rich Lane
@ 2016-06-03  6:01       ` Ilya Maximets
  0 siblings, 0 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-06-03  6:01 UTC (permalink / raw)
  To: Rich Lane
  Cc: dev, Huawei Xie, Yuanhan Liu, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On 02.06.2016 19:22, Rich Lane wrote:
> On Thu, Jun 2, 2016 at 3:46 AM, Ilya Maximets <i.maximets@samsung.com <mailto:i.maximets@samsung.com>> wrote:
> 
>     Hi, Rich.
>     Thank you for testing and analysing.
> 
>     On 01.06.2016 01:06, Rich Lane wrote:
>     > On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets <i.maximets@samsung.com <mailto:i.maximets@samsung.com> <mailto:i.maximets@samsung.com <mailto:i.maximets@samsung.com>>> wrote:
>     >
>     >     In current implementation guest application can reinitialize vrings
>     >     by executing start after stop. In the same time host application
>     >     can still poll virtqueue while device stopped in guest and it will
>     >     crash with segmentation fault while vring reinitialization because
>     >     of dereferencing of bad descriptor addresses.
>     >
>     >
>     > I see a performance regression with this patch at large packet sizes (> 768 bytes). rte_vhost_enqueue_burst is consuming 10% more cycles. Strangely, there's actually a ~1% performance improvement at small packet sizes.
>     >
>     > The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1.
>     >
>     > AFAICT this is just the compiler generating bad code. One difference is that it's storing the offset on the stack instead of in a register. A workaround is to move the !desc_addr check outside the unlikely macros.
>     >
>     >     --- a/lib/librte_vhost/vhost_rxtx.c
>     >     +++ b/lib/librte_vhost/vhost_rxtx.c
>     >     @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>     >             struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>     >
>     >             desc = &vq->desc[desc_idx];
>     >     -       if (unlikely(desc->len < vq->vhost_hlen))
>     >     +       desc_addr = gpa_to_vva(dev, desc->addr);
>     >     +       if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
>     >
>     >
>     >  Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) || !desc_addr)".
>     >
>     >                     return -1;
>     >
>     >
>     >     -       desc_addr = gpa_to_vva(dev, desc->addr);
>     >             rte_prefetch0((void *)(uintptr_t)desc_addr);
>     >
>     >             virtio_enqueue_offload(m, &virtio_hdr.hdr);
>     >     @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>     >
>     >                             desc = &vq->desc[desc->next];
>     >                             desc_addr   = gpa_to_vva(dev, desc->addr);
>     >     +                       if (unlikely(!desc_addr))
>     >
>     >
>     > Workaround: change to "if (!desc_addr)".
>     >
>     >
>     >     +                               return -1;
>     >     +
>     >                             desc_offset = 0;
>     >                             desc_avail  = desc->len;
>     >                     }
>     >
> 
>     What about other places? Is there same issues or it's only inside copy_mbuf_to_desc() ?
> 
> 
> Only copy_mbuf_to_desc has the issue.

Ok.
Actually, I can't reproduce this performance issue using gcc 4.8.5 from RHEL 7.2.
I'm not sure if I should post v2 with above fixes. May be them could be applied
while pushing patch to repository? 

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
                   ` (2 preceding siblings ...)
  2016-05-31 22:06 ` Rich Lane
@ 2016-07-01  7:35 ` Yuanhan Liu
  2016-07-06 11:19   ` Ilya Maximets
  2016-07-14  8:18 ` [dpdk-dev] [PATCH v2] " Ilya Maximets
  2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
  5 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-01  7:35 UTC (permalink / raw)
  To: Ilya Maximets; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

Hi,

Sorry for the long delay.

On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
> In current implementation guest application can reinitialize vrings
> by executing start after stop. In the same time host application
> can still poll virtqueue while device stopped in guest and it will
> crash with segmentation fault while vring reinitialization because
> of dereferencing of bad descriptor addresses.

Yes, you are right that vring will be reinitialized after restart.
But even though, I don't see the reason it will cause a vhost crash,
since the reinitialization will reset all the vring memeory by 0:

    memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);

That means those bad descriptors will be skipped, safely, at vhost
side by:

	if (unlikely(desc->len < dev->vhost_hlen))
		return -1;

> 
> OVS crash for example:
> <------------------------------------------------------------------------>
> [test-pmd inside guest VM]
> 
> 	testpmd> port stop all
> 	    Stopping ports...
> 	    Checking link statuses...
> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> 	    Done
> 	testpmd> port config all rxq 2
> 	testpmd> port config all txq 2
> 	testpmd> port start all
> 	    Configuring Port 0 (socket 0)
> 	    Port 0: 52:54:00:CB:44:C8
> 	    Checking link statuses...
> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> 	    Done
> 
> [OVS on host]
> 	Program received signal SIGSEGV, Segmentation fault.
> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h

Interesting, so it bypasses the above check since desc->len is non-zero
while desc->addr is zero. The size (2056) also looks weird.

Do you mind to check this issue a bit deeper, say why desc->addr is
zero, however, desc->len is not?

> 	(gdb) bt
> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
> 	    #1  copy_desc_to_mbuf
> 	    #2  rte_vhost_dequeue_burst
> 	    #3  netdev_dpdk_vhost_rxq_recv
> 	    ...
> 
> 	(gdb) bt full
> 	    #0  rte_memcpy
> 	        ...
> 	    #1  copy_desc_to_mbuf
> 	        desc_addr = 0
> 	        mbuf_offset = 0
> 	        desc_offset = 12
> 	        ...
> <------------------------------------------------------------------------>
> 
> Fix that by checking addresses of descriptors before using them.
> 
> Note: For mergeable buffers this patch checks only guest's address for
> zero, but in non-meargeable case host's address checked. This is done
> because checking of host's address in mergeable case requires additional
> refactoring to keep virtqueue in consistent state in case of error.
> 
> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> ---
> 
> Actually, current virtio implementation looks broken for me. Because
> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
> side.

Yes, this sounds buggy. Maybe we could not reset the avail idx, in such
case vhost dequeue/enqueue will just return as there are no more packets
to dequeue and no more space to enqueue, respectively?

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-01  7:35 ` Yuanhan Liu
@ 2016-07-06 11:19   ` Ilya Maximets
  2016-07-06 12:24     ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-06 11:19 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On 01.07.2016 10:35, Yuanhan Liu wrote:
> Hi,
> 
> Sorry for the long delay.
> 
> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>> In current implementation guest application can reinitialize vrings
>> by executing start after stop. In the same time host application
>> can still poll virtqueue while device stopped in guest and it will
>> crash with segmentation fault while vring reinitialization because
>> of dereferencing of bad descriptor addresses.
> 
> Yes, you are right that vring will be reinitialized after restart.
> But even though, I don't see the reason it will cause a vhost crash,
> since the reinitialization will reset all the vring memeory by 0:
> 
>     memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
> 
> That means those bad descriptors will be skipped, safely, at vhost
> side by:
> 
> 	if (unlikely(desc->len < dev->vhost_hlen))
> 		return -1;
> 
>>
>> OVS crash for example:
>> <------------------------------------------------------------------------>
>> [test-pmd inside guest VM]
>>
>> 	testpmd> port stop all
>> 	    Stopping ports...
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>> 	testpmd> port config all rxq 2
>> 	testpmd> port config all txq 2
>> 	testpmd> port start all
>> 	    Configuring Port 0 (socket 0)
>> 	    Port 0: 52:54:00:CB:44:C8
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>>
>> [OVS on host]
>> 	Program received signal SIGSEGV, Segmentation fault.
>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
> 
> Interesting, so it bypasses the above check since desc->len is non-zero
> while desc->addr is zero. The size (2056) also looks weird.
> 
> Do you mind to check this issue a bit deeper, say why desc->addr is
> zero, however, desc->len is not?

OK. I checked this few more times. Actually, I see, that desc->addr is
not zero. All desc memory looks like some rubbish:

<------------------------------------------------------------------------------>
(gdb)
#3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363,
                      m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0)
        desc = 0x2aabc00ff530
        desc_addr = 0
        mbuf_offset = 0
        prev = 0x7fe9db269400
        nr_desc = 1
        desc_offset = 12
        cur = 0x7fe9db269400
        hdr = 0x0
        desc_avail = 1012591375
        mbuf_avail = 1526
        cpy_len = 1526

(gdb) p *desc
$2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516}

<------------------------------------------------------------------------------>

And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge
address to host's.

Scenario was the same. SIGSEGV received right after 'port start all'.

Another thought:

	Actually, there is a race window between 'memset' in guest and reading
	of 'desc->len' and 'desc->addr' on host. So, it's possible to read non
	zero 'len' and zero 'addr' right after that. But you're right, this
	case should be very rare.

> 
>> 	(gdb) bt
>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>> 	    #1  copy_desc_to_mbuf
>> 	    #2  rte_vhost_dequeue_burst
>> 	    #3  netdev_dpdk_vhost_rxq_recv
>> 	    ...
>>
>> 	(gdb) bt full
>> 	    #0  rte_memcpy
>> 	        ...
>> 	    #1  copy_desc_to_mbuf
>> 	        desc_addr = 0
>> 	        mbuf_offset = 0
>> 	        desc_offset = 12
>> 	        ...
>> <------------------------------------------------------------------------>
>>
>> Fix that by checking addresses of descriptors before using them.
>>
>> Note: For mergeable buffers this patch checks only guest's address for
>> zero, but in non-meargeable case host's address checked. This is done
>> because checking of host's address in mergeable case requires additional
>> refactoring to keep virtqueue in consistent state in case of error.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> ---
>>
>> Actually, current virtio implementation looks broken for me. Because
>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
>> side.
> 
> Yes, this sounds buggy. Maybe we could not reset the avail idx, in such
> case vhost dequeue/enqueue will just return as there are no more packets
> to dequeue and no more space to enqueue, respectively?

Maybe this will be a good fix for virtio because vhost will not try to receive
from wrong descriptors. But this will not help if vhost already tries to receive
something in time of guest's reconfiguration.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-06 11:19   ` Ilya Maximets
@ 2016-07-06 12:24     ` Yuanhan Liu
  2016-07-08 11:48       ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-06 12:24 UTC (permalink / raw)
  To: Ilya Maximets; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On Wed, Jul 06, 2016 at 02:19:12PM +0300, Ilya Maximets wrote:
> On 01.07.2016 10:35, Yuanhan Liu wrote:
> > Hi,
> > 
> > Sorry for the long delay.
> > 
> > On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
> >> In current implementation guest application can reinitialize vrings
> >> by executing start after stop. In the same time host application
> >> can still poll virtqueue while device stopped in guest and it will
> >> crash with segmentation fault while vring reinitialization because
> >> of dereferencing of bad descriptor addresses.
> > 
> > Yes, you are right that vring will be reinitialized after restart.
> > But even though, I don't see the reason it will cause a vhost crash,
> > since the reinitialization will reset all the vring memeory by 0:
> > 
> >     memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
> > 
> > That means those bad descriptors will be skipped, safely, at vhost
> > side by:
> > 
> > 	if (unlikely(desc->len < dev->vhost_hlen))
> > 		return -1;
> > 
> >>
> >> OVS crash for example:
> >> <------------------------------------------------------------------------>
> >> [test-pmd inside guest VM]
> >>
> >> 	testpmd> port stop all
> >> 	    Stopping ports...
> >> 	    Checking link statuses...
> >> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> >> 	    Done
> >> 	testpmd> port config all rxq 2
> >> 	testpmd> port config all txq 2
> >> 	testpmd> port start all
> >> 	    Configuring Port 0 (socket 0)
> >> 	    Port 0: 52:54:00:CB:44:C8
> >> 	    Checking link statuses...
> >> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
> >> 	    Done
> >>
> >> [OVS on host]
> >> 	Program received signal SIGSEGV, Segmentation fault.
> >> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
> > 
> > Interesting, so it bypasses the above check since desc->len is non-zero
> > while desc->addr is zero. The size (2056) also looks weird.
> > 
> > Do you mind to check this issue a bit deeper, say why desc->addr is
> > zero, however, desc->len is not?
> 
> OK. I checked this few more times.

Thanks!

> Actually, I see, that desc->addr is
> not zero. All desc memory looks like some rubbish:
> 
> <------------------------------------------------------------------------------>
> (gdb)
> #3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363,
>                       m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0)
>         desc = 0x2aabc00ff530
>         desc_addr = 0
>         mbuf_offset = 0
>         prev = 0x7fe9db269400
>         nr_desc = 1
>         desc_offset = 12
>         cur = 0x7fe9db269400
>         hdr = 0x0
>         desc_avail = 1012591375
>         mbuf_avail = 1526
>         cpy_len = 1526
> 
> (gdb) p *desc
> $2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516}
> 
> <------------------------------------------------------------------------------>
> 
> And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge
> address to host's.
> 
> Scenario was the same. SIGSEGV received right after 'port start all'.
> 
> Another thought:
> 
> 	Actually, there is a race window between 'memset' in guest and reading
> 	of 'desc->len' and 'desc->addr' on host. So, it's possible to read non
> 	zero 'len' and zero 'addr' right after that.

That's also what I was thinking, that it should the only reason caused
such issue.

> But you're right, this case should be very rare.

Yes, it should be very rare. What troubles me is that seems you can
reproduce this issue very easily, that I doubt it's caused by this
rare race. The reason could be something else?

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-06 12:24     ` Yuanhan Liu
@ 2016-07-08 11:48       ` Ilya Maximets
  2016-07-10 13:17         ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-08 11:48 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On 06.07.2016 15:24, Yuanhan Liu wrote:
> On Wed, Jul 06, 2016 at 02:19:12PM +0300, Ilya Maximets wrote:
>> On 01.07.2016 10:35, Yuanhan Liu wrote:
>>> Hi,
>>>
>>> Sorry for the long delay.
>>>
>>> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>>>> In current implementation guest application can reinitialize vrings
>>>> by executing start after stop. In the same time host application
>>>> can still poll virtqueue while device stopped in guest and it will
>>>> crash with segmentation fault while vring reinitialization because
>>>> of dereferencing of bad descriptor addresses.
>>>
>>> Yes, you are right that vring will be reinitialized after restart.
>>> But even though, I don't see the reason it will cause a vhost crash,
>>> since the reinitialization will reset all the vring memeory by 0:
>>>
>>>     memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
>>>
>>> That means those bad descriptors will be skipped, safely, at vhost
>>> side by:
>>>
>>> 	if (unlikely(desc->len < dev->vhost_hlen))
>>> 		return -1;
>>>
>>>>
>>>> OVS crash for example:
>>>> <------------------------------------------------------------------------>
>>>> [test-pmd inside guest VM]
>>>>
>>>> 	testpmd> port stop all
>>>> 	    Stopping ports...
>>>> 	    Checking link statuses...
>>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>>> 	    Done
>>>> 	testpmd> port config all rxq 2
>>>> 	testpmd> port config all txq 2
>>>> 	testpmd> port start all
>>>> 	    Configuring Port 0 (socket 0)
>>>> 	    Port 0: 52:54:00:CB:44:C8
>>>> 	    Checking link statuses...
>>>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>>>> 	    Done
>>>>
>>>> [OVS on host]
>>>> 	Program received signal SIGSEGV, Segmentation fault.
>>>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
>>>
>>> Interesting, so it bypasses the above check since desc->len is non-zero
>>> while desc->addr is zero. The size (2056) also looks weird.
>>>
>>> Do you mind to check this issue a bit deeper, say why desc->addr is
>>> zero, however, desc->len is not?
>>
>> OK. I checked this few more times.
> 
> Thanks!
> 
>> Actually, I see, that desc->addr is
>> not zero. All desc memory looks like some rubbish:
>>
>> <------------------------------------------------------------------------------>
>> (gdb)
>> #3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363,
>>                       m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0)
>>         desc = 0x2aabc00ff530
>>         desc_addr = 0
>>         mbuf_offset = 0
>>         prev = 0x7fe9db269400
>>         nr_desc = 1
>>         desc_offset = 12
>>         cur = 0x7fe9db269400
>>         hdr = 0x0
>>         desc_avail = 1012591375
>>         mbuf_avail = 1526
>>         cpy_len = 1526
>>
>> (gdb) p *desc
>> $2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516}
>>
>> <------------------------------------------------------------------------------>
>>
>> And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge
>> address to host's.
>>
>> Scenario was the same. SIGSEGV received right after 'port start all'.
>>
>> Another thought:
>>
>> 	Actually, there is a race window between 'memset' in guest and reading
>> 	of 'desc->len' and 'desc->addr' on host. So, it's possible to read non
>> 	zero 'len' and zero 'addr' right after that.
> 
> That's also what I was thinking, that it should the only reason caused
> such issue.
> 
>> But you're right, this case should be very rare.
> 
> Yes, it should be very rare. What troubles me is that seems you can
> reproduce this issue very easily, that I doubt it's caused by this
> rare race. The reason could be something else?

I don't know what exactly happens, but it constantly happens on 'port start all'
command execution. Descriptors becomes broken for some time and after that time
descriptors becomes normal. I think so, because with my patch applied network
is working. It means, that broken descriptors appears for some little time
while virtio restarting with new configuration.

Another point is that crash constantly happens on queue_id=3 (second RX queue) in
my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
rxq=2.

Obviously, virtio needs to be fixed, but we need to check address anyway on
vhost side, because we don't know what happens in guest in common case.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-08 11:48       ` Ilya Maximets
@ 2016-07-10 13:17         ` Yuanhan Liu
  2016-07-11  8:38           ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-10 13:17 UTC (permalink / raw)
  To: Ilya Maximets; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
> 
> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
> rxq=2.

That's a valuable message: what's your DPDK HEAD commit while triggering
this issue?

	--yliu
> 
> Obviously, virtio needs to be fixed, but we need to check address anyway on
> vhost side, because we don't know what happens in guest in common case.
> 
> Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-10 13:17         ` Yuanhan Liu
@ 2016-07-11  8:38           ` Yuanhan Liu
  2016-07-11  9:50             ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-11  8:38 UTC (permalink / raw)
  To: Ilya Maximets; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
> > 
> > Another point is that crash constantly happens on queue_id=3 (second RX queue) in
> > my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
> > rxq=2.
> 
> That's a valuable message: what's your DPDK HEAD commit while triggering
> this issue?

I guess I have understood what goes wrong in you case.

I would guess that your vhost has 2 queues (here I mean queue-pairs,
including one Tx and Rx queue; below usage is the same) configured,
so does to your QEMU. However, you just enabled 1 queue while starting
testpmd inside the guest, and you want to enable 2 queues by running
following testpmd commands:

    stop
    port stop all
    port config all rxq 2
    port config all txq 2
    port start all

Badly, that won't work for current virtio PMD implementation, and what's
worse, it triggers a vhost crash, the one you saw.

Here is how it comes. Since you just enabled 1 queue while starting
testpmd, it will setup 1 queue only, meaning only one queue's **valid**
information will be sent to vhost. You might see SET_VRING_ADDR
(and related vhost messages) for the other queue as well, but they
are just the dummy messages: they don't include any valid/real
information about the 2nd queue: the driver don't setup it after all.

So far, so good. It became broken when you run above commands. Those
commands do setup for the 2nd queue, however, they failed to trigger
the QEMU virtio device to start the vhost-user negotiation, meaning
no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
untold and not updated.

What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
messages, to enable all the vrings. And since the vrings for the 2nd
queue are not properly configured, the crash happens.

So maybe we should do virtio reset on port start?

	--yliu 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-11  8:38           ` Yuanhan Liu
@ 2016-07-11  9:50             ` Ilya Maximets
  2016-07-11 11:05               ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-11  9:50 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan

On 11.07.2016 11:38, Yuanhan Liu wrote:
> On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
>> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
>>>
>>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
>>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
>>> rxq=2.
>>
>> That's a valuable message: what's your DPDK HEAD commit while triggering
>> this issue?

fbfd99551ca3 ("mbuf: add raw allocation function")

> 
> I guess I have understood what goes wrong in you case.
> 
> I would guess that your vhost has 2 queues (here I mean queue-pairs,
> including one Tx and Rx queue; below usage is the same) configured,
> so does to your QEMU. However, you just enabled 1 queue while starting
> testpmd inside the guest, and you want to enable 2 queues by running
> following testpmd commands:
> 
>     stop
>     port stop all
>     port config all rxq 2
>     port config all txq 2
>     port start all
> 
> Badly, that won't work for current virtio PMD implementation, and what's
> worse, it triggers a vhost crash, the one you saw.
> 
> Here is how it comes. Since you just enabled 1 queue while starting
> testpmd, it will setup 1 queue only, meaning only one queue's **valid**
> information will be sent to vhost. You might see SET_VRING_ADDR
> (and related vhost messages) for the other queue as well, but they
> are just the dummy messages: they don't include any valid/real
> information about the 2nd queue: the driver don't setup it after all.
> 
> So far, so good. It became broken when you run above commands. Those
> commands do setup for the 2nd queue, however, they failed to trigger
> the QEMU virtio device to start the vhost-user negotiation, meaning
> no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
> untold and not updated.
> 
> What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
> messages, to enable all the vrings. And since the vrings for the 2nd
> queue are not properly configured, the crash happens.

Hmm, why 2nd queue works properly with my fix to vhost in this case?
 
> So maybe we should do virtio reset on port start?

I guess it was removed by this patch:
a85786dc816f ("virtio: fix states handling during initialization").

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-11  9:50             ` Ilya Maximets
@ 2016-07-11 11:05               ` Yuanhan Liu
  2016-07-11 11:47                 ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-11 11:05 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger

On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
> On 11.07.2016 11:38, Yuanhan Liu wrote:
> > On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
> >> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
> >>>
> >>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
> >>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
> >>> rxq=2.
> >>
> >> That's a valuable message: what's your DPDK HEAD commit while triggering
> >> this issue?
> 
> fbfd99551ca3 ("mbuf: add raw allocation function")
> 
> > 
> > I guess I have understood what goes wrong in you case.
> > 
> > I would guess that your vhost has 2 queues (here I mean queue-pairs,
> > including one Tx and Rx queue; below usage is the same) configured,
> > so does to your QEMU. However, you just enabled 1 queue while starting
> > testpmd inside the guest, and you want to enable 2 queues by running
> > following testpmd commands:
> > 
> >     stop
> >     port stop all
> >     port config all rxq 2
> >     port config all txq 2
> >     port start all
> > 
> > Badly, that won't work for current virtio PMD implementation, and what's
> > worse, it triggers a vhost crash, the one you saw.
> > 
> > Here is how it comes. Since you just enabled 1 queue while starting
> > testpmd, it will setup 1 queue only, meaning only one queue's **valid**
> > information will be sent to vhost. You might see SET_VRING_ADDR
> > (and related vhost messages) for the other queue as well, but they
> > are just the dummy messages: they don't include any valid/real
> > information about the 2nd queue: the driver don't setup it after all.
> > 
> > So far, so good. It became broken when you run above commands. Those
> > commands do setup for the 2nd queue, however, they failed to trigger
> > the QEMU virtio device to start the vhost-user negotiation, meaning
> > no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
> > untold and not updated.
> > 
> > What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
> > messages, to enable all the vrings. And since the vrings for the 2nd
> > queue are not properly configured, the crash happens.
> 
> Hmm, why 2nd queue works properly with my fix to vhost in this case?

Hmm, really? You are sure that data flows in your 2nd queue after those
commands? From what I know is that your patch just avoid a crash, but
does not fix it.

> > So maybe we should do virtio reset on port start?
> 
> I guess it was removed by this patch:
> a85786dc816f ("virtio: fix states handling during initialization").

Seems yes.

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-11 11:05               ` Yuanhan Liu
@ 2016-07-11 11:47                 ` Ilya Maximets
  2016-07-12  2:43                   ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-11 11:47 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger

On 11.07.2016 14:05, Yuanhan Liu wrote:
> On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
>> On 11.07.2016 11:38, Yuanhan Liu wrote:
>>> On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
>>>> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
>>>>>
>>>>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
>>>>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
>>>>> rxq=2.
>>>>
>>>> That's a valuable message: what's your DPDK HEAD commit while triggering
>>>> this issue?
>>
>> fbfd99551ca3 ("mbuf: add raw allocation function")
>>
>>>
>>> I guess I have understood what goes wrong in you case.
>>>
>>> I would guess that your vhost has 2 queues (here I mean queue-pairs,
>>> including one Tx and Rx queue; below usage is the same) configured,
>>> so does to your QEMU. However, you just enabled 1 queue while starting
>>> testpmd inside the guest, and you want to enable 2 queues by running
>>> following testpmd commands:
>>>
>>>     stop
>>>     port stop all
>>>     port config all rxq 2
>>>     port config all txq 2
>>>     port start all
>>>
>>> Badly, that won't work for current virtio PMD implementation, and what's
>>> worse, it triggers a vhost crash, the one you saw.
>>>
>>> Here is how it comes. Since you just enabled 1 queue while starting
>>> testpmd, it will setup 1 queue only, meaning only one queue's **valid**
>>> information will be sent to vhost. You might see SET_VRING_ADDR
>>> (and related vhost messages) for the other queue as well, but they
>>> are just the dummy messages: they don't include any valid/real
>>> information about the 2nd queue: the driver don't setup it after all.
>>>
>>> So far, so good. It became broken when you run above commands. Those
>>> commands do setup for the 2nd queue, however, they failed to trigger
>>> the QEMU virtio device to start the vhost-user negotiation, meaning
>>> no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
>>> untold and not updated.
>>>
>>> What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
>>> messages, to enable all the vrings. And since the vrings for the 2nd
>>> queue are not properly configured, the crash happens.
>>
>> Hmm, why 2nd queue works properly with my fix to vhost in this case?
> 
> Hmm, really? You are sure that data flows in your 2nd queue after those
> commands? From what I know is that your patch just avoid a crash, but
> does not fix it.

Oh, sorry. Yes, it doesn't work. With my patch applied I have a QEMU hang.

>>> So maybe we should do virtio reset on port start?
>>
>> I guess it was removed by this patch:
>> a85786dc816f ("virtio: fix states handling during initialization").
> 
> Seems yes.
> 
> 	--yliu
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-11 11:47                 ` Ilya Maximets
@ 2016-07-12  2:43                   ` Yuanhan Liu
  2016-07-12  5:53                     ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-12  2:43 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger, Thomas Monjalon

On Mon, Jul 11, 2016 at 02:47:56PM +0300, Ilya Maximets wrote:
> On 11.07.2016 14:05, Yuanhan Liu wrote:
> > On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
> >> On 11.07.2016 11:38, Yuanhan Liu wrote:
> >>> On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
> >>>> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
> >>>>>
> >>>>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
> >>>>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
> >>>>> rxq=2.
> >>>>
> >>>> That's a valuable message: what's your DPDK HEAD commit while triggering
> >>>> this issue?
> >>
> >> fbfd99551ca3 ("mbuf: add raw allocation function")
> >>
> >>>
> >>> I guess I have understood what goes wrong in you case.
> >>>
> >>> I would guess that your vhost has 2 queues (here I mean queue-pairs,
> >>> including one Tx and Rx queue; below usage is the same) configured,
> >>> so does to your QEMU. However, you just enabled 1 queue while starting
> >>> testpmd inside the guest, and you want to enable 2 queues by running
> >>> following testpmd commands:
> >>>
> >>>     stop
> >>>     port stop all
> >>>     port config all rxq 2
> >>>     port config all txq 2
> >>>     port start all
> >>>
> >>> Badly, that won't work for current virtio PMD implementation, and what's
> >>> worse, it triggers a vhost crash, the one you saw.
> >>>
> >>> Here is how it comes. Since you just enabled 1 queue while starting
> >>> testpmd, it will setup 1 queue only, meaning only one queue's **valid**
> >>> information will be sent to vhost. You might see SET_VRING_ADDR
> >>> (and related vhost messages) for the other queue as well, but they
> >>> are just the dummy messages: they don't include any valid/real
> >>> information about the 2nd queue: the driver don't setup it after all.
> >>>
> >>> So far, so good. It became broken when you run above commands. Those
> >>> commands do setup for the 2nd queue, however, they failed to trigger
> >>> the QEMU virtio device to start the vhost-user negotiation, meaning
> >>> no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
> >>> untold and not updated.
> >>>
> >>> What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
> >>> messages, to enable all the vrings. And since the vrings for the 2nd
> >>> queue are not properly configured, the crash happens.
> >>
> >> Hmm, why 2nd queue works properly with my fix to vhost in this case?
> > 
> > Hmm, really? You are sure that data flows in your 2nd queue after those
> > commands? From what I know is that your patch just avoid a crash, but
> > does not fix it.
> 
> Oh, sorry. Yes, it doesn't work. With my patch applied I have a QEMU hang.

The crash actually could be avoided by commit 0823c1cb0a73 ("vhost:
workaround stale vring base"), accidentally. That's why I asked you
above the HEAD commit you were using.

> >>> So maybe we should do virtio reset on port start?
> >>
> >> I guess it was removed by this patch:
> >> a85786dc816f ("virtio: fix states handling during initialization").
> > 
> > Seems yes.

Actually, we should not do that: do reset on port start. The right fix
should be allocating MAX queues virtio device supports (2 here). This
would allow us changing the queue number dynamically.

But this doesn't sound a simple fix; it involves many code changes, due
to it was not designed this way before. Therefore, we will not fix it
in this release, due to it's too late. Let's fix it in the next release
instead. For the crash issue, it will not happen with the latest HEAD.
Though it's accident fix, I think we are fine here.

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-12  2:43                   ` Yuanhan Liu
@ 2016-07-12  5:53                     ` Ilya Maximets
  2016-07-13  7:34                       ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-12  5:53 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger, Thomas Monjalon

On 12.07.2016 05:43, Yuanhan Liu wrote:
> On Mon, Jul 11, 2016 at 02:47:56PM +0300, Ilya Maximets wrote:
>> On 11.07.2016 14:05, Yuanhan Liu wrote:
>>> On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
>>>> On 11.07.2016 11:38, Yuanhan Liu wrote:
>>>>> On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
>>>>>> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
>>>>>>>
>>>>>>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
>>>>>>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
>>>>>>> rxq=2.
>>>>>>
>>>>>> That's a valuable message: what's your DPDK HEAD commit while triggering
>>>>>> this issue?
>>>>
>>>> fbfd99551ca3 ("mbuf: add raw allocation function")
>>>>
>>>>>
>>>>> I guess I have understood what goes wrong in you case.
>>>>>
>>>>> I would guess that your vhost has 2 queues (here I mean queue-pairs,
>>>>> including one Tx and Rx queue; below usage is the same) configured,
>>>>> so does to your QEMU. However, you just enabled 1 queue while starting
>>>>> testpmd inside the guest, and you want to enable 2 queues by running
>>>>> following testpmd commands:
>>>>>
>>>>>     stop
>>>>>     port stop all
>>>>>     port config all rxq 2
>>>>>     port config all txq 2
>>>>>     port start all
>>>>>
>>>>> Badly, that won't work for current virtio PMD implementation, and what's
>>>>> worse, it triggers a vhost crash, the one you saw.
>>>>>
>>>>> Here is how it comes. Since you just enabled 1 queue while starting
>>>>> testpmd, it will setup 1 queue only, meaning only one queue's **valid**
>>>>> information will be sent to vhost. You might see SET_VRING_ADDR
>>>>> (and related vhost messages) for the other queue as well, but they
>>>>> are just the dummy messages: they don't include any valid/real
>>>>> information about the 2nd queue: the driver don't setup it after all.
>>>>>
>>>>> So far, so good. It became broken when you run above commands. Those
>>>>> commands do setup for the 2nd queue, however, they failed to trigger
>>>>> the QEMU virtio device to start the vhost-user negotiation, meaning
>>>>> no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
>>>>> untold and not updated.
>>>>>
>>>>> What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
>>>>> messages, to enable all the vrings. And since the vrings for the 2nd
>>>>> queue are not properly configured, the crash happens.
>>>>
>>>> Hmm, why 2nd queue works properly with my fix to vhost in this case?
>>>
>>> Hmm, really? You are sure that data flows in your 2nd queue after those
>>> commands? From what I know is that your patch just avoid a crash, but
>>> does not fix it.
>>
>> Oh, sorry. Yes, it doesn't work. With my patch applied I have a QEMU hang.
> 
> The crash actually could be avoided by commit 0823c1cb0a73 ("vhost:
> workaround stale vring base"), accidentally. That's why I asked you
> above the HEAD commit you were using.

Thanks for pointing this. I'll check it.
 
>>>>> So maybe we should do virtio reset on port start?
>>>>
>>>> I guess it was removed by this patch:
>>>> a85786dc816f ("virtio: fix states handling during initialization").
>>>
>>> Seems yes.
> 
> Actually, we should not do that: do reset on port start. The right fix
> should be allocating MAX queues virtio device supports (2 here). This
> would allow us changing the queue number dynamically.

Yes, I agree that this is the right way to fix this issue.
 
> But this doesn't sound a simple fix; it involves many code changes, due
> to it was not designed this way before. Therefore, we will not fix it
> in this release, due to it's too late. Let's fix it in the next release
> instead. For the crash issue, it will not happen with the latest HEAD.
> Though it's accident fix, I think we are fine here.
> 
> 	--yliu
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-12  5:53                     ` Ilya Maximets
@ 2016-07-13  7:34                       ` Ilya Maximets
  2016-07-13  8:47                         ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-13  7:34 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger, Thomas Monjalon

On 12.07.2016 08:53, Ilya Maximets wrote:
> On 12.07.2016 05:43, Yuanhan Liu wrote:
>> On Mon, Jul 11, 2016 at 02:47:56PM +0300, Ilya Maximets wrote:
>>> On 11.07.2016 14:05, Yuanhan Liu wrote:
>>>> On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
>>>>> On 11.07.2016 11:38, Yuanhan Liu wrote:
>>>>>> On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
>>>>>>> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
>>>>>>>>
>>>>>>>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
>>>>>>>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
>>>>>>>> rxq=2.
>>>>>>>
>>>>>>> That's a valuable message: what's your DPDK HEAD commit while triggering
>>>>>>> this issue?
>>>>>
>>>>> fbfd99551ca3 ("mbuf: add raw allocation function")
>>>>>
>>>>>>
>>>>>> I guess I have understood what goes wrong in you case.
>>>>>>
>>>>>> I would guess that your vhost has 2 queues (here I mean queue-pairs,
>>>>>> including one Tx and Rx queue; below usage is the same) configured,
>>>>>> so does to your QEMU. However, you just enabled 1 queue while starting
>>>>>> testpmd inside the guest, and you want to enable 2 queues by running
>>>>>> following testpmd commands:
>>>>>>
>>>>>>     stop
>>>>>>     port stop all
>>>>>>     port config all rxq 2
>>>>>>     port config all txq 2
>>>>>>     port start all
>>>>>>
>>>>>> Badly, that won't work for current virtio PMD implementation, and what's
>>>>>> worse, it triggers a vhost crash, the one you saw.
>>>>>>
>>>>>> Here is how it comes. Since you just enabled 1 queue while starting
>>>>>> testpmd, it will setup 1 queue only, meaning only one queue's **valid**
>>>>>> information will be sent to vhost. You might see SET_VRING_ADDR
>>>>>> (and related vhost messages) for the other queue as well, but they
>>>>>> are just the dummy messages: they don't include any valid/real
>>>>>> information about the 2nd queue: the driver don't setup it after all.
>>>>>>
>>>>>> So far, so good. It became broken when you run above commands. Those
>>>>>> commands do setup for the 2nd queue, however, they failed to trigger
>>>>>> the QEMU virtio device to start the vhost-user negotiation, meaning
>>>>>> no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
>>>>>> untold and not updated.
>>>>>>
>>>>>> What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
>>>>>> messages, to enable all the vrings. And since the vrings for the 2nd
>>>>>> queue are not properly configured, the crash happens.
>>>>>
>>>>> Hmm, why 2nd queue works properly with my fix to vhost in this case?
>>>>
>>>> Hmm, really? You are sure that data flows in your 2nd queue after those
>>>> commands? From what I know is that your patch just avoid a crash, but
>>>> does not fix it.
>>>
>>> Oh, sorry. Yes, it doesn't work. With my patch applied I have a QEMU hang.
>>
>> The crash actually could be avoided by commit 0823c1cb0a73 ("vhost:
>> workaround stale vring base"), accidentally. That's why I asked you
>> above the HEAD commit you were using.
> 
> Thanks for pointing this. I'll check it.

I've checked my DPDK HEAD with above commit backported. Yes, it helps to
avoid vhost crash in my scenario. As expected, after reconfiguration new
virtqueue doesn't work, QEMU hangs sometimes.

>>>>>> So maybe we should do virtio reset on port start?
>>>>>
>>>>> I guess it was removed by this patch:
>>>>> a85786dc816f ("virtio: fix states handling during initialization").
>>>>
>>>> Seems yes.
>>
>> Actually, we should not do that: do reset on port start. The right fix
>> should be allocating MAX queues virtio device supports (2 here). This
>> would allow us changing the queue number dynamically.
> 
> Yes, I agree that this is the right way to fix this issue.
>  
>> But this doesn't sound a simple fix; it involves many code changes, due
>> to it was not designed this way before. Therefore, we will not fix it
>> in this release, due to it's too late. Let's fix it in the next release
>> instead. For the crash issue, it will not happen with the latest HEAD.
>> Though it's accident fix, I think we are fine here.

This scenario fixed somehow, I agree. But this patch still needed to protect
vhost from untrusted VM, from malicious or buggy virtio application.
Maybe we could change the commit-message and resend this patch as a
security enhancement? What do you think?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-13  7:34                       ` Ilya Maximets
@ 2016-07-13  8:47                         ` Yuanhan Liu
  2016-07-13 15:54                           ` Rich Lane
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-13  8:47 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger, Thomas Monjalon, Rich Lane

On Wed, Jul 13, 2016 at 10:34:07AM +0300, Ilya Maximets wrote:
> On 12.07.2016 08:53, Ilya Maximets wrote:
> > On 12.07.2016 05:43, Yuanhan Liu wrote:
> >> On Mon, Jul 11, 2016 at 02:47:56PM +0300, Ilya Maximets wrote:
> >>> On 11.07.2016 14:05, Yuanhan Liu wrote:
> >>>> On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
> >>>>> On 11.07.2016 11:38, Yuanhan Liu wrote:
> >>>>>> On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
> >>>>>>> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
> >>>>>>>>
> >>>>>>>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
> >>>>>>>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
> >>>>>>>> rxq=2.
> >>>>>>>
> >>>>>>> That's a valuable message: what's your DPDK HEAD commit while triggering
> >>>>>>> this issue?
> >>>>>
> >>>>> fbfd99551ca3 ("mbuf: add raw allocation function")
> >>>>>
> >>>>>>
> >>>>>> I guess I have understood what goes wrong in you case.
> >>>>>>
> >>>>>> I would guess that your vhost has 2 queues (here I mean queue-pairs,
> >>>>>> including one Tx and Rx queue; below usage is the same) configured,
> >>>>>> so does to your QEMU. However, you just enabled 1 queue while starting
> >>>>>> testpmd inside the guest, and you want to enable 2 queues by running
> >>>>>> following testpmd commands:
> >>>>>>
> >>>>>>     stop
> >>>>>>     port stop all
> >>>>>>     port config all rxq 2
> >>>>>>     port config all txq 2
> >>>>>>     port start all
> >>>>>>
> >>>>>> Badly, that won't work for current virtio PMD implementation, and what's
> >>>>>> worse, it triggers a vhost crash, the one you saw.
> >>>>>>
> >>>>>> Here is how it comes. Since you just enabled 1 queue while starting
> >>>>>> testpmd, it will setup 1 queue only, meaning only one queue's **valid**
> >>>>>> information will be sent to vhost. You might see SET_VRING_ADDR
> >>>>>> (and related vhost messages) for the other queue as well, but they
> >>>>>> are just the dummy messages: they don't include any valid/real
> >>>>>> information about the 2nd queue: the driver don't setup it after all.
> >>>>>>
> >>>>>> So far, so good. It became broken when you run above commands. Those
> >>>>>> commands do setup for the 2nd queue, however, they failed to trigger
> >>>>>> the QEMU virtio device to start the vhost-user negotiation, meaning
> >>>>>> no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
> >>>>>> untold and not updated.
> >>>>>>
> >>>>>> What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
> >>>>>> messages, to enable all the vrings. And since the vrings for the 2nd
> >>>>>> queue are not properly configured, the crash happens.
> >>>>>
> >>>>> Hmm, why 2nd queue works properly with my fix to vhost in this case?
> >>>>
> >>>> Hmm, really? You are sure that data flows in your 2nd queue after those
> >>>> commands? From what I know is that your patch just avoid a crash, but
> >>>> does not fix it.
> >>>
> >>> Oh, sorry. Yes, it doesn't work. With my patch applied I have a QEMU hang.
> >>
> >> The crash actually could be avoided by commit 0823c1cb0a73 ("vhost:
> >> workaround stale vring base"), accidentally. That's why I asked you
> >> above the HEAD commit you were using.
> > 
> > Thanks for pointing this. I'll check it.
> 
> I've checked my DPDK HEAD with above commit backported. Yes, it helps to
> avoid vhost crash in my scenario. As expected, after reconfiguration new
> virtqueue doesn't work, QEMU hangs sometimes.
> >>>>>> So maybe we should do virtio reset on port start?
> >>>>>
> >>>>> I guess it was removed by this patch:
> >>>>> a85786dc816f ("virtio: fix states handling during initialization").
> >>>>
> >>>> Seems yes.
> >>
> >> Actually, we should not do that: do reset on port start. The right fix
> >> should be allocating MAX queues virtio device supports (2 here). This
> >> would allow us changing the queue number dynamically.
> > 
> > Yes, I agree that this is the right way to fix this issue.
> >  
> >> But this doesn't sound a simple fix; it involves many code changes, due
> >> to it was not designed this way before. Therefore, we will not fix it
> >> in this release, due to it's too late. Let's fix it in the next release
> >> instead. For the crash issue, it will not happen with the latest HEAD.
> >> Though it's accident fix, I think we are fine here.
> 
> This scenario fixed somehow, I agree. But this patch still needed to protect
> vhost from untrusted VM, from malicious or buggy virtio application.
> Maybe we could change the commit-message and resend this patch as a
> security enhancement? What do you think?

Indeed, but I'm a bit concerned about the performance regression found
by Rich, yet I am not quite sure why it happens, though Rich claimed
that it seems to be a problem related to compiler.

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-13  8:47                         ` Yuanhan Liu
@ 2016-07-13 15:54                           ` Rich Lane
  2016-07-14  1:42                             ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Rich Lane @ 2016-07-13 15:54 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Ilya Maximets, dev, Huawei Xie, Dyasly Sergey, Heetae Ahn,
	Jianfeng Tan, Stephen Hemminger, Thomas Monjalon

On Wednesday, July 13, 2016, Yuanhan Liu <yuanhan.liu@linux.intel.com>
wrote:

> On Wed, Jul 13, 2016 at 10:34:07AM +0300, Ilya Maximets wrote:
> > This scenario fixed somehow, I agree. But this patch still needed to
> protect
> > vhost from untrusted VM, from malicious or buggy virtio application.
> > Maybe we could change the commit-message and resend this patch as a
> > security enhancement? What do you think?
>
> Indeed, but I'm a bit concerned about the performance regression found
> by Rich, yet I am not quite sure why it happens, though Rich claimed
> that it seems to be a problem related to compiler.


The workaround I suggested solves the performance regression. But even if
it hadn't, this is a security fix that should be merged regardless of the
performance impact.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-13 15:54                           ` Rich Lane
@ 2016-07-14  1:42                             ` Yuanhan Liu
  2016-07-14  4:38                               ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-14  1:42 UTC (permalink / raw)
  To: Rich Lane
  Cc: Ilya Maximets, dev, Huawei Xie, Dyasly Sergey, Heetae Ahn,
	Jianfeng Tan, Stephen Hemminger, Thomas Monjalon

On Wed, Jul 13, 2016 at 08:54:08AM -0700, Rich Lane wrote:
> On Wednesday, July 13, 2016, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote:
> 
>     On Wed, Jul 13, 2016 at 10:34:07AM +0300, Ilya Maximets wrote:
>     > This scenario fixed somehow, I agree. But this patch still needed to
>     protect
>     > vhost from untrusted VM, from malicious or buggy virtio application.
>     > Maybe we could change the commit-message and resend this patch as a
>     > security enhancement? What do you think?
> 
>     Indeed, but I'm a bit concerned about the performance regression found
>     by Rich, yet I am not quite sure why it happens, though Rich claimed
>     that it seems to be a problem related to compiler.
> 
> 
> The workaround I suggested solves the performance regression. But even if it
> hadn't, this is a security fix that should be merged regardless of the
> performance impact.

Good point. Ilya, would you reword the commit log and resend based on
latest code?

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.
  2016-07-14  1:42                             ` Yuanhan Liu
@ 2016-07-14  4:38                               ` Ilya Maximets
  0 siblings, 0 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-07-14  4:38 UTC (permalink / raw)
  To: Yuanhan Liu, Rich Lane
  Cc: dev, Huawei Xie, Dyasly Sergey, Heetae Ahn, Jianfeng Tan,
	Stephen Hemminger, Thomas Monjalon

On 14.07.2016 04:42, Yuanhan Liu wrote:
> On Wed, Jul 13, 2016 at 08:54:08AM -0700, Rich Lane wrote:
>> On Wednesday, July 13, 2016, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote:
>>
>>     On Wed, Jul 13, 2016 at 10:34:07AM +0300, Ilya Maximets wrote:
>>     > This scenario fixed somehow, I agree. But this patch still needed to
>>     protect
>>     > vhost from untrusted VM, from malicious or buggy virtio application.
>>     > Maybe we could change the commit-message and resend this patch as a
>>     > security enhancement? What do you think?
>>
>>     Indeed, but I'm a bit concerned about the performance regression found
>>     by Rich, yet I am not quite sure why it happens, though Rich claimed
>>     that it seems to be a problem related to compiler.
>>
>>
>> The workaround I suggested solves the performance regression. But even if it
>> hadn't, this is a security fix that should be merged regardless of the
>> performance impact.
> 
> Good point. Ilya, would you reword the commit log and resend based on
> latest code?

OK.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address
  2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
                   ` (3 preceding siblings ...)
  2016-07-01  7:35 ` Yuanhan Liu
@ 2016-07-14  8:18 ` Ilya Maximets
  2016-07-15  6:17   ` Yuanhan Liu
  2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
  5 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-14  8:18 UTC (permalink / raw)
  To: dev, Huawei Xie, Yuanhan Liu, Rich Lane
  Cc: Dyasly Sergey, Heetae Ahn, Jianfeng Tan, Stephen Hemminger,
	Thomas Monjalon, Ilya Maximets

In current implementation vhost will crash with segmentation fault
if malicious or buggy virtio application breaks addresses of descriptors.

Before commit 0823c1cb0a73 this crash was reproducible even with
normal DPDK application that tries to change number of virtqueues
dynamically inside VM.

Fix that by checking addresses of descriptors before using.

Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
from '-1' to '0' because it returns unsigned value and it means
number of used descriptors.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
---
Version 2:
	* Rebased on top of current master.
	* host's address now checked in meargeable case,
	  because needed refactoring already done.
	* Commit-message changed because old issue with
	  virtio reload accidentially fixed by commit
	  0823c1cb0a73.

 lib/librte_vhost/vhost_rxtx.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 15ca956..31e8b58 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
 
 	desc = &vq->desc[desc_idx];
-	if (unlikely(desc->len < dev->vhost_hlen))
+	desc_addr = gpa_to_vva(dev, desc->addr);
+	if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
 		return -1;
 
-	desc_addr = gpa_to_vva(dev, desc->addr);
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	virtio_enqueue_offload(m, &virtio_hdr.hdr);
@@ -182,7 +182,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				return -1;
 
 			desc = &vq->desc[desc->next];
-			desc_addr   = gpa_to_vva(dev, desc->addr);
+			desc_addr = gpa_to_vva(dev, desc->addr);
+			if (unlikely(!desc_addr))
+				return -1;
+
 			desc_offset = 0;
 			desc_avail  = desc->len;
 		}
@@ -387,10 +390,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
 		dev->vid, cur_idx, end_idx);
 
-	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
-		return -1;
-
 	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
+		return 0;
+
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	virtio_hdr.num_buffers = end_idx - start_idx;
@@ -425,6 +428,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 
 			vec_idx++;
 			desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+			if (unlikely(!desc_addr))
+				return 0;
 
 			/* Prefetch buffer address. */
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
@@ -507,7 +512,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 		*(volatile uint16_t *)&vq->used->idx += nr_used;
 		vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
 			sizeof(vq->used->idx));
-		vq->last_used_idx = end;
+		vq->last_used_idx += nr_used;
 	}
 
 	if (likely(pkt_idx)) {
@@ -688,6 +693,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		return -1;
 
 	desc_addr = gpa_to_vva(dev, desc->addr);
+	if (unlikely(!desc_addr))
+		return -1;
+
 	hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
 	rte_prefetch0(hdr);
 
@@ -701,6 +709,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		desc = &vq->desc[desc->next];
 
 		desc_addr = gpa_to_vva(dev, desc->addr);
+		if (unlikely(!desc_addr))
+			return -1;
+
 		rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 		desc_offset = 0;
@@ -737,6 +748,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			desc = &vq->desc[desc->next];
 
 			desc_addr = gpa_to_vva(dev, desc->addr);
+			if (unlikely(!desc_addr))
+				return -1;
+
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 			desc_offset = 0;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address
  2016-07-14  8:18 ` [dpdk-dev] [PATCH v2] " Ilya Maximets
@ 2016-07-15  6:17   ` Yuanhan Liu
  2016-07-15  7:23     ` Ilya Maximets
  0 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-15  6:17 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Rich Lane, Dyasly Sergey, Heetae Ahn,
	Jianfeng Tan, Stephen Hemminger, Thomas Monjalon

On Thu, Jul 14, 2016 at 11:18:39AM +0300, Ilya Maximets wrote:
> In current implementation vhost will crash with segmentation fault
> if malicious or buggy virtio application breaks addresses of descriptors.
> 
> Before commit 0823c1cb0a73 this crash was reproducible even with
> normal DPDK application that tries to change number of virtqueues
> dynamically inside VM.
> 
> Fix that by checking addresses of descriptors before using.
> 
> Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
> from '-1' to '0' because it returns unsigned value and it means
> number of used descriptors.

Yeah, that's a good fix. Thanks.

Maybe you'd better make it a standalone patch.

> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> ---
> Version 2:
> 	* Rebased on top of current master.
> 	* host's address now checked in meargeable case,
> 	  because needed refactoring already done.
> 	* Commit-message changed because old issue with
> 	  virtio reload accidentially fixed by commit
> 	  0823c1cb0a73.
> 
>  lib/librte_vhost/vhost_rxtx.c | 28 +++++++++++++++++++++-------
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 15ca956..31e8b58 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>  	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>  
>  	desc = &vq->desc[desc_idx];
> -	if (unlikely(desc->len < dev->vhost_hlen))
> +	desc_addr = gpa_to_vva(dev, desc->addr);
> +	if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
>  		return -1;

So, you discards the workaround from Rich?

>  
> -	desc_addr = gpa_to_vva(dev, desc->addr);
>  	rte_prefetch0((void *)(uintptr_t)desc_addr);
>  
>  	virtio_enqueue_offload(m, &virtio_hdr.hdr);
> @@ -182,7 +182,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>  				return -1;
>  
>  			desc = &vq->desc[desc->next];
> -			desc_addr   = gpa_to_vva(dev, desc->addr);
> +			desc_addr = gpa_to_vva(dev, desc->addr);
> +			if (unlikely(!desc_addr))
> +				return -1;
> +
>  			desc_offset = 0;
>  			desc_avail  = desc->len;
>  		}
> @@ -387,10 +390,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
>  	LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
>  		dev->vid, cur_idx, end_idx);
>  
> -	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
> -		return -1;
> -
>  	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
> +	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
> +		return 0;
> +
>  	rte_prefetch0((void *)(uintptr_t)desc_addr);
>  
>  	virtio_hdr.num_buffers = end_idx - start_idx;
> @@ -425,6 +428,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
>  
>  			vec_idx++;
>  			desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
> +			if (unlikely(!desc_addr))
> +				return 0;
>  
>  			/* Prefetch buffer address. */
>  			rte_prefetch0((void *)(uintptr_t)desc_addr);
> @@ -507,7 +512,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>  		*(volatile uint16_t *)&vq->used->idx += nr_used;
>  		vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>  			sizeof(vq->used->idx));
> -		vq->last_used_idx = end;
> +		vq->last_used_idx += nr_used;

Ditto, this may deserve another patch, too.

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address
  2016-07-15  6:17   ` Yuanhan Liu
@ 2016-07-15  7:23     ` Ilya Maximets
  2016-07-15  8:40       ` Yuanhan Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ilya Maximets @ 2016-07-15  7:23 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: dev, Huawei Xie, Rich Lane, Dyasly Sergey, Heetae Ahn,
	Jianfeng Tan, Stephen Hemminger, Thomas Monjalon

On 15.07.2016 09:17, Yuanhan Liu wrote:
> On Thu, Jul 14, 2016 at 11:18:39AM +0300, Ilya Maximets wrote:
>> In current implementation vhost will crash with segmentation fault
>> if malicious or buggy virtio application breaks addresses of descriptors.
>>
>> Before commit 0823c1cb0a73 this crash was reproducible even with
>> normal DPDK application that tries to change number of virtqueues
>> dynamically inside VM.
>>
>> Fix that by checking addresses of descriptors before using.
>>
>> Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
>> from '-1' to '0' because it returns unsigned value and it means
>> number of used descriptors.
> 
> Yeah, that's a good fix. Thanks.
> 
> Maybe you'd better make it a standalone patch.

Ok. Maybe I should split this patch in two:
1. Fix return value + using of this value (vq->last_used_idx += nr_used;)
2. Check addresses of descriptors.
What do you think?

>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> ---
>> Version 2:
>> 	* Rebased on top of current master.
>> 	* host's address now checked in meargeable case,
>> 	  because needed refactoring already done.
>> 	* Commit-message changed because old issue with
>> 	  virtio reload accidentially fixed by commit
>> 	  0823c1cb0a73.
>>
>>  lib/librte_vhost/vhost_rxtx.c | 28 +++++++++++++++++++++-------
>>  1 file changed, 21 insertions(+), 7 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
>> index 15ca956..31e8b58 100644
>> --- a/lib/librte_vhost/vhost_rxtx.c
>> +++ b/lib/librte_vhost/vhost_rxtx.c
>> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>  	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>>  
>>  	desc = &vq->desc[desc_idx];
>> -	if (unlikely(desc->len < dev->vhost_hlen))
>> +	desc_addr = gpa_to_vva(dev, desc->addr);
>> +	if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
>>  		return -1;
> 
> So, you discards the workaround from Rich?

I can apply it, if you wish. Should I?

>>  
>> -	desc_addr = gpa_to_vva(dev, desc->addr);
>>  	rte_prefetch0((void *)(uintptr_t)desc_addr);
>>  
>>  	virtio_enqueue_offload(m, &virtio_hdr.hdr);
>> @@ -182,7 +182,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>  				return -1;
>>  
>>  			desc = &vq->desc[desc->next];
>> -			desc_addr   = gpa_to_vva(dev, desc->addr);
>> +			desc_addr = gpa_to_vva(dev, desc->addr);
>> +			if (unlikely(!desc_addr))
>> +				return -1;
>> +
>>  			desc_offset = 0;
>>  			desc_avail  = desc->len;
>>  		}
>> @@ -387,10 +390,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>  	LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
>>  		dev->vid, cur_idx, end_idx);
>>  
>> -	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
>> -		return -1;
>> -
>>  	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
>> +	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
>> +		return 0;
>> +
>>  	rte_prefetch0((void *)(uintptr_t)desc_addr);
>>  
>>  	virtio_hdr.num_buffers = end_idx - start_idx;
>> @@ -425,6 +428,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>  
>>  			vec_idx++;
>>  			desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
>> +			if (unlikely(!desc_addr))
>> +				return 0;
>>  
>>  			/* Prefetch buffer address. */
>>  			rte_prefetch0((void *)(uintptr_t)desc_addr);
>> @@ -507,7 +512,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>>  		*(volatile uint16_t *)&vq->used->idx += nr_used;
>>  		vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>>  			sizeof(vq->used->idx));
>> -		vq->last_used_idx = end;
>> +		vq->last_used_idx += nr_used;
> 
> Ditto, this may deserve another patch, too.
> 
> 	--yliu
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address
  2016-07-15  7:23     ` Ilya Maximets
@ 2016-07-15  8:40       ` Yuanhan Liu
  0 siblings, 0 replies; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-15  8:40 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Rich Lane, Dyasly Sergey, Heetae Ahn,
	Jianfeng Tan, Stephen Hemminger, Thomas Monjalon

On Fri, Jul 15, 2016 at 10:23:12AM +0300, Ilya Maximets wrote:
> On 15.07.2016 09:17, Yuanhan Liu wrote:
> > On Thu, Jul 14, 2016 at 11:18:39AM +0300, Ilya Maximets wrote:
> >> In current implementation vhost will crash with segmentation fault
> >> if malicious or buggy virtio application breaks addresses of descriptors.
> >>
> >> Before commit 0823c1cb0a73 this crash was reproducible even with
> >> normal DPDK application that tries to change number of virtqueues
> >> dynamically inside VM.
> >>
> >> Fix that by checking addresses of descriptors before using.
> >>
> >> Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
> >> from '-1' to '0' because it returns unsigned value and it means
> >> number of used descriptors.
> > 
> > Yeah, that's a good fix. Thanks.
> > 
> > Maybe you'd better make it a standalone patch.
> 
> Ok. Maybe I should split this patch in two:
> 1. Fix return value + using of this value (vq->last_used_idx += nr_used;)
> 2. Check addresses of descriptors.
> What do you think?

Good to me.

> >> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> >> ---
> >> Version 2:
> >> 	* Rebased on top of current master.
> >> 	* host's address now checked in meargeable case,
> >> 	  because needed refactoring already done.
> >> 	* Commit-message changed because old issue with
> >> 	  virtio reload accidentially fixed by commit
> >> 	  0823c1cb0a73.
> >>
> >>  lib/librte_vhost/vhost_rxtx.c | 28 +++++++++++++++++++++-------
> >>  1 file changed, 21 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> >> index 15ca956..31e8b58 100644
> >> --- a/lib/librte_vhost/vhost_rxtx.c
> >> +++ b/lib/librte_vhost/vhost_rxtx.c
> >> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
> >>  	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
> >>  
> >>  	desc = &vq->desc[desc_idx];
> >> -	if (unlikely(desc->len < dev->vhost_hlen))
> >> +	desc_addr = gpa_to_vva(dev, desc->addr);
> >> +	if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
> >>  		return -1;
> > 
> > So, you discards the workaround from Rich?
> 
> I can apply it, if you wish. Should I?

Yeah, it's hard to tell. The performace regression is weird after all.
I'm thinking we should appy it anyway: it saves 10% regression, which
is worthwhile. I think we should also add comments there.

	--yliu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad descriptor address
  2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
                   ` (4 preceding siblings ...)
  2016-07-14  8:18 ` [dpdk-dev] [PATCH v2] " Ilya Maximets
@ 2016-07-15 11:15 ` Ilya Maximets
  2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue Ilya Maximets
                     ` (2 more replies)
  5 siblings, 3 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-07-15 11:15 UTC (permalink / raw)
  To: dev, Huawei Xie, Yuanhan Liu, Rich Lane
  Cc: Dyasly Sergey, Heetae Ahn, Jianfeng Tan, Stephen Hemminger,
	Thomas Monjalon, Ilya Maximets

Version 3:
	* Patch splitted in two.
	* Applied workaround from Rich Lane and added comment about
	  performance issue with some compilers and 'unlikely' macro.

Version 2:
	* Rebased on top of current master.
	* host's address now checked in meargeable case,
	  because needed refactoring already done.
	* Commit-message changed because old issue with
	  virtio reload accidentially fixed by commit
	  0823c1cb0a73.

Ilya Maximets (2):
  vhost: fix using of bad return value on mergeable enqueue
  vhost: do sanity check for ring descriptor address

 lib/librte_vhost/vhost_rxtx.c | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue
  2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
@ 2016-07-15 11:15   ` Ilya Maximets
  2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 2/2] vhost: do sanity check for ring descriptor address Ilya Maximets
  2016-07-15 12:14   ` [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad " Yuanhan Liu
  2 siblings, 0 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-07-15 11:15 UTC (permalink / raw)
  To: dev, Huawei Xie, Yuanhan Liu, Rich Lane
  Cc: Dyasly Sergey, Heetae Ahn, Jianfeng Tan, Stephen Hemminger,
	Thomas Monjalon, Ilya Maximets

Return value on error changed from '-1' to '0' because it returns
unsigned value and it means number of used descriptors.

Also fixed updating of 'last_used_idx' by using actual number of
used descriptors.

Fixes: 623bc47054d0 ("vhost: do sanity check for ring descriptor length")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
---
 lib/librte_vhost/vhost_rxtx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 15ca956..a9b04df 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -388,7 +388,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		dev->vid, cur_idx, end_idx);
 
 	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
-		return -1;
+		return 0;
 
 	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
@@ -507,7 +507,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 		*(volatile uint16_t *)&vq->used->idx += nr_used;
 		vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
 			sizeof(vq->used->idx));
-		vq->last_used_idx = end;
+		vq->last_used_idx += nr_used;
 	}
 
 	if (likely(pkt_idx)) {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH v3 2/2] vhost: do sanity check for ring descriptor address
  2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
  2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue Ilya Maximets
@ 2016-07-15 11:15   ` Ilya Maximets
  2016-07-15 12:14   ` [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad " Yuanhan Liu
  2 siblings, 0 replies; 39+ messages in thread
From: Ilya Maximets @ 2016-07-15 11:15 UTC (permalink / raw)
  To: dev, Huawei Xie, Yuanhan Liu, Rich Lane
  Cc: Dyasly Sergey, Heetae Ahn, Jianfeng Tan, Stephen Hemminger,
	Thomas Monjalon, Ilya Maximets

In current implementation vhost will crash with segmentation fault
if malicious or buggy virtio application breaks addresses of descriptors.

Before commit 0823c1cb0a73 ("vhost: workaround stale vring base")
this crash was reproducible even with normal DPDK application that tries
to change number of virtqueues dynamically inside VM.

Fix that by checking addresses of descriptors before using.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
---
 lib/librte_vhost/vhost_rxtx.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index a9b04df..bc00518 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -147,10 +147,15 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
 
 	desc = &vq->desc[desc_idx];
-	if (unlikely(desc->len < dev->vhost_hlen))
+	desc_addr = gpa_to_vva(dev, desc->addr);
+	/*
+	 * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
+	 * performance issue with some versions of gcc (4.8.4 and 5.3.0) which
+	 * otherwise stores offset on the stack instead of in a register.
+	 */
+	if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr)
 		return -1;
 
-	desc_addr = gpa_to_vva(dev, desc->addr);
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	virtio_enqueue_offload(m, &virtio_hdr.hdr);
@@ -182,7 +187,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				return -1;
 
 			desc = &vq->desc[desc->next];
-			desc_addr   = gpa_to_vva(dev, desc->addr);
+			desc_addr = gpa_to_vva(dev, desc->addr);
+			if (unlikely(!desc_addr))
+				return -1;
+
 			desc_offset = 0;
 			desc_avail  = desc->len;
 		}
@@ -387,10 +395,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
 		dev->vid, cur_idx, end_idx);
 
-	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
+	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
 		return 0;
 
-	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	virtio_hdr.num_buffers = end_idx - start_idx;
@@ -425,6 +433,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 
 			vec_idx++;
 			desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+			if (unlikely(!desc_addr))
+				return 0;
 
 			/* Prefetch buffer address. */
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
@@ -688,6 +698,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		return -1;
 
 	desc_addr = gpa_to_vva(dev, desc->addr);
+	if (unlikely(!desc_addr))
+		return -1;
+
 	hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
 	rte_prefetch0(hdr);
 
@@ -701,6 +714,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		desc = &vq->desc[desc->next];
 
 		desc_addr = gpa_to_vva(dev, desc->addr);
+		if (unlikely(!desc_addr))
+			return -1;
+
 		rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 		desc_offset = 0;
@@ -737,6 +753,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			desc = &vq->desc[desc->next];
 
 			desc_addr = gpa_to_vva(dev, desc->addr);
+			if (unlikely(!desc_addr))
+				return -1;
+
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 			desc_offset = 0;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad descriptor address
  2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
  2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue Ilya Maximets
  2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 2/2] vhost: do sanity check for ring descriptor address Ilya Maximets
@ 2016-07-15 12:14   ` Yuanhan Liu
  2016-07-15 19:37     ` Thomas Monjalon
  2 siblings, 1 reply; 39+ messages in thread
From: Yuanhan Liu @ 2016-07-15 12:14 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: dev, Huawei Xie, Rich Lane, Dyasly Sergey, Heetae Ahn,
	Jianfeng Tan, Stephen Hemminger, Thomas Monjalon

On Fri, Jul 15, 2016 at 02:15:03PM +0300, Ilya Maximets wrote:
> Version 3:
> 	* Patch splitted in two.
> 	* Applied workaround from Rich Lane and added comment about
> 	  performance issue with some compilers and 'unlikely' macro.

Thanks a lot for the patches.

Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>

	--yliu
> 
> Version 2:
> 	* Rebased on top of current master.
> 	* host's address now checked in meargeable case,
> 	  because needed refactoring already done.
> 	* Commit-message changed because old issue with
> 	  virtio reload accidentially fixed by commit
> 	  0823c1cb0a73.
> 
> Ilya Maximets (2):
>   vhost: fix using of bad return value on mergeable enqueue
>   vhost: do sanity check for ring descriptor address
> 
>  lib/librte_vhost/vhost_rxtx.c | 33 ++++++++++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad descriptor address
  2016-07-15 12:14   ` [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad " Yuanhan Liu
@ 2016-07-15 19:37     ` Thomas Monjalon
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2016-07-15 19:37 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: Yuanhan Liu, dev, Huawei Xie, Rich Lane, Dyasly Sergey,
	Heetae Ahn, Jianfeng Tan, Stephen Hemminger

2016-07-15 20:14, Yuanhan Liu:
> On Fri, Jul 15, 2016 at 02:15:03PM +0300, Ilya Maximets wrote:
> > Version 3:
> > 	* Patch splitted in two.
> > 	* Applied workaround from Rich Lane and added comment about
> > 	  performance issue with some compilers and 'unlikely' macro.
> 
> Thanks a lot for the patches.
> 
> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2016-07-15 19:37 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-20 12:50 [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address Ilya Maximets
2016-05-23 10:57 ` Yuanhan Liu
2016-05-23 11:04   ` Ilya Maximets
2016-05-30 11:05     ` Ilya Maximets
2016-05-30 14:25       ` Yuanhan Liu
2016-05-31  9:12         ` Ilya Maximets
2016-05-30 12:00 ` Tan, Jianfeng
2016-05-30 12:24   ` Ilya Maximets
2016-05-31  6:53     ` Tan, Jianfeng
2016-05-31  9:10       ` Ilya Maximets
2016-05-31 22:06 ` Rich Lane
2016-06-02 10:46   ` Ilya Maximets
2016-06-02 16:22     ` Rich Lane
2016-06-03  6:01       ` Ilya Maximets
2016-07-01  7:35 ` Yuanhan Liu
2016-07-06 11:19   ` Ilya Maximets
2016-07-06 12:24     ` Yuanhan Liu
2016-07-08 11:48       ` Ilya Maximets
2016-07-10 13:17         ` Yuanhan Liu
2016-07-11  8:38           ` Yuanhan Liu
2016-07-11  9:50             ` Ilya Maximets
2016-07-11 11:05               ` Yuanhan Liu
2016-07-11 11:47                 ` Ilya Maximets
2016-07-12  2:43                   ` Yuanhan Liu
2016-07-12  5:53                     ` Ilya Maximets
2016-07-13  7:34                       ` Ilya Maximets
2016-07-13  8:47                         ` Yuanhan Liu
2016-07-13 15:54                           ` Rich Lane
2016-07-14  1:42                             ` Yuanhan Liu
2016-07-14  4:38                               ` Ilya Maximets
2016-07-14  8:18 ` [dpdk-dev] [PATCH v2] " Ilya Maximets
2016-07-15  6:17   ` Yuanhan Liu
2016-07-15  7:23     ` Ilya Maximets
2016-07-15  8:40       ` Yuanhan Liu
2016-07-15 11:15 ` [dpdk-dev] [PATCH v3 0/2] " Ilya Maximets
2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue Ilya Maximets
2016-07-15 11:15   ` [dpdk-dev] [PATCH v3 2/2] vhost: do sanity check for ring descriptor address Ilya Maximets
2016-07-15 12:14   ` [dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad " Yuanhan Liu
2016-07-15 19:37     ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).