* [dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()
@ 2016-03-16 12:52 Patrik Andersson R
2016-03-17 1:35 ` Xie, Huawei
0 siblings, 1 reply; 3+ messages in thread
From: Patrik Andersson R @ 2016-03-16 12:52 UTC (permalink / raw)
To: dev
Hello,
When taking a snapshot of a running VM instance, using OpenStack
"nova image-create", I noticed that one OVS pmd-thread eventually
failed in DPDK rte_vhost_dequeue_burst() with repeating log entries:
compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate
memory for mbuf.
Debugging (data included further down) this issue lead to the
observation that there is no protection against malformed vhost
queue descriptors, thus tenant separation might be violated as a
single faulty VM might bring down the connectivity of all VMs
connected to the same virtual switch.
To avoid this, validation would be needed at some points in the
rte_vhost_dequeue_burst() code:
1) when the queue descriptor is picked up for processing,
desc->flags and desc->len might both be 0
...
desc = &vq->desc[head[entry_success]];
...
/* Discard first buffer as it is the virtio header */
if (desc->flags & VRING_DESC_F_NEXT) {
desc = &vq->desc[desc->next];
vb_offset = 0;
vb_avail = desc->len;
} else {
vb_offset = vq->vhost_hlen;
vb_avail = desc->len - vb_offset;
}
....
2) at buffer address translation gpa_to_vva(), might fail
returning NULL as indication
vb_addr = gpa_to_vva(dev, desc->addr);
...
while (cpy_len != 0) {
rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset),
(void *)((uintptr_t)(vb_addr + vb_offset)),
cpy_len);
...
}
...
Wondering if there are any plans of adding any kind of validation in
DPDK, or if it would be useful to suggest specific implementation of
such validations in the DPDK code?
Or is there some mechanism that gives us the confidence to trust
the vhost queue content absolutely?
Debugging data:
For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst()
due to use of a vhost queue descriptor that has all fields 0:
(gdb) print *desc
{addr = 0, len = 0, flags = 0, next = 0}
Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset,
leads to the problem observed. What happens is that the packet needs to
be segmented -- on my system it fails roughly at segment 122000 when
memory available for mbufs run out.
The relevant local variables for rte_vhost_dequeue_burst() when breaking
on the condition desc->len == 0:
vb_avail = 4294967284 (0xfffffff4)
seg_avail = 2608
vb_offset = 12
cpy_len = 2608
seg_num = 1
desc = 0x2aadb6e5c000
vb_addr = 46928960159744
entry_success = 0
Note also that there is no crash despite to the desc->addr being zero,
it is a valid address in the regions mapped to the device. Although, the
3 regions mapped does not seem to be correct either at this stage.
The versions that I'm running are OVS 2.4.0, with corrections from the
2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and
libvirt version 1.2.12.
Regards,
Patrik
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()
2016-03-16 12:52 [dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst() Patrik Andersson R
@ 2016-03-17 1:35 ` Xie, Huawei
2016-03-17 6:37 ` Patrik Andersson R
0 siblings, 1 reply; 3+ messages in thread
From: Xie, Huawei @ 2016-03-17 1:35 UTC (permalink / raw)
To: Patrik Andersson R, dev
On 3/16/2016 8:53 PM, Patrik Andersson R wrote:
> Hello,
>
> When taking a snapshot of a running VM instance, using OpenStack
> "nova image-create", I noticed that one OVS pmd-thread eventually
> failed in DPDK rte_vhost_dequeue_burst() with repeating log entries:
>
> compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate
> memory for mbuf.
>
>
> Debugging (data included further down) this issue lead to the
> observation that there is no protection against malformed vhost
> queue descriptors, thus tenant separation might be violated as a
> single faulty VM might bring down the connectivity of all VMs
> connected to the same virtual switch.
>
> To avoid this, validation would be needed at some points in the
> rte_vhost_dequeue_burst() code:
>
> 1) when the queue descriptor is picked up for processing,
> desc->flags and desc->len might both be 0
>
> ...
> desc = &vq->desc[head[entry_success]];
> ...
> /* Discard first buffer as it is the virtio header */
> if (desc->flags & VRING_DESC_F_NEXT) {
> desc = &vq->desc[desc->next];
> vb_offset = 0;
> vb_avail = desc->len;
> } else {
> vb_offset = vq->vhost_hlen;
> vb_avail = desc->len - vb_offset;
> }
> ....
>
> 2) at buffer address translation gpa_to_vva(), might fail
> returning NULL as indication
>
> vb_addr = gpa_to_vva(dev, desc->addr);
> ...
> while (cpy_len != 0) {
> rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset),
> (void *)((uintptr_t)(vb_addr + vb_offset)),
> cpy_len);
> ...
> }
> ...
>
>
> Wondering if there are any plans of adding any kind of validation in
> DPDK, or if it would be useful to suggest specific implementation of
> such validations in the DPDK code?
>
> Or is there some mechanism that gives us the confidence to trust
> the vhost queue content absolutely?
>
>
>
> Debugging data:
>
> For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst()
> due to use of a vhost queue descriptor that has all fields 0:
>
> (gdb) print *desc
> {addr = 0, len = 0, flags = 0, next = 0}
>
>
> Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset,
> leads to the problem observed. What happens is that the packet needs to
> be segmented -- on my system it fails roughly at segment 122000 when
> memory available for mbufs run out.
>
> The relevant local variables for rte_vhost_dequeue_burst() when breaking
> on the condition desc->len == 0:
>
> vb_avail = 4294967284 (0xfffffff4)
> seg_avail = 2608
> vb_offset = 12
> cpy_len = 2608
> seg_num = 1
> desc = 0x2aadb6e5c000
> vb_addr = 46928960159744
> entry_success = 0
>
> Note also that there is no crash despite to the desc->addr being zero,
> it is a valid address in the regions mapped to the device. Although, the
> 3 regions mapped does not seem to be correct either at this stage.
>
>
> The versions that I'm running are OVS 2.4.0, with corrections from the
> 2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and
> libvirt version 1.2.12.
>
>
> Regards,
>
> Patrik
Thanks Patrik. You are right. We had planned to enhance the robustness
of vhost so that neither malicious nor buggy guest virtio driver could
corrupt vhost. Actually the 16.04 RC1 has fixed some issues (the return
of gpa_to_vva isn't checked).
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()
2016-03-17 1:35 ` Xie, Huawei
@ 2016-03-17 6:37 ` Patrik Andersson R
0 siblings, 0 replies; 3+ messages in thread
From: Patrik Andersson R @ 2016-03-17 6:37 UTC (permalink / raw)
To: Xie, Huawei, dev
Hi Huawei,
thank you for the quick response and for the pointer to the 16.04-rc1
version. Nice!
I think it would be great also to have a sanity check on the gpa_to_vva().
Although nothing recent has hit it we had some problems in that area
in the past.
Regards,
Patrik
On 03/17/2016 02:35 AM, Xie, Huawei wrote:
> On 3/16/2016 8:53 PM, Patrik Andersson R wrote:
>> Hello,
>>
>> When taking a snapshot of a running VM instance, using OpenStack
>> "nova image-create", I noticed that one OVS pmd-thread eventually
>> failed in DPDK rte_vhost_dequeue_burst() with repeating log entries:
>>
>> compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate
>> memory for mbuf.
>>
>>
>> Debugging (data included further down) this issue lead to the
>> observation that there is no protection against malformed vhost
>> queue descriptors, thus tenant separation might be violated as a
>> single faulty VM might bring down the connectivity of all VMs
>> connected to the same virtual switch.
>>
>> To avoid this, validation would be needed at some points in the
>> rte_vhost_dequeue_burst() code:
>>
>> 1) when the queue descriptor is picked up for processing,
>> desc->flags and desc->len might both be 0
>>
>> ...
>> desc = &vq->desc[head[entry_success]];
>> ...
>> /* Discard first buffer as it is the virtio header */
>> if (desc->flags & VRING_DESC_F_NEXT) {
>> desc = &vq->desc[desc->next];
>> vb_offset = 0;
>> vb_avail = desc->len;
>> } else {
>> vb_offset = vq->vhost_hlen;
>> vb_avail = desc->len - vb_offset;
>> }
>> ....
>>
>> 2) at buffer address translation gpa_to_vva(), might fail
>> returning NULL as indication
>>
>> vb_addr = gpa_to_vva(dev, desc->addr);
>> ...
>> while (cpy_len != 0) {
>> rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset),
>> (void *)((uintptr_t)(vb_addr + vb_offset)),
>> cpy_len);
>> ...
>> }
>> ...
>>
>>
>> Wondering if there are any plans of adding any kind of validation in
>> DPDK, or if it would be useful to suggest specific implementation of
>> such validations in the DPDK code?
>>
>> Or is there some mechanism that gives us the confidence to trust
>> the vhost queue content absolutely?
>>
>>
>>
>> Debugging data:
>>
>> For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst()
>> due to use of a vhost queue descriptor that has all fields 0:
>>
>> (gdb) print *desc
>> {addr = 0, len = 0, flags = 0, next = 0}
>>
>>
>> Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset,
>> leads to the problem observed. What happens is that the packet needs to
>> be segmented -- on my system it fails roughly at segment 122000 when
>> memory available for mbufs run out.
>>
>> The relevant local variables for rte_vhost_dequeue_burst() when breaking
>> on the condition desc->len == 0:
>>
>> vb_avail = 4294967284 (0xfffffff4)
>> seg_avail = 2608
>> vb_offset = 12
>> cpy_len = 2608
>> seg_num = 1
>> desc = 0x2aadb6e5c000
>> vb_addr = 46928960159744
>> entry_success = 0
>>
>> Note also that there is no crash despite to the desc->addr being zero,
>> it is a valid address in the regions mapped to the device. Although, the
>> 3 regions mapped does not seem to be correct either at this stage.
>>
>>
>> The versions that I'm running are OVS 2.4.0, with corrections from the
>> 2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and
>> libvirt version 1.2.12.
>>
>>
>> Regards,
>>
>> Patrik
> Thanks Patrik. You are right. We had planned to enhance the robustness
> of vhost so that neither malicious nor buggy guest virtio driver could
> corrupt vhost. Actually the 16.04 RC1 has fixed some issues (the return
> of gpa_to_vva isn't checked).
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-03-17 6:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-16 12:52 [dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst() Patrik Andersson R
2016-03-17 1:35 ` Xie, Huawei
2016-03-17 6:37 ` Patrik Andersson R
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).