From: Patrik Andersson R <patrik.r.andersson@ericsson.com>
To: "Xie, Huawei" <huawei.xie@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()
Date: Thu, 17 Mar 2016 07:37:16 +0100 [thread overview]
Message-ID: <56EA509C.8080905@ericsson.com> (raw)
In-Reply-To: <C37D651A908B024F974696C65296B57B4C672777@SHSMSX101.ccr.corp.intel.com>
Hi Huawei,
thank you for the quick response and for the pointer to the 16.04-rc1
version. Nice!
I think it would be great also to have a sanity check on the gpa_to_vva().
Although nothing recent has hit it we had some problems in that area
in the past.
Regards,
Patrik
On 03/17/2016 02:35 AM, Xie, Huawei wrote:
> On 3/16/2016 8:53 PM, Patrik Andersson R wrote:
>> Hello,
>>
>> When taking a snapshot of a running VM instance, using OpenStack
>> "nova image-create", I noticed that one OVS pmd-thread eventually
>> failed in DPDK rte_vhost_dequeue_burst() with repeating log entries:
>>
>> compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate
>> memory for mbuf.
>>
>>
>> Debugging (data included further down) this issue lead to the
>> observation that there is no protection against malformed vhost
>> queue descriptors, thus tenant separation might be violated as a
>> single faulty VM might bring down the connectivity of all VMs
>> connected to the same virtual switch.
>>
>> To avoid this, validation would be needed at some points in the
>> rte_vhost_dequeue_burst() code:
>>
>> 1) when the queue descriptor is picked up for processing,
>> desc->flags and desc->len might both be 0
>>
>> ...
>> desc = &vq->desc[head[entry_success]];
>> ...
>> /* Discard first buffer as it is the virtio header */
>> if (desc->flags & VRING_DESC_F_NEXT) {
>> desc = &vq->desc[desc->next];
>> vb_offset = 0;
>> vb_avail = desc->len;
>> } else {
>> vb_offset = vq->vhost_hlen;
>> vb_avail = desc->len - vb_offset;
>> }
>> ....
>>
>> 2) at buffer address translation gpa_to_vva(), might fail
>> returning NULL as indication
>>
>> vb_addr = gpa_to_vva(dev, desc->addr);
>> ...
>> while (cpy_len != 0) {
>> rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset),
>> (void *)((uintptr_t)(vb_addr + vb_offset)),
>> cpy_len);
>> ...
>> }
>> ...
>>
>>
>> Wondering if there are any plans of adding any kind of validation in
>> DPDK, or if it would be useful to suggest specific implementation of
>> such validations in the DPDK code?
>>
>> Or is there some mechanism that gives us the confidence to trust
>> the vhost queue content absolutely?
>>
>>
>>
>> Debugging data:
>>
>> For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst()
>> due to use of a vhost queue descriptor that has all fields 0:
>>
>> (gdb) print *desc
>> {addr = 0, len = 0, flags = 0, next = 0}
>>
>>
>> Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset,
>> leads to the problem observed. What happens is that the packet needs to
>> be segmented -- on my system it fails roughly at segment 122000 when
>> memory available for mbufs run out.
>>
>> The relevant local variables for rte_vhost_dequeue_burst() when breaking
>> on the condition desc->len == 0:
>>
>> vb_avail = 4294967284 (0xfffffff4)
>> seg_avail = 2608
>> vb_offset = 12
>> cpy_len = 2608
>> seg_num = 1
>> desc = 0x2aadb6e5c000
>> vb_addr = 46928960159744
>> entry_success = 0
>>
>> Note also that there is no crash despite to the desc->addr being zero,
>> it is a valid address in the regions mapped to the device. Although, the
>> 3 regions mapped does not seem to be correct either at this stage.
>>
>>
>> The versions that I'm running are OVS 2.4.0, with corrections from the
>> 2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and
>> libvirt version 1.2.12.
>>
>>
>> Regards,
>>
>> Patrik
> Thanks Patrik. You are right. We had planned to enhance the robustness
> of vhost so that neither malicious nor buggy guest virtio driver could
> corrupt vhost. Actually the 16.04 RC1 has fixed some issues (the return
> of gpa_to_vva isn't checked).
>
prev parent reply other threads:[~2016-03-17 6:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-16 12:52 Patrik Andersson R
2016-03-17 1:35 ` Xie, Huawei
2016-03-17 6:37 ` Patrik Andersson R [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56EA509C.8080905@ericsson.com \
--to=patrik.r.andersson@ericsson.com \
--cc=dev@dpdk.org \
--cc=huawei.xie@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).