patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes
@ 2018-04-23 16:00 Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 01/11] vhost: fix indirect descriptors table translation size Maxime Coquelin
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This series fixes the security vulnerability referenced
as CVE-2018-1059.

Patches are already applied to the branch, but reviews
are encouraged. Any issues spotted would be fixed on top.


Maxime Coquelin (11):
  vhost: fix indirect descriptors table translation size
  vhost: check all range is mapped when translating GPAs
  vhost: introduce safe API for GPA translation
  vhost: ensure all range is mapped when translating QVAs
  vhost: add support for non-contiguous indirect descs tables
  vhost: handle virtually non-contiguous buffers in Tx
  vhost: handle virtually non-contiguous buffers in Rx
  vhost: handle virtually non-contiguous buffers in Rx-mrg
  examples/vhost: move to safe GPA translation API
  examples/vhost_scsi: move to safe GPA translation API
  vhost: deprecate unsafe GPA translation API

 examples/vhost/virtio_net.c            |  94 +++++++-
 examples/vhost_scsi/vhost_scsi.c       |  56 ++++-
 lib/librte_vhost/rte_vhost.h           |  46 ++++
 lib/librte_vhost/rte_vhost_version.map |   6 +
 lib/librte_vhost/vhost.c               |  39 ++--
 lib/librte_vhost/vhost.h               |   8 +-
 lib/librte_vhost/vhost_user.c          |  58 +++--
 lib/librte_vhost/virtio_net.c          | 411 ++++++++++++++++++++++++++++-----
 8 files changed, 602 insertions(+), 116 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 01/11] vhost: fix indirect descriptors table translation size
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 02/11] vhost: check all range is mapped when translating GPAs Maxime Coquelin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch fixes the size passed at the indirect descriptor
table translation time, which is the len field of the descriptor,
and not a single descriptor.

This issue has been assigned CVE-2018-1059.

Fixes: 62fdb8255ae7 ("vhost: use the guest IOVA to host VA helper")

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/virtio_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index d34703074..cb1d0cfc4 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1328,7 +1328,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 			desc = (struct vring_desc *)(uintptr_t)
 				vhost_iova_to_vva(dev, vq,
 						vq->desc[desc_indexes[i]].addr,
-						sizeof(*desc),
+						vq->desc[desc_indexes[i]].len,
 						VHOST_ACCESS_RO);
 			if (unlikely(!desc))
 				break;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 02/11] vhost: check all range is mapped when translating GPAs
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 01/11] vhost: fix indirect descriptors table translation size Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 03/11] vhost: introduce safe API for GPA translation Maxime Coquelin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

There is currently no check done on the length when translating
guest addresses into host virtual addresses. Also, there is no
guanrantee that the guest addresses range is contiguous in
the host virtual address space.

This patch prepares vhost_iova_to_vva() and its callers to
return and check the mapped size. If the mapped size is smaller
than the requested size, the caller handle it as an error.

This issue has been assigned CVE-2018-1059.

Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost.c      | 39 +++++++++++++++-----------
 lib/librte_vhost/vhost.h      |  6 ++--
 lib/librte_vhost/virtio_net.c | 64 +++++++++++++++++++++++++++----------------
 3 files changed, 67 insertions(+), 42 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 51ea720a3..a8ed40b1f 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -58,17 +58,17 @@ struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
 /* Called with iotlb_lock read-locked */
 uint64_t
 __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
-		    uint64_t iova, uint64_t size, uint8_t perm)
+		    uint64_t iova, uint64_t *size, uint8_t perm)
 {
 	uint64_t vva, tmp_size;
 
-	if (unlikely(!size))
+	if (unlikely(!*size))
 		return 0;
 
-	tmp_size = size;
+	tmp_size = *size;
 
 	vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm);
-	if (tmp_size == size)
+	if (tmp_size == *size)
 		return vva;
 
 	iova += tmp_size;
@@ -158,32 +158,39 @@ free_device(struct virtio_net *dev)
 int
 vring_translate(struct virtio_net *dev, struct vhost_virtqueue *vq)
 {
-	uint64_t size;
+	uint64_t req_size, size;
 
 	if (!(dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)))
 		goto out;
 
-	size = sizeof(struct vring_desc) * vq->size;
+	req_size = sizeof(struct vring_desc) * vq->size;
+	size = req_size;
 	vq->desc = (struct vring_desc *)(uintptr_t)vhost_iova_to_vva(dev, vq,
 						vq->ring_addrs.desc_user_addr,
-						size, VHOST_ACCESS_RW);
-	if (!vq->desc)
+						&size, VHOST_ACCESS_RW);
+	if (!vq->desc || size != req_size)
 		return -1;
 
-	size = sizeof(struct vring_avail);
-	size += sizeof(uint16_t) * vq->size;
+	req_size = sizeof(struct vring_avail);
+	req_size += sizeof(uint16_t) * vq->size;
+	if (dev->features & (1ULL << VIRTIO_RING_F_EVENT_IDX))
+		req_size += sizeof(uint16_t);
+	size = req_size;
 	vq->avail = (struct vring_avail *)(uintptr_t)vhost_iova_to_vva(dev, vq,
 						vq->ring_addrs.avail_user_addr,
-						size, VHOST_ACCESS_RW);
-	if (!vq->avail)
+						&size, VHOST_ACCESS_RW);
+	if (!vq->avail || size != req_size)
 		return -1;
 
-	size = sizeof(struct vring_used);
-	size += sizeof(struct vring_used_elem) * vq->size;
+	req_size = sizeof(struct vring_used);
+	req_size += sizeof(struct vring_used_elem) * vq->size;
+	if (dev->features & (1ULL << VIRTIO_RING_F_EVENT_IDX))
+		req_size += sizeof(uint16_t);
+	size = req_size;
 	vq->used = (struct vring_used *)(uintptr_t)vhost_iova_to_vva(dev, vq,
 						vq->ring_addrs.used_user_addr,
-						size, VHOST_ACCESS_RW);
-	if (!vq->used)
+						&size, VHOST_ACCESS_RW);
+	if (!vq->used || size != req_size)
 		return -1;
 
 out:
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index c8f2a8176..de300c107 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -381,18 +381,18 @@ struct vhost_device_ops const *vhost_driver_callback_get(const char *path);
 void vhost_backend_cleanup(struct virtio_net *dev);
 
 uint64_t __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
-			uint64_t iova, uint64_t size, uint8_t perm);
+			uint64_t iova, uint64_t *len, uint8_t perm);
 int vring_translate(struct virtio_net *dev, struct vhost_virtqueue *vq);
 void vring_invalidate(struct virtio_net *dev, struct vhost_virtqueue *vq);
 
 static __rte_always_inline uint64_t
 vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
-			uint64_t iova, uint64_t size, uint8_t perm)
+			uint64_t iova, uint64_t *len, uint8_t perm)
 {
 	if (!(dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)))
 		return rte_vhost_gpa_to_vva(dev->mem, iova);
 
-	return __vhost_iova_to_vva(dev, vq, iova, size, perm);
+	return __vhost_iova_to_vva(dev, vq, iova, len, perm);
 }
 
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index cb1d0cfc4..79bac590d 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -204,6 +204,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t desc_avail, desc_offset;
 	uint32_t mbuf_avail, mbuf_offset;
 	uint32_t cpy_len;
+	uint64_t dlen;
 	struct vring_desc *desc;
 	uint64_t desc_addr;
 	/* A counter to avoid desc dead loop chain */
@@ -213,14 +214,16 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	int error = 0;
 
 	desc = &descs[desc_idx];
+	dlen = desc->len;
 	desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
-					desc->len, VHOST_ACCESS_RW);
+					&dlen, VHOST_ACCESS_RW);
 	/*
 	 * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
 	 * performance issue with some versions of gcc (4.8.4 and 5.3.0) which
 	 * otherwise stores offset on the stack instead of in a register.
 	 */
-	if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr) {
+	if (unlikely(dlen != desc->len || desc->len < dev->vhost_hlen) ||
+			!desc_addr) {
 		error = -1;
 		goto out;
 	}
@@ -258,10 +261,11 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			}
 
 			desc = &descs[desc->next];
+			dlen = desc->len;
 			desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
-							desc->len,
+							&dlen,
 							VHOST_ACCESS_RW);
-			if (unlikely(!desc_addr)) {
+			if (unlikely(!desc_addr || dlen != desc->len)) {
 				error = -1;
 				goto out;
 			}
@@ -375,12 +379,13 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 		int err;
 
 		if (vq->desc[desc_idx].flags & VRING_DESC_F_INDIRECT) {
+			uint64_t dlen = vq->desc[desc_idx].len;
 			descs = (struct vring_desc *)(uintptr_t)
 				vhost_iova_to_vva(dev,
 						vq, vq->desc[desc_idx].addr,
-						vq->desc[desc_idx].len,
-						VHOST_ACCESS_RO);
-			if (unlikely(!descs)) {
+						&dlen, VHOST_ACCESS_RO);
+			if (unlikely(!descs ||
+					dlen != vq->desc[desc_idx].len)) {
 				count = i;
 				break;
 			}
@@ -438,16 +443,18 @@ fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint16_t idx = vq->avail->ring[avail_idx & (vq->size - 1)];
 	uint32_t vec_id = *vec_idx;
 	uint32_t len    = 0;
+	uint64_t dlen;
 	struct vring_desc *descs = vq->desc;
 
 	*desc_chain_head = idx;
 
 	if (vq->desc[idx].flags & VRING_DESC_F_INDIRECT) {
+		dlen = vq->desc[idx].len;
 		descs = (struct vring_desc *)(uintptr_t)
 			vhost_iova_to_vva(dev, vq, vq->desc[idx].addr,
-						vq->desc[idx].len,
+						&dlen,
 						VHOST_ACCESS_RO);
-		if (unlikely(!descs))
+		if (unlikely(!descs || dlen != vq->desc[idx].len))
 			return -1;
 
 		idx = 0;
@@ -530,6 +537,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t mbuf_offset, mbuf_avail;
 	uint32_t desc_offset, desc_avail;
 	uint32_t cpy_len;
+	uint64_t dlen;
 	uint64_t hdr_addr, hdr_phys_addr;
 	struct rte_mbuf *hdr_mbuf;
 	struct batch_copy_elem *batch_copy = vq->batch_copy_elems;
@@ -541,10 +549,12 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		goto out;
 	}
 
+	dlen = buf_vec[vec_idx].buf_len;
 	desc_addr = vhost_iova_to_vva(dev, vq, buf_vec[vec_idx].buf_addr,
-						buf_vec[vec_idx].buf_len,
-						VHOST_ACCESS_RW);
-	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr) {
+						&dlen, VHOST_ACCESS_RW);
+	if (dlen != buf_vec[vec_idx].buf_len ||
+			buf_vec[vec_idx].buf_len < dev->vhost_hlen ||
+			!desc_addr) {
 		error = -1;
 		goto out;
 	}
@@ -566,12 +576,14 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		/* done with current desc buf, get the next one */
 		if (desc_avail == 0) {
 			vec_idx++;
+			dlen = buf_vec[vec_idx].buf_len;
 			desc_addr =
 				vhost_iova_to_vva(dev, vq,
 					buf_vec[vec_idx].buf_addr,
-					buf_vec[vec_idx].buf_len,
+					&dlen,
 					VHOST_ACCESS_RW);
-			if (unlikely(!desc_addr)) {
+			if (unlikely(!desc_addr ||
+					dlen != buf_vec[vec_idx].buf_len)) {
 				error = -1;
 				goto out;
 			}
@@ -911,6 +923,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t desc_avail, desc_offset;
 	uint32_t mbuf_avail, mbuf_offset;
 	uint32_t cpy_len;
+	uint64_t dlen;
 	struct rte_mbuf *cur = m, *prev = m;
 	struct virtio_net_hdr *hdr = NULL;
 	/* A counter to avoid desc dead loop chain */
@@ -926,11 +939,12 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		goto out;
 	}
 
+	dlen = desc->len;
 	desc_addr = vhost_iova_to_vva(dev,
 					vq, desc->addr,
-					desc->len,
+					&dlen,
 					VHOST_ACCESS_RO);
-	if (unlikely(!desc_addr)) {
+	if (unlikely(!desc_addr || dlen != desc->len)) {
 		error = -1;
 		goto out;
 	}
@@ -953,11 +967,12 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			goto out;
 		}
 
+		dlen = desc->len;
 		desc_addr = vhost_iova_to_vva(dev,
 							vq, desc->addr,
-							desc->len,
+							&dlen,
 							VHOST_ACCESS_RO);
-		if (unlikely(!desc_addr)) {
+		if (unlikely(!desc_addr || dlen != desc->len)) {
 			error = -1;
 			goto out;
 		}
@@ -1041,11 +1056,11 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				goto out;
 			}
 
+			dlen = desc->len;
 			desc_addr = vhost_iova_to_vva(dev,
 							vq, desc->addr,
-							desc->len,
-							VHOST_ACCESS_RO);
-			if (unlikely(!desc_addr)) {
+							&dlen, VHOST_ACCESS_RO);
+			if (unlikely(!desc_addr || dlen != desc->len)) {
 				error = -1;
 				goto out;
 			}
@@ -1319,18 +1334,21 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	for (i = 0; i < count; i++) {
 		struct vring_desc *desc;
 		uint16_t sz, idx;
+		uint64_t dlen;
 		int err;
 
 		if (likely(i + 1 < count))
 			rte_prefetch0(&vq->desc[desc_indexes[i + 1]]);
 
 		if (vq->desc[desc_indexes[i]].flags & VRING_DESC_F_INDIRECT) {
+			dlen = vq->desc[desc_indexes[i]].len;
 			desc = (struct vring_desc *)(uintptr_t)
 				vhost_iova_to_vva(dev, vq,
 						vq->desc[desc_indexes[i]].addr,
-						vq->desc[desc_indexes[i]].len,
+						&dlen,
 						VHOST_ACCESS_RO);
-			if (unlikely(!desc))
+			if (unlikely(!desc ||
+					dlen != vq->desc[desc_indexes[i]].len))
 				break;
 
 			rte_prefetch0(desc);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 03/11] vhost: introduce safe API for GPA translation
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 01/11] vhost: fix indirect descriptors table translation size Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 02/11] vhost: check all range is mapped when translating GPAs Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 04/11] vhost: ensure all range is mapped when translating QVAs Maxime Coquelin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This new rte_vhost_va_from_guest_pa API takes an extra len
parameter, used to specify the size of the range to be mapped.
Effective mapped range is returned via len parameter.

This issue has been assigned CVE-2018-1059.

Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/rte_vhost.h           | 40 ++++++++++++++++++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map |  6 +++++
 lib/librte_vhost/vhost.h               |  2 +-
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index f65364495..f2d6c95e5 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -142,6 +142,46 @@ rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
 	return 0;
 }
 
+/**
+ * Convert guest physical address to host virtual address safely
+ *
+ * This variant of rte_vhost_gpa_to_vva() takes care all the
+ * requested length is mapped and contiguous in process address
+ * space.
+ *
+ * @param mem
+ *  the guest memory regions
+ * @param gpa
+ *  the guest physical address for querying
+ * @param len
+ *  the size of the requested area to map, updated with actual size mapped
+ * @return
+ *  the host virtual address on success, 0 on failure
+ */
+static __rte_always_inline uint64_t
+rte_vhost_va_from_guest_pa(struct rte_vhost_memory *mem,
+						   uint64_t gpa, uint64_t *len)
+{
+	struct rte_vhost_mem_region *r;
+	uint32_t i;
+
+	for (i = 0; i < mem->nregions; i++) {
+		r = &mem->regions[i];
+		if (gpa >= r->guest_phys_addr &&
+		    gpa <  r->guest_phys_addr + r->size) {
+
+			if (unlikely(*len > r->guest_phys_addr + r->size - gpa))
+				*len = r->guest_phys_addr + r->size - gpa;
+
+			return gpa - r->guest_phys_addr +
+			       r->host_user_addr;
+		}
+	}
+	*len = 0;
+
+	return 0;
+}
+
 #define RTE_VHOST_NEED_LOG(features)	((features) & (1ULL << VHOST_F_LOG_ALL))
 
 /**
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 1e7049535..9cb1d8ca6 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -52,3 +52,9 @@ DPDK_17.08 {
 	rte_vhost_rx_queue_count;
 
 } DPDK_17.05;
+
+DPDK_17.11.2 {
+	global;
+
+	rte_vhost_va_from_guest_pa;
+} DPDK_17.08;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index de300c107..16d6b8913 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -390,7 +390,7 @@ vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			uint64_t iova, uint64_t *len, uint8_t perm)
 {
 	if (!(dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)))
-		return rte_vhost_gpa_to_vva(dev->mem, iova);
+		return rte_vhost_va_from_guest_pa(dev->mem, iova, len);
 
 	return __vhost_iova_to_vva(dev, vq, iova, len, perm);
 }
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 04/11] vhost: ensure all range is mapped when translating QVAs
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (2 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 03/11] vhost: introduce safe API for GPA translation Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 05/11] vhost: add support for non-contiguous indirect descs tables Maxime Coquelin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch ensures that all the address range is mapped when
translating addresses from master's addresses (e.g. QEMU host
addressess) to process VAs.

This issue has been assigned CVE-2018-1059.

Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 58 +++++++++++++++++++++++++++----------------
 1 file changed, 36 insertions(+), 22 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 3acaacf58..50e654db1 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -329,21 +329,26 @@ numa_realloc(struct virtio_net *dev, int index __rte_unused)
 
 /* Converts QEMU virtual address to Vhost virtual address. */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qva)
+qva_to_vva(struct virtio_net *dev, uint64_t qva, uint64_t *len)
 {
-	struct rte_vhost_mem_region *reg;
+	struct rte_vhost_mem_region *r;
 	uint32_t i;
 
 	/* Find the region where the address lives. */
 	for (i = 0; i < dev->mem->nregions; i++) {
-		reg = &dev->mem->regions[i];
+		r = &dev->mem->regions[i];
+
+		if (qva >= r->guest_user_addr &&
+		    qva <  r->guest_user_addr + r->size) {
+
+			if (unlikely(*len > r->guest_user_addr + r->size - qva))
+				*len = r->guest_user_addr + r->size - qva;
 
-		if (qva >= reg->guest_user_addr &&
-		    qva <  reg->guest_user_addr + reg->size) {
-			return qva - reg->guest_user_addr +
-			       reg->host_user_addr;
+			return qva - r->guest_user_addr +
+			       r->host_user_addr;
 		}
 	}
+	*len = 0;
 
 	return 0;
 }
@@ -356,20 +361,20 @@ qva_to_vva(struct virtio_net *dev, uint64_t qva)
  */
 static uint64_t
 ring_addr_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
-		uint64_t ra, uint64_t size)
+		uint64_t ra, uint64_t *size)
 {
 	if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) {
 		uint64_t vva;
 
 		vva = vhost_user_iotlb_cache_find(vq, ra,
-					&size, VHOST_ACCESS_RW);
+					size, VHOST_ACCESS_RW);
 		if (!vva)
 			vhost_user_iotlb_miss(dev, ra, VHOST_ACCESS_RW);
 
 		return vva;
 	}
 
-	return qva_to_vva(dev, ra);
+	return qva_to_vva(dev, ra, size);
 }
 
 static struct virtio_net *
@@ -377,16 +382,18 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
 {
 	struct vhost_virtqueue *vq = dev->virtqueue[vq_index];
 	struct vhost_vring_addr *addr = &vq->ring_addrs;
+	uint64_t len;
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	if (vq->desc && vq->avail && vq->used)
 		return dev;
 
+	len = sizeof(struct vring_desc) * vq->size;
 	vq->desc = (struct vring_desc *)(uintptr_t)ring_addr_to_vva(dev,
-			vq, addr->desc_user_addr, sizeof(struct vring_desc));
-	if (vq->desc == 0) {
+			vq, addr->desc_user_addr, &len);
+	if (vq->desc == 0 || len != sizeof(struct vring_desc) * vq->size) {
 		RTE_LOG(DEBUG, VHOST_CONFIG,
-			"(%d) failed to find desc ring address.\n",
+			"(%d) failed to map desc ring.\n",
 			dev->vid);
 		return dev;
 	}
@@ -395,20 +402,26 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
 	vq = dev->virtqueue[vq_index];
 	addr = &vq->ring_addrs;
 
+	len = sizeof(struct vring_avail) + sizeof(uint16_t) * vq->size;
 	vq->avail = (struct vring_avail *)(uintptr_t)ring_addr_to_vva(dev,
-			vq, addr->avail_user_addr, sizeof(struct vring_avail));
-	if (vq->avail == 0) {
+			vq, addr->avail_user_addr, &len);
+	if (vq->avail == 0 ||
+			len != sizeof(struct vring_avail) +
+			sizeof(uint16_t) * vq->size) {
 		RTE_LOG(DEBUG, VHOST_CONFIG,
-			"(%d) failed to find avail ring address.\n",
+			"(%d) failed to map avail ring.\n",
 			dev->vid);
 		return dev;
 	}
 
+	len = sizeof(struct vring_used) +
+		sizeof(struct vring_used_elem) * vq->size;
 	vq->used = (struct vring_used *)(uintptr_t)ring_addr_to_vva(dev,
-			vq, addr->used_user_addr, sizeof(struct vring_used));
-	if (vq->used == 0) {
+			vq, addr->used_user_addr, &len);
+	if (vq->used == 0 || len != sizeof(struct vring_used) +
+			sizeof(struct vring_used_elem) * vq->size) {
 		RTE_LOG(DEBUG, VHOST_CONFIG,
-			"(%d) failed to find used ring address.\n",
+			"(%d) failed to map used ring.\n",
 			dev->vid);
 		return dev;
 	}
@@ -1094,11 +1107,12 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg)
 	struct virtio_net *dev = *pdev;
 	struct vhost_iotlb_msg *imsg = &msg->payload.iotlb;
 	uint16_t i;
-	uint64_t vva;
+	uint64_t vva, len;
 
 	switch (imsg->type) {
 	case VHOST_IOTLB_UPDATE:
-		vva = qva_to_vva(dev, imsg->uaddr);
+		len = imsg->size;
+		vva = qva_to_vva(dev, imsg->uaddr, &len);
 		if (!vva)
 			return -1;
 
@@ -1106,7 +1120,7 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg)
 			struct vhost_virtqueue *vq = dev->virtqueue[i];
 
 			vhost_user_iotlb_cache_insert(vq, imsg->iova, vva,
-					imsg->size, imsg->perm);
+					len, imsg->perm);
 
 			if (is_vring_iotlb_update(vq, imsg))
 				*pdev = dev = translate_ring_addresses(dev, i);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 05/11] vhost: add support for non-contiguous indirect descs tables
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (3 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 04/11] vhost: ensure all range is mapped when translating QVAs Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 06/11] vhost: handle virtually non-contiguous buffers in Tx Maxime Coquelin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch adds support for non-contiguous indirect descriptor
tables in VA space.

When it happens, which is unlikely, a table is allocated and the
non-contiguous content is copied into it.

This issue has been assigned CVE-2018-1059.

Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/virtio_net.c | 108 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 101 insertions(+), 7 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 79bac590d..13252e6ec 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -45,6 +45,7 @@
 #include <rte_sctp.h>
 #include <rte_arp.h>
 #include <rte_spinlock.h>
+#include <rte_malloc.h>
 
 #include "iotlb.h"
 #include "vhost.h"
@@ -59,6 +60,46 @@ is_valid_virt_queue_idx(uint32_t idx, int is_tx, uint32_t nr_vring)
 	return (is_tx ^ (idx & 1)) == 0 && idx < nr_vring;
 }
 
+static __rte_always_inline struct vring_desc *
+alloc_copy_ind_table(struct virtio_net *dev, struct vhost_virtqueue *vq,
+					 struct vring_desc *desc)
+{
+	struct vring_desc *idesc;
+	uint64_t src, dst;
+	uint64_t len, remain = desc->len;
+	uint64_t desc_addr = desc->addr;
+
+	idesc = rte_malloc(__func__, desc->len, 0);
+	if (unlikely(!idesc))
+		return 0;
+
+	dst = (uint64_t)(uintptr_t)idesc;
+
+	while (remain) {
+		len = remain;
+		src = vhost_iova_to_vva(dev, vq, desc_addr, &len,
+				VHOST_ACCESS_RO);
+		if (unlikely(!src || !len)) {
+			rte_free(idesc);
+			return 0;
+		}
+
+		rte_memcpy((void *)(uintptr_t)dst, (void *)(uintptr_t)src, len);
+
+		remain -= len;
+		dst += len;
+		desc_addr += len;
+	}
+
+	return idesc;
+}
+
+static __rte_always_inline void
+free_ind_table(struct vring_desc *idesc)
+{
+	rte_free(idesc);
+}
+
 static __rte_always_inline void
 do_flush_shadow_used_ring(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			  uint16_t to, uint16_t from, uint16_t size)
@@ -375,6 +416,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 
 	rte_prefetch0(&vq->desc[desc_indexes[0]]);
 	for (i = 0; i < count; i++) {
+		struct vring_desc *idesc = NULL;
 		uint16_t desc_idx = desc_indexes[i];
 		int err;
 
@@ -384,12 +426,24 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 				vhost_iova_to_vva(dev,
 						vq, vq->desc[desc_idx].addr,
 						&dlen, VHOST_ACCESS_RO);
-			if (unlikely(!descs ||
-					dlen != vq->desc[desc_idx].len)) {
+			if (unlikely(!descs)) {
 				count = i;
 				break;
 			}
 
+			if (unlikely(dlen < vq->desc[desc_idx].len)) {
+				/*
+				 * The indirect desc table is not contiguous
+				 * in process VA space, we have to copy it.
+				 */
+				idesc = alloc_copy_ind_table(dev, vq,
+							&vq->desc[desc_idx]);
+				if (unlikely(!idesc))
+					break;
+
+				descs = idesc;
+			}
+
 			desc_idx = 0;
 			sz = vq->desc[desc_idx].len / sizeof(*descs);
 		} else {
@@ -400,11 +454,15 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 		err = copy_mbuf_to_desc(dev, vq, descs, pkts[i], desc_idx, sz);
 		if (unlikely(err)) {
 			count = i;
+			free_ind_table(idesc);
 			break;
 		}
 
 		if (i + 1 < count)
 			rte_prefetch0(&vq->desc[desc_indexes[i+1]]);
+
+		if (unlikely(!!idesc))
+			free_ind_table(idesc);
 	}
 
 	do_data_copy_enqueue(dev, vq);
@@ -445,6 +503,7 @@ fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t len    = 0;
 	uint64_t dlen;
 	struct vring_desc *descs = vq->desc;
+	struct vring_desc *idesc = NULL;
 
 	*desc_chain_head = idx;
 
@@ -454,15 +513,29 @@ fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			vhost_iova_to_vva(dev, vq, vq->desc[idx].addr,
 						&dlen,
 						VHOST_ACCESS_RO);
-		if (unlikely(!descs || dlen != vq->desc[idx].len))
+		if (unlikely(!descs))
 			return -1;
 
+		if (unlikely(dlen < vq->desc[idx].len)) {
+			/*
+			 * The indirect desc table is not contiguous
+			 * in process VA space, we have to copy it.
+			 */
+			idesc = alloc_copy_ind_table(dev, vq, &vq->desc[idx]);
+			if (unlikely(!idesc))
+				return -1;
+
+			descs = idesc;
+		}
+
 		idx = 0;
 	}
 
 	while (1) {
-		if (unlikely(vec_id >= BUF_VECTOR_MAX || idx >= vq->size))
+		if (unlikely(vec_id >= BUF_VECTOR_MAX || idx >= vq->size)) {
+			free_ind_table(idesc);
 			return -1;
+		}
 
 		len += descs[idx].len;
 		buf_vec[vec_id].buf_addr = descs[idx].addr;
@@ -479,6 +552,9 @@ fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	*desc_chain_len = len;
 	*vec_idx = vec_id;
 
+	if (unlikely(!!idesc))
+		free_ind_table(idesc);
+
 	return 0;
 }
 
@@ -1332,7 +1408,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	/* Prefetch descriptor index. */
 	rte_prefetch0(&vq->desc[desc_indexes[0]]);
 	for (i = 0; i < count; i++) {
-		struct vring_desc *desc;
+		struct vring_desc *desc, *idesc = NULL;
 		uint16_t sz, idx;
 		uint64_t dlen;
 		int err;
@@ -1347,10 +1423,22 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 						vq->desc[desc_indexes[i]].addr,
 						&dlen,
 						VHOST_ACCESS_RO);
-			if (unlikely(!desc ||
-					dlen != vq->desc[desc_indexes[i]].len))
+			if (unlikely(!desc))
 				break;
 
+			if (unlikely(dlen < vq->desc[desc_indexes[i]].len)) {
+				/*
+				 * The indirect desc table is not contiguous
+				 * in process VA space, we have to copy it.
+				 */
+				idesc = alloc_copy_ind_table(dev, vq,
+						&vq->desc[desc_indexes[i]]);
+				if (unlikely(!idesc))
+					break;
+
+				desc = idesc;
+			}
+
 			rte_prefetch0(desc);
 			sz = vq->desc[desc_indexes[i]].len / sizeof(*desc);
 			idx = 0;
@@ -1364,6 +1452,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 		if (unlikely(pkts[i] == NULL)) {
 			RTE_LOG(ERR, VHOST_DATA,
 				"Failed to allocate memory for mbuf.\n");
+			free_ind_table(idesc);
 			break;
 		}
 
@@ -1371,6 +1460,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 					mbuf_pool);
 		if (unlikely(err)) {
 			rte_pktmbuf_free(pkts[i]);
+			free_ind_table(idesc);
 			break;
 		}
 
@@ -1380,6 +1470,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 			zmbuf = get_zmbuf(vq);
 			if (!zmbuf) {
 				rte_pktmbuf_free(pkts[i]);
+				free_ind_table(idesc);
 				break;
 			}
 			zmbuf->mbuf = pkts[i];
@@ -1396,6 +1487,9 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 			vq->nr_zmbuf += 1;
 			TAILQ_INSERT_TAIL(&vq->zmbuf_list, zmbuf, next);
 		}
+
+		if (unlikely(!!idesc))
+			free_ind_table(idesc);
 	}
 	vq->last_avail_idx += i;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 06/11] vhost: handle virtually non-contiguous buffers in Tx
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (4 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 05/11] vhost: add support for non-contiguous indirect descs tables Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 07/11] vhost: handle virtually non-contiguous buffers in Rx Maxime Coquelin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch enables the handling of buffers non-contiguous in
process virtual address space in the dequeue path.

When virtio-net header doesn't fit in a single chunck, it is
copied into a local variablei before being processed.

For packet content, the copy length is limited to the chunck
size, next chuncks VAs being fetched afterward.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/virtio_net.c | 117 ++++++++++++++++++++++++++++++++++--------
 1 file changed, 95 insertions(+), 22 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 13252e6ec..47717a16d 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -995,12 +995,13 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		  struct rte_mempool *mbuf_pool)
 {
 	struct vring_desc *desc;
-	uint64_t desc_addr;
+	uint64_t desc_addr, desc_gaddr;
 	uint32_t desc_avail, desc_offset;
 	uint32_t mbuf_avail, mbuf_offset;
 	uint32_t cpy_len;
-	uint64_t dlen;
+	uint64_t desc_chunck_len;
 	struct rte_mbuf *cur = m, *prev = m;
+	struct virtio_net_hdr tmp_hdr;
 	struct virtio_net_hdr *hdr = NULL;
 	/* A counter to avoid desc dead loop chain */
 	uint32_t nr_desc = 1;
@@ -1015,19 +1016,52 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		goto out;
 	}
 
-	dlen = desc->len;
+	desc_chunck_len = desc->len;
+	desc_gaddr = desc->addr;
 	desc_addr = vhost_iova_to_vva(dev,
-					vq, desc->addr,
-					&dlen,
+					vq, desc_gaddr,
+					&desc_chunck_len,
 					VHOST_ACCESS_RO);
-	if (unlikely(!desc_addr || dlen != desc->len)) {
+	if (unlikely(!desc_addr)) {
 		error = -1;
 		goto out;
 	}
 
 	if (virtio_net_with_host_offload(dev)) {
-		hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
-		rte_prefetch0(hdr);
+		if (unlikely(desc_chunck_len < sizeof(struct virtio_net_hdr))) {
+			uint64_t len = desc_chunck_len;
+			uint64_t remain = sizeof(struct virtio_net_hdr);
+			uint64_t src = desc_addr;
+			uint64_t dst = (uint64_t)(uintptr_t)&tmp_hdr;
+			uint64_t guest_addr = desc_gaddr;
+
+			/*
+			 * No luck, the virtio-net header doesn't fit
+			 * in a contiguous virtual area.
+			 */
+			while (remain) {
+				len = remain;
+				src = vhost_iova_to_vva(dev, vq,
+						guest_addr, &len,
+						VHOST_ACCESS_RO);
+				if (unlikely(!src || !len)) {
+					error = -1;
+					goto out;
+				}
+
+				rte_memcpy((void *)(uintptr_t)dst,
+						   (void *)(uintptr_t)src, len);
+
+				guest_addr += len;
+				remain -= len;
+				dst += len;
+			}
+
+			hdr = &tmp_hdr;
+		} else {
+			hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
+			rte_prefetch0(hdr);
+		}
 	}
 
 	/*
@@ -1043,12 +1077,13 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			goto out;
 		}
 
-		dlen = desc->len;
+		desc_chunck_len = desc->len;
+		desc_gaddr = desc->addr;
 		desc_addr = vhost_iova_to_vva(dev,
-							vq, desc->addr,
-							&dlen,
+							vq, desc_gaddr,
+							&desc_chunck_len,
 							VHOST_ACCESS_RO);
-		if (unlikely(!desc_addr || dlen != desc->len)) {
+		if (unlikely(!desc_addr)) {
 			error = -1;
 			goto out;
 		}
@@ -1058,19 +1093,37 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		nr_desc    += 1;
 	} else {
 		desc_avail  = desc->len - dev->vhost_hlen;
-		desc_offset = dev->vhost_hlen;
+
+		if (unlikely(desc_chunck_len < dev->vhost_hlen)) {
+			desc_chunck_len = desc_avail;
+			desc_gaddr += dev->vhost_hlen;
+			desc_addr = vhost_iova_to_vva(dev,
+					vq, desc_gaddr,
+					&desc_chunck_len,
+					VHOST_ACCESS_RO);
+			if (unlikely(!desc_addr)) {
+				error = -1;
+				goto out;
+			}
+
+			desc_offset = 0;
+		} else {
+			desc_offset = dev->vhost_hlen;
+			desc_chunck_len -= dev->vhost_hlen;
+		}
 	}
 
 	rte_prefetch0((void *)(uintptr_t)(desc_addr + desc_offset));
 
-	PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset), desc_avail, 0);
+	PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset),
+			desc_chunck_len, 0);
 
 	mbuf_offset = 0;
 	mbuf_avail  = m->buf_len - RTE_PKTMBUF_HEADROOM;
 	while (1) {
 		uint64_t hpa;
 
-		cpy_len = RTE_MIN(desc_avail, mbuf_avail);
+		cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail);
 
 		/*
 		 * A desc buf might across two host physical pages that are
@@ -1078,7 +1131,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		 * will be copied even though zero copy is enabled.
 		 */
 		if (unlikely(dev->dequeue_zero_copy && (hpa = gpa_to_hpa(dev,
-					desc->addr + desc_offset, cpy_len)))) {
+					desc_gaddr + desc_offset, cpy_len)))) {
 			cur->data_len = cpy_len;
 			cur->data_off = 0;
 			cur->buf_addr = (void *)(uintptr_t)(desc_addr
@@ -1093,7 +1146,8 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		} else {
 			if (likely(cpy_len > MAX_BATCH_LEN ||
 				   copy_nb >= vq->size ||
-				   (hdr && cur == m))) {
+				   (hdr && cur == m) ||
+				   desc->len != desc_chunck_len)) {
 				rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *,
 								   mbuf_offset),
 					   (void *)((uintptr_t)(desc_addr +
@@ -1114,6 +1168,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		mbuf_avail  -= cpy_len;
 		mbuf_offset += cpy_len;
 		desc_avail  -= cpy_len;
+		desc_chunck_len -= cpy_len;
 		desc_offset += cpy_len;
 
 		/* This desc reaches to its end, get the next one */
@@ -1132,11 +1187,13 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				goto out;
 			}
 
-			dlen = desc->len;
+			desc_chunck_len = desc->len;
+			desc_gaddr = desc->addr;
 			desc_addr = vhost_iova_to_vva(dev,
-							vq, desc->addr,
-							&dlen, VHOST_ACCESS_RO);
-			if (unlikely(!desc_addr || dlen != desc->len)) {
+							vq, desc_gaddr,
+							&desc_chunck_len,
+							VHOST_ACCESS_RO);
+			if (unlikely(!desc_addr)) {
 				error = -1;
 				goto out;
 			}
@@ -1146,7 +1203,23 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			desc_offset = 0;
 			desc_avail  = desc->len;
 
-			PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
+			PRINT_PACKET(dev, (uintptr_t)desc_addr,
+					desc_chunck_len, 0);
+		} else if (unlikely(desc_chunck_len == 0)) {
+			desc_chunck_len = desc_avail;
+			desc_gaddr += desc_offset;
+			desc_addr = vhost_iova_to_vva(dev, vq,
+					desc_gaddr,
+					&desc_chunck_len,
+					VHOST_ACCESS_RO);
+			if (unlikely(!desc_addr)) {
+				error = -1;
+				goto out;
+			}
+			desc_offset = 0;
+
+			PRINT_PACKET(dev, (uintptr_t)desc_addr,
+					desc_chunck_len, 0);
 		}
 
 		/*
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 07/11] vhost: handle virtually non-contiguous buffers in Rx
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (5 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 06/11] vhost: handle virtually non-contiguous buffers in Tx Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 08/11] vhost: handle virtually non-contiguous buffers in Rx-mrg Maxime Coquelin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch enables the handling of buffers non-contiguous in
process virtual address space in the enqueue path when mergeable
buffers aren't used.

When virtio-net header doesn't fit in a single chunck, it is
computed in a local variable and copied to the buffer chuncks
afterwards.

For packet content, the copy length is limited to the chunck
size, next chuncks VAs being fetched afterward.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/virtio_net.c | 95 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 77 insertions(+), 18 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 47717a16d..5bd4e58d6 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -245,9 +245,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t desc_avail, desc_offset;
 	uint32_t mbuf_avail, mbuf_offset;
 	uint32_t cpy_len;
-	uint64_t dlen;
+	uint64_t desc_chunck_len;
 	struct vring_desc *desc;
-	uint64_t desc_addr;
+	uint64_t desc_addr, desc_gaddr;
 	/* A counter to avoid desc dead loop chain */
 	uint16_t nr_desc = 1;
 	struct batch_copy_elem *batch_copy = vq->batch_copy_elems;
@@ -255,28 +255,74 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	int error = 0;
 
 	desc = &descs[desc_idx];
-	dlen = desc->len;
-	desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
-					&dlen, VHOST_ACCESS_RW);
+	desc_chunck_len = desc->len;
+	desc_gaddr = desc->addr;
+	desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr,
+					&desc_chunck_len, VHOST_ACCESS_RW);
 	/*
 	 * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
 	 * performance issue with some versions of gcc (4.8.4 and 5.3.0) which
 	 * otherwise stores offset on the stack instead of in a register.
 	 */
-	if (unlikely(dlen != desc->len || desc->len < dev->vhost_hlen) ||
-			!desc_addr) {
+	if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr) {
 		error = -1;
 		goto out;
 	}
 
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
-	virtio_enqueue_offload(m, (struct virtio_net_hdr *)(uintptr_t)desc_addr);
-	vhost_log_write(dev, desc->addr, dev->vhost_hlen);
-	PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
+	if (likely(desc_chunck_len >= dev->vhost_hlen)) {
+		virtio_enqueue_offload(m,
+				(struct virtio_net_hdr *)(uintptr_t)desc_addr);
+		PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
+		vhost_log_write(dev, desc_gaddr, dev->vhost_hlen);
+	} else {
+		struct virtio_net_hdr vnet_hdr;
+		uint64_t remain = dev->vhost_hlen;
+		uint64_t len;
+		uint64_t src = (uint64_t)(uintptr_t)&vnet_hdr, dst;
+		uint64_t guest_addr = desc_gaddr;
+
+		virtio_enqueue_offload(m, &vnet_hdr);
+
+		while (remain) {
+			len = remain;
+			dst = vhost_iova_to_vva(dev, vq, guest_addr,
+					&len, VHOST_ACCESS_RW);
+			if (unlikely(!dst || !len)) {
+				error = -1;
+				goto out;
+			}
+
+			rte_memcpy((void *)(uintptr_t)dst,
+					(void *)(uintptr_t)src, len);
+
+			PRINT_PACKET(dev, (uintptr_t)dst, len, 0);
+			vhost_log_write(dev, guest_addr, len);
+			remain -= len;
+			guest_addr += len;
+			dst += len;
+		}
+	}
 
-	desc_offset = dev->vhost_hlen;
 	desc_avail  = desc->len - dev->vhost_hlen;
+	if (unlikely(desc_chunck_len < dev->vhost_hlen)) {
+		desc_chunck_len = desc_avail;
+		desc_gaddr = desc->addr + dev->vhost_hlen;
+		desc_addr = vhost_iova_to_vva(dev,
+				vq, desc_gaddr,
+				&desc_chunck_len,
+				VHOST_ACCESS_RW);
+		if (unlikely(!desc_addr)) {
+			error = -1;
+			goto out;
+		}
+
+		desc_offset = 0;
+	} else {
+		desc_offset = dev->vhost_hlen;
+		desc_chunck_len -= dev->vhost_hlen;
+	}
 
 	mbuf_avail  = rte_pktmbuf_data_len(m);
 	mbuf_offset = 0;
@@ -302,26 +348,38 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			}
 
 			desc = &descs[desc->next];
-			dlen = desc->len;
-			desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
-							&dlen,
+			desc_chunck_len = desc->len;
+			desc_gaddr = desc->addr;
+			desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr,
+							&desc_chunck_len,
 							VHOST_ACCESS_RW);
-			if (unlikely(!desc_addr || dlen != desc->len)) {
+			if (unlikely(!desc_addr)) {
 				error = -1;
 				goto out;
 			}
 
 			desc_offset = 0;
 			desc_avail  = desc->len;
+		} else if (unlikely(desc_chunck_len == 0)) {
+			desc_chunck_len = desc_avail;
+			desc_gaddr += desc_offset;
+			desc_addr = vhost_iova_to_vva(dev,
+					vq, desc_gaddr,
+					&desc_chunck_len, VHOST_ACCESS_RW);
+			if (unlikely(!desc_addr)) {
+				error = -1;
+				goto out;
+			}
+			desc_offset = 0;
 		}
 
-		cpy_len = RTE_MIN(desc_avail, mbuf_avail);
+		cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail);
 		if (likely(cpy_len > MAX_BATCH_LEN || copy_nb >= vq->size)) {
 			rte_memcpy((void *)((uintptr_t)(desc_addr +
 							desc_offset)),
 				rte_pktmbuf_mtod_offset(m, void *, mbuf_offset),
 				cpy_len);
-			vhost_log_write(dev, desc->addr + desc_offset, cpy_len);
+			vhost_log_write(dev, desc_gaddr + desc_offset, cpy_len);
 			PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset),
 				     cpy_len, 0);
 		} else {
@@ -329,7 +387,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				(void *)((uintptr_t)(desc_addr + desc_offset));
 			batch_copy[copy_nb].src =
 				rte_pktmbuf_mtod_offset(m, void *, mbuf_offset);
-			batch_copy[copy_nb].log_addr = desc->addr + desc_offset;
+			batch_copy[copy_nb].log_addr = desc_gaddr + desc_offset;
 			batch_copy[copy_nb].len = cpy_len;
 			copy_nb++;
 		}
@@ -338,6 +396,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		mbuf_offset += cpy_len;
 		desc_avail  -= cpy_len;
 		desc_offset += cpy_len;
+		desc_chunck_len -= cpy_len;
 	}
 
 out:
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 08/11] vhost: handle virtually non-contiguous buffers in Rx-mrg
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (6 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 07/11] vhost: handle virtually non-contiguous buffers in Rx Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 09/11] examples/vhost: move to safe GPA translation API Maxime Coquelin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch enables the handling of buffers non-contiguous in
process virtual address space in the enqueue path when mergeable
buffers are used.

When virtio-net header doesn't fit in a single chunck, it is
computed in a local variable and copied to the buffer chuncks
afterwards.

For packet content, the copy length is limited to the chunck
size, next chuncks VAs being fetched afterward.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/virtio_net.c | 115 ++++++++++++++++++++++++++++++++----------
 1 file changed, 87 insertions(+), 28 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5bd4e58d6..a013c07b0 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -668,14 +668,15 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			    uint16_t num_buffers)
 {
 	uint32_t vec_idx = 0;
-	uint64_t desc_addr;
+	uint64_t desc_addr, desc_gaddr;
 	uint32_t mbuf_offset, mbuf_avail;
 	uint32_t desc_offset, desc_avail;
 	uint32_t cpy_len;
-	uint64_t dlen;
+	uint64_t desc_chunck_len;
 	uint64_t hdr_addr, hdr_phys_addr;
 	struct rte_mbuf *hdr_mbuf;
 	struct batch_copy_elem *batch_copy = vq->batch_copy_elems;
+	struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL;
 	uint16_t copy_nb = vq->batch_copy_nb_elems;
 	int error = 0;
 
@@ -684,26 +685,48 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		goto out;
 	}
 
-	dlen = buf_vec[vec_idx].buf_len;
-	desc_addr = vhost_iova_to_vva(dev, vq, buf_vec[vec_idx].buf_addr,
-						&dlen, VHOST_ACCESS_RW);
-	if (dlen != buf_vec[vec_idx].buf_len ||
-			buf_vec[vec_idx].buf_len < dev->vhost_hlen ||
-			!desc_addr) {
+	desc_chunck_len = buf_vec[vec_idx].buf_len;
+	desc_gaddr = buf_vec[vec_idx].buf_addr;
+	desc_addr = vhost_iova_to_vva(dev, vq,
+					desc_gaddr,
+					&desc_chunck_len,
+					VHOST_ACCESS_RW);
+	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr) {
 		error = -1;
 		goto out;
 	}
 
 	hdr_mbuf = m;
 	hdr_addr = desc_addr;
-	hdr_phys_addr = buf_vec[vec_idx].buf_addr;
+	if (unlikely(desc_chunck_len < dev->vhost_hlen))
+		hdr = &tmp_hdr;
+	else
+		hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)hdr_addr;
+	hdr_phys_addr = desc_gaddr;
 	rte_prefetch0((void *)(uintptr_t)hdr_addr);
 
 	LOG_DEBUG(VHOST_DATA, "(%d) RX: num merge buffers %d\n",
 		dev->vid, num_buffers);
 
 	desc_avail  = buf_vec[vec_idx].buf_len - dev->vhost_hlen;
-	desc_offset = dev->vhost_hlen;
+	if (unlikely(desc_chunck_len < dev->vhost_hlen)) {
+		desc_chunck_len = desc_avail;
+		desc_gaddr += dev->vhost_hlen;
+		desc_addr = vhost_iova_to_vva(dev, vq,
+				desc_gaddr,
+				&desc_chunck_len,
+				VHOST_ACCESS_RW);
+		if (unlikely(!desc_addr)) {
+			error = -1;
+			goto out;
+		}
+
+		desc_offset = 0;
+	} else {
+		desc_offset = dev->vhost_hlen;
+		desc_chunck_len -= dev->vhost_hlen;
+	}
+
 
 	mbuf_avail  = rte_pktmbuf_data_len(m);
 	mbuf_offset = 0;
@@ -711,14 +734,14 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		/* done with current desc buf, get the next one */
 		if (desc_avail == 0) {
 			vec_idx++;
-			dlen = buf_vec[vec_idx].buf_len;
+			desc_chunck_len = buf_vec[vec_idx].buf_len;
+			desc_gaddr = buf_vec[vec_idx].buf_addr;
 			desc_addr =
 				vhost_iova_to_vva(dev, vq,
-					buf_vec[vec_idx].buf_addr,
-					&dlen,
+					desc_gaddr,
+					&desc_chunck_len,
 					VHOST_ACCESS_RW);
-			if (unlikely(!desc_addr ||
-					dlen != buf_vec[vec_idx].buf_len)) {
+			if (unlikely(!desc_addr)) {
 				error = -1;
 				goto out;
 			}
@@ -727,6 +750,17 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
 			desc_offset = 0;
 			desc_avail  = buf_vec[vec_idx].buf_len;
+		} else if (unlikely(desc_chunck_len == 0)) {
+			desc_chunck_len = desc_avail;
+			desc_gaddr += desc_offset;
+			desc_addr = vhost_iova_to_vva(dev, vq,
+					desc_gaddr,
+					&desc_chunck_len, VHOST_ACCESS_RW);
+			if (unlikely(!desc_addr)) {
+				error = -1;
+				goto out;
+			}
+			desc_offset = 0;
 		}
 
 		/* done with current mbuf, get the next one */
@@ -738,30 +772,55 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		}
 
 		if (hdr_addr) {
-			struct virtio_net_hdr_mrg_rxbuf *hdr;
-
-			hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)
-				hdr_addr;
 			virtio_enqueue_offload(hdr_mbuf, &hdr->hdr);
 			ASSIGN_UNLESS_EQUAL(hdr->num_buffers, num_buffers);
 
-			vhost_log_write(dev, hdr_phys_addr, dev->vhost_hlen);
-			PRINT_PACKET(dev, (uintptr_t)hdr_addr,
-				     dev->vhost_hlen, 0);
+			if (unlikely(hdr == &tmp_hdr)) {
+				uint64_t len;
+				uint64_t remain = dev->vhost_hlen;
+				uint64_t src = (uint64_t)(uintptr_t)hdr, dst;
+				uint64_t guest_addr = hdr_phys_addr;
+
+				while (remain) {
+					len = remain;
+					dst = vhost_iova_to_vva(dev, vq,
+							guest_addr, &len,
+							VHOST_ACCESS_RW);
+					if (unlikely(!dst || !len)) {
+						error = -1;
+						goto out;
+					}
+
+					rte_memcpy((void *)(uintptr_t)dst,
+							(void *)(uintptr_t)src,
+							len);
+
+					PRINT_PACKET(dev, (uintptr_t)dst,
+							len, 0);
+					vhost_log_write(dev, guest_addr, len);
+
+					remain -= len;
+					guest_addr += len;
+					dst += len;
+				}
+			} else {
+				PRINT_PACKET(dev, (uintptr_t)hdr_addr,
+						dev->vhost_hlen, 0);
+				vhost_log_write(dev, hdr_phys_addr,
+						dev->vhost_hlen);
+			}
 
 			hdr_addr = 0;
 		}
 
-		cpy_len = RTE_MIN(desc_avail, mbuf_avail);
+		cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail);
 
 		if (likely(cpy_len > MAX_BATCH_LEN || copy_nb >= vq->size)) {
 			rte_memcpy((void *)((uintptr_t)(desc_addr +
 							desc_offset)),
 				rte_pktmbuf_mtod_offset(m, void *, mbuf_offset),
 				cpy_len);
-			vhost_log_write(dev,
-				buf_vec[vec_idx].buf_addr + desc_offset,
-				cpy_len);
+			vhost_log_write(dev, desc_gaddr + desc_offset, cpy_len);
 			PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset),
 				cpy_len, 0);
 		} else {
@@ -769,8 +828,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				(void *)((uintptr_t)(desc_addr + desc_offset));
 			batch_copy[copy_nb].src =
 				rte_pktmbuf_mtod_offset(m, void *, mbuf_offset);
-			batch_copy[copy_nb].log_addr =
-				buf_vec[vec_idx].buf_addr + desc_offset;
+			batch_copy[copy_nb].log_addr = desc_gaddr + desc_offset;
 			batch_copy[copy_nb].len = cpy_len;
 			copy_nb++;
 		}
@@ -779,6 +837,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		mbuf_offset += cpy_len;
 		desc_avail  -= cpy_len;
 		desc_offset += cpy_len;
+		desc_chunck_len -= cpy_len;
 	}
 
 out:
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 09/11] examples/vhost: move to safe GPA translation API
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (7 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 08/11] vhost: handle virtually non-contiguous buffers in Rx-mrg Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 10/11] examples/vhost_scsi: " Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 11/11] vhost: deprecate unsafe " Maxime Coquelin
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch uses the new rte_vhost_va_from_guest_pa() API
to ensure the application doesn't perform out-of-bound
accesses either because of a malicious guest providing an
incorrect descriptor length, or because the buffer is
contiguous in guest physical address space but not in the
host process virtual address space.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 examples/vhost/virtio_net.c | 94 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 83 insertions(+), 11 deletions(-)

diff --git a/examples/vhost/virtio_net.c b/examples/vhost/virtio_net.c
index 1ab57f526..31c3dd064 100644
--- a/examples/vhost/virtio_net.c
+++ b/examples/vhost/virtio_net.c
@@ -85,16 +85,20 @@ enqueue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 	    struct rte_mbuf *m, uint16_t desc_idx)
 {
 	uint32_t desc_avail, desc_offset;
+	uint64_t desc_chunck_len;
 	uint32_t mbuf_avail, mbuf_offset;
 	uint32_t cpy_len;
 	struct vring_desc *desc;
-	uint64_t desc_addr;
+	uint64_t desc_addr, desc_gaddr;
 	struct virtio_net_hdr virtio_hdr = {0, 0, 0, 0, 0, 0};
 	/* A counter to avoid desc dead loop chain */
 	uint16_t nr_desc = 1;
 
 	desc = &vr->desc[desc_idx];
-	desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr);
+	desc_chunck_len = desc->len;
+	desc_gaddr = desc->addr;
+	desc_addr = rte_vhost_va_from_guest_pa(
+			dev->mem, desc_gaddr, &desc_chunck_len);
 	/*
 	 * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
 	 * performance issue with some versions of gcc (4.8.4 and 5.3.0) which
@@ -106,9 +110,42 @@ enqueue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 	/* write virtio-net header */
-	*(struct virtio_net_hdr *)(uintptr_t)desc_addr = virtio_hdr;
+	if (likely(desc_chunck_len >= dev->hdr_len)) {
+		*(struct virtio_net_hdr *)(uintptr_t)desc_addr = virtio_hdr;
+		desc_offset = dev->hdr_len;
+	} else {
+		uint64_t len;
+		uint64_t remain = dev->hdr_len;
+		uint64_t src = (uint64_t)(uintptr_t)&virtio_hdr, dst;
+		uint64_t guest_addr = desc_gaddr;
+
+		while (remain) {
+			len = remain;
+			dst = rte_vhost_va_from_guest_pa(dev->mem,
+					guest_addr, &len);
+			if (unlikely(!dst || !len))
+				return -1;
+
+			rte_memcpy((void *)(uintptr_t)dst,
+					(void *)(uintptr_t)src,
+					len);
+
+			remain -= len;
+			guest_addr += len;
+			dst += len;
+		}
+
+		desc_chunck_len = desc->len - dev->hdr_len;
+		desc_gaddr += dev->hdr_len;
+		desc_addr = rte_vhost_va_from_guest_pa(
+				dev->mem, desc_gaddr,
+				&desc_chunck_len);
+		if (unlikely(!desc_addr))
+			return -1;
+
+		desc_offset = 0;
+	}
 
-	desc_offset = dev->hdr_len;
 	desc_avail  = desc->len - dev->hdr_len;
 
 	mbuf_avail  = rte_pktmbuf_data_len(m);
@@ -133,15 +170,28 @@ enqueue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 				return -1;
 
 			desc = &vr->desc[desc->next];
-			desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr);
+			desc_chunck_len = desc->len;
+			desc_gaddr = desc->addr;
+			desc_addr = rte_vhost_va_from_guest_pa(
+					dev->mem, desc_gaddr, &desc_chunck_len);
 			if (unlikely(!desc_addr))
 				return -1;
 
 			desc_offset = 0;
 			desc_avail  = desc->len;
+		} else if (unlikely(desc_chunck_len == 0)) {
+			desc_chunck_len = desc_avail;
+			desc_gaddr += desc_offset;
+			desc_addr = rte_vhost_va_from_guest_pa(dev->mem,
+					desc_gaddr,
+					&desc_chunck_len);
+			if (unlikely(!desc_addr))
+				return -1;
+
+			desc_offset = 0;
 		}
 
-		cpy_len = RTE_MIN(desc_avail, mbuf_avail);
+		cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail);
 		rte_memcpy((void *)((uintptr_t)(desc_addr + desc_offset)),
 			rte_pktmbuf_mtod_offset(m, void *, mbuf_offset),
 			cpy_len);
@@ -150,6 +200,7 @@ enqueue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 		mbuf_offset += cpy_len;
 		desc_avail  -= cpy_len;
 		desc_offset += cpy_len;
+		desc_chunck_len -= cpy_len;
 	}
 
 	return 0;
@@ -223,8 +274,9 @@ dequeue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 	    struct rte_mempool *mbuf_pool)
 {
 	struct vring_desc *desc;
-	uint64_t desc_addr;
+	uint64_t desc_addr, desc_gaddr;
 	uint32_t desc_avail, desc_offset;
+	uint64_t desc_chunck_len;
 	uint32_t mbuf_avail, mbuf_offset;
 	uint32_t cpy_len;
 	struct rte_mbuf *cur = m, *prev = m;
@@ -236,7 +288,10 @@ dequeue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 			(desc->flags & VRING_DESC_F_INDIRECT))
 		return -1;
 
-	desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr);
+	desc_chunck_len = desc->len;
+	desc_gaddr = desc->addr;
+	desc_addr = rte_vhost_va_from_guest_pa(
+			dev->mem, desc_gaddr, &desc_chunck_len);
 	if (unlikely(!desc_addr))
 		return -1;
 
@@ -250,7 +305,10 @@ dequeue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 	 * header.
 	 */
 	desc = &vr->desc[desc->next];
-	desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr);
+	desc_chunck_len = desc->len;
+	desc_gaddr = desc->addr;
+	desc_addr = rte_vhost_va_from_guest_pa(
+			dev->mem, desc_gaddr, &desc_chunck_len);
 	if (unlikely(!desc_addr))
 		return -1;
 	rte_prefetch0((void *)(uintptr_t)desc_addr);
@@ -262,7 +320,7 @@ dequeue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 	mbuf_offset = 0;
 	mbuf_avail  = m->buf_len - RTE_PKTMBUF_HEADROOM;
 	while (1) {
-		cpy_len = RTE_MIN(desc_avail, mbuf_avail);
+		cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail);
 		rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *,
 						   mbuf_offset),
 			(void *)((uintptr_t)(desc_addr + desc_offset)),
@@ -272,6 +330,7 @@ dequeue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 		mbuf_offset += cpy_len;
 		desc_avail  -= cpy_len;
 		desc_offset += cpy_len;
+		desc_chunck_len -= cpy_len;
 
 		/* This desc reaches to its end, get the next one */
 		if (desc_avail == 0) {
@@ -283,13 +342,26 @@ dequeue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr,
 				return -1;
 			desc = &vr->desc[desc->next];
 
-			desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr);
+			desc_chunck_len = desc->len;
+			desc_gaddr = desc->addr;
+			desc_addr = rte_vhost_va_from_guest_pa(
+					dev->mem, desc_gaddr, &desc_chunck_len);
 			if (unlikely(!desc_addr))
 				return -1;
 			rte_prefetch0((void *)(uintptr_t)desc_addr);
 
 			desc_offset = 0;
 			desc_avail  = desc->len;
+		} else if (unlikely(desc_chunck_len == 0)) {
+			desc_chunck_len = desc_avail;
+			desc_gaddr += desc_offset;
+			desc_addr = rte_vhost_va_from_guest_pa(dev->mem,
+					desc_gaddr,
+					&desc_chunck_len);
+			if (unlikely(!desc_addr))
+				return -1;
+
+			desc_offset = 0;
 		}
 
 		/*
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 10/11] examples/vhost_scsi: move to safe GPA translation API
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (8 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 09/11] examples/vhost: move to safe GPA translation API Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 11/11] vhost: deprecate unsafe " Maxime Coquelin
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch uses the new rte_vhost_va_from_guest_pa() API
to ensure all the descriptor buffer is mapped contiguously
in the application virtual address space.

As the application did not checked return of previous API,
this patch just print an error if the buffer address isn't in
the vhost memory regions or if it is scattered. Ideally, it
should handle scattered buffers gracefully.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 examples/vhost_scsi/vhost_scsi.c | 56 +++++++++++++++++++++++++++++++++-------
 1 file changed, 47 insertions(+), 9 deletions(-)

diff --git a/examples/vhost_scsi/vhost_scsi.c b/examples/vhost_scsi/vhost_scsi.c
index b4f1f8d27..b40f99363 100644
--- a/examples/vhost_scsi/vhost_scsi.c
+++ b/examples/vhost_scsi/vhost_scsi.c
@@ -68,7 +68,7 @@ vhost_scsi_ctrlr_find(__rte_unused const char *ctrlr_name)
 	return g_vhost_ctrlr;
 }
 
-static uint64_t gpa_to_vva(int vid, uint64_t gpa)
+static uint64_t gpa_to_vva(int vid, uint64_t gpa, uint64_t *len)
 {
 	char path[PATH_MAX];
 	struct vhost_scsi_ctrlr *ctrlr;
@@ -88,7 +88,7 @@ static uint64_t gpa_to_vva(int vid, uint64_t gpa)
 
 	assert(ctrlr->mem != NULL);
 
-	return rte_vhost_gpa_to_vva(ctrlr->mem, gpa);
+	return rte_vhost_va_from_guest_pa(ctrlr->mem, gpa, len);
 }
 
 static struct vring_desc *
@@ -138,15 +138,29 @@ static void
 vhost_process_read_payload_chain(struct vhost_scsi_task *task)
 {
 	void *data;
+	uint64_t chunck_len;
 
 	task->iovs_cnt = 0;
+	chunck_len = task->desc->len;
 	task->resp = (void *)(uintptr_t)gpa_to_vva(task->bdev->vid,
-						   task->desc->addr);
+						   task->desc->addr,
+						   &chunck_len);
+	if (!task->resp || chunck_len != task->desc->len) {
+		fprintf(stderr, "failed to translate desc address.\n");
+		return;
+	}
 
 	while (descriptor_has_next(task->desc)) {
 		task->desc = descriptor_get_next(task->vq->desc, task->desc);
+		chunck_len = task->desc->len;
 		data = (void *)(uintptr_t)gpa_to_vva(task->bdev->vid,
-						     task->desc->addr);
+						     task->desc->addr,
+							 &chunck_len);
+		if (!data || chunck_len != task->desc->len) {
+			fprintf(stderr, "failed to translate desc address.\n");
+			return;
+		}
+
 		task->iovs[task->iovs_cnt].iov_base = data;
 		task->iovs[task->iovs_cnt].iov_len = task->desc->len;
 		task->data_len += task->desc->len;
@@ -158,12 +172,20 @@ static void
 vhost_process_write_payload_chain(struct vhost_scsi_task *task)
 {
 	void *data;
+	uint64_t chunck_len;
 
 	task->iovs_cnt = 0;
 
 	do {
+		chunck_len = task->desc->len;
 		data = (void *)(uintptr_t)gpa_to_vva(task->bdev->vid,
-						     task->desc->addr);
+						     task->desc->addr,
+							 &chunck_len);
+		if (!data || chunck_len != task->desc->len) {
+			fprintf(stderr, "failed to translate desc address.\n");
+			return;
+		}
+
 		task->iovs[task->iovs_cnt].iov_base = data;
 		task->iovs[task->iovs_cnt].iov_len = task->desc->len;
 		task->data_len += task->desc->len;
@@ -171,8 +193,12 @@ vhost_process_write_payload_chain(struct vhost_scsi_task *task)
 		task->desc = descriptor_get_next(task->vq->desc, task->desc);
 	} while (descriptor_has_next(task->desc));
 
+	chunck_len = task->desc->len;
 	task->resp = (void *)(uintptr_t)gpa_to_vva(task->bdev->vid,
-						   task->desc->addr);
+						   task->desc->addr,
+						   &chunck_len);
+	if (!task->resp || chunck_len != task->desc->len)
+		fprintf(stderr, "failed to translate desc address.\n");
 }
 
 static struct vhost_block_dev *
@@ -218,6 +244,7 @@ process_requestq(struct vhost_scsi_ctrlr *ctrlr, uint32_t q_idx)
 		int req_idx;
 		uint16_t last_idx;
 		struct vhost_scsi_task *task;
+		uint64_t chunck_len;
 
 		last_idx = scsi_vq->last_used_idx & (vq->size - 1);
 		req_idx = vq->avail->ring[last_idx];
@@ -235,16 +262,27 @@ process_requestq(struct vhost_scsi_ctrlr *ctrlr, uint32_t q_idx)
 		assert((task->desc->flags & VRING_DESC_F_INDIRECT) == 0);
 		scsi_vq->last_used_idx++;
 
+		chunck_len = task->desc->len;
 		task->req = (void *)(uintptr_t)gpa_to_vva(task->bdev->vid,
-							  task->desc->addr);
+							  task->desc->addr,
+							  &chunck_len);
+		if (!task->req || chunck_len != task->desc->len) {
+			fprintf(stderr, "failed to translate desc address.\n");
+			return;
+		}
 
 		task->desc = descriptor_get_next(task->vq->desc, task->desc);
 		if (!descriptor_has_next(task->desc)) {
 			task->dxfer_dir = SCSI_DIR_NONE;
+			chunck_len = task->desc->len;
 			task->resp = (void *)(uintptr_t)
 					      gpa_to_vva(task->bdev->vid,
-							 task->desc->addr);
-
+							 task->desc->addr,
+							 &chunck_len);
+			if (!task->resp || chunck_len != task->desc->len) {
+				fprintf(stderr, "failed to translate desc address.\n");
+				return;
+			}
 		} else if (!descriptor_is_wr(task->desc)) {
 			task->dxfer_dir = SCSI_DIR_TO_DEV;
 			vhost_process_write_payload_chain(task);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-stable] [PATCH v17.11 LTS 11/11] vhost: deprecate unsafe GPA translation API
  2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
                   ` (9 preceding siblings ...)
  2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 10/11] examples/vhost_scsi: " Maxime Coquelin
@ 2018-04-23 16:00 ` Maxime Coquelin
  10 siblings, 0 replies; 12+ messages in thread
From: Maxime Coquelin @ 2018-04-23 16:00 UTC (permalink / raw)
  To: stable; +Cc: Maxime Coquelin

This patch marks rte_vhost_gpa_to_vva() as deprecated because
it is unsafe. Application relying on this API should move
to the new rte_vhost_va_from_guest_pa() API, and check
returned length to avoid out-of-bound accesses.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/rte_vhost.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index f2d6c95e5..3fc6034de 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -117,6 +117,11 @@ struct vhost_device_ops {
 /**
  * Convert guest physical address to host virtual address
  *
+ * This function is deprecated because unsafe.
+ * New rte_vhost_va_from_guest_pa() should be used instead to ensure
+ * guest physical ranges are fully and contiguously mapped into
+ * process virtual address space.
+ *
  * @param mem
  *  the guest memory regions
  * @param gpa
@@ -124,6 +129,7 @@ struct vhost_device_ops {
  * @return
  *  the host virtual address on success, 0 on failure
  */
+__rte_deprecated
 static __rte_always_inline uint64_t
 rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
 {
-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-04-23 16:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-23 16:00 [dpdk-stable] [PATCH v17.11 LTS 00/11] Vhost: CVE-2018-1059 fixes Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 01/11] vhost: fix indirect descriptors table translation size Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 02/11] vhost: check all range is mapped when translating GPAs Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 03/11] vhost: introduce safe API for GPA translation Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 04/11] vhost: ensure all range is mapped when translating QVAs Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 05/11] vhost: add support for non-contiguous indirect descs tables Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 06/11] vhost: handle virtually non-contiguous buffers in Tx Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 07/11] vhost: handle virtually non-contiguous buffers in Rx Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 08/11] vhost: handle virtually non-contiguous buffers in Rx-mrg Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 09/11] examples/vhost: move to safe GPA translation API Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 10/11] examples/vhost_scsi: " Maxime Coquelin
2018-04-23 16:00 ` [dpdk-stable] [PATCH v17.11 LTS 11/11] vhost: deprecate unsafe " Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).