From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id E86134C6F for ; Mon, 23 Apr 2018 17:58:48 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 92E16F1207 for ; Mon, 23 Apr 2018 15:58:48 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-58.ams2.redhat.com [10.36.112.58]) by smtp.corp.redhat.com (Postfix) with ESMTP id E8A712023239; Mon, 23 Apr 2018 15:58:47 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org Cc: Maxime Coquelin Date: Mon, 23 Apr 2018 17:58:13 +0200 Message-Id: <20180423155818.21285-8-maxime.coquelin@redhat.com> In-Reply-To: <20180423155818.21285-1-maxime.coquelin@redhat.com> References: <20180423155818.21285-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Mon, 23 Apr 2018 15:58:48 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Mon, 23 Apr 2018 15:58:48 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH 07/12] vhost: handle virtually non-contiguous buffers in Rx X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Apr 2018 15:58:49 -0000 This patch enables the handling of buffers non-contiguous in process virtual address space in the enqueue path when mergeable buffers aren't used. When virtio-net header doesn't fit in a single chunck, it is computed in a local variable and copied to the buffer chuncks afterwards. For packet content, the copy length is limited to the chunck size, next chuncks VAs being fetched afterward. This issue has been assigned CVE-2018-1059. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 95 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 77 insertions(+), 18 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index dcbfbd5ef..531425792 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -221,9 +221,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, uint32_t desc_avail, desc_offset; uint32_t mbuf_avail, mbuf_offset; uint32_t cpy_len; - uint64_t dlen; + uint64_t desc_chunck_len; struct vring_desc *desc; - uint64_t desc_addr; + uint64_t desc_addr, desc_gaddr; /* A counter to avoid desc dead loop chain */ uint16_t nr_desc = 1; struct batch_copy_elem *batch_copy = vq->batch_copy_elems; @@ -231,28 +231,74 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, int error = 0; desc = &descs[desc_idx]; - dlen = desc->len; - desc_addr = vhost_iova_to_vva(dev, vq, desc->addr, - &dlen, VHOST_ACCESS_RW); + desc_chunck_len = desc->len; + desc_gaddr = desc->addr; + desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr, + &desc_chunck_len, VHOST_ACCESS_RW); /* * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid * performance issue with some versions of gcc (4.8.4 and 5.3.0) which * otherwise stores offset on the stack instead of in a register. */ - if (unlikely(dlen != desc->len || desc->len < dev->vhost_hlen) || - !desc_addr) { + if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr) { error = -1; goto out; } rte_prefetch0((void *)(uintptr_t)desc_addr); - virtio_enqueue_offload(m, (struct virtio_net_hdr *)(uintptr_t)desc_addr); - vhost_log_write(dev, desc->addr, dev->vhost_hlen); - PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0); + if (likely(desc_chunck_len >= dev->vhost_hlen)) { + virtio_enqueue_offload(m, + (struct virtio_net_hdr *)(uintptr_t)desc_addr); + PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0); + vhost_log_write(dev, desc_gaddr, dev->vhost_hlen); + } else { + struct virtio_net_hdr vnet_hdr; + uint64_t remain = dev->vhost_hlen; + uint64_t len; + uint64_t src = (uint64_t)(uintptr_t)&vnet_hdr, dst; + uint64_t guest_addr = desc_gaddr; + + virtio_enqueue_offload(m, &vnet_hdr); + + while (remain) { + len = remain; + dst = vhost_iova_to_vva(dev, vq, guest_addr, + &len, VHOST_ACCESS_RW); + if (unlikely(!dst || !len)) { + error = -1; + goto out; + } + + rte_memcpy((void *)(uintptr_t)dst, + (void *)(uintptr_t)src, len); + + PRINT_PACKET(dev, (uintptr_t)dst, (uint32_t)len, 0); + vhost_log_write(dev, guest_addr, len); + remain -= len; + guest_addr += len; + dst += len; + } + } - desc_offset = dev->vhost_hlen; desc_avail = desc->len - dev->vhost_hlen; + if (unlikely(desc_chunck_len < dev->vhost_hlen)) { + desc_chunck_len = desc_avail; + desc_gaddr = desc->addr + dev->vhost_hlen; + desc_addr = vhost_iova_to_vva(dev, + vq, desc_gaddr, + &desc_chunck_len, + VHOST_ACCESS_RW); + if (unlikely(!desc_addr)) { + error = -1; + goto out; + } + + desc_offset = 0; + } else { + desc_offset = dev->vhost_hlen; + desc_chunck_len -= dev->vhost_hlen; + } mbuf_avail = rte_pktmbuf_data_len(m); mbuf_offset = 0; @@ -278,26 +324,38 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, } desc = &descs[desc->next]; - dlen = desc->len; - desc_addr = vhost_iova_to_vva(dev, vq, desc->addr, - &dlen, + desc_chunck_len = desc->len; + desc_gaddr = desc->addr; + desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr, + &desc_chunck_len, VHOST_ACCESS_RW); - if (unlikely(!desc_addr || dlen != desc->len)) { + if (unlikely(!desc_addr)) { error = -1; goto out; } desc_offset = 0; desc_avail = desc->len; + } else if (unlikely(desc_chunck_len == 0)) { + desc_chunck_len = desc_avail; + desc_gaddr += desc_offset; + desc_addr = vhost_iova_to_vva(dev, + vq, desc_gaddr, + &desc_chunck_len, VHOST_ACCESS_RW); + if (unlikely(!desc_addr)) { + error = -1; + goto out; + } + desc_offset = 0; } - cpy_len = RTE_MIN(desc_avail, mbuf_avail); + cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail); if (likely(cpy_len > MAX_BATCH_LEN || copy_nb >= vq->size)) { rte_memcpy((void *)((uintptr_t)(desc_addr + desc_offset)), rte_pktmbuf_mtod_offset(m, void *, mbuf_offset), cpy_len); - vhost_log_write(dev, desc->addr + desc_offset, cpy_len); + vhost_log_write(dev, desc_gaddr + desc_offset, cpy_len); PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset), cpy_len, 0); } else { @@ -305,7 +363,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, (void *)((uintptr_t)(desc_addr + desc_offset)); batch_copy[copy_nb].src = rte_pktmbuf_mtod_offset(m, void *, mbuf_offset); - batch_copy[copy_nb].log_addr = desc->addr + desc_offset; + batch_copy[copy_nb].log_addr = desc_gaddr + desc_offset; batch_copy[copy_nb].len = cpy_len; copy_nb++; } @@ -314,6 +372,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, mbuf_offset += cpy_len; desc_avail -= cpy_len; desc_offset += cpy_len; + desc_chunck_len -= cpy_len; } out: -- 2.14.3