From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2CD53A00C5; Wed, 28 Sep 2022 16:37:09 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C95154113C; Wed, 28 Sep 2022 16:37:08 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 42AC5410FA for ; Wed, 28 Sep 2022 16:37:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664375826; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1ijlq1SqdpSzb4HtqbWHUmEQ+HcAI1QOTh5gHbVb0mU=; b=V2TvMlYx8b27EE/Jzv319/PFjILz8zsfD6aLAy2e8vQ0BydwI0j6C0rIeAgp2xBrcNcFY3 twhzPzhrqdgNycCwe8Yx6lGEn38h7UhbBcn9S/8tEuG5uq2ZLHcKAOO+VsBot68MrDy3Dj VyS3NcEhjm6sl8dMQE67Kg0bprXXsSc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-591-rhUKbT7xMrWvaIXoF-1w1A-1; Wed, 28 Sep 2022 10:37:05 -0400 X-MC-Unique: rhUKbT7xMrWvaIXoF-1w1A-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F37F6857D0D; Wed, 28 Sep 2022 14:37:04 +0000 (UTC) Received: from [10.39.208.19] (unknown [10.39.208.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 292412166B48; Wed, 28 Sep 2022 14:37:04 +0000 (UTC) Message-ID: <228bf109-57c2-91f0-ab2c-6ac92d688781@redhat.com> Date: Wed, 28 Sep 2022 16:37:02 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 To: Claudio Fontana , Chenbo Xia Cc: dev@dpdk.org References: <20220802004938.23670-1-cfontana@suse.de> <20220802004938.23670-2-cfontana@suse.de> From: Maxime Coquelin Subject: Re: [PATCH v3 1/2] vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc In-Reply-To: <20220802004938.23670-2-cfontana@suse.de> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Claudio, The title should be reworded, maybe something like below? "vhost: fix possible out of bound access in buffer vectors" On 8/2/22 02:49, Claudio Fontana wrote: > in virtio_dev_split we cannot currently call desc_to_mbuf with > nr_vec == 0, or we end up trying to rte_memcpy from a source address > buf_vec[0] that is an uninitialized stack variable. > > Improve this in general by having desc_to_mbuf and mbuf_to_desc > return -1 when called with an invalid nr_vec == 0, which should > fix any other instance of this problem. > > This should fix errors that have been reported in multiple occasions > from telcos to the DPDK, OVS and QEMU projects, as this affects in > particular the openvswitch/DPDK, QEMU vhost-user setup when the > guest DPDK application abruptly goes away via SIGKILL and then > reconnects. Looking at fill_vec_buf_split() and map_one_desc(), I suspect that on guest DPDK application kill/crash, the shared virtqueue memory gets unmmaped on guest side, and the backend reads zeros in descriptors, and so zero-length. If that's the case, then your fix might not be enough, if for example the backends reads the .next field of the desc as 0, it may cause undefined behaviour. I'm not sure how we could address this though. Maybe we miss something on Vhost side to detect the guest application has crashed, because we should not try to access the vrings as long as the new guest app instance is done with its initialization. I'm going to try to reproduce the issue, but I'm not sure I will succeed. Could you please share the Vhost logs when the issue is reproduced and you face the crash? > > The back trace looks roughly like this, depending on the specific > rte_memcpy selected, etc, in any case the "src" parameter is garbage > (in this example containing 0 + dev->host_hlen(12 = 0xc)). > > Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7f64e5e6b700 (LWP 141373)] > rte_mov128blocks (n=2048, src=0xc , > dst=0x150da4480) at ../lib/eal/x86/include/rte_memcpy.h:384 > (gdb) bt > 0 rte_mov128blocks (n=2048, src=0xc, dst=0x150da4480) > 1 rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480) > 2 rte_memcpy (n=2048, src=0xc, dst=) > 3 sync_fill_seg > 4 desc_to_mbuf > 5 virtio_dev_tx_split > 6 virtio_dev_tx_split_legacy > 7 0x00007f676fea0fef in rte_vhost_dequeue_burst > 8 0x00007f6772005a62 in netdev_dpdk_vhost_rxq_recv > 9 0x00007f6771f38116 in netdev_rxq_recv > 10 0x00007f6771f03d96 in dp_netdev_process_rxq_port > 11 0x00007f6771f04239 in pmd_thread_main > 12 0x00007f6771f92aff in ovsthread_wrapper > 13 0x00007f6771c1b6ea in start_thread > 14 0x00007f6771933a8f in clone It is a fix, so it should contains the Fixes tag, and also cc stable@dpdk.org. > > Tested-by: Claudio Fontana Tested-by can be removed as you're already the author. > Signed-off-by: Claudio Fontana > --- > lib/vhost/virtio_net.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c > index 35fa4670fd..eb19e54c2b 100644 > --- a/lib/vhost/virtio_net.c > +++ b/lib/vhost/virtio_net.c > @@ -1153,7 +1153,7 @@ mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, > struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL; > struct vhost_async *async = vq->async; > > - if (unlikely(m == NULL)) > + if (unlikely(m == NULL) || nr_vec == 0) I think nr_vec == 0 is also unlikely. > return -1; > > buf_addr = buf_vec[vec_idx].buf_addr; > @@ -2673,6 +2673,9 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, > struct vhost_async *async = vq->async; > struct async_inflight_info *pkts_info; > > + if (unlikely(nr_vec == 0)) > + return -1; > + > buf_addr = buf_vec[vec_idx].buf_addr; > buf_iova = buf_vec[vec_idx].buf_iova; > buf_len = buf_vec[vec_idx].buf_len; > @@ -2917,9 +2920,11 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, > vq->last_avail_idx + i, > &nr_vec, buf_vec, > &head_idx, &buf_len, > - VHOST_ACCESS_RO) < 0)) > + VHOST_ACCESS_RO) < 0)) { > + dropped += 1; > + i++; > break; > - > + } > update_shadow_used_ring_split(vq, head_idx, 0); > > err = virtio_dev_pktmbuf_prep(dev, pkts[i], buf_len);