DPDK patches and discussions
 help / color / mirror / Atom feed
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Rich Lane <rich.lane@bigswitch.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: avoid buffer overflow in update_secure_len
Date: Tue, 17 Nov 2015 21:23:49 +0800	[thread overview]
Message-ID: <20151117132349.GT2326@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <CAGSMBPOLNsc-+_Zj7FgBhmD0kpUAoy3fu5urxN74YTfmE20Qzw@mail.gmail.com>

On Thu, Nov 12, 2015 at 01:46:03PM -0800, Rich Lane wrote:
> You can reproduce this with l2fwd and the vhost PMD.
> 
> You'll need this patch on top of the vhost PMD patches:
> --- a/lib/librte_vhost/virtio-net.c
> +++ b/lib/librte_vhost/virtio-net.c
> @@ -471,7 +471,7 @@ reset_owner(struct vhost_device_ctx ctx)
>                 return -1;
>  
>         if (dev->flags & VIRTIO_DEV_RUNNING)
> -               notify_ops->destroy_device(dev);
> +               notify_destroy_device(dev);
>  
>         cleanup_device(dev);
>         reset_device(dev);
> 
> 1. Start l2fwd on the host: l2fwd -l 0,1 --vdev eth_null --vdev
> eth_vhost0,iface=/run/vhost0.sock -- -p3
> 2. Start a VM using vhost-user and set up uio, hugepages, etc.
> 3. Start l2fwd inside the VM: l2fwd -l 0,1 --vdev eth_null -- -p3
> 4. Kill the l2fwd inside the VM with SIGINT.
> 5. Start l2fwd inside the VM.
> 6. l2fwd on the host crashes.
> 
> I found the source of the memory corruption by setting a watchpoint in
> gdb: watch -l rte_eth_devices[1].data->rx_queues

Rich,

Thanks for the detailed steps for reproducing this issue, and sorry for
being a bit late: I finally got the time to dig this issue today.

Put simply, buffer overflow is not the root cause, but the fact "we do
not release resource on stop/exit" is.

And here is how the issue comes.  After step 4 (terminating l2fwd), neither
the l2fwd nor the virtio pmd driver does some resource release. Hence,
l2fwd at HOST will not notice such chage, still trying to receive and
queue packets to the vhost dev. It's not an issue as far as we don't
start l2fwd again, for there is actaully no packets to forward, and
rte_vhost_dequeue_burst returns from:

    596         avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
    597
    598         /* If there are no available buffers then return. */
    599         if (vq->last_used_idx == avail_idx)
    600                 return 0; 

But just at the init stage while starting l2fwd (step 5), rte_eal_memory_init()
resets all huge pages memory to zero, resulting all vq->desc[] items
being reset to zero, which in turn ends up with secure_len being set
with 0 at return.

(BTW, I'm not quite sure why the inside VM huge pages memory reset
would results to vq->desc reset).

The vq desc reset reuslts to a dead loop at virtio_dev_merge_rx(),
as update_secure_len() keeps setting secure_len with 0:

    511                    do {
    512                            avail_idx = *((volatile uint16_t *)&vq->avail->idx);
    513                            if (unlikely(res_cur_idx == avail_idx)) {
    514                                    LOG_DEBUG(VHOST_DATA,
    515                                            "(%"PRIu64") Failed "
    516                                            "to get enough desc from "
    517                                            "vring\n",
    518                                            dev->device_fh);
    519                                    goto merge_rx_exit;
    520                            } else {
    521                                    update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
    522                                    res_cur_idx++;
    523                            }
    524                    } while (pkt_len > secure_len);

The dead loop causes vec_idx keep increasing then, and overflows
quickly, leading to the crash in the end as you saw. 

So, the following would resolve this issue, in a right way (I
guess), and it's for virtio-pmd and l2fwd only so far.

---
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 12fcc23..8d6bf56 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1507,9 +1507,12 @@ static void
 virtio_dev_stop(struct rte_eth_dev *dev)
 {
 	struct rte_eth_link link;
+	struct virtio_hw *hw = dev->data->dev_private;
 
 	PMD_INIT_LOG(DEBUG, "stop");
 
+	vtpci_reset(hw);
+
 	if (dev->data->dev_conf.intr_conf.lsc)
 		rte_intr_disable(&dev->pci_dev->intr_handle);
 
diff --git a/examples/l2fwd/main.c b/examples/l2fwd/main.c
index 720fd5a..565f648 100644
--- a/examples/l2fwd/main.c
+++ b/examples/l2fwd/main.c
@@ -44,6 +44,7 @@
 #include <ctype.h>
 #include <errno.h>
 #include <getopt.h>
+#include <signal.h>
 
 #include <rte_common.h>
 #include <rte_log.h>
@@ -534,14 +535,40 @@ check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
 	}
 }
 
+static uint8_t nb_ports;
+static uint8_t nb_ports_available;
+
+/* When we receive a INT signal, unregister vhost driver */
+static void
+sigint_handler(__rte_unused int signum)
+{
+	uint8_t portid;
+
+	for (portid = 0; portid < nb_ports; portid++) {
+		/* skip ports that are not enabled */
+		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
+			printf("Skipping disabled port %u\n", (unsigned) portid);
+			nb_ports_available--;
+			continue;
+		}
+
+		/* stopping port */
+		printf("Stopping port %u... ", (unsigned) portid);
+		fflush(stdout);
+		rte_eth_dev_stop(portid);
+
+		printf("done: \n");
+	}
+
+        exit(0);
+}
+
 int
 main(int argc, char **argv)
 {
 	struct lcore_queue_conf *qconf;
 	struct rte_eth_dev_info dev_info;
 	int ret;
-	uint8_t nb_ports;
-	uint8_t nb_ports_available;
 	uint8_t portid, last_port;
 	unsigned lcore_id, rx_lcore_id;
 	unsigned nb_ports_in_mask = 0;
@@ -688,6 +715,8 @@ main(int argc, char **argv)
 		/* initialize port stats */
 		memset(&port_statistics, 0, sizeof(port_statistics));
 	}
+	signal(SIGINT, sigint_handler);
+
 
 	if (!nb_ports_available) {
 		rte_exit(EXIT_FAILURE,


----

And if you rethink this issue twice, you will find it's neither a
vhost-pmd nor l2fwd specific issue. I could easy reproduce it with
vhost-switch and virtio testpmd combo. The reason behind that would
be same: we don't release/stop the resources at stop.

It's kind of a known issue so far, and it's on Zhihong (cc'ed) TODO
list to handle them correctly in next release.

	--yliu

> 
> On Thu, Nov 12, 2015 at 1:23 AM, Yuanhan Liu <yuanhan.liu@linux.intel.com>
> wrote:
> 
>     On Thu, Nov 12, 2015 at 12:02:33AM -0800, Rich Lane wrote:
>     > The guest could trigger this buffer overflow by creating a cycle of
>     descriptors
>     > (which would also cause an infinite loop). The more common case is that
>     > vq->avail->idx jumps out of the range [last_used_idx, last_used_idx+256).
>     This
>     > happens nearly every time when restarting a DPDK app inside a VM
>     connected to a
>     > vhost-user vswitch because the virtqueue memory allocated by the previous
>     run
>     > is zeroed.
> 
>     Hi,
> 
>     I somehow was aware of this issue before while reading the code.
>     Thinking that we never met that, I delayed the fix (it was still
>     in my TODO list).
> 
>     Would you please tell me the steps (commands would be better) to
>     reproduce your issue? I'd like to know more about the isue: I'm
>     guessing maybe we need fix it with a bit more cares.
> 
>             --yliu
>     >
>     > Signed-off-by: Rich Lane <rlane@bigswitch.com>
>     > ---
>     >  lib/librte_vhost/vhost_rxtx.c | 2 +-
>     >  1 file changed, 1 insertion(+), 1 deletion(-)
>     >
>     > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/
>     vhost_rxtx.c
>     > index 9322ce6..d95b478 100644
>     > --- a/lib/librte_vhost/vhost_rxtx.c
>     > +++ b/lib/librte_vhost/vhost_rxtx.c
>     > @@ -453,7 +453,7 @@ update_secure_len(struct vhost_virtqueue *vq,
>     uint32_t id,
>     >               vq->buf_vec[vec_id].desc_idx = idx;
>     >               vec_id++;
>     >
>     > -             if (vq->desc[idx].flags & VRING_DESC_F_NEXT) {
>     > +             if (vq->desc[idx].flags & VRING_DESC_F_NEXT && vec_id <
>     BUF_VECTOR_MAX) {
>     >                       idx = vq->desc[idx].next;
>     >                       next_desc = 1;
>     >               }
>     > --
>     > 1.9.1
> 
> 

  reply	other threads:[~2015-11-17 13:23 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-12  8:02 Rich Lane
2015-11-12  9:23 ` Yuanhan Liu
2015-11-12 21:46   ` Rich Lane
2015-11-17 13:23     ` Yuanhan Liu [this message]
2015-11-17 16:39       ` Rich Lane
2015-11-18  2:56         ` Yuanhan Liu
2015-11-18  5:23           ` Wang, Zhihong
2015-11-18  5:26           ` Rich Lane
2015-11-18  5:32             ` Yuanhan Liu
2015-11-18  6:13           ` Xie, Huawei
2015-11-18  6:25             ` Yuanhan Liu
2015-11-18  8:13               ` Xie, Huawei
2015-11-18 15:53             ` Stephen Hemminger
2015-11-18 16:00               ` Xie, Huawei
2015-11-18  7:53           ` Xie, Huawei
2015-11-18  8:48             ` Yuanhan Liu
2015-11-18 11:15               ` Xie, Huawei
2015-11-19  5:51                 ` Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151117132349.GT2326@yliu-dev.sh.intel.com \
    --to=yuanhan.liu@linux.intel.com \
    --cc=dev@dpdk.org \
    --cc=rich.lane@bigswitch.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).