From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk0-f41.google.com (mail-vk0-f41.google.com [209.85.213.41]) by dpdk.org (Postfix) with ESMTP id D5B885A71 for ; Tue, 17 Nov 2015 17:39:30 +0100 (CET) Received: by vkas68 with SMTP id s68so9335745vka.2 for ; Tue, 17 Nov 2015 08:39:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bigswitch_com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=NpGYqOsKBT1ZQk1Ef0b462kpr2jMQwFWDejtJGdWMO0=; b=gEuFnuj26gtmkVJ+/NlOxq4Ey25Wew+baP+gO4QRYEsgSR8dtK9iYM/vkq6nFNazaY SsnX8eL+id8sNanD+/OjfKf5er56v3DjBLf3W8vL//mfOZIfQq7keX/3Bhhfg3SuCDaK SKLfkYCxWOzimlM0hfLm9vqeSj5tAP3GOFg/o/ZE0IgSuXaG3IoIh4mXywZx6i0fkszv Nc78GCzOjhDyFHzn4TTwAd31ooL7gyPYonF0q8jpsbt8Z3oIDl6yy8rAAZFJLoVgvXmO IXzLOIR7sKL/csHMug98JREE8yBb51LjBGX18XkhPuwEEXH2RnB92EcAtzAFXxM6e4y9 acNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=NpGYqOsKBT1ZQk1Ef0b462kpr2jMQwFWDejtJGdWMO0=; b=UcKfW2kOOz0ocNAu0FJpRK1RbD5u84oqkkdfqmzdGTfMJEPtgNewizD38pGHbTw9oe 9Rn1hqdJeFgknYcD8mLlgUrRSaS2BvUurmbtCuei0ADhrb44czdikJIIzUZyCNsnD21M qjX3eRlqy724JB4ABd5g06cq5Kgh1JwzvtlSX6l3iC51XgPzY/HX74fSmXCq+2uUhk66 36btCNdAXALxSv+iZbEeqjXF4kkGw0l3oL4h8y39Lsg7MQlxPROLMJlmG9Io4pkpwCyc 2exq/NRLyjFyoNGy5ewm1H++YH9HQdo4qsyDDiPUbRJ9SF7r8YvoAP5LmIH3UL3nZ8M2 yxJA== X-Gm-Message-State: ALoCoQkk4iyp6+pahlSrFu+QDsUN5DzIX0kjGyK5XdaE2l9fLuZUw06Y2XyDSNC50ciB4ZrDGkgM MIME-Version: 1.0 X-Received: by 10.31.13.1 with SMTP id 1mr4967425vkn.100.1447778370220; Tue, 17 Nov 2015 08:39:30 -0800 (PST) Received: by 10.31.3.170 with HTTP; Tue, 17 Nov 2015 08:39:30 -0800 (PST) In-Reply-To: <20151117132349.GT2326@yliu-dev.sh.intel.com> References: <1447315353-42152-1-git-send-email-rlane@bigswitch.com> <20151112092305.GI2326@yliu-dev.sh.intel.com> <20151117132349.GT2326@yliu-dev.sh.intel.com> Date: Tue, 17 Nov 2015 08:39:30 -0800 Message-ID: From: Rich Lane To: Yuanhan Liu Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] vhost: avoid buffer overflow in update_secure_len X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Nov 2015 16:39:31 -0000 On Tue, Nov 17, 2015 at 5:23 AM, Yuanhan Liu wrote: > On Thu, Nov 12, 2015 at 01:46:03PM -0800, Rich Lane wrote: > > You can reproduce this with l2fwd and the vhost PMD. > > > > You'll need this patch on top of the vhost PMD patches: > > --- a/lib/librte_vhost/virtio-net.c > > +++ b/lib/librte_vhost/virtio-net.c > > @@ -471,7 +471,7 @@ reset_owner(struct vhost_device_ctx ctx) > > return -1; > > > > if (dev->flags & VIRTIO_DEV_RUNNING) > > - notify_ops->destroy_device(dev); > > + notify_destroy_device(dev); > > > > cleanup_device(dev); > > reset_device(dev); > > > > 1. Start l2fwd on the host: l2fwd -l 0,1 --vdev eth_null --vdev > > eth_vhost0,iface=/run/vhost0.sock -- -p3 > > 2. Start a VM using vhost-user and set up uio, hugepages, etc. > > 3. Start l2fwd inside the VM: l2fwd -l 0,1 --vdev eth_null -- -p3 > > 4. Kill the l2fwd inside the VM with SIGINT. > > 5. Start l2fwd inside the VM. > > 6. l2fwd on the host crashes. > > > > I found the source of the memory corruption by setting a watchpoint in > > gdb: watch -l rte_eth_devices[1].data->rx_queues > > Rich, > > Thanks for the detailed steps for reproducing this issue, and sorry for > being a bit late: I finally got the time to dig this issue today. > > Put simply, buffer overflow is not the root cause, but the fact "we do > not release resource on stop/exit" is. > > And here is how the issue comes. After step 4 (terminating l2fwd), neither > the l2fwd nor the virtio pmd driver does some resource release. Hence, > l2fwd at HOST will not notice such chage, still trying to receive and > queue packets to the vhost dev. It's not an issue as far as we don't > start l2fwd again, for there is actaully no packets to forward, and > rte_vhost_dequeue_burst returns from: > > 596 avail_idx = *((volatile uint16_t *)&vq->avail->idx); > 597 > 598 /* If there are no available buffers then return. */ > 599 if (vq->last_used_idx == avail_idx) > 600 return 0; > > But just at the init stage while starting l2fwd (step 5), > rte_eal_memory_init() > resets all huge pages memory to zero, resulting all vq->desc[] items > being reset to zero, which in turn ends up with secure_len being set > with 0 at return. > > (BTW, I'm not quite sure why the inside VM huge pages memory reset > would results to vq->desc reset). > > The vq desc reset reuslts to a dead loop at virtio_dev_merge_rx(), > as update_secure_len() keeps setting secure_len with 0: > > 511 do { > 512 avail_idx = *((volatile uint16_t > *)&vq->avail->idx); > 513 if (unlikely(res_cur_idx == avail_idx)) > { > 514 LOG_DEBUG(VHOST_DATA, > 515 "(%"PRIu64") Failed " > 516 "to get enough desc > from " > 517 "vring\n", > 518 dev->device_fh); > 519 goto merge_rx_exit; > 520 } else { > 521 update_secure_len(vq, > res_cur_idx, &secure_len, &vec_idx); > 522 res_cur_idx++; > 523 } > 524 } while (pkt_len > secure_len); > > The dead loop causes vec_idx keep increasing then, and overflows > quickly, leading to the crash in the end as you saw. > > So, the following would resolve this issue, in a right way (I > guess), and it's for virtio-pmd and l2fwd only so far. > > --- > diff --git a/drivers/net/virtio/virtio_ethdev.c > b/drivers/net/virtio/virtio_ethdev.c > index 12fcc23..8d6bf56 100644 > --- a/drivers/net/virtio/virtio_ethdev.c > +++ b/drivers/net/virtio/virtio_ethdev.c > @@ -1507,9 +1507,12 @@ static void > virtio_dev_stop(struct rte_eth_dev *dev) > { > struct rte_eth_link link; > + struct virtio_hw *hw = dev->data->dev_private; > > PMD_INIT_LOG(DEBUG, "stop"); > > + vtpci_reset(hw); > + > if (dev->data->dev_conf.intr_conf.lsc) > rte_intr_disable(&dev->pci_dev->intr_handle); > > diff --git a/examples/l2fwd/main.c b/examples/l2fwd/main.c > index 720fd5a..565f648 100644 > --- a/examples/l2fwd/main.c > +++ b/examples/l2fwd/main.c > @@ -44,6 +44,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -534,14 +535,40 @@ check_all_ports_link_status(uint8_t port_num, > uint32_t port_mask) > } > } > > +static uint8_t nb_ports; > +static uint8_t nb_ports_available; > + > +/* When we receive a INT signal, unregister vhost driver */ > +static void > +sigint_handler(__rte_unused int signum) > +{ > + uint8_t portid; > + > + for (portid = 0; portid < nb_ports; portid++) { > + /* skip ports that are not enabled */ > + if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) { > + printf("Skipping disabled port %u\n", (unsigned) > portid); > + nb_ports_available--; > + continue; > + } > + > + /* stopping port */ > + printf("Stopping port %u... ", (unsigned) portid); > + fflush(stdout); > + rte_eth_dev_stop(portid); > + > + printf("done: \n"); > + } > + > + exit(0); > +} > + > int > main(int argc, char **argv) > { > struct lcore_queue_conf *qconf; > struct rte_eth_dev_info dev_info; > int ret; > - uint8_t nb_ports; > - uint8_t nb_ports_available; > uint8_t portid, last_port; > unsigned lcore_id, rx_lcore_id; > unsigned nb_ports_in_mask = 0; > @@ -688,6 +715,8 @@ main(int argc, char **argv) > /* initialize port stats */ > memset(&port_statistics, 0, sizeof(port_statistics)); > } > + signal(SIGINT, sigint_handler); > + > > if (!nb_ports_available) { > rte_exit(EXIT_FAILURE, > > > ---- > > And if you rethink this issue twice, you will find it's neither a > vhost-pmd nor l2fwd specific issue. I could easy reproduce it with > vhost-switch and virtio testpmd combo. The reason behind that would > be same: we don't release/stop the resources at stop. > > It's kind of a known issue so far, and it's on Zhihong (cc'ed) TODO > list to handle them correctly in next release. > > --yliu Thanks for looking into this. I agree with your description of the root cause, it's what I was referring to when I mentioned that the virtqueue memory is zeroed when the guest app is restarted. Agreed that it's not specific to l2fwd/vhost PMD. When the guest zeroes the avail virtqueue idx it goes backwards from the perspective of the host. The host then loops up to 2^16 times until res_cur_idx == avail_idx, overflowing the buf_vec array after the first 256 iterations. No real packet TX is needed. I don't think that adding a SIGINT handler is the right solution, though. The guest app could be killed with another signal (SIGKILL). Worse, a malicious or buggy guest could write to just that field. vhost should not crash no matter what the guest writes into the virtqueues.