From: Tetsuya Mukawa <mukawa@igel.co.jp> To: "Xie, Huawei" <huawei.xie@intel.com>, "dev@dpdk.org" <dev@dpdk.org> Subject: Re: [dpdk-dev] [PATCH RFC v2 08/12] lib/librte_vhost: vhost-user support Date: Wed, 17 Dec 2014 13:22:15 +0900 Message-ID: <549104F7.20906@igel.co.jp> (raw) In-Reply-To: <5490F90E.6050701@igel.co.jp> (2014/12/17 12:31), Tetsuya Mukawa wrote: > (2014/12/17 10:06), Xie, Huawei wrote: >>>> +{ >>>> + struct virtio_net *dev = get_device(ctx); >>>> + >>>> + /* We have to stop the queue (virtio) if it is running. */ >>>> + if (dev->flags & VIRTIO_DEV_RUNNING) >>>> + notify_ops->destroy_device(dev); >>> I have an one concern about finalization of vrings. >>> Can vhost-backend stop accessing RX/TX to the vring before replying to >>> this message? >>> >>> QEMU sends this message when virtio-net device is finalized by >>> virtio-net driver on the guest. >>> After finalization, memories used by the vring will be freed by >>> virtio-net driver, because these memories are allocated by virtio-net >>> driver. >>> Because of this, I guess vhost-backend must stop accessing to vring >>> before replying to this message. >>> >>> I am not sure what is a good way to stop accessing. >>> One idea is adding a condition checking when rte_vhost_dequeue_burst() >>> and rte_vhost_enqueue_burst() is called. >>> Anyway we probably need to wait for stopping access before replying. >>> >>> Thanks, >>> Tetsuya >>> >> I think we have discussed the similar question. > Sorry, probably I might not be able to got your point correctly at the > former email. > >> It is actually the same issue whether the virtio APP in guest is crashed, or is finalized. > I guess when the APP is finalized correctly, we can have a solution. > Could you please read comment I wrote later? > >> The virtio APP will only write to the STATUS register without waiting/syncing to vhost backend. > Yes, virtio APP only write to the STATUS register. I agree with it. > > When the register is written by guest, KVM will catch it, and the > context will be change to QEMU. And QEMU will works like below. > (Also while doing following steps, guest is waiting because the context > is in QEMU) > > Could you please see below with latest QEMU code? > 1. virtio_ioport_write() [hw/virtio/virtio-pci.c] <= virtio APP will > wait for replying of this function. > 2. virtio_set_status() [hw/virtio/virtio.c] > 3. virtio_net_set_status() [hw/net/virtio-net.c] > 4. virtio_net_vhost_status() [hw/net/virtio-net.c] > 5. vhost_net_stop() [hw/net/vhost_net.c] > 6. vhost_net_stop_one() [hw/net/vhost_net.c] > 7. vhost_dev_stop() [hw/virtio/vhost.c] > 8. vhost_virtqueue_stop() [hw/virtio/vhost.c] > 9. vhost_user_call() [virtio/vhost-user.c] > 10. VHOST_USER_GET_VRING_BASE message is sent to backend. And waiting > for backend reply. > > When the vhost-user backend receives GET_VRING_BASE, I guess the guest > APP is stopped. > Also QEMU will wait for vhost-user backend reply because GET_VRING_BASE > is synchronous message. > Because of above, I guess virtio APP can wait for vhost-backend > finalization. > >> After that, not only the guest memory pointed by vring entry but also the vring itself isn't usable any more. >> The memory for vring or pointed by vring entry might be used by other APPs. > I agree. > >> This will crash guest(rather than the vhost, do you agree?). > I guess we cannot assume how the freed memory is used. > In some cases, a new APP still works, but vhost backend can access > inconsistency vring structure. > In the case vhost backend could receive illegal packets. > For example, avail_idx value might be changed to be 0xFFFF by a new APP. > (I am not sure RX/TX functions can handle such a value correctly) > > Anyway, my point is if we can finalize vhost backend correctly, we only > need to take care of crashing case. > (If so, it's very nice :)) > So let's make sure whether virtio APP can wait for finalization, or not. > I am thinking how to do it now. > I added sleep() like below. --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -300,7 +300,10 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) virtio_pci_stop_ioeventfd(proxy); } virtio_set_status(vdev, val & 0xFF); + if (val == 0) + sleep(10); if (val & VIRTIO_CONFIG_S_DRIVER_OK) { virtio_pci_start_ioeventfd(proxy); When I type 'dpdk_nic_bind.py' to cause GET_VRING_BASE, this command takes 10 seconds to be finished. So we can assume that virtio APP is able to wait for finalization of vhost backend. Thanks, Tetsuya >> If you mean this issue, I think we have no solution but one walk around: keep the huge page files of crashed app, and >> bind virtio to igb_uio and then delete the huge page files. > Yes I agree. > If the virtio APP is crashed, this will be a solution. > > Thanks, > Tetsuya > >> In our implementation, when vhost sends the message, we will call the destroy_device provided by the vSwitch to ask the >> vSwitch to stop processing the vring, but this willn't solve the issue I mention above, because the virtio APP in guest will n't >> wait us. >> >> Could you explain a bit more? Is it the same issue? >> >> >> -huawei >> >> >> >
next prev parent reply other threads:[~2014-12-17 4:22 UTC|newest] Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-12-10 21:37 [dpdk-dev] [PATCH RFC v2 00/12] " Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 01/12] lib/librte_vhost: mov vhost-cuse implementation to vhost_cuse directory Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 02/12] lib/librte_vhost: rename vhost-net-cdev.h as vhost-net.h Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 03/12] lib/librte_vhost: move event_copy logic from virtio-net.c to vhost-net-cdev.c Huawei Xie 2015-01-07 9:10 ` Xie, Huawei 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 04/12] lib/librte_vhost: copy of host_memory_map from virtio-net.c to new file virtio-net-cdev.c Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 05/12] lib/librte_vhost: host_memory_map refine Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 06/12] lib/librte_vhost: cuse_set_memory_table Huawei Xie 2014-12-15 5:20 ` Tetsuya Mukawa 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 07/12] lib/librte_vhost: async event and callback Huawei Xie 2014-12-15 5:20 ` Tetsuya Mukawa 2014-12-17 17:51 ` Xie, Huawei 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 08/12] lib/librte_vhost: vhost-user support Huawei Xie 2014-12-11 5:36 ` Linhaifeng 2015-01-05 10:21 ` Xie, Huawei 2015-01-23 3:40 ` Xie, Huawei 2015-01-23 3:53 ` Linhaifeng 2014-12-11 6:04 ` Linhaifeng 2014-12-11 17:13 ` Xie, Huawei 2014-12-12 2:25 ` Linhaifeng 2014-12-11 20:16 ` Xie, Huawei 2015-01-23 3:36 ` Xie, Huawei 2015-01-23 8:36 ` Linhaifeng 2014-12-16 3:05 ` Tetsuya Mukawa 2014-12-17 1:06 ` Xie, Huawei 2014-12-17 3:31 ` Tetsuya Mukawa 2014-12-17 4:22 ` Tetsuya Mukawa [this message] 2014-12-17 17:31 ` Xie, Huawei 2014-12-19 3:36 ` Tetsuya Mukawa 2014-12-24 7:21 ` Tetsuya Mukawa 2015-01-04 9:53 ` Xie, Huawei 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 09/12] lib/librte_vhost: minor fix Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 10/12] lib/librte_vhost: vhost-user memory region map Huawei Xie 2014-12-16 2:38 ` Tetsuya Mukawa 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 11/12] lib/librte_vhost: kick/callfd fix Huawei Xie 2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 12/12] lib/librte_vhost: cleanup when vhost user socket connection is closed Huawei Xie 2014-12-10 22:04 ` [dpdk-dev] [PATCH RFC v2 00/12] lib/librte_vhost: vhost-user support Xie, Huawei 2014-12-11 2:21 ` Tetsuya Mukawa 2014-12-15 5:26 ` Tetsuya Mukawa 2014-12-17 17:43 ` Xie, Huawei 2015-01-07 12:43 ` Qiu, Michael 2015-01-23 8:16 ` Linhaifeng 2015-01-26 7:24 ` Xie, Huawei
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=549104F7.20906@igel.co.jp \ --to=mukawa@igel.co.jp \ --cc=dev@dpdk.org \ --cc=huawei.xie@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
DPDK patches and discussions This inbox may be cloned and mirrored by anyone: git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \ dev@dpdk.org public-inbox-index dev Example config snippet for mirrors. Newsgroup available over NNTP: nntp://inbox.dpdk.org/inbox.dpdk.dev AGPL code for this site: git clone https://public-inbox.org/public-inbox.git