DPDK patches and discussions
 help / color / mirror / Atom feed
From: Tetsuya Mukawa <mukawa@igel.co.jp>
To: "Xie, Huawei" <huawei.xie@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH RFC v2 08/12] lib/librte_vhost: vhost-user support
Date: Wed, 17 Dec 2014 12:31:26 +0900
Message-ID: <5490F90E.6050701@igel.co.jp> (raw)
In-Reply-To: <C37D651A908B024F974696C65296B57B0F326812@SHSMSX101.ccr.corp.intel.com>

(2014/12/17 10:06), Xie, Huawei wrote:
>>> +{
>>> +	struct virtio_net *dev = get_device(ctx);
>>> +
>>> +	/* We have to stop the queue (virtio) if it is running. */
>>> +	if (dev->flags & VIRTIO_DEV_RUNNING)
>>> +		notify_ops->destroy_device(dev);
>> I have an one concern about finalization of vrings.
>> Can vhost-backend stop accessing RX/TX to the vring before replying to
>> this message?
>>
>> QEMU sends this message when virtio-net device is finalized by
>> virtio-net driver on the guest.
>> After finalization, memories used by the vring will be freed by
>> virtio-net driver, because these memories are allocated by virtio-net
>> driver.
>> Because of this, I guess vhost-backend must stop accessing to vring
>> before replying to this message.
>>
>> I am not sure what is a good way to stop accessing.
>> One idea is adding a condition checking when rte_vhost_dequeue_burst()
>> and rte_vhost_enqueue_burst() is called.
>> Anyway we probably need to wait for stopping access before replying.
>>
>> Thanks,
>> Tetsuya
>>
> I think we have discussed the similar question.

Sorry, probably I might not be able to got your point correctly at the
former email.

> It is actually the same issue whether the virtio APP in guest is crashed, or is finalized.

I guess when the APP is finalized correctly, we can have a solution.
Could you please read comment I wrote later?

> The virtio APP will only write to the STATUS register without waiting/syncing to vhost backend.

Yes, virtio APP only write to the STATUS register. I agree with it.

When the register is written by guest, KVM will catch it, and the
context will be change to QEMU. And QEMU will works like below.
(Also while doing following steps, guest is waiting because the context
is in QEMU)

Could you please see below with latest QEMU code?
1. virtio_ioport_write() [hw/virtio/virtio-pci.c] <= virtio APP will
wait for replying of this function.
2. virtio_set_status() [hw/virtio/virtio.c]
3. virtio_net_set_status() [hw/net/virtio-net.c]
4. virtio_net_vhost_status() [hw/net/virtio-net.c]
5. vhost_net_stop() [hw/net/vhost_net.c]
6. vhost_net_stop_one() [hw/net/vhost_net.c]
7. vhost_dev_stop() [hw/virtio/vhost.c]
8. vhost_virtqueue_stop() [hw/virtio/vhost.c]
9. vhost_user_call() [virtio/vhost-user.c]
10. VHOST_USER_GET_VRING_BASE message is sent to backend. And waiting
for backend reply.

When the vhost-user backend receives GET_VRING_BASE, I guess the guest
APP is stopped.
Also QEMU will wait for vhost-user backend reply because GET_VRING_BASE
is synchronous message.
Because of above, I guess virtio APP can wait for vhost-backend
finalization.

> After that, not only the guest memory pointed by vring entry but also the vring itself isn't usable any more.
> The memory for vring or pointed by vring entry might be used by other APPs.

I agree.

> This will crash guest(rather than the vhost, do you agree?).

I guess we cannot assume how the freed memory is used.
In some cases, a new APP still works, but vhost backend can access
inconsistency vring structure.
In the case vhost backend could receive illegal packets.
For example, avail_idx value might be changed to be 0xFFFF by a new APP.
(I am not sure RX/TX functions can handle such a value correctly)

Anyway, my point is if we can finalize vhost backend correctly, we only
need to take care of crashing case.
(If so, it's very nice :))
So let's make sure whether virtio APP can wait for finalization, or not.
I am thinking how to do it now.


> If you mean this issue, I think we have no solution but one walk around: keep the huge page files of crashed app, and 
> bind virtio to igb_uio and then delete the huge page files.

Yes I agree.
If the virtio APP is crashed, this will be a solution.

Thanks,
Tetsuya

>
> In our implementation, when vhost sends the message,  we will call the destroy_device provided by the vSwitch to ask the
> vSwitch to stop processing the vring, but this willn't solve the issue I mention above, because the virtio APP in guest will n't 
> wait us.
>
> Could you explain a bit more? Is it the same issue?
>
>
> -huawei
>
>
>

  reply	other threads:[~2014-12-17  3:31 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-10 21:37 [dpdk-dev] [PATCH RFC v2 00/12] " Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 01/12] lib/librte_vhost: mov vhost-cuse implementation to vhost_cuse directory Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 02/12] lib/librte_vhost: rename vhost-net-cdev.h as vhost-net.h Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 03/12] lib/librte_vhost: move event_copy logic from virtio-net.c to vhost-net-cdev.c Huawei Xie
2015-01-07  9:10   ` Xie, Huawei
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 04/12] lib/librte_vhost: copy of host_memory_map from virtio-net.c to new file virtio-net-cdev.c Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 05/12] lib/librte_vhost: host_memory_map refine Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 06/12] lib/librte_vhost: cuse_set_memory_table Huawei Xie
2014-12-15  5:20   ` Tetsuya Mukawa
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 07/12] lib/librte_vhost: async event and callback Huawei Xie
2014-12-15  5:20   ` Tetsuya Mukawa
2014-12-17 17:51     ` Xie, Huawei
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 08/12] lib/librte_vhost: vhost-user support Huawei Xie
2014-12-11  5:36   ` Linhaifeng
2015-01-05 10:21     ` Xie, Huawei
2015-01-23  3:40     ` Xie, Huawei
2015-01-23  3:53       ` Linhaifeng
2014-12-11  6:04   ` Linhaifeng
2014-12-11 17:13     ` Xie, Huawei
2014-12-12  2:25       ` Linhaifeng
2014-12-11 20:16     ` Xie, Huawei
2015-01-23  3:36     ` Xie, Huawei
2015-01-23  8:36       ` Linhaifeng
2014-12-16  3:05   ` Tetsuya Mukawa
2014-12-17  1:06     ` Xie, Huawei
2014-12-17  3:31       ` Tetsuya Mukawa [this message]
2014-12-17  4:22         ` Tetsuya Mukawa
2014-12-17 17:31           ` Xie, Huawei
2014-12-19  3:36             ` Tetsuya Mukawa
2014-12-24  7:21   ` Tetsuya Mukawa
2015-01-04  9:53     ` Xie, Huawei
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 09/12] lib/librte_vhost: minor fix Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 10/12] lib/librte_vhost: vhost-user memory region map Huawei Xie
2014-12-16  2:38   ` Tetsuya Mukawa
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 11/12] lib/librte_vhost: kick/callfd fix Huawei Xie
2014-12-10 21:37 ` [dpdk-dev] [PATCH RFC v2 12/12] lib/librte_vhost: cleanup when vhost user socket connection is closed Huawei Xie
2014-12-10 22:04 ` [dpdk-dev] [PATCH RFC v2 00/12] lib/librte_vhost: vhost-user support Xie, Huawei
2014-12-11  2:21   ` Tetsuya Mukawa
2014-12-15  5:26 ` Tetsuya Mukawa
2014-12-17 17:43   ` Xie, Huawei
2015-01-07 12:43     ` Qiu, Michael
2015-01-23  8:16 ` Linhaifeng
2015-01-26  7:24   ` Xie, Huawei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5490F90E.6050701@igel.co.jp \
    --to=mukawa@igel.co.jp \
    --cc=dev@dpdk.org \
    --cc=huawei.xie@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git