From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <huawei.xie@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id 8C85B68AF
 for <dev@dpdk.org>; Sat, 15 Nov 2014 02:32:48 +0100 (CET)
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by fmsmga101.fm.intel.com with ESMTP; 14 Nov 2014 17:42:52 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="416793238"
Received: from pgsmsx101.gar.corp.intel.com ([10.221.44.78])
 by FMSMGA003.fm.intel.com with ESMTP; 14 Nov 2014 17:33:45 -0800
Received: from pgsmsx105.gar.corp.intel.com (10.221.44.96) by
 PGSMSX101.gar.corp.intel.com (10.221.44.78) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Sat, 15 Nov 2014 09:42:51 +0800
Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by
 pgsmsx105.gar.corp.intel.com (10.221.44.96) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Sat, 15 Nov 2014 09:42:50 +0800
Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.110]) by
 SHSMSX152.ccr.corp.intel.com ([169.254.6.5]) with mapi id 14.03.0195.001;
 Sat, 15 Nov 2014 09:42:49 +0800
From: "Xie, Huawei" <huawei.xie@intel.com>
To: Tetsuya Mukawa <mukawa@igel.co.jp>, "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: vhost-user technical isssues
Thread-Index: Ac/997srCGFXGQlUQCu5VHdgBPnJPv//6DqA//yYxXCABnWcgP/+AnEw
Date: Sat, 15 Nov 2014 01:42:49 +0000
Message-ID: <C37D651A908B024F974696C65296B57B0F3044C6@SHSMSX101.ccr.corp.intel.com>
References: <C37D651A908B024F974696C65296B57B0F2F19EF@SHSMSX101.ccr.corp.intel.com>
 <5462DE39.1070006@igel.co.jp>
 <C37D651A908B024F974696C65296B57B0F30265D@SHSMSX101.ccr.corp.intel.com>
 <54656E87.8090801@igel.co.jp>
In-Reply-To: <54656E87.8090801@igel.co.jp>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] vhost-user technical isssues
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Nov 2014 01:32:49 -0000


> -----Original Message-----
> From: Tetsuya Mukawa [mailto:mukawa@igel.co.jp]
> Sent: Thursday, November 13, 2014 7:53 PM
> To: Xie, Huawei; dev@dpdk.org
> Cc: Long, Thomas
> Subject: Re: vhost-user technical isssues
>=20
> Hi Xie,
>=20
> (2014/11/14 9:22), Xie, Huawei wrote:
> >> I think so. I guess we need to consider 2 types of restarting. One is
> >> virtio-net driver restarting, the other is vhost-user backend
> >> restarting. But, so far, it's nice to start to think about virtio-net
> >> driver restarting first. Probably we need to implement a way to let
> >> vhost-user backend know virtio-net driver is restarted. I am not sure
> >> what is good way to let vhost-user backend know it. But how about
> >> followings RFC?
> > I checked your code today, and didn't find the logic to deal with virti=
o
> reconfiguration.
> Yes.
> I guess the first implementation of librte_vhost may just replace
> vhost-example function.
> Probably vhost-example doesn't think about restarting.
> Because of this, I haven't implemented.
>=20
> > My thought without new message support: When vhost-user receives
> > another configuration message since last time it is ready for
> > processing, then we could release it from data core, and process the
> > next reconfiguration message, and then re-add it to data core when it
> > is ready again(check new kick message as before). The candidate
> > message is set_mem_table. It is ok we keep the device on data core
> > until we receive the new reconfiguration message. Just waste vhost
> > some cycles checking the avail idx.
>=20
> For example, let's assume DPDK app1 is started on guest with virtio-net
> device port.
> If DPDK app1 on guest is stopped, and other DPDK app2 on guest is
> started without virtio-net device port.
> Hugepages DPDK app1 used will be used by DPDK app2.
> It means the memory accessed by vhost-user backend might be changed by
> DPDK app2.
> And vhost-user backend will be crashed.
> So I guess we need some kinds of reset message.
>=20

Virtio DPDK app crashes silently is an issue.
Let us check if there is possibility guest could crash vhost backend.
Even with kernel virtio, I think the basic principle is host shouldn't trus=
t guest totally.
If vhost could be crashed by virtio guest corrupting the ring,  it is the d=
esign of our vhost backend. :). Hope I make me clear.
Let us check case by case later. I understand some security check might slo=
w the performance.

I think the real problem is on the contrary, host could crash guest app or =
even kernel if guest is using the old memory that vhost is also using.

Btw, I checked qemu code, based on current qemu implementation, VHOST_USER_=
GET_VRING_BASE message is sent and only sent during vring stop. So this mes=
sage could be used to remove virtio device from data core temporarily. Howe=
ver this sounds not reasonable from the name of this message.
I did an implementation based on this message, virtio PMD could run now.=20
Previously we couldn't switch virtio from kernel driver to igb_uio. Now the=
y could switch smoothly between those two.
Check the RFC patch.
Besides, as stated in the patch, I think we should only leave the most comm=
on operation in virtio layer, and move the control handling related to cuse=
/fuse layer. It is difficult to handle all the differences in message flow.

> Thanks,
> Tetsuya