From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id E04028D90 for ; Mon, 25 Jan 2016 11:15:29 +0100 (CET) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP; 25 Jan 2016 02:15:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,343,1449561600"; d="scan'208";a="35671133" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by fmsmga004.fm.intel.com with ESMTP; 25 Jan 2016 02:15:29 -0800 Received: from fmsmsx111.amr.corp.intel.com (10.18.116.5) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.248.2; Mon, 25 Jan 2016 02:15:28 -0800 Received: from shsmsx103.ccr.corp.intel.com (10.239.110.14) by fmsmsx111.amr.corp.intel.com (10.18.116.5) with Microsoft SMTP Server (TLS) id 14.3.248.2; Mon, 25 Jan 2016 02:15:28 -0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.215]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.218]) with mapi id 14.03.0248.002; Mon, 25 Jan 2016 18:15:20 +0800 From: "Xie, Huawei" To: Tetsuya Mukawa , "dev@dpdk.org" , "yuanhan.liu@linux.intel.com" , "Tan, Jianfeng" Thread-Topic: [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment Thread-Index: AdFU7PZnAunQ5LVPT++h9TW0MGQ/9A== Date: Mon, 25 Jan 2016 10:15:19 +0000 Message-ID: References: <1453108389-21006-2-git-send-email-mukawa@igel.co.jp> <1453374478-30996-6-git-send-email-mukawa@igel.co.jp> <56A2065A.9020207@igel.co.jp> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Jan 2016 10:15:30 -0000 On 1/22/2016 6:38 PM, Tetsuya Mukawa wrote:=0A= > On 2016/01/22 17:14, Xie, Huawei wrote:=0A= >> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:=0A= >>> virtio: Extend virtio-net PMD to support container environment=0A= >>>=0A= >>> The patch adds a new virtio-net PMD configuration that allows the PMD t= o=0A= >>> work on host as if the PMD is in VM.=0A= >>> Here is new configuration for virtio-net PMD.=0A= >>> - CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE=0A= >>> To use this mode, EAL needs physically contiguous memory. To allocate= =0A= >>> such memory, add "--shm" option to application command line.=0A= >>>=0A= >>> To prepare virtio-net device on host, the users need to invoke QEMU=0A= >>> process in special qtest mode. This mode is mainly used for testing QEM= U=0A= >>> devices from outer process. In this mode, no guest runs.=0A= >>> Here is QEMU command line.=0A= >>>=0A= >>> $ qemu-system-x86_64 \=0A= >>> -machine pc-i440fx-1.4,accel=3Dqtest \=0A= >>> -display none -qtest-log /dev/null \=0A= >>> -qtest unix:/tmp/socket,server \=0A= >>> -netdev type=3Dtap,script=3D/etc/qemu-ifup,id=3Dnet0,queue= s=3D1\=0A= >>> -device virtio-net-pci,netdev=3Dnet0,mq=3Don \=0A= >>> -chardev socket,id=3Dchr1,path=3D/tmp/ivshmem,server \=0A= >>> -device ivshmem,size=3D1G,chardev=3Dchr1,vectors=3D1=0A= >>>=0A= >>> * QEMU process is needed per port.=0A= >> Does qtest supports hot plug virtio-net pci device, so that we could run= =0A= >> one QEMU process in host, which provisions the virtio-net virtual=0A= >> devices for the container?=0A= > Theoretically, we can use hot plug in some cases.=0A= > But I guess we have 3 concerns here.=0A= >=0A= > 1. Security.=0A= > If we share QEMU process between multiple DPDK applications, this QEMU=0A= > process will have all fds of the applications on different containers.= =0A= > In some cases, it will be security concern.=0A= > So, I guess we need to support current 1:1 configuration at least.=0A= >=0A= > 2. shared memory.=0A= > Currently, QEMU and DPDK application will map shared memory using same=0A= > virtual address.=0A= > So if multiple DPDK application connects to one QEMU process, each DPDK= =0A= > application should have different address for shared memory. I guess=0A= > this will be a big limitation.=0A= >=0A= > 3. PCI bridge.=0A= > So far, QEMU has one PCI bridge, so we can connect almost 10 PCI devices= =0A= > to QEMU.=0A= > (I forget correct number, but it's almost 10, because some slots are=0A= > reserved by QEMU)=0A= > A DPDK application needs both virtio-net and ivshmem device, so I guess= =0A= > almost 5 DPDK applications can connect to one QEMU process, so far.=0A= > To add more PCI bridges solves this.=0A= > But we need to add a lot of implementation to support cascaded PCI=0A= > bridges and PCI devices.=0A= > (Also we need to solve above "2nd" concern.)=0A= >=0A= > Anyway, if we use virtio-net PMD and vhost-user PMD, QEMU process will=0A= > not do anything after initialization.=0A= > (QEMU will try to read a qtest socket, then be stopped because there is= =0A= > no message after initialization)=0A= > So I guess we can ignore overhead of these QEMU processes.=0A= > If someone cannot ignore it, I guess this is the one of cases that it's= =0A= > nice to use your light weight container implementation.=0A= =0A= Thanks for the explanation, and also in your opinion where is the best=0A= place to run the QEMU instance? If we run QEMU instances in host, for=0A= vhost-kernel support, we could get rid of the root privilege issue.=0A= =0A= Another issue is do you plan to support multiple virtio devices in=0A= container? Currently i find the code assuming only one virtio-net device=0A= in QEMU, right?=0A= =0A= Btw, i have read most of your qtest code. No obvious issues found so far=0A= but quite a couple of nits. You must have spent a lot of time on this.=0A= It is great work!=0A= =0A= > Thanks,=0A= > Tetsuya=0A= >=0A= =0A=