From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id AC1D55936 for ; Thu, 24 Dec 2015 15:05:18 +0100 (CET) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP; 24 Dec 2015 06:05:17 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,474,1444719600"; d="scan'208";a="18714113" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by fmsmga004.fm.intel.com with ESMTP; 24 Dec 2015 06:05:17 -0800 Received: from fmsmsx115.amr.corp.intel.com (10.18.116.19) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 24 Dec 2015 06:05:17 -0800 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by fmsmsx115.amr.corp.intel.com (10.18.116.19) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 24 Dec 2015 06:05:17 -0800 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.28]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.105]) with mapi id 14.03.0248.002; Thu, 24 Dec 2015 22:05:14 +0800 From: "Tan, Jianfeng" To: Tetsuya Mukawa , "dev@dpdk.org" Thread-Topic: [PATCH v1 0/2] Virtio-net PMD Extension to work on host Thread-Index: AQHRN90xOZCP14OXCUayJO232yFI8Z7VkoOQ Date: Thu, 24 Dec 2015 14:05:13 +0000 Message-ID: References: <1447930650-26023-2-git-send-email-mukawa@igel.co.jp> <1450255049-2263-1-git-send-email-mukawa@igel.co.jp> In-Reply-To: <1450255049-2263-1-git-send-email-mukawa@igel.co.jp> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "nakajima.yoshihiro@lab.ntt.co.jp" , "mst@redhat.com" Subject: Re: [dpdk-dev] [PATCH v1 0/2] Virtio-net PMD Extension to work on host X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Dec 2015 14:05:19 -0000 Hi Tetsuya, After several days' studying your patch, I have some questions as follows: 1. Is physically-contig memory really necessary? This is a too strong requirement IMHO. IVSHMEM doesn't require this in its = original meaning. So how do you think of Huawei Xie's idea of using virtual address for address translation? (In add= ition, virtual address of mem_table could be different in application and QTest, but this can be addressed because SET_M= EM_TABLE msg will be intercepted by QTest) 2. Is root privilege OK in container's case? Another reason we'd like to give up physically-contig feature is that it ne= eds root privilege to read /proc/self/pagemap file. Container has already been widely criticized for bad security isolati= on. Enabling root privilege will make it worse. On the other hand, it's not easy to remove root privilege too. If we use vh= ost-net as the backend, kernel will definitely require root privilege to create a tap device/raw socket. We tend to pick s= uch work, which requires root, into runtime preparation of a container. Do you agree? 3.Is one Qtest process per virtio device too heavy? Although we can foresee that each container always owns only one virtio dev= ice, but take its possible high density into consideration, hundreds or even thousands of container requires the sa= me number of QTest processes. As you mentioned that port hotplug is supported, is it possible to use just on= e QTest process for all virtio devices emulation? As you know, we have another solution according to this (which under heavy = internal review). But I think we have lots of common problems to be solved, right? Thanks for your great work! Thanks, Jianfeng > -----Original Message----- > From: Tetsuya Mukawa [mailto:mukawa@igel.co.jp] > Sent: Wednesday, December 16, 2015 4:37 PM > To: dev@dpdk.org > Cc: nakajima.yoshihiro@lab.ntt.co.jp; Tan, Jianfeng; Xie, Huawei; > mst@redhat.com; marcandre.lureau@gmail.com; Tetsuya Mukawa > Subject: [PATCH v1 0/2] Virtio-net PMD Extension to work on host >=20 > [Change log] >=20 > PATCH v1: > (Just listing functionality changes and important bug fix) > * Support virtio-net interrupt handling. > (It means virtio-net PMD on host and guest have same virtio-net feature= s) > * Fix memory allocation method to allocate contiguous memory correctly. > * Port Hotplug is supported. > * Rebase on DPDK-2.2. >=20 >=20 > [Abstraction] >=20 > Normally, virtio-net PMD only works on VM, because there is no virtio-net > device on host. > This RFC patch extends virtio-net PMD to be able to work on host as virtu= al > PMD. > But we didn't implement virtio-net device as a part of virtio-net PMD. > To prepare virtio-net device for the PMD, start QEMU process with special > QTest mode, then connect it from virtio-net PMD through unix domain > socket. >=20 > The virtio-net PMD on host is fully compatible with the PMD on guest. > We can use same functionalities, and connect to anywhere QEMU virtio-net > device can. > For example, the PMD can use virtio-net multi queues function. Also it ca= n > connects to vhost-net kernel module and vhost-user backend application. > Similar to virtio-net PMD on QEMU, application memory that uses virtio-ne= t > PMD will be shared between vhost backend application. But vhost backend > application memory will not be shared. >=20 > Main target of this PMD is container like docker, rkt, lxc and etc. > We can isolate related processes(virtio-net PMD process, QEMU and vhost- > user backend process) by container. > But, to communicate through unix domain socket, shared directory will be > needed. >=20 >=20 > [How to use] >=20 > So far, we need QEMU patch to connect to vhost-user backend. > See below patch. > - http://patchwork.ozlabs.org/patch/552549/ > To know how to use, check commit log. >=20 >=20 > [Detailed Description] >=20 > - virtio-net device implementation > This host mode PMD uses QEMU virtio-net device. To do that, QEMU QTest > functionality is used. > QTest is a test framework of QEMU devices. It allows us to implement a > device driver outside of QEMU. > With QTest, we can implement DPDK application and virtio-net PMD as > standalone process on host. > When QEMU is invoked as QTest mode, any guest code will not run. > To know more about QTest, see below. > - http://wiki.qemu.org/Features/QTest >=20 > - probing devices > QTest provides a unix domain socket. Through this socket, driver process = can > access to I/O port and memory of QEMU virtual machine. > The PMD will send I/O port accesses to probe pci devices. > If we can find virtio-net and ivshmem device, initialize the devices. > Also, I/O port accesses of virtio-net PMD will be sent through socket, an= d > virtio-net PMD can initialize vitio-net device on QEMU correctly. >=20 > - ivshmem device to share memory > To share memory that virtio-net PMD process uses, ivshmem device will be > used. > Because ivshmem device can only handle one file descriptor, shared memory > should be consist of one file. > To allocate such a memory, EAL has new option called "--contig-mem". > If the option is specified, EAL will open a file and allocate memory from > hugepages. > While initializing ivshmem device, we can set BAR(Base Address Register). > It represents which memory QEMU vcpu can access to this shared memory. > We will specify host physical address of shared memory as this address. > It is very useful because we don't need to apply patch to QEMU to calcula= te > address offset. > (For example, if virtio-net PMD process will allocate memory from shared > memory, then specify the physical address of it to virtio-net register, Q= EMU > virtio-net device can understand it without calculating address offset.) >=20 >=20 > [Known issues] >=20 > - vhost-user > So far, to use vhost-user, we need to apply a patch to QEMU. > This is because, QEMU will not send memory information and file descripto= r > of ivshmem device to vhost-user backend. > I have submitted the patch to QEMU. > See "http://patchwork.ozlabs.org/patch/552549/". > Also, we may have an issue in DPDK vhost library to handle kickfd and cal= lfd. > The patch for this issue is needed. I have a workaround patch, but let me > check it more. > If someone wants to check vhost-user behavior, I will describe it more in > later email. >=20 >=20 >=20 >=20 > Tetsuya Mukawa (2): > EAL: Add new EAL "--contig-mem" option > virtio: Extend virtio-net PMD to support container environment >=20 > config/common_linuxapp | 1 + > drivers/net/virtio/Makefile | 4 + > drivers/net/virtio/qtest.c | 1107 > ++++++++++++++++++++++++++++ > drivers/net/virtio/virtio_ethdev.c | 341 ++++++++- > drivers/net/virtio/virtio_ethdev.h | 12 + > drivers/net/virtio/virtio_pci.h | 25 + > lib/librte_eal/common/eal_common_options.c | 7 + > lib/librte_eal/common/eal_internal_cfg.h | 1 + > lib/librte_eal/common/eal_options.h | 2 + > lib/librte_eal/linuxapp/eal/eal_memory.c | 77 +- > 10 files changed, 1543 insertions(+), 34 deletions(-) > create mode 100644 drivers/net/virtio/qtest.c >=20 > -- > 2.1.4