From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 224A35686 for ; Mon, 28 Dec 2015 06:16:22 +0100 (CET) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP; 27 Dec 2015 21:16:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,489,1444719600"; d="scan'208";a="849322361" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by orsmga001.jf.intel.com with ESMTP; 27 Dec 2015 21:16:00 -0800 Received: from shsmsx102.ccr.corp.intel.com (10.239.4.154) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 27 Dec 2015 21:16:00 -0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.111]) by shsmsx102.ccr.corp.intel.com ([169.254.2.133]) with mapi id 14.03.0248.002; Mon, 28 Dec 2015 13:15:58 +0800 From: "Qiu, Michael" To: Tetsuya Mukawa , "dev@dpdk.org" Thread-Topic: [RFC PATCH 0/2] Virtio-net PMD Extension to work on host. Thread-Index: AQHRIrkyM9WWyZjar0y4DKkNGm7Lig== Date: Mon, 28 Dec 2015 05:15:58 +0000 Message-ID: <533710CFB86FA344BFBF2D6802E6028622F04DCF@SHSMSX101.ccr.corp.intel.com> References: <1447930650-26023-1-git-send-email-mukawa@igel.co.jp> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "nakajima.yoshihiro@lab.ntt.co.jp" , "zhbzg@huawei.com" , "mst@redhat.com" , "gaoxiaoqiu@huawei.com" , "oscar.zhangbo@huawei.com" , "ann.zhuangyanying@huawei.com" , "zhoujingbin@huawei.com" , "guohongzhen@huawei.com" Subject: Re: [dpdk-dev] [RFC PATCH 0/2] Virtio-net PMD Extension to work on host. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Dec 2015 05:16:23 -0000 Hi, Tetsuya=0A= =0A= I have a question about your solution, as I know you plan to run qemu=0A= and dpdk both in container right?=0A= =0A= If so, I think it's a bit tricky, DPDK is a lib, and qemu is a App,=0A= seems it is not suitable to let a lib depends on Apps.=0A= =0A= Also, till now I don't see any usecase to run qemu inside container.=0A= =0A= Thanks,=0A= Michael=0A= =0A= On 11/19/2015 6:58 PM, Tetsuya Mukawa wrote:=0A= > THIS IS A PoC IMPLEMENATION.=0A= >=0A= > [Abstraction]=0A= >=0A= > Normally, virtio-net PMD only works on VM, because there is no virtio-net= device on host.=0A= > This RFC patch extends virtio-net PMD to be able to work on host as virtu= al PMD.=0A= > But we didn't implement virtio-net device as a part of virtio-net PMD.=0A= > To prepare virtio-net device for the PMD, start QEMU process with special= QTest mode, then connect it from virtio-net PMD through unix domain socket= .=0A= >=0A= > The PMD can connect to anywhere QEMU virtio-net device can.=0A= > For example, the PMD can connects to vhost-net kernel module and vhost-us= er backend application.=0A= > Similar to virtio-net PMD on QEMU, application memory that uses virtio-ne= t PMD will be shared between vhost backend application.=0A= > But vhost backend application memory will not be shared.=0A= >=0A= > Main target of this PMD is container like docker, rkt, lxc and etc.=0A= > We can isolate related processes(virtio-net PMD process, QEMU and vhost-u= ser backend process) by container.=0A= > But, to communicate through unix domain socket, shared directory will be = needed.=0A= >=0A= >=0A= > [How to use]=0A= >=0A= > So far, we need QEMU patch to connect to vhost-user backend.=0A= > Please check known issue in later section.=0A= > Because of this, I will describe example of using vhost-net kernel module= .=0A= >=0A= > - Compile=0A= > Set "CONFIG_RTE_VIRTIO_VDEV=3Dy" in config/common_linux.=0A= > Then compile it.=0A= >=0A= > - Start QEMU like below.=0A= > $ sudo qemu-system-x86_64 -qtest unix:/tmp/qtest0,server -machine accel= =3Dqtest \=0A= > -display none -qtest-log /dev/null \=0A= > -netdev type=3Dtap,script=3D/etc/qemu-ifup,id= =3Dnet0,vhost=3Don \=0A= > -device virtio-net-pci,netdev=3Dnet0 \=0A= > -chardev socket,id=3Dchr1,path=3D/tmp/ivshmem0= ,server \=0A= > -device ivshmem,size=3D1G,chardev=3Dchr1,vecto= rs=3D1=0A= >=0A= > - Start DPDK application like below=0A= > $ sudo ./testpmd -c f -n 1 -m 1024 --shm \=0A= > --vdev=3D"eth_cvio0,qtest=3D/tmp/qtest0,ivshmem=3D/tmp/= ivshmem0" -- \=0A= > --disable-hw-vlan --txqflags=3D0xf00 -i=0A= >=0A= > - Check created tap device.=0A= >=0A= > (*1) Please Specify same memory size in QEMU and DPDK command line.=0A= >=0A= >=0A= > [Detailed Description]=0A= >=0A= > - virtio-net device implementation=0A= > The PMD uses QEMU virtio-net device. To do that, QEMU QTest functionality= is used.=0A= > QTest is a test framework of QEMU devices. It allows us to implement a de= vice driver outside of QEMU.=0A= > With QTest, we can implement DPDK application and virtio-net PMD as stand= alone process on host.=0A= > When QEMU is invoked as QTest mode, any guest code will not run.=0A= > To know more about QTest, see below.=0A= > http://wiki.qemu.org/Features/QTest=0A= >=0A= > - probing devices=0A= > QTest provides a unix domain socket. Through this socket, driver process = can access to I/O port and memory of QEMU virtual machine.=0A= > The PMD will send I/O port accesses to probe pci devices.=0A= > If we can find virtio-net and ivshmem device, initialize the devices.=0A= > Also, I/O port accesses of virtio-net PMD will be sent through socket, an= d virtio-net PMD can initialize vitio-net device on QEMU correctly.=0A= >=0A= > - ivshmem device to share memory=0A= > To share memory that virtio-net PMD process uses, ivshmem device will be = used.=0A= > Because ivshmem device can only handle one file descriptor, shared memory= should be consist of one file.=0A= > To allocate such a memory, EAL has new option called "--shm".=0A= > If the option is specified, EAL will open a file and allocate memory from= hugepages.=0A= > While initializing ivshmem device, we can set BAR(Base Address Register).= =0A= > It represents which memory QEMU vcpu can access to this shared memory.=0A= > We will specify host physical address of shared memory as this address.= =0A= > It is very useful because we don't need to apply patch to QEMU to calcula= te address offset.=0A= > (For example, if virtio-net PMD process will allocate memory from shared = memory, then specify the physical address of it to virtio-net register, QEM= U virtio-net device can understand it without calculating address offset.)= =0A= >=0A= > - Known limitation=0A= > So far, the PMD doesn't handle interrupts from QEMU devices.=0A= > Because of this, VIRTIO_NET_F_STATUS functionality is dropped.=0A= > But without it, we can use all virtio-net functions.=0A= >=0A= > - Known issues=0A= > So far, to use vhost-user, we need to apply vhost-user patch to QEMU and = DPDK vhost library.=0A= > This is because, QEMU will not send memory information and file descripto= r of ivshmem device to vhost-user backend.=0A= > (Anyway, vhost-net kernel module can receive the information. So vhost-us= er behavior will not be correct. I will submit the patch to QEMU soon)=0A= > Also, we may have an issue in DPDK vhost library to handle kickfd and cal= lfd. The patch for it is needed.=0A= > (Let me check it more)=0A= > If someone wants to check vhost-user behavior, I will describe it more in= later email.=0A= >=0A= >=0A= > [Addition]=0A= >=0A= > We can apply same manner to handle any kind of QEMU devices from DPDK app= lication.=0A= > So far, I don't have any ideas except for virtio-net device. But someone = would have.=0A= >=0A= >=0A= > Tetsuya Mukawa (2):=0A= > EAL: Add new EAL "--shm" option.=0A= > virtio: Extend virtio-net PMD to support container environment=0A= >=0A= > config/common_linuxapp | 5 +=0A= > drivers/net/virtio/Makefile | 4 +=0A= > drivers/net/virtio/qtest.c | 590 +++++++++++++++++++++++= ++++++=0A= > drivers/net/virtio/virtio_ethdev.c | 214 ++++++++++-=0A= > drivers/net/virtio/virtio_ethdev.h | 16 +=0A= > drivers/net/virtio/virtio_pci.h | 25 ++=0A= > lib/librte_eal/common/eal_common_options.c | 5 +=0A= > lib/librte_eal/common/eal_internal_cfg.h | 1 +=0A= > lib/librte_eal/common/eal_options.h | 2 +=0A= > lib/librte_eal/common/include/rte_memory.h | 5 +=0A= > lib/librte_eal/linuxapp/eal/eal_memory.c | 71 ++++=0A= > 11 files changed, 917 insertions(+), 21 deletions(-)=0A= > create mode 100644 drivers/net/virtio/qtest.c=0A= >=0A= =0A=