From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 0D760568B for ; Fri, 29 Apr 2016 03:35:09 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP; 28 Apr 2016 18:35:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,549,1455004800"; d="scan'208";a="968834600" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by fmsmga002.fm.intel.com with ESMTP; 28 Apr 2016 18:35:08 -0700 Received: from fmsmsx151.amr.corp.intel.com (10.18.125.4) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 28 Apr 2016 18:35:08 -0700 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by FMSMSX151.amr.corp.intel.com (10.18.125.4) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 28 Apr 2016 18:35:08 -0700 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.229]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.45]) with mapi id 14.03.0248.002; Fri, 29 Apr 2016 09:35:06 +0800 From: "Tan, Jianfeng" To: "dev@dpdk.org" CC: "Xie, Huawei" , "rich.lane@bigswitch.com" , "yuanhan.liu@linux.intel.com" , "mst@redhat.com" , "nakajima.yoshihiro@lab.ntt.co.jp" , "p.fedin@samsung.com" , "Qiu, Michael" , "ann.zhuangyanying@huawei.com" , "mukawa@igel.co.jp" , "nhorman@tuxdriver.com" Thread-Topic: [PATCH v4 0/8] virtio support for container Thread-Index: AQHRobUPf+KiBychNEukdDUnexXU4p+gKsnA Date: Fri, 29 Apr 2016 01:35:05 +0000 Message-ID: References: <1446748276-132087-1-git-send-email-jianfeng.tan@intel.com> <1461892716-19122-1-git-send-email-jianfeng.tan@intel.com> In-Reply-To: <1461892716-19122-1-git-send-email-jianfeng.tan@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4 0/8] virtio support for container X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Apr 2016 01:35:10 -0000 Sorry, forget to mention, this patchset depends on: - [PATCH v2] virtio: fix modify drv_flags for specific device - [PATCH v3 0/2] virtio: fix memory leak of virtqueue memzones Thanks, Jianfeng > -----Original Message----- > From: Tan, Jianfeng > Sent: Friday, April 29, 2016 9:18 AM > To: dev@dpdk.org > Cc: Tan, Jianfeng; Xie, Huawei; rich.lane@bigswitch.com; > yuanhan.liu@linux.intel.com; mst@redhat.com; > nakajima.yoshihiro@lab.ntt.co.jp; p.fedin@samsung.com; Qiu, Michael; > ann.zhuangyanying@huawei.com; mukawa@igel.co.jp; > nhorman@tuxdriver.com > Subject: [PATCH v4 0/8] virtio support for container >=20 > v4: > - Avoid using dev_type, instead use (eth_dev->pci_device is NULL) to > judge if it's virtual device or physical device. > - Change the added device name to virtio-user. > - Split into vhost_user.c, vhost_kernel.c, vhost.c, virtio_user_pci.c, > virtio_user_dev.c. > - Move virtio-user specific data from struct virtio_hw into struct > virtio_user_hw. > - Add support to send reset_owner message. > - Change del_queue implementation. (This need more check) > - Remove rte_panic(), and superseded with log. > - Add reset_owner into virtio_pci_ops.reset. > - Merge parameter "rx" and "tx" to "queues" to emliminate confusion. > - Move get_features to after set_owner. > - Redefine path in virtio_user_hw from char * to char []. >=20 > v3: > - Remove --single-file option; do no change at EAL memory. > - Remove the added API rte_eal_get_backfile_info(), instead we check all > opened files with HUGEFILE_FMT to find hugepage files owned by DPDK. > - Accordingly, add more restrictions at "Known issue" section. > - Rename parameter from queue_num to queue_size for confusion. > - Rename vhost_embedded.c to rte_eth_virtio_vdev.c. > - Move code related to the newly added vdev to rte_eth_virtio_vdev.c, to > reuse eth_virtio_dev_init(), remove its static declaration. > - Implement dev_uninit() for rte_eth_dev_detach(). > - WARN -> ERR, in vhost_embedded.c > - Add more commit message for clarify the model. >=20 > v2: > - Rebase on the patchset of virtio 1.0 support. > - Fix cannot create non-hugepage memory. > - Fix wrong size of memory region when "single-file" is used. > - Fix setting of offset in virtqueue to use virtual address. > - Fix setting TUNSETVNETHDRSZ in vhost-user's branch. > - Add mac option to specify the mac address of this virtual device. > - Update doc. >=20 > This patchset is to provide high performance networking interface (virtio= ) > for container-based DPDK applications. The way of starting DPDK apps in > containers with ownership of NIC devices exclusively is beyond the scope. > The basic idea here is to present a new virtual device (named virtio-user= ), > which can be discovered and initialized by DPDK. To minimize the change, > we reuse already-existing virtio PMD code (driver/net/virtio/). >=20 > Background: Previously, we usually use a virtio device in the context of > QEMU/VM as below pic shows. Virtio nic is emulated in QEMU, and usually > presented in VM as a PCI device. >=20 > ------------------ > | virtio driver | -----> VM > ------------------ > | > | ----------> (over PCI bus or MMIO or Channel I/O) > | > ------------------ > | device emulate | > | | -----> QEMU > | vhost adapter | > ------------------ > | > | ----------> (vhost-user protocol or vhost-net ioctls) > | > ------------------ > | vhost backend | > ------------------ >=20 > Compared to QEMU/VM case, virtio support for contaner requires to > embedded > device framework inside the virtio PMD. So this converged driver actually > plays three roles: > - virtio driver to drive this new kind of virtual device; > - device emulation to present this virtual device and reponse to the > virtio driver, which is originally by QEMU; > - and the role to communicate with vhost backend, which is also > originally by QEMU. >=20 > The code layout and functionality of each module: >=20 > ---------------------- > | ------------------ | > | | virtio driver | |----> (virtio_user_pci.c) > | ------------------ | > | | | > | ------------------ | ------> virtio-user PMD > | | device emulate |-|----> (virtio_user_dev.c) > | | | | > | | vhost adapter |-|----> (vhost_user.c, vhost_kernel.c, vhost.c) > | ------------------ | > ---------------------- > | > | -------------- --> (vhost-user protocol or vhost-net ioctls) > | > ------------------ > | vhost backend | > ------------------ >=20 > How to share memory? In VM's case, qemu always shares all physical layout > to backend. But it's not feasible for a container, as a process, to share > all virtual memory regions to backend. So only specified virtual memory > regions (with type of shared) are sent to backend. It's a limitation that > only addresses in these areas can be used to transmit or receive packets. >=20 > Known issues: > - Control queue and multi-queue are not supported yet. > - Cannot work with --huge-unlink. > - Cannot work with no-huge. > - Cannot work when there are more than > VHOST_MEMORY_MAX_NREGIONS(8) > hugepages. > - Root privilege is a must (mainly becase of sorting hugepages according > to physical address). > - Applications should not use file name like HUGEFILE_FMT ("%smap_%d"). >=20 > How to use? >=20 > a. Apply this patchset. >=20 > b. To compile container apps: > $: make config RTE_SDK=3D`pwd` T=3Dx86_64-native-linuxapp-gcc > $: make install RTE_SDK=3D`pwd` T=3Dx86_64-native-linuxapp-gcc > $: make -C examples/l2fwd RTE_SDK=3D`pwd` T=3Dx86_64-native-linuxapp-gcc > $: make -C examples/vhost RTE_SDK=3D`pwd` T=3Dx86_64-native-linuxapp-gcc >=20 > c. To build a docker image using Dockerfile below. > $: cat ./Dockerfile > FROM ubuntu:latest > WORKDIR /usr/src/dpdk > COPY . /usr/src/dpdk > ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/" > $: docker build -t dpdk-app-l2fwd . >=20 > d. Used with vhost-user > $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \ > --socket-mem 1024,1024 -- -p 0x1 --stats 1 > $: docker run -i -t -v :/var/run/usvhost \ > -v /dev/hugepages:/dev/hugepages \ > dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \ > --vdev=3Dvirtio-user0,path=3D/var/run/usvhost -- -p 0x1 >=20 > f. Used with vhost-net > $: modprobe vhost > $: modprobe vhost-net > $: docker run -i -t --privileged \ > -v /dev/vhost-net:/dev/vhost-net \ > -v /dev/net/tun:/dev/net/tun \ > -v /dev/hugepages:/dev/hugepages \ > dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \ > --vdev=3Dvirtio-user0,path=3D/dev/vhost-net -- -p 0x1 >=20 > By the way, it's not necessary to run in a container. >=20 > Signed-off-by: Huawei Xie > Signed-off-by: Jianfeng Tan > Acked-By: Neil Horman >=20 > Jianfeng Tan (8): > virtio: hide phys addr check inside pci ops > virtio: abstract vring hdr desc init as a method > virtio: enable use virtual address to fill desc > virtio-user: add vhost adapter layer > virtio-user: add device emulation layer APIs > virtio-user: add new virtual pci driver for virtio > virtio-user: add a new virtual device named virtio-user > doc: update doc for virtio-user >=20 > config/common_linuxapp | 3 + > doc/guides/nics/overview.rst | 64 +-- > doc/guides/rel_notes/release_16_07.rst | 4 + > drivers/net/virtio/Makefile | 8 + > drivers/net/virtio/virtio_ethdev.c | 69 ++-- > drivers/net/virtio/virtio_ethdev.h | 2 + > drivers/net/virtio/virtio_pci.c | 30 +- > drivers/net/virtio/virtio_pci.h | 3 +- > drivers/net/virtio/virtio_rxtx.c | 5 +- > drivers/net/virtio/virtio_rxtx_simple.c | 13 +- > drivers/net/virtio/virtio_user/vhost.c | 105 +++++ > drivers/net/virtio/virtio_user/vhost.h | 221 +++++++++++ > drivers/net/virtio/virtio_user/vhost_kernel.c | 254 ++++++++++++ > drivers/net/virtio/virtio_user/vhost_user.c | 375 +++++++++++++++++= + > drivers/net/virtio/virtio_user/virtio_user_dev.c | 475 > +++++++++++++++++++++++ > drivers/net/virtio/virtio_user/virtio_user_dev.h | 61 +++ > drivers/net/virtio/virtio_user/virtio_user_pci.c | 209 ++++++++++ > drivers/net/virtio/virtqueue.h | 33 +- > 18 files changed, 1849 insertions(+), 85 deletions(-) > create mode 100644 drivers/net/virtio/virtio_user/vhost.c > create mode 100644 drivers/net/virtio/virtio_user/vhost.h > create mode 100644 drivers/net/virtio/virtio_user/vhost_kernel.c > create mode 100644 drivers/net/virtio/virtio_user/vhost_user.c > create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.c > create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.h > create mode 100644 drivers/net/virtio/virtio_user/virtio_user_pci.c >=20 > -- > 2.1.4