From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 425AA5687 for ; Fri, 29 Apr 2016 03:18:38 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 28 Apr 2016 18:18:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,549,1455004800"; d="scan'208";a="694039273" Received: from dpdk06.sh.intel.com ([10.239.128.225]) by FMSMGA003.fm.intel.com with ESMTP; 28 Apr 2016 18:18:35 -0700 From: Jianfeng Tan To: dev@dpdk.org Cc: Jianfeng Tan , Huawei Xie , rich.lane@bigswitch.com, yuanhan.liu@linux.intel.com, mst@redhat.com, nakajima.yoshihiro@lab.ntt.co.jp, p.fedin@samsung.com, michael.qiu@intel.com, ann.zhuangyanying@huawei.com, mukawa@igel.co.jp, nhorman@tuxdriver.com Date: Fri, 29 Apr 2016 01:18:28 +0000 Message-Id: <1461892716-19122-1-git-send-email-jianfeng.tan@intel.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1446748276-132087-1-git-send-email-jianfeng.tan@intel.com> References: <1446748276-132087-1-git-send-email-jianfeng.tan@intel.com> Subject: [dpdk-dev] [PATCH v4 0/8] virtio support for container X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Apr 2016 01:18:39 -0000 v4: - Avoid using dev_type, instead use (eth_dev->pci_device is NULL) to judge if it's virtual device or physical device. - Change the added device name to virtio-user. - Split into vhost_user.c, vhost_kernel.c, vhost.c, virtio_user_pci.c, virtio_user_dev.c. - Move virtio-user specific data from struct virtio_hw into struct virtio_user_hw. - Add support to send reset_owner message. - Change del_queue implementation. (This need more check) - Remove rte_panic(), and superseded with log. - Add reset_owner into virtio_pci_ops.reset. - Merge parameter "rx" and "tx" to "queues" to emliminate confusion. - Move get_features to after set_owner. - Redefine path in virtio_user_hw from char * to char []. v3: - Remove --single-file option; do no change at EAL memory. - Remove the added API rte_eal_get_backfile_info(), instead we check all opened files with HUGEFILE_FMT to find hugepage files owned by DPDK. - Accordingly, add more restrictions at "Known issue" section. - Rename parameter from queue_num to queue_size for confusion. - Rename vhost_embedded.c to rte_eth_virtio_vdev.c. - Move code related to the newly added vdev to rte_eth_virtio_vdev.c, to reuse eth_virtio_dev_init(), remove its static declaration. - Implement dev_uninit() for rte_eth_dev_detach(). - WARN -> ERR, in vhost_embedded.c - Add more commit message for clarify the model. v2: - Rebase on the patchset of virtio 1.0 support. - Fix cannot create non-hugepage memory. - Fix wrong size of memory region when "single-file" is used. - Fix setting of offset in virtqueue to use virtual address. - Fix setting TUNSETVNETHDRSZ in vhost-user's branch. - Add mac option to specify the mac address of this virtual device. - Update doc. This patchset is to provide high performance networking interface (virtio) for container-based DPDK applications. The way of starting DPDK apps in containers with ownership of NIC devices exclusively is beyond the scope. The basic idea here is to present a new virtual device (named virtio-user), which can be discovered and initialized by DPDK. To minimize the change, we reuse already-existing virtio PMD code (driver/net/virtio/). Background: Previously, we usually use a virtio device in the context of QEMU/VM as below pic shows. Virtio nic is emulated in QEMU, and usually presented in VM as a PCI device. ------------------ | virtio driver | -----> VM ------------------ | | ----------> (over PCI bus or MMIO or Channel I/O) | ------------------ | device emulate | | | -----> QEMU | vhost adapter | ------------------ | | ----------> (vhost-user protocol or vhost-net ioctls) | ------------------ | vhost backend | ------------------ Compared to QEMU/VM case, virtio support for contaner requires to embedded device framework inside the virtio PMD. So this converged driver actually plays three roles: - virtio driver to drive this new kind of virtual device; - device emulation to present this virtual device and reponse to the virtio driver, which is originally by QEMU; - and the role to communicate with vhost backend, which is also originally by QEMU. The code layout and functionality of each module: ---------------------- | ------------------ | | | virtio driver | |----> (virtio_user_pci.c) | ------------------ | | | | | ------------------ | ------> virtio-user PMD | | device emulate |-|----> (virtio_user_dev.c) | | | | | | vhost adapter |-|----> (vhost_user.c, vhost_kernel.c, vhost.c) | ------------------ | ---------------------- | | -------------- --> (vhost-user protocol or vhost-net ioctls) | ------------------ | vhost backend | ------------------ How to share memory? In VM's case, qemu always shares all physical layout to backend. But it's not feasible for a container, as a process, to share all virtual memory regions to backend. So only specified virtual memory regions (with type of shared) are sent to backend. It's a limitation that only addresses in these areas can be used to transmit or receive packets. Known issues: - Control queue and multi-queue are not supported yet. - Cannot work with --huge-unlink. - Cannot work with no-huge. - Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8) hugepages. - Root privilege is a must (mainly becase of sorting hugepages according to physical address). - Applications should not use file name like HUGEFILE_FMT ("%smap_%d"). How to use? a. Apply this patchset. b. To compile container apps: $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc c. To build a docker image using Dockerfile below. $: cat ./Dockerfile FROM ubuntu:latest WORKDIR /usr/src/dpdk COPY . /usr/src/dpdk ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/" $: docker build -t dpdk-app-l2fwd . d. Used with vhost-user $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \ --socket-mem 1024,1024 -- -p 0x1 --stats 1 $: docker run -i -t -v :/var/run/usvhost \ -v /dev/hugepages:/dev/hugepages \ dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \ --vdev=virtio-user0,path=/var/run/usvhost -- -p 0x1 f. Used with vhost-net $: modprobe vhost $: modprobe vhost-net $: docker run -i -t --privileged \ -v /dev/vhost-net:/dev/vhost-net \ -v /dev/net/tun:/dev/net/tun \ -v /dev/hugepages:/dev/hugepages \ dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \ --vdev=virtio-user0,path=/dev/vhost-net -- -p 0x1 By the way, it's not necessary to run in a container. Signed-off-by: Huawei Xie Signed-off-by: Jianfeng Tan Acked-By: Neil Horman Jianfeng Tan (8): virtio: hide phys addr check inside pci ops virtio: abstract vring hdr desc init as a method virtio: enable use virtual address to fill desc virtio-user: add vhost adapter layer virtio-user: add device emulation layer APIs virtio-user: add new virtual pci driver for virtio virtio-user: add a new virtual device named virtio-user doc: update doc for virtio-user config/common_linuxapp | 3 + doc/guides/nics/overview.rst | 64 +-- doc/guides/rel_notes/release_16_07.rst | 4 + drivers/net/virtio/Makefile | 8 + drivers/net/virtio/virtio_ethdev.c | 69 ++-- drivers/net/virtio/virtio_ethdev.h | 2 + drivers/net/virtio/virtio_pci.c | 30 +- drivers/net/virtio/virtio_pci.h | 3 +- drivers/net/virtio/virtio_rxtx.c | 5 +- drivers/net/virtio/virtio_rxtx_simple.c | 13 +- drivers/net/virtio/virtio_user/vhost.c | 105 +++++ drivers/net/virtio/virtio_user/vhost.h | 221 +++++++++++ drivers/net/virtio/virtio_user/vhost_kernel.c | 254 ++++++++++++ drivers/net/virtio/virtio_user/vhost_user.c | 375 ++++++++++++++++++ drivers/net/virtio/virtio_user/virtio_user_dev.c | 475 +++++++++++++++++++++++ drivers/net/virtio/virtio_user/virtio_user_dev.h | 61 +++ drivers/net/virtio/virtio_user/virtio_user_pci.c | 209 ++++++++++ drivers/net/virtio/virtqueue.h | 33 +- 18 files changed, 1849 insertions(+), 85 deletions(-) create mode 100644 drivers/net/virtio/virtio_user/vhost.c create mode 100644 drivers/net/virtio/virtio_user/vhost.h create mode 100644 drivers/net/virtio/virtio_user/vhost_kernel.c create mode 100644 drivers/net/virtio/virtio_user/vhost_user.c create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.c create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.h create mode 100644 drivers/net/virtio/virtio_user/virtio_user_pci.c -- 2.1.4