From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id DB94F6A87 for ; Tue, 14 Jun 2016 10:32:14 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 14 Jun 2016 01:32:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,470,1459839600"; d="scan'208";a="718736692" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by FMSMGA003.fm.intel.com with ESMTP; 14 Jun 2016 01:32:11 -0700 Date: Tue, 14 Jun 2016 16:34:19 +0800 From: Yuanhan Liu To: Jianfeng Tan Cc: dev@dpdk.org, Huawei Xie , rich.lane@bigswitch.com, mst@redhat.com, nakajima.yoshihiro@lab.ntt.co.jp, p.fedin@samsung.com, ann.zhuangyanying@huawei.com, mukawa@igel.co.jp, nhorman@tuxdriver.com Message-ID: <20160614083419.GV10038@yliu-dev.sh.intel.com> References: <1465799943-138577-1-git-send-email-jianfeng.tan@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1465799943-138577-1-git-send-email-jianfeng.tan@intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v8 0/6] virtio support for container X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jun 2016 08:32:15 -0000 Series Acked-by: Yuanhan Liu --yliu On Mon, Jun 13, 2016 at 06:38:57AM +0000, Jianfeng Tan wrote: > v8: > - Change to use max_queue_pairs instead of queue_pairs to initialize > and deinitialize queues. > - Remove vhost-kernel support. > > v7: > - CONFIG_RTE_VIRTIO_VDEV -> CONFIG_RTE_VIRTIO_USER; and corresondingly, > RTE_VIRTIO_VDEV -> RTE_VIRTIO_USER. > - uint64_t -> uintptr_t, so that it can be compiled on 32-bit platform. > - Rebase on latest dpdk-next-virtio branch. > - Abandon abstracting related code into vring_hdr_desc_init(), instead, > just move it behind setup_queue(). > > v6: > - Move driver related code into from driver/net/virtio/virtio-user/ to > driver/net/virtio/ directory, inside virtio_user_ethdev.c. > - Rename vdev to virtio_user in comments and code. > - Merge code, which lies in virtio_user_pci.c, into virtio_user_ethdev.c. > - Add some comments at virtio-user special handling at virtio_dev_ethdev.c. > - Merge document update into the 7nd commit where virtio-user is added. > - Add usage with vhost-switch in vhost.rst. > > v5: > - Rename struct virtio_user_hw to struct virtio_user_dev. > - Rename "vdev_private" to "virtio_user_dev". > - Move special handling into virtio_ethdev.c from queue_setup(). > - Add vring in virtio_user_dev (remove rte_eth_dev_data), so that > device does not depend on driver's data structure (rte_eth_dev_data). > - Remove update on doc/guides/nics/overview.rst, because virtio-user has > exact feature set with virtio. > - Change "unsigned long int" to "uint64_t", "unsigned" to "uint32_t". > - Remove unnecessary cast in vdev_read_dev_config(). > - Add functions in virtio_user_dev.c with prefix of "virtio_user_". > - Rebase on virtio-next-virtio. > > v4: > - Avoid using dev_type, instead use (eth_dev->pci_device is NULL) to > judge if it's virtual device or physical device. > - Change the added device name to virtio-user. > - Split into vhost_user.c, vhost_kernel.c, vhost.c, virtio_user_pci.c, > virtio_user_dev.c. > - Move virtio-user specific data from struct virtio_hw into struct > virtio_user_hw. > - Add support to send reset_owner message. > - Change del_queue implementation. (This need more check) > - Remove rte_panic(), and superseded with log. > - Add reset_owner into virtio_pci_ops.reset. > - Merge parameter "rx" and "tx" to "queues" to emliminate confusion. > - Move get_features to after set_owner. > - Redefine path in virtio_user_hw from char * to char []. > > v3: > - Remove --single-file option; do no change at EAL memory. > - Remove the added API rte_eal_get_backfile_info(), instead we check all > opened files with HUGEFILE_FMT to find hugepage files owned by DPDK. > - Accordingly, add more restrictions at "Known issue" section. > - Rename parameter from queue_num to queue_size for confusion. > - Rename vhost_embedded.c to rte_eth_virtio_vdev.c. > - Move code related to the newly added vdev to rte_eth_virtio_vdev.c, to > reuse eth_virtio_dev_init(), remove its static declaration. > - Implement dev_uninit() for rte_eth_dev_detach(). > - WARN -> ERR, in vhost_embedded.c > - Add more commit message for clarify the model. > > v2: > - Rebase on the patchset of virtio 1.0 support. > - Fix cannot create non-hugepage memory. > - Fix wrong size of memory region when "single-file" is used. > - Fix setting of offset in virtqueue to use virtual address. > - Fix setting TUNSETVNETHDRSZ in vhost-user's branch. > - Add mac option to specify the mac address of this virtual device. > - Update doc. > > This patchset is to provide high performance networking interface (virtio) > for container-based DPDK applications. The way of starting DPDK apps in > containers with ownership of NIC devices exclusively is beyond the scope. > The basic idea here is to present a new virtual device (named virtio-user), > which can be discovered and initialized by DPDK. To minimize the change, > we reuse already-existing virtio PMD code (driver/net/virtio/). > > Background: Previously, we usually use a virtio device in the context of > QEMU/VM as below pic shows. Virtio nic is emulated in QEMU, and usually > presented in VM as a PCI device. > > ------------------ > | virtio driver | -----> VM > ------------------ > | > | ----------> (over PCI bus or MMIO or Channel I/O) > | > ------------------ > | device emulate | > | | -----> QEMU > | vhost adapter | > ------------------ > | > | ----------> (vhost-user protocol or vhost-net ioctls) > | > ------------------ > | vhost backend | > ------------------ > > Compared to QEMU/VM case, virtio support for contaner requires to embedded > device framework inside the virtio PMD. So this converged driver actually > plays three roles: > - virtio driver to drive this new kind of virtual device; > - device emulation to present this virtual device and reponse to the > virtio driver, which is originally by QEMU; > - and the role to communicate with vhost backend, which is also > originally by QEMU. > > The code layout and functionality of each module: > > ---------------------- > | ------------------ | > | | virtio driver | |----> (virtio_user_ethdev.c) > | ------------------ | > | | | > | ------------------ | ------> virtio-user PMD > | | device emulate |-|----> (virtio_user_dev.c) > | | | | > | | vhost adapter |-|----> (vhost_user.c, vhost_kernel.c, vhost.c) > | ------------------ | > ---------------------- > | > | -------------- --> (vhost-user protocol) > | > ------------------ > | vhost backend | > ------------------ > > How to share memory? In VM's case, qemu always shares all physical layout > to backend. But it's not feasible for a container, as a process, to share > all virtual memory regions to backend. So only specified virtual memory > regions (with type of shared) are sent to backend. It's a limitation that > only addresses in these areas can be used to transmit or receive packets. > > Known issues: > - Control queue and multi-queue are not supported yet. > - Cannot work with --huge-unlink. > - Cannot work with no-huge. > - Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8) > hugepages. > - Root privilege is a must (mainly becase of sorting hugepages according > to physical address). > - Applications should not use file name like HUGEFILE_FMT ("%smap_%d"). > - Cannot work with vhost kernel. > > How to use? > > a. Apply this patchset. > > b. To compile container apps: > $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc > $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc > $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc > $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc > > c. To build a docker image using Dockerfile below. > $: cat ./Dockerfile > FROM ubuntu:latest > WORKDIR /usr/src/dpdk > COPY . /usr/src/dpdk > ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/" > $: docker build -t dpdk-app-l2fwd . > > d. Used with vhost-user > $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \ > --socket-mem 1024,1024 -- -p 0x1 --stats 1 > $: docker run -i -t -v :/var/run/usvhost \ > -v /dev/hugepages:/dev/hugepages \ > dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \ > --vdev=virtio-user0,path=/var/run/usvhost -- -p 0x1 > > By the way, it's not necessary to run in a container. > > Signed-off-by: Huawei Xie > Signed-off-by: Jianfeng Tan > > > Jianfeng Tan (6): > virtio: hide phys addr check inside pci ops > virtio: enable use virtual address to fill desc > virtio-user: add vhost user adapter layer > virtio-user: add device emulation layer APIs > virtio-user: add new virtual pci driver for virtio > virtio-user: add a new vdev named virtio-user > > config/common_linuxapp | 1 + > doc/guides/rel_notes/release_16_07.rst | 12 + > doc/guides/sample_app_ug/vhost.rst | 17 + > drivers/net/virtio/Makefile | 6 + > drivers/net/virtio/virtio_ethdev.c | 77 ++-- > drivers/net/virtio/virtio_ethdev.h | 2 + > drivers/net/virtio/virtio_pci.c | 30 +- > drivers/net/virtio/virtio_pci.h | 3 +- > drivers/net/virtio/virtio_rxtx.c | 5 +- > drivers/net/virtio/virtio_rxtx_simple.c | 13 +- > drivers/net/virtio/virtio_user/vhost.h | 141 ++++++++ > drivers/net/virtio/virtio_user/vhost_user.c | 404 +++++++++++++++++++++ > drivers/net/virtio/virtio_user/virtio_user_dev.c | 227 ++++++++++++ > drivers/net/virtio/virtio_user/virtio_user_dev.h | 62 ++++ > drivers/net/virtio/virtio_user_ethdev.c | 427 +++++++++++++++++++++++ > drivers/net/virtio/virtqueue.h | 10 + > 16 files changed, 1395 insertions(+), 42 deletions(-) > create mode 100644 drivers/net/virtio/virtio_user/vhost.h > create mode 100644 drivers/net/virtio/virtio_user/vhost_user.c > create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.c > create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.h > create mode 100644 drivers/net/virtio/virtio_user_ethdev.c > > -- > 2.1.4