From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4C5C8A04B0; Fri, 14 Aug 2020 12:26:53 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4AC911C0D0; Fri, 14 Aug 2020 12:26:52 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 282C21C0CD for ; Fri, 14 Aug 2020 12:26:49 +0200 (CEST) IronPort-SDR: wJ3+LFef5cBmI6btUMWDZaxTQzgKBgdvfmf2T5gvBI053foI93+WMswOq6dECQsiX6d/SeZJ7B d801vT/KLbpQ== X-IronPort-AV: E=McAfee;i="6000,8403,9712"; a="154347202" X-IronPort-AV: E=Sophos;i="5.76,312,1592895600"; d="scan'208";a="154347202" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2020 03:26:48 -0700 IronPort-SDR: TD2N6LjqwvX+03mvy6Zu9igk0oUfGPy8eHMypiuSbSwGap67Hka/rNna2jsiCWb2PXGtAy41fx r/SVPJFmzWEg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,312,1592895600"; d="scan'208";a="295717577" Received: from npg-dpdk-virtio-xiachenbo-nw.sh.intel.com ([10.67.119.123]) by orsmga006.jf.intel.com with ESMTP; 14 Aug 2020 03:26:46 -0700 From: Chenbo Xia To: dev@dpdk.org, thomas@monjalon.net, xuan.ding@intel.com, xiuchun.lu@intel.com, cunming.liang@intel.com, changpeng.liu@intel.com Cc: zhihong.wang@intel.com Date: Fri, 14 Aug 2020 19:16:04 +0000 Message-Id: <20200814191606.26312-1-chenbo.xia@intel.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [RFC v1 0/2] Add device emulation support in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This series enables DPDK to be an alternative I/O device emulation library of building virtualized devices in separate processes outside QEMU. It introduces a new library (librte_vfio_user), a new device class (emudev) and one pilot device provider (avf_emudev) with its backend of Ethdev PMD (avfbe_ethdev). *librte_vfio_user* is a server implementation of VFIO-over-socket[1] (also known as vfio-user) which is a protocol that allows a device to be virtualized in a separate process outside of QEMU. *emudev* is a device type for emulated devices. It is up to device provider to choose the transport. In avf_emudev case, it uses vfio-user as transport communicate with its client (e.g., QEMU). *avf_emudev* is the emudev provider of AVF which is a device specification for Intel Virtual Function cross generation. It’s implemented by an AVF emudev driver which offers a few APIs for avfbe_ethdev or app logic to operate. *avfbe_ethdev* is a normal ethdev PMD to supply the basic I/O as backend data path of avf_emudev. One simple usage of avfbe_ethdev could be a para-virtualized backend connected with network application logic. Background & Motivation ----------------------- In order to reduce the attack surface, QEMU community is disaggregating QEMU by removing part of device emulation from it. The disaggregated/multi-process QEMU is using VFIO-over-socket/vfio-user as the main transport mechanism to disaggregate I/O services from QEMU[2]. Vfio-user essentially implements the VFIO device model presented to the user process by a set of messages over a unix-domain socket. The main difference between application using vfio-user and application using vfio kernel module is that device manipulation is based on socket messages for vfio-user but system calls for vfio kernel module. The vfio-user devices consist of a generic VFIO device type, living in QEMU, which is called the client[3], and the core device implementation (emulated device), living outside of QEMU, which is called the server. With the introduction and support of vfio-user in QEMU, QEMU is explicitly adding support for external emulated device and data path. We are trying to leverage that and introducing vfio-user support in DPDK. By doing so, DPDK is enabled to be an alternative I/O device emulation library of building virtualized devices along with high-performance data path in separate processes outside QEMU. It will be easy for hardware vendors to provide virtualized solutions of their hardware devices by implementing emulated device in DPDK. Except for vfio-user introduced in DPDK, this series also introduces the first emulated device implementation. That is emulated AVF device (avf_emudev) implemented by AVF emulation driver (avf_emudev driver). Emulated AVF device demos how emulated device could be implemented in DPDK. SPDK is also investigating to implement use case for NVMe. Design overview --------------- +------------------------------------------------------+ | +---------------+ +---------------+ | | | avf_emudev | | avfbe_ethdev | | | | driver | | driver | | | +---------------+ +---------------+ | | | | | | ------------------------------------------- VDEV BUS | | | | | | +---------------+ +--------------+ | +--------------+ | | vdev: | | vdev: | | | +----------+ | | | /path/to/vfio | | avf_emudev_# | | | | Generic | | | +---------------+ +--------------+ | | | vfio-dev | | | | | | +----------+ | | | | | +----------+ | | +----------+ | | | vfio-user| | | | vfio-user| | | | client | |<---|----->| server | | | +----------+ | | +----------+ | | QEMU | | DPDK | +--------------+ +------------------------------------------------------+ - vfio-user. Vfio-user in DPDK is referred to the vfio-user protocol implementation playing server role. It provides transport between emulated device and generic VFIO device in QEMU. Emulated device in DPDK and generic VFIO device in QEMU are working together to present VFIO device model to VM. This series introduces vfio-user implementation as a library called librte_vfio_user which is under lib/librte_vfio_user. - vdev:/path/to/vfio. It defines the emudev device and binds to vdev bus driver. The emudev device is defined by DPDK applications through command line as '--vdev=emu_iavf, path=/path/to/socket, id=#' in avf_emudev case. Parameters in command line include device name (emu_iavf) which is used to identify corresponding driver (in this case, avf_emudev driver which implements emudev device of AVF), path=/path/to/socket which is used to open the transport interface to vfio-user client in QEMU, and id which is the index of emudev device. - avf_emudev driver. It implements emulated AVF device which is the emudev provider of AVF. The avf_emudev_driver offers a few APIs implementation exposed by emudev device APIs for avfbe_ethdev_pmd or application logic to operate. These APIs are described in lib/librte_emudev/rte_emudev.h. - vdev: avf_emudev_#. The vdev device is defined by DPDK application through command line as '--vdev=net_avfbe,id=#,avf_emu_id=#'.It is associated with emudev provider of AVF by 'avf_emu_id=#'. - avfbe_ethdev driver. It is a normal ethdev PMD to supply the basic I/O as backend data path of avf_emudev. Why not rawdev for emulated device ---------------------------------- Instead of introducing new class emudev, emulated device could be presented as rawdev. However, existing rawdev APIs cannot meet the requirements of emulated device. There are three API categories for emudev. They are emudev device lifecycle management, backend facing APIs, and emudev device provider facing APIs respectively. Existing rawdev APIs could only cover lifecycle management APIs and some of backend facing APIs. Other APIs, even if added to rawdev API are not required by other rawdev applications. References ---------- [1]: https://patchew.org/QEMU/1594913503-52271-1-git-send-email-thanos.makatos@nutanix.com/ [2]: https://wiki.qemu.org/Features/MultiProcessQEMU [3]: https://github.com/elmarco/qemu/blob/wip/vfio-user/hw/vfio/libvfio-user.c Chenbo Xia (2): vfio_user: Add library for vfio over socket emudev: Add library for emulated device lib/librte_emudev/rte_emudev.h | 315 +++++++++++++++++++++++++ lib/librte_vfio_user/rte_vfio_user.h | 335 +++++++++++++++++++++++++++ 2 files changed, 650 insertions(+) create mode 100644 lib/librte_emudev/rte_emudev.h create mode 100644 lib/librte_vfio_user/rte_vfio_user.h -- 2.17.1