From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 8FBB44C96 for ; Mon, 25 Jun 2018 13:01:46 +0200 (CEST) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jun 2018 04:01:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,270,1526367600"; d="scan'208";a="50342640" Received: from debian.sh.intel.com (HELO debian) ([10.67.104.228]) by fmsmga008.fm.intel.com with ESMTP; 25 Jun 2018 04:01:43 -0700 Date: Mon, 25 Jun 2018 19:01:46 +0800 From: Tiwei Bie To: Dariusz Stojaczyk Cc: dev@dpdk.org, Maxime Coquelin , Tetsuya Mukawa , Stefan Hajnoczi , Thomas Monjalon , yliu@fridaylinux.org, James Harris , Tomasz Kulasek , Pawel Wodkowski Message-ID: <20180625110146.GA18211@debian> References: <1526648465-62579-1-git-send-email-dariuszx.stojaczyk@intel.com> <20180607151227.23660-1-darek.stojaczyk@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20180607151227.23660-1-darek.stojaczyk@gmail.com> User-Agent: Mutt/1.9.5 (2018-04-13) Subject: Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Jun 2018 11:01:47 -0000 On Thu, Jun 07, 2018 at 05:12:20PM +0200, Dariusz Stojaczyk wrote: > rte_vhost is not vhost-user spec compliant. Some Vhost drivers have > been already confirmed not to work with rte_vhost. virtio-user-scsi-pci > in QEMU 2.12 doesn't fully initialize its management queues at SeaBIOS > stage. This is perfectly fine from the Vhost-user spec perspective, but > doesn't meet rte_vhost expectations. rte_vhost waits for all queues > to be fully initialized before it allows the entire device to be > processed. > > The problem can be temporarily worked around by start-stopping entire > device (all queues) on each vhost-user message that could change queue > state. This would increase general VM boot time and is totally > unacceptable though. > > Fixing rte_vhost properly would require quite a big amount > of changes which would completely break backward compatibility. > We're introducing a new rte_vhost2 library intended to smooth the transition. > It exposes a low-level API for implementing new vhost devices. > The existing rte_vhost is about to be refactored to use rte_vhost2 > underneath and demanding users could now use rte_vhost2 directly. > > We're designing a fresh library here, so this opens up a great > amount of possibilities and improvements we can make to the public > API that will pay off significantly for the sake of future > specification/library extensions. > > rte_vhost2 abstracts away most vhost-user/virtio-vhost-user specifics > and allows developers to implement vhost devices with an ease. > It calls user-provided callbacks once proper device initialization > state has been reached. That is - memory mappings have changed, > virtqueues are ready to be processed, features have changed at > runtime, etc. > > Compared to the rte_vhost, this lib additionally allows the following: > > * ability to start/stop particular queues > * most callbacks are now asynchronous - it greatly simplifies > the event handling for asynchronous applications and doesn't > make anything harder for synchronous ones. > * this is low-level vhost API. It doesn't have any vhost-net, nvme > or crypto references. These backend-specific libraries will > be later refactored to use *this* generic library underneath. > This implies that the library doesn't do any virtqueue processing, > it only delivers vring addresses to the user, so he can process > virtqueues by himself. > * abstracting away Unix domain sockets (vhost-user) and PCI > (virtio-vhost-user). > * APIs for interrupt-driven drivers > * replacing gpa_to_vva function with an IOMMU-aware variant > * The API imposes how public functions can be called and how > internal data can change, so there's only a minimal work required > to ensure thread-safety. Possibly no mutexes are required at all. > * full Virtio 1.0/vhost-user specification compliance. > > The proposed structure for the new library is described below. > * rte_vhost2.h > - public API > - registering targets with provided ops > - unregistering targets > - iova_to_vva() > * transport.h/.c > - implements rte_vhost2.h > - allows registering vhost transports, which are opaquely required by the > rte_vhost2.h API (target register function requires transport name). > - exposes a set of callbacks to be implemented by each transport > * vhost_user.c > - vhost-user Unix domain socket transport > - does recvmsg() > - uses the underlying vhost-user helper lib to process messages, but still > handles some transport-specific ones, e.g. SET_MEM_TABLE > - calls some of the rte_vhost2.h ops registered with a target > * fd_man.h/.c > - polls provided descriptors, calls user callbacks on fd events > - based on the original rte_vhost version > - additionally allows calling user-provided callbacks on the poll thread > * vhost.h/.c > - a transport-agnostic vhost-user library > - calls most of the rte_vhost2.h ops registered with a target > - manages virtqueues state > - hopefully to be reused by the virtio-vhost-user > - exposes a set of callbacks to be implemented by each transport > (for e.g. sending message replies) > > This series includes only vhost-user transport. Virtio-vhost-user > is to be done later. > > The following items are still TBD: > * vhost-user slave channel > * get/set_config > * cfg_call() implementation > * IOTLB > * NUMA awareness > * Live migration > * vhost-net implementation (to port rte_vhost over) > * vhost-crypto implementation (to port rte_vhost over) > * vhost-user client mode (connecting to an external socket) > * various error logging > > Dariusz Stojaczyk (7): > vhost2: add initial implementation > vhost2: add vhost-user helper layer > vhost2/vhost: handle queue_stop/device_destroy failure > vhost2: add asynchronous fd_man > vhost2: add initial vhost-user transport > vhost2/vhost: support protocol features > vhost2/user: implement tgt unregister path Hi Dariusz, Thank you for putting efforts in making the DPDK vhost more generic! >>From my understanding, your proposal is that: 1) Introduce rte_vhost2 to provide the APIs which allow users to implement vhost backends like SCSI, net, crypto, .. 2) Refactor the existing rte_vhost to use rte_vhost2. The rte_vhost will still provide below existing sets of APIs: 1. The APIs which allow users to implement external vhost backends (these APIs were designed for SPDK previously) 2. The APIs provided by the net backend 3. The APIs provided by the crypto backend And above APIs in rte_vhost won't be changed. Is my above understanding correct? Thanks! Best regards, Tiwei Bie > > lib/Makefile | 4 +- > lib/librte_vhost2/Makefile | 25 + > lib/librte_vhost2/fd_man.c | 288 +++++++++++ > lib/librte_vhost2/fd_man.h | 125 +++++ > lib/librte_vhost2/rte_vhost2.h | 304 +++++++++++ > lib/librte_vhost2/rte_vhost2_version.map | 12 + > lib/librte_vhost2/transport.c | 74 +++ > lib/librte_vhost2/transport.h | 32 ++ > lib/librte_vhost2/vhost.c | 728 ++++++++++++++++++++++++++ > lib/librte_vhost2/vhost.h | 203 ++++++++ > lib/librte_vhost2/vhost_user.c | 845 +++++++++++++++++++++++++++++++ > 11 files changed, 2638 insertions(+), 2 deletions(-) > create mode 100644 lib/librte_vhost2/Makefile > create mode 100644 lib/librte_vhost2/fd_man.c > create mode 100644 lib/librte_vhost2/fd_man.h > create mode 100644 lib/librte_vhost2/rte_vhost2.h > create mode 100644 lib/librte_vhost2/rte_vhost2_version.map > create mode 100644 lib/librte_vhost2/transport.c > create mode 100644 lib/librte_vhost2/transport.h > create mode 100644 lib/librte_vhost2/vhost.c > create mode 100644 lib/librte_vhost2/vhost.h > create mode 100644 lib/librte_vhost2/vhost_user.c > > -- > 2.11.0 > > > Dariusz Stojaczyk (7): > vhost2: add initial implementation > vhost2: add vhost-user helper layer > vhost2/vhost: handle queue_stop/device_destroy failure > vhost2: add asynchronous fd_man > vhost2: add initial vhost-user transport > vhost2/vhost: support protocol features > vhost2/user: implement tgt unregister path > > lib/Makefile | 4 +- > lib/librte_vhost2/Makefile | 25 + > lib/librte_vhost2/fd_man.c | 288 +++++++++++ > lib/librte_vhost2/fd_man.h | 125 +++++ > lib/librte_vhost2/rte_vhost2.h | 304 +++++++++++ > lib/librte_vhost2/rte_vhost2_version.map | 12 + > lib/librte_vhost2/transport.c | 74 +++ > lib/librte_vhost2/transport.h | 32 ++ > lib/librte_vhost2/vhost.c | 728 ++++++++++++++++++++++++++ > lib/librte_vhost2/vhost.h | 203 ++++++++ > lib/librte_vhost2/vhost_user.c | 845 +++++++++++++++++++++++++++++++ > 11 files changed, 2638 insertions(+), 2 deletions(-) > create mode 100644 lib/librte_vhost2/Makefile > create mode 100644 lib/librte_vhost2/fd_man.c > create mode 100644 lib/librte_vhost2/fd_man.h > create mode 100644 lib/librte_vhost2/rte_vhost2.h > create mode 100644 lib/librte_vhost2/rte_vhost2_version.map > create mode 100644 lib/librte_vhost2/transport.c > create mode 100644 lib/librte_vhost2/transport.h > create mode 100644 lib/librte_vhost2/vhost.c > create mode 100644 lib/librte_vhost2/vhost.h > create mode 100644 lib/librte_vhost2/vhost_user.c > > -- > 2.11.0 >