From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id DA2F2A0471 for ; Thu, 20 Jun 2019 13:35:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 976251D389; Thu, 20 Jun 2019 13:35:23 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 424B61D37B for ; Thu, 20 Jun 2019 13:35:22 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A3C3A3B714; Thu, 20 Jun 2019 11:35:21 +0000 (UTC) Received: from [10.36.112.46] (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 253C619C4F; Thu, 20 Jun 2019 11:35:14 +0000 (UTC) To: Nikos Dragazis , dev@dpdk.org Cc: Tiwei Bie , Zhihong Wang , Stefan Hajnoczi , Wei Wang , Stojaczyk Dariusz , Vangelis Koukis References: <1560957293-17294-1-git-send-email-ndragazis@arrikto.com> From: Maxime Coquelin Message-ID: Date: Thu, 20 Jun 2019 13:35:13 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <1560957293-17294-1-git-send-email-ndragazis@arrikto.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 20 Jun 2019 11:35:21 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 00/28] vhost: add virtio-vhost-user transport X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Nikos, On 6/19/19 5:14 PM, Nikos Dragazis wrote: > Hi everyone, > > this patch series introduces the concept of the virtio-vhost-user > transport. This is actually a revised version of an earlier RFC > implementation that has been proposed by Stefan Hajnoczi [1]. Though > this is a great feature, it seems to have been stalled, so I’d like to > restart the conversation on this and hopefully get it merged with your > help. Let me give you an overview. Thanks for taking over the series! I think you are already aware of that, but it arrives too late to consider it for v19.08, as the proposal deadline is over by almost 3 weeks. That said, it is good that you sent it early, so that we can work to make it in for v19.11. > The virtio-vhost-user transport is a vhost-user transport implementation > that is based on the virtio-vhost-user device. Its key difference with > the existing transport is that it allows deploying vhost-user targets > inside dedicated Storage Appliance VMs instead of host user space. In > other words, it allows having guests that act as vhost-user backends for > other guests. > > The virtio-vhost-user device implements the vhost-user control plane > (master-slave communication) as follows: > > 1. it parses the vhost-user messages from the vhost-user unix domain > socket and forwards them to the slave guest through virtqueues > > 2. it maps the vhost memory regions in QEMU’s process address space and > exposes them to the slave guest as a RAM-backed PCI MMIO region > > 3. it hooks up doorbells to the callfds. The slave guest can use these > doorbells to interrupt the master guest driver > > The device code has not yet been merged into upstream QEMU, but this is > definitely the end goal. Could you provide a pointer to the QEMU series, and instructions to test this new device? > The current state is that we are awaiting for > the approval of the virtio spec. Ditto, a link to the spec patches would be useful. > I have Cced Darek from the SPDK community who has helped me a lot by > reviewing this series. Note that any device type could be implemented > over this new transport. So, adding the virtio-vhost-user transport in > DPDK would allow using it from SPDK as well. > > Getting into the code internals, this patch series makes the following > changes: > > 1. introduce a generic interface for the transport-specific operations. > Each of the two available transports, the pre-existing AF_UNIX > transport and the virtio-vhost-user transport, is going to implement > this interface. The AF_UNIX-specific code has been extracted from the > core vhost-user code and is now part of the AF_UNIX transport > implementation in trans_af_unix.c. > > 2. introduce the virtio-vhost-user transport. The virtio-vhost-user > transport requires a driver for the virtio-vhost-user devices. The > driver along with the transport implementation have been packed into > a separate library in `drivers/virtio_vhost_user/`. The necessary > virtio-pci code has been copied from `drivers/net/virtio/`. Some > additional changes have been made so that the driver can utilize the > additional resources of the virtio-vhost-user device. > > 3. update librte_vhost public API to enable choosing transport for each > new vhost device. Extend the vhost net driver and vhost-scsi example > application to export this new API to the end user. > > The primary changes I did to Stefan’s RFC implementation are the > following: > > 1. moved postcopy live migration code into trans_af_unix.c. Postcopy > live migration relies on the userfault fd mechanism, which cannot be > supported by virtio-vhost-user. > > 2. moved setup of the log memory region into trans_af_unix.c. Setting up > the log memory region involves mapping/unmapping guest memory. This > is an AF_UNIX transport-specific operation. > > 3. introduced a vhost transport operation for > process_slave_message_reply() > > 4. moved the virtio-vhost-user transport/driver into a separate library > in `drivers/virtio_vhost_user/`. This required making vhost.h and > vhost_user.h part of librte_vhost public API and exporting some > private symbols via the version script. This looks better to me that > just moving the entire librte_vhost into `drivers/`. I am not sure if > this is the most appropriate solution. I am looking forward to your > suggestions on this. I'm not sure this is the right place to put it. > 5. made use of the virtio PCI capabilities for the additional device > resources (doorbells, shared memory). This required changes in > virtio_pci.c and trans_virtio_vhost_user.c. > > 6. [minor] changed some commit headlines to comply with > check-git-log.sh. > > Please, have a look and let me know about your thoughts. Any > reviews/pointers/suggestions are welcome. Maxime