From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 01009A058A; Fri, 17 Apr 2020 10:02:18 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B361C1DE34; Fri, 17 Apr 2020 10:02:17 +0200 (CEST) Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) by dpdk.org (Postfix) with ESMTP id CFB441DE20 for ; Fri, 17 Apr 2020 10:02:15 +0200 (CEST) Received: by mail-io1-f47.google.com with SMTP id o127so1419186iof.0 for ; Fri, 17 Apr 2020 01:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=M2G4G3sPhIWnTulNiKKcDXUAj3u0klUBeAThJTL3cFc=; b=tz8U9pFxnBHx29dlXgFSaBVDWLQGd7SVYn1l2aJlZdLZaC+7ko9CyUiULVcjODPBYX tzZyzHXFqRiv1m+jGnKH+zoy5r2uqtaG18jpyOXEuHA3Ij6LOu7adcOz4ZFsbFfcCx6o ILgndXofVQmmHtbwf/Rgt75OGFO/geBjP6Vn5hzGC6mXN7rLW9pas+mKC4wMfmE5wKrT Axmvlavx2wc4fT57y7B/Yd7CgEuNyB8FSCMBpGTpVejqvFsBnnr1AOxM6e1d+P9IHKeB 3NJG/KbEnHZu0CzR95z0qYOOgGE//JTmB5yVNl14np8zNAZinp/iiZPdDGxZ7piWCFpp BD8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=M2G4G3sPhIWnTulNiKKcDXUAj3u0klUBeAThJTL3cFc=; b=IGYu/Qn5dAtHaLfiF/WEPNZQUCanU5UCL+PvEPB8lRgYN93Fj/6WWCqZSrreqpBMJY 28xck/s2JArfDlsspvkn3Kbru0j9k/buTtDl+7CTq+gkHR22GkcdryBTjtWsPeVzPYYX fE0Rw18InlZIl36Ww05vBwIEaaEwy4tlNcISOyRkwflZq7gsBQnIGePxp38r9+M4IqSr vtu5GzoZU97ljLPkesfgsAIIaPNa06yf7lYn1C0nFt5ZbxqFlv4YNiGB1wtJzPbqg7NA bKFMtRCEQvuNnMvqisASuyUx88uTODxvVeVfg7khk233FLLY0RP+DSfvQwp1uw3wr63A N5Lw== X-Gm-Message-State: AGi0PuZXxYn+Cx4+Rx3q6SjYhmgkwVdNiTTH9RfT5wY9pgI89N97mVag Cm3zd0oqx+hP+RTBz0lXYJW0iUxTIfWsjZutPuE= X-Google-Smtp-Source: APiQypKewNpWtuWylqF0m6McqbCiDbhvkOgNUsVxueMQYiggrVf26WZt7GsjDtLCH2iDd8Msvba6Zsn3O+uEh1n4EGU= X-Received: by 2002:a05:6602:2f87:: with SMTP id u7mr1859501iow.94.1587110534825; Fri, 17 Apr 2020 01:02:14 -0700 (PDT) MIME-Version: 1.0 References: <89B17B9B05A1964E8D40D6090018F28151277ADF@SHSMSX107.ccr.corp.intel.com> In-Reply-To: <89B17B9B05A1964E8D40D6090018F28151277ADF@SHSMSX107.ccr.corp.intel.com> From: Jerin Jacob Date: Fri, 17 Apr 2020 13:31:58 +0530 Message-ID: To: "Fu, Patrick" Cc: "dev@dpdk.org" , Maxime Coquelin , "Ye, Xiaolong" , "Hu, Jiayu" , "Wang, Zhihong" , "Liang, Cunming" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC] Accelerating Data Movement for DPDK vHost with DMA Engines X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, Apr 17, 2020 at 12:56 PM Fu, Patrick wrote: > > Background > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > DPDK vhost library implements a user-space VirtIO net backend allowing ho= st applications to directly communicate with VirtIO front-end in VMs and co= ntainers. However, every vhost enqueue/dequeue operation requires to copy p= acket buffers between guest and host memory. The overhead of copying large = bulk of data makes the vhost backend become the I/O bottleneck. DMA engines= , including un-core DMA accelerator, like Crystal Beach DMA (CBDMA) and Dat= a Streaming Accelerator (DSA), and discrete card general purpose DMA, are e= xtremely efficient in data movement within system memory. Therefore, we pro= pose a set of asynchronous DMA data movement API in vhost library for DMA a= cceleration. With offloading packet copies in vhost data-path from the CPU = to the DMA engine, which can not only accelerate data transfers, but also s= ave precious CPU core resources. > > New API Overview > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > The proposed APIs in the vhost library support various DMA engines to acc= elerate data transfers in the data-path. For the higher performance, DMA en= gines work in an asynchronous manner, where DMA data transfers and CPU comp= utations are executed in parallel. The proposed API consists of control pat= h API and data path API. The control path API includes Registration API and= DMA operation callback, and the data path API includes asynchronous API. T= o remove the dependency of vendor specific DMA engines, the DMA operation c= allback provides generic DMA data transfer abstractions. To support asynchr= onous DMA data movement, the new async API provides asynchronous ring opera= tion semantic in data-path. To enable/disable DMA acceleration for virtqueu= es, users need to use registration API is to register/unregister DMA callba= ck implementations to the vhost library and bind DMA channels to virtqueues= . The DMA channels used by virtqueues are provided by DPDK applications, wh= ich is backed by virtual or physical DMA devices. > The proposed APIs are consisted of 3 sub-sets: > 1. DMA Registration APIs > 2. DMA Operation Callbacks > 3. Async Data APIs > > DMA Registration APIs > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > DMA acceleration is per queue basis. DPDK applications need to explicitly= decide whether a virtqueue needs DMA acceleration and which DMA channel to= use. In addition, a DMA channel is dedicated to a virtqueue and a DMA chan= nel cannot be bound to multiple virtqueues at the same time. To enable DMA = acceleration for a virtqueue, DPDK applications need to implement DMA opera= tion callbacks for a specific DMA type (e.g. CBDMA) first, then register th= e callbacks to the vhost library and bind a DMA channel to a virtqueue, and= finally use the new async API to perform data-path operations on the virtq= ueue. > The definitions of registration API are shown below: > int rte_vhost_async_channel_register(int vid, uint16_t queue_id, > struct rte_vdma_device_ops *ops); > > int rte_vhost_async_channel_unregister(int vid, uint16_t queue_id); We already have multiple DMA implementation over raw dev. Why not make a new dmadev class for DMA acceleration and use it by virtio and any other clients?