From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E7F28428D7; Thu, 6 Apr 2023 10:16:46 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DA0B442D17; Thu, 6 Apr 2023 10:16:46 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id D923A42D1A for ; Thu, 6 Apr 2023 10:16:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680769005; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mft56XWjqPOZVb7Efptl/UmN8DIdeepYNPTwGSYsQUI=; b=UgsZs68B1apkyLVHva1DQLxuwD65pM+tgPu542qKy3IwhcfeNiEoRN/DpJJUaKMH6SMpeG 7uss9h7Wbaey29plWq+uy50dKMOeO9KEZpRzRJr9rniJmBQ7q67sIKOwJz2VnE+n3wgU/G wTvDEUxpEQUU2fpfSkyyAFdU8/GNbS0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-222-wVssPAsNPWCTsBGHcuLX3Q-1; Thu, 06 Apr 2023 04:16:42 -0400 X-MC-Unique: wVssPAsNPWCTsBGHcuLX3Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F3D95855304; Thu, 6 Apr 2023 08:16:41 +0000 (UTC) Received: from [10.39.208.17] (unknown [10.39.208.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4ECEF1415117; Thu, 6 Apr 2023 08:16:37 +0000 (UTC) Message-ID: <7199500b-5c34-0e8b-07a2-1bf1b549bb27@redhat.com> Date: Thu, 6 Apr 2023 10:16:36 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 To: Yongji Xie Cc: dev@dpdk.org, David Marchand , chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, Jason Wang , cunming.liang@intel.com, echaudro@redhat.com, Eugenio Perez Martin , Adrian Moreno Zapata References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> From: Maxime Coquelin Subject: Re: [RFC 00/27] Add VDUSE support to Vhost library In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Yongji, On 4/6/23 05:44, Yongji Xie wrote: > Hi Maxime, > > On Fri, Mar 31, 2023 at 11:43 PM Maxime Coquelin > wrote: >> >> This series introduces a new type of backend, VDUSE, >> to the Vhost library. >> >> VDUSE stands for vDPA device in Userspace, it enables >> implementing a Virtio device in userspace and have it >> attached to the Kernel vDPA bus. >> >> Once attached to the vDPA bus, it can be used either by >> Kernel Virtio drivers, like virtio-net in our case, via >> the virtio-vdpa driver. Doing that, the device is visible >> to the Kernel networking stack and is exposed to userspace >> as a regular netdev. >> >> It can also be exposed to userspace thanks to the >> vhost-vdpa driver, via a vhost-vdpa chardev that can be >> passed to QEMU or Virtio-user PMD. >> >> While VDUSE support is already available in upstream >> Kernel, a couple of patches are required to support >> network device type: >> >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc >> >> In order to attach the created VDUSE device to the vDPA >> bus, a recent iproute2 version containing the vdpa tool is >> required. >> >> Usage: >> ====== >> >> 1. Probe required Kernel modules >> # modprobe vdpa >> # modprobe vduse >> # modprobe virtio-vdpa >> >> 2. Build (require vduse kernel headers to be available) >> # meson build >> # ninja -C build >> >> 3. Create a VDUSE device (vduse0) using Vhost PMD with >> testpmd (with 4 queue pairs in this example) >> # ./build/app/dpdk-testpmd --no-pci --vdev=net_vhost0,iface=/dev/vduse/vduse0,queues=4 --log-level=*:9 -- -i --txq=4 --rxq=4 >> >> 4. Attach the VDUSE device to the vDPA bus >> # vdpa dev add name vduse0 mgmtdev vduse >> => The virtio-net netdev shows up (eth0 here) >> # ip l show eth0 >> 21: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 >> link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff >> >> 5. Start/stop traffic in testpmd >> testpmd> start >> testpmd> show port stats 0 >> ######################## NIC statistics for port 0 ######################## >> RX-packets: 11 RX-missed: 0 RX-bytes: 1482 >> RX-errors: 0 >> RX-nombuf: 0 >> TX-packets: 1 TX-errors: 0 TX-bytes: 62 >> >> Throughput (since last show) >> Rx-pps: 0 Rx-bps: 0 >> Tx-pps: 0 Tx-bps: 0 >> ############################################################################ >> testpmd> stop >> >> 6. Detach the VDUSE device from the vDPA bus >> # vdpa dev del vduse0 >> >> 7. Quit testpmd >> testpmd> quit >> >> Known issues & remaining work: >> ============================== >> - Fix issue in FD manager (still polling while FD has been removed) >> - Add Netlink support in Vhost library >> - Support device reconnection >> - Support packed ring >> - Enable & test more Virtio features >> - Provide performance benchmark results >> > > Nice work! Thanks for bringing VDUSE to the network area. I wonder if > you have some plan to support userspace memory registration [1]? I > think this feature can benefit the performance since an extra data > copy could be eliminated in our case. I plan to have a closer look later, once VDUSE support will be added. I think it will be difficult to support it in the case of DPDK for networking: - For dequeue path it would be basically re-introducing dequeue zero- copy support that we removed some time ago. It was a hack where we replaced the regular mbuf buffer with the descriptor one, increased the reference counter, and at next dequeue API calls checked if the former mbufs ref counter is 1 and restore the mbuf. Issue is that physical NIC drivers usually release sent mbufs by pool, once a certain threshold is met. So it can cause draining of the virtqueue as the descs are not written back into the used ring for quite some time, depending on the NIC/traffic/... - For enqueue path, I don't think this is possible with virtual switches by design, as when a mbuf is received on a physical port, we don't know in which Vhost/VDUSE port it will be switched to. And for VM to VM communication, should it use the src VM buffer or the dest VM one? Only case it could work is if you had a simple forwarder between a VDUSE device and a physical port. But I don't think there is much interest in such use-case. Any thoughts? Thanks, Maxime > [1] https://lwn.net/Articles/902809/ > > Thanks, > Yongji >