From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 98B56428D8; Thu, 6 Apr 2023 13:04:34 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6F79742D0E; Thu, 6 Apr 2023 13:04:34 +0200 (CEST) Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by mails.dpdk.org (Postfix) with ESMTP id 58D6F41153 for ; Thu, 6 Apr 2023 13:04:33 +0200 (CEST) Received: by mail-pf1-f170.google.com with SMTP id bt19so25539001pfb.3 for ; Thu, 06 Apr 2023 04:04:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1680779072; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+51Qh4ph2xDARjk/RN2iIt89JkM0AowkliFSb5WG5qs=; b=RKvavF6iwisI0tqpgS2lYvVbG046Qhmm1v+KCqGE6OvbHhOyjzTsafcLPgAz4LMqfi zU2EcFrGKF8g+QuuPzb65cPdVNzZEMIwvGHr85Kc5w3Z/VtahsheBhAryODXO0x1v9IC QXyqCcWqZ74LZwqO33QqsedMDwNCSPgCQnFur4VaXWhqkuCt12x3qi2tAfXrnF/q9Akq D6oQFELtln8VRm3R0SOBqZGfE3z4bRJpyJvu1YgBJfX6xgREgjahSdJIpsnPaETUDLko OlRuyKv8/0OQ51l0CE48EuEgGzp2Fbp02OaqivG16E07FwwR8JE7GOKttO9pp7rJh7+Y KHMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680779072; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+51Qh4ph2xDARjk/RN2iIt89JkM0AowkliFSb5WG5qs=; b=cWB9EDKw4DluNR5oHD7go9g4b+lpfZdq4wG1+fRNjLNc9qEVJiDPCT/NX7Pvnlhi13 Ht5cuKXE+SDPtjopMq6lUvqPNfkMBZqskjrNg1GIay7QXYPkUi7r2H4xloNgkW6CVLZ0 sSjmQ/KFG53ctdAl7KuwQh4/AWoQU+B9GcIYCJ6Mxngd/0gtVaRSFyFWS9pGfbCDbPZV FhoaxgGCY0PDwpCm0FXsPYmrPb9m6dlFN5NBm6KMWpngVncrTfFjQOZj2WENU5S+pbj4 N2d6AkDjI2oys/93YMuRRuVXAebxm7Im5R//U2VThVlltaV3mFi8aQuI475L6tn2BLc3 x0Fw== X-Gm-Message-State: AAQBX9e1fZ5C3LEztCfufCkNfh+HsdoeraBSkqX8RNEBggdVxR/MB+aE EFQ4bUWiHp+moLD8zz+sav7mA8feIPKUjwMirF4m X-Google-Smtp-Source: AKy350a8OnujzR2z/0NGWD4fa93QSvWwPHLNAisDF6l9BGbZMKFGa9rtVhthYkqJqflGJBJpnVXHA4FqqMThRR0KZ1M= X-Received: by 2002:a63:4f21:0:b0:514:1418:72f0 with SMTP id d33-20020a634f21000000b00514141872f0mr3072819pgb.0.1680779072028; Thu, 06 Apr 2023 04:04:32 -0700 (PDT) MIME-Version: 1.0 References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> <7199500b-5c34-0e8b-07a2-1bf1b549bb27@redhat.com> In-Reply-To: <7199500b-5c34-0e8b-07a2-1bf1b549bb27@redhat.com> From: Yongji Xie Date: Thu, 6 Apr 2023 19:04:20 +0800 Message-ID: Subject: Re: [RFC 00/27] Add VDUSE support to Vhost library To: Maxime Coquelin Cc: dev@dpdk.org, David Marchand , chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, Jason Wang , cunming.liang@intel.com, echaudro@redhat.com, Eugenio Perez Martin , Adrian Moreno Zapata Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, Apr 6, 2023 at 4:17=E2=80=AFPM Maxime Coquelin wrote: > > Hi Yongji, > > On 4/6/23 05:44, Yongji Xie wrote: > > Hi Maxime, > > > > On Fri, Mar 31, 2023 at 11:43=E2=80=AFPM Maxime Coquelin > > wrote: > >> > >> This series introduces a new type of backend, VDUSE, > >> to the Vhost library. > >> > >> VDUSE stands for vDPA device in Userspace, it enables > >> implementing a Virtio device in userspace and have it > >> attached to the Kernel vDPA bus. > >> > >> Once attached to the vDPA bus, it can be used either by > >> Kernel Virtio drivers, like virtio-net in our case, via > >> the virtio-vdpa driver. Doing that, the device is visible > >> to the Kernel networking stack and is exposed to userspace > >> as a regular netdev. > >> > >> It can also be exposed to userspace thanks to the > >> vhost-vdpa driver, via a vhost-vdpa chardev that can be > >> passed to QEMU or Virtio-user PMD. > >> > >> While VDUSE support is already available in upstream > >> Kernel, a couple of patches are required to support > >> network device type: > >> > >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc > >> > >> In order to attach the created VDUSE device to the vDPA > >> bus, a recent iproute2 version containing the vdpa tool is > >> required. > >> > >> Usage: > >> =3D=3D=3D=3D=3D=3D > >> > >> 1. Probe required Kernel modules > >> # modprobe vdpa > >> # modprobe vduse > >> # modprobe virtio-vdpa > >> > >> 2. Build (require vduse kernel headers to be available) > >> # meson build > >> # ninja -C build > >> > >> 3. Create a VDUSE device (vduse0) using Vhost PMD with > >> testpmd (with 4 queue pairs in this example) > >> # ./build/app/dpdk-testpmd --no-pci --vdev=3Dnet_vhost0,iface=3D/dev/v= duse/vduse0,queues=3D4 --log-level=3D*:9 -- -i --txq=3D4 --rxq=3D4 > >> > >> 4. Attach the VDUSE device to the vDPA bus > >> # vdpa dev add name vduse0 mgmtdev vduse > >> =3D> The virtio-net netdev shows up (eth0 here) > >> # ip l show eth0 > >> 21: eth0: mtu 1500 qdisc mq state UP= mode DEFAULT group default qlen 1000 > >> link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff > >> > >> 5. Start/stop traffic in testpmd > >> testpmd> start > >> testpmd> show port stats 0 > >> ######################## NIC statistics for port 0 ###############= ######### > >> RX-packets: 11 RX-missed: 0 RX-bytes: 1482 > >> RX-errors: 0 > >> RX-nombuf: 0 > >> TX-packets: 1 TX-errors: 0 TX-bytes: 62 > >> > >> Throughput (since last show) > >> Rx-pps: 0 Rx-bps: 0 > >> Tx-pps: 0 Tx-bps: 0 > >> ###################################################################= ######### > >> testpmd> stop > >> > >> 6. Detach the VDUSE device from the vDPA bus > >> # vdpa dev del vduse0 > >> > >> 7. Quit testpmd > >> testpmd> quit > >> > >> Known issues & remaining work: > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > >> - Fix issue in FD manager (still polling while FD has been removed) > >> - Add Netlink support in Vhost library > >> - Support device reconnection > >> - Support packed ring > >> - Enable & test more Virtio features > >> - Provide performance benchmark results > >> > > > > Nice work! Thanks for bringing VDUSE to the network area. I wonder if > > you have some plan to support userspace memory registration [1]? I > > think this feature can benefit the performance since an extra data > > copy could be eliminated in our case. > > I plan to have a closer look later, once VDUSE support will be added. > I think it will be difficult to support it in the case of DPDK for > networking: > > - For dequeue path it would be basically re-introducing dequeue zero- > copy support that we removed some time ago. It was a hack where we > replaced the regular mbuf buffer with the descriptor one, increased the > reference counter, and at next dequeue API calls checked if the former > mbufs ref counter is 1 and restore the mbuf. Issue is that physical NIC > drivers usually release sent mbufs by pool, once a certain threshold is > met. So it can cause draining of the virtqueue as the descs are not > written back into the used ring for quite some time, depending on the > NIC/traffic/... > OK, I see. Could this issue be mitigated by releasing sent mbufs one by one once we sent it out or simply increasing the virtqueue size? > - For enqueue path, I don't think this is possible with virtual switches > by design, as when a mbuf is received on a physical port, we don't know > in which Vhost/VDUSE port it will be switched to. And for VM to VM > communication, should it use the src VM buffer or the dest VM one? > Yes, I agree that it's hard to achieve that in the enqueue path. > Only case it could work is if you had a simple forwarder between a VDUSE > device and a physical port. But I don't think there is much interest in > such use-case. > OK, I get it. Thanks, Yongji