DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Yang, Zhiyong" <zhiyong.yang@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"yliu@fridaylinux.org" <yliu@fridaylinux.org>
Cc: "Wang, Wei W" <wei.w.wang@intel.com>,
	"Tan, Jianfeng" <jianfeng.tan@intel.com>
Subject: Re: [dpdk-dev] [PATCH 00/11] net/vhostpci: A new vhostpci PMD supporting VM2VM scenario
Date: Thu, 11 Jan 2018 11:13:56 +0000	[thread overview]
Message-ID: <E182254E98A5DA4EB1E657AC7CB9BD2A8B023492@BGSMSX101.gar.corp.intel.com> (raw)
In-Reply-To: <961a2372-39c8-70ff-41a1-5379122c0427@redhat.com>

Hi Maxime, all, 

> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Tuesday, December 19, 2017 7:15 PM
> To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org;
> yliu@fridaylinux.org
> Cc: Wang, Wei W <wei.w.wang@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>
> Subject: Re: [PATCH 00/11] net/vhostpci: A new vhostpci PMD supporting
> VM2VM scenario
> 
> Hi Zhiyong,
> 
> On 11/30/2017 10:46 AM, Zhiyong Yang wrote:
> > Vhostpci PMD is a new type driver working in guest OS which has
> > ability to drive the vhostpci modern pci device, which is a new virtio device.
> >
> > The following linking is about vhostpci design:
> >
> > An initial device design is presented at KVM Forum'16:
> > http://www.linux-kvm.org/images/5/55/02x07A-Wei_Wang-Design_of-
> Vhost-p
> > ci.pdf The latest device design and implementation will be posted to
> > the QEMU community soon.
> >
> > Vhostpci PMD works in pair with virtio-net PMD to achieve
> > point-to-point communication between VMs. DPDK already has
> > virtio/vhost user PMD pair to implement RX/TX packets between
> > guest/host scenario. However, for VM2VM use cases, Virtio PMD needs to
> > transmit pkts from VM1 to host OS firstly by vhost user port, then
> > transmit pkts to the 2nd VM by virtio PMD port again. Virtio/Vhostpci
> > PMD pair can implement shared memory to receive/trasmit packets
> > directly between two VMs. Currently, the entire memory of the virtio-net
> side VM is shared to the vhost-pci side VM, and mapped via device BAR2,
> and the first 4KB area of BAR2 is reserved to store the metadata.
> >
> > The vhostpci/virtio PMD working processing is the following:
> >
> > 1.VM1 startup with vhostpci device, bind the device to DPDK in the
> > guest1, launch the DPDK testpmd, then waiting for the remote memory
> > info (the VM2 shares memory, memory regions and vring info).
> >
> > 2.VM2 startup with virtio-net device, bind the virito-net to DPDK in
> > the VM2, run testpmd using virtio PMD.
> >
> > 3.vhostpci device negotiate virtio message with virtio-net device via
> > socket as vhost user/virtio-net do that.
> >
> > 4.Vhostpci device gets VM2's memory region and vring info and write
> > the metadata to VM2's shared memory.
> >
> > 5.When the metadata is ready to be read by the Vhostpci PMD, the PMD
> > will receive a config interrupt with LINK_UP set in the status config.
> >
> > 6.Vhostpci PMD and Virtio PMD can transmit/receive the packets.
> >
> > How to test?
> >
> > 1. launch VM1 with vhostpci device.
> > qemu/x86_64-softmmu/qemu-system-x86_64 -cpu host -M pc -enable-
> kvm \
> > -smp 16,threads=1,sockets=1 -m 8G -mem-prealloc -realtime mlock=on \
> > -object memory-backend-file,id=mem,size=8G,mem-
> path=/dev/hugepages, \
> > share=on -numa node,memdev=mem -drive
> > if=virtio,file=/root/vhost-pci/guest1.img,format=raw \ -kernel
> > /opt/guest_kernel -append 'root=/dev/vda1 ro default_hugepagesz=1G
> > hugepagesz=1G \
> > hugepages=2 console=ttyS0,115200,8n1 3' -netdev
> > tap,id=net1,br=br0,script=/etc/qemu-ifup \ -chardev
> > socket,id=slave1,server,wait=off, path=/opt/vhost-pci-slave1 -device
> > vhost-pci-net-pci, \
> > chardev=slave1 \
> > -nographic
> >
> > 2. bind vhostpci device to dpdk using igb_uio.
> > startup dpdk
> > ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 -- -i
> >
> > 3. launch VM2 with virtio-net device.
> >
> > qemu/x86_64-softmmu/qemu-system-x86_64 -cpu host -M pc -enable-
> kvm \
> > -smp 4,threads=1,sockets=1 -m 8G -mem-prealloc -realtime mlock=on \
> > -object
> > memory-backend-file,id=mem,size=8G,mem-
> path=/dev/hugepages,share=on \
> > -numa node,memdev=mem -drive
> > if=virtio,file=/root/vhost-pci/guest2.img,format=raw \ -net none
> > -no-hpet -kernel /opt/guest_kernel \ -append 'root=/dev/vda1 ro
> > default_hugepagesz=1G hugepagesz=1G hugepages=2
> > console=ttyS0,115200,8n1 3' \ -chardev
> > socket,id=sock2,path=/opt/vhost-pci-slave1 \ -netdev
> > type=vhost-user,id=net2,chardev=sock2,vhostforce \ -device
> > virtio-net-pci,mac=52:54:00:00:00:02,netdev=net2 \ -nographic
> >
> > 4.bind virtio-net to dpdk using igb_uio run dpdk
> >
> > ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 --socket-mem
> > 512,0 \
> > -- -i --rxq=1 --txq=1 --nb-cores=1
> >
> > 5. vhostpci PMD run "start"
> >
> > 6. virtio PMD side run "start tx_first"
> >
> > loopback testing can work.
> >
> > note:
> > 1. only support igb_uio for now.
> > 2. vhostpci device is a modern pci device. vhostpci PMD only supports
> > mergable mode. Virtio device side must be mergable mode.
> > 3. vhostpci PMD supports one queue pair for now.
> >
> > Zhiyong Yang (11):
> >    drivers/net: add vhostpci PMD base files
> >    net/vhostpci: public header files
> >    net/vhostpci: add debugging log macros
> >    net/vhostpci: add basic framework
> >    net/vhostpci: add queue setup
> >    net/vhostpci: add support for link status change
> >    net/vhostpci: get remote memory region and vring info
> >    net/vhostpci: add RX function
> >    net/vhostpci: add TX function
> >    net/vhostpci: support RX/TX packets statistics
> >    net/vhostpci: update release note
> >
> >   MAINTAINERS                                       |    6 +
> >   config/common_base                                |    9 +
> >   config/common_linuxapp                            |    1 +
> >   doc/guides/rel_notes/release_18_02.rst            |    6 +
> >   drivers/net/Makefile                              |    1 +
> >   drivers/net/vhostpci/Makefile                     |   54 +
> >   drivers/net/vhostpci/rte_pmd_vhostpci_version.map |    3 +
> >   drivers/net/vhostpci/vhostpci_ethdev.c            | 1521
> +++++++++++++++++++++
> >   drivers/net/vhostpci/vhostpci_ethdev.h            |  176 +++
> >   drivers/net/vhostpci/vhostpci_logs.h              |   69 +
> >   drivers/net/vhostpci/vhostpci_net.h               |   74 +
> >   drivers/net/vhostpci/vhostpci_pci.c               |  334 +++++
> >   drivers/net/vhostpci/vhostpci_pci.h               |  240 ++++
> >   mk/rte.app.mk                                     |    1 +
> >   14 files changed, 2495 insertions(+)
> >   create mode 100644 drivers/net/vhostpci/Makefile
> >   create mode 100644
> drivers/net/vhostpci/rte_pmd_vhostpci_version.map
> >   create mode 100644 drivers/net/vhostpci/vhostpci_ethdev.c
> >   create mode 100644 drivers/net/vhostpci/vhostpci_ethdev.h
> >   create mode 100644 drivers/net/vhostpci/vhostpci_logs.h
> >   create mode 100644 drivers/net/vhostpci/vhostpci_net.h
> >   create mode 100644 drivers/net/vhostpci/vhostpci_pci.c
> >   create mode 100644 drivers/net/vhostpci/vhostpci_pci.h
> >
> 
> Thanks for the RFC.
> It seems there is a lot of code duplication between this series and libvhost-
> user.
> 
> Does the non-RFC would make reuse of libvhost-user? I'm thinking of all the
> code copied from virtio-net.c for example.
> 
> If not, I think this is problematic as it will double the maintenance cost.
>

I'm trying to reuse  librte_vhost RX/TX logic  and it seems feasible,
However, I have to expose many internal data structures in librte_vhost such as virtio_net, vhost_virtqueue , etc to PMD layer.
Since vhostpci PMD is using one virtio pci device (vhostpci device) in guest,    Memory allocation and release should be done in driver/net/vhostpci as virtio PMD does that.
Vhostpci and vhost can share struct  virtio_net to manage the different drivers, since they are very similar.
The features for example zero copy feature, make rarp packets don't need to be supported for vhostpci, we can always disable them.
How do you think about the thoughts?

thanks
Zhiyong
 
> Cheers,
> Maxime



  parent reply	other threads:[~2018-01-11 11:14 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30  9:46 Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 01/11] drivers/net: add vhostpci PMD base files Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 02/11] net/vhostpci: public header files Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 03/11] net/vhostpci: add debugging log macros Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 04/11] net/vhostpci: add basic framework Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 05/11] net/vhostpci: add queue setup Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 06/11] net/vhostpci: add support for link status change Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 07/11] net/vhostpci: get remote memory region and vring info Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 08/11] net/vhostpci: add RX function Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 09/11] net/vhostpci: add TX function Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 10/11] net/vhostpci: support RX/TX packets statistics Zhiyong Yang
2017-11-30  9:46 ` [dpdk-dev] [PATCH 11/11] net/vhostpci: update release note Zhiyong Yang
2017-12-05  6:59 ` [dpdk-dev] [PATCH 00/11] net/vhostpci: A new vhostpci PMD supporting VM2VM scenario Yang, Zhiyong
2017-12-05 14:08   ` Yuanhan Liu
2017-12-06  3:00     ` Wei Wang
2017-12-07  6:07   ` Yang, Zhiyong
2017-12-19 11:14 ` Maxime Coquelin
2017-12-20  1:51   ` Yang, Zhiyong
2017-12-21  5:52     ` Tan, Jianfeng
2017-12-21  6:21       ` Yang, Zhiyong
2017-12-21  6:26         ` Yang, Zhiyong
2017-12-21  8:26           ` Maxime Coquelin
2017-12-21  8:40             ` Yang, Zhiyong
2018-01-11 11:13   ` Yang, Zhiyong [this message]
2018-01-18  9:04     ` Maxime Coquelin
2018-01-19  1:56       ` Yang, Zhiyong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E182254E98A5DA4EB1E657AC7CB9BD2A8B023492@BGSMSX101.gar.corp.intel.com \
    --to=zhiyong.yang@intel.com \
    --cc=dev@dpdk.org \
    --cc=jianfeng.tan@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=wei.w.wang@intel.com \
    --cc=yliu@fridaylinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).