From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 757AD2C16 for ; Fri, 22 Apr 2016 12:12:25 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 22 Apr 2016 03:12:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,516,1455004800"; d="scan'208";a="690418920" Received: from shwdeisgchi083.ccr.corp.intel.com (HELO [10.239.67.193]) ([10.239.67.193]) by FMSMGA003.fm.intel.com with ESMTP; 22 Apr 2016 03:12:21 -0700 To: Yuanhan Liu References: <1446748276-132087-1-git-send-email-jianfeng.tan@intel.com> <1461207396-42742-1-git-send-email-jianfeng.tan@intel.com> <1461207396-42742-2-git-send-email-jianfeng.tan@intel.com> <20160421220121.GA7603@yliu-dev.sh.intel.com> Cc: dev@dpdk.org, Huawei Xie , rich.lane@bigswitch.com, mst@redhat.com, nakajima.yoshihiro@lab.ntt.co.jp, p.fedin@samsung.com, michael.qiu@intel.com, ann.zhuangyanying@huawei.com, mukawa@igel.co.jp, nhorman@tuxdrver.com, Thomas Monjalon From: "Tan, Jianfeng" Message-ID: <5719F905.1060704@intel.com> Date: Fri, 22 Apr 2016 18:12:21 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <20160421220121.GA7603@yliu-dev.sh.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 1/2] virtio/vdev: add embeded device emulation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Apr 2016 10:12:26 -0000 Hi Yuanhan, On 4/22/2016 6:01 AM, Yuanhan Liu wrote: > On Thu, Apr 21, 2016 at 02:56:35AM +0000, Jianfeng Tan wrote: >> Background: Previously, we usually use a virtio device in QEMU/VM's >> context as below pic shows. Virtio nic is emulated in QEMU, and usually >> presented in VM as a PCI device. >> >> |-----------| >> | vm | >> |-----------| (over PCI bus or MMIO or Channel I/O) >> | QEMU | -> device emulation >> |-----------| >> | >> | (vhost-user protocol or vhost-net ioctls) >> | >> |-----------| >> | vhost | >> |-----------| >> >> Then we come to the topic that how to present a virtio device in an app >> or container, which uses virtio device to do inter process communication >> with vhost backend process. To achieve that, first of all, we need way >> in DPDK to interract with vhost backend. And then emulate a virtual >> virtio device in DPDK (which is addressed in following patch). >> >> |-----------| >> | DPDK app | >> |-----------| >> | DPDK lib | -> device emulation (addressed by following patch) >> |-----------| >> | >> | (vhost-user protocol or vhost-net ioctls), addressed by this patch >> | >> |-----------| >> | vhost | >> |-----------| >> >> How: we implement another instance of struct virtio_pci_ops to intercept >> the communications between VM and QEMU. Instead of rd/wr ioport or PCI >> configuration space, here we directly talk with backend through the vhost >> file. > Nope, that's wrong, and here becomes a bit subtle. I will try to make > some explanation here. > > Let's talk about the normal case (with QEMU) first. Where, virtio PMD > is a driver, and virito device is emulated inside QEMU, and exposed by > PCI. So, virtio PMD talks to the device with ioport rd/wr (or MMIO for > virtio 1.0). > > Till now, you are right. > > However, vhost-user socket is for establishing a connection, providing > HOST with enough information, so that host can directly manipulate the > vring, to dequeue/enqueue buffers. > > So, what you were saying about "directly talk with backend (the virtual > virtio device you created) through vhost file" is not right. Instead, > in your case, the (virtual) virtio device and the PMD driver is in > the same process space, therefore, you could actually access or the > device info simply by normal read/write. Here by _backend_, I mean vhost. I know you take backend as the device emulation. I ask Huawei, he thinks backend = device emulation + vhost. So maybe I should use the phrase vhost for accuracy. > > As you can see, It's a bit messy to mix all of them (virtio PMD driver, > virtio device emulation, and vhost-uesr frontend) in one single directory > (or even in one single file as you did). Therefore, I'd suggest you to > make a new dir, say "virtio-user" (a good name from Thomas), and put all > files related to virtio device emulation and vhost-user frontend there. > > Further more, I'd suggest to divide the code into following files: > > - virtio-user/virtio.c > > All virtio device emulation goes here. > > - virtio-user/vhost-user.c > > The vhost-user frontend implementation > > - virtio-user/vhost-kernel.c > > vhost kernel hanldings, including setting the tap device. > > - And, __maybe__ another standalone file for handling the talk > between the driver and the device. (See more for the comments > about virtio_pci_ops below). > > > That would make it much clearer, IMO. Got your point here, but to be honest, I'm a little concerned to split a 1k-lined file into 5+ files. Previously I'd like to make them converged together. But your suggestion is great for clean code. If any other could give some comments? > > Besides that, I came up with few minor nits below. You might want to > fix them all in the next version. > >> +static int >> +vhost_user_write(int fd, void *buf, int len, int *fds, int fd_num) >> +{ >> + int r; >> + struct msghdr msgh; >> + struct iovec iov; >> + size_t fd_size = fd_num * sizeof(int); >> + char control[CMSG_SPACE(fd_size)]; >> + struct cmsghdr *cmsg; >> + >> + bzero(&msgh, sizeof(msgh)); >> + bzero(control, sizeof(control)); > bzero is marked as deprecated (see the man page), use memset instead. I should apologize about this, you actually told me once before, but I failed to fix it in this version. > >> + >> +static struct vhost_user_msg m __rte_unused; > Hmm, if it's not used, why define it. If it's used, why decorate it > with __rte_unused? This variable is to make it easy to calculate length of some fields inside this struct. But seems that removing it does not leading to any compiling error. I'll remove it. > >> + >> +static void >> +prepare_vhost_memory_user(struct vhost_user_msg *msg, int fds[]) >> +{ >> + int i, num; >> + struct hugepage_file_info huges[VHOST_MEMORY_MAX_NREGIONS]; >> + struct vhost_memory_region *mr; >> + >> + num = get_hugepage_file_info(huges, VHOST_MEMORY_MAX_NREGIONS); >> + if (num < 0) >> + rte_panic("Failed to prepare memory for vhost-user\n"); > Do not use rte_panic, unless it's really needed. I see no good reason > to use it in a driver. If something we need is out of order, just > return and print some log and tell the user that this driver will not > work. This would keep other components work. You may then argue that > we have only one driver in container usage, but still, it's not a > good habit. Just want to make it "fail early, fair loudly". But I agree to keep same stype with other driver. > >> +static void >> +vdev_reset(struct virtio_hw *hw __rte_unused) >> +{ >> + /* do nothing according to qemu vhost user spec */ > That's not the right way to quote spec, it barely tells us anything > useful. So, you should quote the content here. OK, I'll add more info here. > >> + >> +static const struct virtio_pci_ops vdev_ops = { >> + .read_dev_cfg = vdev_read_dev_config, >> + .write_dev_cfg = vdev_write_dev_config, >> + .reset = vdev_reset, >> + .get_status = vdev_get_status, >> + .set_status = vdev_set_status, >> + .get_features = vdev_get_features, >> + .set_features = vdev_set_features, >> + .get_isr = vdev_get_isr, >> + .set_config_irq = vdev_set_config_irq, >> + .get_queue_num = vdev_get_queue_num, >> + .setup_queue = vdev_setup_queue, >> + .del_queue = vdev_del_queue, >> + .notify_queue = vdev_notify_queue, >> +}; > As stated above, this acutally does NOT belong to the virtual virtio > device emulation. It should be part of the code of the PMD driver. > You should seperate them. > >> diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h >> index b69785e..68097e6 100644 >> --- a/drivers/net/virtio/virtio_pci.h >> +++ b/drivers/net/virtio/virtio_pci.h >> @@ -260,6 +260,20 @@ struct virtio_hw { >> struct virtio_pci_common_cfg *common_cfg; >> struct virtio_net_config *dev_cfg; >> const struct virtio_pci_ops *vtpci_ops; >> +#ifdef RTE_VIRTIO_VDEV >> +#define VHOST_KERNEL 0 >> +#define VHOST_USER 1 >> + int type; /* type of backend */ >> + uint32_t queue_num; >> + char *path; >> + int mac_specified; >> + int vhostfd; >> + int backfd; /* tap device used in vhost-net */ >> + int callfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1]; >> + int kickfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1]; >> + uint8_t status; >> + struct rte_eth_dev_data *data; >> +#endif > And put all of them to the virtio "device" context, in the virtio-user/ > then. OK, I'll define a new struct to store these fields. I consider there are two methods to refer these data. One is to store a pointer to the instance of the new struct in the struct virtio_hw; the other is to main them totally closed inside virtio-user, such as using a linked chain. Which do you prefer? Thanks, Jianfeng > > --yliu