From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f178.google.com (mail-pf0-f178.google.com [209.85.192.178]) by dpdk.org (Postfix) with ESMTP id 4823C91EF for ; Tue, 26 Jan 2016 15:05:47 +0100 (CET) Received: by mail-pf0-f178.google.com with SMTP id n128so99445939pfn.3 for ; Tue, 26 Jan 2016 06:05:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mvista-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=cn+jDI0VCDy4e8LV+RwEWqlcYepn53W1tn+0lXLDypw=; b=ryar4yC1HkmQQnbgmsL5rKkLTyYq/H8jc/b5STwN2tFHd6NcayWF8WZg9MsgSJGc/5 HnAczqoAsnH1L2A9ISBHnpVc0HQ/nB60Ql3zxBvTp9rKyTC5LQFRmuQdVMVaoQtVGf7X zrKDS03yk9tefbWWMERXlve6FBBKLMYr9puAzU+NHYQpGxctEDpG7lb5K7AHka9hZUCy noVTKVIDqmukUiRiP8LD4/2kpPzhpUkb0XBOcs0Rw9hNxYcwdL5o1AQXVkCygOO3vJge nylzCwEgBs50YRK/SRFYF70JTf72ZXLvgoNp3XHV73/dZRyJ1jIsHHTAkBZXPOSQOKG+ 521g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=cn+jDI0VCDy4e8LV+RwEWqlcYepn53W1tn+0lXLDypw=; b=Yl8YcjSdLyDNxjhaFnhfU7f8NKgowbrUCK7pLYy11sYVZGUp3kpck2Ng9/d+glpbeg g8KZuB2RxoCmuiIr3U828dTOST7WvBMrZ0AWYkLhrg+lso4ipvTQqC3W1sr3wv2Tdm2E dPZH3e28lz7PfNGafibsPawtDzwV+kZtz7S3BjuTRx4XmACQcjC7v4W5xHnlX2DFAVo6 BAvenOUvwd6lt1KzK5FuFgGqf1eLqkuKbHUpCMOQD2dFDjghDSlHPYPrLBi+TXite8Fb PWMZJsblzWI19QMuN1L0/vC8OUK5sF+OFDnpdYoVWYKYYMP4o2mdwvTRBt3scESoXEPy pzdA== X-Gm-Message-State: AG10YOR6hoEMGbo0xriLFbujFKc9DTdFJmtWNg2qgLto8eRUS5eqVoC3QSmjjd6GgyAlCTL+A24P/Um9FDrjnTSf MIME-Version: 1.0 X-Received: by 10.98.42.81 with SMTP id q78mr33785697pfq.142.1453817146528; Tue, 26 Jan 2016 06:05:46 -0800 (PST) Received: by 10.66.196.81 with HTTP; Tue, 26 Jan 2016 06:05:46 -0800 (PST) In-Reply-To: <6703609.YCn1Se5Uby@xps13> References: <1453229842-15310-1-git-send-email-sshukla@mvista.com> <2661661.zCOWZL8G2B@xps13> <6703609.YCn1Se5Uby@xps13> Date: Tue, 26 Jan 2016 19:35:46 +0530 Message-ID: From: Santosh Shukla To: Thomas Monjalon Content-Type: text/plain; charset=UTF-8 Cc: "Michael S. Tsirkin" , dev@dpdk.org, Alex Williamson Subject: Re: [dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jan 2016 14:05:47 -0000 On Tue, Jan 26, 2016 at 6:30 PM, Thomas Monjalon wrote: > 2016-01-26 15:56, Santosh Shukla: >> On Mon, Jan 25, 2016 at 8:59 PM, Thomas Monjalon >> wrote: >> > 2016-01-21 22:47, Santosh Shukla: >> >> On Thu, Jan 21, 2016 at 8:16 PM, Thomas Monjalon >> >> wrote: >> >> > 2016-01-21 17:34, Santosh Shukla: >> >> >> On Thu, Jan 21, 2016 at 4:58 PM, Thomas Monjalon >> >> >> wrote: >> >> >> > 2016-01-21 16:43, Santosh Shukla: >> >> >> >> David Marchand wrote: >> >> >> >> > This is a mode (specific to vfio), not a new kernel driver. >> >> >> >> > >> >> >> >> Yes, Specific to VFIO and this is why noiommu appended after vfio i.e.. >> >> >> >> __VFIO and __VFIO_NOIOMMU. >> >> >> > >> >> >> > Woaaa! Your logic is really disappointing :) >> >> >> > Specific to VFIO => append _NOIOMMU >> >> >> > If it's for VFIO, it should be called VFIO (that's my logic). >> >> >> > >> >> >> I am confused by reading your comment. vfio works for default iommu >> >> >> and now with noiommu. drv->kdrv need to know driver mode for vfio >> >> >> case. So that user can simply read drv->kdrv value in their driver and >> >> >> accordingly use vfio rd/wr api for example {pread/pwrite}. This is how >> >> >> rte_eal_pci_vfio_read/write_bar() api implemented. >> >> > >> >> > Sorry I don't understand. Why EAL read/write functions should be different >> >> > depending of the VFIO mode? >> >> >> >> no, EAL rd/wr functions are not different for vfio or vfio modes {same >> >> for iommu or noiommu}. Pl. see pci_eal_read/write_bar() api. Those >> >> apis currently used for VFIO, Irrespective of vfio mode. If required, >> >> we can add UIO bar_rd/wr api too. pci_eal_rd/wr_bar() are abstract >> >> apis. Underneath implementation can be vfio or uio type. >> > >> > It means you agree the suffix _NOIOMMU is not needed? >> > It seems we go nowhere in this discussion. You said >> > "drv->kdrv need to know driver mode for vfio" >> >> In my observation, currently virtio work for vfio-noiommu, that's why >> said drv->kdrv need to know vfio mode. > > It is your observation. It may change in near future. > so that mean till then, virtio support for non-x86 arch has to wait? We have working model with vfio-noiommu, don't you think it make sense to let vfio_noiommu implementation exist and later in-case virtio+iommu gets mainline then switch to vfio __mode__ agnostic approach. And for that All it takes to replace __noiommu suffix with default. >> > and after >> > "Those apis currently used for VFIO, Irrespective of vfio mode" >> > That's why I assume your first assumption was wrong. >> > >> >> Newly introduced dpdk global api pci_eal_rd/wr_bar(), can be used for >> vfio and uio both; can be used for vfio w/IOMMU and vfio w/o IOMMU >> both. >> >> >> >> > Why do we care to parse noiommu only? >> >> >> >> >> >> Because pmd drivers example virtio can work with vfio only in >> >> >> _noiommu_ mode. In particular, virtio spec 0.95 / legacy virtio. >> >> > >> >> > Please could you explain the limitation (except IOMMU availability)? >> >> >> >> Ok. >> >> >> >> I believe - we both agree that noiommu mode is a need for pmd drivers >> >> like virtio, right? if so then other reason is implementation driven >> > >> > No, noiommu is a need for some environment having no IOMMU. >> > But in my understanding, virtio could run with a nested IOMMU. >> >> Interesting, like to understand nested one, I did tried in past by >> passing "iommu=pt intel_iommu=on kvm-intel.nested=1" in cmdline for >> x86 (for guest/host both), but virtio pci device binding to vfio-pci >> driver fails. Tried on 4.2 kernel (qemu version 2.5), is it working >> for >4.2 kernel/ qemu-version? > > I haven't tried. > >> >> i.e.. >> >> >> >> Pl. look at virtio_pci.c in this patch.. VIRTIO_RD/WR/_1/2/4() >> >> implementation. They are in-use and applicable to virtio spec 0.95, >> >> so far support uio/ioport-way rd/wr. Now to support vfio-way rd/wr - >> >> need to check drv->kdrv value, that value should be of vfio_noiommu >> >> types __not__ generic _vfio types. >> > >> > I still don't understand why it would not work with VFIO w/IOMMU. >> >> with vfio+iommu; binding virtio pci device to vfio-pci driver fail; >> giving below error: >> [ 53.053464] VFIO - User Level meta-driver version: 0.3 >> [ 73.077805] vfio-pci: probe of 0000:00:03.0 failed with error -22 >> [ 73.077852] vfio-pci: probe of 0000:00:03.0 failed with error -22 >> >> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get() >> fails: iommu doesn't have group for virtio pci device. > > Yes it fails when binding. > So the later check in the virtio PMD is useless. > Which check? >> In case of noiommu, it prepares the group / add device to iommu group, >> so it passes. >> >> Jason in other thread mentioned that he is working on virtio+iommu >> approach [1], Patches are not merged and I am yet to evaluate his >> patches for virtio pmd driver for iommu(+vfio). so wondering how >> virtio pci device could work unless jason patches used? >> >> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg337079.html > > I haven't tried nested IOMMU. > All this thread was about the kernel module in use, i.e. VFIO. > We are saying that virtio could work in both VFIO modes. I agree and I am looking at stuff that works, vfio-noiommu is in linus-next, and soon be in mainline. This series will let non-x86 archs to use virtio pmd driver in first place and could take benefits of for example vhost, ovs-dpdk-vhost-user etc.. > Furthermore restricting virtio to no-iommu mode doesn't bring > any improvement. We're not __restricting__, as soon as virtio+iommu gets working state, we'll simply replace __noiommu with default. Then its upto user to try out virtio with vfio default or vfio_noiommu. > That's why I suggest to keep the initial semantic of kdrv and > not pollute it with VFIO modes. > I am okay to live with default and forget suffix __noiommu but there are implementation problem which was discussed in other thread - Virtio pmd driver should avoid interface parsing i.e. virtio_resource_init_uio/vfio() etc.. For vfio case - We could easily get rid of by moving /sys parsing to pci_eal layer, Right? If so then virtio currently works with vfio-noiommu, it make sense to me that pci_eal layer does parsing for pmd driver before that pmd driver get initialized. - Another case could be: iommu-less-pmd-driver. eal layer to do parsing before updating drv->kdrv. >> >> >> So at >> >> >> the initialization (example .. virtio-net) of such pmd driver, pmd >> >> >> driver should know that vfio-with-noiommu mode enabled or not? for >> >> >> that pmd driver simply checks drv->kdrv value. >> >> > >> >> > If a check is needed, I would prefer using your function >> >> > pci_vfio_is_noiommu() and remove driver modes from struct rte_kernel_driver. >> >> >> >> I don't think calling pci_vfio_no_iommu() inside >> >> virtio_reg_rd/wr_1/2/3() would be a good idea. >> > >> > Why? The value may be cached in the priv properties. >> > >> pci_vfio_is_noiommu() parses /sys for >> - enable_noiommu param >> - attached driver name is vfio-noiommu or not. >> >> It does file operation for that, I meant to say that calling this api >> within register_rd/wr function is not correct. It would be better if >> those low level register_rd/wr api only checks driver_types. > > Yes, that's why I said the return of pci_vfio_is_noiommu() may be cached > to keep efficiency. I am not convinced though, Still find pmd driver checking driver_types using drv->kdrv is better approach than introducing a new global variable which may look something like; At pci_eal layer ---- bool vfio_mode; vfio_mode = pci_vfio_is_noiommu(); At virtio pmd driver layer ---- Checking value at vfio_mode variable before doing virtio_rd/wr for vfio interface. Instead virtio pmd driver doing virtio_reg_rd/wr_1/2/4() { if (drv->kdrv == VFIO) do pread()/pwrite() else in()/out() } is better approach. Let me know if you still think former is better than latter then I'll send patch revision right-away.