From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 531B73005 for ; Wed, 16 Dec 2015 05:38:36 +0100 (CET) Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 0DAA4C0B7E01; Wed, 16 Dec 2015 04:38:35 +0000 (UTC) Received: from t450s.home (ovpn-113-219.phx2.redhat.com [10.3.113.219]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tBG4cX3F021941; Tue, 15 Dec 2015 23:38:33 -0500 Message-ID: <1450240711.2674.11.camel@redhat.com> From: Alex Williamson To: Ferruh Yigit Date: Tue, 15 Dec 2015 21:38:31 -0700 In-Reply-To: <20151216040408.GA18363@sivlogin002.ir.intel.com> References: <60420822.AbcfvjLZCk@xps13> <566B4A50.9090607@6wind.com> <1449874953.20509.6.camel@redhat.com> <26FA93C7ED1EAA44AB77D62FBE1D27BA6747CE55@IRSMSX108.ger.corp.intel.com> <1450198398.6042.32.camel@redhat.com> <20151216040408.GA18363@sivlogin002.ir.intel.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] VFIO no-iommu X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2015 04:38:36 -0000 On Wed, 2015-12-16 at 04:04 +0000, Ferruh Yigit wrote: > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > On Tue, 2015-12-15 at 13:43 +0000, O'Driscoll, Tim wrote: > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alex > > > > Williamson > > > > Sent: Friday, December 11, 2015 11:03 PM > > > > To: Vincent JARDIN; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] VFIO no-iommu > > > > > > > > On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote: > > > > > Thanks Thomas for putting back this topic. > > > > > > > > > > Alex, > > > > > > > > > > I'd like to hear more about the impacts of "unsupported": > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.g > > > > > it/c > > > > > ommi > > > > > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d > > > > >    Use of this mode, specifically binding a device without a > > > > > native > > > > >    IOMMU group to a VFIO bus driver will taint the kernel and > > > > > should > > > > >    therefore not be considered supported. > > > > > > > > > > It means that we get ride of uio; so it is a nice code > > > > > cleanup: > > > > > but > > > > > why > > > > > would VFIO/NO IOMMU be better if the bottomline is > > > > > "unsupported"? > > > > > > > > How supportable do you think the uio method is?  Fundamentally > > > > we > > > > have > > > > a userspace driver doing unrestricted DMA; it can access and > > > > modify > > > > any > > > > memory in the system.  This is the reason uio won't provide a > > > > mechanism > > > > to enable MSI and if you ask the uio maintainer, they don't > > > > support > > > > DMA > > > > at all, it's only intended as a programmed IO interface to the > > > > device. > > > >  Unless we can sandbox a user owned device within an IOMMU > > > > protected > > > > container, it's not supportable.  The VFIO no-iommu mode can > > > > simply > > > > provide you that unsupported mode more easily since it > > > > leverages > > > > code > > > > from the supported mode, which is IOMMU protected.  Thanks, > > > > > > Thanks for clarifying. > > > > > > This does seem like it would be useful for DPDK. We're doing some > > > further investigation to see if it works out of the box with DPDK > > > or > > > if we need to make any changes to support it. > > > > The iommu model is different, there's no type1 interface available > > when > > using this mode since we have no ability to provide translation. > >  The > > no-iommu iommu model really does nothing, which is a possible issue > > for > > userspace.  Is it sufficient?  We stopped short of creating a page > > pinning interface through the no-iommu model because it requires > > code > > and adding piles of new code for an interface we claim is > > unsupported > > doesn't make a lot of sense.  The device interface should be > > identical > > to existing vfio support. > > > > > Thomas highlighted that your original commit for this had been > > > reverted. What specifically would you need from us in order to > > > re- > > > submit the VFIO No-IOMMU support? > > > > No API changes should ever go into the kernel without being > > validated > > by a user.  Without that we're risking that the kernel interface is > > broken and we're stuck supporting it.  In this case I tried to make > > sure we had a working user before it went it, gambled that it was > > close > > enough to put in anyway, then paid the price when development went > > silent on the user side.  To get it back in, I'm going to need a > > working use first.  You can re-apply 033291eccbdb or re- > > revert ae5515d66362 for development of that.  I need to see that it > > works and that there's some consensus from the dpdk community that > > it's > > a worthwhile path forward for cases without an iommu.  There's no > > point > > in merging it if it only becomes a userspace proof of concept. > >  Thanks, > > > I tested the DPDK (HEAD of master) with the patch, with help of > Anatoly, > and DPDK works in no-iommu environment with a little modification. > > Basically the only modification is adapt new group naming (noiommu-$) > and Sorry, forgot to mention that one.  The intention with the modified group name is that I want to be very certain that a user intending to only support properly iommu isolated devices doesn't accidentally need to deal with these no-iommu mode devices. > disable dma mapping (VFIO_IOMMU_MAP_DMA) > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in vfio > module, > container->noiommu is not set before doing a > vfio_group_set_container() > and vfio_for_each_iommu_driver selects wrong driver. Running CHECK_EXTENSION on a container without the group attached is only going to tell you what extensions vfio is capable of, not necessarily what extensions are available to you with that group.  Is this just a general dpdk-vfio ordering bug? > What I test is bind two different type of NICs into VFIO driver, and > use > testpmd to confirm transfer is working.  Kernel booted without iommu > enabled, > vfio module inserted with "enable_unsafe_noiommu_support" parameter. So it works.  Is it acceptable?  Useful?  Sufficiently complete?  Does it imply deprecating the uio interface?  I believe the feature that started this discussion was support for MSI/X interrupts so that VFs can support some kind of interrupt (uio only supports INTx since it doesn't allow DMA).  Implementing that would be the ultimate test of whether this provides dpdk with not only a more consistent interface, but the feature dpdk wants that's missing in uio. Thanks, Alex