From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ferruh.yigit@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 3A18F3005
 for <dev@dpdk.org>; Wed, 16 Dec 2015 05:04:13 +0100 (CET)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by fmsmga102.fm.intel.com with ESMTP; 15 Dec 2015 20:04:12 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.20,435,1444719600"; d="scan'208";a="842109466"
Received: from irvmail001.ir.intel.com ([163.33.26.43])
 by orsmga001.jf.intel.com with ESMTP; 15 Dec 2015 20:04:11 -0800
Received: from sivlogin002.ir.intel.com (sivlogin002.ir.intel.com
 [10.237.217.37])
 by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id
 tBG44AR3030529; Wed, 16 Dec 2015 04:04:10 GMT
Received: from sivlogin002.ir.intel.com (localhost [127.0.0.1])
 by sivlogin002.ir.intel.com with ESMTP id tBG4493D022736;
 Wed, 16 Dec 2015 04:04:09 GMT
Received: (from fyigit@localhost)
	by sivlogin002.ir.intel.com with œ id tBG449ug022732;
	Wed, 16 Dec 2015 04:04:09 GMT
X-Authentication-Warning: sivlogin002.ir.intel.com: fyigit set sender to
 ferruh.yigit@intel.com using -f
Date: Wed, 16 Dec 2015 04:04:08 +0000
From: Ferruh Yigit <ferruh.yigit@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Message-ID: <20151216040408.GA18363@sivlogin002.ir.intel.com>
Mail-Followup-To: Alex Williamson <alex.williamson@redhat.com>,
 "O'Driscoll, Tim" <tim.odriscoll@intel.com>,
 Vincent JARDIN <vincent.jardin@6wind.com>,
 "dev@dpdk.org" <dev@dpdk.org>
References: <60420822.AbcfvjLZCk@xps13> <566B4A50.9090607@6wind.com>
 <1449874953.20509.6.camel@redhat.com>
 <26FA93C7ED1EAA44AB77D62FBE1D27BA6747CE55@IRSMSX108.ger.corp.intel.com>
 <1450198398.6042.32.camel@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <1450198398.6042.32.camel@redhat.com>
User-Agent: Mutt/1.5.17 (2007-11-01)
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] VFIO no-iommu
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Dec 2015 04:04:13 -0000

On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote:
> On Tue, 2015-12-15 at 13:43 +0000, O'Driscoll, Tim wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alex
> > > Williamson
> > > Sent: Friday, December 11, 2015 11:03 PM
> > > To: Vincent JARDIN; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] VFIO no-iommu
> > > 
> > > On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote:
> > > > Thanks Thomas for putting back this topic.
> > > > 
> > > > Alex,
> > > > 
> > > > I'd like to hear more about the impacts of "unsupported":
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/c
> > > > ommi
> > > > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d
> > > >    Use of this mode, specifically binding a device without a
> > > > native
> > > >    IOMMU group to a VFIO bus driver will taint the kernel and
> > > > should
> > > >    therefore not be considered supported.
> > > > 
> > > > It means that we get ride of uio; so it is a nice code cleanup:
> > > > but
> > > > why
> > > > would VFIO/NO IOMMU be better if the bottomline is "unsupported"?
> > > 
> > > How supportable do you think the uio method is?  Fundamentally we
> > > have
> > > a userspace driver doing unrestricted DMA; it can access and modify
> > > any
> > > memory in the system.  This is the reason uio won't provide a
> > > mechanism
> > > to enable MSI and if you ask the uio maintainer, they don't support
> > > DMA
> > > at all, it's only intended as a programmed IO interface to the
> > > device.
> > >  Unless we can sandbox a user owned device within an IOMMU
> > > protected
> > > container, it's not supportable.  The VFIO no-iommu mode can simply
> > > provide you that unsupported mode more easily since it leverages
> > > code
> > > from the supported mode, which is IOMMU protected.  Thanks,
> > 
> > Thanks for clarifying.
> > 
> > This does seem like it would be useful for DPDK. We're doing some
> > further investigation to see if it works out of the box with DPDK or
> > if we need to make any changes to support it.
> 
> The iommu model is different, there's no type1 interface available when
> using this mode since we have no ability to provide translation.  The
> no-iommu iommu model really does nothing, which is a possible issue for
> userspace.  Is it sufficient?  We stopped short of creating a page
> pinning interface through the no-iommu model because it requires code
> and adding piles of new code for an interface we claim is unsupported
> doesn't make a lot of sense.  The device interface should be identical
> to existing vfio support.
> 
> > Thomas highlighted that your original commit for this had been
> > reverted. What specifically would you need from us in order to re-
> > submit the VFIO No-IOMMU support?
> 
> No API changes should ever go into the kernel without being validated
> by a user.  Without that we're risking that the kernel interface is
> broken and we're stuck supporting it.  In this case I tried to make
> sure we had a working user before it went it, gambled that it was close
> enough to put in anyway, then paid the price when development went
> silent on the user side.  To get it back in, I'm going to need a
> working use first.  You can re-apply 033291eccbdb or re-
> revert ae5515d66362 for development of that.  I need to see that it
> works and that there's some consensus from the dpdk community that it's
> a worthwhile path forward for cases without an iommu.  There's no point
> in merging it if it only becomes a userspace proof of concept.  Thanks,
> 
I tested the DPDK (HEAD of master) with the patch, with help of Anatoly,
and DPDK works in no-iommu environment with a little modification.

Basically the only modification is adapt new group naming (noiommu-$) and
disable dma mapping (VFIO_IOMMU_MAP_DMA)

Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in vfio module,
container->noiommu is not set before doing a vfio_group_set_container()
and vfio_for_each_iommu_driver selects wrong driver.

What I test is bind two different type of NICs into VFIO driver, and use
testpmd to confirm transfer is working.  Kernel booted without iommu enabled,
vfio module inserted with "enable_unsafe_noiommu_support" parameter.

Thanks,
ferruh