* [dpdk-dev] VFIO no-iommu @ 2015-12-11 16:28 Thomas Monjalon 2015-12-11 22:12 ` Vincent JARDIN 2015-12-11 23:20 ` Jan Viktorin 0 siblings, 2 replies; 35+ messages in thread From: Thomas Monjalon @ 2015-12-11 16:28 UTC (permalink / raw) To: dev; +Cc: Alex Williamson Recently there were some discussions to have an upstream replacement for our igb_uio module. Several solutions were discussed (new uio driver, uio_pci_generic, vfio): https://lkml.org/lkml/2015/10/16/700 Alex Williamson (maintainer of VFIO driver), submitted a solution and was waiting some feedback. Unfortunately, nobody caught it and he has reverted his work: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ae5515d It is an important challenge to remove our out-of-tree modules and especially igb_uio. It is a long way to have a standard solution integrated in every distributions. The current cooking Linux kernel is 4.4 and will have a long term maintenance: https://kernel.org/releases.html So it is a pity to miss this opportunity. Stephen has fixed a bug to use the IOMMU group zero: http://dpdk.org/browse/dpdk/commit/?id=22215f141b1 Is there someone interested to work on VFIO no-iommu and provide some feedbacks? We also need to prepare a documentation patch to explain its usage compared to the standard VFIO mode. Thanks ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-11 16:28 [dpdk-dev] VFIO no-iommu Thomas Monjalon @ 2015-12-11 22:12 ` Vincent JARDIN 2015-12-11 23:02 ` Alex Williamson 2015-12-11 23:20 ` Jan Viktorin 1 sibling, 1 reply; 35+ messages in thread From: Vincent JARDIN @ 2015-12-11 22:12 UTC (permalink / raw) To: dev, Alex Williamson Thanks Thomas for putting back this topic. Alex, I'd like to hear more about the impacts of "unsupported": https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=033291eccbdb1b70ffc02641edae19ac825dc75d Use of this mode, specifically binding a device without a native IOMMU group to a VFIO bus driver will taint the kernel and should therefore not be considered supported. It means that we get ride of uio; so it is a nice code cleanup: but why would VFIO/NO IOMMU be better if the bottomline is "unsupported"? Thank you, Vincent On 11/12/2015 17:28, Thomas Monjalon wrote: > Recently there were some discussions to have an upstream replacement > for our igb_uio module. > Several solutions were discussed (new uio driver, uio_pci_generic, vfio): > https://lkml.org/lkml/2015/10/16/700 > > Alex Williamson (maintainer of VFIO driver), submitted a solution > and was waiting some feedback. Unfortunately, nobody caught it and > he has reverted his work: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ae5515d > > It is an important challenge to remove our out-of-tree modules and > especially igb_uio. It is a long way to have a standard solution integrated > in every distributions. > The current cooking Linux kernel is 4.4 and will have a long term maintenance: > https://kernel.org/releases.html > So it is a pity to miss this opportunity. > > Stephen has fixed a bug to use the IOMMU group zero: > http://dpdk.org/browse/dpdk/commit/?id=22215f141b1 > > Is there someone interested to work on VFIO no-iommu and provide > some feedbacks? > We also need to prepare a documentation patch to explain its usage > compared to the standard VFIO mode. > > Thanks > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-11 22:12 ` Vincent JARDIN @ 2015-12-11 23:02 ` Alex Williamson 2015-12-15 13:43 ` O'Driscoll, Tim 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-11 23:02 UTC (permalink / raw) To: Vincent JARDIN, dev On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote: > Thanks Thomas for putting back this topic. > > Alex, > > I'd like to hear more about the impacts of "unsupported": > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commi > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d > Use of this mode, specifically binding a device without a native > IOMMU group to a VFIO bus driver will taint the kernel and should > therefore not be considered supported. > > It means that we get ride of uio; so it is a nice code cleanup: but > why > would VFIO/NO IOMMU be better if the bottomline is "unsupported"? How supportable do you think the uio method is? Fundamentally we have a userspace driver doing unrestricted DMA; it can access and modify any memory in the system. This is the reason uio won't provide a mechanism to enable MSI and if you ask the uio maintainer, they don't support DMA at all, it's only intended as a programmed IO interface to the device. Unless we can sandbox a user owned device within an IOMMU protected container, it's not supportable. The VFIO no-iommu mode can simply provide you that unsupported mode more easily since it leverages code from the supported mode, which is IOMMU protected. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-11 23:02 ` Alex Williamson @ 2015-12-15 13:43 ` O'Driscoll, Tim 2015-12-15 16:53 ` Alex Williamson 0 siblings, 1 reply; 35+ messages in thread From: O'Driscoll, Tim @ 2015-12-15 13:43 UTC (permalink / raw) To: Alex Williamson, Vincent JARDIN, dev > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alex Williamson > Sent: Friday, December 11, 2015 11:03 PM > To: Vincent JARDIN; dev@dpdk.org > Subject: Re: [dpdk-dev] VFIO no-iommu > > On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote: > > Thanks Thomas for putting back this topic. > > > > Alex, > > > > I'd like to hear more about the impacts of "unsupported": > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commi > > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d > > Use of this mode, specifically binding a device without a native > > IOMMU group to a VFIO bus driver will taint the kernel and should > > therefore not be considered supported. > > > > It means that we get ride of uio; so it is a nice code cleanup: but > > why > > would VFIO/NO IOMMU be better if the bottomline is "unsupported"? > > How supportable do you think the uio method is? Fundamentally we have > a userspace driver doing unrestricted DMA; it can access and modify any > memory in the system. This is the reason uio won't provide a mechanism > to enable MSI and if you ask the uio maintainer, they don't support DMA > at all, it's only intended as a programmed IO interface to the device. > Unless we can sandbox a user owned device within an IOMMU protected > container, it's not supportable. The VFIO no-iommu mode can simply > provide you that unsupported mode more easily since it leverages code > from the supported mode, which is IOMMU protected. Thanks, Thanks for clarifying. This does seem like it would be useful for DPDK. We're doing some further investigation to see if it works out of the box with DPDK or if we need to make any changes to support it. Thomas highlighted that your original commit for this had been reverted. What specifically would you need from us in order to re-submit the VFIO No-IOMMU support? Tim > > Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-15 13:43 ` O'Driscoll, Tim @ 2015-12-15 16:53 ` Alex Williamson 2015-12-16 4:04 ` Ferruh Yigit 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-15 16:53 UTC (permalink / raw) To: O'Driscoll, Tim, Vincent JARDIN, dev On Tue, 2015-12-15 at 13:43 +0000, O'Driscoll, Tim wrote: > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alex > > Williamson > > Sent: Friday, December 11, 2015 11:03 PM > > To: Vincent JARDIN; dev@dpdk.org > > Subject: Re: [dpdk-dev] VFIO no-iommu > > > > On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote: > > > Thanks Thomas for putting back this topic. > > > > > > Alex, > > > > > > I'd like to hear more about the impacts of "unsupported": > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/c > > > ommi > > > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d > > > Use of this mode, specifically binding a device without a > > > native > > > IOMMU group to a VFIO bus driver will taint the kernel and > > > should > > > therefore not be considered supported. > > > > > > It means that we get ride of uio; so it is a nice code cleanup: > > > but > > > why > > > would VFIO/NO IOMMU be better if the bottomline is "unsupported"? > > > > How supportable do you think the uio method is? Fundamentally we > > have > > a userspace driver doing unrestricted DMA; it can access and modify > > any > > memory in the system. This is the reason uio won't provide a > > mechanism > > to enable MSI and if you ask the uio maintainer, they don't support > > DMA > > at all, it's only intended as a programmed IO interface to the > > device. > > Unless we can sandbox a user owned device within an IOMMU > > protected > > container, it's not supportable. The VFIO no-iommu mode can simply > > provide you that unsupported mode more easily since it leverages > > code > > from the supported mode, which is IOMMU protected. Thanks, > > Thanks for clarifying. > > This does seem like it would be useful for DPDK. We're doing some > further investigation to see if it works out of the box with DPDK or > if we need to make any changes to support it. The iommu model is different, there's no type1 interface available when using this mode since we have no ability to provide translation. The no-iommu iommu model really does nothing, which is a possible issue for userspace. Is it sufficient? We stopped short of creating a page pinning interface through the no-iommu model because it requires code and adding piles of new code for an interface we claim is unsupported doesn't make a lot of sense. The device interface should be identical to existing vfio support. > Thomas highlighted that your original commit for this had been > reverted. What specifically would you need from us in order to re- > submit the VFIO No-IOMMU support? No API changes should ever go into the kernel without being validated by a user. Without that we're risking that the kernel interface is broken and we're stuck supporting it. In this case I tried to make sure we had a working user before it went it, gambled that it was close enough to put in anyway, then paid the price when development went silent on the user side. To get it back in, I'm going to need a working use first. You can re-apply 033291eccbdb or re- revert ae5515d66362 for development of that. I need to see that it works and that there's some consensus from the dpdk community that it's a worthwhile path forward for cases without an iommu. There's no point in merging it if it only becomes a userspace proof of concept. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-15 16:53 ` Alex Williamson @ 2015-12-16 4:04 ` Ferruh Yigit 2015-12-16 4:38 ` Alex Williamson 0 siblings, 1 reply; 35+ messages in thread From: Ferruh Yigit @ 2015-12-16 4:04 UTC (permalink / raw) To: Alex Williamson; +Cc: dev On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > On Tue, 2015-12-15 at 13:43 +0000, O'Driscoll, Tim wrote: > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alex > > > Williamson > > > Sent: Friday, December 11, 2015 11:03 PM > > > To: Vincent JARDIN; dev@dpdk.org > > > Subject: Re: [dpdk-dev] VFIO no-iommu > > > > > > On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote: > > > > Thanks Thomas for putting back this topic. > > > > > > > > Alex, > > > > > > > > I'd like to hear more about the impacts of "unsupported": > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/c > > > > ommi > > > > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d > > > > Use of this mode, specifically binding a device without a > > > > native > > > > IOMMU group to a VFIO bus driver will taint the kernel and > > > > should > > > > therefore not be considered supported. > > > > > > > > It means that we get ride of uio; so it is a nice code cleanup: > > > > but > > > > why > > > > would VFIO/NO IOMMU be better if the bottomline is "unsupported"? > > > > > > How supportable do you think the uio method is? Fundamentally we > > > have > > > a userspace driver doing unrestricted DMA; it can access and modify > > > any > > > memory in the system. This is the reason uio won't provide a > > > mechanism > > > to enable MSI and if you ask the uio maintainer, they don't support > > > DMA > > > at all, it's only intended as a programmed IO interface to the > > > device. > > > Unless we can sandbox a user owned device within an IOMMU > > > protected > > > container, it's not supportable. The VFIO no-iommu mode can simply > > > provide you that unsupported mode more easily since it leverages > > > code > > > from the supported mode, which is IOMMU protected. Thanks, > > > > Thanks for clarifying. > > > > This does seem like it would be useful for DPDK. We're doing some > > further investigation to see if it works out of the box with DPDK or > > if we need to make any changes to support it. > > The iommu model is different, there's no type1 interface available when > using this mode since we have no ability to provide translation. The > no-iommu iommu model really does nothing, which is a possible issue for > userspace. Is it sufficient? We stopped short of creating a page > pinning interface through the no-iommu model because it requires code > and adding piles of new code for an interface we claim is unsupported > doesn't make a lot of sense. The device interface should be identical > to existing vfio support. > > > Thomas highlighted that your original commit for this had been > > reverted. What specifically would you need from us in order to re- > > submit the VFIO No-IOMMU support? > > No API changes should ever go into the kernel without being validated > by a user. Without that we're risking that the kernel interface is > broken and we're stuck supporting it. In this case I tried to make > sure we had a working user before it went it, gambled that it was close > enough to put in anyway, then paid the price when development went > silent on the user side. To get it back in, I'm going to need a > working use first. You can re-apply 033291eccbdb or re- > revert ae5515d66362 for development of that. I need to see that it > works and that there's some consensus from the dpdk community that it's > a worthwhile path forward for cases without an iommu. There's no point > in merging it if it only becomes a userspace proof of concept. Thanks, > I tested the DPDK (HEAD of master) with the patch, with help of Anatoly, and DPDK works in no-iommu environment with a little modification. Basically the only modification is adapt new group naming (noiommu-$) and disable dma mapping (VFIO_IOMMU_MAP_DMA) Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in vfio module, container->noiommu is not set before doing a vfio_group_set_container() and vfio_for_each_iommu_driver selects wrong driver. What I test is bind two different type of NICs into VFIO driver, and use testpmd to confirm transfer is working. Kernel booted without iommu enabled, vfio module inserted with "enable_unsafe_noiommu_support" parameter. Thanks, ferruh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 4:04 ` Ferruh Yigit @ 2015-12-16 4:38 ` Alex Williamson 2015-12-16 8:35 ` Burakov, Anatoly 2016-01-14 6:03 ` Jike Song 0 siblings, 2 replies; 35+ messages in thread From: Alex Williamson @ 2015-12-16 4:38 UTC (permalink / raw) To: Ferruh Yigit; +Cc: dev On Wed, 2015-12-16 at 04:04 +0000, Ferruh Yigit wrote: > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > On Tue, 2015-12-15 at 13:43 +0000, O'Driscoll, Tim wrote: > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alex > > > > Williamson > > > > Sent: Friday, December 11, 2015 11:03 PM > > > > To: Vincent JARDIN; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] VFIO no-iommu > > > > > > > > On Fri, 2015-12-11 at 23:12 +0100, Vincent JARDIN wrote: > > > > > Thanks Thomas for putting back this topic. > > > > > > > > > > Alex, > > > > > > > > > > I'd like to hear more about the impacts of "unsupported": > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.g > > > > > it/c > > > > > ommi > > > > > t/?id=033291eccbdb1b70ffc02641edae19ac825dc75d > > > > > Use of this mode, specifically binding a device without a > > > > > native > > > > > IOMMU group to a VFIO bus driver will taint the kernel and > > > > > should > > > > > therefore not be considered supported. > > > > > > > > > > It means that we get ride of uio; so it is a nice code > > > > > cleanup: > > > > > but > > > > > why > > > > > would VFIO/NO IOMMU be better if the bottomline is > > > > > "unsupported"? > > > > > > > > How supportable do you think the uio method is? Fundamentally > > > > we > > > > have > > > > a userspace driver doing unrestricted DMA; it can access and > > > > modify > > > > any > > > > memory in the system. This is the reason uio won't provide a > > > > mechanism > > > > to enable MSI and if you ask the uio maintainer, they don't > > > > support > > > > DMA > > > > at all, it's only intended as a programmed IO interface to the > > > > device. > > > > Unless we can sandbox a user owned device within an IOMMU > > > > protected > > > > container, it's not supportable. The VFIO no-iommu mode can > > > > simply > > > > provide you that unsupported mode more easily since it > > > > leverages > > > > code > > > > from the supported mode, which is IOMMU protected. Thanks, > > > > > > Thanks for clarifying. > > > > > > This does seem like it would be useful for DPDK. We're doing some > > > further investigation to see if it works out of the box with DPDK > > > or > > > if we need to make any changes to support it. > > > > The iommu model is different, there's no type1 interface available > > when > > using this mode since we have no ability to provide translation. > > The > > no-iommu iommu model really does nothing, which is a possible issue > > for > > userspace. Is it sufficient? We stopped short of creating a page > > pinning interface through the no-iommu model because it requires > > code > > and adding piles of new code for an interface we claim is > > unsupported > > doesn't make a lot of sense. The device interface should be > > identical > > to existing vfio support. > > > > > Thomas highlighted that your original commit for this had been > > > reverted. What specifically would you need from us in order to > > > re- > > > submit the VFIO No-IOMMU support? > > > > No API changes should ever go into the kernel without being > > validated > > by a user. Without that we're risking that the kernel interface is > > broken and we're stuck supporting it. In this case I tried to make > > sure we had a working user before it went it, gambled that it was > > close > > enough to put in anyway, then paid the price when development went > > silent on the user side. To get it back in, I'm going to need a > > working use first. You can re-apply 033291eccbdb or re- > > revert ae5515d66362 for development of that. I need to see that it > > works and that there's some consensus from the dpdk community that > > it's > > a worthwhile path forward for cases without an iommu. There's no > > point > > in merging it if it only becomes a userspace proof of concept. > > Thanks, > > > I tested the DPDK (HEAD of master) with the patch, with help of > Anatoly, > and DPDK works in no-iommu environment with a little modification. > > Basically the only modification is adapt new group naming (noiommu-$) > and Sorry, forgot to mention that one. The intention with the modified group name is that I want to be very certain that a user intending to only support properly iommu isolated devices doesn't accidentally need to deal with these no-iommu mode devices. > disable dma mapping (VFIO_IOMMU_MAP_DMA) > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in vfio > module, > container->noiommu is not set before doing a > vfio_group_set_container() > and vfio_for_each_iommu_driver selects wrong driver. Running CHECK_EXTENSION on a container without the group attached is only going to tell you what extensions vfio is capable of, not necessarily what extensions are available to you with that group. Is this just a general dpdk-vfio ordering bug? > What I test is bind two different type of NICs into VFIO driver, and > use > testpmd to confirm transfer is working. Kernel booted without iommu > enabled, > vfio module inserted with "enable_unsafe_noiommu_support" parameter. So it works. Is it acceptable? Useful? Sufficiently complete? Does it imply deprecating the uio interface? I believe the feature that started this discussion was support for MSI/X interrupts so that VFs can support some kind of interrupt (uio only supports INTx since it doesn't allow DMA). Implementing that would be the ultimate test of whether this provides dpdk with not only a more consistent interface, but the feature dpdk wants that's missing in uio. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 4:38 ` Alex Williamson @ 2015-12-16 8:35 ` Burakov, Anatoly 2015-12-16 16:23 ` Burakov, Anatoly 2015-12-16 17:11 ` Alex Williamson 2016-01-14 6:03 ` Jike Song 1 sibling, 2 replies; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-16 8:35 UTC (permalink / raw) To: Alex Williamson, Yigit, Ferruh; +Cc: dev Hi Alex, > On Wed, 2015-12-16 at 04:04 +0000, Ferruh Yigit wrote: > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > I tested the DPDK (HEAD of master) with the patch, with help of > > Anatoly, and DPDK works in no-iommu environment with a little > > modification. > > > > Basically the only modification is adapt new group naming (noiommu-$) > > and > > Sorry, forgot to mention that one. The intention with the modified group > name is that I want to be very certain that a user intending to only support > properly iommu isolated devices doesn't accidentally need to deal with these > no-iommu mode devices. > > > disable dma mapping (VFIO_IOMMU_MAP_DMA) > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in vfio > > module, > > container->noiommu is not set before doing a > > vfio_group_set_container() > > and vfio_for_each_iommu_driver selects wrong driver. > > Running CHECK_EXTENSION on a container without the group attached is > only going to tell you what extensions vfio is capable of, not necessarily what > extensions are available to you with that group. Is this just a general dpdk- > vfio ordering bug? Yes, that is how VFIO was implemented in DPDK. I was under the impression that checking extension before assigning devices was the correct way to do things, so as to not to try anything we know would fail anyway. Does this imply that CHECK_EXTENSION needs to be called on both container and groups (or just on groups)? > > > What I test is bind two different type of NICs into VFIO driver, and > > use testpmd to confirm transfer is working. Kernel booted without > > iommu enabled, vfio module inserted with > > "enable_unsafe_noiommu_support" parameter. > > So it works. Is it acceptable? Useful? Sufficiently complete? Does it imply > deprecating the uio interface? I believe the feature that started this > discussion was support for MSI/X interrupts so that VFs can support some > kind of interrupt (uio only supports INTx since it doesn't allow > DMA). Implementing that would be the ultimate test of whether this > provides dpdk with not only a more consistent interface, but the feature > dpdk wants that's missing in uio. Thanks, More testing will be needed, especially regarding interrupts, we will keep you updated. Thanks, Anatoly > > Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 8:35 ` Burakov, Anatoly @ 2015-12-16 16:23 ` Burakov, Anatoly 2015-12-16 23:17 ` Thomas Monjalon 2015-12-16 17:11 ` Alex Williamson 1 sibling, 1 reply; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-16 16:23 UTC (permalink / raw) To: Thomas Monjalon (thomas.monjalon@6wind.com); +Cc: 'dev@dpdk.org' Hi Thomas, > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > So it works. Is it acceptable? Useful? Sufficiently complete? Does > > it imply deprecating the uio interface? I believe the feature that > > started this discussion was support for MSI/X interrupts so that VFs > > can support some kind of interrupt (uio only supports INTx since it > > doesn't allow DMA). Implementing that would be the ultimate test of > > whether this provides dpdk with not only a more consistent interface, > > but the feature dpdk wants that's missing in uio. Thanks, Ferruh has done a great job so far testing Alex's patch, very few changes from DPDK side seem to be required as far as existing functionality goes (not sure about VF interrupts mentioned by Alex). However, one thing that concerns me is usability. While it is true that no-IOMMU mode in VFIO would mean uio interfaces could be deprecated in time, the no-iommu mode is way more hassle than using igb_uio/uio_pci_generic because it will require a kernel recompile as opposed to simply compiling and insmod'ding an out-of-tree driver. So, in essence, if you don't want an IOMMU, it's becoming that much harder to use DPDK. Would that be something DPDK is willing to live with in the absence of uio interfaces? Thanks, Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 16:23 ` Burakov, Anatoly @ 2015-12-16 23:17 ` Thomas Monjalon 2015-12-17 9:52 ` Burakov, Anatoly 0 siblings, 1 reply; 35+ messages in thread From: Thomas Monjalon @ 2015-12-16 23:17 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: dev 2015-12-16 16:23, Burakov, Anatoly: > Hi Thomas, > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > So it works. Is it acceptable? Useful? Sufficiently complete? Does > > > it imply deprecating the uio interface? I believe the feature that > > > started this discussion was support for MSI/X interrupts so that VFs > > > can support some kind of interrupt (uio only supports INTx since it > > > doesn't allow DMA). Implementing that would be the ultimate test of > > > whether this provides dpdk with not only a more consistent interface, > > > but the feature dpdk wants that's missing in uio. Thanks, > > Ferruh has done a great job so far testing Alex's patch, very few changes from DPDK side seem to be required as far as existing functionality goes (not sure about VF interrupts mentioned by Alex). However, one thing that concerns me is usability. While it is true that no-IOMMU mode in VFIO would mean uio interfaces could be deprecated in time, the no-iommu mode is way more hassle than using igb_uio/uio_pci_generic because it will require a kernel recompile as opposed to simply compiling and insmod'ding an out-of-tree driver. So, in essence, if you don't want an IOMMU, it's becoming that much harder to use DPDK. Would that be something DPDK is willing to live with in the absence of uio interfaces? Excuse me if I missed something obvious. Why a kernel compilation is needed? ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 23:17 ` Thomas Monjalon @ 2015-12-17 9:52 ` Burakov, Anatoly 2015-12-17 10:09 ` Thomas Monjalon 0 siblings, 1 reply; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-17 9:52 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev Hi Thomas, > > Hi Thomas, > > > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > > So it works. Is it acceptable? Useful? Sufficiently complete? > > > > Does it imply deprecating the uio interface? I believe the > > > > feature that started this discussion was support for MSI/X > > > > interrupts so that VFs can support some kind of interrupt (uio > > > > only supports INTx since it doesn't allow DMA). Implementing that > > > > would be the ultimate test of whether this provides dpdk with not > > > > only a more consistent interface, but the feature dpdk wants > > > > that's missing in uio. Thanks, > > > > Ferruh has done a great job so far testing Alex's patch, very few changes > from DPDK side seem to be required as far as existing functionality goes (not > sure about VF interrupts mentioned by Alex). However, one thing that > concerns me is usability. While it is true that no-IOMMU mode in VFIO would > mean uio interfaces could be deprecated in time, the no-iommu mode is way > more hassle than using igb_uio/uio_pci_generic because it will require a > kernel recompile as opposed to simply compiling and insmod'ding an out-of- > tree driver. So, in essence, if you don't want an IOMMU, it's becoming that > much harder to use DPDK. Would that be something DPDK is willing to live > with in the absence of uio interfaces? > > Excuse me if I missed something obvious. > Why a kernel compilation is needed? Well, not really full kernel compilation, but in the default configuration, VFIO driver would not support NOIOMMU mode. I.e. it's not compiled by default. Support for no-iommu should be enabled in kernel config and compiled in. So, whoever is going to use DPDK with VFIO-no-iommu will have to download kernel tree and recompile the VFIO module and install it. That's obviously way more hassle than simply compiling an out-of-tree driver that's already included and works with an out-of-the-box kernel. Thanks, Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-17 9:52 ` Burakov, Anatoly @ 2015-12-17 10:09 ` Thomas Monjalon 2015-12-17 19:38 ` Jan Viktorin 0 siblings, 1 reply; 35+ messages in thread From: Thomas Monjalon @ 2015-12-17 10:09 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: dev Hi, 2015-12-17 09:52, Burakov, Anatoly: > > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > > > So it works. Is it acceptable? Useful? Sufficiently complete? > > > > > Does it imply deprecating the uio interface? I believe the > > > > > feature that started this discussion was support for MSI/X > > > > > interrupts so that VFs can support some kind of interrupt (uio > > > > > only supports INTx since it doesn't allow DMA). Implementing that > > > > > would be the ultimate test of whether this provides dpdk with not > > > > > only a more consistent interface, but the feature dpdk wants > > > > > that's missing in uio. Thanks, > > > > > > Ferruh has done a great job so far testing Alex's patch, very few changes > > from DPDK side seem to be required as far as existing functionality goes (not > > sure about VF interrupts mentioned by Alex). However, one thing that > > concerns me is usability. While it is true that no-IOMMU mode in VFIO would > > mean uio interfaces could be deprecated in time, the no-iommu mode is way > > more hassle than using igb_uio/uio_pci_generic because it will require a > > kernel recompile as opposed to simply compiling and insmod'ding an out-of- > > tree driver. So, in essence, if you don't want an IOMMU, it's becoming that > > much harder to use DPDK. Would that be something DPDK is willing to live > > with in the absence of uio interfaces? > > > > Excuse me if I missed something obvious. > > Why a kernel compilation is needed? > > Well, not really full kernel compilation, but in the default configuration, VFIO driver would not support NOIOMMU mode. I.e. it's not compiled by default. Support for no-iommu should be enabled in kernel config and compiled in. So, whoever is going to use DPDK with VFIO-no-iommu will have to download kernel tree and recompile the VFIO module and install it. That's obviously way more hassle than simply compiling an out-of-tree driver that's already included and works with an out-of-the-box kernel. The "out-of-the-box kernel" is configured by your distribution. So we don't know yet what will be their choice. If the distribution supports DPDK, it should be enabled. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-17 10:09 ` Thomas Monjalon @ 2015-12-17 19:38 ` Jan Viktorin 2015-12-17 21:16 ` Vincent JARDIN 2015-12-17 23:29 ` Stephen Hemminger 0 siblings, 2 replies; 35+ messages in thread From: Jan Viktorin @ 2015-12-17 19:38 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev On Thu, 17 Dec 2015 11:09:23 +0100 Thomas Monjalon <thomas.monjalon@6wind.com> wrote: > Hi, > > 2015-12-17 09:52, Burakov, Anatoly: > > > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > > > > So it works. Is it acceptable? Useful? Sufficiently complete? > > > > > > Does it imply deprecating the uio interface? I believe the > > > > > > feature that started this discussion was support for MSI/X > > > > > > interrupts so that VFs can support some kind of interrupt (uio > > > > > > only supports INTx since it doesn't allow DMA). Implementing that > > > > > > would be the ultimate test of whether this provides dpdk with not > > > > > > only a more consistent interface, but the feature dpdk wants > > > > > > that's missing in uio. Thanks, > > > > > > > > Ferruh has done a great job so far testing Alex's patch, very few changes > > > from DPDK side seem to be required as far as existing functionality goes (not > > > sure about VF interrupts mentioned by Alex). However, one thing that > > > concerns me is usability. While it is true that no-IOMMU mode in VFIO would > > > mean uio interfaces could be deprecated in time, the no-iommu mode is way > > > more hassle than using igb_uio/uio_pci_generic because it will require a > > > kernel recompile as opposed to simply compiling and insmod'ding an out-of- > > > tree driver. So, in essence, if you don't want an IOMMU, it's becoming that > > > much harder to use DPDK. Would that be something DPDK is willing to live > > > with in the absence of uio interfaces? > > > > > > Excuse me if I missed something obvious. > > > Why a kernel compilation is needed? > > > > Well, not really full kernel compilation, but in the default configuration, VFIO driver would not support NOIOMMU mode. I.e. it's not compiled by default. Support for no-iommu should be enabled in kernel config and compiled in. So, whoever is going to use DPDK with VFIO-no-iommu will have to download kernel tree and recompile the VFIO module and install it. That's obviously way more hassle than simply compiling an out-of-tree driver that's already included and works with an out-of-the-box kernel. > > The "out-of-the-box kernel" is configured by your distribution. > So we don't know yet what will be their choice. > If the distribution supports DPDK, it should be enabled. I have a question as I am not involved in all possible DPDK configurations, platforms, etc. and not yet very involved in vfio. What are the devices which do not have IOMMU? If I have, say, DPDK 2.3 with vfio-noiommu, which platforms (or computer systems) I am targeting? Would it be an Intel-based system? Would it be PPC8, ARM? If it is ARMv7... I would say that the fact I have to explicitly enable the no-IOMMU feature and rebuild the kernel (or whatever) is just OK. As for such systems, it is common to have a quite customized OS. Well, the big distributions are able to run on those devices, that's true... However, in such case, the users are usually skilled enough to take care of having their own special Linux kernel. So, is the fact the distributions would not support the no-IOMMU setup in their default configuration really an issue? Will some very common Intel/DPDK-based box need this? Regards Jan ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-17 19:38 ` Jan Viktorin @ 2015-12-17 21:16 ` Vincent JARDIN 2015-12-17 23:29 ` Stephen Hemminger 1 sibling, 0 replies; 35+ messages in thread From: Vincent JARDIN @ 2015-12-17 21:16 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On 17/12/2015 20:38, Jan Viktorin wrote: > which platforms (or computer systems) I am targeting? It is about VMs on IOMMU capable systems. What if you need to use SRIOV with IXGBE, or IGB devices? For some DPDK cases, like Mellanox or virtio, you do not need to use VFIO/UIO into the guests, so no issue. But for some other PMDs, you need a VFIO/UIO. Best regards, Vincent ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-17 19:38 ` Jan Viktorin 2015-12-17 21:16 ` Vincent JARDIN @ 2015-12-17 23:29 ` Stephen Hemminger 1 sibling, 0 replies; 35+ messages in thread From: Stephen Hemminger @ 2015-12-17 23:29 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On Thu, 17 Dec 2015 20:38:16 +0100 Jan Viktorin <viktorin@rehivetech.com> wrote: > On Thu, 17 Dec 2015 11:09:23 +0100 > Thomas Monjalon <thomas.monjalon@6wind.com> wrote: > > > Hi, > > > > 2015-12-17 09:52, Burakov, Anatoly: > > > > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > > > > > So it works. Is it acceptable? Useful? Sufficiently complete? > > > > > > > Does it imply deprecating the uio interface? I believe the > > > > > > > feature that started this discussion was support for MSI/X > > > > > > > interrupts so that VFs can support some kind of interrupt (uio > > > > > > > only supports INTx since it doesn't allow DMA). Implementing that > > > > > > > would be the ultimate test of whether this provides dpdk with not > > > > > > > only a more consistent interface, but the feature dpdk wants > > > > > > > that's missing in uio. Thanks, > > > > > > > > > > Ferruh has done a great job so far testing Alex's patch, very few changes > > > > from DPDK side seem to be required as far as existing functionality goes (not > > > > sure about VF interrupts mentioned by Alex). However, one thing that > > > > concerns me is usability. While it is true that no-IOMMU mode in VFIO would > > > > mean uio interfaces could be deprecated in time, the no-iommu mode is way > > > > more hassle than using igb_uio/uio_pci_generic because it will require a > > > > kernel recompile as opposed to simply compiling and insmod'ding an out-of- > > > > tree driver. So, in essence, if you don't want an IOMMU, it's becoming that > > > > much harder to use DPDK. Would that be something DPDK is willing to live > > > > with in the absence of uio interfaces? > > > > > > > > Excuse me if I missed something obvious. > > > > Why a kernel compilation is needed? > > > > > > Well, not really full kernel compilation, but in the default configuration, VFIO driver would not support NOIOMMU mode. I.e. it's not compiled by default. Support for no-iommu should be enabled in kernel config and compiled in. So, whoever is going to use DPDK with VFIO-no-iommu will have to download kernel tree and recompile the VFIO module and install it. That's obviously way more hassle than simply compiling an out-of-tree driver that's already included and works with an out-of-the-box kernel. > > > > The "out-of-the-box kernel" is configured by your distribution. > > So we don't know yet what will be their choice. > > If the distribution supports DPDK, it should be enabled. > > I have a question as I am not involved in all possible DPDK > configurations, platforms, etc. and not yet very involved in vfio. What > are the devices which do not have IOMMU? If I have, say, DPDK 2.3 with > vfio-noiommu, which platforms (or computer systems) I am targeting? > > Would it be an Intel-based system? Would it be PPC8, ARM? > > If it is ARMv7... I would say that the fact I have to explicitly enable > the no-IOMMU feature and rebuild the kernel (or whatever) is just OK. As > for such systems, it is common to have a quite customized OS. Well, > the big distributions are able to run on those devices, that's true... > However, in such case, the users are usually skilled enough to take > care of having their own special Linux kernel. > > So, is the fact the distributions would not support the no-IOMMU setup > in their default configuration really an issue? Will some very common > Intel/DPDK-based box need this? > > Regards > Jan So far: * broken hardware (many systems including those from Dell) do not provide working IOMMU because of Bios bugs etc. * Linux guest in VMware/KVM/Hyper-V. There is no IOMMU emulation in most of these systems. * Older smaller systems (ie Atom) may not have IOMMU ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 8:35 ` Burakov, Anatoly 2015-12-16 16:23 ` Burakov, Anatoly @ 2015-12-16 17:11 ` Alex Williamson 2015-12-16 17:22 ` Burakov, Anatoly 1 sibling, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-16 17:11 UTC (permalink / raw) To: Burakov, Anatoly, Yigit, Ferruh; +Cc: dev On Wed, 2015-12-16 at 08:35 +0000, Burakov, Anatoly wrote: > Hi Alex, > > > On Wed, 2015-12-16 at 04:04 +0000, Ferruh Yigit wrote: > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > I tested the DPDK (HEAD of master) with the patch, with help of > > > Anatoly, and DPDK works in no-iommu environment with a little > > > modification. > > > > > > Basically the only modification is adapt new group naming > > > (noiommu-$) > > > and > > > > Sorry, forgot to mention that one. The intention with the modified > > group > > name is that I want to be very certain that a user intending to > > only support > > properly iommu isolated devices doesn't accidentally need to deal > > with these > > no-iommu mode devices. > > > > > disable dma mapping (VFIO_IOMMU_MAP_DMA) > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in > > > vfio > > > module, > > > container->noiommu is not set before doing a > > > vfio_group_set_container() > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > Running CHECK_EXTENSION on a container without the group attached > > is > > only going to tell you what extensions vfio is capable of, not > > necessarily what > > extensions are available to you with that group. Is this just a > > general dpdk- > > vfio ordering bug? > > Yes, that is how VFIO was implemented in DPDK. I was under the > impression that checking extension before assigning devices was the > correct way to do things, so as to not to try anything we know would > fail anyway. Does this imply that CHECK_EXTENSION needs to be called > on both container and groups (or just on groups)? Hmm, in Documentation/vfio.txt we do give the following algorithm: if (ioctl(container, VFIO_GET_API_VERSION) != VFIO_API_VERSION) /* Unknown API version */ if (!ioctl(container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) /* Doesn't support the IOMMU driver we want. */ ... That's just going to query each iommu driver and we can't yet say whether the group the user attaches to the container later will actually support that extension until we try to do it, that would come at VFIO_SET_IOMMU. So is it perhaps a vfio bug that we're not advertising no-iommu until the group is attached? After all, we are capable of it with just an empty container, just like we are with type1, but we're going to fail SET_IOMMU for the wrong combination. This is exactly the sort of thing that makes me glad we reverted it without feedback from a working user driver. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 17:11 ` Alex Williamson @ 2015-12-16 17:22 ` Burakov, Anatoly 2015-12-17 16:43 ` Alex Williamson 0 siblings, 1 reply; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-16 17:22 UTC (permalink / raw) To: Alex Williamson, Yigit, Ferruh; +Cc: dev Hi Alex, > On Wed, 2015-12-16 at 08:35 +0000, Burakov, Anatoly wrote: > > Hi Alex, > > > > > On Wed, 2015-12-16 at 04:04 +0000, Ferruh Yigit wrote: > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson wrote: > > > > I tested the DPDK (HEAD of master) with the patch, with help of > > > > Anatoly, and DPDK works in no-iommu environment with a little > > > > modification. > > > > > > > > Basically the only modification is adapt new group naming > > > > (noiommu-$) > > > > and > > > > > > Sorry, forgot to mention that one. The intention with the modified > > > group name is that I want to be very certain that a user intending > > > to only support properly iommu isolated devices doesn't accidentally > > > need to deal with these no-iommu mode devices. > > > > > > > disable dma mapping (VFIO_IOMMU_MAP_DMA) > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in vfio > > > > module, > > > > container->noiommu is not set before doing a > > > > vfio_group_set_container() > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > > Running CHECK_EXTENSION on a container without the group attached is > > > only going to tell you what extensions vfio is capable of, not > > > necessarily what extensions are available to you with that group. > > > Is this just a general dpdk- vfio ordering bug? > > > > Yes, that is how VFIO was implemented in DPDK. I was under the > > impression that checking extension before assigning devices was the > > correct way to do things, so as to not to try anything we know would > > fail anyway. Does this imply that CHECK_EXTENSION needs to be called > > on both container and groups (or just on groups)? > > Hmm, in Documentation/vfio.txt we do give the following algorithm: > > if (ioctl(container, VFIO_GET_API_VERSION) != VFIO_API_VERSION) > /* Unknown API version */ > > if (!ioctl(container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) > /* Doesn't support the IOMMU driver we want. */ > ... > > That's just going to query each iommu driver and we can't yet say whether > the group the user attaches to the container later will actually support that > extension until we try to do it, that would come at VFIO_SET_IOMMU. So is > it perhaps a vfio bug that we're not advertising no-iommu until the group is > attached? After all, we are capable of it with just an empty container, just > like we are with type1, but we're going to fail SET_IOMMU for the wrong > combination. > This is exactly the sort of thing that makes me glad we reverted it without > feedback from a working user driver. Thanks, Whether it should be considered a "bug" in VFIO or "by design" is up to you, of course, but at least according to the VFIO documentation, we are meant to check for type 1 extension and then attach devices, so it would be expected to get VFIO_NOIOMMU_IOMMU marked as supported even without any devices attached to the container (just like we get type 1 as supported without any devices attached). Having said that, if it was meant to attach devices first and then check the extensions, then perhaps the documentation should also point out that fact (or perhaps I missed that detail in my readings of the docs, in which case my apologies). Thanks, Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 17:22 ` Burakov, Anatoly @ 2015-12-17 16:43 ` Alex Williamson 2015-12-18 10:43 ` Yigit, Ferruh 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-17 16:43 UTC (permalink / raw) To: Burakov, Anatoly, Yigit, Ferruh; +Cc: dev On Wed, 2015-12-16 at 17:22 +0000, Burakov, Anatoly wrote: > Hi Alex, > > > On Wed, 2015-12-16 at 08:35 +0000, Burakov, Anatoly wrote: > > > Hi Alex, > > > > > > > On Wed, 2015-12-16 at 04:04 +0000, Ferruh Yigit wrote: > > > > > On Tue, Dec 15, 2015 at 09:53:18AM -0700, Alex Williamson > > > > > wrote: > > > > > I tested the DPDK (HEAD of master) with the patch, with help > > > > > of > > > > > Anatoly, and DPDK works in no-iommu environment with a little > > > > > modification. > > > > > > > > > > Basically the only modification is adapt new group naming > > > > > (noiommu-$) > > > > > and > > > > > > > > Sorry, forgot to mention that one. The intention with the > > > > modified > > > > group name is that I want to be very certain that a user > > > > intending > > > > to only support properly iommu isolated devices doesn't > > > > accidentally > > > > need to deal with these no-iommu mode devices. > > > > > > > > > disable dma mapping (VFIO_IOMMU_MAP_DMA) > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in > > > > > vfio > > > > > module, > > > > > container->noiommu is not set before doing a > > > > > vfio_group_set_container() > > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > > > > Running CHECK_EXTENSION on a container without the group > > > > attached is > > > > only going to tell you what extensions vfio is capable of, not > > > > necessarily what extensions are available to you with that > > > > group. > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > Yes, that is how VFIO was implemented in DPDK. I was under the > > > impression that checking extension before assigning devices was > > > the > > > correct way to do things, so as to not to try anything we know > > > would > > > fail anyway. Does this imply that CHECK_EXTENSION needs to be > > > called > > > on both container and groups (or just on groups)? > > > > Hmm, in Documentation/vfio.txt we do give the following algorithm: > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > VFIO_API_VERSION) > > /* Unknown API version */ > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > VFIO_TYPE1_IOMMU)) > > /* Doesn't support the IOMMU driver we want. */ > > ... > > > > That's just going to query each iommu driver and we can't yet say > > whether > > the group the user attaches to the container later will actually > > support that > > extension until we try to do it, that would come at VFIO_SET_IOMMU. > > So is > > it perhaps a vfio bug that we're not advertising no-iommu until the > > group is > > attached? After all, we are capable of it with just an empty > > container, just > > like we are with type1, but we're going to fail SET_IOMMU for the > > wrong > > combination. > > This is exactly the sort of thing that makes me glad we reverted > > it without > > feedback from a working user driver. Thanks, > > Whether it should be considered a "bug" in VFIO or "by design" is up > to you, of course, but at least according to the VFIO documentation, > we are meant to check for type 1 extension and then attach devices, > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as supported > even without any devices attached to the container (just like we get > type 1 as supported without any devices attached). Having said that, > if it was meant to attach devices first and then check the > extensions, then perhaps the documentation should also point out that > fact (or perhaps I missed that detail in my readings of the docs, in > which case my apologies). Hi Anatoly, Does the below patch make it behave more like you'd expect. This applies to v4.4-rc4, I'd fold this into the base patch if we reincorporate it to a future kernel. Thanks, Alex commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 Author: Alex Williamson <alex.williamson@redhat.com> Date: Wed Dec 16 19:02:01 2015 -0700 vfio: Fix no-iommu CHECK_EXTENSION Previously the no-iommu iommu driver was only visible when the container had an attached no-iommu group. This means that CHECK_EXTENSION on and empty container couldn't report the possibility of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether or not the user can make use of it with the group, so this is inconsistent. Add the no-iommu iommu to the list of iommu drivers when enabled via module option, but skip all the others if the container is attached to a no-iommu groups. Note that tainting is now done with the "unsafe" module callback rather than explictly within vfio. Also fixes module option and module description name inconsistency. Also make vfio_noiommu_ops const. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index de632da..d3a9432 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -99,9 +99,6 @@ struct vfio_device { #ifdef CONFIG_VFIO_NOIOMMU static bool noiommu __read_mostly; -module_param_named(enable_unsafe_noiommu_support, - noiommu, bool, S_IRUGO | S_IWUSR); -MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode. This mode provides no device isolation, no DMA translation, no host kernel protection, cannot be used for device assignment to virtual machines, requires RAWIO permissions, and will taint the kernel. If you do not know what this is for, step away. (default: false)"); #endif /* @@ -138,17 +135,6 @@ struct iommu_group *vfio_iommu_group_get(struct device *dev) iommu_group_put(group); if (ret) return NULL; - - /* - * Where to taint? At this point we've added an IOMMU group for a - * device that is not backed by iommu_ops, therefore any iommu_ - * callback using iommu_ops can legitimately Oops. So, while we may - * be about to give a DMA capable device to a user without IOMMU - * protection, which is clearly taint-worthy, let's go ahead and do - * it here. - */ - add_taint(TAINT_USER, LOCKDEP_STILL_OK); - dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n"); #endif return group; @@ -207,7 +193,7 @@ static void vfio_noiommu_detach_group(void *iommu_data, { } -static struct vfio_iommu_driver_ops vfio_noiommu_ops = { +static const struct vfio_iommu_driver_ops vfio_noiommu_ops = { .name = "vfio-noiommu", .owner = THIS_MODULE, .open = vfio_noiommu_open, @@ -217,24 +203,34 @@ static struct vfio_iommu_driver_ops vfio_noiommu_ops = { .detach_group = vfio_noiommu_detach_group, }; -static struct vfio_iommu_driver vfio_noiommu_driver = { - .ops = &vfio_noiommu_ops, +static int noiommu_param_set(const char *val, const struct kernel_param *kp) +{ + int ret; + + if (!val) + val = "1"; + + ret = strtobool(val, kp->arg); + if (ret) + return ret; + + if (noiommu) + ret = vfio_register_iommu_driver(&vfio_noiommu_ops); + else + vfio_unregister_iommu_driver(&vfio_noiommu_ops); + + return ret; +} + +static const struct kernel_param_ops noiommu_param_ops = { + .flags = KERNEL_PARAM_OPS_FL_NOARG, + .set = noiommu_param_set, + .get = param_get_bool, }; -/* - * Wrap IOMMU drivers, the noiommu driver is the one and only driver for - * noiommu groups (and thus containers) and not available for normal groups. - */ -#define vfio_for_each_iommu_driver(con, pos) \ - for (pos = con->noiommu ? &vfio_noiommu_driver : \ - list_first_entry(&vfio.iommu_drivers_list, \ - struct vfio_iommu_driver, vfio_next); \ - (con->noiommu ? pos != NULL : \ - &pos->vfio_next != &vfio.iommu_drivers_list); \ - pos = con->noiommu ? NULL : list_next_entry(pos, vfio_next)) -#else -#define vfio_for_each_iommu_driver(con, pos) \ - list_for_each_entry(pos, &vfio.iommu_drivers_list, vfio_next) +module_param_cb_unsafe(enable_unsafe_noiommu_mode, &noiommu_param_ops, + &noiommu, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode. This mode provides no device isolation, no DMA translation, no host kernel protection, cannot be used for device assignment to virtual machines, requires RAWIO permissions, and will taint the kernel. If you do not know what this is for, step away. (default: false)"); #endif @@ -999,7 +995,12 @@ static long vfio_ioctl_check_extension(struct vfio_container *container, */ if (!driver) { mutex_lock(&vfio.iommu_drivers_lock); - vfio_for_each_iommu_driver(container, driver) { + list_for_each_entry(driver, &vfio.iommu_drivers_list, + vfio_next) { + if (container->noiommu && + driver->ops != &vfio_noiommu_ops) + continue; + if (!try_module_get(driver->ops->owner)) continue; @@ -1068,9 +1069,12 @@ static long vfio_ioctl_set_iommu(struct vfio_container *container, } mutex_lock(&vfio.iommu_drivers_lock); - vfio_for_each_iommu_driver(container, driver) { + list_for_each_entry(driver, &vfio.iommu_drivers_list, vfio_next) { void *data; + if (container->noiommu && driver->ops != &vfio_noiommu_ops) + continue; + if (!try_module_get(driver->ops->owner)) continue; ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-17 16:43 ` Alex Williamson @ 2015-12-18 10:43 ` Yigit, Ferruh 2015-12-18 14:38 ` Alex Williamson 0 siblings, 1 reply; 35+ messages in thread From: Yigit, Ferruh @ 2015-12-18 10:43 UTC (permalink / raw) To: Alex Williamson; +Cc: dev On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote: <...> > > > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in > > > > > > vfio > > > > > > module, > > > > > > container->noiommu is not set before doing a > > > > > > vfio_group_set_container() > > > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > > > > > > Running CHECK_EXTENSION on a container without the group > > > > > attached is > > > > > only going to tell you what extensions vfio is capable of, not > > > > > necessarily what extensions are available to you with that > > > > > group. > > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was under the > > > > impression that checking extension before assigning devices was > > > > the > > > > correct way to do things, so as to not to try anything we know > > > > would > > > > fail anyway. Does this imply that CHECK_EXTENSION needs to be > > > > called > > > > on both container and groups (or just on groups)? > > > > > > Hmm, in Documentation/vfio.txt we do give the following algorithm: > > > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > > VFIO_API_VERSION) > > > /* Unknown API version */ > > > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > > VFIO_TYPE1_IOMMU)) > > > /* Doesn't support the IOMMU driver we want. */ > > > ... > > > > > > That's just going to query each iommu driver and we can't yet say > > > whether > > > the group the user attaches to the container later will actually > > > support that > > > extension until we try to do it, that would come at VFIO_SET_IOMMU. > > > So is > > > it perhaps a vfio bug that we're not advertising no-iommu until the > > > group is > > > attached? After all, we are capable of it with just an empty > > > container, just > > > like we are with type1, but we're going to fail SET_IOMMU for the > > > wrong > > > combination. > > > This is exactly the sort of thing that makes me glad we reverted > > > it without > > > feedback from a working user driver. Thanks, > > > > Whether it should be considered a "bug" in VFIO or "by design" is up > > to you, of course, but at least according to the VFIO documentation, > > we are meant to check for type 1 extension and then attach devices, > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as supported > > even without any devices attached to the container (just like we get > > type 1 as supported without any devices attached). Having said that, > > if it was meant to attach devices first and then check the > > extensions, then perhaps the documentation should also point out that > > fact (or perhaps I missed that detail in my readings of the docs, in > > which case my apologies). > > Hi Anatoly, > > Does the below patch make it behave more like you'd expect. This > applies to v4.4-rc4, I'd fold this into the base patch if we > reincorporate it to a future kernel. Thanks, > > Alex > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > Author: Alex Williamson <alex.williamson@redhat.com> > Date: Wed Dec 16 19:02:01 2015 -0700 > > vfio: Fix no-iommu CHECK_EXTENSION > > Previously the no-iommu iommu driver was only visible when the > container had an attached no-iommu group. This means that > CHECK_EXTENSION on and empty container couldn't report the possibility > of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether or not the user > can make use of it with the group, so this is inconsistent. Add the > no-iommu iommu to the list of iommu drivers when enabled via module > option, but skip all the others if the container is attached to a > no-iommu groups. Note that tainting is now done with the "unsafe" > module callback rather than explictly within vfio. > > Also fixes module option and module description name inconsistency. > > Also make vfio_noiommu_ops const. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Hi Alex, I got following crash with this update: [ +0.046973] BUG: unable to handle kernel NULL pointer dereference at (null) [ +0.000031] IP: [<ffffffff813b92cf>] __list_add+V0x1f/0xc0 [ +0.000024] PGD 0 [ +0.000010] Oops: 0000 [#1] SMP ... [ +0.000028] Call Trace: [ +0.000014] [<ffffffff81777301>] __mutex_lock_slowpath+0x91/0x110 [ +0.000022] [<ffffffff817773a3>] mutex_lock+0x23/0x40 [ +0.000020] [<ffffffffa0370c00>] vfio_register_iommu_driver+0x40/0xc0 [vfio] [ +0.000025] [<ffffffffa0370cd1>] noiommu_param_set+0x51/0x60 [vfio] [ +0.000022] [<ffffffff810bbe3e>] parse_args+0x1be/0x4a0 [ +0.000021] [<ffffffff81121ea0>] load_module+0xe30/0x2730 [ +0.000942] [<ffffffff8111f500>] ? __symbol_put+0x60/0x60 [ +0.000921] [<ffffffff81223f60>] ? kernel_read+0x50/0x80 [ +0.000917] [<ffffffff811239f9>] SyS_finit_module+0xb9/0xf0 [ +0.000925] [<ffffffff817796ae>] entry_SYSCALL_64_fastpath+0x12/0x71 > -static struct vfio_iommu_driver vfio_noiommu_driver = { > - .ops = &vfio_noiommu_ops, > +static int noiommu_param_set(const char *val, const struct kernel_param *kp) > +{ > + int ret; > + > + if (!val) > + val = "1"; > + > + ret = strtobool(val, kp->arg); > + if (ret) > + return ret; > + > + if (noiommu) > + ret = vfio_register_iommu_driver(&vfio_noiommu_ops); Is it possible that kernel_param_ops->set() called before module_init() that initializes the mutex. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/module.c?id=refs/tags/v4.4-rc5#n3517 > + else > + vfio_unregister_iommu_driver(&vfio_noiommu_ops); > + > + return ret; > +} > + > +static const struct kernel_param_ops noiommu_param_ops = { > + .flags = KERNEL_PARAM_OPS_FL_NOARG, > + .set = noiommu_param_set, > + .get = param_get_bool, > }; > Thanks, ferruh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-18 10:43 ` Yigit, Ferruh @ 2015-12-18 14:38 ` Alex Williamson 2015-12-18 21:50 ` Alex Williamson 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-18 14:38 UTC (permalink / raw) To: Yigit, Ferruh; +Cc: dev On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote: > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote: > <...> > > > > > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, > > > > > > > because in > > > > > > > vfio > > > > > > > module, > > > > > > > container->noiommu is not set before doing a > > > > > > > vfio_group_set_container() > > > > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > > > > > > > > Running CHECK_EXTENSION on a container without the group > > > > > > attached is > > > > > > only going to tell you what extensions vfio is capable of, > > > > > > not > > > > > > necessarily what extensions are available to you with that > > > > > > group. > > > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was under > > > > > the > > > > > impression that checking extension before assigning devices > > > > > was > > > > > the > > > > > correct way to do things, so as to not to try anything we > > > > > know > > > > > would > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs to be > > > > > called > > > > > on both container and groups (or just on groups)? > > > > > > > > Hmm, in Documentation/vfio.txt we do give the following > > > > algorithm: > > > > > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > > > VFIO_API_VERSION) > > > > /* Unknown API version */ > > > > > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > > > VFIO_TYPE1_IOMMU)) > > > > /* Doesn't support the IOMMU driver we want. */ > > > > ... > > > > > > > > That's just going to query each iommu driver and we can't yet > > > > say > > > > whether > > > > the group the user attaches to the container later will > > > > actually > > > > support that > > > > extension until we try to do it, that would come at > > > > VFIO_SET_IOMMU. > > > > So is > > > > it perhaps a vfio bug that we're not advertising no-iommu until > > > > the > > > > group is > > > > attached? After all, we are capable of it with just an empty > > > > container, just > > > > like we are with type1, but we're going to fail SET_IOMMU for > > > > the > > > > wrong > > > > combination. > > > > This is exactly the sort of thing that makes me glad we > > > > reverted > > > > it without > > > > feedback from a working user driver. Thanks, > > > > > > Whether it should be considered a "bug" in VFIO or "by design" is > > > up > > > to you, of course, but at least according to the VFIO > > > documentation, > > > we are meant to check for type 1 extension and then attach > > > devices, > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as > > > supported > > > even without any devices attached to the container (just like we > > > get > > > type 1 as supported without any devices attached). Having said > > > that, > > > if it was meant to attach devices first and then check the > > > extensions, then perhaps the documentation should also point out > > > that > > > fact (or perhaps I missed that detail in my readings of the docs, > > > in > > > which case my apologies). > > > > Hi Anatoly, > > > > Does the below patch make it behave more like you'd expect. This > > applies to v4.4-rc4, I'd fold this into the base patch if we > > reincorporate it to a future kernel. Thanks, > > > > Alex > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > > Author: Alex Williamson <alex.williamson@redhat.com> > > Date: Wed Dec 16 19:02:01 2015 -0700 > > > > vfio: Fix no-iommu CHECK_EXTENSION > > > > Previously the no-iommu iommu driver was only visible when the > > container had an attached no-iommu group. This means that > > CHECK_EXTENSION on and empty container couldn't report the > > possibility > > of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether or not > > the user > > can make use of it with the group, so this is > > inconsistent. Add the > > no-iommu iommu to the list of iommu drivers when enabled via > > module > > option, but skip all the others if the container is attached to > > a > > no-iommu groups. Note that tainting is now done with the > > "unsafe" > > module callback rather than explictly within vfio. > > > > Also fixes module option and module description name > > inconsistency. > > > > Also make vfio_noiommu_ops const. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > Hi Alex, > > I got following crash with this update: > > [ +0.046973] BUG: unable to handle kernel NULL pointer dereference > at (null) > [ +0.000031] IP: [<ffffffff813b92cf>] __list_add+V0x1f/0xc0 > [ +0.000024] PGD 0 > [ +0.000010] Oops: 0000 [#1] SMP > ... > [ +0.000028] Call Trace: > [ +0.000014] [<ffffffff81777301>] __mutex_lock_slowpath+0x91/0x110 > [ +0.000022] [<ffffffff817773a3>] mutex_lock+0x23/0x40 > [ +0.000020] [<ffffffffa0370c00>] > vfio_register_iommu_driver+0x40/0xc0 [vfio] > [ +0.000025] [<ffffffffa0370cd1>] noiommu_param_set+0x51/0x60 > [vfio] > [ +0.000022] [<ffffffff810bbe3e>] parse_args+0x1be/0x4a0 > [ +0.000021] [<ffffffff81121ea0>] load_module+0xe30/0x2730 > [ +0.000942] [<ffffffff8111f500>] ? __symbol_put+0x60/0x60 > [ +0.000921] [<ffffffff81223f60>] ? kernel_read+0x50/0x80 > [ +0.000917] [<ffffffff811239f9>] SyS_finit_module+0xb9/0xf0 > [ +0.000925] [<ffffffff817796ae>] > entry_SYSCALL_64_fastpath+0x12/0x71 > > > -static struct vfio_iommu_driver vfio_noiommu_driver = { > > - .ops = &vfio_noiommu_ops, > > +static int noiommu_param_set(const char *val, const struct > > kernel_param *kp) > > +{ > > + int ret; > > + > > + if (!val) > > + val = "1"; > > + > > + ret = strtobool(val, kp->arg); > > + if (ret) > > + return ret; > > + > > + if (noiommu) > > + ret = > > vfio_register_iommu_driver(&vfio_noiommu_ops); > > Is it possible that kernel_param_ops->set() called before > module_init() that initializes the mutex. > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/ > kernel/module.c?id=refs/tags/v4.4-rc5#n3517 Yes, guess I didn't think that one through very well. I'll try again. Thank you for testing, sorry it was a dud. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-18 14:38 ` Alex Williamson @ 2015-12-18 21:50 ` Alex Williamson 2015-12-21 11:46 ` Yigit, Ferruh 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-18 21:50 UTC (permalink / raw) To: Yigit, Ferruh; +Cc: dev On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote: > On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote: > > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote: > > <...> > > > > > > > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, > > > > > > > > because in > > > > > > > > vfio > > > > > > > > module, > > > > > > > > container->noiommu is not set before doing a > > > > > > > > vfio_group_set_container() > > > > > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > > > > > > > > > > Running CHECK_EXTENSION on a container without the group > > > > > > > attached is > > > > > > > only going to tell you what extensions vfio is capable > > > > > > > of, > > > > > > > not > > > > > > > necessarily what extensions are available to you with > > > > > > > that > > > > > > > group. > > > > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was under > > > > > > the > > > > > > impression that checking extension before assigning devices > > > > > > was > > > > > > the > > > > > > correct way to do things, so as to not to try anything we > > > > > > know > > > > > > would > > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs to > > > > > > be > > > > > > called > > > > > > on both container and groups (or just on groups)? > > > > > > > > > > Hmm, in Documentation/vfio.txt we do give the following > > > > > algorithm: > > > > > > > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > > > > VFIO_API_VERSION) > > > > > /* Unknown API version */ > > > > > > > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > > > > VFIO_TYPE1_IOMMU)) > > > > > /* Doesn't support the IOMMU driver we want. > > > > > */ > > > > > ... > > > > > > > > > > That's just going to query each iommu driver and we can't yet > > > > > say > > > > > whether > > > > > the group the user attaches to the container later will > > > > > actually > > > > > support that > > > > > extension until we try to do it, that would come at > > > > > VFIO_SET_IOMMU. > > > > > So is > > > > > it perhaps a vfio bug that we're not advertising no-iommu > > > > > until > > > > > the > > > > > group is > > > > > attached? After all, we are capable of it with just an empty > > > > > container, just > > > > > like we are with type1, but we're going to fail SET_IOMMU for > > > > > the > > > > > wrong > > > > > combination. > > > > > This is exactly the sort of thing that makes me glad we > > > > > reverted > > > > > it without > > > > > feedback from a working user driver. Thanks, > > > > > > > > Whether it should be considered a "bug" in VFIO or "by design" > > > > is > > > > up > > > > to you, of course, but at least according to the VFIO > > > > documentation, > > > > we are meant to check for type 1 extension and then attach > > > > devices, > > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as > > > > supported > > > > even without any devices attached to the container (just like > > > > we > > > > get > > > > type 1 as supported without any devices attached). Having said > > > > that, > > > > if it was meant to attach devices first and then check the > > > > extensions, then perhaps the documentation should also point > > > > out > > > > that > > > > fact (or perhaps I missed that detail in my readings of the > > > > docs, > > > > in > > > > which case my apologies). > > > > > > Hi Anatoly, > > > > > > Does the below patch make it behave more like you'd expect. This > > > applies to v4.4-rc4, I'd fold this into the base patch if we > > > reincorporate it to a future kernel. Thanks, > > > > > > Alex > > > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > > > Author: Alex Williamson <alex.williamson@redhat.com> > > > Date: Wed Dec 16 19:02:01 2015 -0700 > > > > > > vfio: Fix no-iommu CHECK_EXTENSION > > > > > > Previously the no-iommu iommu driver was only visible when > > > the > > > container had an attached no-iommu group. This means that > > > CHECK_EXTENSION on and empty container couldn't report the > > > possibility > > > of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether or not > > > the user > > > can make use of it with the group, so this is > > > inconsistent. Add the > > > no-iommu iommu to the list of iommu drivers when enabled via > > > module > > > option, but skip all the others if the container is attached > > > to > > > a > > > no-iommu groups. Note that tainting is now done with the > > > "unsafe" > > > module callback rather than explictly within vfio. > > > > > > Also fixes module option and module description name > > > inconsistency. > > > > > > Also make vfio_noiommu_ops const. > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > Hi Alex, > > > > I got following crash with this update: Let's try this one: commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05 Author: Alex Williamson <alex.williamson@redhat.com> Date: Fri Dec 18 14:45:55 2015 -0700 v2 vfio fix no-iommu CHECK_EXTENSION Register and unregister the no-iommu iommu backend at module initialization and exit, but disable it unless enabled via module option. Rather than modify the iommu driver walk, selectively skip combinations that aren't supported. CHECK_EXTENSION on a container without any groups attached exposes all possible extensions. Once a group is attached, the no-iommu backend is skipped for regular groups and regular iommu backends are skipped for no-iommu groups. This would be folded into a single patch to re-propose vfio no-iommu mode upstream. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index de632da..85a5793 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -99,7 +99,7 @@ struct vfio_device { #ifdef CONFIG_VFIO_NOIOMMU static bool noiommu __read_mostly; -module_param_named(enable_unsafe_noiommu_support, +module_param_named(enable_unsafe_noiommu_mode, noiommu, bool, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode. This mode provides no device isolation, no DMA translation, no host kernel protection, cannot be used for device assignment to virtual machines, requires RAWIO permissions, and will taint the kernel. If you do not know what this is for, step away. (default: false)"); #endif @@ -185,7 +185,7 @@ static long vfio_noiommu_ioctl(void *iommu_data, unsigned int cmd, unsigned long arg) { if (cmd == VFIO_CHECK_EXTENSION) - return arg == VFIO_NOIOMMU_IOMMU ? 1 : 0; + return noiommu && (arg == VFIO_NOIOMMU_IOMMU) ? 1 : 0; return -ENOTTY; } @@ -207,7 +207,7 @@ static void vfio_noiommu_detach_group(void *iommu_data, { } -static struct vfio_iommu_driver_ops vfio_noiommu_ops = { +static const struct vfio_iommu_driver_ops vfio_noiommu_ops = { .name = "vfio-noiommu", .owner = THIS_MODULE, .open = vfio_noiommu_open, @@ -216,25 +216,6 @@ static struct vfio_iommu_driver_ops vfio_noiommu_ops = { .attach_group = vfio_noiommu_attach_group, .detach_group = vfio_noiommu_detach_group, }; - -static struct vfio_iommu_driver vfio_noiommu_driver = { - .ops = &vfio_noiommu_ops, -}; - -/* - * Wrap IOMMU drivers, the noiommu driver is the one and only driver for - * noiommu groups (and thus containers) and not available for normal groups. - */ -#define vfio_for_each_iommu_driver(con, pos) \ - for (pos = con->noiommu ? &vfio_noiommu_driver : \ - list_first_entry(&vfio.iommu_drivers_list, \ - struct vfio_iommu_driver, vfio_next); \ - (con->noiommu ? pos != NULL : \ - &pos->vfio_next != &vfio.iommu_drivers_list); \ - pos = con->noiommu ? NULL : list_next_entry(pos, vfio_next)) -#else -#define vfio_for_each_iommu_driver(con, pos) \ - list_for_each_entry(pos, &vfio.iommu_drivers_list, vfio_next) #endif @@ -999,7 +980,14 @@ static long vfio_ioctl_check_extension(struct vfio_container *container, */ if (!driver) { mutex_lock(&vfio.iommu_drivers_lock); - vfio_for_each_iommu_driver(container, driver) { + list_for_each_entry(driver, &vfio.iommu_drivers_list, + vfio_next) { + + if (!list_empty(&container->group_list) && + (container->noiommu != + (driver->ops == &vfio_noiommu_ops))) + continue; + if (!try_module_get(driver->ops->owner)) continue; @@ -1068,9 +1056,16 @@ static long vfio_ioctl_set_iommu(struct vfio_container *container, } mutex_lock(&vfio.iommu_drivers_lock); - vfio_for_each_iommu_driver(container, driver) { + list_for_each_entry(driver, &vfio.iommu_drivers_list, vfio_next) { void *data; + /* + * Only noiommu containers can use vfio-noiommu and noiommu + * containers can only use vfio-noiommu. + */ + if (container->noiommu != (driver->ops == &vfio_noiommu_ops)) + continue; + if (!try_module_get(driver->ops->owner)) continue; @@ -1799,6 +1794,9 @@ static int __init vfio_init(void) request_module_nowait("vfio_iommu_type1"); request_module_nowait("vfio_iommu_spapr_tce"); +#ifdef CONFIG_VFIO_NOIOMMU + vfio_register_iommu_driver(&vfio_noiommu_ops); +#endif return 0; err_cdev_add: @@ -1815,6 +1813,9 @@ static void __exit vfio_cleanup(void) { WARN_ON(!list_empty(&vfio.group_list)); +#ifdef CONFIG_VFIO_NOIOMMU + vfio_unregister_iommu_driver(&vfio_noiommu_ops); +#endif idr_destroy(&vfio.group_idr); cdev_del(&vfio.group_cdev); unregister_chrdev_region(vfio.group_devt, MINORMASK); ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-18 21:50 ` Alex Williamson @ 2015-12-21 11:46 ` Yigit, Ferruh 2015-12-21 12:18 ` [dpdk-dev] [PATCH] vfio: add no-iommu support Ferruh Yigit 2015-12-21 19:22 ` [dpdk-dev] VFIO no-iommu Alex Williamson 0 siblings, 2 replies; 35+ messages in thread From: Yigit, Ferruh @ 2015-12-21 11:46 UTC (permalink / raw) To: Alex Williamson; +Cc: dev On Fri, Dec 18, 2015 at 02:50:17PM -0700, Alex Williamson wrote: > On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote: > > On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote: > > > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote: > > > <...> > > > > > > > > > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, > > > > > > > > > because in > > > > > > > > > vfio > > > > > > > > > module, > > > > > > > > > container->noiommu is not set before doing a > > > > > > > > > vfio_group_set_container() > > > > > > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > > > > > > > > > > > > Running CHECK_EXTENSION on a container without the group > > > > > > > > attached is > > > > > > > > only going to tell you what extensions vfio is capable > > > > > > > > of, > > > > > > > > not > > > > > > > > necessarily what extensions are available to you with > > > > > > > > that > > > > > > > > group. > > > > > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was under > > > > > > > the > > > > > > > impression that checking extension before assigning devices > > > > > > > was > > > > > > > the > > > > > > > correct way to do things, so as to not to try anything we > > > > > > > know > > > > > > > would > > > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs to > > > > > > > be > > > > > > > called > > > > > > > on both container and groups (or just on groups)? > > > > > > > > > > > > Hmm, in Documentation/vfio.txt we do give the following > > > > > > algorithm: > > > > > > > > > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > > > > > VFIO_API_VERSION) > > > > > > /* Unknown API version */ > > > > > > > > > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > > > > > VFIO_TYPE1_IOMMU)) > > > > > > /* Doesn't support the IOMMU driver we want. > > > > > > */ > > > > > > ... > > > > > > > > > > > > That's just going to query each iommu driver and we can't yet > > > > > > say > > > > > > whether > > > > > > the group the user attaches to the container later will > > > > > > actually > > > > > > support that > > > > > > extension until we try to do it, that would come at > > > > > > VFIO_SET_IOMMU. > > > > > > So is > > > > > > it perhaps a vfio bug that we're not advertising no-iommu > > > > > > until > > > > > > the > > > > > > group is > > > > > > attached? After all, we are capable of it with just an empty > > > > > > container, just > > > > > > like we are with type1, but we're going to fail SET_IOMMU for > > > > > > the > > > > > > wrong > > > > > > combination. > > > > > > This is exactly the sort of thing that makes me glad we > > > > > > reverted > > > > > > it without > > > > > > feedback from a working user driver. Thanks, > > > > > > > > > > Whether it should be considered a "bug" in VFIO or "by design" > > > > > is > > > > > up > > > > > to you, of course, but at least according to the VFIO > > > > > documentation, > > > > > we are meant to check for type 1 extension and then attach > > > > > devices, > > > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as > > > > > supported > > > > > even without any devices attached to the container (just like > > > > > we > > > > > get > > > > > type 1 as supported without any devices attached). Having said > > > > > that, > > > > > if it was meant to attach devices first and then check the > > > > > extensions, then perhaps the documentation should also point > > > > > out > > > > > that > > > > > fact (or perhaps I missed that detail in my readings of the > > > > > docs, > > > > > in > > > > > which case my apologies). > > > > > > > > Hi Anatoly, > > > > > > > > Does the below patch make it behave more like you'd expect. This > > > > applies to v4.4-rc4, I'd fold this into the base patch if we > > > > reincorporate it to a future kernel. Thanks, > > > > > > > > Alex > > > > > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > > > > Author: Alex Williamson <alex.williamson@redhat.com> > > > > Date: Wed Dec 16 19:02:01 2015 -0700 > > > > > > > > vfio: Fix no-iommu CHECK_EXTENSION > > > > > > > > Previously the no-iommu iommu driver was only visible when > > > > the > > > > container had an attached no-iommu group. This means that > > > > CHECK_EXTENSION on and empty container couldn't report the > > > > possibility > > > > of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether or not > > > > the user > > > > can make use of it with the group, so this is > > > > inconsistent. Add the > > > > no-iommu iommu to the list of iommu drivers when enabled via > > > > module > > > > option, but skip all the others if the container is attached > > > > to > > > > a > > > > no-iommu groups. Note that tainting is now done with the > > > > "unsafe" > > > > module callback rather than explictly within vfio. > > > > > > > > Also fixes module option and module description name > > > > inconsistency. > > > > > > > > Also make vfio_noiommu_ops const. > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > > > Hi Alex, > > > > > > I got following crash with this update: > > Let's try this one: > > commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05 > Author: Alex Williamson <alex.williamson@redhat.com> > Date: Fri Dec 18 14:45:55 2015 -0700 > > v2 vfio fix no-iommu CHECK_EXTENSION > > Register and unregister the no-iommu iommu backend at module > initialization and exit, but disable it unless enabled via module > option. Rather than modify the iommu driver walk, selectively skip > combinations that aren't supported. CHECK_EXTENSION on a container > without any groups attached exposes all possible extensions. Once a > group is attached, the no-iommu backend is skipped for regular groups > and regular iommu backends are skipped for no-iommu groups. > > This would be folded into a single patch to re-propose vfio no-iommu > mode upstream. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > Hi Alex, Thank you for the update. I have tested this on both no-iommu and iommu environment and worked successfully. I believe this approach is better because it is simpler. >From DPDK point of view, only update to support vfio no-iommu is: to use new group names and disable DMA mapping. If VFIO module compiled with "CONFIG_VFIO_NOIOMMU=y" by default, that makes things easier for DPDK, in no-iommu environment inserting vfio module with proper parameter makes it available for DPDK. Thanks, ferruh ^ permalink raw reply [flat|nested] 35+ messages in thread
* [dpdk-dev] [PATCH] vfio: add no-iommu support 2015-12-21 11:46 ` Yigit, Ferruh @ 2015-12-21 12:18 ` Ferruh Yigit 2015-12-21 15:15 ` Burakov, Anatoly 2015-12-21 19:22 ` [dpdk-dev] VFIO no-iommu Alex Williamson 1 sibling, 1 reply; 35+ messages in thread From: Ferruh Yigit @ 2015-12-21 12:18 UTC (permalink / raw) To: dev This is based on patch from Alex Williamson: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=033291eccbdb plus http://dpdk.org/dev/patchwork/patch/9598/ This patch is intended to test above patches on DPDK rather than official patch to DPDK. Test result is DPDK successfully run on no-iommu environment. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> --- lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 28 +++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c index 74f91ba..90bba4a 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c @@ -61,6 +61,18 @@ #ifdef VFIO_PRESENT +/*#define VFIO_NOIOMMU*/ + +#ifndef VFIO_NOIOMMU_IOMMU +#define VFIO_NOIOMMU_IOMMU 8 +#endif + +#ifdef VFIO_NOIOMMU +#define VFIO_IOMMU_TYPE VFIO_NOIOMMU_IOMMU +#else +#define VFIO_IOMMU_TYPE VFIO_TYPE1_IOMMU +#endif + #define PAGE_SIZE (sysconf(_SC_PAGESIZE)) #define PAGE_MASK (~(PAGE_SIZE - 1)) @@ -71,7 +83,11 @@ EAL_REGISTER_TAILQ(rte_vfio_tailq) #define VFIO_DIR "/dev/vfio" #define VFIO_CONTAINER_PATH "/dev/vfio/vfio" +#ifdef VFIO_NOIOMMU +#define VFIO_GROUP_FMT "/dev/vfio/noiommu-%u" +#else #define VFIO_GROUP_FMT "/dev/vfio/%u" +#endif #define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL) /* per-process VFIO config */ @@ -212,17 +228,21 @@ pci_vfio_set_bus_master(int dev_fd) static int pci_vfio_setup_dma_maps(int vfio_container_fd) { +#ifndef VFIO_NOIOMMU const struct rte_memseg *ms = rte_eal_get_physmem_layout(); - int i, ret; + int i; +#endif + int ret; ret = ioctl(vfio_container_fd, VFIO_SET_IOMMU, - VFIO_TYPE1_IOMMU); + VFIO_IOMMU_TYPE); if (ret) { RTE_LOG(ERR, EAL, " cannot set IOMMU type, " "error %i (%s)\n", errno, strerror(errno)); return -1; } +#ifndef VFIO_NOIOMMU /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ for (i = 0; i < RTE_MAX_MEMSEG; i++) { struct vfio_iommu_type1_dma_map dma_map; @@ -245,6 +265,7 @@ pci_vfio_setup_dma_maps(int vfio_container_fd) return -1; } } +#endif return 0; } @@ -373,7 +394,8 @@ pci_vfio_get_container_fd(void) } /* check if we support IOMMU type 1 */ - ret = ioctl(vfio_container_fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU); + ret = ioctl(vfio_container_fd, VFIO_CHECK_EXTENSION, + VFIO_IOMMU_TYPE); if (ret != 1) { if (ret < 0) RTE_LOG(ERR, EAL, " could not get IOMMU type, " -- 2.5.0 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] [PATCH] vfio: add no-iommu support 2015-12-21 12:18 ` [dpdk-dev] [PATCH] vfio: add no-iommu support Ferruh Yigit @ 2015-12-21 15:15 ` Burakov, Anatoly 2015-12-21 15:26 ` Yigit, Ferruh 0 siblings, 1 reply; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-21 15:15 UTC (permalink / raw) To: Yigit, Ferruh, dev Hi Ferruh, > This is based on patch from Alex Williamson: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=03 > 3291eccbdb > plus > http://dpdk.org/dev/patchwork/patch/9598/ > > This patch is intended to test above patches on DPDK rather than official > patch to DPDK. > > Test result is DPDK successfully run on no-iommu environment. > > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> > --- > lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 28 > +++++++++++++++++++++++++--- > 1 file changed, 25 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c > b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c > index 74f91ba..90bba4a 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c > +++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c > @@ -61,6 +61,18 @@ > > #ifdef VFIO_PRESENT > > +/*#define VFIO_NOIOMMU*/ > + > +#ifndef VFIO_NOIOMMU_IOMMU > +#define VFIO_NOIOMMU_IOMMU 8 > +#endif > + > +#ifdef VFIO_NOIOMMU > +#define VFIO_IOMMU_TYPE VFIO_NOIOMMU_IOMMU #else #define > +VFIO_IOMMU_TYPE VFIO_TYPE1_IOMMU #endif > + > #define PAGE_SIZE (sysconf(_SC_PAGESIZE)) > #define PAGE_MASK (~(PAGE_SIZE - 1)) > > @@ -71,7 +83,11 @@ EAL_REGISTER_TAILQ(rte_vfio_tailq) > > #define VFIO_DIR "/dev/vfio" > #define VFIO_CONTAINER_PATH "/dev/vfio/vfio" > +#ifdef VFIO_NOIOMMU > +#define VFIO_GROUP_FMT "/dev/vfio/noiommu-%u" > +#else > #define VFIO_GROUP_FMT "/dev/vfio/%u" > +#endif > #define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL) > > /* per-process VFIO config */ > @@ -212,17 +228,21 @@ pci_vfio_set_bus_master(int dev_fd) static int > pci_vfio_setup_dma_maps(int vfio_container_fd) { > +#ifndef VFIO_NOIOMMU > const struct rte_memseg *ms = rte_eal_get_physmem_layout(); > - int i, ret; > + int i; > +#endif > + int ret; > > ret = ioctl(vfio_container_fd, VFIO_SET_IOMMU, > - VFIO_TYPE1_IOMMU); > + VFIO_IOMMU_TYPE); > if (ret) { > RTE_LOG(ERR, EAL, " cannot set IOMMU type, " > "error %i (%s)\n", errno, strerror(errno)); > return -1; > } > > +#ifndef VFIO_NOIOMMU > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > struct vfio_iommu_type1_dma_map dma_map; @@ -245,6 > +265,7 @@ pci_vfio_setup_dma_maps(int vfio_container_fd) > return -1; > } > } > +#endif > > return 0; > } > @@ -373,7 +394,8 @@ pci_vfio_get_container_fd(void) > } > > /* check if we support IOMMU type 1 */ > - ret = ioctl(vfio_container_fd, VFIO_CHECK_EXTENSION, > VFIO_TYPE1_IOMMU); > + ret = ioctl(vfio_container_fd, VFIO_CHECK_EXTENSION, > + VFIO_IOMMU_TYPE); > if (ret != 1) { > if (ret < 0) > RTE_LOG(ERR, EAL, " could not get IOMMU > type, " > -- > 2.5.0 This is one approach :) I was thinking of another, building some kind of more generic support for multiple VFIO drivers. It's a bit more code and probably overkill as a solution to this particular problem, but hopefully it'll make it easier to add new VFIO drivers down the line (with each driver having their own DMA mapping function), should we choose to do so. I'm still working on the patch, but if everyone is OK with this approach instead of a more general one, that's fine with me. Thanks, Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] [PATCH] vfio: add no-iommu support 2015-12-21 15:15 ` Burakov, Anatoly @ 2015-12-21 15:26 ` Yigit, Ferruh 2015-12-21 15:28 ` Burakov, Anatoly 0 siblings, 1 reply; 35+ messages in thread From: Yigit, Ferruh @ 2015-12-21 15:26 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: dev On Mon, Dec 21, 2015 at 03:15:46PM +0000, Burakov, Anatoly wrote: > > This is based on patch from Alex Williamson: > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=03 > > 3291eccbdb > > plus > > http://dpdk.org/dev/patchwork/patch/9598/ > > > > This patch is intended to test above patches on DPDK rather than official > > patch to DPDK. > > > > Test result is DPDK successfully run on no-iommu environment. > > > > This is one approach :) I was thinking of another, building some kind of more generic support for multiple VFIO drivers. It's a bit more code and probably overkill as a solution to this particular problem, but hopefully it'll make it easier to add new VFIO drivers down the line (with each driver having their own DMA mapping function), should we choose to do so. I'm still working on the patch, but if everyone is OK with this approach instead of a more general one, that's fine with me. > Hi Anatoly, This patch sent just to show what changes done to test VFIO no-iommu I mentioned, and to have a justification for the kernel patch, not sent as a final solution in DPDK, sorry for interrupting your work. Thanks, ferruh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] [PATCH] vfio: add no-iommu support 2015-12-21 15:26 ` Yigit, Ferruh @ 2015-12-21 15:28 ` Burakov, Anatoly 0 siblings, 0 replies; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-21 15:28 UTC (permalink / raw) To: Yigit, Ferruh; +Cc: dev Hi Ferruh, > On Mon, Dec 21, 2015 at 03:15:46PM +0000, Burakov, Anatoly wrote: > > > This is based on patch from Alex Williamson: > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/comm > > > it/?id=03 > > > 3291eccbdb > > > plus > > > http://dpdk.org/dev/patchwork/patch/9598/ > > > > > > This patch is intended to test above patches on DPDK rather than > > > official patch to DPDK. > > > > > > Test result is DPDK successfully run on no-iommu environment. > > > > > > > This is one approach :) I was thinking of another, building some kind of > more generic support for multiple VFIO drivers. It's a bit more code and > probably overkill as a solution to this particular problem, but hopefully it'll > make it easier to add new VFIO drivers down the line (with each driver > having their own DMA mapping function), should we choose to do so. I'm still > working on the patch, but if everyone is OK with this approach instead of a > more general one, that's fine with me. > > > Hi Anatoly, > > This patch sent just to show what changes done to test VFIO no-iommu I > mentioned, and to have a justification for the kernel patch, not sent as a final > solution in DPDK, sorry for interrupting your work. > > Thanks, > Ferruh Ah OK, I misread the part where it said that it is not to be applied as-is. Thanks! Thanks, Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-21 11:46 ` Yigit, Ferruh 2015-12-21 12:18 ` [dpdk-dev] [PATCH] vfio: add no-iommu support Ferruh Yigit @ 2015-12-21 19:22 ` Alex Williamson 2015-12-22 20:20 ` Alex Williamson 1 sibling, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-21 19:22 UTC (permalink / raw) To: Yigit, Ferruh; +Cc: dev On Mon, 2015-12-21 at 11:46 +0000, Yigit, Ferruh wrote: > On Fri, Dec 18, 2015 at 02:50:17PM -0700, Alex Williamson wrote: > > On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote: > > > On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote: > > > > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson > > > > wrote: > > > > <...> > > > > > > > > > > > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, > > > > > > > > > > because in > > > > > > > > > > vfio > > > > > > > > > > module, > > > > > > > > > > container->noiommu is not set before doing a > > > > > > > > > > vfio_group_set_container() > > > > > > > > > > and vfio_for_each_iommu_driver selects wrong > > > > > > > > > > driver. > > > > > > > > > > > > > > > > > > Running CHECK_EXTENSION on a container without the > > > > > > > > > group > > > > > > > > > attached is > > > > > > > > > only going to tell you what extensions vfio is > > > > > > > > > capable > > > > > > > > > of, > > > > > > > > > not > > > > > > > > > necessarily what extensions are available to you with > > > > > > > > > that > > > > > > > > > group. > > > > > > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > > > > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was > > > > > > > > under > > > > > > > > the > > > > > > > > impression that checking extension before assigning > > > > > > > > devices > > > > > > > > was > > > > > > > > the > > > > > > > > correct way to do things, so as to not to try anything > > > > > > > > we > > > > > > > > know > > > > > > > > would > > > > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs > > > > > > > > to > > > > > > > > be > > > > > > > > called > > > > > > > > on both container and groups (or just on groups)? > > > > > > > > > > > > > > Hmm, in Documentation/vfio.txt we do give the following > > > > > > > algorithm: > > > > > > > > > > > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > > > > > > VFIO_API_VERSION) > > > > > > > /* Unknown API version */ > > > > > > > > > > > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > > > > > > VFIO_TYPE1_IOMMU)) > > > > > > > /* Doesn't support the IOMMU driver we > > > > > > > want. > > > > > > > */ > > > > > > > ... > > > > > > > > > > > > > > That's just going to query each iommu driver and we can't > > > > > > > yet > > > > > > > say > > > > > > > whether > > > > > > > the group the user attaches to the container later will > > > > > > > actually > > > > > > > support that > > > > > > > extension until we try to do it, that would come at > > > > > > > VFIO_SET_IOMMU. > > > > > > > So is > > > > > > > it perhaps a vfio bug that we're not advertising no-iommu > > > > > > > until > > > > > > > the > > > > > > > group is > > > > > > > attached? After all, we are capable of it with just an > > > > > > > empty > > > > > > > container, just > > > > > > > like we are with type1, but we're going to fail SET_IOMMU > > > > > > > for > > > > > > > the > > > > > > > wrong > > > > > > > combination. > > > > > > > This is exactly the sort of thing that makes me glad we > > > > > > > reverted > > > > > > > it without > > > > > > > feedback from a working user driver. Thanks, > > > > > > > > > > > > Whether it should be considered a "bug" in VFIO or "by > > > > > > design" > > > > > > is > > > > > > up > > > > > > to you, of course, but at least according to the VFIO > > > > > > documentation, > > > > > > we are meant to check for type 1 extension and then attach > > > > > > devices, > > > > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as > > > > > > supported > > > > > > even without any devices attached to the container (just > > > > > > like > > > > > > we > > > > > > get > > > > > > type 1 as supported without any devices attached). Having > > > > > > said > > > > > > that, > > > > > > if it was meant to attach devices first and then check the > > > > > > extensions, then perhaps the documentation should also > > > > > > point > > > > > > out > > > > > > that > > > > > > fact (or perhaps I missed that detail in my readings of the > > > > > > docs, > > > > > > in > > > > > > which case my apologies). > > > > > > > > > > Hi Anatoly, > > > > > > > > > > Does the below patch make it behave more like you'd expect. > > > > > This > > > > > applies to v4.4-rc4, I'd fold this into the base patch if we > > > > > reincorporate it to a future kernel. Thanks, > > > > > > > > > > Alex > > > > > > > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > > > > > Author: Alex Williamson <alex.williamson@redhat.com> > > > > > Date: Wed Dec 16 19:02:01 2015 -0700 > > > > > > > > > > vfio: Fix no-iommu CHECK_EXTENSION > > > > > > > > > > Previously the no-iommu iommu driver was only visible > > > > > when > > > > > the > > > > > container had an attached no-iommu group. This means > > > > > that > > > > > CHECK_EXTENSION on and empty container couldn't report > > > > > the > > > > > possibility > > > > > of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether or > > > > > not > > > > > the user > > > > > can make use of it with the group, so this is > > > > > inconsistent. Add the > > > > > no-iommu iommu to the list of iommu drivers when enabled > > > > > via > > > > > module > > > > > option, but skip all the others if the container is > > > > > attached > > > > > to > > > > > a > > > > > no-iommu groups. Note that tainting is now done with the > > > > > "unsafe" > > > > > module callback rather than explictly within vfio. > > > > > > > > > > Also fixes module option and module description name > > > > > inconsistency. > > > > > > > > > > Also make vfio_noiommu_ops const. > > > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.co > > > > > m> > > > > > > > > Hi Alex, > > > > > > > > I got following crash with this update: > > > > Let's try this one: > > > > commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05 > > Author: Alex Williamson <alex.williamson@redhat.com> > > Date: Fri Dec 18 14:45:55 2015 -0700 > > > > v2 vfio fix no-iommu CHECK_EXTENSION > > > > Register and unregister the no-iommu iommu backend at module > > initialization and exit, but disable it unless enabled via > > module > > option. Rather than modify the iommu driver walk, selectively > > skip > > combinations that aren't supported. CHECK_EXTENSION on a > > container > > without any groups attached exposes all possible > > extensions. Once a > > group is attached, the no-iommu backend is skipped for regular > > groups > > and regular iommu backends are skipped for no-iommu groups. > > > > This would be folded into a single patch to re-propose vfio no- > > iommu > > mode upstream. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > Hi Alex, > > Thank you for the update. I have tested this on both no-iommu and > iommu environment > and worked successfully. I believe this approach is better because it > is simpler. > > From DPDK point of view, only update to support vfio no-iommu is: to > use new group names > and disable DMA mapping. > > If VFIO module compiled with "CONFIG_VFIO_NOIOMMU=y" by default, that > makes things easier > for DPDK, in no-iommu environment inserting vfio module with proper > parameter makes it > available for DPDK. Thanks for the update Ferruh. Also note that the module option is dynamically settable so that it can support running with statically compiled modules where you may not know whether or not to enable no- iommu until after boot, or where unloading and re-loading a module might not be an option. It will be up to each distro to decide whether to enable the config option, but I think we at least highlight the existing support issue for non-iommu protected userspace drivers, which is something that was not at all clear with the uio driver approach. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-21 19:22 ` [dpdk-dev] VFIO no-iommu Alex Williamson @ 2015-12-22 20:20 ` Alex Williamson 2015-12-23 11:19 ` Burakov, Anatoly 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2015-12-22 20:20 UTC (permalink / raw) To: Yigit, Ferruh; +Cc: dev On Mon, 2015-12-21 at 12:22 -0700, Alex Williamson wrote: > On Mon, 2015-12-21 at 11:46 +0000, Yigit, Ferruh wrote: > > On Fri, Dec 18, 2015 at 02:50:17PM -0700, Alex Williamson wrote: > > > On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote: > > > > On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote: > > > > > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson > > > > > wrote: > > > > > <...> > > > > > > > > > > > > > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION > > > > > > > > > > > ioctl, > > > > > > > > > > > because in > > > > > > > > > > > vfio > > > > > > > > > > > module, > > > > > > > > > > > container->noiommu is not set before doing a > > > > > > > > > > > vfio_group_set_container() > > > > > > > > > > > and vfio_for_each_iommu_driver selects wrong > > > > > > > > > > > driver. > > > > > > > > > > > > > > > > > > > > Running CHECK_EXTENSION on a container without the > > > > > > > > > > group > > > > > > > > > > attached is > > > > > > > > > > only going to tell you what extensions vfio is > > > > > > > > > > capable > > > > > > > > > > of, > > > > > > > > > > not > > > > > > > > > > necessarily what extensions are available to you > > > > > > > > > > with > > > > > > > > > > that > > > > > > > > > > group. > > > > > > > > > > Is this just a general dpdk- vfio ordering bug? > > > > > > > > > > > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was > > > > > > > > > under > > > > > > > > > the > > > > > > > > > impression that checking extension before assigning > > > > > > > > > devices > > > > > > > > > was > > > > > > > > > the > > > > > > > > > correct way to do things, so as to not to try > > > > > > > > > anything > > > > > > > > > we > > > > > > > > > know > > > > > > > > > would > > > > > > > > > fail anyway. Does this imply that CHECK_EXTENSION > > > > > > > > > needs > > > > > > > > > to > > > > > > > > > be > > > > > > > > > called > > > > > > > > > on both container and groups (or just on groups)? > > > > > > > > > > > > > > > > Hmm, in Documentation/vfio.txt we do give the following > > > > > > > > algorithm: > > > > > > > > > > > > > > > > if (ioctl(container, VFIO_GET_API_VERSION) != > > > > > > > > VFIO_API_VERSION) > > > > > > > > /* Unknown API version */ > > > > > > > > > > > > > > > > if (!ioctl(container, VFIO_CHECK_EXTENSION, > > > > > > > > VFIO_TYPE1_IOMMU)) > > > > > > > > /* Doesn't support the IOMMU driver we > > > > > > > > want. > > > > > > > > */ > > > > > > > > ... > > > > > > > > > > > > > > > > That's just going to query each iommu driver and we > > > > > > > > can't > > > > > > > > yet > > > > > > > > say > > > > > > > > whether > > > > > > > > the group the user attaches to the container later will > > > > > > > > actually > > > > > > > > support that > > > > > > > > extension until we try to do it, that would come at > > > > > > > > VFIO_SET_IOMMU. > > > > > > > > So is > > > > > > > > it perhaps a vfio bug that we're not advertising no- > > > > > > > > iommu > > > > > > > > until > > > > > > > > the > > > > > > > > group is > > > > > > > > attached? After all, we are capable of it with just an > > > > > > > > empty > > > > > > > > container, just > > > > > > > > like we are with type1, but we're going to fail > > > > > > > > SET_IOMMU > > > > > > > > for > > > > > > > > the > > > > > > > > wrong > > > > > > > > combination. > > > > > > > > This is exactly the sort of thing that makes me glad > > > > > > > > we > > > > > > > > reverted > > > > > > > > it without > > > > > > > > feedback from a working user driver. Thanks, > > > > > > > > > > > > > > Whether it should be considered a "bug" in VFIO or "by > > > > > > > design" > > > > > > > is > > > > > > > up > > > > > > > to you, of course, but at least according to the VFIO > > > > > > > documentation, > > > > > > > we are meant to check for type 1 extension and then > > > > > > > attach > > > > > > > devices, > > > > > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked > > > > > > > as > > > > > > > supported > > > > > > > even without any devices attached to the container (just > > > > > > > like > > > > > > > we > > > > > > > get > > > > > > > type 1 as supported without any devices attached). Having > > > > > > > said > > > > > > > that, > > > > > > > if it was meant to attach devices first and then check > > > > > > > the > > > > > > > extensions, then perhaps the documentation should also > > > > > > > point > > > > > > > out > > > > > > > that > > > > > > > fact (or perhaps I missed that detail in my readings of > > > > > > > the > > > > > > > docs, > > > > > > > in > > > > > > > which case my apologies). > > > > > > > > > > > > Hi Anatoly, > > > > > > > > > > > > Does the below patch make it behave more like you'd expect. > > > > > > This > > > > > > applies to v4.4-rc4, I'd fold this into the base patch if > > > > > > we > > > > > > reincorporate it to a future kernel. Thanks, > > > > > > > > > > > > Alex > > > > > > > > > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > > > > > > Author: Alex Williamson <alex.williamson@redhat.com> > > > > > > Date: Wed Dec 16 19:02:01 2015 -0700 > > > > > > > > > > > > vfio: Fix no-iommu CHECK_EXTENSION > > > > > > > > > > > > Previously the no-iommu iommu driver was only visible > > > > > > when > > > > > > the > > > > > > container had an attached no-iommu group. This means > > > > > > that > > > > > > CHECK_EXTENSION on and empty container couldn't report > > > > > > the > > > > > > possibility > > > > > > of using VFIO_NOIOMMU_IOMMU. We report TYPE1 whether > > > > > > or > > > > > > not > > > > > > the user > > > > > > can make use of it with the group, so this is > > > > > > inconsistent. Add the > > > > > > no-iommu iommu to the list of iommu drivers when > > > > > > enabled > > > > > > via > > > > > > module > > > > > > option, but skip all the others if the container is > > > > > > attached > > > > > > to > > > > > > a > > > > > > no-iommu groups. Note that tainting is now done with > > > > > > the > > > > > > "unsafe" > > > > > > module callback rather than explictly within vfio. > > > > > > > > > > > > Also fixes module option and module description name > > > > > > inconsistency. > > > > > > > > > > > > Also make vfio_noiommu_ops const. > > > > > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat. > > > > > > co > > > > > > m> > > > > > > > > > > Hi Alex, > > > > > > > > > > I got following crash with this update: > > > > > > Let's try this one: > > > > > > commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05 > > > Author: Alex Williamson <alex.williamson@redhat.com> > > > Date: Fri Dec 18 14:45:55 2015 -0700 > > > > > > v2 vfio fix no-iommu CHECK_EXTENSION > > > > > > Register and unregister the no-iommu iommu backend at module > > > initialization and exit, but disable it unless enabled via > > > module > > > option. Rather than modify the iommu driver walk, > > > selectively > > > skip > > > combinations that aren't supported. CHECK_EXTENSION on a > > > container > > > without any groups attached exposes all possible > > > extensions. Once a > > > group is attached, the no-iommu backend is skipped for > > > regular > > > groups > > > and regular iommu backends are skipped for no-iommu groups. > > > > > > This would be folded into a single patch to re-propose vfio > > > no- > > > iommu > > > mode upstream. > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > > Hi Alex, > > > > Thank you for the update. I have tested this on both no-iommu and > > iommu environment > > and worked successfully. I believe this approach is better because > > it > > is simpler. > > > > From DPDK point of view, only update to support vfio no-iommu is: > > to > > use new group names > > and disable DMA mapping. > > > > If VFIO module compiled with "CONFIG_VFIO_NOIOMMU=y" by default, > > that > > makes things easier > > for DPDK, in no-iommu environment inserting vfio module with proper > > parameter makes it > > available for DPDK. > > Thanks for the update Ferruh. Also note that the module option is > dynamically settable so that it can support running with statically > compiled modules where you may not know whether or not to enable no- > iommu until after boot, or where unloading and re-loading a module > might not be an option. It will be up to each distro to decide > whether > to enable the config option, but I think we at least highlight the > existing support issue for non-iommu protected userspace drivers, > which > is something that was not at all clear with the uio driver approach. Hi, I've re-posted the unified patch upstream and it should start showing up in the next linux-next build. I expect the dpdk code won't be merged until after this gets back into a proper kernel, but could we get the dpdk modifications posted as rfc for others looking to try it? Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-22 20:20 ` Alex Williamson @ 2015-12-23 11:19 ` Burakov, Anatoly 2015-12-31 14:30 ` Santosh Shukla 0 siblings, 1 reply; 35+ messages in thread From: Burakov, Anatoly @ 2015-12-23 11:19 UTC (permalink / raw) To: Alex Williamson, Yigit, Ferruh; +Cc: dev Hi Alex, > I've re-posted the unified patch upstream and it should start showing up in > the next linux-next build. I expect the dpdk code won't be merged until > after this gets back into a proper kernel, but could we get the dpdk > modifications posted as rfc for others looking to try it? I have already posted a patch that should work with No-IOMMU. http://dpdk.org/dev/patchwork/patch/9619/ Apologies for not CC-ing you. I too would be interested to know if other people are having any issues with the patch. Thanks, Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-23 11:19 ` Burakov, Anatoly @ 2015-12-31 14:30 ` Santosh Shukla 0 siblings, 0 replies; 35+ messages in thread From: Santosh Shukla @ 2015-12-31 14:30 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: dev, Alex Williamson On Wed, Dec 23, 2015 at 4:49 PM, Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > Hi Alex, > >> I've re-posted the unified patch upstream and it should start showing up in >> the next linux-next build. I expect the dpdk code won't be merged until >> after this gets back into a proper kernel, but could we get the dpdk >> modifications posted as rfc for others looking to try it? > > I have already posted a patch that should work with No-IOMMU. > > http://dpdk.org/dev/patchwork/patch/9619/ > > Apologies for not CC-ing you. I too would be interested to know if other people are having any issues with the patch. > I tried this patch for virtio-net pmd driver on arm64 and It worked for me. I didn't reviewed patch, but functionally nothing broke in my test environment, we'll review as time permit.. for now Tested-by: Santosh Shukla <sshukla@mvista.com> > Thanks, > Anatoly ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-16 4:38 ` Alex Williamson 2015-12-16 8:35 ` Burakov, Anatoly @ 2016-01-14 6:03 ` Jike Song 2016-01-14 6:52 ` Alex Williamson 1 sibling, 1 reply; 35+ messages in thread From: Jike Song @ 2016-01-14 6:03 UTC (permalink / raw) To: Alex Williamson; +Cc: dev On Wed, Dec 16, 2015 at 12:38 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > > So it works. Is it acceptable? Useful? Sufficiently complete? Does > it imply deprecating the uio interface? I believe the feature that > started this discussion was support for MSI/X interrupts so that VFs > can support some kind of interrupt (uio only supports INTx since it > doesn't allow DMA). Implementing that would be the ultimate test of > whether this provides dpdk with not only a more consistent interface, > but the feature dpdk wants that's missing in uio. Thanks, > Hi Alex, Sorry for jumping in. Just being curious, how does VFIO No-IOMMU mode support DMA from userspace drivers? If I understand correctly, due to the absence of IOMMU, pcidev has to use physaddr to start a DMA transaction, but how it is supposed to get physaddr from userspace drivers, /proc/<pid>/pagemap or something else? -- Thanks, Jike ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2016-01-14 6:03 ` Jike Song @ 2016-01-14 6:52 ` Alex Williamson 2016-01-14 8:12 ` Jike Song 0 siblings, 1 reply; 35+ messages in thread From: Alex Williamson @ 2016-01-14 6:52 UTC (permalink / raw) To: Jike Song; +Cc: dev On Thu, 2016-01-14 at 14:03 +0800, Jike Song wrote: > On Wed, Dec 16, 2015 at 12:38 PM, Alex Williamson > <alex.williamson@redhat.com> wrote: > > > > So it works. Is it acceptable? Useful? Sufficiently complete? Does > > it imply deprecating the uio interface? I believe the feature that > > started this discussion was support for MSI/X interrupts so that VFs > > can support some kind of interrupt (uio only supports INTx since it > > doesn't allow DMA). Implementing that would be the ultimate test of > > whether this provides dpdk with not only a more consistent interface, > > but the feature dpdk wants that's missing in uio. Thanks, > > > Hi Alex, > > Sorry for jumping in. Just being curious, how does VFIO No-IOMMU mode > support DMA from userspace drivers? If I understand correctly, due to > the absence of IOMMU, pcidev has to use physaddr to start a DMA > transaction, but how it is supposed to get physaddr from userspace > drivers, /proc//pagemap or something else? Hi Jike, vfio no-iommu does nothing to facilitate DMA mappings, UIO didn't either and DPDK managed to work with that. vfio no-iommu is meant to be an enabler and provide a consistent vfio device model (with MSI/X interrupts), but fundamentally the idea of a non-iommu protected, user owned device capable of DMA is unsupportable. This is why vfio no- iommu taints the kernel. With that in mind, one of the design goals is to introduce as little code as possible for vfio no-iommu. A new vfio iommu backend with pinning and virt-to-bus translation goes against that design goal. I don't know the details of how DPDK did this with UIO, but the same lack of DMA mapping facilities is present with vfio no-iommu. It really just brings the vfio device model, nothing more. Thanks, Alex ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2016-01-14 6:52 ` Alex Williamson @ 2016-01-14 8:12 ` Jike Song 0 siblings, 0 replies; 35+ messages in thread From: Jike Song @ 2016-01-14 8:12 UTC (permalink / raw) To: Alex Williamson; +Cc: dev On Thu, Jan 14, 2016 at 2:52 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Thu, 2016-01-14 at 14:03 +0800, Jike Song wrote: >> On Wed, Dec 16, 2015 at 12:38 PM, Alex Williamson >> <alex.williamson@redhat.com> wrote: >> > >> > So it works. Is it acceptable? Useful? Sufficiently complete? Does >> > it imply deprecating the uio interface? I believe the feature that >> > started this discussion was support for MSI/X interrupts so that VFs >> > can support some kind of interrupt (uio only supports INTx since it >> > doesn't allow DMA). Implementing that would be the ultimate test of >> > whether this provides dpdk with not only a more consistent interface, >> > but the feature dpdk wants that's missing in uio. Thanks, >> > >> Hi Alex, >> >> Sorry for jumping in. Just being curious, how does VFIO No-IOMMU mode >> support DMA from userspace drivers? If I understand correctly, due to >> the absence of IOMMU, pcidev has to use physaddr to start a DMA >> transaction, but how it is supposed to get physaddr from userspace >> drivers, /proc//pagemap or something else? > > Hi Jike, > > vfio no-iommu does nothing to facilitate DMA mappings, UIO didn't > either and DPDK managed to work with that. vfio no-iommu is meant to > be an enabler and provide a consistent vfio device model (with MSI/X > interrupts), but fundamentally the idea of a non-iommu protected, user > owned device capable of DMA is unsupportable. This is why vfio no- > iommu taints the kernel. With that in mind, one of the design goals is > to introduce as little code as possible for vfio no-iommu. A new vfio > iommu backend with pinning and virt-to-bus translation goes against > that design goal. I don't know the details of how DPDK did this with > UIO, but the same lack of DMA mapping facilities is present with vfio > no-iommu. It really just brings the vfio device model, nothing more. > Thanks, > > Alex Thanks! - that addressed my question :) By the way, my previous assumption(consulting /proc/<pid>/pagemap) apparently doesn't work: one cannot assume a usespace buffer is physically continuous. -- Thanks, Jike ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-11 16:28 [dpdk-dev] VFIO no-iommu Thomas Monjalon 2015-12-11 22:12 ` Vincent JARDIN @ 2015-12-11 23:20 ` Jan Viktorin 2015-12-15 11:20 ` Alejandro Lucero 1 sibling, 1 reply; 35+ messages in thread From: Jan Viktorin @ 2015-12-11 23:20 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev, Alex Williamson Hello, I am not involved in the vfio very much, however, I was watching some vfio-related code in last few weeks. It looks promising to me and IMHO it seems to the best way to bring a support of integrated Ethernet MACs into DPDK (related to many SoCs). Unfortunately, the ARMv7 SoCs (I know) lacks of an IOMMU... The only protection there is the TrustZone technology but I have no idea of its support in the kernel. It's also far from being a replacement of an IOMMU. When using FPGAs, it is possible to put an IOMMU engine there (I've got such a prototype somewhere in my VHDL library) but nobody will probably do use because of saving on-chip resources. The X-Gene SoC (ARM 64) contains 2x 10 Gbps EMACs on the chip. I have no idea about IOMMUs there. Thus, this platform can probably benefit of such driver as well. The question is whether there is some interest to have this kind of support in DPDK. Thus, I'd like to have the vfio/no-iommu to support the ARMv7 (otherwise it would be effectively dead in DPDK). Unfortunately, it's not my primary job at the moment. Regards Jan Note: as far as I know, it is discouraged to refer to lkml.org as it is often very slow - my case today :). On Fri, 11 Dec 2015 17:28:43 +0100 Thomas Monjalon <thomas.monjalon@6wind.com> wrote: > Recently there were some discussions to have an upstream replacement > for our igb_uio module. > Several solutions were discussed (new uio driver, uio_pci_generic, vfio): > https://lkml.org/lkml/2015/10/16/700 > > Alex Williamson (maintainer of VFIO driver), submitted a solution > and was waiting some feedback. Unfortunately, nobody caught it and > he has reverted his work: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ae5515d > > It is an important challenge to remove our out-of-tree modules and > especially igb_uio. It is a long way to have a standard solution integrated > in every distributions. > The current cooking Linux kernel is 4.4 and will have a long term maintenance: > https://kernel.org/releases.html > So it is a pity to miss this opportunity. > > Stephen has fixed a bug to use the IOMMU group zero: > http://dpdk.org/browse/dpdk/commit/?id=22215f141b1 > > Is there someone interested to work on VFIO no-iommu and provide > some feedbacks? > We also need to prepare a documentation patch to explain its usage > compared to the standard VFIO mode. > > Thanks ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [dpdk-dev] VFIO no-iommu 2015-12-11 23:20 ` Jan Viktorin @ 2015-12-15 11:20 ` Alejandro Lucero 0 siblings, 0 replies; 35+ messages in thread From: Alejandro Lucero @ 2015-12-15 11:20 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev, Alex Williamson Hi, I know a bit about VFIO implementation, have been debugging IOMMU (intel) problems, know how QEMU/KVM work about using legacy or vfio attached devices, and I'm the maintainer of a DPDK PMD recently accepted upstream which requires our particular UIO driver (not maintained upstream). So I guess I could help with this effort and testing the code with our card. Of course, I can not be full time on this but I will be happy to contribute. On Fri, Dec 11, 2015 at 11:20 PM, Jan Viktorin <viktorin@rehivetech.com> wrote: > Hello, > > I am not involved in the vfio very much, however, I was watching some > vfio-related code in last few weeks. It looks promising to me and > IMHO it seems to the best way to bring a support of integrated Ethernet > MACs into DPDK (related to many SoCs). Unfortunately, the ARMv7 SoCs (I > know) lacks of an IOMMU... The only protection there is the TrustZone > technology but I have no idea of its support in the kernel. It's also > far from being a replacement of an IOMMU. When using FPGAs, it is > possible to put an IOMMU engine there (I've got such a prototype > somewhere in my VHDL library) but nobody will probably do use because > of saving on-chip resources. > > The X-Gene SoC (ARM 64) contains 2x 10 Gbps EMACs on the chip. I have no > idea about IOMMUs there. Thus, this platform can probably benefit of > such driver as well. The question is whether there is some interest to > have this kind of support in DPDK. > > Thus, I'd like to have the vfio/no-iommu to support the ARMv7 (otherwise > it would be effectively dead in DPDK). Unfortunately, it's not my > primary job at the moment. > > Regards > Jan > > Note: as far as I know, it is discouraged to refer to lkml.org as > it is often very slow - my case today :). > > On Fri, 11 Dec 2015 17:28:43 +0100 > Thomas Monjalon <thomas.monjalon@6wind.com> wrote: > > > Recently there were some discussions to have an upstream replacement > > for our igb_uio module. > > Several solutions were discussed (new uio driver, uio_pci_generic, vfio): > > https://lkml.org/lkml/2015/10/16/700 > > > > Alex Williamson (maintainer of VFIO driver), submitted a solution > > and was waiting some feedback. Unfortunately, nobody caught it and > > he has reverted his work: > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ae5515d > > > > It is an important challenge to remove our out-of-tree modules and > > especially igb_uio. It is a long way to have a standard solution > integrated > > in every distributions. > > The current cooking Linux kernel is 4.4 and will have a long term > maintenance: > > https://kernel.org/releases.html > > So it is a pity to miss this opportunity. > > > > Stephen has fixed a bug to use the IOMMU group zero: > > http://dpdk.org/browse/dpdk/commit/?id=22215f141b1 > > > > Is there someone interested to work on VFIO no-iommu and provide > > some feedbacks? > > We also need to prepare a documentation patch to explain its usage > > compared to the standard VFIO mode. > > > > Thanks > > ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2016-01-14 8:12 UTC | newest] Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-11 16:28 [dpdk-dev] VFIO no-iommu Thomas Monjalon 2015-12-11 22:12 ` Vincent JARDIN 2015-12-11 23:02 ` Alex Williamson 2015-12-15 13:43 ` O'Driscoll, Tim 2015-12-15 16:53 ` Alex Williamson 2015-12-16 4:04 ` Ferruh Yigit 2015-12-16 4:38 ` Alex Williamson 2015-12-16 8:35 ` Burakov, Anatoly 2015-12-16 16:23 ` Burakov, Anatoly 2015-12-16 23:17 ` Thomas Monjalon 2015-12-17 9:52 ` Burakov, Anatoly 2015-12-17 10:09 ` Thomas Monjalon 2015-12-17 19:38 ` Jan Viktorin 2015-12-17 21:16 ` Vincent JARDIN 2015-12-17 23:29 ` Stephen Hemminger 2015-12-16 17:11 ` Alex Williamson 2015-12-16 17:22 ` Burakov, Anatoly 2015-12-17 16:43 ` Alex Williamson 2015-12-18 10:43 ` Yigit, Ferruh 2015-12-18 14:38 ` Alex Williamson 2015-12-18 21:50 ` Alex Williamson 2015-12-21 11:46 ` Yigit, Ferruh 2015-12-21 12:18 ` [dpdk-dev] [PATCH] vfio: add no-iommu support Ferruh Yigit 2015-12-21 15:15 ` Burakov, Anatoly 2015-12-21 15:26 ` Yigit, Ferruh 2015-12-21 15:28 ` Burakov, Anatoly 2015-12-21 19:22 ` [dpdk-dev] VFIO no-iommu Alex Williamson 2015-12-22 20:20 ` Alex Williamson 2015-12-23 11:19 ` Burakov, Anatoly 2015-12-31 14:30 ` Santosh Shukla 2016-01-14 6:03 ` Jike Song 2016-01-14 6:52 ` Alex Williamson 2016-01-14 8:12 ` Jike Song 2015-12-11 23:20 ` Jan Viktorin 2015-12-15 11:20 ` Alejandro Lucero
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).