* [dpdk-dev] vfio: failed to select IOMMU type @ 2017-04-01 10:46 Andrew Rybchenko 2017-04-03 16:11 ` Burakov, Anatoly 0 siblings, 1 reply; 11+ messages in thread From: Andrew Rybchenko @ 2017-04-01 10:46 UTC (permalink / raw) To: dev, Alejandro Lucero; +Cc: Anatoly Burakov Hi, after the following commit (it was picked up by dpdk-next-net recently), I have problems with VFIO: === commit 94c0776b1badd1ee715d60f07391058f23494365 Author: Alejandro Lucero <alejandro.lucero@netronome.com> Date: Wed Mar 29 10:54:50 2017 +0100 vfio: support hotplug Current device hotplug is just supported by UIO managed devices. This patch adds same functionality with VFIO. It has been validated through tests using IOMMU and also with VFIO and no-iommu mode. Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> === The second PCI function fails to bind: # testpmd -w 06:00.0 -w 06:00.1 -c 0xc -n 4 -- --rxd=512 --txd=512 --crc-strip --disable-hw-vlan-filter --disable-hw-vlan-strip EAL: Detected 16 lcore(s) EAL: 2048 hugepages of size 2097152 reserved, but no mounted hugetlbfs found for that size EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:06:00.0 on NUMA socket 0 EAL: probe driver: 1924:a03 net_sfc_efx EAL: using IOMMU type 1 (Type 1) EAL: Ignore mapping IO port bar(0) addr: 2101 EAL: PCI device 0000:06:00.1 on NUMA socket 0 EAL: probe driver: 1924:a03 net_sfc_efx EAL: 0000:06:00.1 failed to select IOMMU type EAL: Requested device 0000:06:00.1 cannot be used EAL: Requested device 0000:7f:08.0 cannot be used EAL: Requested device 0000:7f:08.2 cannot be used EAL: Requested device 0000:7f:08.3 cannot be used ... Also I don't understand why it spams about many other PCI functions taking into account that just 2 are specified in whitelist. I've bisected to find commit when the problem appears, but has not found root cause yet. Andrew. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-01 10:46 [dpdk-dev] vfio: failed to select IOMMU type Andrew Rybchenko @ 2017-04-03 16:11 ` Burakov, Anatoly 2017-04-04 15:29 ` Andrew Rybchenko 0 siblings, 1 reply; 11+ messages in thread From: Burakov, Anatoly @ 2017-04-03 16:11 UTC (permalink / raw) To: Andrew Rybchenko, dev, Alejandro Lucero > From: Andrew Rybchenko [mailto:arybchenko@solarflare.com] > Sent: Saturday, April 1, 2017 11:47 AM > To: dev@dpdk.org; Alejandro Lucero <alejandro.lucero@netronome.com> > Cc: Burakov, Anatoly <anatoly.burakov@intel.com> > Subject: vfio: failed to select IOMMU type > > Hi, > > after the following commit (it was picked up by dpdk-next-net recently), I > have problems with VFIO: > === > commit 94c0776b1badd1ee715d60f07391058f23494365 > Author: Alejandro Lucero <alejandro.lucero@netronome.com> > Date: Wed Mar 29 10:54:50 2017 +0100 > > vfio: support hotplug > > Current device hotplug is just supported by UIO managed devices. > This patch adds same functionality with VFIO. > > It has been validated through tests using IOMMU and also with > VFIO and no-iommu mode. > > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com> > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> === > > The second PCI function fails to bind: > # testpmd -w 06:00.0 -w 06:00.1 -c 0xc -n 4 -- --rxd=512 --txd=512 --crc-strip - > -disable-hw-vlan-filter --disable-hw-vlan-strip > EAL: Detected 16 lcore(s) > EAL: 2048 hugepages of size 2097152 reserved, but no mounted hugetlbfs > found for that size > EAL: Probing VFIO support... > EAL: VFIO support initialized > EAL: PCI device 0000:06:00.0 on NUMA socket 0 > EAL: probe driver: 1924:a03 net_sfc_efx > EAL: using IOMMU type 1 (Type 1) > EAL: Ignore mapping IO port bar(0) addr: 2101 > EAL: PCI device 0000:06:00.1 on NUMA socket 0 > EAL: probe driver: 1924:a03 net_sfc_efx > EAL: 0000:06:00.1 failed to select IOMMU type > EAL: Requested device 0000:06:00.1 cannot be used > EAL: Requested device 0000:7f:08.0 cannot be used > EAL: Requested device 0000:7f:08.2 cannot be used > EAL: Requested device 0000:7f:08.3 cannot be used ... > > Also I don't understand why it spams about many other PCI functions taking > into account that just 2 are specified in whitelist. > > I've bisected to find commit when the problem appears, but has not found > root cause yet. > > Andrew. Hi Andrew, It would be interesting to know what was wrong there. The whitelist issue is surprising, and from the logs it seems like EAL is trying to set up DMA mappings multiple times. Posting a more detailed log would be very helpful in tracking down the issue as well. I have tested that code with ixgbe devices, so I'm not too sure what can go wrong there. Thanks, Anatoly ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-03 16:11 ` Burakov, Anatoly @ 2017-04-04 15:29 ` Andrew Rybchenko 2017-04-04 15:52 ` Burakov, Anatoly 0 siblings, 1 reply; 11+ messages in thread From: Andrew Rybchenko @ 2017-04-04 15:29 UTC (permalink / raw) To: Burakov, Anatoly, dev, Alejandro Lucero On 04/03/2017 07:11 PM, Burakov, Anatoly wrote: >> From: Andrew Rybchenko [mailto:arybchenko@solarflare.com] >> Sent: Saturday, April 1, 2017 11:47 AM >> To: dev@dpdk.org; Alejandro Lucero <alejandro.lucero@netronome.com> >> Cc: Burakov, Anatoly <anatoly.burakov@intel.com> >> Subject: vfio: failed to select IOMMU type >> >> Hi, >> >> after the following commit (it was picked up by dpdk-next-net recently), I >> have problems with VFIO: >> === >> commit 94c0776b1badd1ee715d60f07391058f23494365 >> Author: Alejandro Lucero <alejandro.lucero@netronome.com> >> Date: Wed Mar 29 10:54:50 2017 +0100 >> >> vfio: support hotplug >> >> Current device hotplug is just supported by UIO managed devices. >> This patch adds same functionality with VFIO. >> >> It has been validated through tests using IOMMU and also with >> VFIO and no-iommu mode. >> >> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com> >> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> === >> >> The second PCI function fails to bind: >> # testpmd -w 06:00.0 -w 06:00.1 -c 0xc -n 4 -- --rxd=512 --txd=512 --crc-strip - >> -disable-hw-vlan-filter --disable-hw-vlan-strip >> EAL: Detected 16 lcore(s) >> EAL: 2048 hugepages of size 2097152 reserved, but no mounted hugetlbfs >> found for that size >> EAL: Probing VFIO support... >> EAL: VFIO support initialized >> EAL: PCI device 0000:06:00.0 on NUMA socket 0 >> EAL: probe driver: 1924:a03 net_sfc_efx >> EAL: using IOMMU type 1 (Type 1) >> EAL: Ignore mapping IO port bar(0) addr: 2101 >> EAL: PCI device 0000:06:00.1 on NUMA socket 0 >> EAL: probe driver: 1924:a03 net_sfc_efx >> EAL: 0000:06:00.1 failed to select IOMMU type >> EAL: Requested device 0000:06:00.1 cannot be used >> EAL: Requested device 0000:7f:08.0 cannot be used >> EAL: Requested device 0000:7f:08.2 cannot be used >> EAL: Requested device 0000:7f:08.3 cannot be used ... >> >> Also I don't understand why it spams about many other PCI functions taking >> into account that just 2 are specified in whitelist. >> >> I've bisected to find commit when the problem appears, but has not found >> root cause yet. >> >> Andrew. > Hi Andrew, > > It would be interesting to know what was wrong there. The whitelist issue is surprising, and from the logs it seems like EAL is trying to set up DMA mappings multiple times. Posting a more detailed log would be very helpful in tracking down the issue as well. I have tested that code with ixgbe devices, so I'm not too sure what can go wrong there. Hi Anatoly, I've sent patch to fix whitelist issue. It the result of rte_exit substitution with just logging. I think a key to the main problem is the same IOMMU group used for both PCI functions. It tries to set IOMMU type using the same file descriptor twice. The second set is dummy, since the same value is set, but still fails, I guess, because it is already in use. See logs with debug enabled and few extra logs below: EAL: PCI device 0000:06:00.0 on NUMA socket 0 EAL: probe driver: 1924:a03 net_sfc_efx EAL: vfio_get_group_fd:75: group-no=53 EAL: vfio_get_group_fd:135: group-no=53 fd=16 filename=/dev/vfio/53 EAL: vfio_setup_device:319: ps-type=0 groups=1 EAL: using IOMMU type 1 (Type 1) EAL: Ignore mapping IO port bar(0) addr: 2101 EAL: PCI memory mapped at 0x7fffc0000000 EAL: Trying to map BAR 4 that contains the MSI-X table. Trying offsets: 0x40000000000:0x0000, 0x40000001000:0x3000 EAL: PCI memory mapped at 0x7fffc0801000 PMD: sfc_efx 0000:06:00.0 #0: use ef10 Rx datapath PMD: sfc_efx 0000:06:00.0 #0: use ef10_simple Tx datapath EAL: PCI device 0000:06:00.1 on NUMA socket 0 EAL: probe driver: 1924:a03 net_sfc_efx EAL: vfio_get_group_fd:75: group-no=53 EAL: vfio_setup_device:319: ps-type=0 groups=1 EAL: set IOMMU type 1 (Type 1) failed, error 22 (Invalid argument) EAL: set IOMMU type 7 (sPAPR) failed, error 22 (Invalid argument) EAL: set IOMMU type 8 (No-IOMMU) failed, error 22 (Invalid argument) EAL: 0000:06:00.1 failed to select IOMMU type EAL: Requested device 0000:06:00.1 cannot be used Andrew. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-04 15:29 ` Andrew Rybchenko @ 2017-04-04 15:52 ` Burakov, Anatoly 2017-04-04 16:10 ` Andrew Rybchenko 2017-04-05 7:12 ` Alejandro Lucero 0 siblings, 2 replies; 11+ messages in thread From: Burakov, Anatoly @ 2017-04-04 15:52 UTC (permalink / raw) To: Andrew Rybchenko, dev, Alejandro Lucero Hi Andrew, > I think a key to the main problem is the same IOMMU group used for both PCI functions. > It tries to set IOMMU type using the same file descriptor twice. The second set is dummy, since the same value is set, but still fails, I guess, because it is already in use. > See logs with debug enabled and few extra logs below: Yes you're right. Specifically, eal_vfio.c:vfio_setup_device() at line 311 (where we check for number of groups) - the code always assumes that one active group means we've just initialized a new group, which may not necessarily be the case if there's more than one device per group. Alejandro, please correct me if I'm wrong, but I think this raises another issue: vfio_release_device() seems to attempt to close group fd unconditionally, which is probably a bad idea if there are more than one device per group. Would you be so kind to come up with a patch to fix this oversight, or should I do it? :) Thanks, Anatoly ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-04 15:52 ` Burakov, Anatoly @ 2017-04-04 16:10 ` Andrew Rybchenko 2017-04-04 16:20 ` Burakov, Anatoly 2017-04-05 7:12 ` Alejandro Lucero 1 sibling, 1 reply; 11+ messages in thread From: Andrew Rybchenko @ 2017-04-04 16:10 UTC (permalink / raw) To: Burakov, Anatoly, dev, Alejandro Lucero On 04/04/2017 06:52 PM, Burakov, Anatoly wrote: > Hi Andrew, > >> I think a key to the main problem is the same IOMMU group used for both PCI functions. >> It tries to set IOMMU type using the same file descriptor twice. The second set is dummy, since the same value is set, but still fails, I guess, because it is already in use. >> See logs with debug enabled and few extra logs below: > Yes you're right. Specifically, eal_vfio.c:vfio_setup_device() at line 311 (where we check for number of groups) - the code always assumes that one > active group means we've just initialized a new group, which may not necessarily be the case if there's more than one device per group. > > Alejandro, please correct me if I'm wrong, but I think this raises another issue: vfio_release_device() seems to attempt to close group fd unconditionally, > which is probably a bad idea if there are more than one device per group. If it is true, please, care about it. > Would you be so kind to come up with a patch to fix this oversight, or should I do it? :) Please, take a look at http://dpdk.org/dev/patchwork/patch/23202/ Basically it just moves IOMMU type set under previous container-not-set condition. Also my testing is very restricted, no hotplug, no different IOMMU groups. Andrew. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-04 16:10 ` Andrew Rybchenko @ 2017-04-04 16:20 ` Burakov, Anatoly 2017-04-05 7:15 ` Alejandro Lucero 0 siblings, 1 reply; 11+ messages in thread From: Burakov, Anatoly @ 2017-04-04 16:20 UTC (permalink / raw) To: Andrew Rybchenko, dev, Alejandro Lucero Hi Andrew, > Please, take a look at http://dpdk.org/dev/patchwork/patch/23202/ I took a quick look. It should fix the problem (closing the group will cause container detachment, so we can always assume that if there's no container associated with the group, there are no devices within that group). The issue of vfio_release_device() still remains. I think the fix for that would closely follow what you did here: close the dev_fd, check if the group still has a container associated with it, and if not, close the group_fd and clear the group. I cannot thoroughly test it right this moment, however I'll have time later during the week. Thanks, Anatoly ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-04 16:20 ` Burakov, Anatoly @ 2017-04-05 7:15 ` Alejandro Lucero 0 siblings, 0 replies; 11+ messages in thread From: Alejandro Lucero @ 2017-04-05 7:15 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: Andrew Rybchenko, dev On Tue, Apr 4, 2017 at 5:20 PM, Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > Hi Andrew, > > > Please, take a look at http://dpdk.org/dev/patchwork/patch/23202/ > > I took a quick look. It should fix the problem (closing the group will > cause container detachment, so we can always assume that if there's no > container associated with the group, there are no devices within that > group). The issue of vfio_release_device() still remains. I think the fix > for that would closely follow what you did here: close the dev_fd, check if > the group still has a container associated with it, and if not, close the > group_fd and clear the group. > > That could work. When implementing the hotplug I realize the kernel could give more info about a particular group status. Maybe I propose some changes for facilitating this, even if we can fix the issue just checking if there is a container there. > I cannot thoroughly test it right this moment, however I'll have time > later during the week. > > Thanks, > Anatoly > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-04 15:52 ` Burakov, Anatoly 2017-04-04 16:10 ` Andrew Rybchenko @ 2017-04-05 7:12 ` Alejandro Lucero 2017-04-06 9:10 ` Burakov, Anatoly 1 sibling, 1 reply; 11+ messages in thread From: Alejandro Lucero @ 2017-04-05 7:12 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: Andrew Rybchenko, dev Hi Andrew, Anatoly, On Tue, Apr 4, 2017 at 4:52 PM, Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > Hi Andrew, > > > I think a key to the main problem is the same IOMMU group used for both > PCI functions. > > It tries to set IOMMU type using the same file descriptor twice. The > second set is dummy, since the same value is set, but still fails, I guess, > because it is already in use. > > See logs with debug enabled and few extra logs below: > > Yes you're right. Specifically, eal_vfio.c:vfio_setup_device() at line > 311 (where we check for number of groups) - the code always assumes that one > active group means we've just initialized a new group, which may not > necessarily be the case if there's more than one device per group. > > Yes, the code was not aware of that possibility. Being honest, I knew about that, but not in my mind when implementing the code. I have just cards where each PF even VF have their own VFIO group. This could be a problem for testing. Anatoly, do you have a system with that peculiarity? > Alejandro, please correct me if I'm wrong, but I think this raises another > issue: vfio_release_device() seems to attempt to close group fd > unconditionally, > which is probably a bad idea if there are more than one device per group. > > Yes, I think so, but not completely sure. Doing a quick look at kernel VFIO code, that seems a problem, but need more time for studying the code again. Again, not having hardware with this peculiarity will make the process harder. > Would you be so kind to come up with a patch to fix this oversight, or > should I do it? :) > > Yes, I will work on this as a priority and send a patch asap. Andrew, could you test this hotplug option in some way with your systems? Thanks > Thanks, > Anatoly > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-05 7:12 ` Alejandro Lucero @ 2017-04-06 9:10 ` Burakov, Anatoly 2017-04-18 11:22 ` Alejandro Lucero 0 siblings, 1 reply; 11+ messages in thread From: Burakov, Anatoly @ 2017-04-06 9:10 UTC (permalink / raw) To: Alejandro Lucero; +Cc: Andrew Rybchenko, dev Hi Alejandro, > Yes, the code was not aware of that possibility. Being honest, I knew about that, but not in my mind when implementing the code. I have just cards where each PF even VF have their own VFIO group. This could be a problem for testing. > Anatoly, do you have a system with that peculiarity? I think if you compile an old kernel (like a 3.6), you should be able to reproduce the behavior. (at least I am able to do this with an old kernel) Thanks, Anatoly ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-06 9:10 ` Burakov, Anatoly @ 2017-04-18 11:22 ` Alejandro Lucero 2017-04-24 9:09 ` Burakov, Anatoly 0 siblings, 1 reply; 11+ messages in thread From: Alejandro Lucero @ 2017-04-18 11:22 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: Andrew Rybchenko, dev On Thu, Apr 6, 2017 at 11:10 AM, Burakov, Anatoly <anatoly.burakov@intel.com > wrote: > Hi Alejandro, > > > Yes, the code was not aware of that possibility. Being honest, I knew > about that, but not in my mind when implementing the code. I have just > cards where each PF even VF have their own VFIO group. This could be a > problem for testing. > > Anatoly, do you have a system with that peculiarity? > > I think if you compile an old kernel (like a 3.6), you should be able to > reproduce the behavior. (at least I am able to do this with an old kernel) > > It turns out the container is not removed while it has users, and having a group file descriptor opened implies an active user. So we need another approach. The kernel does not give any information we can use (and maybe it requires a patch for fixing a potential problem regarding group closing) but we can track number of devices in a group easily and close the group when last device is closed. I have tested this approach and it works for the single device per group case, which is the one I can test by now. But I'm not sure I understood this comment about using a old kernel for testing the multiple devices per group case. Can you confirm if I understood this correctly? If I use an old kernel, devices like VFs are created in the same IOMMU group? Thanks > Thanks, > Anatoly > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] vfio: failed to select IOMMU type 2017-04-18 11:22 ` Alejandro Lucero @ 2017-04-24 9:09 ` Burakov, Anatoly 0 siblings, 0 replies; 11+ messages in thread From: Burakov, Anatoly @ 2017-04-24 9:09 UTC (permalink / raw) To: Alejandro Lucero; +Cc: Andrew Rybchenko, dev Hi Alejandro, > I have tested this approach and it works for the single device per group case, which is the one I can test by now. But I'm not sure I understood this comment about using a old kernel for testing the multiple devices per group case. Can you confirm if I understood this correctly? If I use an old kernel, devices like VFs are created in the same IOMMU group? I cannot confirm this for every use case, but I've done some recent testing with VFIO on an old kernel for an unrelated patch, and I noticed that if I used kernel 3.6, all my physical ports were assigned the same IOMMU group (as opposed to different ones with a more recent kernel). So I think it's a safe bet to at least try and test this on your side :) Thanks, Anatoly ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-04-24 9:09 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-01 10:46 [dpdk-dev] vfio: failed to select IOMMU type Andrew Rybchenko 2017-04-03 16:11 ` Burakov, Anatoly 2017-04-04 15:29 ` Andrew Rybchenko 2017-04-04 15:52 ` Burakov, Anatoly 2017-04-04 16:10 ` Andrew Rybchenko 2017-04-04 16:20 ` Burakov, Anatoly 2017-04-05 7:15 ` Alejandro Lucero 2017-04-05 7:12 ` Alejandro Lucero 2017-04-06 9:10 ` Burakov, Anatoly 2017-04-18 11:22 ` Alejandro Lucero 2017-04-24 9:09 ` Burakov, Anatoly
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).