DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@intel.com>
To: Alejandro Lucero <alejandro.lucero@netronome.com>
Cc: dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] igb_uio: map dummy dma forcing iommu domain attachment
Date: Fri, 10 Feb 2017 19:03:17 +0000	[thread overview]
Message-ID: <d499b34f-7a13-bcdb-cf26-bba5a0ada247@intel.com> (raw)
In-Reply-To: <CAD+H993AnUWkwpoBQf+nXpkHEXL6D_igkt3y4RAx4ixJ+47cjQ@mail.gmail.com>

On 2/8/2017 11:54 AM, Alejandro Lucero wrote:
> Hi Ferruh,
> 
> On Tue, Feb 7, 2017 at 3:59 PM, Ferruh Yigit <ferruh.yigit@intel.com
> <mailto:ferruh.yigit@intel.com>> wrote:
> 
>     Hi Alejandro,
> 
>     On 1/18/2017 12:27 PM, Alejandro Lucero wrote:
>     > For using a DPDK app when iommu is enabled, it requires to
>     > add iommu=pt to the kernel command line. But using igb_uio driver
>     > makes DMAR errors because the device has not an IOMMU domain.
> 
>     Please help to understand the scope of the problem,
> 
> 
> After reading your reply, I realize I could have explained it better.
> First of all, this is related to SRIOV, exactly when the VFs are created.
>  
> 
>     1- How can you re-produce the problem?
> 
> 
> Using a VF from a Intel card by a DPDK app in the host and a kernel >=
> 3.15. Although usually VFs are assigned to VMs, it could also be an
> option to use VFs by the host. 
> 
> BTW, I did not try to reproduce the problem with an Intel card. I
> triggered this problem with an NFP, but because the problem behind, I
> bet that is going to happen for an Intel one as well.

I can able to reproduce the problem with ixgbe, by using VF on the host.

And I verified your patch fixes it, it cause device attached to a vfio
group.

So, I believe good to get this patch, but it is already to late for
17.02 release.
I suggest getting this one early 17.05, so it gives more time to test.

> 
>  
> 
>     2- What happens get DMAR errors, is it prevents device work or some
>     annoying error messages?
> 
> 
> A DMAR error implies the device can not access to the DMA address given
> by the host. I have experienced several situations where it is just that
> device not being able to work at all, but it also has more global
> implications and you need to reboot the system because it is unreliable.
> I think it depends on how these DMAR errors are handled, but in any
> case, this is a bad thing.

In my test, implication was device is not working.

>  
> 
> 
>     3- Can you please share the error messages?
> 
> 
> With this problem you can expect something like this:
> 
>  559.163874] DMAR: DRHD: handling fault status reg 2
> [ 559.165427] DMAR: DMAR:[DMA Read] Request device [82:08.0] fault addr
> e7b73b000
> [ 559.165427] DMAR:[fault reason 02] Present bit in context entry is clear
> [ 568.367417] DMAR: DRHD: handling fault status reg 102
> [ 568.369025] DMAR: DMAR:[DMA Read] Request device [82:08.1] fault addr
> ebb73b000
> [ 568.369025] DMAR:[fault reason 02] Present bit in context entry is clear
> [ 571.773944] DMAR: DRHD: handling fault status reg 202
> [ 571.775550] DMAR: DMAR:[DMA Read] Request device [82:08.2] fault addr
> efb73b000
> [ 571.775550] DMAR:[fault reason 02] Present bit in context entry is clear
> [ 575.039654] DMAR: DRHD: handling fault status reg 302
> [ 575.041259] DMAR: DMAR:[DMA Read] Request device [82:08.3] fault addr
> f3b73b000
> [ 575.041259] DMAR:[fault reason 02] Present bit in context entry is clear
> 
> There are different DMAR errors, sometimes referring to a specific
> address being wrong. In this case it is related to the device not having
> a context or a IOMMU domain.
> 
> Also note we got these errors for different devices/VFs. This was with a
> DPDK app using several VFs.
>  
> 
> 
> 
>     >
>     > Since kernel 3.15, iommu=pt requires to use the internal kernel
>     > DMA API for attaching the device to the IOMMU 1:1 mapping, aka
>     > si_domain. Previous versions did attach the device to that
>     > domain when intel iommu notifier was called.
> 
>     Again, what is not working since 3.15?
> 
> 
> This specific case, yes. With older kernels, when VFs are created, IOMMU
> code is executed (notifier chain callback) and if iommu=pt, the VF is
> attached to the si_domain, this is the 1:1 mapping. But this has changed
> with newer kernels, and after VFs are created they have no IOMMU domain
> at all. The kernel expects the driver to implicitly create such a domain
> when the kernel DMA API is used.

Thanks again for clarification.
What will be the effect of your patch for kernel < 3.15, should your
update be protected with a kernel version check, or is it safe for all?

>  
> 
> 
>     >
>     > This is not a problem if the driver does later some call to the
>     > DMA API because the mapping can be done then. But DPDK apps do
>     > not use that DMA API at all.
> 
>     Is this same/similar with:
>     http://dpdk.org/dev/patchwork/patch/12654/
>     <http://dpdk.org/dev/patchwork/patch/12654/>
> 
>  
> That case was another issue regarding IOMMU and iommu=pt. The problem
> there was when you detach a VF from a VM, but the VF was initially
> attached to the si_domain because the kernel did so. The patch helped to
> attach the VF again to that domain when binding to the UIO.
> 
> Looking at that patch now (I did comment on it then), it just solved the
> problem if the VF was detach form the UIO, something that could be
> easily forgotten or simply not done because, apparently, it is not needed.

I also able to reproduce this case. When driver switched from igb_uio ->
vfio_pci -> igb_uio, it stops working, giving similar DMAR errors.

Your patch also fixing this, at least for my test. When unbind from
vfio_pci, iommu group removed, but binding igb_uio adds it back.

> 
> What about to use VFIO?
> 
> With that previous patch, it was not enough. I do not remember the
> details now, and I'm not sure if VFIO created another IOMMU domain if
> the device had one, but it could leave the device without an IOMMU
> domain after the first use.
> 
> In this particular case, VFIO would work, because the device gets its
> own IOMMU domain. But there are two main problems if this is not fixed
> when using UIO:
> 
> 1) UIO is one of the two options for working with IOMMU. We all agree
> VFIO is the right one for IOMMU, but as long as UIO is still an option,
> that should be fixed.
> 
> 2) Some installations need to work with and without IOMMU. Having same
> module for both cases makes things simpler and therefore they use UIO
> instead of VFIO.
> 
>  
> 
>     >
>     > Doing this dma map and unmap is harmless even when iommu is not
>     > enabled at all.
>     >
>     > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com <mailto:alejandro.lucero@netronome.com>>

Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>

>     <...>
> 
>     Thanks,
>     ferruh
> 
> 
> 

  reply	other threads:[~2017-02-10 19:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-18 12:27 Alejandro Lucero
2017-02-07 15:59 ` Ferruh Yigit
2017-02-08 11:54   ` Alejandro Lucero
2017-02-10 19:03     ` Ferruh Yigit [this message]
2017-02-10 19:06       ` Ferruh Yigit
2017-02-13 13:38         ` Alejandro Lucero
2017-02-13 15:54           ` Ferruh Yigit
2017-02-13 13:31       ` Alejandro Lucero
2017-02-17 12:29 ` Ferruh Yigit
2017-03-30 20:20   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d499b34f-7a13-bcdb-cf26-bba5a0ada247@intel.com \
    --to=ferruh.yigit@intel.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).