DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Liang, Cunming" <cunming.liang@intel.com>
To: alex <alex@weka.io>, "Zhou, Danny" <danny.zhou@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Why do we need iommu=pt?
Date: Wed, 22 Oct 2014 08:53:00 +0000	[thread overview]
Message-ID: <D0158A423229094DA7ABF71CF2FA0DA311851099@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <CAKfHP0WstpCYb0POKP_bOt36+qfcAGj-j=ij9BeCZifRjJ080g@mail.gmail.com>

I thinks it's a good point using dma_addr rather than phys_addr.
Without iommu, the value of them are the same.
With iommu, the dma_addr value equal to the iova. 
It's not all for DPDK working with iommu but not pass through.

We know each iova belongs to one iommu domain.
And each device can attach to one domain.
It means the iova will have coupling relationship with domain/device.

Looking back to DPDK descriptor ring, it's all right, already coupling with device.
But if for mbuf mempool, in most cases, it's shared by multiple ports.
So if keeping the way, all those ports/device need to put into the same iommu domain.
And the mempool has attach to specific domain, but not just the device.
On this time, iommu domain no longer be transparent in DPDK.
Vfio provide the verbs to control domain, we still need library to manager such domain with mempool.

All that overhead just make DPDK works with iommu in host, but remember pt always works.
The isolation of devices mainly for security concern.
If it's not necessary, pt definitely is a good choice without performance impact.

For those self-implemented PMD using the DMA kernel interface to set up its mappings appropriately.
It don't require "iommu=pt". The default option "iommu=on" also works.

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of alex
> Sent: Wednesday, October 22, 2014 3:36 PM
> To: Zhou, Danny
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> 
> Shiva.
> The cost of disabling iommu=pt when intel_iommu=on is dire. DPDK won't work
> as the RX/TX descriptors will be useless.
> Any dam access by the device will be dropped as no dam-mapping will exists.
> 
> Danny.
> The IOMMU hurts performance in kernel drivers which perform a map and umap
> operation for each e/ingress packet.
> The costs of unmapping when under strict protection limit a +10Gb to 3Gb
> with cpu maxed out at 100%. DPDK apps shouldn't feel any difference IFF the
> rx descriptors contain iova and not real physical addresses which are used
> currently.
> 
> 
> On Tue, Oct 21, 2014 at 10:10 PM, Zhou, Danny <danny.zhou@intel.com> wrote:
> 
> > IMHO, if memory protection with IOMMU is needed or not really depends on
> > how you use
> > and deploy your DPDK based applications. For Telco network middle boxes,
> > which adopts
> > a "close model" solution to achieve extremely high performance, the entire
> > system including
> > HW, software in kernel and userspace are controlled by Telco vendors and
> > assumed trustable, so
> > memory protection is not so important. While for Datacenters, which
> > generally adopts a "open model"
> > solution allows running user space applications(e.g. tenant applications
> > and VMs) which could
> > direct access NIC and DMA engine inside the NIC using modified DPDK PMD
> > are not trustable
> > as they can potentially DAM to/from arbitrary memory regions using
> > physical addresses, so IOMMU
> > is needed to provide strict memory protection, at the cost of negative
> > performance impact.
> >
> > So if you want to seek high performance, disable IOMMU in BIOS or OS. And
> > if security is a major
> > concern, tune it on and tradeoff between performance and security. But I
> > do NOT think is comes with
> > an extremely high performance costs according to our performance
> > measurement, but it probably true
> > for 100G NIC.
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Shivapriya Hiremath
> > > Sent: Wednesday, October 22, 2014 12:54 AM
> > > To: Alex Markuze
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> > >
> > > Hi,
> > >
> > > Thank you for all the replies.
> > > I am trying to understand the impact of this on DPDK. What will be the
> > > repercussions of disabling "iommu=pt" on the DPDK performance?
> > >
> > >
> > > On Tue, Oct 21, 2014 at 12:32 AM, Alex Markuze <alex@weka.io> wrote:
> > >
> > > > DPDK uses a 1:1 mapping and doesn't support IOMMU.  IOMMU allows for
> > > > simpler VM physical address translation.
> > > > The second role of IOMMU is to allow protection from unwanted memory
> > > > access by an unsafe devise that has DMA privileges. Unfortunately this
> > > > protection comes with an extremely high performance costs for high
> > speed
> > > > nics.
> > > >
> > > > To your question iommu=pt disables IOMMU support for the hypervisor.
> > > >
> > > > On Tue, Oct 21, 2014 at 1:39 AM, Xie, Huawei <huawei.xie@intel.com>
> > wrote:
> > > >
> > > >>
> > > >>
> > > >> > -----Original Message-----
> > > >> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Shivapriya
> > > >> Hiremath
> > > >> > Sent: Monday, October 20, 2014 2:59 PM
> > > >> > To: dev@dpdk.org
> > > >> > Subject: [dpdk-dev] Why do we need iommu=pt?
> > > >> >
> > > >> > Hi,
> > > >> >
> > > >> > My question is that if the Poll mode  driver used the DMA kernel
> > > >> interface
> > > >> > to set up its mappings appropriately, would it still require that
> > > >> iommu=pt
> > > >> > be set?
> > > >> > What is the purpose of setting iommu=pt ?
> > > >> PMD allocates memory though hugetlb file system, and fills the
> > physical
> > > >> address
> > > >> into the descriptor.
> > > >> pt is used to pass through iotlb translation. Refer to the below link.
> > > >> http://lkml.iu.edu/hypermail/linux/kernel/0906.2/02129.html
> > > >> >
> > > >> > Thank you.
> > > >>
> > > >
> > > >
> >

  reply	other threads:[~2014-10-22  8:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20 21:58 Shivapriya Hiremath
2014-10-20 22:39 ` Xie, Huawei
2014-10-21  7:32   ` Alex Markuze
2014-10-21 16:53     ` Shivapriya Hiremath
2014-10-21 19:10       ` Zhou, Danny
2014-10-22  7:35         ` alex
2014-10-22  8:53           ` Liang, Cunming [this message]
2014-10-22 15:21             ` Zhou, Danny
2014-10-23  7:49               ` Alex Markuze
2014-10-27 17:27               ` Shivapriya Hiremath
2014-10-27 17:32                 ` Zhou, Danny
2014-10-30 23:22                 ` Zhou, Danny
2014-10-31  0:05                   ` Shivapriya Hiremath

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D0158A423229094DA7ABF71CF2FA0DA311851099@shsmsx102.ccr.corp.intel.com \
    --to=cunming.liang@intel.com \
    --cc=alex@weka.io \
    --cc=danny.zhou@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).