DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
To: "Walker, Benjamin" <benjamin.walker@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Running DPDK as an unprivileged user
Date: Wed, 4 Jan 2017 19:39:18 +0800	[thread overview]
Message-ID: <685186b4-e50e-c122-459b-e4635404c3f8@intel.com> (raw)
In-Reply-To: <1483044080.11975.1.camel@intel.com>

Hi Benjamin,


On 12/30/2016 4:41 AM, Walker, Benjamin wrote:
> Hi all,
>
> I've been digging in to what it would take to run DPDK as an
> unprivileged user and I have some findings that I thought
> were worthy of discussion. The assumptions here are that I'm
> using a very recent Linux kernel (4.8.15 to be specific) and
> I'm using vfio with my IOMMU enabled. I'm only interested in
> making it possible to run as an unprivileged user in this
> type of environment.
>
> There are a few key things that DPDK needs to do in order to
> run as an unprivileged user:
>
> 1) Allocate hugepages
> 2) Map device resources
> 3) Map hugepage virtual addresses to DMA addresses.
>
> For #1 and #2, DPDK works just fine today. You simply chown
> the relevant resources in sysfs to the desired user and
> everything is happy.
>
> The problem is #3. This currently relies on looking up the
> mappings in /proc/self/pagemap, but the ability to get
> physical addresses in /proc/self/pagemap as an unprivileged
> user was removed from the kernel in the 4.x timeframe due to
> the Rowhammer vulnerability. At this time, it is not
> possible to run DPDK as an unprivileged user on a 4.x Linux
> kernel.
>
> There is a way to make this work though, which I'll outline
> now. Unfortunately, I think it is going to require some very
> significant changes to the initialization flow in the EAL.
> One bit of  of background before I go into how to fix this -
> there are three types of memory addresses - virtual
> addresses, physical addresses, and DMA addresses. Sometimes
> DMA addresses are called bus addresses or I/O addresses, but
> I'll call them DMA addresses because I think that's the
> clearest name. In a system without an IOMMU, DMA addresses
> and physical addresses are equivalent, but in a system with
> an IOMMU any arbitrary DMA address can be chosen by the user
> to map to a given physical address. For security reasons
> (rowhammer), it is no longer considered safe to expose
> physical addresses to userspace, but it is perfectly fine to
> expose DMA addresses when an IOMMU is present.
>
> DPDK today begins by allocating all of the required
> hugepages, then finds all of the physical addresses for
> those hugepages using /proc/self/pagemap, sorts the
> hugepages by physical address, then remaps the pages to
> contiguous virtual addresses. Later on and if vfio is
> enabled, it asks vfio to pin the hugepages and to set their
> DMA addresses in the IOMMU to be the physical addresses
> discovered earlier. Of course, running as an unprivileged
> user means all of the physical addresses in
> /proc/self/pagemap are just 0, so this doesn't end up
> working. Further, there is no real reason to choose the
> physical address as the DMA address in the IOMMU - it would
> be better to just count up starting at 0.

Why not just using virtual address as the DMA address in this case to 
avoid maintaining another kind of addresses?

>   Also, because the
> pages are pinned after the virtual to physical mapping is
> looked up, there is a window where a page could be moved.
> Hugepage mappings can be moved on more recent kernels (at
> least 4.x), and the reliability of hugepages having static
> mappings decreases with every kernel release.

Do you mean kernel might take back a physical page after mapping it to a 
virtual page (maybe copy the data to another physical page)? Could you 
please show some links or kernel commits?

> Note that this
> probably means that using uio on recent kernels is subtly
> broken and cannot be supported going forward because there
> is no uio mechanism to pin the memory.
>
> The first open question I have is whether DPDK should allow
> uio at all on recent (4.x) kernels. My current understanding
> is that there is no way to pin memory and hugepages can now
> be moved around, so uio would be unsafe. What does the
> community think here?
>
> My second question is whether the user should be allowed to
> mix uio and vfio usage simultaneously. For vfio, the
> physical addresses are really DMA addresses and are best
> when arbitrarily chosen to appear sequential relative to
> their virtual addresses.

Why "sequential relative to their virtual addresses"? IOMMU table is for 
DMA addr -> physical addr mapping. So we need to DMA addresses 
"sequential relative to their physical addresses"? Based on your above 
analysis on how hugepages are initialized, virtual addresses is a good 
candidate for DMA address?

Thanks,
Jianfeng

  parent reply	other threads:[~2017-01-04 11:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-29 20:41 Walker, Benjamin
2016-12-30  1:14 ` Stephen Hemminger
2017-01-02 14:32   ` Thomas Monjalon
2017-01-02 19:47     ` Stephen Hemminger
2017-01-03 22:50       ` Walker, Benjamin
2017-01-04 10:11         ` Thomas Monjalon
2017-01-04 21:35           ` Walker, Benjamin
2017-01-04 11:39 ` Tan, Jianfeng [this message]
2017-01-04 21:34   ` Walker, Benjamin
2017-01-05 10:09     ` Sergio Gonzalez Monroy
2017-01-05 10:16       ` Sergio Gonzalez Monroy
2017-01-05 14:58         ` Tan, Jianfeng
2017-01-05 15:52     ` Tan, Jianfeng
2017-11-05  0:17       ` Thomas Monjalon
2017-11-27 17:58         ` Walker, Benjamin
2017-11-28 14:16           ` Alejandro Lucero
2017-11-28 17:50             ` Walker, Benjamin
2017-11-28 19:13               ` Alejandro Lucero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=685186b4-e50e-c122-459b-e4635404c3f8@intel.com \
    --to=jianfeng.tan@intel.com \
    --cc=benjamin.walker@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).