From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
To: "Walker, Benjamin" <benjamin.walker@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Running DPDK as an unprivileged user
Date: Wed, 4 Jan 2017 19:39:18 +0800 [thread overview]
Message-ID: <685186b4-e50e-c122-459b-e4635404c3f8@intel.com> (raw)
In-Reply-To: <1483044080.11975.1.camel@intel.com>
Hi Benjamin,
On 12/30/2016 4:41 AM, Walker, Benjamin wrote:
> Hi all,
>
> I've been digging in to what it would take to run DPDK as an
> unprivileged user and I have some findings that I thought
> were worthy of discussion. The assumptions here are that I'm
> using a very recent Linux kernel (4.8.15 to be specific) and
> I'm using vfio with my IOMMU enabled. I'm only interested in
> making it possible to run as an unprivileged user in this
> type of environment.
>
> There are a few key things that DPDK needs to do in order to
> run as an unprivileged user:
>
> 1) Allocate hugepages
> 2) Map device resources
> 3) Map hugepage virtual addresses to DMA addresses.
>
> For #1 and #2, DPDK works just fine today. You simply chown
> the relevant resources in sysfs to the desired user and
> everything is happy.
>
> The problem is #3. This currently relies on looking up the
> mappings in /proc/self/pagemap, but the ability to get
> physical addresses in /proc/self/pagemap as an unprivileged
> user was removed from the kernel in the 4.x timeframe due to
> the Rowhammer vulnerability. At this time, it is not
> possible to run DPDK as an unprivileged user on a 4.x Linux
> kernel.
>
> There is a way to make this work though, which I'll outline
> now. Unfortunately, I think it is going to require some very
> significant changes to the initialization flow in the EAL.
> One bit of of background before I go into how to fix this -
> there are three types of memory addresses - virtual
> addresses, physical addresses, and DMA addresses. Sometimes
> DMA addresses are called bus addresses or I/O addresses, but
> I'll call them DMA addresses because I think that's the
> clearest name. In a system without an IOMMU, DMA addresses
> and physical addresses are equivalent, but in a system with
> an IOMMU any arbitrary DMA address can be chosen by the user
> to map to a given physical address. For security reasons
> (rowhammer), it is no longer considered safe to expose
> physical addresses to userspace, but it is perfectly fine to
> expose DMA addresses when an IOMMU is present.
>
> DPDK today begins by allocating all of the required
> hugepages, then finds all of the physical addresses for
> those hugepages using /proc/self/pagemap, sorts the
> hugepages by physical address, then remaps the pages to
> contiguous virtual addresses. Later on and if vfio is
> enabled, it asks vfio to pin the hugepages and to set their
> DMA addresses in the IOMMU to be the physical addresses
> discovered earlier. Of course, running as an unprivileged
> user means all of the physical addresses in
> /proc/self/pagemap are just 0, so this doesn't end up
> working. Further, there is no real reason to choose the
> physical address as the DMA address in the IOMMU - it would
> be better to just count up starting at 0.
Why not just using virtual address as the DMA address in this case to
avoid maintaining another kind of addresses?
> Also, because the
> pages are pinned after the virtual to physical mapping is
> looked up, there is a window where a page could be moved.
> Hugepage mappings can be moved on more recent kernels (at
> least 4.x), and the reliability of hugepages having static
> mappings decreases with every kernel release.
Do you mean kernel might take back a physical page after mapping it to a
virtual page (maybe copy the data to another physical page)? Could you
please show some links or kernel commits?
> Note that this
> probably means that using uio on recent kernels is subtly
> broken and cannot be supported going forward because there
> is no uio mechanism to pin the memory.
>
> The first open question I have is whether DPDK should allow
> uio at all on recent (4.x) kernels. My current understanding
> is that there is no way to pin memory and hugepages can now
> be moved around, so uio would be unsafe. What does the
> community think here?
>
> My second question is whether the user should be allowed to
> mix uio and vfio usage simultaneously. For vfio, the
> physical addresses are really DMA addresses and are best
> when arbitrarily chosen to appear sequential relative to
> their virtual addresses.
Why "sequential relative to their virtual addresses"? IOMMU table is for
DMA addr -> physical addr mapping. So we need to DMA addresses
"sequential relative to their physical addresses"? Based on your above
analysis on how hugepages are initialized, virtual addresses is a good
candidate for DMA address?
Thanks,
Jianfeng
next prev parent reply other threads:[~2017-01-04 11:39 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-29 20:41 Walker, Benjamin
2016-12-30 1:14 ` Stephen Hemminger
2017-01-02 14:32 ` Thomas Monjalon
2017-01-02 19:47 ` Stephen Hemminger
2017-01-03 22:50 ` Walker, Benjamin
2017-01-04 10:11 ` Thomas Monjalon
2017-01-04 21:35 ` Walker, Benjamin
2017-01-04 11:39 ` Tan, Jianfeng [this message]
2017-01-04 21:34 ` Walker, Benjamin
2017-01-05 10:09 ` Sergio Gonzalez Monroy
2017-01-05 10:16 ` Sergio Gonzalez Monroy
2017-01-05 14:58 ` Tan, Jianfeng
2017-01-05 15:52 ` Tan, Jianfeng
2017-11-05 0:17 ` Thomas Monjalon
2017-11-27 17:58 ` Walker, Benjamin
2017-11-28 14:16 ` Alejandro Lucero
2017-11-28 17:50 ` Walker, Benjamin
2017-11-28 19:13 ` Alejandro Lucero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=685186b4-e50e-c122-459b-e4635404c3f8@intel.com \
--to=jianfeng.tan@intel.com \
--cc=benjamin.walker@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).