DPDK patches and discussions
 help / color / mirror / Atom feed
From: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
To: Aaron Conole <aconole@redhat.com>, dev@openvswitch.org
Cc: dev@dpdk.org, Flavio Leitner <fbl@sysclose.org>,
	Ansis Atteka <aatteka@ovn.org>
Subject: Re: [dpdk-dev] [ovs-dev] [PATCH v2 0/4] rhel/fedora: non-root OvS out of the box
Date: Fri, 14 Jul 2017 11:59:54 +0100	[thread overview]
Message-ID: <2125110d-06ee-715e-e8f4-71e745ba310b@intel.com> (raw)
In-Reply-To: <f7tshi2eq5b.fsf@redhat.com>

On 11/07/2017 20:21, Aaron Conole wrote:
> Aaron Conole <aconole@redhat.com> writes:
>
>> Aaron Conole <aconole@redhat.com> writes:
>>
>>> This series attempts to introduce the ability to start and use
>>> Open vSwitch 'out of the box' as a non-root user.  It does this by
>>> modifying the service files to pass the recently introduced --ovs-user
>>> argument around, and by making some minor tweaks to the passwd, group,
>>> and filesystem information.
>>>
>>> I prefixed the packaging work with 'redhat', but if rpm or packaging
>>> is a preferred prefx for that work, I can respin.
>>>
>>> The more controversial changes are:
>>>
>>> * This modifies the /etc/sysconfig/ file on install.
>>> * The dpdk support directly modifies /dev/hugepages with a call to chmod
>>> * A new user 'openvswitch', and up to two new groups 'openvswitch', and
>>>    'hugetlbfs' are created
>>> * A change to soexpand.pl to allow conditional inclusion of dpdk-related
>>>    options
>>>
>> An interesting development has occurred while testing this series.
>>
>> It seems that as part of a rowhammer mitigation, access to
>> /proc/self/pagemap ends up being restricted.  This makes DPDK break in a
>> catastrophic way.
>>
>> One way of mitigating this is to keep the CAP_SYS_ADMIN capability when
>> DPDK is enabled (not sure whether it would be a runtime or compile
>> time change).  This means we end up keeping many root-user level
>> permissions that we probably shouldn't need or want.  I was thinking
>> that when DPDK is compiled in, we would keep the CAP_SYS_ADMIN for the
>> first iteration of DB synchronization, and then drop it after calling
>> DPDK-init.  That would prevent lazy loading, or being able to turn it
>> on without restarting the daemon (which I don't like).
>>
>> Another is to say that if DPDK is enabled at compile time, just don't
>> drop permissions at all.  That approach seems really wrong, but it's a
>> possibility.
>>
>> Not sure what else can be done from the OvS side for this.  I think it
>> could be possible to do something where before dropping privs, we cache
>> the pagemap and then feed it to DPDK during initialization, but that
>> will require work from DPDK side, and I'm not sure if it actually works
>> with DPDK (because I haven't looked into why the pagemap is being read
>> to begin with).
>>
>> So, I'm a bit stuck on this work, and asking for some opinions.
>>
> UPDATE: it seems that with DPDK 17.02+, this has been resolved.  I'll
> wait for resubmit until after the following commit has been applied:
>
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/334893.html
>

As you mentioned, DPDK needs CAP_SYS_ADMIN to be able to *read* 
/proc/self/pagemap.
The goal is to get the physical address of the hugepages we are using.

Since DPDK 17.05 (commit below) we are able to run non-root if we have 
an IOMMU,
but we still need CAP_SYS_ADMIN if we do not have an IOMMU as we need those
physical addresses for DMA.

commit cdc242f260e766bd95a658b5e0686a62ec04f5b0
Author: Ben Walker <benjamin.walker@intel.com>
Date:   Tue Jan 31 10:44:53 2017 -0700

     eal/linux: support running as unprivileged user

     For Linux kernel 4.0 and newer, the ability to obtain
     physical page frame numbers for unprivileged users from
     /proc/self/pagemap was removed. Instead, when an IOMMU
     is present, simply choose our own DMA addresses instead.

     Signed-off-by: Ben Walker <benjamin.walker@intel.com>


I have not tried myself but I think you suggestion of dropping 
privileges after
eal_init should work given that currently is the only time we parse 
/proc/self/pagemap

Thanks,
Sergio

      reply	other threads:[~2017-07-14 10:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170705175634.7957-1-aconole@redhat.com>
2017-07-07 15:40 ` Aaron Conole
2017-07-11 19:21   ` Aaron Conole
2017-07-14 10:59     ` Sergio Gonzalez Monroy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2125110d-06ee-715e-e8f4-71e745ba310b@intel.com \
    --to=sergio.gonzalez.monroy@intel.com \
    --cc=aatteka@ovn.org \
    --cc=aconole@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dev@openvswitch.org \
    --cc=fbl@sysclose.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).