From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DD4D4A00C3 for ; Fri, 17 Jun 2022 19:05:11 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 44B4942B86; Fri, 17 Jun 2022 19:05:09 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id E70534282D; Fri, 17 Jun 2022 19:05:03 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655485504; x=1687021504; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=LC/Pe6GFN45K/YjAohzkY8cJngTGNyzt3AoBGEyhLsU=; b=dFb0YRe3WUksGGmMB1UR0z6E7AexDHoLILdG5Qfb83lnrmXOvXzl3vO2 HUb5S4Z9CKDyb3UtvJGIWCcBL7hogtPpQXtuN7M8KBTNsvjXZ54chdtmw AdpRi00tmF4ZAhII0S4FlHcuEQfwJtAJjl/J/PFPnnYT3FHPVjAzZOM93 0p8iw+NEiyrn1pBt4sb5VF6xVFuXsxsDFqHIyMKgeZgY4iggsEVN7lHpz /af/DrPP14zl9nyh9Y1EHT+XlxXTttTNw3PGKCTFqTVH1Dz7157dqSdZk Wb/O5Pihmz/iAJj/PouwYtUd1daLyQV7cp8HzTyIew1CsGEizScdLJCOM A==; X-IronPort-AV: E=McAfee;i="6400,9594,10380"; a="259947211" X-IronPort-AV: E=Sophos;i="5.92,306,1650956400"; d="scan'208";a="259947211" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jun 2022 09:38:24 -0700 X-IronPort-AV: E=Sophos;i="5.92,306,1650956400"; d="scan'208";a="763293765" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.10.212]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 17 Jun 2022 09:38:22 -0700 Date: Fri, 17 Jun 2022 17:38:19 +0100 From: Bruce Richardson To: Dmitry Kozlyuk Cc: dev@dpdk.org, stable@dpdk.org, Anatoly Burakov Subject: Re: [PATCH v2 3/4] doc: give specific instructions for running as non-root Message-ID: References: <20220607234949.2311884-1-dkozlyuk@nvidia.com> <20220617112508.3823291-1-dkozlyuk@nvidia.com> <20220617112508.3823291-4-dkozlyuk@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220617112508.3823291-4-dkozlyuk@nvidia.com> X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org On Fri, Jun 17, 2022 at 02:25:07PM +0300, Dmitry Kozlyuk wrote: > The guide to run DPDK applications as non-root in Linux > did not provide specific instructions to configure the required access > and did not explain why each bit is needed. > The latter is important because running as non-root > is one of the ways to tighten security and grant minimal permissions. > > Cc: stable@dpdk.org > > Signed-off-by: Dmitry Kozlyuk Thanks for this, some good changes here. Comments inline below. /Bruce > --- > doc/guides/linux_gsg/enable_func.rst | 67 +++++++++++++++++-- > .../prog_guide/env_abstraction_layer.rst | 2 + > 2 files changed, 63 insertions(+), 6 deletions(-) > > diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst > index 1df3ab0255..2f908e8b70 100644 > --- a/doc/guides/linux_gsg/enable_func.rst > +++ b/doc/guides/linux_gsg/enable_func.rst > @@ -13,13 +13,58 @@ Enabling Additional Functionality > Running DPDK Applications Without Root Privileges > ------------------------------------------------- > > -In order to run DPDK as non-root, the following Linux filesystem objects' > -permissions should be adjusted to ensure that the Linux account being used to > -run the DPDK application has access to them: > +The following sections describe generic requirements and configuration > +for running DPDK applications as non-root. > +There may be additional requirements documented for some drivers. > > -* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` > +Hugepages > +~~~~~~~~~ > > -* If the HPET is to be used, ``/dev/hpet`` > +Hugepages must be reserved as root before runing the application as non-root, > +for example:: > + > + sudo dpdk-hugepages.py --reserve 1G > + > +If multi-process is not required, running with ``--in-memory`` > +bypasses the need to access hugepage mount point and files within it. > +Otherwise, hugepage directory must be made accessible > +for writing to the unprivileged user. > +A good way for managing multiple applications using hugepages > +is to mount the filesystem with group permissions > +and add a supplementary group to each application or container. > + > +One option is to use the script provided by this project:: > + > + export HUGEDIR=$HOME/huge-1G > + mkdir -p $HUGEDIR > + sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g` > + > +In production environment, the OS can manage mount points > +(`systemd example `_). > + > +The ``hugetlb`` filesystem has additional options to guarantee or limit > +the amount of memory that is possible to allocate using the mount point. > +Refer to the `documentation `_. > + > +If the driver requires using physical addresses (PA), > +the executable file must be granted additional capabilities: > + > +* ``SYS_ADMIN`` to read ``/proc/self/pagemaps`` > +* ``IPC_LOCK`` to lock hugepages in memory Are either of these necessary if using vfio-pci and VA mode? I have seen it previously reported that IPC_LOCK is necessary for IOMMU memory mapping for DMA - at least for docker containers - so I'd like it confirmed that we don't need them in the in-memory case running on the host. If I get the chance I'll try double-checking by testing myself. > + > +.. code-block:: console > + > + setcap cap_ipc_lock,cap_sys_admin+ep > + > +If physical addresses are not accessible, > +the following message will appear during EAL initialization:: > + > + EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied > + > +It is harmless in case PA are not needed. > + While this is probably worth having in the doc, I think we should really include a note here about using vfio-pci rather than uio and therefore not needing physical addresses. > +Resource Limits > +~~~~~~~~~~~~~~~ > > When running as non-root user, there may be some additional resource limits > that are imposed by the system. Specifically, the following resource limits may > @@ -34,7 +79,15 @@ need to be adjusted in order to ensure normal DPDK operation: > The above limits can usually be adjusted by editing > ``/etc/security/limits.conf`` file, and rebooting. > > -Additionally, depending on which kernel driver is in use, the relevant > +See `Hugepage Mapping `_ > +secton to learn how these limits affect EAL. Typo: s/secton/section/ > + > +Device Control > +~~~~~~~~~~~~~~ > + > +If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted. > + Given that HPET has been off by default for years, I think we can probably remove this line. Anyone still using it likely already knows this. > +Depending on which kernel driver is in use, the relevant > resources also should be accessible by the user running the DPDK application. > > For ``vfio-pci`` kernel driver, the following Linux file system objects' > @@ -64,6 +117,8 @@ system objects' permissions should be adjusted: > /sys/class/uio/uio0/device/config > /sys/class/uio/uio0/device/resource* > I think our minimum supported kernel version is now >4.0 so I believe this uio section should be removed as it's only applicable for earlier kernel versions. > +For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is required > +for ``iopl()`` call to enable access to PCI IO ports. > How "legacy" is legacy-mode? Is it still likely in widespread use that we need this? > Power Management and Power Saving Functionality > ----------------------------------------------- > diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst > index 5f0748fba1..70fa099d30 100644 > --- a/doc/guides/prog_guide/env_abstraction_layer.rst > +++ b/doc/guides/prog_guide/env_abstraction_layer.rst > @@ -228,6 +228,8 @@ Normally, these options do not need to be changed. > can later be mapped into that preallocated VA space (if dynamic memory mode > is enabled), and can optionally be mapped into it at startup. > > +.. _hugepage_mapping: > + > Hugepage Mapping > ^^^^^^^^^^^^^^^^ > > -- > 2.25.1 >