* [PATCH 3/4] doc: give specific instructions for running as non-root
[not found] <20220607234949.2311884-1-dkozlyuk@nvidia.com>
@ 2022-06-07 23:49 ` Dmitry Kozlyuk
2022-06-08 0:03 ` Stephen Hemminger
2022-06-07 23:49 ` [PATCH 4/4] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
[not found] ` <20220617112508.3823291-1-dkozlyuk@nvidia.com>
2 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-07 23:49 UTC (permalink / raw)
To: dev; +Cc: Thomas Monjalon, stable, Anatoly Burakov
The guide to run DPDK applications as non-root in Linux
did not provide specific instructions to configure the required access
and did not explain why each bit is needed.
The latter is important because running as non-root
is one of the ways to tighten security and grant minimal permissions.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/linux_gsg/enable_func.rst | 53 ++++++++++++++++---
.../prog_guide/env_abstraction_layer.rst | 2 +
2 files changed, 49 insertions(+), 6 deletions(-)
diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
index 1df3ab0255..c6975ce8bf 100644
--- a/doc/guides/linux_gsg/enable_func.rst
+++ b/doc/guides/linux_gsg/enable_func.rst
@@ -13,13 +13,46 @@ Enabling Additional Functionality
Running DPDK Applications Without Root Privileges
-------------------------------------------------
-In order to run DPDK as non-root, the following Linux filesystem objects'
-permissions should be adjusted to ensure that the Linux account being used to
-run the DPDK application has access to them:
+The following sections describe generic requirements and configuration
+for running DPDK applications as non-root.
+There may be additional requirements documented for some drivers.
-* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
+Hugepages
+~~~~~~~~~
-* If the HPET is to be used, ``/dev/hpet``
+Hugepages must be reserved as root before runing the application as non-root,
+for example::
+
+ sudo dpdk-hugepages.py --reserve 1G
+
+If multi-process is not required, running with ``--in-memory``
+bypasses the need to access hugepage mount point and files within it.
+Otherwise, hugepage directory must be made accessible
+for writing to the unprivileged user, for example::
+
+ export HUGEDIR=$HOME/huge-1G
+ mkdir -p $HUGEDIR
+ sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g`
+
+If the driver requires using physical addresses (PA),
+the executable file must be granted additional capabilities:
+
+* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
+* ``IPC_LOCK`` to lock hugepages in memory
+
+.. code-block:: console
+
+ setcap cap_ipc_lock,cap_sys_admin+ep <executable>
+
+If physical addresses are not accessible,
+the following message will appear during EAL initialization::
+
+ EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
+
+It is harmless in case PA are not needed.
+
+Resource Limits
+~~~~~~~~~~~~~~~
When running as non-root user, there may be some additional resource limits
that are imposed by the system. Specifically, the following resource limits may
@@ -34,7 +67,15 @@ need to be adjusted in order to ensure normal DPDK operation:
The above limits can usually be adjusted by editing
``/etc/security/limits.conf`` file, and rebooting.
-Additionally, depending on which kernel driver is in use, the relevant
+See `Hugepage Mapping <hugepage_mapping>`_
+secton to learn how these limits affect EAL.
+
+Device Control
+~~~~~~~~~~~~~~
+
+If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
+
+Depending on which kernel driver is in use, the relevant
resources also should be accessible by the user running the DPDK application.
For ``vfio-pci`` kernel driver, the following Linux file system objects'
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 5f0748fba1..70fa099d30 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -228,6 +228,8 @@ Normally, these options do not need to be changed.
can later be mapped into that preallocated VA space (if dynamic memory mode
is enabled), and can optionally be mapped into it at startup.
+.. _hugepage_mapping:
+
Hugepage Mapping
^^^^^^^^^^^^^^^^
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/4] doc: update instructions for running as non-root for MLX5
[not found] <20220607234949.2311884-1-dkozlyuk@nvidia.com>
2022-06-07 23:49 ` [PATCH 3/4] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-07 23:49 ` Dmitry Kozlyuk
2022-06-08 0:13 ` Stephen Hemminger
[not found] ` <20220617112508.3823291-1-dkozlyuk@nvidia.com>
2 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-07 23:49 UTC (permalink / raw)
To: dev; +Cc: Thomas Monjalon, stable
Reference the common guide for generic setup.
Remove excessive capabilities from the recommended list.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/platform/mlx5.rst | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/doc/guides/platform/mlx5.rst b/doc/guides/platform/mlx5.rst
index 64a4c5e76e..956a72fadf 100644
--- a/doc/guides/platform/mlx5.rst
+++ b/doc/guides/platform/mlx5.rst
@@ -404,25 +404,23 @@ The device can be bound again at this point.
Run as Non-Root
^^^^^^^^^^^^^^^
-In order to run as a non-root user,
-some capabilities must be granted to the application::
+Hugepage and resource limit setup is documented
+in the :ref:`common Linux guide <Running_Without_Root_Privileges>`.
+This PMD does not require physical addresses,
+so capability configuration is not needed to access hugepages.
+Note that physical addresses may be required by other drivers.
- setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_ipc_lock+ep <dpdk-app>
+Additional capabilities must be granted to the application::
-Below are the reasons for the need of each capability:
-
-``cap_sys_admin``
- When using physical addresses (PA mode), with Linux >= 4.0,
- for access to ``/proc/self/pagemap``.
+ setcap cap_net_raw,cap_net_admin,cap_sys_rawio+ep <executable>
-``cap_net_admin``
- For device configuration.
+Below are the reasons for the need of each capability:
``cap_net_raw``
For raw ethernet queue allocation through kernel driver.
-``cap_ipc_lock``
- For DMA memory pinning.
+``cap_net_admin``
+ For device configuration, like setting link status or MTU.
Windows Environment
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] doc: give specific instructions for running as non-root
2022-06-07 23:49 ` [PATCH 3/4] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-08 0:03 ` Stephen Hemminger
0 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2022-06-08 0:03 UTC (permalink / raw)
To: Dmitry Kozlyuk; +Cc: dev, Thomas Monjalon, stable, Anatoly Burakov
On Wed, 8 Jun 2022 02:49:48 +0300
Dmitry Kozlyuk <dkozlyuk@nvidia.com> wrote:
> The guide to run DPDK applications as non-root in Linux
> did not provide specific instructions to configure the required access
> and did not explain why each bit is needed.
> The latter is important because running as non-root
> is one of the ways to tighten security and grant minimal permissions.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
If running with multiple containers it is often better to have OS
take care of mounting huge pages.
https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount
And a good way for managing multiple applications using hugepages
is to mount device with group permissions and add supplementary
group to each container.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] doc: update instructions for running as non-root for MLX5
2022-06-07 23:49 ` [PATCH 4/4] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
@ 2022-06-08 0:13 ` Stephen Hemminger
2022-06-17 11:26 ` Dmitry Kozlyuk
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2022-06-08 0:13 UTC (permalink / raw)
To: Dmitry Kozlyuk; +Cc: dev, Thomas Monjalon, stable
On Wed, 8 Jun 2022 02:49:49 +0300
Dmitry Kozlyuk <dkozlyuk@nvidia.com> wrote:
> Reference the common guide for generic setup.
> Remove excessive capabilities from the recommended list.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> ---
> doc/guides/platform/mlx5.rst | 22 ++++++++++------------
> 1 file changed, 10 insertions(+), 12 deletions(-)
This change needs additional changes to make it correct English grammar.
> diff --git a/doc/guides/platform/mlx5.rst b/doc/guides/platform/mlx5.rst
> index 64a4c5e76e..956a72fadf 100644
> --- a/doc/guides/platform/mlx5.rst
> +++ b/doc/guides/platform/mlx5.rst
> @@ -404,25 +404,23 @@ The device can be bound again at this point.
> Run as Non-Root
> ^^^^^^^^^^^^^^^
>
> -In order to run as a non-root user,
> -some capabilities must be granted to the application::
> +Hugepage and resource limit setup is documented
Subject is plural so verb must be plural => are documented
> +in the :ref:`common Linux guide <Running_Without_Root_Privileges>`.
> +This PMD does not require physical addresses,
> +so capability configuration is not needed to access hugepages.
In technical writing "therefore" is preferred over "so"
and you need a preposition. Please reword something like:
"This PMD does can operate without direct physical memory and hugepages
are not required."
Often applications will keep using hugepages (makes them NIC independent)
and in that case they would still need permissions.
> +Note that physical addresses may be required by other drivers.
>
> - setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_ipc_lock+ep <dpdk-app>
> +Additional capabilities must be granted to the application::
>
> -Below are the reasons for the need of each capability:
> -
> -``cap_sys_admin``
> - When using physical addresses (PA mode), with Linux >= 4.0,
> - for access to ``/proc/self/pagemap``.
> + setcap cap_net_raw,cap_net_admin,cap_sys_rawio+ep <executable>
>
> -``cap_net_admin``
> - For device configuration.
> +Below are the reasons for the need of each capability:
>
> ``cap_net_raw``
> For raw ethernet queue allocation through kernel driver.
>
> -``cap_ipc_lock``
> - For DMA memory pinning.
> +``cap_net_admin``
> + For device configuration, like setting link status or MTU.
>
The most common usage for running as non-root is some container system.
In that case capabilities are managed by the container service (ie systemd, docker, etc)
and not done by setting filesystem capabilities.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 3/4] doc: give specific instructions for running as non-root
[not found] ` <20220617112508.3823291-1-dkozlyuk@nvidia.com>
@ 2022-06-17 11:25 ` Dmitry Kozlyuk
2022-06-17 16:38 ` Bruce Richardson
2022-06-17 11:25 ` [PATCH v2 4/4] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
[not found] ` <20220624084817.63145-1-dkozlyuk@nvidia.com>
2 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-17 11:25 UTC (permalink / raw)
To: dev; +Cc: stable, Anatoly Burakov
The guide to run DPDK applications as non-root in Linux
did not provide specific instructions to configure the required access
and did not explain why each bit is needed.
The latter is important because running as non-root
is one of the ways to tighten security and grant minimal permissions.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/linux_gsg/enable_func.rst | 67 +++++++++++++++++--
.../prog_guide/env_abstraction_layer.rst | 2 +
2 files changed, 63 insertions(+), 6 deletions(-)
diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
index 1df3ab0255..2f908e8b70 100644
--- a/doc/guides/linux_gsg/enable_func.rst
+++ b/doc/guides/linux_gsg/enable_func.rst
@@ -13,13 +13,58 @@ Enabling Additional Functionality
Running DPDK Applications Without Root Privileges
-------------------------------------------------
-In order to run DPDK as non-root, the following Linux filesystem objects'
-permissions should be adjusted to ensure that the Linux account being used to
-run the DPDK application has access to them:
+The following sections describe generic requirements and configuration
+for running DPDK applications as non-root.
+There may be additional requirements documented for some drivers.
-* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
+Hugepages
+~~~~~~~~~
-* If the HPET is to be used, ``/dev/hpet``
+Hugepages must be reserved as root before runing the application as non-root,
+for example::
+
+ sudo dpdk-hugepages.py --reserve 1G
+
+If multi-process is not required, running with ``--in-memory``
+bypasses the need to access hugepage mount point and files within it.
+Otherwise, hugepage directory must be made accessible
+for writing to the unprivileged user.
+A good way for managing multiple applications using hugepages
+is to mount the filesystem with group permissions
+and add a supplementary group to each application or container.
+
+One option is to use the script provided by this project::
+
+ export HUGEDIR=$HOME/huge-1G
+ mkdir -p $HUGEDIR
+ sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g`
+
+In production environment, the OS can manage mount points
+(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
+
+The ``hugetlb`` filesystem has additional options to guarantee or limit
+the amount of memory that is possible to allocate using the mount point.
+Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
+
+If the driver requires using physical addresses (PA),
+the executable file must be granted additional capabilities:
+
+* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
+* ``IPC_LOCK`` to lock hugepages in memory
+
+.. code-block:: console
+
+ setcap cap_ipc_lock,cap_sys_admin+ep <executable>
+
+If physical addresses are not accessible,
+the following message will appear during EAL initialization::
+
+ EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
+
+It is harmless in case PA are not needed.
+
+Resource Limits
+~~~~~~~~~~~~~~~
When running as non-root user, there may be some additional resource limits
that are imposed by the system. Specifically, the following resource limits may
@@ -34,7 +79,15 @@ need to be adjusted in order to ensure normal DPDK operation:
The above limits can usually be adjusted by editing
``/etc/security/limits.conf`` file, and rebooting.
-Additionally, depending on which kernel driver is in use, the relevant
+See `Hugepage Mapping <hugepage_mapping>`_
+secton to learn how these limits affect EAL.
+
+Device Control
+~~~~~~~~~~~~~~
+
+If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
+
+Depending on which kernel driver is in use, the relevant
resources also should be accessible by the user running the DPDK application.
For ``vfio-pci`` kernel driver, the following Linux file system objects'
@@ -64,6 +117,8 @@ system objects' permissions should be adjusted:
/sys/class/uio/uio0/device/config
/sys/class/uio/uio0/device/resource*
+For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is required
+for ``iopl()`` call to enable access to PCI IO ports.
Power Management and Power Saving Functionality
-----------------------------------------------
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 5f0748fba1..70fa099d30 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -228,6 +228,8 @@ Normally, these options do not need to be changed.
can later be mapped into that preallocated VA space (if dynamic memory mode
is enabled), and can optionally be mapped into it at startup.
+.. _hugepage_mapping:
+
Hugepage Mapping
^^^^^^^^^^^^^^^^
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 4/4] doc: update instructions for running as non-root for MLX5
[not found] ` <20220617112508.3823291-1-dkozlyuk@nvidia.com>
2022-06-17 11:25 ` [PATCH v2 3/4] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-17 11:25 ` Dmitry Kozlyuk
[not found] ` <20220624084817.63145-1-dkozlyuk@nvidia.com>
2 siblings, 0 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-17 11:25 UTC (permalink / raw)
To: dev; +Cc: stable
Reference the common guide for generic setup.
Remove excessive capabilities from the recommended list.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/platform/mlx5.rst | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/doc/guides/platform/mlx5.rst b/doc/guides/platform/mlx5.rst
index 64a4c5e76e..18d38f3488 100644
--- a/doc/guides/platform/mlx5.rst
+++ b/doc/guides/platform/mlx5.rst
@@ -404,25 +404,30 @@ The device can be bound again at this point.
Run as Non-Root
^^^^^^^^^^^^^^^
-In order to run as a non-root user,
-some capabilities must be granted to the application::
+Hugepage and resource limit setup are documented
+in the :ref:`common Linux guide <Running_Without_Root_Privileges>`.
+This PMD can operate without access to physical addresses,
+therefore it does not require ``SYS_ADMIN`` to access ``/proc/self/pagemaps``.
+Note that this requirement may still come from other drivers.
- setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_ipc_lock+ep <dpdk-app>
+Below are additional capabilities that must be granted to the application
+with the reasons for the need of each capability:
-Below are the reasons for the need of each capability:
+``NET_RAW``
+ For raw Ethernet queue allocation through the kernel driver.
-``cap_sys_admin``
- When using physical addresses (PA mode), with Linux >= 4.0,
- for access to ``/proc/self/pagemap``.
+``NET_ADMIN``
+ For device configuration, like setting link status or MTU.
-``cap_net_admin``
- For device configuration.
+``SYS_RAWIO``
+ For using group 1 and above (software steering) in Flow API.
-``cap_net_raw``
- For raw ethernet queue allocation through kernel driver.
+They can be manually granted for a specific executable file::
-``cap_ipc_lock``
- For DMA memory pinning.
+ setcap cap_net_raw,cap_net_admin,cap_sys_rawio+ep <executable>
+
+Alternatively, a service manager or a container runtime
+may configure the capabilities for a process.
Windows Environment
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH 4/4] doc: update instructions for running as non-root for MLX5
2022-06-08 0:13 ` Stephen Hemminger
@ 2022-06-17 11:26 ` Dmitry Kozlyuk
0 siblings, 0 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-17 11:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, NBU-Contact-Thomas Monjalon (EXTERNAL), stable
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, June 8, 2022 3:14 AM
> [...]
> This change needs additional changes to make it correct English
> grammar.
Thank you for the useful comments to this and other patches.
I hope you don't mind that I took some sentences almost verbatim for v2.
> [...]
> > +This PMD does not require physical addresses,
> > +so capability configuration is not needed to access hugepages.
>
> In technical writing "therefore" is preferred over "so"
> and you need a preposition. Please reword something like:
>
> "This PMD does can operate without direct physical memory and
> hugepages
> are not required."
>
> Often applications will keep using hugepages (makes them NIC
> independent)
> and in that case they would still need permissions.
MLX5 PMD uses hugepages but does not use physical addresses
unlike most other HW drivers.
I tried to make it more clear in v2.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 3/4] doc: give specific instructions for running as non-root
2022-06-17 11:25 ` [PATCH v2 3/4] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-17 16:38 ` Bruce Richardson
2022-06-20 6:10 ` Dmitry Kozlyuk
0 siblings, 1 reply; 16+ messages in thread
From: Bruce Richardson @ 2022-06-17 16:38 UTC (permalink / raw)
To: Dmitry Kozlyuk; +Cc: dev, stable, Anatoly Burakov
On Fri, Jun 17, 2022 at 02:25:07PM +0300, Dmitry Kozlyuk wrote:
> The guide to run DPDK applications as non-root in Linux
> did not provide specific instructions to configure the required access
> and did not explain why each bit is needed.
> The latter is important because running as non-root
> is one of the ways to tighten security and grant minimal permissions.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Thanks for this, some good changes here. Comments inline below.
/Bruce
> ---
> doc/guides/linux_gsg/enable_func.rst | 67 +++++++++++++++++--
> .../prog_guide/env_abstraction_layer.rst | 2 +
> 2 files changed, 63 insertions(+), 6 deletions(-)
>
> diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
> index 1df3ab0255..2f908e8b70 100644
> --- a/doc/guides/linux_gsg/enable_func.rst
> +++ b/doc/guides/linux_gsg/enable_func.rst
> @@ -13,13 +13,58 @@ Enabling Additional Functionality
> Running DPDK Applications Without Root Privileges
> -------------------------------------------------
>
> -In order to run DPDK as non-root, the following Linux filesystem objects'
> -permissions should be adjusted to ensure that the Linux account being used to
> -run the DPDK application has access to them:
> +The following sections describe generic requirements and configuration
> +for running DPDK applications as non-root.
> +There may be additional requirements documented for some drivers.
>
> -* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
> +Hugepages
> +~~~~~~~~~
>
> -* If the HPET is to be used, ``/dev/hpet``
> +Hugepages must be reserved as root before runing the application as non-root,
> +for example::
> +
> + sudo dpdk-hugepages.py --reserve 1G
> +
> +If multi-process is not required, running with ``--in-memory``
> +bypasses the need to access hugepage mount point and files within it.
> +Otherwise, hugepage directory must be made accessible
> +for writing to the unprivileged user.
> +A good way for managing multiple applications using hugepages
> +is to mount the filesystem with group permissions
> +and add a supplementary group to each application or container.
> +
> +One option is to use the script provided by this project::
> +
> + export HUGEDIR=$HOME/huge-1G
> + mkdir -p $HUGEDIR
> + sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g`
> +
> +In production environment, the OS can manage mount points
> +(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
> +
> +The ``hugetlb`` filesystem has additional options to guarantee or limit
> +the amount of memory that is possible to allocate using the mount point.
> +Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
> +
> +If the driver requires using physical addresses (PA),
> +the executable file must be granted additional capabilities:
> +
> +* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
> +* ``IPC_LOCK`` to lock hugepages in memory
Are either of these necessary if using vfio-pci and VA mode? I have seen it
previously reported that IPC_LOCK is necessary for IOMMU memory mapping for
DMA - at least for docker containers - so I'd like it confirmed that we
don't need them in the in-memory case running on the host. If I get the
chance I'll try double-checking by testing myself.
> +
> +.. code-block:: console
> +
> + setcap cap_ipc_lock,cap_sys_admin+ep <executable>
> +
> +If physical addresses are not accessible,
> +the following message will appear during EAL initialization::
> +
> + EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
> +
> +It is harmless in case PA are not needed.
> +
While this is probably worth having in the doc, I think we should really
include a note here about using vfio-pci rather than uio and therefore not
needing physical addresses.
> +Resource Limits
> +~~~~~~~~~~~~~~~
>
> When running as non-root user, there may be some additional resource limits
> that are imposed by the system. Specifically, the following resource limits may
> @@ -34,7 +79,15 @@ need to be adjusted in order to ensure normal DPDK operation:
> The above limits can usually be adjusted by editing
> ``/etc/security/limits.conf`` file, and rebooting.
>
> -Additionally, depending on which kernel driver is in use, the relevant
> +See `Hugepage Mapping <hugepage_mapping>`_
> +secton to learn how these limits affect EAL.
Typo: s/secton/section/
> +
> +Device Control
> +~~~~~~~~~~~~~~
> +
> +If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
> +
Given that HPET has been off by default for years, I think we can probably
remove this line. Anyone still using it likely already knows this.
> +Depending on which kernel driver is in use, the relevant
> resources also should be accessible by the user running the DPDK application.
>
> For ``vfio-pci`` kernel driver, the following Linux file system objects'
> @@ -64,6 +117,8 @@ system objects' permissions should be adjusted:
> /sys/class/uio/uio0/device/config
> /sys/class/uio/uio0/device/resource*
>
I think our minimum supported kernel version is now >4.0 so I believe this
uio section should be removed as it's only applicable for earlier kernel
versions.
> +For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is required
> +for ``iopl()`` call to enable access to PCI IO ports.
>
How "legacy" is legacy-mode? Is it still likely in widespread use that we
need this?
> Power Management and Power Saving Functionality
> -----------------------------------------------
> diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
> index 5f0748fba1..70fa099d30 100644
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> @@ -228,6 +228,8 @@ Normally, these options do not need to be changed.
> can later be mapped into that preallocated VA space (if dynamic memory mode
> is enabled), and can optionally be mapped into it at startup.
>
> +.. _hugepage_mapping:
> +
> Hugepage Mapping
> ^^^^^^^^^^^^^^^^
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v2 3/4] doc: give specific instructions for running as non-root
2022-06-17 16:38 ` Bruce Richardson
@ 2022-06-20 6:10 ` Dmitry Kozlyuk
2022-06-20 8:37 ` Bruce Richardson
0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-20 6:10 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev, stable, Anatoly Burakov
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Friday, June 17, 2022 7:38 PM
> > [...]
> > +If the driver requires using physical addresses (PA),
> > +the executable file must be granted additional capabilities:
> > +
> > +* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
> > +* ``IPC_LOCK`` to lock hugepages in memory
>
> Are either of these necessary if using vfio-pci and VA mode? I have
> seen it previously reported that IPC_LOCK is necessary for IOMMU
> memory mapping for DMA - at least for docker containers - so I'd
> like it confirmed that we don't need them in the in-memory case
> running on the host. If I get the chance I'll try double-checking
> by testing myself.
Sorry, I don't have a physical device using vfio-pci to check.
MLX5 that I have tested doesn't need these capabilities,
but it locks memory from the kernel side.
Note that --in-memory doesn't imply --iova-mode=va.
>
> > +
> > +.. code-block:: console
> > +
> > + setcap cap_ipc_lock,cap_sys_admin+ep <executable>
> > +
> > +If physical addresses are not accessible,
> > +the following message will appear during EAL initialization::
> > +
> > + EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap:
> Permission denied
> > +
> > +It is harmless in case PA are not needed.
> > +
>
> While this is probably worth having in the doc, I think we should
> really
> include a note here about using vfio-pci rather than uio and therefore
> not
> needing physical addresses.
A note won't harm. There are also non-PCI devices, though.
> > +For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is
> required
> > +for ``iopl()`` call to enable access to PCI IO ports.
> >
>
> How "legacy" is legacy-mode? Is it still likely in widespread use that
> we need this?
I don't really know.
The spec says that legacy support is optional
(2.2.3 Legacy Interface: A Note on Feature Bits) and it aims
to reduce the chance of a legacy driver attempting to drive the device
(4.1.2.1 Device Requirements: PCI Device Discovery).
OTOH, DPDK supports it and requirements must be documented.
I can add a line suggesting to use modern virtio,
but also don't mind removing this.
I'll address skipped comments in v3, thanks.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 3/4] doc: give specific instructions for running as non-root
2022-06-20 6:10 ` Dmitry Kozlyuk
@ 2022-06-20 8:37 ` Bruce Richardson
2022-06-24 8:49 ` Dmitry Kozlyuk
0 siblings, 1 reply; 16+ messages in thread
From: Bruce Richardson @ 2022-06-20 8:37 UTC (permalink / raw)
To: Dmitry Kozlyuk; +Cc: dev, stable, Anatoly Burakov
On Mon, Jun 20, 2022 at 06:10:37AM +0000, Dmitry Kozlyuk wrote:
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Friday, June 17, 2022 7:38 PM
> > > [...]
> > > +If the driver requires using physical addresses (PA),
> > > +the executable file must be granted additional capabilities:
> > > +
> > > +* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
> > > +* ``IPC_LOCK`` to lock hugepages in memory
> >
> > Are either of these necessary if using vfio-pci and VA mode? I have
> > seen it previously reported that IPC_LOCK is necessary for IOMMU
> > memory mapping for DMA - at least for docker containers - so I'd
> > like it confirmed that we don't need them in the in-memory case
> > running on the host. If I get the chance I'll try double-checking
> > by testing myself.
>
> Sorry, I don't have a physical device using vfio-pci to check.
> MLX5 that I have tested doesn't need these capabilities,
> but it locks memory from the kernel side.
> Note that --in-memory doesn't imply --iova-mode=va.
>
> >
> > > +
> > > +.. code-block:: console
> > > +
> > > + setcap cap_ipc_lock,cap_sys_admin+ep <executable>
> > > +
> > > +If physical addresses are not accessible,
> > > +the following message will appear during EAL initialization::
> > > +
> > > + EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap:
> > Permission denied
> > > +
> > > +It is harmless in case PA are not needed.
> > > +
> >
> > While this is probably worth having in the doc, I think we should
> > really
> > include a note here about using vfio-pci rather than uio and therefore
> > not
> > needing physical addresses.
>
> A note won't harm. There are also non-PCI devices, though.
>
> > > +For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is
> > required
> > > +for ``iopl()`` call to enable access to PCI IO ports.
> > >
> >
> > How "legacy" is legacy-mode? Is it still likely in widespread use that
> > we need this?
>
> I don't really know.
> The spec says that legacy support is optional
> (2.2.3 Legacy Interface: A Note on Feature Bits) and it aims
> to reduce the chance of a legacy driver attempting to drive the device
> (4.1.2.1 Device Requirements: PCI Device Discovery).
> OTOH, DPDK supports it and requirements must be documented.
> I can add a line suggesting to use modern virtio,
> but also don't mind removing this.
>
I suppose the main question for this legacy virtio bit is where it should
be documented, more than if it should be. Given this is a GSG, we should
try and avoid getting too deep into driver-specific issues, so I think we
should omit legacy virtio here, but have it docuemented in the relevant
virtio-specific doc. Does that seem reasonable?
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3 3/5] doc: give specific instructions for running as non-root
[not found] ` <20220624084817.63145-1-dkozlyuk@nvidia.com>
@ 2022-06-24 8:48 ` Dmitry Kozlyuk
2022-06-24 9:09 ` Bruce Richardson
2022-06-24 8:48 ` [PATCH v3 4/5] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
[not found] ` <20220624131956.75160-1-dkozlyuk@nvidia.com>
2 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-24 8:48 UTC (permalink / raw)
To: dev; +Cc: stable, Anatoly Burakov
The guide to run DPDK applications as non-root in Linux
did not provide specific instructions to configure the required access
and did not explain why each bit is needed.
The latter is important because running as non-root
is one of the ways to tighten security and grant minimal permissions.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/linux_gsg/enable_func.rst | 89 +++++++++++++------
.../prog_guide/env_abstraction_layer.rst | 2 +
2 files changed, 64 insertions(+), 27 deletions(-)
diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
index 1df3ab0255..0b57417c94 100644
--- a/doc/guides/linux_gsg/enable_func.rst
+++ b/doc/guides/linux_gsg/enable_func.rst
@@ -13,13 +13,63 @@ Enabling Additional Functionality
Running DPDK Applications Without Root Privileges
-------------------------------------------------
-In order to run DPDK as non-root, the following Linux filesystem objects'
-permissions should be adjusted to ensure that the Linux account being used to
-run the DPDK application has access to them:
+The following sections describe generic requirements and configuration
+for running DPDK applications as non-root.
+There may be additional requirements documented for some drivers.
-* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
+Hugepages
+~~~~~~~~~
-* If the HPET is to be used, ``/dev/hpet``
+Hugepages must be reserved as root before running the application as non-root,
+for example::
+
+ sudo dpdk-hugepages.py --reserve 1G
+
+If multi-process is not required, running with ``--in-memory``
+bypasses the need to access hugepage mount point and files within it.
+Otherwise, hugepage directory must be made accessible
+for writing to the unprivileged user.
+A good way for managing multiple applications using hugepages
+is to mount the filesystem with group permissions
+and add a supplementary group to each application or container.
+
+One option is to use the script provided by this project::
+
+ export HUGEDIR=$HOME/huge-1G
+ mkdir -p $HUGEDIR
+ sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g`
+
+In production environment, the OS can manage mount points
+(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
+
+The ``hugetlb`` filesystem has additional options to guarantee or limit
+the amount of memory that is possible to allocate using the mount point.
+Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
+
+If the driver requires using physical addresses (PA),
+the executable file must be granted additional capabilities:
+
+* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
+* ``IPC_LOCK`` to lock hugepages in memory
+
+.. code-block:: console
+
+ setcap cap_ipc_lock,cap_sys_admin+ep <executable>
+
+If physical addresses are not accessible,
+the following message will appear during EAL initialization::
+
+ EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
+
+It is harmless in case PA are not needed.
+
+.. note::
+
+ Using ``vfio-pci`` kernel driver, if applicable, can eliminate the need
+ for physical addresses and therefore reduce the permission requirements.
+
+Resource Limits
+~~~~~~~~~~~~~~~
When running as non-root user, there may be some additional resource limits
that are imposed by the system. Specifically, the following resource limits may
@@ -34,8 +84,13 @@ need to be adjusted in order to ensure normal DPDK operation:
The above limits can usually be adjusted by editing
``/etc/security/limits.conf`` file, and rebooting.
-Additionally, depending on which kernel driver is in use, the relevant
-resources also should be accessible by the user running the DPDK application.
+See `Hugepage Mapping <hugepage_mapping>`_
+section to learn how these limits affect EAL.
+
+Device Control
+~~~~~~~~~~~~~~
+
+If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
For ``vfio-pci`` kernel driver, the following Linux file system objects'
permissions should be adjusted:
@@ -45,26 +100,6 @@ permissions should be adjusted:
* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of
devices intended to be used by DPDK, for example, ``/dev/vfio/50``
-.. note::
-
- The instructions below will allow running DPDK with ``igb_uio`` or
- ``uio_pci_generic`` drivers as non-root with older Linux kernel versions.
- However, since version 4.0, the kernel does not allow unprivileged processes
- to read the physical address information from the pagemaps file, making it
- impossible for those processes to be used by non-privileged users. In such
- cases, using the VFIO driver is recommended.
-
-For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file
-system objects' permissions should be adjusted:
-
-* The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on
-
-* The userspace-io sysfs config and resource files, for example for ``uio0``::
-
- /sys/class/uio/uio0/device/config
- /sys/class/uio/uio0/device/resource*
-
-
Power Management and Power Saving Functionality
-----------------------------------------------
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 5f0748fba1..70fa099d30 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -228,6 +228,8 @@ Normally, these options do not need to be changed.
can later be mapped into that preallocated VA space (if dynamic memory mode
is enabled), and can optionally be mapped into it at startup.
+.. _hugepage_mapping:
+
Hugepage Mapping
^^^^^^^^^^^^^^^^
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3 4/5] doc: update instructions for running as non-root for MLX5
[not found] ` <20220624084817.63145-1-dkozlyuk@nvidia.com>
2022-06-24 8:48 ` [PATCH v3 3/5] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-24 8:48 ` Dmitry Kozlyuk
[not found] ` <20220624131956.75160-1-dkozlyuk@nvidia.com>
2 siblings, 0 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-24 8:48 UTC (permalink / raw)
To: dev; +Cc: stable
Reference the common guide for generic setup.
Remove excessive capabilities from the recommended list.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/platform/mlx5.rst | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/doc/guides/platform/mlx5.rst b/doc/guides/platform/mlx5.rst
index 64a4c5e76e..18d38f3488 100644
--- a/doc/guides/platform/mlx5.rst
+++ b/doc/guides/platform/mlx5.rst
@@ -404,25 +404,30 @@ The device can be bound again at this point.
Run as Non-Root
^^^^^^^^^^^^^^^
-In order to run as a non-root user,
-some capabilities must be granted to the application::
+Hugepage and resource limit setup are documented
+in the :ref:`common Linux guide <Running_Without_Root_Privileges>`.
+This PMD can operate without access to physical addresses,
+therefore it does not require ``SYS_ADMIN`` to access ``/proc/self/pagemaps``.
+Note that this requirement may still come from other drivers.
- setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_ipc_lock+ep <dpdk-app>
+Below are additional capabilities that must be granted to the application
+with the reasons for the need of each capability:
-Below are the reasons for the need of each capability:
+``NET_RAW``
+ For raw Ethernet queue allocation through the kernel driver.
-``cap_sys_admin``
- When using physical addresses (PA mode), with Linux >= 4.0,
- for access to ``/proc/self/pagemap``.
+``NET_ADMIN``
+ For device configuration, like setting link status or MTU.
-``cap_net_admin``
- For device configuration.
+``SYS_RAWIO``
+ For using group 1 and above (software steering) in Flow API.
-``cap_net_raw``
- For raw ethernet queue allocation through kernel driver.
+They can be manually granted for a specific executable file::
-``cap_ipc_lock``
- For DMA memory pinning.
+ setcap cap_net_raw,cap_net_admin,cap_sys_rawio+ep <executable>
+
+Alternatively, a service manager or a container runtime
+may configure the capabilities for a process.
Windows Environment
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v2 3/4] doc: give specific instructions for running as non-root
2022-06-20 8:37 ` Bruce Richardson
@ 2022-06-24 8:49 ` Dmitry Kozlyuk
0 siblings, 0 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-24 8:49 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev, stable, Anatoly Burakov
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday, June 20, 2022 11:38 AM
> [...]
> > > > +For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is
> > > required
> > > > +for ``iopl()`` call to enable access to PCI IO ports.
> > > >
> > >
> > > How "legacy" is legacy-mode? Is it still likely in widespread use
> that
> > > we need this?
> >
> > I don't really know.
> > The spec says that legacy support is optional
> > (2.2.3 Legacy Interface: A Note on Feature Bits) and it aims
> > to reduce the chance of a legacy driver attempting to drive the
> device
> > (4.1.2.1 Device Requirements: PCI Device Discovery).
> > OTOH, DPDK supports it and requirements must be documented.
> > I can add a line suggesting to use modern virtio,
> > but also don't mind removing this.
> >
>
> I suppose the main question for this legacy virtio bit
> is where it should be documented, more than if it should be.
> Given this is a GSG, we should try and avoid getting too deep
> into driver-specific issues, so I think we should omit legacy virtio here,
> but have it docuemented in the relevant virtio-specific doc.
> Does that seem reasonable?
Yes, moved to the virtio doc (it looks like it could use an update BTW).
I also chose to keep HPET line because there's an entire section on HPET
below in the document.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 3/5] doc: give specific instructions for running as non-root
2022-06-24 8:48 ` [PATCH v3 3/5] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-24 9:09 ` Bruce Richardson
0 siblings, 0 replies; 16+ messages in thread
From: Bruce Richardson @ 2022-06-24 9:09 UTC (permalink / raw)
To: Dmitry Kozlyuk; +Cc: dev, stable, Anatoly Burakov
On Fri, Jun 24, 2022 at 11:48:15AM +0300, Dmitry Kozlyuk wrote:
> The guide to run DPDK applications as non-root in Linux
> did not provide specific instructions to configure the required access
> and did not explain why each bit is needed.
> The latter is important because running as non-root
> is one of the ways to tighten security and grant minimal permissions.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Good improvements. One small suggestion below, otherwise:
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> doc/guides/linux_gsg/enable_func.rst | 89 +++++++++++++------
> .../prog_guide/env_abstraction_layer.rst | 2 +
> 2 files changed, 64 insertions(+), 27 deletions(-)
>
> diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
> index 1df3ab0255..0b57417c94 100644
> --- a/doc/guides/linux_gsg/enable_func.rst
> +++ b/doc/guides/linux_gsg/enable_func.rst
> @@ -13,13 +13,63 @@ Enabling Additional Functionality
> Running DPDK Applications Without Root Privileges
> -------------------------------------------------
>
> -In order to run DPDK as non-root, the following Linux filesystem objects'
> -permissions should be adjusted to ensure that the Linux account being used to
> -run the DPDK application has access to them:
> +The following sections describe generic requirements and configuration
> +for running DPDK applications as non-root.
> +There may be additional requirements documented for some drivers.
>
> -* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
> +Hugepages
> +~~~~~~~~~
>
> -* If the HPET is to be used, ``/dev/hpet``
> +Hugepages must be reserved as root before running the application as non-root,
> +for example::
> +
> + sudo dpdk-hugepages.py --reserve 1G
> +
> +If multi-process is not required, running with ``--in-memory``
> +bypasses the need to access hugepage mount point and files within it.
> +Otherwise, hugepage directory must be made accessible
> +for writing to the unprivileged user.
> +A good way for managing multiple applications using hugepages
> +is to mount the filesystem with group permissions
> +and add a supplementary group to each application or container.
> +
> +One option is to use the script provided by this project::
> +
> + export HUGEDIR=$HOME/huge-1G
> + mkdir -p $HUGEDIR
> + sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g`
> +
> +In production environment, the OS can manage mount points
> +(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
> +
> +The ``hugetlb`` filesystem has additional options to guarantee or limit
> +the amount of memory that is possible to allocate using the mount point.
> +Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
> +
> +If the driver requires using physical addresses (PA),
> +the executable file must be granted additional capabilities:
> +
> +* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
> +* ``IPC_LOCK`` to lock hugepages in memory
> +
> +.. code-block:: console
> +
> + setcap cap_ipc_lock,cap_sys_admin+ep <executable>
> +
> +If physical addresses are not accessible,
> +the following message will appear during EAL initialization::
> +
> + EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
> +
> +It is harmless in case PA are not needed.
> +
> +.. note::
> +
> + Using ``vfio-pci`` kernel driver, if applicable, can eliminate the need
> + for physical addresses and therefore reduce the permission requirements.
> +
Can we move this note up a bit, to immediately after the paragraph about
requiring physical addresses. It's better to inform the user immediately if
a section is not relevant to them, than only telling them at the end once
they have already read it.
/Bruce
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v4 3/5] doc: give specific instructions for running as non-root
[not found] ` <20220624131956.75160-1-dkozlyuk@nvidia.com>
@ 2022-06-24 13:19 ` Dmitry Kozlyuk
2022-06-24 13:19 ` [PATCH v4 4/5] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
1 sibling, 0 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-24 13:19 UTC (permalink / raw)
To: dev; +Cc: stable, Bruce Richardson, Anatoly Burakov
The guide to run DPDK applications as non-root in Linux
did not provide specific instructions to configure the required access
and did not explain why each bit is needed.
The latter is important because running as non-root
is one of the ways to tighten security and grant minimal permissions.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
doc/guides/linux_gsg/enable_func.rst | 90 +++++++++++++------
.../prog_guide/env_abstraction_layer.rst | 2 +
2 files changed, 65 insertions(+), 27 deletions(-)
diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
index 1df3ab0255..b15bfb2f9f 100644
--- a/doc/guides/linux_gsg/enable_func.rst
+++ b/doc/guides/linux_gsg/enable_func.rst
@@ -13,13 +13,64 @@ Enabling Additional Functionality
Running DPDK Applications Without Root Privileges
-------------------------------------------------
-In order to run DPDK as non-root, the following Linux filesystem objects'
-permissions should be adjusted to ensure that the Linux account being used to
-run the DPDK application has access to them:
+The following sections describe generic requirements and configuration
+for running DPDK applications as non-root.
+There may be additional requirements documented for some drivers.
-* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
+Hugepages
+~~~~~~~~~
-* If the HPET is to be used, ``/dev/hpet``
+Hugepages must be reserved as root before running the application as non-root,
+for example::
+
+ sudo dpdk-hugepages.py --reserve 1G
+
+If multi-process is not required, running with ``--in-memory``
+bypasses the need to access hugepage mount point and files within it.
+Otherwise, hugepage directory must be made accessible
+for writing to the unprivileged user.
+A good way for managing multiple applications using hugepages
+is to mount the filesystem with group permissions
+and add a supplementary group to each application or container.
+
+One option is to use the script provided by this project::
+
+ export HUGEDIR=$HOME/huge-1G
+ mkdir -p $HUGEDIR
+ sudo dpdk-hugepages.py --mount --directory $HUGEDIR --user `id -u` --group `id -g`
+
+In production environment, the OS can manage mount points
+(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
+
+The ``hugetlb`` filesystem has additional options to guarantee or limit
+the amount of memory that is possible to allocate using the mount point.
+Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
+
+.. note::
+
+ Using ``vfio-pci`` kernel driver, if applicable, can eliminate the need
+ for physical addresses and therefore eliminate the permission requirements
+ described below.
+
+If the driver requires using physical addresses (PA),
+the executable file must be granted additional capabilities:
+
+* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
+* ``IPC_LOCK`` to lock hugepages in memory
+
+.. code-block:: console
+
+ setcap cap_ipc_lock,cap_sys_admin+ep <executable>
+
+If physical addresses are not accessible,
+the following message will appear during EAL initialization::
+
+ EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
+
+It is harmless in case PA are not needed.
+
+Resource Limits
+~~~~~~~~~~~~~~~
When running as non-root user, there may be some additional resource limits
that are imposed by the system. Specifically, the following resource limits may
@@ -34,8 +85,13 @@ need to be adjusted in order to ensure normal DPDK operation:
The above limits can usually be adjusted by editing
``/etc/security/limits.conf`` file, and rebooting.
-Additionally, depending on which kernel driver is in use, the relevant
-resources also should be accessible by the user running the DPDK application.
+See `Hugepage Mapping <hugepage_mapping>`_
+section to learn how these limits affect EAL.
+
+Device Control
+~~~~~~~~~~~~~~
+
+If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
For ``vfio-pci`` kernel driver, the following Linux file system objects'
permissions should be adjusted:
@@ -45,26 +101,6 @@ permissions should be adjusted:
* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of
devices intended to be used by DPDK, for example, ``/dev/vfio/50``
-.. note::
-
- The instructions below will allow running DPDK with ``igb_uio`` or
- ``uio_pci_generic`` drivers as non-root with older Linux kernel versions.
- However, since version 4.0, the kernel does not allow unprivileged processes
- to read the physical address information from the pagemaps file, making it
- impossible for those processes to be used by non-privileged users. In such
- cases, using the VFIO driver is recommended.
-
-For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file
-system objects' permissions should be adjusted:
-
-* The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on
-
-* The userspace-io sysfs config and resource files, for example for ``uio0``::
-
- /sys/class/uio/uio0/device/config
- /sys/class/uio/uio0/device/resource*
-
-
Power Management and Power Saving Functionality
-----------------------------------------------
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 5f0748fba1..70fa099d30 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -228,6 +228,8 @@ Normally, these options do not need to be changed.
can later be mapped into that preallocated VA space (if dynamic memory mode
is enabled), and can optionally be mapped into it at startup.
+.. _hugepage_mapping:
+
Hugepage Mapping
^^^^^^^^^^^^^^^^
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v4 4/5] doc: update instructions for running as non-root for MLX5
[not found] ` <20220624131956.75160-1-dkozlyuk@nvidia.com>
2022-06-24 13:19 ` [PATCH v4 3/5] doc: give specific instructions for running as non-root Dmitry Kozlyuk
@ 2022-06-24 13:19 ` Dmitry Kozlyuk
1 sibling, 0 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-06-24 13:19 UTC (permalink / raw)
To: dev; +Cc: stable
Reference the common guide for generic setup.
Remove excessive capabilities from the recommended list.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
doc/guides/platform/mlx5.rst | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/doc/guides/platform/mlx5.rst b/doc/guides/platform/mlx5.rst
index 64a4c5e76e..18d38f3488 100644
--- a/doc/guides/platform/mlx5.rst
+++ b/doc/guides/platform/mlx5.rst
@@ -404,25 +404,30 @@ The device can be bound again at this point.
Run as Non-Root
^^^^^^^^^^^^^^^
-In order to run as a non-root user,
-some capabilities must be granted to the application::
+Hugepage and resource limit setup are documented
+in the :ref:`common Linux guide <Running_Without_Root_Privileges>`.
+This PMD can operate without access to physical addresses,
+therefore it does not require ``SYS_ADMIN`` to access ``/proc/self/pagemaps``.
+Note that this requirement may still come from other drivers.
- setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_ipc_lock+ep <dpdk-app>
+Below are additional capabilities that must be granted to the application
+with the reasons for the need of each capability:
-Below are the reasons for the need of each capability:
+``NET_RAW``
+ For raw Ethernet queue allocation through the kernel driver.
-``cap_sys_admin``
- When using physical addresses (PA mode), with Linux >= 4.0,
- for access to ``/proc/self/pagemap``.
+``NET_ADMIN``
+ For device configuration, like setting link status or MTU.
-``cap_net_admin``
- For device configuration.
+``SYS_RAWIO``
+ For using group 1 and above (software steering) in Flow API.
-``cap_net_raw``
- For raw ethernet queue allocation through kernel driver.
+They can be manually granted for a specific executable file::
-``cap_ipc_lock``
- For DMA memory pinning.
+ setcap cap_net_raw,cap_net_admin,cap_sys_rawio+ep <executable>
+
+Alternatively, a service manager or a container runtime
+may configure the capabilities for a process.
Windows Environment
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-06-24 13:20 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20220607234949.2311884-1-dkozlyuk@nvidia.com>
2022-06-07 23:49 ` [PATCH 3/4] doc: give specific instructions for running as non-root Dmitry Kozlyuk
2022-06-08 0:03 ` Stephen Hemminger
2022-06-07 23:49 ` [PATCH 4/4] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
2022-06-08 0:13 ` Stephen Hemminger
2022-06-17 11:26 ` Dmitry Kozlyuk
[not found] ` <20220617112508.3823291-1-dkozlyuk@nvidia.com>
2022-06-17 11:25 ` [PATCH v2 3/4] doc: give specific instructions for running as non-root Dmitry Kozlyuk
2022-06-17 16:38 ` Bruce Richardson
2022-06-20 6:10 ` Dmitry Kozlyuk
2022-06-20 8:37 ` Bruce Richardson
2022-06-24 8:49 ` Dmitry Kozlyuk
2022-06-17 11:25 ` [PATCH v2 4/4] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
[not found] ` <20220624084817.63145-1-dkozlyuk@nvidia.com>
2022-06-24 8:48 ` [PATCH v3 3/5] doc: give specific instructions for running as non-root Dmitry Kozlyuk
2022-06-24 9:09 ` Bruce Richardson
2022-06-24 8:48 ` [PATCH v3 4/5] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
[not found] ` <20220624131956.75160-1-dkozlyuk@nvidia.com>
2022-06-24 13:19 ` [PATCH v4 3/5] doc: give specific instructions for running as non-root Dmitry Kozlyuk
2022-06-24 13:19 ` [PATCH v4 4/5] doc: update instructions for running as non-root for MLX5 Dmitry Kozlyuk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).