The current instructions are slightly out of date when it comes to providing information about setting up the system for using DPDK as non-root, so update them. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> --- doc/guides/linux_gsg/enable_func.rst | 54 ++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 15 deletions(-) diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index b2bda80bb7..78b0f7c012 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -58,22 +58,34 @@ The application can then determine what action to take, if any, if the HPET is n if any, and on what is available on the system at runtime. Running DPDK Applications Without Root Privileges --------------------------------------------------------- +------------------------------------------------- -.. note:: +In order to run DPDK as non-root, the following Linux filesystem objects' +permissions should be adjusted to ensure that the Linux account being used to +run the DPDK application has access to them: - The instructions below will allow running DPDK as non-root with older - Linux kernel versions. However, since version 4.0, the kernel does not allow - unprivileged processes to read the physical address information from - the pagemaps file, making it impossible for those processes to use HW - devices which require physical addresses +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` -Although applications using the DPDK use network ports and other hardware resources directly, -with a number of small permission adjustments it is possible to run these applications as a user other than "root". -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that -the Linux user account being used to run the DPDK application has access to them: +* If the HPET is to be used, ``/dev/hpet`` -* All directories which serve as hugepage mount points, for example, ``/mnt/huge`` +When running as non-root user, there may be some additional resource limits +that are imposed by the system. Specifically, the following resource limits may +need to be adjusted in order to ensure normal DPDK operation: + +* RLIMIT_LOCKS (number of file locks that can be held by a process) + +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) + +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) + +The above limits can usually be adjusted by editing +``/etc/security/limits.conf`` file, and rebooting. + +Additionally, depending on which kernel driver is in use, the relevant +resources also should be accessible by the user running the DPDK application. + +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file +system objects' permissions should be adjusted: * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on @@ -82,11 +94,23 @@ the Linux user account being used to run the DPDK application has access to them /sys/class/uio/uio0/device/config /sys/class/uio/uio0/device/resource* -* If the HPET is to be used, ``/dev/hpet`` - .. note:: - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default. + The instructions above will allow running DPDK with ``igb_uio`` driver as + non-root with older Linux kernel versions. However, since version 4.0, the + kernel does not allow unprivileged processes to read the physical address + information from the pagemaps file, making it impossible for those + processes to be used by non-privileged users. In such cases, using the VFIO + driver is recommended. + +For ``vfio-pci`` kernel driver, the following Linux file system objects' +permissions should be adjusted: + +* The VFIO device file , ``/dev/vfio/vfio`` + +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of + devices intended to be used by DPDK, for example, ``/dev/vfio/50`` + Power Management and Power Saving Functionality ----------------------------------------------- -- 2.17.1
Current information regarding hugepage usage is a little out of date. Update it to include information on in-memory mode, as well as on default mountpoints provided by systemd. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> --- doc/guides/linux_gsg/sys_reqs.rst | 39 +++++++++++++++++++------------ 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index a124656bcb..2ddd7ed667 100644 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -155,8 +155,12 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz Reserving Hugepages for DPDK Use ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The allocation of hugepages should be done at boot time or as soon as possible after system boot -to prevent memory from being fragmented in physical memory. +The allocation of hugepages can be performed either at run time or at boot time. +In the general case, reserving hugepages at run time is perfectly fine, but in +use cases where having lots of physically contiguous memory is required, it is +preferable to reserve hugepages at boot time, as that will help in preventing +physical memory from becoming heavily fragmented. + To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: @@ -187,9 +191,9 @@ See the Documentation/admin-guide/kernel-parameters.txt file in your Linux sourc **Alternative:** -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. +There is also the option of allocating hugepages after the system has booted. This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: +For a single-node system, the command to use is as follows (assuming that 1024 of 2MB pages are required):: echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages @@ -198,22 +202,27 @@ On a NUMA machine, pages should be allocated explicitly on separate nodes:: echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages -.. note:: - - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. - Using Hugepages with the DPDK ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: +If secondary process support is not required, DPDK is able to use hugepages +without any configuration by using "in-memory" mode. Please see +:ref:`linux_eal_parameters` for more details. + +If secondary process support is required, mount points for hugepages need to be +created. On modern Linux distributions, a default mount point for hugepages is provided +by the system and is located at ``/dev/hugepages``. This mount point will use the +default hugepage size set by the kernel parameters as described above. + +However, in order to use multiple hugepage sizes, it is necessary to manually +create mount points for hugepage sizes that are not provided by the system +(e.g. 1GB pages). + +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: mkdir /mnt/huge - mount -t hugetlbfs nodev /mnt/huge + mount -t hugetlbfs pagesize=1GB /mnt/huge The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: - nodev /mnt/huge hugetlbfs defaults 0 0 - -For 1GB pages, the page size must be specified as a mount option:: - - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 -- 2.17.1
On Mon, Aug 24, 2020 at 04:45:00PM +0100, Anatoly Burakov wrote:
> The current instructions are slightly out of date when it comes to
> providing information about setting up the system for using DPDK as
> non-root, so update them.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> doc/guides/linux_gsg/enable_func.rst | 54 ++++++++++++++++++++--------
> 1 file changed, 39 insertions(+), 15 deletions(-)
>
> diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
> index b2bda80bb7..78b0f7c012 100644
> --- a/doc/guides/linux_gsg/enable_func.rst
> +++ b/doc/guides/linux_gsg/enable_func.rst
> @@ -58,22 +58,34 @@ The application can then determine what action to take, if any, if the HPET is n
> if any, and on what is available on the system at runtime.
>
> Running DPDK Applications Without Root Privileges
> ---------------------------------------------------------
> +-------------------------------------------------
>
> -.. note::
> +In order to run DPDK as non-root, the following Linux filesystem objects'
> +permissions should be adjusted to ensure that the Linux account being used to
> +run the DPDK application has access to them:
>
> - The instructions below will allow running DPDK as non-root with older
> - Linux kernel versions. However, since version 4.0, the kernel does not allow
> - unprivileged processes to read the physical address information from
> - the pagemaps file, making it impossible for those processes to use HW
> - devices which require physical addresses
> +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
>
> -Although applications using the DPDK use network ports and other hardware resources directly,
> -with a number of small permission adjustments it is possible to run these applications as a user other than "root".
> -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that
> -the Linux user account being used to run the DPDK application has access to them:
> +* If the HPET is to be used, ``/dev/hpet``
>
> -* All directories which serve as hugepage mount points, for example, ``/mnt/huge``
> +When running as non-root user, there may be some additional resource limits
> +that are imposed by the system. Specifically, the following resource limits may
> +need to be adjusted in order to ensure normal DPDK operation:
> +
> +* RLIMIT_LOCKS (number of file locks that can be held by a process)
> +
> +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process)
> +
> +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have)
> +
> +The above limits can usually be adjusted by editing
> +``/etc/security/limits.conf`` file, and rebooting.
> +
> +Additionally, depending on which kernel driver is in use, the relevant
> +resources also should be accessible by the user running the DPDK application.
> +
> +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file
> +system objects' permissions should be adjusted:
>
> * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on
>
> @@ -82,11 +94,23 @@ the Linux user account being used to run the DPDK application has access to them
> /sys/class/uio/uio0/device/config
> /sys/class/uio/uio0/device/resource*
>
> -* If the HPET is to be used, ``/dev/hpet``
> -
> .. note::
>
> - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default.
> + The instructions above will allow running DPDK with ``igb_uio`` driver as
> + non-root with older Linux kernel versions. However, since version 4.0, the
> + kernel does not allow unprivileged processes to read the physical address
> + information from the pagemaps file, making it impossible for those
> + processes to be used by non-privileged users. In such cases, using the VFIO
> + driver is recommended.
> +
> +For ``vfio-pci`` kernel driver, the following Linux file system objects'
> +permissions should be adjusted:
> +
> +* The VFIO device file , ``/dev/vfio/vfio``
> +
> +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of
> + devices intended to be used by DPDK, for example, ``/dev/vfio/50``
> +
>
Since we'd very much prefer in all cases people to use VFIO, I think the
VFIO instructions should come first.
Otherwise the text itself reads fine to me.
/Bruce
On Mon, Aug 24, 2020 at 04:45:01PM +0100, Anatoly Burakov wrote: > Current information regarding hugepage usage is a little out of date. > Update it to include information on in-memory mode, as well as on > default mountpoints provided by systemd. > > Cc: stable@dpdk.org > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> > --- > doc/guides/linux_gsg/sys_reqs.rst | 39 +++++++++++++++++++------------ > 1 file changed, 24 insertions(+), 15 deletions(-) > > diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst > index a124656bcb..2ddd7ed667 100644 > --- a/doc/guides/linux_gsg/sys_reqs.rst > +++ b/doc/guides/linux_gsg/sys_reqs.rst > @@ -155,8 +155,12 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz > Reserving Hugepages for DPDK Use > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > -The allocation of hugepages should be done at boot time or as soon as possible after system boot > -to prevent memory from being fragmented in physical memory. > +The allocation of hugepages can be performed either at run time or at boot time. > +In the general case, reserving hugepages at run time is perfectly fine, but in > +use cases where having lots of physically contiguous memory is required, it is > +preferable to reserve hugepages at boot time, as that will help in preventing > +physical memory from becoming heavily fragmented. > + Although we are removing the note about 1G pages requiring to be reserved at boot time, I think we should still mention here that some older kernel versions do not allow 1G reservations post-boot. > To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. > > For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: > @@ -187,9 +191,9 @@ See the Documentation/admin-guide/kernel-parameters.txt file in your Linux sourc > > **Alternative:** > > -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. > +There is also the option of allocating hugepages after the system has booted. > This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. > -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: > +For a single-node system, the command to use is as follows (assuming that 1024 of 2MB pages are required):: > > echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > > @@ -198,22 +202,27 @@ On a NUMA machine, pages should be allocated explicitly on separate nodes:: > echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages > echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages > > -.. note:: > - > - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. > - > Using Hugepages with the DPDK > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: > +If secondary process support is not required, DPDK is able to use hugepages > +without any configuration by using "in-memory" mode. Please see > +:ref:`linux_eal_parameters` for more details. > + > +If secondary process support is required, mount points for hugepages need to be > +created. On modern Linux distributions, a default mount point for hugepages is provided > +by the system and is located at ``/dev/hugepages``. This mount point will use the > +default hugepage size set by the kernel parameters as described above. > + > +However, in order to use multiple hugepage sizes, it is necessary to manually Rather than multiple hugepage sizes, I'd suggest changing this to hugepage sizes other than the default. Do we also want to add a line somewhere explaining that the default size can be set a boot using a kernel parameter? > +create mount points for hugepage sizes that are not provided by the system > +(e.g. 1GB pages). > + > +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: > > mkdir /mnt/huge > - mount -t hugetlbfs nodev /mnt/huge > + mount -t hugetlbfs pagesize=1GB /mnt/huge > > The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: > > - nodev /mnt/huge hugetlbfs defaults 0 0 > - > -For 1GB pages, the page size must be specified as a mount option:: > - > - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 > + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 > -- > 2.17.1
On 8/24/2020 4:45 PM, Anatoly Burakov wrote:
> The current instructions are slightly out of date when it comes to
> providing information about setting up the system for using DPDK as
> non-root, so update them.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Thanks for the doc update, it is useful. I did able to run testpmd as non-root
using vfio-pci module.
For series,
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
On 24-Aug-20 6:13 PM, Bruce Richardson wrote: > On Mon, Aug 24, 2020 at 04:45:01PM +0100, Anatoly Burakov wrote: >> Current information regarding hugepage usage is a little out of date. >> Update it to include information on in-memory mode, as well as on >> default mountpoints provided by systemd. >> >> Cc: stable@dpdk.org >> >> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> >> --- >> doc/guides/linux_gsg/sys_reqs.rst | 39 +++++++++++++++++++------------ >> 1 file changed, 24 insertions(+), 15 deletions(-) >> >> diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst >> index a124656bcb..2ddd7ed667 100644 >> --- a/doc/guides/linux_gsg/sys_reqs.rst >> +++ b/doc/guides/linux_gsg/sys_reqs.rst >> @@ -155,8 +155,12 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz >> Reserving Hugepages for DPDK Use >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> -The allocation of hugepages should be done at boot time or as soon as possible after system boot >> -to prevent memory from being fragmented in physical memory. >> +The allocation of hugepages can be performed either at run time or at boot time. >> +In the general case, reserving hugepages at run time is perfectly fine, but in >> +use cases where having lots of physically contiguous memory is required, it is >> +preferable to reserve hugepages at boot time, as that will help in preventing >> +physical memory from becoming heavily fragmented. >> + > > Although we are removing the note about 1G pages requiring to be reserved > at boot time, I think we should still mention here that some older kernel > versions do not allow 1G reservations post-boot. Agreed, will fix in v2. > >> To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. >> >> For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: >> @@ -187,9 +191,9 @@ See the Documentation/admin-guide/kernel-parameters.txt file in your Linux sourc >> >> **Alternative:** >> >> -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. >> +There is also the option of allocating hugepages after the system has booted. >> This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. >> -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: >> +For a single-node system, the command to use is as follows (assuming that 1024 of 2MB pages are required):: >> >> echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages >> >> @@ -198,22 +202,27 @@ On a NUMA machine, pages should be allocated explicitly on separate nodes:: >> echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages >> echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages >> >> -.. note:: >> - >> - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. >> - >> Using Hugepages with the DPDK >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: >> +If secondary process support is not required, DPDK is able to use hugepages >> +without any configuration by using "in-memory" mode. Please see >> +:ref:`linux_eal_parameters` for more details. >> + >> +If secondary process support is required, mount points for hugepages need to be >> +created. On modern Linux distributions, a default mount point for hugepages is provided >> +by the system and is located at ``/dev/hugepages``. This mount point will use the >> +default hugepage size set by the kernel parameters as described above. >> + >> +However, in order to use multiple hugepage sizes, it is necessary to manually > > Rather than multiple hugepage sizes, I'd suggest changing this to hugepage > sizes other than the default. OK, will fix. > > Do we also want to add a line somewhere explaining that the default size > can be set a boot using a kernel parameter? It's already there, right above this :) > >> +create mount points for hugepage sizes that are not provided by the system >> +(e.g. 1GB pages). >> + >> +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: >> >> mkdir /mnt/huge >> - mount -t hugetlbfs nodev /mnt/huge >> + mount -t hugetlbfs pagesize=1GB /mnt/huge >> >> The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: >> >> - nodev /mnt/huge hugetlbfs defaults 0 0 >> - >> -For 1GB pages, the page size must be specified as a mount option:: >> - >> - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 >> + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 >> -- >> 2.17.1 -- Thanks, Anatoly
On 24-Aug-20 6:08 PM, Bruce Richardson wrote: > On Mon, Aug 24, 2020 at 04:45:00PM +0100, Anatoly Burakov wrote: >> The current instructions are slightly out of date when it comes to >> providing information about setting up the system for using DPDK as >> non-root, so update them. >> >> Cc: stable@dpdk.org >> >> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> >> --- >> doc/guides/linux_gsg/enable_func.rst | 54 ++++++++++++++++++++-------- >> 1 file changed, 39 insertions(+), 15 deletions(-) >> >> diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst >> index b2bda80bb7..78b0f7c012 100644 >> --- a/doc/guides/linux_gsg/enable_func.rst >> +++ b/doc/guides/linux_gsg/enable_func.rst >> @@ -58,22 +58,34 @@ The application can then determine what action to take, if any, if the HPET is n >> if any, and on what is available on the system at runtime. >> >> Running DPDK Applications Without Root Privileges >> --------------------------------------------------------- >> +------------------------------------------------- >> >> -.. note:: >> +In order to run DPDK as non-root, the following Linux filesystem objects' >> +permissions should be adjusted to ensure that the Linux account being used to >> +run the DPDK application has access to them: >> >> - The instructions below will allow running DPDK as non-root with older >> - Linux kernel versions. However, since version 4.0, the kernel does not allow >> - unprivileged processes to read the physical address information from >> - the pagemaps file, making it impossible for those processes to use HW >> - devices which require physical addresses >> +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` >> >> -Although applications using the DPDK use network ports and other hardware resources directly, >> -with a number of small permission adjustments it is possible to run these applications as a user other than "root". >> -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that >> -the Linux user account being used to run the DPDK application has access to them: >> +* If the HPET is to be used, ``/dev/hpet`` >> >> -* All directories which serve as hugepage mount points, for example, ``/mnt/huge`` >> +When running as non-root user, there may be some additional resource limits >> +that are imposed by the system. Specifically, the following resource limits may >> +need to be adjusted in order to ensure normal DPDK operation: >> + >> +* RLIMIT_LOCKS (number of file locks that can be held by a process) >> + >> +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) >> + >> +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) >> + >> +The above limits can usually be adjusted by editing >> +``/etc/security/limits.conf`` file, and rebooting. >> + >> +Additionally, depending on which kernel driver is in use, the relevant >> +resources also should be accessible by the user running the DPDK application. >> + >> +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file >> +system objects' permissions should be adjusted: >> >> * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on >> >> @@ -82,11 +94,23 @@ the Linux user account being used to run the DPDK application has access to them >> /sys/class/uio/uio0/device/config >> /sys/class/uio/uio0/device/resource* >> >> -* If the HPET is to be used, ``/dev/hpet`` >> - >> .. note:: >> >> - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default. >> + The instructions above will allow running DPDK with ``igb_uio`` driver as >> + non-root with older Linux kernel versions. However, since version 4.0, the >> + kernel does not allow unprivileged processes to read the physical address >> + information from the pagemaps file, making it impossible for those >> + processes to be used by non-privileged users. In such cases, using the VFIO >> + driver is recommended. >> + >> +For ``vfio-pci`` kernel driver, the following Linux file system objects' >> +permissions should be adjusted: >> + >> +* The VFIO device file , ``/dev/vfio/vfio`` >> + >> +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of >> + devices intended to be used by DPDK, for example, ``/dev/vfio/50`` >> + >> > Since we'd very much prefer in all cases people to use VFIO, I think the > VFIO instructions should come first. > Otherwise the text itself reads fine to me. OK, will fix in v2. > > /Bruce > -- Thanks, Anatoly
The current instructions are slightly out of date when it comes to providing information about setting up the system for using DPDK as non-root, so update them. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> --- Notes: v2: - Moved VFIO description to be first doc/guides/linux_gsg/enable_func.rst | 54 ++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 15 deletions(-) diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index b2bda80bb7..a000def6cc 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -58,22 +58,42 @@ The application can then determine what action to take, if any, if the HPET is n if any, and on what is available on the system at runtime. Running DPDK Applications Without Root Privileges --------------------------------------------------------- +------------------------------------------------- -.. note:: +In order to run DPDK as non-root, the following Linux filesystem objects' +permissions should be adjusted to ensure that the Linux account being used to +run the DPDK application has access to them: - The instructions below will allow running DPDK as non-root with older - Linux kernel versions. However, since version 4.0, the kernel does not allow - unprivileged processes to read the physical address information from - the pagemaps file, making it impossible for those processes to use HW - devices which require physical addresses +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` -Although applications using the DPDK use network ports and other hardware resources directly, -with a number of small permission adjustments it is possible to run these applications as a user other than "root". -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that -the Linux user account being used to run the DPDK application has access to them: +* If the HPET is to be used, ``/dev/hpet`` -* All directories which serve as hugepage mount points, for example, ``/mnt/huge`` +When running as non-root user, there may be some additional resource limits +that are imposed by the system. Specifically, the following resource limits may +need to be adjusted in order to ensure normal DPDK operation: + +* RLIMIT_LOCKS (number of file locks that can be held by a process) + +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) + +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) + +The above limits can usually be adjusted by editing +``/etc/security/limits.conf`` file, and rebooting. + +Additionally, depending on which kernel driver is in use, the relevant +resources also should be accessible by the user running the DPDK application. + +For ``vfio-pci`` kernel driver, the following Linux file system objects' +permissions should be adjusted: + +* The VFIO device file, ``/dev/vfio/vfio`` + +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of + devices intended to be used by DPDK, for example, ``/dev/vfio/50`` + +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file +system objects' permissions should be adjusted: * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on @@ -82,11 +102,15 @@ the Linux user account being used to run the DPDK application has access to them /sys/class/uio/uio0/device/config /sys/class/uio/uio0/device/resource* -* If the HPET is to be used, ``/dev/hpet`` - .. note:: - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default. + The instructions above will allow running DPDK with ``igb_uio`` driver as + non-root with older Linux kernel versions. However, since version 4.0, the + kernel does not allow unprivileged processes to read the physical address + information from the pagemaps file, making it impossible for those + processes to be used by non-privileged users. In such cases, using the VFIO + driver is recommended. + Power Management and Power Saving Functionality ----------------------------------------------- -- 2.17.1
Current information regarding hugepage usage is a little out of date. Update it to include information on in-memory mode, as well as on default mountpoints provided by systemd. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> --- Notes: v2: - Reworked the description - Put runtime reservation first, and boot time as an alternative - Clarified wording and fixed typos - Mentioned that some kernel versions not supporting reserving 1G pages doc/guides/linux_gsg/sys_reqs.rst | 71 ++++++++++++++++++++----------- 1 file changed, 45 insertions(+), 26 deletions(-) diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index a124656bcb..8782d05579 100644 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -155,8 +155,35 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz Reserving Hugepages for DPDK Use ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The allocation of hugepages should be done at boot time or as soon as possible after system boot -to prevent memory from being fragmented in physical memory. +The reservation of hugepages can be performed at run time. This is done by +echoing the number of hugepages required to a ``nr_hugepages`` file in the +``/sys/kernel/`` directory corresponding to a specific page size (in +Kilobytes). For a single-node system, the command to use is as follows +(assuming that 1024 of 2MB pages are required):: + + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages + +On a NUMA machine, the above command will usually divide the number of hugepages +equally across all NUMA nodes (assuming there is enough memory on all NUMA +nodes). However, pages can also be reserved explicitly on individual NUMA +nodes using a ``nr_hugepages`` file in the ``/sys/devices/`` directory:: + + echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages + echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages + +.. note:: + + Some kernel versions may not allow reserving 1 GB hugepages at run time, so + reserving them at boot time may be the only option. Please see below for + instructions. + +**Alternative:** + +In the general case, reserving hugepages at run time is perfectly fine, but in +use cases where having lots of physically contiguous memory is required, it is +preferable to reserve hugepages at boot time, as that will help in preventing +physical memory from becoming heavily fragmented. + To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: @@ -185,35 +212,27 @@ the number of hugepages reserved at boot time is generally divided equally betwe See the Documentation/admin-guide/kernel-parameters.txt file in your Linux source tree for further details of these and other kernel options. -**Alternative:** - -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. -This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: - - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - -On a NUMA machine, pages should be allocated explicitly on separate nodes:: - - echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages - echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages - -.. note:: - - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. - Using Hugepages with the DPDK ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: +If secondary process support is not required, DPDK is able to use hugepages +without any configuration by using "in-memory" mode. Please see +:ref:`linux_eal_parameters` for more details. + +If secondary process support is required, mount points for hugepages need to be +created. On modern Linux distributions, a default mount point for hugepages is provided +by the system and is located at ``/dev/hugepages``. This mount point will use the +default hugepage size set by the kernel parameters as described above. + +However, in order to use hugepage sizes other than default, it is necessary to +manually create mount points for hugepage sizes that are not provided by the +system (e.g. 1GB pages). + +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: mkdir /mnt/huge - mount -t hugetlbfs nodev /mnt/huge + mount -t hugetlbfs pagesize=1GB /mnt/huge The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: - nodev /mnt/huge hugetlbfs defaults 0 0 - -For 1GB pages, the page size must be specified as a mount option:: - - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 -- 2.17.1
On Tue, Aug 25, 2020 at 01:17:48PM +0100, Anatoly Burakov wrote:
> The current instructions are slightly out of date when it comes to
> providing information about setting up the system for using DPDK as
> non-root, so update them.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
On Tue, Aug 25, 2020 at 01:17:49PM +0100, Anatoly Burakov wrote: > Current information regarding hugepage usage is a little out of date. > Update it to include information on in-memory mode, as well as on > default mountpoints provided by systemd. > > Cc: stable@dpdk.org > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> > --- > > Notes: > v2: > - Reworked the description > - Put runtime reservation first, and boot time as an alternative > - Clarified wording and fixed typos > - Mentioned that some kernel versions not supporting reserving 1G pages > > doc/guides/linux_gsg/sys_reqs.rst | 71 ++++++++++++++++++++----------- > 1 file changed, 45 insertions(+), 26 deletions(-) > > diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst > index a124656bcb..8782d05579 100644 > --- a/doc/guides/linux_gsg/sys_reqs.rst > +++ b/doc/guides/linux_gsg/sys_reqs.rst > @@ -155,8 +155,35 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz > Reserving Hugepages for DPDK Use > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > -The allocation of hugepages should be done at boot time or as soon as possible after system boot > -to prevent memory from being fragmented in physical memory. > +The reservation of hugepages can be performed at run time. This is done by > +echoing the number of hugepages required to a ``nr_hugepages`` file in the > +``/sys/kernel/`` directory corresponding to a specific page size (in > +Kilobytes). For a single-node system, the command to use is as follows > +(assuming that 1024 of 2MB pages are required):: > + > + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > + > +On a NUMA machine, the above command will usually divide the number of hugepages > +equally across all NUMA nodes (assuming there is enough memory on all NUMA > +nodes). However, pages can also be reserved explicitly on individual NUMA > +nodes using a ``nr_hugepages`` file in the ``/sys/devices/`` directory:: > + > + echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages > + echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages > + > +.. note:: > + > + Some kernel versions may not allow reserving 1 GB hugepages at run time, so > + reserving them at boot time may be the only option. Please see below for > + instructions. > + > +**Alternative:** > + > +In the general case, reserving hugepages at run time is perfectly fine, but in > +use cases where having lots of physically contiguous memory is required, it is > +preferable to reserve hugepages at boot time, as that will help in preventing > +physical memory from becoming heavily fragmented. > + > To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. > > For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: > @@ -185,35 +212,27 @@ the number of hugepages reserved at boot time is generally divided equally betwe > > See the Documentation/admin-guide/kernel-parameters.txt file in your Linux source tree for further details of these and other kernel options. > > -**Alternative:** > - > -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. > -This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. > -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: > - > - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > - > -On a NUMA machine, pages should be allocated explicitly on separate nodes:: > - > - echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages > - echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages > - > -.. note:: > - > - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. > - > Using Hugepages with the DPDK > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: > +If secondary process support is not required, DPDK is able to use hugepages > +without any configuration by using "in-memory" mode. Please see > +:ref:`linux_eal_parameters` for more details. > + > +If secondary process support is required, mount points for hugepages need to be > +created. On modern Linux distributions, a default mount point for hugepages is provided > +by the system and is located at ``/dev/hugepages``. This mount point will use the > +default hugepage size set by the kernel parameters as described above. > + > +However, in order to use hugepage sizes other than default, it is necessary to > +manually create mount points for hugepage sizes that are not provided by the > +system (e.g. 1GB pages). This reads a bit strangely, as it implies that the hugepage sizes are not provided by the system, but I believe the intention is to say that the mount points are not provided by the system, correct? Perhaps look to reword. > + > +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: > > mkdir /mnt/huge > - mount -t hugetlbfs nodev /mnt/huge > + mount -t hugetlbfs pagesize=1GB /mnt/huge > > The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: > > - nodev /mnt/huge hugetlbfs defaults 0 0 > - > -For 1GB pages, the page size must be specified as a mount option:: > - > - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 > + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 > -- > 2.17.1 Apart from the one note above, LGTM: Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The current instructions are slightly out of date when it comes to providing information about setting up the system for using DPDK as non-root, so update them. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- Notes: v2: - Moved VFIO description to be first doc/guides/linux_gsg/enable_func.rst | 54 ++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 15 deletions(-) diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index b2bda80bb7..a000def6cc 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -58,22 +58,42 @@ The application can then determine what action to take, if any, if the HPET is n if any, and on what is available on the system at runtime. Running DPDK Applications Without Root Privileges --------------------------------------------------------- +------------------------------------------------- -.. note:: +In order to run DPDK as non-root, the following Linux filesystem objects' +permissions should be adjusted to ensure that the Linux account being used to +run the DPDK application has access to them: - The instructions below will allow running DPDK as non-root with older - Linux kernel versions. However, since version 4.0, the kernel does not allow - unprivileged processes to read the physical address information from - the pagemaps file, making it impossible for those processes to use HW - devices which require physical addresses +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` -Although applications using the DPDK use network ports and other hardware resources directly, -with a number of small permission adjustments it is possible to run these applications as a user other than "root". -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that -the Linux user account being used to run the DPDK application has access to them: +* If the HPET is to be used, ``/dev/hpet`` -* All directories which serve as hugepage mount points, for example, ``/mnt/huge`` +When running as non-root user, there may be some additional resource limits +that are imposed by the system. Specifically, the following resource limits may +need to be adjusted in order to ensure normal DPDK operation: + +* RLIMIT_LOCKS (number of file locks that can be held by a process) + +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) + +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) + +The above limits can usually be adjusted by editing +``/etc/security/limits.conf`` file, and rebooting. + +Additionally, depending on which kernel driver is in use, the relevant +resources also should be accessible by the user running the DPDK application. + +For ``vfio-pci`` kernel driver, the following Linux file system objects' +permissions should be adjusted: + +* The VFIO device file, ``/dev/vfio/vfio`` + +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of + devices intended to be used by DPDK, for example, ``/dev/vfio/50`` + +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file +system objects' permissions should be adjusted: * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on @@ -82,11 +102,15 @@ the Linux user account being used to run the DPDK application has access to them /sys/class/uio/uio0/device/config /sys/class/uio/uio0/device/resource* -* If the HPET is to be used, ``/dev/hpet`` - .. note:: - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default. + The instructions above will allow running DPDK with ``igb_uio`` driver as + non-root with older Linux kernel versions. However, since version 4.0, the + kernel does not allow unprivileged processes to read the physical address + information from the pagemaps file, making it impossible for those + processes to be used by non-privileged users. In such cases, using the VFIO + driver is recommended. + Power Management and Power Saving Functionality ----------------------------------------------- -- 2.17.1
Current information regarding hugepage usage is a little out of date. Update it to include information on in-memory mode, as well as on default mountpoints provided by systemd. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- Notes: v3: - Clarified wording around non-default hugepage sizes v2: - Reworked the description - Put runtime reservation first, and boot time as an alternative - Clarified wording and fixed typos - Mentioned that some kernel versions not supporting reserving 1G pages doc/guides/linux_gsg/sys_reqs.rst | 70 +++++++++++++++++++------------ 1 file changed, 44 insertions(+), 26 deletions(-) diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index a124656bcb..587f9e85e5 100644 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -155,8 +155,35 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz Reserving Hugepages for DPDK Use ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The allocation of hugepages should be done at boot time or as soon as possible after system boot -to prevent memory from being fragmented in physical memory. +The reservation of hugepages can be performed at run time. This is done by +echoing the number of hugepages required to a ``nr_hugepages`` file in the +``/sys/kernel/`` directory corresponding to a specific page size (in +Kilobytes). For a single-node system, the command to use is as follows +(assuming that 1024 of 2MB pages are required):: + + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages + +On a NUMA machine, the above command will usually divide the number of hugepages +equally across all NUMA nodes (assuming there is enough memory on all NUMA +nodes). However, pages can also be reserved explicitly on individual NUMA +nodes using a ``nr_hugepages`` file in the ``/sys/devices/`` directory:: + + echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages + echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages + +.. note:: + + Some kernel versions may not allow reserving 1 GB hugepages at run time, so + reserving them at boot time may be the only option. Please see below for + instructions. + +**Alternative:** + +In the general case, reserving hugepages at run time is perfectly fine, but in +use cases where having lots of physically contiguous memory is required, it is +preferable to reserve hugepages at boot time, as that will help in preventing +physical memory from becoming heavily fragmented. + To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: @@ -185,35 +212,26 @@ the number of hugepages reserved at boot time is generally divided equally betwe See the Documentation/admin-guide/kernel-parameters.txt file in your Linux source tree for further details of these and other kernel options. -**Alternative:** - -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. -This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: - - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - -On a NUMA machine, pages should be allocated explicitly on separate nodes:: - - echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages - echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages - -.. note:: - - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. - Using Hugepages with the DPDK ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: +If secondary process support is not required, DPDK is able to use hugepages +without any configuration by using "in-memory" mode. Please see +:ref:`linux_eal_parameters` for more details. + +If secondary process support is required, mount points for hugepages need to be +created. On modern Linux distributions, a default mount point for hugepages is provided +by the system and is located at ``/dev/hugepages``. This mount point will use the +default hugepage size set by the kernel parameters as described above. + +However, in order to use hugepage sizes other than the default, it is necessary +to manually create mount points for those hugepage sizes (e.g. 1GB pages). + +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: mkdir /mnt/huge - mount -t hugetlbfs nodev /mnt/huge + mount -t hugetlbfs pagesize=1GB /mnt/huge The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: - nodev /mnt/huge hugetlbfs defaults 0 0 - -For 1GB pages, the page size must be specified as a mount option:: - - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 -- 2.17.1
The current instructions are slightly out of date when it comes to providing information about setting up the system for using DPDK as non-root, so update them. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- Notes: v2: - Moved VFIO description to be first doc/guides/linux_gsg/enable_func.rst | 58 ++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 17 deletions(-) diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index aab32252ea..29e1b90217 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -60,22 +60,51 @@ The application can then determine what action to take, if any, if the HPET is n if any, and on what is available on the system at runtime. Running DPDK Applications Without Root Privileges --------------------------------------------------------- +------------------------------------------------- + +In order to run DPDK as non-root, the following Linux filesystem objects' +permissions should be adjusted to ensure that the Linux account being used to +run the DPDK application has access to them: + +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` + +* If the HPET is to be used, ``/dev/hpet`` + +When running as non-root user, there may be some additional resource limits +that are imposed by the system. Specifically, the following resource limits may +need to be adjusted in order to ensure normal DPDK operation: + +* RLIMIT_LOCKS (number of file locks that can be held by a process) + +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) + +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) + +The above limits can usually be adjusted by editing +``/etc/security/limits.conf`` file, and rebooting. + +Additionally, depending on which kernel driver is in use, the relevant +resources also should be accessible by the user running the DPDK application. + +For ``vfio-pci`` kernel driver, the following Linux file system objects' +permissions should be adjusted: + +* The VFIO device file, ``/dev/vfio/vfio`` + +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of + devices intended to be used by DPDK, for example, ``/dev/vfio/50`` .. note:: - The instructions below will allow running DPDK as non-root with older - Linux kernel versions. However, since version 4.0, the kernel does not allow - unprivileged processes to read the physical address information from - the pagemaps file, making it impossible for those processes to use HW - devices which require physical addresses + The instructions below will allow running DPDK with ``igb_uio`` or + ``uio_pci_generic`` drivers as non-root with older Linux kernel versions. + However, since version 4.0, the kernel does not allow unprivileged processes + to read the physical address information from the pagemaps file, making it + impossible for those processes to be used by non-privileged users. In such + cases, using the VFIO driver is recommended. -Although applications using the DPDK use network ports and other hardware resources directly, -with a number of small permission adjustments it is possible to run these applications as a user other than "root". -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that -the Linux user account being used to run the DPDK application has access to them: - -* All directories which serve as hugepage mount points, for example, ``/mnt/huge`` +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file +system objects' permissions should be adjusted: * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on @@ -84,11 +113,6 @@ the Linux user account being used to run the DPDK application has access to them /sys/class/uio/uio0/device/config /sys/class/uio/uio0/device/resource* -* If the HPET is to be used, ``/dev/hpet`` - -.. note:: - - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default. Power Management and Power Saving Functionality ----------------------------------------------- -- 2.17.1
Current information regarding hugepage usage is a little out of date. Update it to include information on in-memory mode, as well as on default mountpoints provided by systemd. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- Notes: v3: - Clarified wording around non-default hugepage sizes v2: - Reworked the description - Put runtime reservation first, and boot time as an alternative - Clarified wording and fixed typos - Mentioned that some kernel versions not supporting reserving 1G pages doc/guides/linux_gsg/sys_reqs.rst | 70 +++++++++++++++++++------------ 1 file changed, 44 insertions(+), 26 deletions(-) diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index 6ecdc04aa9..3966c30456 100644 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -147,8 +147,35 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz Reserving Hugepages for DPDK Use ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The allocation of hugepages should be done at boot time or as soon as possible after system boot -to prevent memory from being fragmented in physical memory. +The reservation of hugepages can be performed at run time. This is done by +echoing the number of hugepages required to a ``nr_hugepages`` file in the +``/sys/kernel/`` directory corresponding to a specific page size (in +Kilobytes). For a single-node system, the command to use is as follows +(assuming that 1024 of 2MB pages are required):: + + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages + +On a NUMA machine, the above command will usually divide the number of hugepages +equally across all NUMA nodes (assuming there is enough memory on all NUMA +nodes). However, pages can also be reserved explicitly on individual NUMA +nodes using a ``nr_hugepages`` file in the ``/sys/devices/`` directory:: + + echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages + echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages + +.. note:: + + Some kernel versions may not allow reserving 1 GB hugepages at run time, so + reserving them at boot time may be the only option. Please see below for + instructions. + +**Alternative:** + +In the general case, reserving hugepages at run time is perfectly fine, but in +use cases where having lots of physically contiguous memory is required, it is +preferable to reserve hugepages at boot time, as that will help in preventing +physical memory from becoming heavily fragmented. + To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: @@ -177,35 +204,26 @@ the number of hugepages reserved at boot time is generally divided equally betwe See the Documentation/admin-guide/kernel-parameters.txt file in your Linux source tree for further details of these and other kernel options. -**Alternative:** - -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. -This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: - - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - -On a NUMA machine, pages should be allocated explicitly on separate nodes:: - - echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages - echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages - -.. note:: - - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. - Using Hugepages with the DPDK ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: +If secondary process support is not required, DPDK is able to use hugepages +without any configuration by using "in-memory" mode. Please see +:ref:`linux_eal_parameters` for more details. + +If secondary process support is required, mount points for hugepages need to be +created. On modern Linux distributions, a default mount point for hugepages is provided +by the system and is located at ``/dev/hugepages``. This mount point will use the +default hugepage size set by the kernel parameters as described above. + +However, in order to use hugepage sizes other than the default, it is necessary +to manually create mount points for those hugepage sizes (e.g. 1GB pages). + +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: mkdir /mnt/huge - mount -t hugetlbfs nodev /mnt/huge + mount -t hugetlbfs pagesize=1GB /mnt/huge The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: - nodev /mnt/huge hugetlbfs defaults 0 0 - -For 1GB pages, the page size must be specified as a mount option:: - - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0 -- 2.17.1
On Thu, Nov 19, 2020 at 11:53 AM Anatoly Burakov <anatoly.burakov@intel.com> wrote: > -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: > +If secondary process support is not required, DPDK is able to use hugepages > +without any configuration by using "in-memory" mode. Please see > +:ref:`linux_eal_parameters` for more details. There is no such reference: Found ninja-1.9.0 at /usr/bin/ninja [3/4] Generating html_guides with a custom command. Install the sphinx ReadTheDocs theme for improved html documentation layout: https://sphinx-rtd-theme.readthedocs.io/ /home/dmarchan/dpdk/doc/guides/linux_gsg/sys_reqs.rst:210: WARNING: undefined label: linux_eal_parameters (if the link has no caption the label must precede a section header) [3/4] Running external command doc. Building docs: Doxygen_API HTML_Guides Did you mean :doc: ? -- David Marchand
On 19-Nov-20 9:03 PM, David Marchand wrote:
> On Thu, Nov 19, 2020 at 11:53 AM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
>> -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps::
>> +If secondary process support is not required, DPDK is able to use hugepages
>> +without any configuration by using "in-memory" mode. Please see
>> +:ref:`linux_eal_parameters` for more details.
>
> There is no such reference:
>
> Found ninja-1.9.0 at /usr/bin/ninja
> [3/4] Generating html_guides with a custom command.
> Install the sphinx ReadTheDocs theme for improved html documentation
> layout: https://sphinx-rtd-theme.readthedocs.io/
> /home/dmarchan/dpdk/doc/guides/linux_gsg/sys_reqs.rst:210: WARNING:
> undefined label: linux_eal_parameters (if the link has no caption the
> label must precede a section header)
> [3/4] Running external command doc.
> Building docs: Doxygen_API HTML_Guides
>
> Did you mean :doc: ?
>
>
Most probably yes, i have. Fix or respin?
--
Thanks,
Anatoly
19/11/2020 11:52, Anatoly Burakov:
> Current information regarding hugepage usage is a little out of date.
> Update it to include information on in-memory mode, as well as on
> default mountpoints provided by systemd.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Applied with small adjustments, thanks.
Note about doc writing:
It is easier to read and update doc source if wrapping short lines logically.