From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id B75E0A04B2
	for <public@inbox.dpdk.org>; Tue, 25 Aug 2020 15:10:54 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 8F71B100C;
	Tue, 25 Aug 2020 15:10:54 +0200 (CEST)
Received: from mga05.intel.com (mga05.intel.com [192.55.52.43])
 by dpdk.org (Postfix) with ESMTP id 6613923D;
 Tue, 25 Aug 2020 15:10:50 +0200 (CEST)
IronPort-SDR: CwgJsNT03u4ih/qDgeNDKakM16Gku6F/C3YzcWz9WKRXxuIuq9qKUrsoWcdINFJv9lOQQ10vPb
 i+TEskTz9J+g==
X-IronPort-AV: E=McAfee;i="6000,8403,9723"; a="240920949"
X-IronPort-AV: E=Sophos;i="5.76,352,1592895600"; d="scan'208";a="240920949"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga005.jf.intel.com ([10.7.209.41])
 by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 25 Aug 2020 06:10:49 -0700
IronPort-SDR: dQBR0+vxrVk50S8ajJ4zmHY56A3Um39Br0tqkK/1bbaP3cfkr69r/O0BeL9eNCP97z7CK0i/1m
 PhYUugF90T1g==
X-IronPort-AV: E=Sophos;i="5.76,352,1592895600"; d="scan'208";a="474339785"
Received: from bricha3-mobl.ger.corp.intel.com ([10.252.1.150])
 by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA;
 25 Aug 2020 06:10:47 -0700
Date: Tue, 25 Aug 2020 14:10:40 +0100
From: Bruce Richardson <bruce.richardson@intel.com>
To: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: dev@dpdk.org, John McNamara <john.mcnamara@intel.com>,
 Marko Kovacevic <marko.kovacevic@intel.com>, ferruh.yigit@intel.com,
 padraig.j.connolly@intel.com, stable@dpdk.org
Message-ID: <20200825131040.GB554@bricha3-MOBL.ger.corp.intel.com>
References: <aca9a5986871ecb3aba7f476fa906a34dabc9e7e.1598283570.git.anatoly.burakov@intel.com>
 <5f68d1f573f9edee2aed8c3d81b655416c18dff0.1598357863.git.anatoly.burakov@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5f68d1f573f9edee2aed8c3d81b655416c18dff0.1598357863.git.anatoly.burakov@intel.com>
Subject: Re: [dpdk-stable] [PATCH v2 2/2] doc/linux_gsg: update information
 on using hugepages
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
Errors-To: stable-bounces@dpdk.org
Sender: "stable" <stable-bounces@dpdk.org>

On Tue, Aug 25, 2020 at 01:17:49PM +0100, Anatoly Burakov wrote:
> Current information regarding hugepage usage is a little out of date.
> Update it to include information on in-memory mode, as well as on
> default mountpoints provided by systemd.
> 
> Cc: stable@dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> 
> Notes:
>     v2:
>     - Reworked the description
>     - Put runtime reservation first, and boot time as an alternative
>     - Clarified wording and fixed typos
>     - Mentioned that some kernel versions not supporting reserving 1G pages
> 
>  doc/guides/linux_gsg/sys_reqs.rst | 71 ++++++++++++++++++++-----------
>  1 file changed, 45 insertions(+), 26 deletions(-)
> 
> diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst
> index a124656bcb..8782d05579 100644
> --- a/doc/guides/linux_gsg/sys_reqs.rst
> +++ b/doc/guides/linux_gsg/sys_reqs.rst
> @@ -155,8 +155,35 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz
>  Reserving Hugepages for DPDK Use
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>  
> -The allocation of hugepages should be done at boot time or as soon as possible after system boot
> -to prevent memory from being fragmented in physical memory.
> +The reservation of hugepages can be performed at run time. This is done by
> +echoing the number of hugepages required to a ``nr_hugepages`` file in the
> +``/sys/kernel/`` directory corresponding to a specific page size (in
> +Kilobytes). For a single-node system, the command to use is as follows
> +(assuming that 1024 of 2MB pages are required)::
> +
> +    echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +On a NUMA machine, the above command will usually divide the number of hugepages
> +equally across all NUMA nodes (assuming there is enough memory on all NUMA
> +nodes). However, pages can also be reserved explicitly on individual NUMA
> +nodes using a ``nr_hugepages`` file in the ``/sys/devices/`` directory::
> +
> +    echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> +    echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
> +
> +.. note::
> +
> +    Some kernel versions may not allow reserving 1 GB hugepages at run time, so
> +    reserving them at boot time may be the only option. Please see below for
> +    instructions.
> +
> +**Alternative:**
> +
> +In the general case, reserving hugepages at run time is perfectly fine, but in
> +use cases where having lots of physically contiguous memory is required, it is
> +preferable to reserve hugepages at boot time, as that will help in preventing
> +physical memory from becoming heavily fragmented.
> +
>  To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line.
>  
>  For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use::
> @@ -185,35 +212,27 @@ the number of hugepages reserved at boot time is generally divided equally betwe
>  
>  See the Documentation/admin-guide/kernel-parameters.txt file in your Linux source tree for further details of these and other kernel options.
>  
> -**Alternative:**
> -
> -For 2 MB pages, there is also the option of allocating hugepages after the system has booted.
> -This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory.
> -For a single-node system, the command to use is as follows (assuming that 1024 pages are required)::
> -
> -    echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> -
> -On a NUMA machine, pages should be allocated explicitly on separate nodes::
> -
> -    echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> -    echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
> -
> -.. note::
> -
> -    For 1G pages, it is not possible to reserve the hugepage memory after the system has booted.
> -
>  Using Hugepages with the DPDK
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>  
> -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps::
> +If secondary process support is not required, DPDK is able to use hugepages
> +without any configuration by using "in-memory" mode. Please see
> +:ref:`linux_eal_parameters` for more details.
> +
> +If secondary process support is required, mount points for hugepages need to be
> +created. On modern Linux distributions, a default mount point for hugepages is provided
> +by the system and is located at ``/dev/hugepages``. This mount point will use the
> +default hugepage size set by the kernel parameters as described above.
> +
> +However, in order to use hugepage sizes other than default, it is necessary to
> +manually create mount points for hugepage sizes that are not provided by the
> +system (e.g. 1GB pages).

This reads a bit strangely, as it implies that the hugepage sizes are not
provided by the system, but I believe the intention is to say that the
mount points are not provided by the system, correct? Perhaps look to
reword.

> +
> +To make the hugepages of size 1GB available for DPDK use, perform the following steps::
>  
>      mkdir /mnt/huge
> -    mount -t hugetlbfs nodev /mnt/huge
> +    mount -t hugetlbfs pagesize=1GB /mnt/huge
>  
>  The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file::
>  
> -    nodev /mnt/huge hugetlbfs defaults 0 0
> -
> -For 1GB pages, the page size must be specified as a mount option::
> -
> -    nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0
> +    nodev /mnt/huge hugetlbfs pagesize=1GB 0 0
> -- 
> 2.17.1

Apart from the one note above, LGTM:

Acked-by: Bruce Richardson <bruce.richardson@intel.com>