DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Cc: dev@dpdk.org, Anatoly Burakov <anatoly.burakov@intel.com>,
	david.marchand@redhat.com, bruce.richardson@intel.com
Subject: Re: [PATCH v1 1/6] doc: add hugepage mapping details
Date: Mon, 17 Jan 2022 10:20:59 +0100	[thread overview]
Message-ID: <3969190.6PsWsQAL7t@thomas> (raw)
In-Reply-To: <20220117080801.481568-2-dkozlyuk@nvidia.com>

Thanks for the nice addition to the documentation, this is really needed.
Some comments below.

17/01/2022 09:07, Dmitry Kozlyuk:
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> -    Memory reservations done using the APIs provided by rte_malloc are also backed by pages from the hugetlbfs filesystem.
> +    Memory reservations done using the APIs provided by rte_malloc are also backed by hugepages.

Should we mention except if --no-huge is used?

> +Hugepage Mapping
> +^^^^^^^^^^^^^^^^
> +
> +Below is an overview of methods used for each OS to obtain hugepages,
> +explaining why certain limitations and options exist in EAL.
> +See the user guide for a specific OS for configuration details.
> +
> +FreeBSD uses ``contigmem`` kernel module
> +to reserve a fixed number of hugepages at system start,
> +which are mapped by EAL at initialization using a specific ``sysctl()``.
> +
> +Windows EAL allocates hugepages from the OS as needed using Win32 API,
> +so available amount depends on the system load.
> +It uses ``virt2phys`` kernel module to obtain physical addresses,
> +unless running in IOVA-as-VA mode (e.g. forced with ``--iova-mode=va``).
> +
> +Linux implements a variety of methods:
> +
> +* mapping each hugepage from its own file in hugetlbfs;
> +* mapping multiple hugepages from a shared file in hugetlbfs;
> +* anonymous mapping.
> +
> +Mapping hugepages from files in hugetlbfs is essential for multi-process,
> +because secondary processes need to map the same hugepages.
> +EAL creates files like ``rtemap_0``
> +in directories specified with ``--huge-dir`` option
> +(or in the mount point for a specific hugepage size).
> +The ``rtemap_`` prefix can be changed using ``--file-prefix``.
> +This may be needed for running multiple primary processes
> +that share a hugetlbfs mount point.
> +Each backing file by default corresponds to one hugepage,
> +it is opened and locked for the entire time the hugepage is used.
> +See :ref:`segment-file-descriptors` section
> +on how the number of open backing file descriptors can be reduced.
> +
> +Backing files may persist after the corresponding hugepage is freed
> +and even after the application terminates,
> +reducing the number of hugepages available to other processes.
> +EAL removes existing files at startup
> +and can remove newly created files before mapping them with ``--huge-unlink``.

This sentence require more explanations, as it is not clear when and why.

> +However, since it disables multi-process anyway,
> +using anonymous mapping (``--in-memory``) is recommended instead.
> +
> +:ref:`EAL memory allocator <malloc>` relies on hugepages being zero-filled.
> +Hugepages are cleared by the kernel when a file in hugetlbfs or its part
> +is mapped for the first time system-wide
> +to prevent data leaks from previous users of the same hugepage.
> +EAL ensures this behavior by removing existing backing files at startup
> +and by recreating them before opening for mapping (as a precaution).
> +
> +Anonymous mapping does not allow multi-process architecture,
> +but it is free of filename conflicts and leftover files on hugetlbfs.

It is also easier to run as non-root.

> +If memfd_create(2) is supported both at build and run time,
> +DPDK memory manager can provide file descriptors for memory segments,
> +which are required for VirtIO with vhost-user backend.
> +This means open file descriptor issues may also affect this mode,
> +with the same solution.

This is not clear. Which issues? Which mode? Which solution?




  reply	other threads:[~2022-01-17  9:21 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 14:37 [RFC PATCH 0/6] Fast restart with many hugepages Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 2/6] mem: add dirty malloc element support Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 3/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 4/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2021-12-30 14:48 ` [RFC PATCH 5/6] eal: allow hugepage file reuse with --huge-unlink Dmitry Kozlyuk
2021-12-30 14:49 ` [RFC PATCH 6/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-01-17  8:07 ` [PATCH v1 0/6] Fast restart with many hugepages Dmitry Kozlyuk
2022-01-17  8:07   ` [PATCH v1 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2022-01-17  9:20     ` Thomas Monjalon [this message]
2022-01-17  8:07   ` [PATCH v1 2/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-01-17 15:47     ` Bruce Richardson
2022-01-17 15:51       ` Bruce Richardson
2022-01-19 21:12         ` Dmitry Kozlyuk
2022-01-20  9:04           ` Bruce Richardson
2022-01-17 16:06     ` Aaron Conole
2022-01-17  8:07   ` [PATCH v1 3/6] mem: add dirty malloc element support Dmitry Kozlyuk
2022-01-17 14:07     ` Thomas Monjalon
2022-01-17  8:07   ` [PATCH v1 4/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2022-01-17 14:10     ` Thomas Monjalon
2022-01-17  8:14   ` [PATCH v1 5/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2022-01-17 14:24     ` Thomas Monjalon
2022-01-17  8:14   ` [PATCH v1 6/6] eal: extend --huge-unlink for " Dmitry Kozlyuk
2022-01-17 14:27     ` Thomas Monjalon
2022-01-17 16:40   ` [PATCH v1 0/6] Fast restart with many hugepages Bruce Richardson
2022-01-19 21:12     ` Dmitry Kozlyuk
2022-01-20  9:05       ` Bruce Richardson
2022-01-19 21:09   ` [PATCH v2 " Dmitry Kozlyuk
2022-01-19 21:09     ` [PATCH v2 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2022-01-27 13:59       ` Bruce Richardson
2022-01-19 21:09     ` [PATCH v2 2/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-01-19 21:09     ` [PATCH v2 3/6] mem: add dirty malloc element support Dmitry Kozlyuk
2022-01-19 21:09     ` [PATCH v2 4/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2022-01-19 21:11     ` [PATCH v2 5/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2022-01-19 21:11       ` [PATCH v2 6/6] eal: extend --huge-unlink for " Dmitry Kozlyuk
2022-01-27 12:07     ` [PATCH v2 0/6] Fast restart with many hugepages Bruce Richardson
2022-02-02 14:12     ` Thomas Monjalon
2022-02-02 21:54     ` David Marchand
2022-02-03 10:26       ` David Marchand
2022-02-03 18:13     ` [PATCH v3 " Dmitry Kozlyuk
2022-02-03 18:13       ` [PATCH v3 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2022-02-08 15:28         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 2/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-02-08 16:20         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 3/6] mem: add dirty malloc element support Dmitry Kozlyuk
2022-02-08 16:36         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 4/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2022-02-08 16:39         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 5/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2022-02-08 17:05         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 6/6] eal: extend --huge-unlink for " Dmitry Kozlyuk
2022-02-08 17:14         ` Burakov, Anatoly
2022-02-08 20:40       ` [PATCH v3 0/6] Fast restart with many hugepages David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3969190.6PsWsQAL7t@thomas \
    --to=thomas@monjalon.net \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dkozlyuk@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).