From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CC171A0032; Mon, 17 Jan 2022 10:21:05 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 54A3E41184; Mon, 17 Jan 2022 10:21:05 +0100 (CET) Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by mails.dpdk.org (Postfix) with ESMTP id 5A3314067B for ; Mon, 17 Jan 2022 10:21:03 +0100 (CET) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id C6AD55C02BC; Mon, 17 Jan 2022 04:21:01 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Mon, 17 Jan 2022 04:21:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm3; bh= Rw5IIxQocYslhh9Ln4ZkS8uhYw1vEYMSr43/u8x+/GM=; b=hMDgjzmKOt93iNFu YwtkaizNxLFQCH+oHv2ZMkSu96fy/6ffsA0o0zrmC9DWst5Z5laTTfX0/Qb44kem 29eXVbFGc1LpPYSa7sFasB6aO9eET1LXqtdyCY/u/vquCc0Uieg9ex0Z+mwdev7c OH3WyU5blqvXw8X4xyHA8POQvu4AmA0sUmkAcOfl4CMCHM/Xpfn/BR/v2gpKer/D a0NemfhctjwNGtxjuB16uXHKeYYEHBcnpdhXAXfDjPhqp/2P2UAMpBQ7kkGDYleJ SEuOeOZBWMN69Q566piss82EdVnl8T76w5xow+BTfKzkf7iI3w7Y35WiFMNi+qHS nOmxYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=Rw5IIxQocYslhh9Ln4ZkS8uhYw1vEYMSr43/u8x+/ GM=; b=hz/xdrospcCmKBXkL5cbSm8LPVXJT9VHa/RwfGe/8DCy41EF69I6zbhYG 5LRW/pA/laQK/jzRSM4PXFvwXWD/7tu/h0vfalnq/7MXkHGOVG2MINg6hZgWKXCb 1jedvEzYZPYZk7utYz4tmPAtRyLqfL9V5hhEWMS0/3S2R87IxMUUoq7iEnsNMoi+ g+8JVv4qg0AeuHp5umZgNDLB1pJVuwyzz0WIobNEfdK1NxMngjXDcxXqBQc/V711 UMf04hRPSFbW6/BslCgIaCYclNAeKjoCp5dHpfcm2GuHoDE6SDEa/e+v6zejs3cS 8ObrYqjvCk5MxG00zzdGk9Kb0qrWA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvvddruddugddtfecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhmrghs ucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenucggtf frrghtthgvrhhnpedugefgvdefudfftdefgeelgffhueekgfffhfeujedtteeutdejueei iedvffegheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 17 Jan 2022 04:21:00 -0500 (EST) From: Thomas Monjalon To: Dmitry Kozlyuk Cc: dev@dpdk.org, Anatoly Burakov , david.marchand@redhat.com, bruce.richardson@intel.com Subject: Re: [PATCH v1 1/6] doc: add hugepage mapping details Date: Mon, 17 Jan 2022 10:20:59 +0100 Message-ID: <3969190.6PsWsQAL7t@thomas> In-Reply-To: <20220117080801.481568-2-dkozlyuk@nvidia.com> References: <20211230143744.3550098-1-dkozlyuk@nvidia.com> <20220117080801.481568-1-dkozlyuk@nvidia.com> <20220117080801.481568-2-dkozlyuk@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Thanks for the nice addition to the documentation, this is really needed. Some comments below. 17/01/2022 09:07, Dmitry Kozlyuk: > --- a/doc/guides/prog_guide/env_abstraction_layer.rst > +++ b/doc/guides/prog_guide/env_abstraction_layer.rst > - Memory reservations done using the APIs provided by rte_malloc are also backed by pages from the hugetlbfs filesystem. > + Memory reservations done using the APIs provided by rte_malloc are also backed by hugepages. Should we mention except if --no-huge is used? > +Hugepage Mapping > +^^^^^^^^^^^^^^^^ > + > +Below is an overview of methods used for each OS to obtain hugepages, > +explaining why certain limitations and options exist in EAL. > +See the user guide for a specific OS for configuration details. > + > +FreeBSD uses ``contigmem`` kernel module > +to reserve a fixed number of hugepages at system start, > +which are mapped by EAL at initialization using a specific ``sysctl()``. > + > +Windows EAL allocates hugepages from the OS as needed using Win32 API, > +so available amount depends on the system load. > +It uses ``virt2phys`` kernel module to obtain physical addresses, > +unless running in IOVA-as-VA mode (e.g. forced with ``--iova-mode=va``). > + > +Linux implements a variety of methods: > + > +* mapping each hugepage from its own file in hugetlbfs; > +* mapping multiple hugepages from a shared file in hugetlbfs; > +* anonymous mapping. > + > +Mapping hugepages from files in hugetlbfs is essential for multi-process, > +because secondary processes need to map the same hugepages. > +EAL creates files like ``rtemap_0`` > +in directories specified with ``--huge-dir`` option > +(or in the mount point for a specific hugepage size). > +The ``rtemap_`` prefix can be changed using ``--file-prefix``. > +This may be needed for running multiple primary processes > +that share a hugetlbfs mount point. > +Each backing file by default corresponds to one hugepage, > +it is opened and locked for the entire time the hugepage is used. > +See :ref:`segment-file-descriptors` section > +on how the number of open backing file descriptors can be reduced. > + > +Backing files may persist after the corresponding hugepage is freed > +and even after the application terminates, > +reducing the number of hugepages available to other processes. > +EAL removes existing files at startup > +and can remove newly created files before mapping them with ``--huge-unlink``. This sentence require more explanations, as it is not clear when and why. > +However, since it disables multi-process anyway, > +using anonymous mapping (``--in-memory``) is recommended instead. > + > +:ref:`EAL memory allocator ` relies on hugepages being zero-filled. > +Hugepages are cleared by the kernel when a file in hugetlbfs or its part > +is mapped for the first time system-wide > +to prevent data leaks from previous users of the same hugepage. > +EAL ensures this behavior by removing existing backing files at startup > +and by recreating them before opening for mapping (as a precaution). > + > +Anonymous mapping does not allow multi-process architecture, > +but it is free of filename conflicts and leftover files on hugetlbfs. It is also easier to run as non-root. > +If memfd_create(2) is supported both at build and run time, > +DPDK memory manager can provide file descriptors for memory segments, > +which are required for VirtIO with vhost-user backend. > +This means open file descriptor issues may also affect this mode, > +with the same solution. This is not clear. Which issues? Which mode? Which solution?