From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 73FC7A058A; Fri, 17 Apr 2020 14:43:18 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3388D1DBB4; Fri, 17 Apr 2020 14:43:17 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 209AB1DBA9 for ; Fri, 17 Apr 2020 14:43:14 +0200 (CEST) IronPort-SDR: TmG9Qv8euMhioQm0ltfyg95M8f+wAOt4TmnDbJ6fnilmjE2zG3z2UxoDtJbIqJ0Z3/mATnWBpX 9fcq65gQCDOQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2020 05:43:14 -0700 IronPort-SDR: qYMnMHo9k+xuQ4/fCVn27UsWTeNKlBvGaH9BLxNsDMdBRirJgJyvPXo0Qvl0s3ovKeN+NY70X0 tRsd8fPjacLw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,395,1580803200"; d="scan'208";a="278369703" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.62.102]) ([10.252.62.102]) by fmsmga004.fm.intel.com with ESMTP; 17 Apr 2020 05:43:11 -0700 To: Dmitry Kozlyuk , dev@dpdk.org Cc: "Dmitry Malloy (MESHCHANINOV)" , Narcisa Ana Maria Vasile , Fady Bader , Tal Shnaiderman , Thomas Monjalon , Harini Ramakrishnan , Omar Cardona , Pallavi Kadam , Ranjit Menon References: <20200410164342.1194634-1-dmitry.kozliuk@gmail.com> <20200414194426.1640704-1-dmitry.kozliuk@gmail.com> <20200414194426.1640704-7-dmitry.kozliuk@gmail.com> From: "Burakov, Anatoly" Message-ID: Date: Fri, 17 Apr 2020 13:43:10 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20200414194426.1640704-7-dmitry.kozliuk@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 06/10] eal: introduce memory management wrappers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 14-Apr-20 8:44 PM, Dmitry Kozlyuk wrote: > System meory management is implemented differently for POSIX and > Windows. Introduce wrapper functions for operations used across DPDK: > > * rte_mem_map() > Create memory mapping for a regular file or a page file (swap). > This supports mapping to a reserved memory region even on Windows. > > * rte_mem_unmap() > Remove mapping created with rte_mem_map(). > > * rte_get_page_size() > Obtain default system page size. > > * rte_mem_lock() > Make arbitrary-sized memory region non-swappable. > > Wrappers follow POSIX semantics limited to DPDK tasks, but their > signatures deliberately differ from POSIX ones to be more safe and > expressive. > > Signed-off-by: Dmitry Kozlyuk > --- > +/** > + * Memory reservation flags. > + */ > +enum eal_mem_reserve_flags { > + /**< Reserve hugepages (support may be limited or missing). */ > + EAL_RESERVE_HUGEPAGES = 1 << 0, > + /**< Fail if requested address is not available. */ > + EAL_RESERVE_EXACT_ADDRESS = 1 << 1 I *really* don't like this terminology. In Linux et al., MAP_FIXED is not just "reserve at this exact address". MAP_FIXED is actually fairly dangerous if you don't know what you're doing, because it will unconditionally unmap any previously mapped memory. Also, to my knowledge, a call to MAP_FIXED cannot fail unless something went very wrong - it will *not* "fail if requested address is not available". We basically use MAP_FIXED because we have already mapped that area with MAP_ANONYMOUS previously, so we can guarantee that it's safe to call MAP_FIXED. I would greatly prefer if this was named to better reflect the above. EAL_FORCE_RESERVE perhaps? The comment also needs to be adjusted. > +}; > + > /** > * Get virtual area of specified size from the OS. > * > @@ -232,8 +243,8 @@ int rte_eal_check_module(const char *module_name); > #define EAL_VIRTUAL_AREA_UNMAP (1 << 2) > /**< immediately unmap reserved virtual area. */ > void * > -eal_get_virtual_area(void *requested_addr, size_t *size, > - size_t page_sz, int flags, int mmap_flags); > +eal_get_virtual_area(void *requested_addr, size_t *size, size_t page_sz, > + int flags, int mmap_flags); > > /** > > +/** > + * Reserve a region of virtual memory. > + * > + * Use eal_mem_free() to free reserved memory. > + * > + * @param requested_addr > + * A desired reservation address. The system may not respect it. > + * NULL means the address will be chosen by the system. > + * @param size > + * Reservation size. Must be a multiple of system page size. > + * @param flags > + * Reservation options. > + * @returns > + * Starting address of the reserved area on success, NULL on failure. > + * Callers must not access this memory until remapping it. > + */ > +void *eal_mem_reserve(void *requested_addr, size_t size, > + enum eal_mem_reserve_flags flags); This seems fairly suspect to me. I know that technically enum is an int, but semantically, IIRC an enum value should always contain exactly one value - you can't use an enum value like a set of flags. > + > +/** > + * Free memory obtained by eal_mem_reserve() or eal_mem_alloc(). > + * > + * If @code virt @endcode and @code size @endcode describe a part of the > + * reserved region, only this part of the region is freed (accurately > + * up to the system page size). If @code virt @endcode points to allocated > + * memory, @code size @endcode must match the one specified on allocation. > + * The behavior is undefined if the memory pointed by @code virt @endcode > + * is obtained from another source than listed above. > + * > + * @param virt > +/** > + * Memory mapping additional flags. > + * > + * In Linux and FreeBSD, each flag is semantically equivalent > + * to OS-specific mmap(3) flag with the same or similar name. > + * In Windows, POSIX and MAP_ANONYMOUS semantics are followed. > + */ > +enum rte_map_flags { > + /** Changes of mapped memory are visible to other processes. */ > + RTE_MAP_SHARED = 1 << 0, > + /** Mapping is not backed by a regular file. */ > + RTE_MAP_ANONYMOUS = 1 << 1, > + /** Copy-on-write mapping, changes are invisible to other processes. */ > + RTE_MAP_PRIVATE = 1 << 2, > + /** Fail if requested address cannot be taken. */ > + RTE_MAP_FIXED = 1 << 3 Again, MAP_FIXED does not behave the way you describe. See above comments. > +}; > + > +/** > + * OS-independent implementation of POSIX mmap(3) > + * with MAP_ANONYMOUS Linux/FreeBSD extension. > + */ > +__rte_experimental > +void *rte_mem_map(void *requested_addr, size_t size, enum rte_mem_prot prot, > + enum rte_map_flags flags, int fd, size_t offset); > + > +/** > + * OS-independent implementation of POSIX munmap(3). > + */ > +__rte_experimental > +int rte_mem_unmap(void *virt, size_t size); > + > +/** > + * Get system page size. This function never failes. > + * > + * @return > + * Positive page size in bytes. > + */ > +__rte_experimental > +int rte_get_page_size(void); uint32_t? or maybe uint64_t? > + > +/** > + * Lock region in physical memory and prevent it from swapping. > + * > + * @param virt > + * The virtual address. > + * @param size > + * Size of the region. > + * @return > + * 0 on success, negative on error. > + * > + * @note Implementations may require @p virt and @p size to be multiples > + * of system page size. > + * @see rte_get_page_size() > + * @see rte_mem_lock_page() > + */ > +__rte_experimental > +int rte_mem_lock(const void *virt, size_t size); > + > /** -- Thanks, Anatoly