From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 286C458F6; Fri, 31 Aug 2018 14:51:54 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w7VCp0N8019143; Fri, 31 Aug 2018 13:51:00 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w7VCp05l019142; Fri, 31 Aug 2018 13:51:00 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org Date: Fri, 31 Aug 2018 13:50:54 +0100 Message-Id: <1535719857-19092-3-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1535719857-19092-1-git-send-email-alejandro.lucero@netronome.com> References: <1535719857-19092-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH v2 2/5] mem: use address hint for mapping hugepages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Aug 2018 12:51:54 -0000 Linux kernel uses a really high address as starting address for serving mmaps calls. If there exist addressing limitations and IOVA mode is VA, this starting address is likely too high for those devices. However, it is possible to use a lower address in the process virtual address space as with 64 bits there is a lot of available space. This patch adds an address hint as starting address for 64 bits systems. Signed-off-by: Alejandro Lucero --- lib/librte_eal/common/eal_common_memory.c | 35 ++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index bdd8f44..97378b1 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -37,6 +37,23 @@ static void *next_baseaddr; static uint64_t system_page_sz; +#ifdef RTE_ARCH_64 +/* + * Linux kernel uses a really high address as starting address for serving + * mmaps calls. If there exists addressing limitations and IOVA mode is VA, + * this starting address is likely too high for those devices. However, it + * is possible to use a lower address in the process virtual address space + * as with 64 bits there is a lot of available space. + * + * Current known limitations are 39 or 40 bits. Setting the starting address + * at 4GB implies there are 508GB or 1020GB for mapping the available + * hugepages. This is likely enough for most systems, although a device with + * addressing limitations should call rte_eal_check_dma_mask for ensuring all + * memory is within supported range. + */ +static uint64_t baseaddr = 0x100000000; +#endif + void * eal_get_virtual_area(void *requested_addr, size_t *size, size_t page_sz, int flags, int mmap_flags) @@ -60,6 +77,11 @@ rte_eal_process_type() == RTE_PROC_PRIMARY) next_baseaddr = (void *) internal_config.base_virtaddr; +#ifdef RTE_ARCH_64 + if (next_baseaddr == NULL && internal_config.base_virtaddr == 0 && + rte_eal_process_type() == RTE_PROC_PRIMARY) + next_baseaddr = (void *) baseaddr; +#endif if (requested_addr == NULL && next_baseaddr != NULL) { requested_addr = next_baseaddr; requested_addr = RTE_PTR_ALIGN(requested_addr, page_sz); @@ -89,9 +111,20 @@ mapped_addr = mmap(requested_addr, (size_t)map_sz, PROT_READ, mmap_flags, -1, 0); + if (mapped_addr == MAP_FAILED && allow_shrink) *size -= page_sz; - } while (allow_shrink && mapped_addr == MAP_FAILED && *size > 0); + + if (mapped_addr != MAP_FAILED && addr_is_hint && + mapped_addr != requested_addr) { + /* hint was not used. Try with another offset */ + munmap(mapped_addr, map_sz); + mapped_addr = MAP_FAILED; + next_baseaddr = RTE_PTR_ADD(next_baseaddr, 0x100000000); + requested_addr = next_baseaddr; + } + } while ((allow_shrink || addr_is_hint) && + mapped_addr == MAP_FAILED && *size > 0); /* align resulting address - if map failed, we will ignore the value * anyway, so no need to add additional checks. -- 1.9.1