From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 962E94414D for ; Wed, 5 Jun 2024 00:50:15 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 65244402D2; Wed, 5 Jun 2024 00:50:15 +0200 (CEST) Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by mails.dpdk.org (Postfix) with ESMTP id 3F77D402A9 for ; Wed, 5 Jun 2024 00:50:13 +0200 (CEST) Received: by mail-lf1-f50.google.com with SMTP id 2adb3069b0e04-52b9af7a01bso3423103e87.0 for ; Tue, 04 Jun 2024 15:50:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717541413; x=1718146213; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=DFcDVprxvgsNTKsHhJeIODmDsZkvhw1KAT6qbWWgr5E=; b=HMtkKtSe2bONXX1hXheX+vrRfP4FtX6cnFWI1BFeKktqGWGSjJhUN9LdowmQll3jrT gvrjf+Uq7hY6OpGYUyIjBVZhiVwj9VkQNP9XUyKJUG799sbfdJxziNSacqD5KjIm7myT hjydFJKalfedy1msOJULP4Sx+us3Y89n4UWlIh2CRjcTagRyNZhIslrVk6R1loCZoXfa EQxysSCUEviOYvjy/Dr5/Q4b2KxsuHa3G/57ecZJ9H1y/8kCUFhVAuS4Q1ku1cQhk9Pp 1D9sU1WY0QKkEQaFZaE3b/JjdIo0YnOCONOR8Cj3Ki9nNF/FwYSBY458UCDVkx8db3wS OChw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717541413; x=1718146213; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DFcDVprxvgsNTKsHhJeIODmDsZkvhw1KAT6qbWWgr5E=; b=P2j3Q900TfOlriEkx4W2AiJaBEe0xcHx37BsVIQi64+fve2kjG3X6Yip6eXm+t3DbK cfiqFW0tT+8QU8SzwQO2k+xYM7mko5pOEW7aS//KHkFnFQJoFsJ/a6Uhmlvl1v9Buv8q sY1eL8Y5lWrw4em/Up1OJ3/oYSkPUHoc9nYoigIXxWMpI+kHYopvHF44CxRBcVcRIesr Uc0FMUQi2AkhbW5qpx6nuMeUMW0TrmxyDwAAwiaNBhKxMNZjovfuPXWLp2aDglPBu7BS a6dP1z8wdBSDyC6tc5mL19sZrUM6tTaw07BXJOa7wm05R8ELN4/MpuQkMTGVOQ9HZa/G egbA== X-Gm-Message-State: AOJu0YxyRuvq+VmtvYIdbXCykqgHglyV1GPUCVU/Em2x7N/Y7z44VNVM kozpy5VaUU1MbYE4KAmEUC3oyiI8FcxD0VDjAIRj4vKO2ZSe+lJq X-Google-Smtp-Source: AGHT+IFhybe1rhViyF7BYJ5CwCphWpZShU9MEIpLBeL58ZKRxHHmfsF9ahEiGUvPTYbZTkQ4f6Vb3Q== X-Received: by 2002:ac2:4e08:0:b0:51c:df1f:2edc with SMTP id 2adb3069b0e04-52bab4b8055mr622159e87.2.1717541412151; Tue, 04 Jun 2024 15:50:12 -0700 (PDT) Received: from sovereign (broadband-109-173-110-33.ip.moscow.rt.ru. [109.173.110.33]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-52b97984553sm872622e87.19.2024.06.04.15.50.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Jun 2024 15:50:11 -0700 (PDT) Date: Wed, 5 Jun 2024 01:50:09 +0300 From: Dmitry Kozlyuk To: Antonio Di Bacco Cc: users@dpdk.org Subject: Re: Failure while allocating 1GB hugepages Message-ID: <20240605015009.7f660149@sovereign> In-Reply-To: References: <20240510180743.53c37660@sovereign> <20240530180022.58cbc8a3@sovereign> X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org 2024-06-03 14:39 (UTC+0200), Antonio Di Bacco: > Hi, > I have the same behaviour with the code in this message. > > The first rte_memzone_reserve_aligned() call requesting 1.5GB > contiguous memory always fails, while the second one is always > successful. Hi, I can't explain the "always" part, but unstable behavior comes from unpredictable IOVA (physical address) that DPDK gets from the kernel. On the first try: 1. DPDK has no 1G hugepages mapped, it needs 2 more 1G hugepages. alloc_pages_on_heap() -> eal_memalloc_alloc_seg_bulk() 2. DPDK asks the kernel for one 1G hugepage, kernel maps the hugepage with IOVA = 0xFC000000, DPDK stores it in memseg_arr[0]. eal_memalloc_alloc_seg_bulk() -> alloc_seg() 3. Same for another hugepage and memseg_arr[1]->iova = 0xF8000000. 4. DPDK checks is the pages are continuous. alloc_pages_on_heap() -> eal_memalloc_is_contig() = false 5. Since it's a failure, DPDK frees newly allocated pages. alloc_pages_on_heap() -> rollback_expand_heap() On the second try: 6. Steps 1 and 2 repeat, but now memseg_arr[0]->iova = 0xF8000000. 7. Step 3 repeats, but now memseg_arr[0]->iova = 0xFC000000. 8. IOVAs are continuous, success. Just a wild guess why the second try may be likely to succeed: memseg_arr[1] with IOVA = 0xF8000000 is freed last at step 5, so maybe this is why the kernel is likely to reuse this page at step 6. I'm afraid the simplest way to get PA-continuous 1.5G reliably is indeed to try several times. The preferred way is to use IOMMU and IOVA-as-VA if HW permits. > It seems in eal_memalloc_is_contig() the 'msl->memseg_arr' items are inverted: > when there is the sequence FC0000000, F80000000 the allocation fails, > while the segments sequence F80000000, FC0000000 is fine. > From my understaning 'msl->memseg_arr' comes from > 'rte_eal_get_configuration()->mem_config;' which is rte_config > declared in eal_common_config.c Not quite, msl->memseg_arr content is dynamic, see above. P.S. One may say, DPDK could do better. It does have N hugepages occupying a continuous range of IOVA. DPDK could make them VA-continuous by remapping. But this would be more work, it still wouldn't be 100% reliable, and still insecure and inflexible compared to IOMMU.