From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8B12944146 for ; Mon, 3 Jun 2024 14:40:09 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4B9A042E1C; Mon, 3 Jun 2024 14:40:09 +0200 (CEST) Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by mails.dpdk.org (Postfix) with ESMTP id 488C642E02 for ; Mon, 3 Jun 2024 14:40:08 +0200 (CEST) Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a691bbb7031so90704866b.1 for ; Mon, 03 Jun 2024 05:40:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717418408; x=1718023208; darn=dpdk.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yFx+2l6QVu4Muo2y5JOx35CwfwNrbVY10g9+MMiCbi8=; b=FBKRE70fKhuc3kVcD1qDasKg5bE3iM869mwO4jEi/nPGbHw58WP0EBqgzUsfzpdBss OCxUMRH1uJ7d2ztkaSQNiCZH4fX1192de8iSwI4nZe+k7ChpwaNo57hVlfULQudV5hrd 6fqM5DwQZG9kQmyGVFgZKCR9fnNWU957VbgfCYh/V5t+JlYC+J7fgBO3bgqRFpxEC3ol vEmc4slMARLSOVqH7RWnPoPhnD+gH9sxPaAbJZ9Pq3ggaUCH1byrhiIyGHCA93R/OQLJ HawB6sDbtRw0UTtm6a7jgKY//aIBnyNdorzmlqB8CJ4eHxjitQb9f0ug3C3Ebr0VgDYn FTXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717418408; x=1718023208; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yFx+2l6QVu4Muo2y5JOx35CwfwNrbVY10g9+MMiCbi8=; b=HwYkmqRu93nhpavM61AtATFSY1mr2YliHzhxUnzC3T6UCB2OsVDocmLlF8his4bxJy idNb1SvU0ho82kpc/23gvgsuAj0YFeyJrnxfP50ImMCpGIHVYTaFc8xVy9OPhhOsTbMK ciGhPDViEI5lKUbazHOIPJ4t3BcBNKU4BhPLnpViVVNKtQQo/zwmLHFS6WNje3Sc3dHu gbbx1itJI5tjjgrlcJZXCYGteBRdcexqnoyCY7+k0NWHbj/hYJRVY2pDPzDC2ee4QLaf DN8dwfeN4Q2FAnMw1UV0M32uloPEEpxB9drK5qOwpl7F1jXhI60IUHmESnaV1tqFMQ0Y dSbw== X-Gm-Message-State: AOJu0YygWCr4H3fT5OmF2POT9hRnU/uafQbLBsgFaHw4o+3Ohy9mldt+ JowuZI2zKaMqkaPrcDLnIycxC1tD/ENRcQUsxku4xHzGEnm+sI1VP9YDig+2tw4supB4DNy7RYN NAjvp+lpqhInTdN8EzJANahKAWC677O8Uyms= X-Google-Smtp-Source: AGHT+IEZJOyKGB/DK00pAAPOXuX9g6pXL9a7aGR2n1+zDOCBiWgOrP79rc8jbk4m3LjFRN0I01Elrj2wajrYgJcSsEw= X-Received: by 2002:a17:906:2f95:b0:a67:83a6:e with SMTP id a640c23a62f3a-a682234b916mr746830266b.75.1717418407604; Mon, 03 Jun 2024 05:40:07 -0700 (PDT) MIME-Version: 1.0 References: <20240510180743.53c37660@sovereign> <20240530180022.58cbc8a3@sovereign> In-Reply-To: <20240530180022.58cbc8a3@sovereign> From: Antonio Di Bacco Date: Mon, 3 Jun 2024 14:39:56 +0200 Message-ID: Subject: Re: Failure while allocating 1GB hugepages To: Dmitry Kozlyuk Cc: users@dpdk.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Hi, I have the same behaviour with the code in this message. The first rte_memzone_reserve_aligned() call requesting 1.5GB contiguous memory always fails, while the second one is always successful. It seems in eal_memalloc_is_contig() the 'msl->memseg_arr' items are invert= ed: when there is the sequence FC0000000, F80000000 the allocation fails, while the segments sequence F80000000, FC0000000 is fine. >From my understaning 'msl->memseg_arr' comes from 'rte_eal_get_configuration()->mem_config;' which is rte_config declared in eal_common_config.c Is there an explanation for this swinging behaviour ? Br, Here the source code: #include #include #include #include #include int main(int argc, char **argv) { const struct rte_memzone *mz; if (rte_eal_init(argc, argv) < 0) return -1; printf("Allocating : 1.5GB\n"); mz =3D rte_memzone_reserve_aligned("my_huge_mem", 0x60000000, rte_socket_id(), RTE_MEMZONE_1GB | RTE_MEMZONE_IOVA_CONTIG, RTE_CACHE_LINE_SIZE); if (mz =3D=3D NULL) { printf(" Fail(1): errno %s\n", rte_strerror(rte_errno)); mz =3D rte_memzone_reserve_aligned("my_huge_mem", 0x60000000, rte_socket_id(), RTE_MEMZONE_1GB | RTE_MEMZONE_IOVA_CONTIG, RTE_CACHE_LINE_SIZE); if (mz =3D=3D NULL) { printf(" Fail(2): errno %s\n", rte_strerror(rte_errno)); return -2; } } printf(" Success: phy[%p] size[%zu]\n", mz->iova, mz->len); rte_memzone_free(mz); rte_eal_cleanup(); return 0; } I added two RTE_LOG notices in eal_memalloc_is_contig @ eal_common_memalloc.c around line 324, after rte_fbarray_get() to print ms->iova /* skip first iteration */ ms =3D rte_fbarray_get(&msl->memseg_arr, start_seg); RTE_LOG(NOTICE, EAL, "memseg_arr[0] =3D %lX \n", ms->iova); // DEBUG cur =3D ms->iova; expected =3D cur + pgsz; /* if we can't access IOVA addresses, assume non-contiguous */ if (cur =3D=3D RTE_BAD_IOVA) return false; for (cur_seg =3D start_seg + 1; cur_seg < end_seg; cur_seg++, expected +=3D pgsz) { ms =3D rte_fbarray_get(&msl->memseg_arr, cur_seg); RTE_LOG(NOTICE, EAL, "memseg_arr[%d] =3D %lX \n", cur_seg, ms->iova); // DE= BUG if (ms->iova !=3D expected) return false; } The output is: Allocating : 1.5GB EAL: memseg_arr[0] =3D FC0000000 EAL: memseg_arr[1] =3D F80000000 Fail(1): errno Cannot allocate memory EAL: memseg_arr[0] =3D F80000000 EAL: memseg_arr[1] =3D FC0000000 EAL: memseg_arr[0] =3D F80000000 EAL: memseg_arr[1] =3D FC0000000 EAL: memseg_arr[0] =3D F80000000 EAL: memseg_arr[1] =3D FC0000000 EAL: memseg_arr[0] =3D F80000000 EAL: memseg_arr[1] =3D FC0000000 Success: phy[0xfa0000000] size[1610612736] On Thu, May 30, 2024 at 5:00=E2=80=AFPM Dmitry Kozlyuk wrote: > > 2024-05-30 12:28 (UTC+0200), Antonio Di Bacco: > > Just in case I need, let us say, 1.5 GB CONTIGUOUS memory zone, > > would it be fine to use something like this as GRUB config in Linux? > > > > default_hugepagesz=3D2G hugepagesz=3D2G hugepages=3D4" > > On x86, "hugepagesz" and "default_hugepagesz" may be either 2M or 1G. > There is no way to *guarantee* that there will be > two physically adjacent 1G hugepages forming 1.5GB contiguous space, > but in practice these options, with the above correction, will do. > > Note that by default the kernel will spread hugepages between NUMA nodes. > You can control this by a more elaborate form of "hugepages" option: > > https://docs.kernel.org/admin-guide/mm/hugetlbpage.html