From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1226D42B9B; Thu, 25 May 2023 14:49:43 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9601F40DF8; Thu, 25 May 2023 14:49:42 +0200 (CEST) Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by mails.dpdk.org (Postfix) with ESMTP id 0C24E40DDB for ; Thu, 25 May 2023 14:49:40 +0200 (CEST) Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-3f767db3164so2606261cf.1 for ; Thu, 25 May 2023 05:49:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1685018980; x=1687610980; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GsXOv+r4gDkSxVEFNW7374N99V+/Cm77+9DnWHwlsDE=; b=brRWEkbd5fMzBH+mAnGHpL8IUXevZOaGOrU/wRfNVB5XEcgVgFJYDnAdLqjS91GYF2 gf5dXwaa9m2M7rhPG2FloHIoBg4lVngETsQSIEqFJ9YHoMwknXi0180m/o5q4rCQGbaY Rsl1bY2AsQidUtBgur0021IUkVUJIUwtmFQDXv35mdFivCflaf1hrwd3ay80Ws8QQZJc GC0WHB9f3hioNqq09su/uNnYiz0Ww3WHL4XP7q9QJQ4b7bCGUjGdhATQ5vcjJy0K2sGM vKAEmKsj22L3dTL3KThqyUUJsx11CpQ3XkTAtuPtRhJZ+m+CzjGm/xyMHWDvK3JFeWQx RLbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685018980; x=1687610980; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GsXOv+r4gDkSxVEFNW7374N99V+/Cm77+9DnWHwlsDE=; b=OT6Biv2CQ8jQC0EWQk+3mQ0Ah147AW+jK+lLvCqatzsQnb8stvaLgBIoSW6xnGauA+ uHiHksRHQExjaQwWaE8l8M8TAswlhpiCatBZD/v1IKycqO5ArDmy4w6/FBEyOjUaxAxB cHbfaX3OtZwVdWTfuMDXUIjgqCe2s6HgYlMO8vsB876k30Y9YyqV/d17+sJJvxa+zhLz BmWtOPhLUnAI3vxpSrY74QR0OJVR+ntU2jDIbR1mUsmG61gw8jvDXn7xvKWV9y7+HGhP sBSErGTs48QtEqplgmtYDBBitDUwf+uRWb3svxBpIZkKlJLdGzqga5ytqji9bLTrR5ef aifQ== X-Gm-Message-State: AC+VfDy7xK0r2pV/LSEpagDmZhkxo4/Qeu2QXEoEYNxSyV3Kjm8Dw7Am T0fX9dzZgyf8fO8jR4C/sWZJN12H7oh/ziLqVcZVtg== X-Google-Smtp-Source: ACHHUZ6r03E6KlPnUeh7SWY2EITugbhy4VFpgyfHHqzqT/X0O5RM8zM0mgVm1dTAgAPMW/v7eLhkKoUmaLXDouEkIE4= X-Received: by 2002:ac8:7f0c:0:b0:3f6:b493:8ee4 with SMTP id f12-20020ac87f0c000000b003f6b4938ee4mr13492391qtk.0.1685018979988; Thu, 25 May 2023 05:49:39 -0700 (PDT) MIME-Version: 1.0 References: <20230522124107.99877-1-changfengnan@bytedance.com> In-Reply-To: From: Fengnan Chang Date: Thu, 25 May 2023 20:49:29 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v2] eal: fix eal init may failed when too much continuous memsegs under legacy mode To: "Burakov, Anatoly" Cc: dev@dpdk.org, Lin Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Burakov, Anatoly =E4=BA=8E2023=E5=B9=B45=E6=9C= =8822=E6=97=A5=E5=91=A8=E4=B8=80 21:28=E5=86=99=E9=81=93=EF=BC=9A > > On 5/22/2023 1:41 PM, Fengnan Chang wrote: > > Under legacy mode, if the number of continuous memsegs greater > > than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though > > another memseg list is empty, because only one memseg list used > > to check in remap_needed_hugepages. > > Fix this by add a argment indicate how many pages mapped in > > remap_segment, remap_segment try to mapped most pages it can, if > > exceed it's capbility, remap_needed_hugepages will continue to > > map other left pages. > > > > For example: > > hugepage configure: > > cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepa= ges > > 10241 > > 10239 > > > > startup log: > > EAL: Detected memory type: socket_id:0 hugepage_sz:2097152 > > EAL: Detected memory type: socket_id:1 hugepage_sz:2097152 > > EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097= 152 > > EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097= 152 > > EAL: Requesting 13370 pages of size 2MB from socket 0 > > EAL: Requesting 7110 pages of size 2MB from socket 1 > > EAL: Attempting to map 14220M on socket 1 > > EAL: Allocated 14220M on socket 1 > > EAL: Attempting to map 26740M on socket 0 > > EAL: Could not find space for memseg. Please increase 32768 and/or 6553= 6 in > > configuration. > > EAL: Couldn't remap hugepage files into memseg lists > > EAL: FATAL: Cannot init memory > > EAL: Cannot init memory > > > > Signed-off-by: Fengnan Chang > > Signed-off-by: Lin Li > > --- > > lib/eal/linux/eal_memory.c | 33 +++++++++++++++++++++------------ > > 1 file changed, 21 insertions(+), 12 deletions(-) > > > > diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c > > index 60fc8cc6ca..b2e6453fbe 100644 > > --- a/lib/eal/linux/eal_memory.c > > +++ b/lib/eal/linux/eal_memory.c > > @@ -657,12 +657,12 @@ unmap_unneeded_hugepages(struct hugepage_file *hu= gepg_tbl, > > } > > > > static int > > -remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_= end) > > +remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_= end, int *mapped_seg_len) > > { > > struct rte_mem_config *mcfg =3D rte_eal_get_configuration()->mem_= config; > > struct rte_memseg_list *msl; > > struct rte_fbarray *arr; > > - int cur_page, seg_len; > > + int cur_page, seg_len, cur_len; > > unsigned int msl_idx; > > int ms_idx; > > uint64_t page_sz; > > @@ -692,8 +692,9 @@ remap_segment(struct hugepage_file *hugepages, int = seg_start, int seg_end) > > > > /* leave space for a hole if array is not empty */ > > empty =3D arr->count =3D=3D 0; > > + cur_len =3D RTE_MIN((unsigned int)seg_len, arr->len - arr= ->count - (empty ? 0 : 1)); > > ms_idx =3D rte_fbarray_find_next_n_free(arr, 0, > > - seg_len + (empty ? 0 : 1)); > > + cur_len + (empty ? 0 : 1)); > > > > /* memseg list is full? */ > > if (ms_idx < 0) > > @@ -704,12 +705,12 @@ remap_segment(struct hugepage_file *hugepages, in= t seg_start, int seg_end) > > */ > > if (!empty) > > ms_idx++; > > + *mapped_seg_len =3D cur_len; > > break; > > } > > if (msl_idx =3D=3D RTE_MAX_MEMSEG_LISTS) { > > - RTE_LOG(ERR, EAL, "Could not find space for memseg. Pleas= e increase %s and/or %s in configuration.\n", > > - RTE_STR(RTE_MAX_MEMSEG_PER_TYPE), > > - RTE_STR(RTE_MAX_MEM_MB_PER_TYPE)); > > + RTE_LOG(ERR, EAL, "Could not find space for memseg. Pleas= e increase RTE_MAX_MEMSEG_PER_LIST " > > + "RTE_MAX_MEMSEG_PER_TYPE and/or RTE_MAX_M= EM_MB_PER_TYPE in configuration.\n"); > > return -1; > > } > > > > @@ -725,6 +726,8 @@ remap_segment(struct hugepage_file *hugepages, int = seg_start, int seg_end) > > void *addr; > > int fd; > > > > + if (cur_page - seg_start =3D=3D *mapped_seg_len) > > + break; > > fd =3D open(hfile->filepath, O_RDWR); > > if (fd < 0) { > > RTE_LOG(ERR, EAL, "Could not open '%s': %s\n", > > @@ -986,7 +989,7 @@ prealloc_segments(struct hugepage_file *hugepages, = int n_pages) > > static int > > remap_needed_hugepages(struct hugepage_file *hugepages, int n_pages) > > { > > - int cur_page, seg_start_page, new_memseg, ret; > > + int cur_page, seg_start_page, new_memseg, ret, mapped_seg_len =3D= 0; > > > > seg_start_page =3D 0; > > for (cur_page =3D 0; cur_page < n_pages; cur_page++) { > > @@ -1023,21 +1026,27 @@ remap_needed_hugepages(struct hugepage_file *hu= gepages, int n_pages) > > /* if this isn't the first time, remap segment */ > > if (cur_page !=3D 0) { > > ret =3D remap_segment(hugepages, seg_star= t_page, > > - cur_page); > > + cur_page, &mapped_seg_len= ); > > if (ret !=3D 0) > > return -1; > > } > > + cur_page =3D seg_start_page + mapped_seg_len; > > /* remember where we started */ > > seg_start_page =3D cur_page; > > + mapped_seg_len =3D 0; > > } > > /* continuation of previous memseg */ > > } > > /* we were stopped, but we didn't remap the last segment, do it n= ow */ > > if (cur_page !=3D 0) { > > - ret =3D remap_segment(hugepages, seg_start_page, > > - cur_page); > > - if (ret !=3D 0) > > - return -1; > > + while (seg_start_page < n_pages) { > > + ret =3D remap_segment(hugepages, seg_start_page, > > + cur_page, &mapped_seg_len); > > + if (ret !=3D 0) > > + return -1; > > + seg_start_page =3D seg_start_page + mapped_seg_le= n; > > + mapped_seg_len =3D 0; > > + } > > } > > return 0; > > } > > This works, but I feel like it's overcomplicated - the same logic you're > trying to use can just be implemented using `find_biggest_free()` + > `find_contig_free()` and returning `seg_len` from `remap_segment()`? > > Something like the following: > > --- > > diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c > index 60fc8cc6ca..08acc787fc 100644 > --- a/lib/eal/linux/eal_memory.c > +++ b/lib/eal/linux/eal_memory.c > @@ -681,6 +681,7 @@ remap_segment(struct hugepage_file *hugepages, int > seg_start, int seg_end) > > /* find free space in memseg lists */ > for (msl_idx =3D 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) { > + int free_len; > bool empty; > msl =3D &mcfg->memsegs[msl_idx]; > arr =3D &msl->memseg_arr; > @@ -692,18 +693,27 @@ remap_segment(struct hugepage_file *hugepages, int > seg_start, int seg_end) > > /* leave space for a hole if array is not empty */ > empty =3D arr->count =3D=3D 0; > - ms_idx =3D rte_fbarray_find_next_n_free(arr, 0, > - seg_len + (empty ? 0 : 1)); > > - /* memseg list is full? */ > - if (ms_idx < 0) > - continue; > + /* find start of the biggest contiguous block and its siz= e */ > + ms_idx =3D rte_fbarray_find_biggest_free(arr, 0); > + free_len =3D rte_fbarray_find_contig_free(arr, ms_idx); > > /* leave some space between memsegs, they are not IOVA > * contiguous, so they shouldn't be VA contiguous either. > */ > - if (!empty) > + if (!empty) { > ms_idx++; > + free_len--; > + } > + > + /* memseg list is full? */ > + if (free_len < 1) > + continue; > + > + /* we might not get all of the space we wanted */ > + free_len =3D RTE_MIN(seg_len, free_len); > + seg_end =3D seg_start + free_len; > + seg_len =3D seg_end - seg_start; > break; > } > if (msl_idx =3D=3D RTE_MAX_MEMSEG_LISTS) { > @@ -787,7 +797,7 @@ remap_segment(struct hugepage_file *hugepages, int > seg_start, int seg_end) > } > RTE_LOG(DEBUG, EAL, "Allocated %" PRIu64 "M on socket %i\n", > (seg_len * page_sz) >> 20, socket_id); > - return 0; > + return seg_len; > } > > static uint64_t > @@ -1022,10 +1032,17 @@ remap_needed_hugepages(struct hugepage_file > *hugepages, int n_pages) > if (new_memseg) { > /* if this isn't the first time, remap segment */ > if (cur_page !=3D 0) { > - ret =3D remap_segment(hugepages, seg_star= t_page, > - cur_page); > - if (ret !=3D 0) > - return -1; > + int n_remapped =3D 0; > + int n_needed =3D cur_page - seg_start_pag= e; > + > + while (n_remapped < n_needed) { > + ret =3D remap_segment(hugepages, > + seg_start_page, > + cur_page); > + if (ret < 0) > + return -1; > + n_remapped +=3D ret; > + } > } > /* remember where we started */ > seg_start_page =3D cur_page; > @@ -1034,10 +1051,16 @@ remap_needed_hugepages(struct hugepage_file > *hugepages, int n_pages) > } > /* we were stopped, but we didn't remap the last segment, do it n= ow */ > if (cur_page !=3D 0) { > - ret =3D remap_segment(hugepages, seg_start_page, > - cur_page); > - if (ret !=3D 0) > - return -1; > + int n_remapped =3D 0; > + int n_needed =3D cur_page - seg_start_page; > + > + while (n_remapped < n_needed) { > + ret =3D remap_segment( > + hugepages, seg_start_page, cur_pa= ge); > + if (ret < 0) > + return -1; > + n_remapped +=3D ret; > + } > } > return 0; > } > > --- > > This should do the trick? Also, this probably needs to be duplicated for > Windows and FreeBSD init as well, because AFAIK they follow the legacy > mem init logic. I take a simple look at FreeBSD and Windows code, FreeBSD don't have this problem, FreeBSD map hugepage is one by one, not mapp multiple at once and windows call eal_dynmem_hugepage_init when have hugepage. So It seems just modify linux is ok. > > -- > Thanks, > Anatoly >