DPDK patches and discussions
 help / color / mirror / Atom feed
From: Fengnan Chang <changfengnan@bytedance.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: dev@dpdk.org, Lin Li <lilintjpu@bytedance.com>
Subject: Re: [External] Re: [PATCH] eal: fix eal init may failed when too much continuous memsegs under legacy mode
Date: Mon, 22 May 2023 20:09:03 +0800	[thread overview]
Message-ID: <CAPFOzZvcwkbG42ymdNEZ+CToy6RhpkyOow5gsBF5eq8Q0dOZPw@mail.gmail.com> (raw)
In-Reply-To: <a52840a8-4056-279e-ed58-55ae6696da32@intel.com>

Burakov, Anatoly <anatoly.burakov@intel.com> 于2023年5月20日周六 23:03写道:
>
> Hi,
>
> On 5/16/2023 1:21 PM, Fengnan Chang wrote:
> > Under legacy mode, if the number of continuous memsegs greater
> > than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> > another memseg list is empty, because only one memseg list used
> > to check in remap_needed_hugepages.
> >
> > For example:
> > hugepage configure:
> > 20480
> > 13370
> > 7110
> >
> > startup log:
> > EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> > EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> > EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> > EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> > EAL: Requesting 13370 pages of size 2MB from socket 0
> > EAL: Requesting 7110 pages of size 2MB from socket 1
> > EAL: Attempting to map 14220M on socket 1
> > EAL: Allocated 14220M on socket 1
> > EAL: Attempting to map 26740M on socket 0
> > EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> > configuration.
>
> Unrelated, but this is probably a wrong message, this should've called
> out the config options to change, not their values. Sounds like a log
> message needs fixing somewhere...

In the older version, the log is:
EAL: Could not find space for memseg. Please increase
CONFIG_RTE_MAX_MEMSEG_PER_TYPE and/or CONFIG_RTE_MAX_MEM_PER_TYPE in
configuration.
Maybe it's better ?

>
> > EAL: Couldn't remap hugepage files into memseg lists
> > EAL: FATAL: Cannot init memory
> > EAL: Cannot init memory
> >
> > Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> > Signed-off-by: Lin Li <lilintjpu@bytedance.com>
> > ---
> >   lib/eal/linux/eal_memory.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
> > index 60fc8cc6ca..36b9e78f5f 100644
> > --- a/lib/eal/linux/eal_memory.c
> > +++ b/lib/eal/linux/eal_memory.c
> > @@ -1001,6 +1001,8 @@ remap_needed_hugepages(struct hugepage_file *hugepages, int n_pages)
> >               if (cur->size == 0)
> >                       break;
> >
> > +             if (cur_page - seg_start_page >= RTE_MAX_MEMSEG_PER_LIST)
> > +                     new_memseg = 1;
>
> I don't think this is quite right, because technically,
> `RTE_MAX_MEMSEG_PER_LIST` is only applied to smaller page size segment
> lists - larger page sizes segment lists will hit their limits earlier.
> So, while this will work for 2MB pages, it won't work for page sizes
> which segment list length is smaller than the maximum (such as 1GB pages).
>
> I think this solution could be improved upon by trying to break up the
> contiguous area instead. I suspect the core of the issue is not even the
> fact that we're exceeding limits of one memseg list, but that we're
> always attempting to map exactly N pages in `remap_hugepages`, which
> results in us leaving large contiguous zones inside memseg lists unused
> because we couldn't satisfy current allocation request and skipped to a
> new memseg list.

Correct, I didn't consider 1GB pages case, I get your point.
Thanks.
>
> For example, let's suppose we found a large contiguous area that
> would've exceeded limits of current memseg list. Sooner or later, this
> contiguous area will end, and we'll attempt to remap this virtual area
> into a memseg list. Whenever that happens, we call into the remap code,
> which will start with first segment, attempt to find exactly N number of
> free spots, fail to do so, and skip to the next segment list.
>
> Thus, sooner or later, if we get contiguous areas that are large enough,
> we will not populate our memseg lists but instead skip through them, and
> start with a new memseg list every time we need a large contiguous area.
> We prioritize having a large contiguous area over using up all of our
> memory map.
>
> If, instead, we could break up the allocation - that is, use
> `rte_fbarray_find_biggest_free()` instead of
> `rte_fbarray_find_next_n_free()`, and keep doing it until we run out of
> segment lists, we will achieve the same result your patch does, but have
> it work for all page sizes, because now we would be targeting the actual
> issue (under-utilization of memseg lists), not its symptoms (exceeding
> segment list limits for large allocations).
>
> This logic could either be inside `remap_hugepages`, or we could just
> return number of pages mapped from `remap_hugepages`, and have the
> calling code (`remap_needed_hugepages`) try again, this time with a
> different start segment, reflecting how much pages we actually mapped.
> IMO this would be easier to implement, as `remap_hugepages` is overly
> complex as it is!
>
> >               if (cur_page == 0)
> >                       new_memseg = 1;
> >               else if (cur->socket_id != prev->socket_id)
>
> --
> Thanks,
> Anatoly
>

  reply	other threads:[~2023-05-22 12:09 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-16 12:21 Fengnan Chang
2023-05-20 15:03 ` Burakov, Anatoly
2023-05-22 12:09   ` Fengnan Chang [this message]
2023-05-22 12:43     ` [External] " Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPFOzZvcwkbG42ymdNEZ+CToy6RhpkyOow5gsBF5eq8Q0dOZPw@mail.gmail.com \
    --to=changfengnan@bytedance.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=lilintjpu@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).