DPDK patches and discussions
 help / color / mirror / Atom feed
From: David Marchand <david.marchand@redhat.com>
To: zhoumin <zhoumin@loongson.cn>
Cc: dev@dpdk.org, olivier.matz@6wind.com, ferruh.yigit@amd.com,
	 kaisenx.you@intel.com,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	 Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [PATCH] malloc: enhance NUMA affinity heuristic
Date: Tue, 3 Jan 2023 11:56:27 +0100	[thread overview]
Message-ID: <CAJFAV8yvm-sSkErhAH_zV5iQDy5fg8_nS=yf7egVrts8OGDv_g@mail.gmail.com> (raw)
In-Reply-To: <0572b450-609d-0053-6fe3-beab118e7020@loongson.cn>

On Tue, Dec 27, 2022 at 10:00 AM zhoumin <zhoumin@loongson.cn> wrote:
>
> Hi David,
>
>
> First of all, I sincerely apologize for the late reply.
>
> I had checked this issue carefully and had some useful findings.
>
> On Wed, Dec 21, 2022 at 22:57 PM, David Marchand wrote:
> > Hello Min,
> >
> > On Wed, Dec 21, 2022 at 11:49 AM David Marchand
> > <david.marchand@redhat.com> wrote:
> >> Trying to allocate memory on the first detected numa node has less
> >> chance to find some memory actually available rather than on the main
> >> lcore numa node (especially when the DPDK application is started only
> >> on one numa node).
> >>
> >> Signed-off-by: David Marchand <david.marchand@redhat.com>
> > I see a failure in the loongarch CI.
> >
> > Running binary with
> > argv[]:'/home/zhoumin/dpdk/build/app/test/dpdk-test'
> > '--file-prefix=eal_flags_c_opt_autotest' '--proc-type=secondary'
> > '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7'
> > Error - process did not run ok with valid corelist value
> > Test Failed
> >
> > The logs don't give the full picture (though it is not LoongArch CI fault).
> >
> > I tried to read back on past mail exchanges about the loongarch
> > server, but I did not find the info.
> > I suspect cores 5 to 7 belong to different numa nodes, can you confirm?
>
> The cores 5 to 7 belong to the same numa node (NUMA node1) on the
> Loongson-3C5000LL CPU on which LoongArch DPDK CI runs.
>
> >
> > I'll post a new revision to account for this case.
> >
>
> The LoongArch DPDK CI uses the core 0-7 to run all the DPDK unit tests
> by adding the arg '-l 0-7' in the meson test args. In the above test
> case, the arg '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7' will make the
> lcore 0 and 6 to run on the core 0 or 6. The logs of EAL will make it
> more clear when I set the log level of EAL to debug as follows:
> EAL: Main lcore 0 is ready (tid=fff3ee18f0;cpuset=[0,6])

The syntax for this --lcores option is not obvious...
This log really helps.


> EAL: lcore 1 is ready (tid=fff2de4cf0;cpuset=[1])
> EAL: lcore 2 is ready (tid=fff25e0cf0;cpuset=[5,6,7])
> EAL: lcore 5 is ready (tid=fff0dd4cf0;cpuset=[0,2])
> EAL: lcore 4 is ready (tid=fff15d8cf0;cpuset=[0,2])
> EAL: lcore 3 is ready (tid=fff1ddccf0;cpuset=[0,2])
> EAL: lcore 7 is ready (tid=ffdb7f8cf0;cpuset=[7])
> EAL: lcore 6 is ready (tid=ffdbffccf0;cpuset=[0,6])
>
> However, The cores 0 and 6 belong to different numa nodes on the
> Loongson-3C5000LL CPU. The core 0 belongs to NUMA node 0 and the core 6
> belongs to NUMA node 1 as follows:
> $ lscpu
> Architecture:        loongarch64
> Byte Order:          Little Endian
> CPU(s):              32
> On-line CPU(s) list: 0-31
> Thread(s) per core:  1
> Core(s) per socket:  4
> Socket(s):           8
> NUMA node(s):        8
> ...
> NUMA node0 CPU(s):   0-3
> NUMA node1 CPU(s):   4-7
> NUMA node2 CPU(s):   8-11
> NUMA node3 CPU(s):   12-15
> NUMA node4 CPU(s):   16-19
> NUMA node5 CPU(s):   20-23
> NUMA node6 CPU(s):   24-27
> NUMA node7 CPU(s):   28-31
> ...
>
> So the socket_id for the lcore 0 and 6 will be set to -1 which can be
> seen from the thread_update_affinity(). Meanwhile, I print out the
> socket_id for the lcore 0 to RTE_MAX_LCORE - 1 as follows:
> lcore_config[*].socket_id: -1 0 1 0 0 0 -1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5
> 5 5 6 6 6 6 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0 0 0 0 0
>
> In this test case, the modified malloc_get_numa_socket() will return -1
> which caused a memory allocation failure.
> Whether it is acceptable in DPDK that the socket_id for a lcore is -1?
> If it's ok, maybe we can check the socket_id of main lcore before using
> it, such as:
> diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
> index d7c410b786..3ee19aee15 100644
> --- a/lib/eal/common/malloc_heap.c
> +++ b/lib/eal/common/malloc_heap.c
> @@ -717,6 +717,10 @@ malloc_get_numa_socket(void)
>                          return socket_id;
>          }
>
> +       socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
> +       if (socket_id != (unsigned int)SOCKET_ID_ANY)
> +               return socket_id;
> +
>          return rte_socket_id_by_idx(0);
>   }

Yep, this is what I had in mind before going off.
v2 incoming.

Thanks Min!


-- 
David Marchand


  reply	other threads:[~2023-01-03 10:56 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-21 10:48 David Marchand
2022-12-21 11:16 ` Bruce Richardson
2022-12-21 13:50   ` Ferruh Yigit
2022-12-21 14:57 ` David Marchand
2022-12-27  9:00   ` zhoumin
2023-01-03 10:56     ` David Marchand [this message]
2023-01-03 10:58 ` [PATCH v2] " David Marchand
2023-01-03 13:32 ` [PATCH v3] " David Marchand
2023-01-31  3:23   ` You, KaisenX
2023-01-31 15:05 ` [PATCH v4] net/iavf:enhance " Kaisen You
2023-01-31 16:05   ` Thomas Monjalon
2023-02-01  5:32     ` You, KaisenX
2023-02-01 12:20 ` [PATCH v5] enhance " Kaisen You
2023-02-01 10:52   ` Jiale, SongX
2023-02-15 14:22   ` Burakov, Anatoly
2023-02-15 14:47     ` Burakov, Anatoly
2023-02-16  2:50     ` You, KaisenX
2023-03-03 14:07       ` Thomas Monjalon
2023-03-09  1:58         ` You, KaisenX
2023-04-13  0:56           ` You, KaisenX
2023-04-19 12:16             ` Thomas Monjalon
2023-04-21  2:34               ` You, KaisenX
2023-04-21  8:12                 ` Thomas Monjalon
2023-04-23  6:52                   ` You, KaisenX
2023-04-23  8:57                     ` You, KaisenX
2023-04-23 13:19                       ` Thomas Monjalon
2023-04-25  5:16   ` [PATCH v6] " Kaisen You
2023-04-27  6:57     ` Thomas Monjalon
2023-05-16  5:19       ` You, KaisenX
2023-05-23  2:50     ` [PATCH v7] " Kaisen You
2023-05-23 10:44       ` Burakov, Anatoly
2023-05-26  6:44         ` You, KaisenX
2023-05-23 12:45       ` Burakov, Anatoly
2023-05-26  6:50       ` [PATCH v8] " Kaisen You
2023-05-26  8:45       ` Kaisen You
2023-05-26 14:44         ` Burakov, Anatoly
2023-05-26 17:50           ` Stephen Hemminger
2023-05-29 10:37             ` Burakov, Anatoly
2023-06-01 14:42         ` David Marchand
2023-06-06 14:04           ` Thomas Monjalon
2023-06-12  9:36           ` Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJFAV8yvm-sSkErhAH_zV5iQDy5fg8_nS=yf7egVrts8OGDv_g@mail.gmail.com' \
    --to=david.marchand@redhat.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=kaisenx.you@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=zhoumin@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).