From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E9B30A00C2; Tue, 3 Jan 2023 11:56:43 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 09F1440693; Tue, 3 Jan 2023 11:56:43 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 014D140689 for ; Tue, 3 Jan 2023 11:56:41 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1672743401; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ehspkwxJnbSbOkYPXarbX4NzhL2FX0p53DldoznTnEM=; b=Qqwe/vJJVjl0/g/zuh0xFCBOqw0bS3nO2ykXbx5I4Rg81ond8WpsJVnY1j1/WXZVYd/a94 GVlXJXpFTS7usXddGudf0G0wIrEUJDgVzxHgjpaDhajcdcX0u6sseRsPeuR7ak715GixE9 eOYJIWWmYegEdIVJxm/K66X/AHBMKIQ= Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-168-xVxHXXnFPN-6ngKQ7cjtmg-1; Tue, 03 Jan 2023 05:56:40 -0500 X-MC-Unique: xVxHXXnFPN-6ngKQ7cjtmg-1 Received: by mail-pg1-f200.google.com with SMTP id e11-20020a63d94b000000b0048988ed9a6cso13585104pgj.1 for ; Tue, 03 Jan 2023 02:56:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ehspkwxJnbSbOkYPXarbX4NzhL2FX0p53DldoznTnEM=; b=k3uxLXNMo0kFdyDt6Jol61nmPXpr86PCYuBxHcaLEX2hVNtTtq1VkzgDy7AdAWCmg9 q5I8pGYr61ICoPp2zXCNK29hcxNKMEdACfr5M+3Rmc90F+XWJwEGyKLuArBgy9M2dzKP 2qMx3Cw9gMtO0rHydhxN6pEDPR8ONYl3bJgCswmDpogRtrtZvaJ3iYJUYieym+VNt1Jh 8eGcDhJTMzd2FqS0W0PQeUCgugEzjG5VvOwUsqIcuVVMba0d0vXKavqa5C0jCuNm6tjb 8xFnTqSH8hKISjjsfptDbZvsAyCQI4ObTw8V9csevMR8H+lB236oEjcLBRFskr/hSLux na0Q== X-Gm-Message-State: AFqh2kp6uasbs4dQcPvTGaeRjHCW8s/HWu87hGY4Lj3T7NLfD6JEDGdh 8jUVUQ0+Iy1g3RKNpraUOlUDBzgbmXsBki7F6YjPE3LoDRIIs9UytAixAxVPFAA5SshdSVcGNRb G9SAZIXimWLyiyWC6SpY= X-Received: by 2002:a17:902:d70f:b0:191:386:d8c with SMTP id w15-20020a170902d70f00b0019103860d8cmr2578039ply.148.1672743399116; Tue, 03 Jan 2023 02:56:39 -0800 (PST) X-Google-Smtp-Source: AMrXdXtgY/55GmCbxEKyBQIOY0B4MypzrSCNAUJh+nAl+b09v+R7iGn6YOCE5eLU40Rbhc9hEwEsRxIRJdQ7yJLrJ0U= X-Received: by 2002:a17:902:d70f:b0:191:386:d8c with SMTP id w15-20020a170902d70f00b0019103860d8cmr2578035ply.148.1672743398648; Tue, 03 Jan 2023 02:56:38 -0800 (PST) MIME-Version: 1.0 References: <20221221104858.296530-1-david.marchand@redhat.com> <0572b450-609d-0053-6fe3-beab118e7020@loongson.cn> In-Reply-To: <0572b450-609d-0053-6fe3-beab118e7020@loongson.cn> From: David Marchand Date: Tue, 3 Jan 2023 11:56:27 +0100 Message-ID: Subject: Re: [PATCH] malloc: enhance NUMA affinity heuristic To: zhoumin Cc: dev@dpdk.org, olivier.matz@6wind.com, ferruh.yigit@amd.com, kaisenx.you@intel.com, Anatoly Burakov , Bruce Richardson X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Dec 27, 2022 at 10:00 AM zhoumin wrote: > > Hi David, > > > First of all, I sincerely apologize for the late reply. > > I had checked this issue carefully and had some useful findings. > > On Wed, Dec 21, 2022 at 22:57 PM, David Marchand wrote: > > Hello Min, > > > > On Wed, Dec 21, 2022 at 11:49 AM David Marchand > > wrote: > >> Trying to allocate memory on the first detected numa node has less > >> chance to find some memory actually available rather than on the main > >> lcore numa node (especially when the DPDK application is started only > >> on one numa node). > >> > >> Signed-off-by: David Marchand > > I see a failure in the loongarch CI. > > > > Running binary with > > argv[]:'/home/zhoumin/dpdk/build/app/test/dpdk-test' > > '--file-prefix=eal_flags_c_opt_autotest' '--proc-type=secondary' > > '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7' > > Error - process did not run ok with valid corelist value > > Test Failed > > > > The logs don't give the full picture (though it is not LoongArch CI fault). > > > > I tried to read back on past mail exchanges about the loongarch > > server, but I did not find the info. > > I suspect cores 5 to 7 belong to different numa nodes, can you confirm? > > The cores 5 to 7 belong to the same numa node (NUMA node1) on the > Loongson-3C5000LL CPU on which LoongArch DPDK CI runs. > > > > > I'll post a new revision to account for this case. > > > > The LoongArch DPDK CI uses the core 0-7 to run all the DPDK unit tests > by adding the arg '-l 0-7' in the meson test args. In the above test > case, the arg '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7' will make the > lcore 0 and 6 to run on the core 0 or 6. The logs of EAL will make it > more clear when I set the log level of EAL to debug as follows: > EAL: Main lcore 0 is ready (tid=fff3ee18f0;cpuset=[0,6]) The syntax for this --lcores option is not obvious... This log really helps. > EAL: lcore 1 is ready (tid=fff2de4cf0;cpuset=[1]) > EAL: lcore 2 is ready (tid=fff25e0cf0;cpuset=[5,6,7]) > EAL: lcore 5 is ready (tid=fff0dd4cf0;cpuset=[0,2]) > EAL: lcore 4 is ready (tid=fff15d8cf0;cpuset=[0,2]) > EAL: lcore 3 is ready (tid=fff1ddccf0;cpuset=[0,2]) > EAL: lcore 7 is ready (tid=ffdb7f8cf0;cpuset=[7]) > EAL: lcore 6 is ready (tid=ffdbffccf0;cpuset=[0,6]) > > However, The cores 0 and 6 belong to different numa nodes on the > Loongson-3C5000LL CPU. The core 0 belongs to NUMA node 0 and the core 6 > belongs to NUMA node 1 as follows: > $ lscpu > Architecture: loongarch64 > Byte Order: Little Endian > CPU(s): 32 > On-line CPU(s) list: 0-31 > Thread(s) per core: 1 > Core(s) per socket: 4 > Socket(s): 8 > NUMA node(s): 8 > ... > NUMA node0 CPU(s): 0-3 > NUMA node1 CPU(s): 4-7 > NUMA node2 CPU(s): 8-11 > NUMA node3 CPU(s): 12-15 > NUMA node4 CPU(s): 16-19 > NUMA node5 CPU(s): 20-23 > NUMA node6 CPU(s): 24-27 > NUMA node7 CPU(s): 28-31 > ... > > So the socket_id for the lcore 0 and 6 will be set to -1 which can be > seen from the thread_update_affinity(). Meanwhile, I print out the > socket_id for the lcore 0 to RTE_MAX_LCORE - 1 as follows: > lcore_config[*].socket_id: -1 0 1 0 0 0 -1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 > 5 5 6 6 6 6 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 > > In this test case, the modified malloc_get_numa_socket() will return -1 > which caused a memory allocation failure. > Whether it is acceptable in DPDK that the socket_id for a lcore is -1? > If it's ok, maybe we can check the socket_id of main lcore before using > it, such as: > diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c > index d7c410b786..3ee19aee15 100644 > --- a/lib/eal/common/malloc_heap.c > +++ b/lib/eal/common/malloc_heap.c > @@ -717,6 +717,10 @@ malloc_get_numa_socket(void) > return socket_id; > } > > + socket_id = rte_lcore_to_socket_id(rte_get_main_lcore()); > + if (socket_id != (unsigned int)SOCKET_ID_ANY) > + return socket_id; > + > return rte_socket_id_by_idx(0); > } Yep, this is what I had in mind before going off. v2 incoming. Thanks Min! -- David Marchand