patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Nick Connolly <nick.connolly@mayadata.io>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: dev@dpdk.org, nicolas.dichtel@6wind.com, stable@dpdk.org
Subject: Re: [dpdk-stable] [PATCH] mem: fix allocation failure on non-NUMA kernel
Date: Thu, 17 Sep 2020 13:29:00 +0100
Message-ID: <8ebfbd1f-9a2b-3421-1d35-a3cf070ce8df@mayadata.io> (raw)
In-Reply-To: <be6c5f8d-aaca-5a73-3fff-03febb3533ad@intel.com>

Hi Anatoly,

Thanks for the response.  You are asking a good question - here's what I 
know:

The issue arose on a single socket system, running WSL2 (full Linux 
kernel running as a lightweight VM under Windows).
The default kernel in this environment is built with CONFIG_NUMA=n which 
means get_mempolicy() returns an error.
This causes the check to ensure that the allocated memory is associated 
with the correct socket to fail.

The change is to skip the allocation check if check_numa() indicates 
that NUMA-aware memory is not supported.

Researching the meaning of CONFIG_NUMA, I found 
https://cateee.net/lkddb/web-lkddb/NUMA.html which says:
> Enable NUMA (Non-Uniform Memory Access) support.
> The kernel will try to allocate memory used by a CPU on the local 
> memory controller of the CPU and add some more NUMA awareness to the 
> kernel.

Clearly CONFIG_NUMA enables memory awareness, but there's no indication 
in the description whether information about the NUMA physical 
architecture is 'hidden', or whether it is still exposed through 
/sys/devices/system/node* (which is used by the rte initialisation code 
to determine how many sockets there are). Unfortunately, I don't have 
ready access to a multi-socket Linux system that I can test this out on, 
so I took the conservative approach that it may be possible to have 
CONFIG_NUMA disabled, but the kernel still report more than one node, 
and coded the change to generate a debug message if this occurs.

Do you know whether CONFIG_NUMA turns off all knowledge about the 
hardware architecture?  If it does, then I agree that the test for 
rte_socket_count() serves no purpose and should be removed.

Many thanks,
Nick


On 17/09/2020 12:31, Burakov, Anatoly wrote:
> On 05-Aug-20 1:26 PM, Nick Connolly wrote:
>> Running dpdk-helloworld on Linux with lib numa present,
>> but no kernel support for NUMA (CONFIG_NUMA=n) causes
>> ret_service_init() to fail with EAL: error allocating
>> rte services array.
>>
>> alloc_seg() calls get_mempolicy to verify that the allocation
>> has happened on the correct socket, but receives ENOSYS from
>> the kernel and fails the allocation.
>>
>> The allocated socket should only be verified if check_numa() is true.
>>
>> Fixes: 2a96c88be83e ("mem: ease init in a docker container")
>> Cc: nicolas.dichtel@6wind.com
>> Cc: stable@dpdk.org
>> Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
>> ---
>>   lib/librte_eal/linux/eal_memalloc.c | 28 +++++++++++++++++-----------
>>   1 file changed, 17 insertions(+), 11 deletions(-)
>>
>> diff --git a/lib/librte_eal/linux/eal_memalloc.c 
>> b/lib/librte_eal/linux/eal_memalloc.c
>> index db60e7997..179757809 100644
>> --- a/lib/librte_eal/linux/eal_memalloc.c
>> +++ b/lib/librte_eal/linux/eal_memalloc.c
>> @@ -610,17 +610,23 @@ alloc_seg(struct rte_memseg *ms, void *addr, 
>> int socket_id,
>>       }
>>     #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
>> -    ret = get_mempolicy(&cur_socket_id, NULL, 0, addr,
>> -                MPOL_F_NODE | MPOL_F_ADDR);
>> -    if (ret < 0) {
>> -        RTE_LOG(DEBUG, EAL, "%s(): get_mempolicy: %s\n",
>> -            __func__, strerror(errno));
>> -        goto mapped;
>> -    } else if (cur_socket_id != socket_id) {
>> -        RTE_LOG(DEBUG, EAL,
>> -                "%s(): allocation happened on wrong socket (wanted 
>> %d, got %d)\n",
>> -            __func__, socket_id, cur_socket_id);
>> -        goto mapped;
>> +    if (check_numa()) {
>> +        ret = get_mempolicy(&cur_socket_id, NULL, 0, addr,
>> +                    MPOL_F_NODE | MPOL_F_ADDR);
>> +        if (ret < 0) {
>> +            RTE_LOG(DEBUG, EAL, "%s(): get_mempolicy: %s\n",
>> +                __func__, strerror(errno));
>> +            goto mapped;
>> +        } else if (cur_socket_id != socket_id) {
>> +            RTE_LOG(DEBUG, EAL,
>> +                    "%s(): allocation happened on wrong socket 
>> (wanted %d, got %d)\n",
>> +                __func__, socket_id, cur_socket_id);
>> +            goto mapped;
>> +        }
>> +    } else {
>> +        if (rte_socket_count() > 1)
>> +            RTE_LOG(DEBUG, EAL, "%s(): not checking socket for 
>> allocation (wanted %d)\n",
>> +                    __func__, socket_id);
>>       }
>
> If there is no kernel support for NUMA, how would we end up with >1 
> socket count?
>
>>   #else
>>       if (rte_socket_count() > 1)
>>
>
>


  reply	other threads:[~2020-09-17 12:29 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-05 12:26 Nick Connolly
2020-08-05 13:42 ` Nicolas Dichtel
2020-08-05 14:20   ` Nick Connolly
2020-08-05 14:36     ` Nicolas Dichtel
2020-08-05 14:53       ` Nick Connolly
2020-08-05 15:13         ` Nicolas Dichtel
2020-08-05 15:21           ` Nick Connolly
2020-09-17 11:28             ` Burakov, Anatoly
2020-09-17 11:31 ` Burakov, Anatoly
2020-09-17 12:29   ` Nick Connolly [this message]
2020-09-17 12:57     ` Burakov, Anatoly
2020-09-17 13:05       ` Nick Connolly
2020-09-17 14:07         ` Burakov, Anatoly
2020-09-17 14:08           ` Nick Connolly
2020-09-17 14:18             ` Burakov, Anatoly
2020-09-17 14:19               ` Nick Connolly
2020-10-12 19:28 ` [dpdk-stable] [PATCH v2] " Nick Connolly
2020-10-13  7:59   ` Nicolas Dichtel
2020-10-13 12:01     ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ebfbd1f-9a2b-3421-1d35-a3cf070ce8df@mayadata.io \
    --to=nick.connolly@mayadata.io \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

patches for DPDK stable branches

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git