patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Nick Connolly <nick.connolly@mayadata.io>
To: nicolas.dichtel@6wind.com, Anatoly Burakov <anatoly.burakov@intel.com>
Cc: dev@dpdk.org, stable@dpdk.org
Subject: Re: [dpdk-stable] [PATCH] mem: fix allocation failure on non-NUMA kernel
Date: Wed, 5 Aug 2020 15:53:12 +0100
Message-ID: <e960495c-fc4d-3fb9-8d30-ad12f3047994@mayadata.io> (raw)
In-Reply-To: <906edd92-cda3-9e79-ebdd-29a944082b61@6wind.com>



On 05/08/2020 15:36, Nicolas Dichtel wrote:
> Le 05/08/2020 à 16:20, Nick Connolly a écrit :
> [snip]
>>>> Fixes: 2a96c88be83e ("mem: ease init in a docker container")
>>> I'm wondering if the bug existed before this commit.
>>>
>>> Before this commit, it was:
>>>          move_pages(getpid(), 1, &addr, NULL, &cur_socket_id, 0);
>>>          if (cur_socket_id != socket_id) {
>>>                  /* error */
>>>
>>> Isn't it possible to hit this error case if CONFIG_NUMA is unset in the kernel?
>> I've just run the previous code to test this out and you are right that
>> move_pages does indeed return -1 with errno set to ENOSYS, but nothing checks
>> this so execution carries on and compares cur_socket_id (which will be unchanged
>> from the zero initialization) with socket_id (which is presumably also zero),
>> thus allowing the allocation to succeed!
> I came to this conclusion, but I didn't check if socket_id could be != from 0.
>
>>> [snip]
>>>> +    if (check_numa()) {
>>>> +        ret = get_mempolicy(&cur_socket_id, NULL, 0, addr,
>>>> +                    MPOL_F_NODE | MPOL_F_ADDR);
>>>> +        if (ret < 0) {
>>>> +            RTE_LOG(DEBUG, EAL, "%s(): get_mempolicy: %s\n",
>>>> +                __func__, strerror(errno));
>>>> +            goto mapped;
>>>> +        } else if (cur_socket_id != socket_id) {
>>>> +            RTE_LOG(DEBUG, EAL,
>>>> +                    "%s(): allocation happened on wrong socket (wanted %d,
>>>> got %d)\n",
>>>> +                __func__, socket_id, cur_socket_id);
>>>> +            goto mapped;
>>>> +        }
>>>> +    } else {
>>>> +        if (rte_socket_count() > 1)
>>>> +            RTE_LOG(DEBUG, EAL, "%s(): not checking socket for allocation
>>>> (wanted %d)\n",
>>>> +                    __func__, socket_id);
>>> nit: maybe an higher log level like WARNING?
>> Open to guidance here - my concern was that this is going to be generated for
>> every call to alloc_seg() and I'm not sure what the frequency will be - I'm
>> cautious about flooding the log with warnings under 'normal running'.  Are the
>> implications of running on a multi socket system with NUMA support disabled in
>> the kernel purely performance related for the DPDK or is there a functional
>> correctness issue as well?
> Is it really a 'normal running' to have CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES in
> dpdk and not CONFIG_NUMA in the kernel?

I'm not an expert of DPDK, but I think it needs to be treated as 'normal 
running', for the following reasons:

 1. The existing code in eal_memalloc_alloc_seg_bulk() is designed to
    work even if check_numa() indicates that NUMA support is not enabled:

    #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
    if (check_numa()) {
             oldmask = numa_allocate_nodemask();
             prepare_numa(&oldpolicy, oldmask, socket);
             have_numa = true;
         }
    #endif
 2. The DPDK application could be built with
    CONFIG_RTE_EAL_NUMA_AWARE_HUGE_PAGES and then the binary run on
    different systems with and without NUMA support.

Regards,
Nick

  reply	other threads:[~2020-08-05 14:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-05 12:26 Nick Connolly
2020-08-05 13:42 ` Nicolas Dichtel
2020-08-05 14:20   ` Nick Connolly
2020-08-05 14:36     ` Nicolas Dichtel
2020-08-05 14:53       ` Nick Connolly [this message]
2020-08-05 15:13         ` Nicolas Dichtel
2020-08-05 15:21           ` Nick Connolly
2020-09-17 11:28             ` Burakov, Anatoly
2020-09-17 11:31 ` Burakov, Anatoly
2020-09-17 12:29   ` Nick Connolly
2020-09-17 12:57     ` Burakov, Anatoly
2020-09-17 13:05       ` Nick Connolly
2020-09-17 14:07         ` Burakov, Anatoly
2020-09-17 14:08           ` Nick Connolly
2020-09-17 14:18             ` Burakov, Anatoly
2020-09-17 14:19               ` Nick Connolly
2020-10-12 19:28 ` [dpdk-stable] [PATCH v2] " Nick Connolly
2020-10-13  7:59   ` Nicolas Dichtel
2020-10-13 12:01     ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e960495c-fc4d-3fb9-8d30-ad12f3047994@mayadata.io \
    --to=nick.connolly@mayadata.io \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

patches for DPDK stable branches

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git