From: Ilya Maximets <i.maximets@samsung.com>
To: Anatoly Burakov <anatoly.burakov@intel.com>, dev@dpdk.org
Cc: solal.pirelli@gmail.com, stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] mem: fix undefined behavior in NUMA code
Date: Wed, 29 Aug 2018 16:02:07 +0300 [thread overview]
Message-ID: <20180829130044eucas1p111b03c624b50c67a2ed0fe4aa038adda~PXHJmVLuc1480114801eucas1p1R@eucas1p1.samsung.com> (raw)
In-Reply-To: <2624b855f3454691212cdb244f04926631c391a2.1535544966.git.anatoly.burakov@intel.com>
Hi.
Thanks for the fix.
Comments inline.
Best regards, Ilya Maximets.
On 29.08.2018 15:21, Anatoly Burakov wrote:
> When NUMA-aware hugepages config option is set, we rely on
> libnuma to tell the kernel to allocate hugepages on a specific
> NUMA node. However, we allocate node mask before we check if
> NUMA is available in the first place, which, according to
> the manpage [1], causes undefined behaviour.
>
> Fix by only using nodemask when we have NUMA available.
>
> [1] https://linux.die.net/man/3/numa_alloc_onnode
>
> Bugzilla ID: 20
>
> Fixes: 1b72605d2416 ("mem: balanced allocation of hugepages")
> Cc: i.maximets@samsung.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> lib/librte_eal/linuxapp/eal/eal_memory.c | 28 ++++++++++++++----------
> 1 file changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index dbf19499e..4976eeacd 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -263,7 +263,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
> int node_id = -1;
> int essential_prev = 0;
> int oldpolicy;
> - struct bitmask *oldmask = numa_allocate_nodemask();
> + struct bitmask *oldmask = NULL;
> bool have_numa = true;
> unsigned long maxnode = 0;
>
> @@ -275,6 +275,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
>
> if (have_numa) {
> RTE_LOG(DEBUG, EAL, "Trying to obtain current memory policy.\n");
> + oldmask = numa_allocate_nodemask();
> if (get_mempolicy(&oldpolicy, oldmask->maskp,
> oldmask->size + 1, 0, 0) < 0) {
> RTE_LOG(ERR, EAL,
> @@ -390,19 +391,22 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
>
> out:
> #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
> - if (maxnode) {
> - RTE_LOG(DEBUG, EAL,
> - "Restoring previous memory policy: %d\n", oldpolicy);
> - if (oldpolicy == MPOL_DEFAULT) {
> - numa_set_localalloc();
> - } else if (set_mempolicy(oldpolicy, oldmask->maskp,
> - oldmask->size + 1) < 0) {
> - RTE_LOG(ERR, EAL, "Failed to restore mempolicy: %s\n",
> - strerror(errno));
> - numa_set_localalloc();
> + if (have_numa) {
> + if (maxnode) {
> + RTE_LOG(DEBUG, EAL,
> + "Restoring previous memory policy: %d\n",
> + oldpolicy);
> + if (oldpolicy == MPOL_DEFAULT) {
> + numa_set_localalloc();
> + } else if (set_mempolicy(oldpolicy, oldmask->maskp,
> + oldmask->size + 1) < 0) {
> + RTE_LOG(ERR, EAL, "Failed to restore mempolicy: %s\n",
> + strerror(errno));
> + numa_set_localalloc();
> + }
> }
> + numa_free_cpumask(oldmask);
> }
> - numa_free_cpumask(oldmask);
The original intend was to avoid ugly nested 'if's as possible.
'maxnode' is only initialized in NUMA case. So, there is no need
to check for 'has_numa'. 'numa_free_cpumask' has 'free' semantics
and checks for the argument. It is safe to call it with NULL.
If you want to be fully compliant with man page, you may use less
invasive change like this:
---
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index dbf19499e..d0b9f3a2f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -390,7 +390,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
out:
#ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
- if (maxnode) {
+ if (have_numa && maxnode) {
RTE_LOG(DEBUG, EAL,
"Restoring previous memory policy: %d\n", oldpolicy);
if (oldpolicy == MPOL_DEFAULT) {
@@ -402,7 +402,8 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
numa_set_localalloc();
}
}
- numa_free_cpumask(oldmask);
+ if (oldmask)
+ numa_free_cpumask(oldmask);
#endif
return i;
}
---
But still, checking both 'have_numa && maxnode', IMHO, is unnecessary.
As this change is cosmetic (issue doesn't produce any real bug),
I'd like to avoid changing the functional code to something less readable.
This also will complicate 'git blame' process.
What do you think?
> #endif
> return i;
> }
>
next prev parent reply other threads:[~2018-08-29 13:00 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20180829122139epcas1p13ad45026365d788c072f2ed7a38349fb@epcas1p1.samsung.com>
2018-08-29 12:21 ` Anatoly Burakov
2018-08-29 13:02 ` Ilya Maximets [this message]
2018-09-20 12:50 ` [dpdk-dev] [PATCH v2] " Anatoly Burakov
2018-09-21 6:47 ` Ilya Maximets
2018-09-21 8:57 ` Burakov, Anatoly
2018-09-21 9:27 ` [dpdk-dev] [PATCH v3] " Anatoly Burakov
2018-09-21 11:02 ` Ilya Maximets
2018-10-03 22:36 ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='20180829130044eucas1p111b03c624b50c67a2ed0fe4aa038adda~PXHJmVLuc1480114801eucas1p1R@eucas1p1.samsung.com' \
--to=i.maximets@samsung.com \
--cc=anatoly.burakov@intel.com \
--cc=dev@dpdk.org \
--cc=solal.pirelli@gmail.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).