* Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
@ 2018-06-01 11:00 ` Burakov, Anatoly
2018-06-01 12:22 ` Stojaczyk, DariuszX
2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
` (2 subsequent siblings)
3 siblings, 1 reply; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 11:00 UTC (permalink / raw)
To: Dariusz Stojaczyk, dev
On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> EAL reserves a huge area in virtual address space
> to provide virtual address contiguity for e.g.
> future memory extensions (memory hotplug). During
> memory hotplug, if the hugepage mmap succeeds but
> doesn't suffice EAL's requiriments, the EAL would
> unmap this mapping straight away, leaving a hole in
> its virtual memory area and making it available
> to everyone. As EAL still thinks it owns the entire
> region, it may try to mmap it later with MAP_FIXED,
> possibly overriding a user's mapping that was made
> in the meantime.
>
> This patch ensures each hole is mapped back by EAL,
> so that it won't be available to anyone else.
>
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---
Generally, if the commit is a bugfix, reference to the original commit
that introduces the issue should be part of the commit message. See DPDK
contribution guidelines [1] and "git fixline" [2]. Also, this bug is
present in 18.05, so you should also Cc: stable@dpdk.org (same applies
for all of your other patches for memalloc).
[1] http://dpdk.org/doc/guides/contributing/patches.html
[2] http://dpdk.org/dev
> lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> index 8c11f98..b959ea5 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> @@ -39,6 +39,7 @@
> #include "eal_filesystem.h"
> #include "eal_internal_cfg.h"
> #include "eal_memalloc.h"
> +#include "eal_private.h"
>
> /*
> * not all kernel version support fallocate on hugetlbfs, so fall back to
> @@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
> int ret = 0;
> int fd;
> size_t alloc_sz;
> + int flags;
> + void *new_addr;
>
> /* takes out a read lock on segment or segment list */
> fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
> @@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
>
> mapped:
> munmap(addr, alloc_sz);
> + flags = MAP_FIXED;
> +#ifdef RTE_ARCH_PPC_64
> + flags |= MAP_HUGETLB;
> +#endif
> + new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
Page size shouldn't be zero, should be alloc_sz.
> + if (new_addr != addr) {
> + if (new_addr != NULL) {
> + munmap(new_addr, alloc_sz);
> + }
> + /* We're leaving a hole in our virtual address space. If
> + * somebody else maps this hole now, we might accidentally
> + * override it in the future. */
> + rte_panic("can't mmap holes in our virtual address space");
I don't think rte_panic() here makes sense. First of all, we're now
striving to not panic inside DPDK libraries, especially once
initialization has already finished. But more importantly, when you
reach this panic, you're deep in the memory allocation process, which
means you hold a write lock to DPDK memory hotplug. If you crash now,
the lock will never be released and subsequent memory hotplug requests
will just deadlock.
What's worse is you may be inside a secondary process, synchronizing the
memory map with that of a primary, which means that even if you release
the lock here, you're basically releasing someone else's lock, so
behavior will be undefined at that point.
I think an error message with the highest possible log level should
suffice (CRIT?), as there's really nothing more we can do here.
That said, how likely is this scenario?
> + }
> resized:
> if (internal_config.single_file_segments) {
> resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
>
Generally, if the above is fixed, i'm OK with the patch.
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 11:00 ` Burakov, Anatoly
@ 2018-06-01 12:22 ` Stojaczyk, DariuszX
2018-06-01 14:59 ` Burakov, Anatoly
0 siblings, 1 reply; 16+ messages in thread
From: Stojaczyk, DariuszX @ 2018-06-01 12:22 UTC (permalink / raw)
To: Burakov, Anatoly, dev
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Friday, June 1, 2018 1:00 PM
> On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> > EAL reserves a huge area in virtual address space to provide virtual
> > address contiguity for e.g.
> > future memory extensions (memory hotplug). During memory hotplug, if
> > the hugepage mmap succeeds but doesn't suffice EAL's requiriments, the
> > EAL would unmap this mapping straight away, leaving a hole in its
> > virtual memory area and making it available to everyone. As EAL still
> > thinks it owns the entire region, it may try to mmap it later with
> > MAP_FIXED, possibly overriding a user's mapping that was made in the
> > meantime.
> >
> > This patch ensures each hole is mapped back by EAL, so that it won't
> > be available to anyone else.
> >
> > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> > ---
>
> Generally, if the commit is a bugfix, reference to the original commit that
> introduces the issue should be part of the commit message. See DPDK
> contribution guidelines [1] and "git fixline" [2]. Also, this bug is present in
> 18.05, so you should also Cc: stable@dpdk.org (same applies for all of your
> other patches for memalloc).
>
> [1] http://dpdk.org/doc/guides/contributing/patches.html
> [2] http://dpdk.org/dev
Ack, thanks.
>
> > lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
> > 1 file changed, 17 insertions(+)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > index 8c11f98..b959ea5 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > @@ -39,6 +39,7 @@
> > #include "eal_filesystem.h"
> > #include "eal_internal_cfg.h"
> > #include "eal_memalloc.h"
> > +#include "eal_private.h"
> >
> > /*
> > * not all kernel version support fallocate on hugetlbfs, so fall
> > back to @@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void
> *addr, int socket_id,
> > int ret = 0;
> > int fd;
> > size_t alloc_sz;
> > + int flags;
> > + void *new_addr;
> >
> > /* takes out a read lock on segment or segment list */
> > fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx); @@
> > -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int
> > socket_id,
> >
> > mapped:
> > munmap(addr, alloc_sz);
> > + flags = MAP_FIXED;
> > +#ifdef RTE_ARCH_PPC_64
> > + flags |= MAP_HUGETLB;
> > +#endif
> > + new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0,
> > +flags);
>
> Page size shouldn't be zero, should be alloc_sz.
The 0 above is for the `flags` parameter. Page size is set to alloc_sz.
```
void *
eal_get_virtual_area(void *requested_addr, size_t *size,
size_t page_sz, int flags, int mmap_flags);
```
I believe it's okay. Correct me if I'm wrong.
>
> > + if (new_addr != addr) {
> > + if (new_addr != NULL) {
> > + munmap(new_addr, alloc_sz);
> > + }
> > + /* We're leaving a hole in our virtual address space. If
> > + * somebody else maps this hole now, we might accidentally
> > + * override it in the future. */
> > + rte_panic("can't mmap holes in our virtual address space");
>
> I don't think rte_panic() here makes sense. First of all, we're now striving to
> not panic inside DPDK libraries, especially once initialization has already
> finished. But more importantly, when you reach this panic, you're deep in
> the memory allocation process, which means you hold a write lock to DPDK
> memory hotplug. If you crash now, the lock will never be released and
> subsequent memory hotplug requests will just deadlock.
>
> What's worse is you may be inside a secondary process, synchronizing the
> memory map with that of a primary, which means that even if you release
> the lock here, you're basically releasing someone else's lock, so behavior
> will be undefined at that point.
>
> I think an error message with the highest possible log level should suffice
> (CRIT?), as there's really nothing more we can do here.
Ok, I'll fallback to CRIT log for now. We could try to add some proper error handling in a separate patch later.
>
> That said, how likely is this scenario?
I can't think of any reason why that last mmap would fail, but it's still technically possible.
>
>
> > + }
> > resized:
> > if (internal_config.single_file_segments) {
> > resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
> >
>
> Generally, if the above is fixed, i'm OK with the patch.
>
> --
> Thanks,
> Anatoly
D.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 12:22 ` Stojaczyk, DariuszX
@ 2018-06-01 14:59 ` Burakov, Anatoly
0 siblings, 0 replies; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 14:59 UTC (permalink / raw)
To: Stojaczyk, DariuszX, dev
>> Page size shouldn't be zero, should be alloc_sz.
>
> The 0 above is for the `flags` parameter. Page size is set to alloc_sz.
>
> ```
> void *
> eal_get_virtual_area(void *requested_addr, size_t *size,
> size_t page_sz, int flags, int mmap_flags);
> ```
>
> I believe it's okay. Correct me if I'm wrong.
Apologies, you're right. I misread the code.
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
2018-06-01 11:00 ` Burakov, Anatoly
@ 2018-06-01 12:51 ` Dariusz Stojaczyk
2018-06-01 11:08 ` Burakov, Anatoly
2018-06-01 13:37 ` [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
3 siblings, 1 reply; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:51 UTC (permalink / raw)
To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk
This isn't documented in the manuals, but a failed
mmap(..., MAP_FIXED) may still unmap overlapping
regions. In such case, we need to remap these regions
back into our address space to ensure mem contiguity.
We do it unconditionally now on mmap failure just to
be safe.
Verified on Linux 4.9.0-4-amd64. I was getting
ENOMEM when trying to map in hugetlbfs with no space
left, but the previous anonymous mapping was still
being removed.
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memalloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index b959ea5..e59ad6e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -527,7 +527,13 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
if (va == MAP_FAILED) {
RTE_LOG(DEBUG, EAL, "%s(): mmap() failed: %s\n", __func__,
strerror(errno));
- goto resized;
+ if (errno == ENOMEM) {
+ /* mmap failed, but the previous region might have
+ * been unmapped anyway; try to remap it */
+ goto unmapped;
+ } else {
+ goto resized;
+ }
}
if (va != addr) {
RTE_LOG(DEBUG, EAL, "%s(): wrong mmap() address\n", __func__);
@@ -588,6 +594,7 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
mapped:
munmap(addr, alloc_sz);
+unmapped:
flags = MAP_FIXED;
#ifdef RTE_ARCH_PPC_64
flags |= MAP_HUGETLB;
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 11:08 ` Burakov, Anatoly
2018-06-01 12:41 ` Stojaczyk, DariuszX
0 siblings, 1 reply; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 11:08 UTC (permalink / raw)
To: Dariusz Stojaczyk, dev
On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> This isn't documented in the manuals, but a failed
> mmap(..., MAP_FIXED) may still unmap overlapping
> regions. In such case, we need to remap these regions
> back into our address space to ensure mem contiguity.
> We do it unconditionally now on mmap failure just to
> be safe.
>
> Verified on Linux 4.9.0-4-amd64. I was getting
> ENOMEM when trying to map in hugetlbfs with no space
> left, but the previous anonymous mapping was still
> being removed.
>
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---
Does this also happen with other error values?
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
2018-06-01 11:08 ` Burakov, Anatoly
@ 2018-06-01 12:41 ` Stojaczyk, DariuszX
0 siblings, 0 replies; 16+ messages in thread
From: Stojaczyk, DariuszX @ 2018-06-01 12:41 UTC (permalink / raw)
To: Burakov, Anatoly, dev
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Friday, June 1, 2018 1:09 PM
> To: Stojaczyk, DariuszX <dariuszx.stojaczyk@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED
> mmap may still perform an unmap
>
> On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> > This isn't documented in the manuals, but a failed mmap(...,
> > MAP_FIXED) may still unmap overlapping regions. In such case, we need
> > to remap these regions back into our address space to ensure mem
> > contiguity.
> > We do it unconditionally now on mmap failure just to be safe.
> >
> > Verified on Linux 4.9.0-4-amd64. I was getting ENOMEM when trying to
> > map in hugetlbfs with no space left, but the previous anonymous
> > mapping was still being removed.
> >
> > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> > ---
>
> Does this also happen with other error values?
The man pages aren't clear on this. It could. I know this patch only remaps the region if errrno == ENOMEM, but we could remap it unconditionally after each failed mmap just as well. It won't hurt if we remap a region unnecessarily. This was my original intent, but apparently I didn't fully commit my changes before publishing patches here. Uhh, sorry.
D.
>
> --
> Thanks,
> Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
2018-06-01 11:00 ` Burakov, Anatoly
2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 13:37 ` Dariusz Stojaczyk
2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
3 siblings, 0 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 13:37 UTC (permalink / raw)
To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk
EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.
This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.
Changes from v1:
* checkpatch fixes
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..2ab3d34 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
#include "eal_filesystem.h"
#include "eal_internal_cfg.h"
#include "eal_memalloc.h"
+#include "eal_private.h"
/*
* not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
int ret = 0;
int fd;
size_t alloc_sz;
+ int flags;
+ void *new_addr;
/* takes out a read lock on segment or segment list */
fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
mapped:
munmap(addr, alloc_sz);
+ flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+ flags |= MAP_HUGETLB;
+#endif
+ new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+ if (new_addr != addr) {
+ if (new_addr != NULL)
+ munmap(new_addr, alloc_sz);
+ /* we're leaving a hole in our virtual address space. if
+ * somebody else maps this hole now, we could accidentally
+ * override it in the future.
+ */
+ rte_panic("can't mmap holes in our virtual address space");
+ }
resized:
if (internal_config.single_file_segments) {
resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
` (2 preceding siblings ...)
2018-06-01 13:37 ` [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
@ 2018-06-01 13:39 ` Dariusz Stojaczyk
2018-06-01 10:24 ` Stojaczyk, DariuszX
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
3 siblings, 2 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 13:39 UTC (permalink / raw)
To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk
EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.
This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.
Changes from v1:
* checkpatch fixes
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..2ab3d34 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
#include "eal_filesystem.h"
#include "eal_internal_cfg.h"
#include "eal_memalloc.h"
+#include "eal_private.h"
/*
* not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
int ret = 0;
int fd;
size_t alloc_sz;
+ int flags;
+ void *new_addr;
/* takes out a read lock on segment or segment list */
fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
mapped:
munmap(addr, alloc_sz);
+ flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+ flags |= MAP_HUGETLB;
+#endif
+ new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+ if (new_addr != addr) {
+ if (new_addr != NULL)
+ munmap(new_addr, alloc_sz);
+ /* we're leaving a hole in our virtual address space. if
+ * somebody else maps this hole now, we could accidentally
+ * override it in the future.
+ */
+ rte_panic("can't mmap holes in our virtual address space");
+ }
resized:
if (internal_config.single_file_segments) {
resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
@ 2018-06-01 10:24 ` Stojaczyk, DariuszX
2018-06-01 11:14 ` Burakov, Anatoly
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
1 sibling, 1 reply; 16+ messages in thread
From: Stojaczyk, DariuszX @ 2018-06-01 10:24 UTC (permalink / raw)
To: dev, Burakov, Anatoly
The Intel SMTP server keeps dying to me. It's really not fun trying to resend these messages for half an hour without success. I'll resend this series from my private mail in the evening.
D.
> -----Original Message-----
> From: Stojaczyk, DariuszX
> Sent: Friday, June 1, 2018 3:39 PM
> To: dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>
> Cc: Stojaczyk, DariuszX <dariuszx.stojaczyk@intel.com>
> Subject: [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL
> virtual memory area
>
> EAL reserves a huge area in virtual address space to provide virtual address
> contiguity for e.g.
> future memory extensions (memory hotplug). During memory hotplug, if
> the hugepage mmap succeeds but doesn't suffice EAL's requiriments, the
> EAL would unmap this mapping straight away, leaving a hole in its virtual
> memory area and making it available to everyone. As EAL still thinks it owns
> the entire region, it may try to mmap it later with MAP_FIXED, possibly
> overriding a user's mapping that was made in the meantime.
>
> This patch ensures each hole is mapped back by EAL, so that it won't be
> available to anyone else.
>
> Changes from v1:
> * checkpatch fixes
>
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---
> lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> index 8c11f98..2ab3d34 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> @@ -39,6 +39,7 @@
> #include "eal_filesystem.h"
> #include "eal_internal_cfg.h"
> #include "eal_memalloc.h"
> +#include "eal_private.h"
>
> /*
> * not all kernel version support fallocate on hugetlbfs, so fall back to @@ -
> 490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int
> socket_id,
> int ret = 0;
> int fd;
> size_t alloc_sz;
> + int flags;
> + void *new_addr;
>
> /* takes out a read lock on segment or segment list */
> fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx); @@ -585,6
> +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
>
> mapped:
> munmap(addr, alloc_sz);
> + flags = MAP_FIXED;
> +#ifdef RTE_ARCH_PPC_64
> + flags |= MAP_HUGETLB;
> +#endif
> + new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0,
> flags);
> + if (new_addr != addr) {
> + if (new_addr != NULL)
> + munmap(new_addr, alloc_sz);
> + /* we're leaving a hole in our virtual address space. if
> + * somebody else maps this hole now, we could accidentally
> + * override it in the future.
> + */
> + rte_panic("can't mmap holes in our virtual address space");
> + }
> resized:
> if (internal_config.single_file_segments) {
> resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
> --
> 2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 10:24 ` Stojaczyk, DariuszX
@ 2018-06-01 11:14 ` Burakov, Anatoly
0 siblings, 0 replies; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 11:14 UTC (permalink / raw)
To: Stojaczyk, DariuszX, dev
On 01-Jun-18 11:24 AM, Stojaczyk, DariuszX wrote:
> The Intel SMTP server keeps dying to me. It's really not fun trying to resend these messages for half an hour without success. I'll resend this series from my private mail in the evening.
>
> D.
>
Please fix your NTP server, your timestamps appear to be wrong (15:39 +2
hours).
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
2018-06-01 10:24 ` Stojaczyk, DariuszX
@ 2018-06-01 12:59 ` Dariusz Stojaczyk
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
2018-06-01 15:06 ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
1 sibling, 2 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:59 UTC (permalink / raw)
To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk, stable
EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.
This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.
Changes from v2:
* replaced rte_panic() with a CRIT log.
* added "git fixline" tags
Changes from v1:
* checkpatch fixes
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.burakov@intel.com
Cc: stable@dpdk.org
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..6be6680 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
#include "eal_filesystem.h"
#include "eal_internal_cfg.h"
#include "eal_memalloc.h"
+#include "eal_private.h"
/*
* not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
int ret = 0;
int fd;
size_t alloc_sz;
+ int flags;
+ void *new_addr;
/* takes out a read lock on segment or segment list */
fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
mapped:
munmap(addr, alloc_sz);
+ flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+ flags |= MAP_HUGETLB;
+#endif
+ new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+ if (new_addr != addr) {
+ if (new_addr != NULL)
+ munmap(new_addr, alloc_sz);
+ /* we're leaving a hole in our virtual address space. if
+ * somebody else maps this hole now, we could accidentally
+ * override it in the future.
+ */
+ RTE_LOG(CRIT, EAL, "Can't mmap holes in our virtual address space\n");
+ }
resized:
if (internal_config.single_file_segments) {
resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
@ 2018-06-01 12:59 ` Dariusz Stojaczyk
2018-06-01 15:07 ` Burakov, Anatoly
2018-06-01 15:06 ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
1 sibling, 1 reply; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:59 UTC (permalink / raw)
To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk, stable
This isn't documented in the manuals, but a failed
mmap(..., MAP_FIXED) may still unmap overlapping
regions. In such case, we need to remap these regions
back into our address space to ensure mem contiguity.
We do it unconditionally now on mmap failure just to
be safe.
Verified on Linux 4.9.0-4-amd64. I was getting
ENOMEM when trying to map hugetlbfs with no space
left, and the previous anonymous mapping was still
being removed.
Changes from v2:
* added "git fixline" tags
Changes from v1:
* checkpatch fixes
* remapping is now done regardless of the mmap errno
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.burakov@intel.com
Cc: stable@dpdk.org
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memalloc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 6be6680..81c94d5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -527,7 +527,10 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
if (va == MAP_FAILED) {
RTE_LOG(DEBUG, EAL, "%s(): mmap() failed: %s\n", __func__,
strerror(errno));
- goto resized;
+ /* mmap failed, but the previous region might have been
+ * unmapped anyway. try to remap it
+ */
+ goto unmapped;
}
if (va != addr) {
RTE_LOG(DEBUG, EAL, "%s(): wrong mmap() address\n", __func__);
@@ -588,6 +591,7 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
mapped:
munmap(addr, alloc_sz);
+unmapped:
flags = MAP_FIXED;
#ifdef RTE_ARCH_PPC_64
flags |= MAP_HUGETLB;
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 15:07 ` Burakov, Anatoly
0 siblings, 0 replies; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 15:07 UTC (permalink / raw)
To: Dariusz Stojaczyk, dev; +Cc: stable
On 01-Jun-18 1:59 PM, Dariusz Stojaczyk wrote:
> This isn't documented in the manuals, but a failed
> mmap(..., MAP_FIXED) may still unmap overlapping
> regions. In such case, we need to remap these regions
> back into our address space to ensure mem contiguity.
> We do it unconditionally now on mmap failure just to
> be safe.
>
> Verified on Linux 4.9.0-4-amd64. I was getting
> ENOMEM when trying to map hugetlbfs with no space
> left, and the previous anonymous mapping was still
> being removed.
>
> Changes from v2:
> * added "git fixline" tags
>
> Changes from v1:
> * checkpatch fixes
> * remapping is now done regardless of the mmap errno
>
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: anatoly.burakov@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Version changes should not be part of commit message. Thomas, could you
please fix this on apply?
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
2018-06-01 12:59 ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 15:06 ` Burakov, Anatoly
2018-07-12 21:42 ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
1 sibling, 1 reply; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 15:06 UTC (permalink / raw)
To: Dariusz Stojaczyk, dev; +Cc: stable
On 01-Jun-18 1:59 PM, Dariusz Stojaczyk wrote:
> EAL reserves a huge area in virtual address space
> to provide virtual address contiguity for e.g.
> future memory extensions (memory hotplug). During
> memory hotplug, if the hugepage mmap succeeds but
> doesn't suffice EAL's requiriments, the EAL would
> unmap this mapping straight away, leaving a hole in
> its virtual memory area and making it available
> to everyone. As EAL still thinks it owns the entire
> region, it may try to mmap it later with MAP_FIXED,
> possibly overriding a user's mapping that was made
> in the meantime.
>
> This patch ensures each hole is mapped back by EAL,
> so that it won't be available to anyone else.
>
> Changes from v2:
> * replaced rte_panic() with a CRIT log.
> * added "git fixline" tags
>
> Changes from v1:
> * checkpatch fixes
>
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: anatoly.burakov@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Thomas, could you please fix the commit message on apply?
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
2018-06-01 15:06 ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
@ 2018-07-12 21:42 ` Thomas Monjalon
0 siblings, 0 replies; 16+ messages in thread
From: Thomas Monjalon @ 2018-07-12 21:42 UTC (permalink / raw)
To: Dariusz Stojaczyk; +Cc: stable, Burakov, Anatoly, dev
01/06/2018 17:06, Burakov, Anatoly:
> On 01-Jun-18 1:59 PM, Dariusz Stojaczyk wrote:
> > EAL reserves a huge area in virtual address space
> > to provide virtual address contiguity for e.g.
> > future memory extensions (memory hotplug). During
> > memory hotplug, if the hugepage mmap succeeds but
> > doesn't suffice EAL's requiriments, the EAL would
> > unmap this mapping straight away, leaving a hole in
> > its virtual memory area and making it available
> > to everyone. As EAL still thinks it owns the entire
> > region, it may try to mmap it later with MAP_FIXED,
> > possibly overriding a user's mapping that was made
> > in the meantime.
> >
> > This patch ensures each hole is mapped back by EAL,
> > so that it won't be available to anyone else.
> >
> > Changes from v2:
> > * replaced rte_panic() with a CRIT log.
> > * added "git fixline" tags
> >
> > Changes from v1:
> > * checkpatch fixes
> >
> > Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> > Cc: anatoly.burakov@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> > ---
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Thomas, could you please fix the commit message on apply?
Yes
Applied, thanks
^ permalink raw reply [flat|nested] 16+ messages in thread