DPDK patches and discussions
 help / color / mirror / Atom feed
* Re: [dpdk-dev] [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
@ 2018-06-01 10:24   ` Stojaczyk, DariuszX
  2018-06-01 11:14     ` Burakov, Anatoly
  2018-06-01 12:59   ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
  1 sibling, 1 reply; 16+ messages in thread
From: Stojaczyk, DariuszX @ 2018-06-01 10:24 UTC (permalink / raw)
  To: dev, Burakov, Anatoly

The Intel SMTP server keeps dying to me. It's really not fun trying to resend these messages for half an hour without success. I'll resend this series from my private mail in the evening.

D.

> -----Original Message-----
> From: Stojaczyk, DariuszX
> Sent: Friday, June 1, 2018 3:39 PM
> To: dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>
> Cc: Stojaczyk, DariuszX <dariuszx.stojaczyk@intel.com>
> Subject: [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL
> virtual memory area
> 
> EAL reserves a huge area in virtual address space to provide virtual address
> contiguity for e.g.
> future memory extensions (memory hotplug). During memory hotplug, if
> the hugepage mmap succeeds but doesn't suffice EAL's requiriments, the
> EAL would unmap this mapping straight away, leaving a hole in its virtual
> memory area and making it available to everyone. As EAL still thinks it owns
> the entire region, it may try to mmap it later with MAP_FIXED, possibly
> overriding a user's mapping that was made in the meantime.
> 
> This patch ensures each hole is mapped back by EAL, so that it won't be
> available to anyone else.
> 
> Changes from v1:
>  * checkpatch fixes
> 
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> index 8c11f98..2ab3d34 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> @@ -39,6 +39,7 @@
>  #include "eal_filesystem.h"
>  #include "eal_internal_cfg.h"
>  #include "eal_memalloc.h"
> +#include "eal_private.h"
> 
>  /*
>   * not all kernel version support fallocate on hugetlbfs, so fall back to @@ -
> 490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int
> socket_id,
>  	int ret = 0;
>  	int fd;
>  	size_t alloc_sz;
> +	int flags;
> +	void *new_addr;
> 
>  	/* takes out a read lock on segment or segment list */
>  	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx); @@ -585,6
> +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
> 
>  mapped:
>  	munmap(addr, alloc_sz);
> +	flags = MAP_FIXED;
> +#ifdef RTE_ARCH_PPC_64
> +	flags |= MAP_HUGETLB;
> +#endif
> +	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0,
> flags);
> +	if (new_addr != addr) {
> +		if (new_addr != NULL)
> +			munmap(new_addr, alloc_sz);
> +		/* we're leaving a hole in our virtual address space. if
> +		 * somebody else maps this hole now, we could accidentally
> +		 * override it in the future.
> +		 */
> +		rte_panic("can't mmap holes in our virtual address space");
> +	}
>  resized:
>  	if (internal_config.single_file_segments) {
>  		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
@ 2018-06-01 11:00 ` Burakov, Anatoly
  2018-06-01 12:22   ` Stojaczyk, DariuszX
  2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 11:00 UTC (permalink / raw)
  To: Dariusz Stojaczyk, dev

On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> EAL reserves a huge area in virtual address space
> to provide virtual address contiguity for e.g.
> future memory extensions (memory hotplug). During
> memory hotplug, if the hugepage mmap succeeds but
> doesn't suffice EAL's requiriments, the EAL would
> unmap this mapping straight away, leaving a hole in
> its virtual memory area and making it available
> to everyone. As EAL still thinks it owns the entire
> region, it may try to mmap it later with MAP_FIXED,
> possibly overriding a user's mapping that was made
> in the meantime.
> 
> This patch ensures each hole is mapped back by EAL,
> so that it won't be available to anyone else.
> 
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---

Generally, if the commit is a bugfix, reference to the original commit 
that introduces the issue should be part of the commit message. See DPDK 
contribution guidelines [1] and "git fixline" [2]. Also, this bug is 
present in 18.05, so you should also Cc: stable@dpdk.org (same applies 
for all of your other patches for memalloc).

[1] http://dpdk.org/doc/guides/contributing/patches.html
[2] http://dpdk.org/dev

>   lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> index 8c11f98..b959ea5 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> @@ -39,6 +39,7 @@
>   #include "eal_filesystem.h"
>   #include "eal_internal_cfg.h"
>   #include "eal_memalloc.h"
> +#include "eal_private.h"
>   
>   /*
>    * not all kernel version support fallocate on hugetlbfs, so fall back to
> @@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
>   	int ret = 0;
>   	int fd;
>   	size_t alloc_sz;
> +	int flags;
> +	void *new_addr;
>   
>   	/* takes out a read lock on segment or segment list */
>   	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
> @@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
>   
>   mapped:
>   	munmap(addr, alloc_sz);
> +	flags = MAP_FIXED;
> +#ifdef RTE_ARCH_PPC_64
> +	flags |= MAP_HUGETLB;
> +#endif
> +	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);

Page size shouldn't be zero, should be alloc_sz.

> +	if (new_addr != addr) {
> +		if (new_addr != NULL) {
> +			munmap(new_addr, alloc_sz);
> +		}
> +		/* We're leaving a hole in our virtual address space. If
> +		 * somebody else maps this hole now, we might accidentally
> +		 * override it in the future. */
> +		rte_panic("can't mmap holes in our virtual address space");

I don't think rte_panic() here makes sense. First of all, we're now 
striving to not panic inside DPDK libraries, especially once 
initialization has already finished. But more importantly, when you 
reach this panic, you're deep in the memory allocation process, which 
means you hold a write lock to DPDK memory hotplug. If you crash now, 
the lock will never be released and subsequent memory hotplug requests 
will just deadlock.

What's worse is you may be inside a secondary process, synchronizing the 
memory map with that of a primary, which means that even if you release 
the lock here, you're basically releasing someone else's lock, so 
behavior will be undefined at that point.

I think an error message with the highest possible log level should 
suffice (CRIT?), as there's really nothing more we can do here.

That said, how likely is this scenario?

> +	}
>   resized:
>   	if (internal_config.single_file_segments) {
>   		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
> 

Generally, if the above is fixed, i'm OK with the patch.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
  2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 11:08   ` Burakov, Anatoly
  2018-06-01 12:41     ` Stojaczyk, DariuszX
  0 siblings, 1 reply; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 11:08 UTC (permalink / raw)
  To: Dariusz Stojaczyk, dev

On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> This isn't documented in the manuals, but a failed
> mmap(..., MAP_FIXED) may still unmap overlapping
> regions. In such case, we need to remap these regions
> back into our address space to ensure mem contiguity.
> We do it unconditionally now on mmap failure just to
> be safe.
> 
> Verified on Linux 4.9.0-4-amd64. I was getting
> ENOMEM when trying to map in hugetlbfs with no space
> left, but the previous anonymous mapping was still
> being removed.
> 
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---

Does this also happen with other error values?

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 10:24   ` Stojaczyk, DariuszX
@ 2018-06-01 11:14     ` Burakov, Anatoly
  0 siblings, 0 replies; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 11:14 UTC (permalink / raw)
  To: Stojaczyk, DariuszX, dev

On 01-Jun-18 11:24 AM, Stojaczyk, DariuszX wrote:
> The Intel SMTP server keeps dying to me. It's really not fun trying to resend these messages for half an hour without success. I'll resend this series from my private mail in the evening.
> 
> D.
> 

Please fix your NTP server, your timestamps appear to be wrong (15:39 +2 
hours).

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 11:00 ` Burakov, Anatoly
@ 2018-06-01 12:22   ` Stojaczyk, DariuszX
  2018-06-01 14:59     ` Burakov, Anatoly
  0 siblings, 1 reply; 16+ messages in thread
From: Stojaczyk, DariuszX @ 2018-06-01 12:22 UTC (permalink / raw)
  To: Burakov, Anatoly, dev


> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Friday, June 1, 2018 1:00 PM
> On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> > EAL reserves a huge area in virtual address space to provide virtual
> > address contiguity for e.g.
> > future memory extensions (memory hotplug). During memory hotplug, if
> > the hugepage mmap succeeds but doesn't suffice EAL's requiriments, the
> > EAL would unmap this mapping straight away, leaving a hole in its
> > virtual memory area and making it available to everyone. As EAL still
> > thinks it owns the entire region, it may try to mmap it later with
> > MAP_FIXED, possibly overriding a user's mapping that was made in the
> > meantime.
> >
> > This patch ensures each hole is mapped back by EAL, so that it won't
> > be available to anyone else.
> >
> > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> > ---
> 
> Generally, if the commit is a bugfix, reference to the original commit that
> introduces the issue should be part of the commit message. See DPDK
> contribution guidelines [1] and "git fixline" [2]. Also, this bug is present in
> 18.05, so you should also Cc: stable@dpdk.org (same applies for all of your
> other patches for memalloc).
> 
> [1] http://dpdk.org/doc/guides/contributing/patches.html
> [2] http://dpdk.org/dev

Ack, thanks.

> 
> >   lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
> >   1 file changed, 17 insertions(+)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > index 8c11f98..b959ea5 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
> > @@ -39,6 +39,7 @@
> >   #include "eal_filesystem.h"
> >   #include "eal_internal_cfg.h"
> >   #include "eal_memalloc.h"
> > +#include "eal_private.h"
> >
> >   /*
> >    * not all kernel version support fallocate on hugetlbfs, so fall
> > back to @@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void
> *addr, int socket_id,
> >   	int ret = 0;
> >   	int fd;
> >   	size_t alloc_sz;
> > +	int flags;
> > +	void *new_addr;
> >
> >   	/* takes out a read lock on segment or segment list */
> >   	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx); @@
> > -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int
> > socket_id,
> >
> >   mapped:
> >   	munmap(addr, alloc_sz);
> > +	flags = MAP_FIXED;
> > +#ifdef RTE_ARCH_PPC_64
> > +	flags |= MAP_HUGETLB;
> > +#endif
> > +	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0,
> > +flags);
> 
> Page size shouldn't be zero, should be alloc_sz.

The 0 above is for the `flags` parameter. Page size is set to alloc_sz.

```
void *
eal_get_virtual_area(void *requested_addr, size_t *size,
		size_t page_sz, int flags, int mmap_flags);
```

I believe it's okay. Correct me if I'm wrong.

> 
> > +	if (new_addr != addr) {
> > +		if (new_addr != NULL) {
> > +			munmap(new_addr, alloc_sz);
> > +		}
> > +		/* We're leaving a hole in our virtual address space. If
> > +		 * somebody else maps this hole now, we might accidentally
> > +		 * override it in the future. */
> > +		rte_panic("can't mmap holes in our virtual address space");
> 
> I don't think rte_panic() here makes sense. First of all, we're now striving to
> not panic inside DPDK libraries, especially once initialization has already
> finished. But more importantly, when you reach this panic, you're deep in
> the memory allocation process, which means you hold a write lock to DPDK
> memory hotplug. If you crash now, the lock will never be released and
> subsequent memory hotplug requests will just deadlock.
> 
> What's worse is you may be inside a secondary process, synchronizing the
> memory map with that of a primary, which means that even if you release
> the lock here, you're basically releasing someone else's lock, so behavior
> will be undefined at that point.
> 
> I think an error message with the highest possible log level should suffice
> (CRIT?), as there's really nothing more we can do here.

Ok, I'll fallback to CRIT log for now. We could try to add some proper error handling in a separate patch later.

> 
> That said, how likely is this scenario?

I can't think of any reason why that last mmap would fail, but it's still technically possible.

>
> 
> > +	}
> >   resized:
> >   	if (internal_config.single_file_segments) {
> >   		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
> >
> 
> Generally, if the above is fixed, i'm OK with the patch.
> 
> --
> Thanks,
> Anatoly

D.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
  2018-06-01 11:08   ` Burakov, Anatoly
@ 2018-06-01 12:41     ` Stojaczyk, DariuszX
  0 siblings, 0 replies; 16+ messages in thread
From: Stojaczyk, DariuszX @ 2018-06-01 12:41 UTC (permalink / raw)
  To: Burakov, Anatoly, dev


> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Friday, June 1, 2018 1:09 PM
> To: Stojaczyk, DariuszX <dariuszx.stojaczyk@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED
> mmap may still perform an unmap
> 
> On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote:
> > This isn't documented in the manuals, but a failed mmap(...,
> > MAP_FIXED) may still unmap overlapping regions. In such case, we need
> > to remap these regions back into our address space to ensure mem
> > contiguity.
> > We do it unconditionally now on mmap failure just to be safe.
> >
> > Verified on Linux 4.9.0-4-amd64. I was getting ENOMEM when trying to
> > map in hugetlbfs with no space left, but the previous anonymous
> > mapping was still being removed.
> >
> > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> > ---
> 
> Does this also happen with other error values?

The man pages aren't clear on this. It could. I know this patch only remaps the region if errrno == ENOMEM, but we could remap it unconditionally after each failed mmap just as well.  It won't hurt if we remap a region unnecessarily. This was my original intent, but apparently I didn't fully commit my changes before publishing patches here. Uhh, sorry.

D.

> 
> --
> Thanks,
> Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
@ 2018-06-01 12:51 Dariusz Stojaczyk
  2018-06-01 11:00 ` Burakov, Anatoly
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:51 UTC (permalink / raw)
  To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk

EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.

This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..b959ea5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
 #include "eal_filesystem.h"
 #include "eal_internal_cfg.h"
 #include "eal_memalloc.h"
+#include "eal_private.h"
 
 /*
  * not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	int ret = 0;
 	int fd;
 	size_t alloc_sz;
+	int flags;
+	void *new_addr;
 
 	/* takes out a read lock on segment or segment list */
 	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 mapped:
 	munmap(addr, alloc_sz);
+	flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+	flags |= MAP_HUGETLB;
+#endif
+	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+	if (new_addr != addr) {
+		if (new_addr != NULL) {
+			munmap(new_addr, alloc_sz);
+		}
+		/* We're leaving a hole in our virtual address space. If
+		 * somebody else maps this hole now, we might accidentally
+		 * override it in the future. */
+		rte_panic("can't mmap holes in our virtual address space");
+	}
 resized:
 	if (internal_config.single_file_segments) {
 		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
  2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
  2018-06-01 11:00 ` Burakov, Anatoly
@ 2018-06-01 12:51 ` Dariusz Stojaczyk
  2018-06-01 11:08   ` Burakov, Anatoly
  2018-06-01 13:37 ` [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
  2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
  3 siblings, 1 reply; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:51 UTC (permalink / raw)
  To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk

This isn't documented in the manuals, but a failed
mmap(..., MAP_FIXED) may still unmap overlapping
regions. In such case, we need to remap these regions
back into our address space to ensure mem contiguity.
We do it unconditionally now on mmap failure just to
be safe.

Verified on Linux 4.9.0-4-amd64. I was getting
ENOMEM when trying to map in hugetlbfs with no space
left, but the previous anonymous mapping was still
being removed.

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index b959ea5..e59ad6e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -527,7 +527,13 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	if (va == MAP_FAILED) {
 		RTE_LOG(DEBUG, EAL, "%s(): mmap() failed: %s\n", __func__,
 			strerror(errno));
-		goto resized;
+		if (errno == ENOMEM) {
+			/* mmap failed, but the previous region might have
+			 * been unmapped anyway; try to remap it */
+			goto unmapped;
+		} else {
+			goto resized;
+		}
 	}
 	if (va != addr) {
 		RTE_LOG(DEBUG, EAL, "%s(): wrong mmap() address\n", __func__);
@@ -588,6 +594,7 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 mapped:
 	munmap(addr, alloc_sz);
+unmapped:
 	flags = MAP_FIXED;
 #ifdef RTE_ARCH_PPC_64
 	flags |= MAP_HUGETLB;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
  2018-06-01 10:24   ` Stojaczyk, DariuszX
@ 2018-06-01 12:59   ` Dariusz Stojaczyk
  2018-06-01 12:59     ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
  2018-06-01 15:06     ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
  1 sibling, 2 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:59 UTC (permalink / raw)
  To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk, stable

EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.

This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.

Changes from v2:
  * replaced rte_panic() with a CRIT log.
  * added "git fixline" tags

Changes from v1:
 * checkpatch fixes

Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.burakov@intel.com
Cc: stable@dpdk.org

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..6be6680 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
 #include "eal_filesystem.h"
 #include "eal_internal_cfg.h"
 #include "eal_memalloc.h"
+#include "eal_private.h"
 
 /*
  * not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	int ret = 0;
 	int fd;
 	size_t alloc_sz;
+	int flags;
+	void *new_addr;
 
 	/* takes out a read lock on segment or segment list */
 	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 mapped:
 	munmap(addr, alloc_sz);
+	flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+	flags |= MAP_HUGETLB;
+#endif
+	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+	if (new_addr != addr) {
+		if (new_addr != NULL)
+			munmap(new_addr, alloc_sz);
+		/* we're leaving a hole in our virtual address space. if
+		 * somebody else maps this hole now, we could accidentally
+		 * override it in the future.
+		 */
+		RTE_LOG(CRIT, EAL, "Can't mmap holes in our virtual address space\n");
+	}
 resized:
 	if (internal_config.single_file_segments) {
 		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
  2018-06-01 12:59   ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
@ 2018-06-01 12:59     ` Dariusz Stojaczyk
  2018-06-01 15:07       ` Burakov, Anatoly
  2018-06-01 15:06     ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
  1 sibling, 1 reply; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 12:59 UTC (permalink / raw)
  To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk, stable

This isn't documented in the manuals, but a failed
mmap(..., MAP_FIXED) may still unmap overlapping
regions. In such case, we need to remap these regions
back into our address space to ensure mem contiguity.
We do it unconditionally now on mmap failure just to
be safe.

Verified on Linux 4.9.0-4-amd64. I was getting
ENOMEM when trying to map hugetlbfs with no space
left, and the previous anonymous mapping was still
being removed.

Changes from v2:
 * added "git fixline" tags

Changes from v1:
 * checkpatch fixes
 * remapping is now done regardless of the mmap errno

Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.burakov@intel.com
Cc: stable@dpdk.org

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 6be6680..81c94d5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -527,7 +527,10 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	if (va == MAP_FAILED) {
 		RTE_LOG(DEBUG, EAL, "%s(): mmap() failed: %s\n", __func__,
 			strerror(errno));
-		goto resized;
+		/* mmap failed, but the previous region might have been
+		 * unmapped anyway. try to remap it
+		 */
+		goto unmapped;
 	}
 	if (va != addr) {
 		RTE_LOG(DEBUG, EAL, "%s(): wrong mmap() address\n", __func__);
@@ -588,6 +591,7 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 mapped:
 	munmap(addr, alloc_sz);
+unmapped:
 	flags = MAP_FIXED;
 #ifdef RTE_ARCH_PPC_64
 	flags |= MAP_HUGETLB;
-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
  2018-06-01 11:00 ` Burakov, Anatoly
  2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 13:37 ` Dariusz Stojaczyk
  2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
  3 siblings, 0 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 13:37 UTC (permalink / raw)
  To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk

EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.

This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.

Changes from v1:
 * checkpatch fixes

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..2ab3d34 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
 #include "eal_filesystem.h"
 #include "eal_internal_cfg.h"
 #include "eal_memalloc.h"
+#include "eal_private.h"
 
 /*
  * not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	int ret = 0;
 	int fd;
 	size_t alloc_sz;
+	int flags;
+	void *new_addr;
 
 	/* takes out a read lock on segment or segment list */
 	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 mapped:
 	munmap(addr, alloc_sz);
+	flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+	flags |= MAP_HUGETLB;
+#endif
+	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+	if (new_addr != addr) {
+		if (new_addr != NULL)
+			munmap(new_addr, alloc_sz);
+		/* we're leaving a hole in our virtual address space. if
+		 * somebody else maps this hole now, we could accidentally
+		 * override it in the future.
+		 */
+		rte_panic("can't mmap holes in our virtual address space");
+	}
 resized:
 	if (internal_config.single_file_segments) {
 		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
                   ` (2 preceding siblings ...)
  2018-06-01 13:37 ` [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
@ 2018-06-01 13:39 ` Dariusz Stojaczyk
  2018-06-01 10:24   ` Stojaczyk, DariuszX
  2018-06-01 12:59   ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
  3 siblings, 2 replies; 16+ messages in thread
From: Dariusz Stojaczyk @ 2018-06-01 13:39 UTC (permalink / raw)
  To: dev, Anatoly Burakov; +Cc: Dariusz Stojaczyk

EAL reserves a huge area in virtual address space
to provide virtual address contiguity for e.g.
future memory extensions (memory hotplug). During
memory hotplug, if the hugepage mmap succeeds but
doesn't suffice EAL's requiriments, the EAL would
unmap this mapping straight away, leaving a hole in
its virtual memory area and making it available
to everyone. As EAL still thinks it owns the entire
region, it may try to mmap it later with MAP_FIXED,
possibly overriding a user's mapping that was made
in the meantime.

This patch ensures each hole is mapped back by EAL,
so that it won't be available to anyone else.

Changes from v1:
 * checkpatch fixes

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8c11f98..2ab3d34 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -39,6 +39,7 @@
 #include "eal_filesystem.h"
 #include "eal_internal_cfg.h"
 #include "eal_memalloc.h"
+#include "eal_private.h"
 
 /*
  * not all kernel version support fallocate on hugetlbfs, so fall back to
@@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	int ret = 0;
 	int fd;
 	size_t alloc_sz;
+	int flags;
+	void *new_addr;
 
 	/* takes out a read lock on segment or segment list */
 	fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx);
@@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 mapped:
 	munmap(addr, alloc_sz);
+	flags = MAP_FIXED;
+#ifdef RTE_ARCH_PPC_64
+	flags |= MAP_HUGETLB;
+#endif
+	new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
+	if (new_addr != addr) {
+		if (new_addr != NULL)
+			munmap(new_addr, alloc_sz);
+		/* we're leaving a hole in our virtual address space. if
+		 * somebody else maps this hole now, we could accidentally
+		 * override it in the future.
+		 */
+		rte_panic("can't mmap holes in our virtual address space");
+	}
 resized:
 	if (internal_config.single_file_segments) {
 		resize_hugefile(fd, path, list_idx, seg_idx, map_offset,
-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 12:22   ` Stojaczyk, DariuszX
@ 2018-06-01 14:59     ` Burakov, Anatoly
  0 siblings, 0 replies; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 14:59 UTC (permalink / raw)
  To: Stojaczyk, DariuszX, dev

>> Page size shouldn't be zero, should be alloc_sz.
> 
> The 0 above is for the `flags` parameter. Page size is set to alloc_sz.
> 
> ```
> void *
> eal_get_virtual_area(void *requested_addr, size_t *size,
> 		size_t page_sz, int flags, int mmap_flags);
> ```
> 
> I believe it's okay. Correct me if I'm wrong.

Apologies, you're right. I misread the code.


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 12:59   ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
  2018-06-01 12:59     ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 15:06     ` Burakov, Anatoly
  2018-07-12 21:42       ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
  1 sibling, 1 reply; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 15:06 UTC (permalink / raw)
  To: Dariusz Stojaczyk, dev; +Cc: stable

On 01-Jun-18 1:59 PM, Dariusz Stojaczyk wrote:
> EAL reserves a huge area in virtual address space
> to provide virtual address contiguity for e.g.
> future memory extensions (memory hotplug). During
> memory hotplug, if the hugepage mmap succeeds but
> doesn't suffice EAL's requiriments, the EAL would
> unmap this mapping straight away, leaving a hole in
> its virtual memory area and making it available
> to everyone. As EAL still thinks it owns the entire
> region, it may try to mmap it later with MAP_FIXED,
> possibly overriding a user's mapping that was made
> in the meantime.
> 
> This patch ensures each hole is mapped back by EAL,
> so that it won't be available to anyone else.
> 
> Changes from v2:
>    * replaced rte_panic() with a CRIT log.
>    * added "git fixline" tags
> 
> Changes from v1:
>   * checkpatch fixes
> 
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: anatoly.burakov@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

Thomas, could you please fix the commit message on apply?

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap
  2018-06-01 12:59     ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
@ 2018-06-01 15:07       ` Burakov, Anatoly
  0 siblings, 0 replies; 16+ messages in thread
From: Burakov, Anatoly @ 2018-06-01 15:07 UTC (permalink / raw)
  To: Dariusz Stojaczyk, dev; +Cc: stable

On 01-Jun-18 1:59 PM, Dariusz Stojaczyk wrote:
> This isn't documented in the manuals, but a failed
> mmap(..., MAP_FIXED) may still unmap overlapping
> regions. In such case, we need to remap these regions
> back into our address space to ensure mem contiguity.
> We do it unconditionally now on mmap failure just to
> be safe.
> 
> Verified on Linux 4.9.0-4-amd64. I was getting
> ENOMEM when trying to map hugetlbfs with no space
> left, and the previous anonymous mapping was still
> being removed.
> 
> Changes from v2:
>   * added "git fixline" tags
> 
> Changes from v1:
>   * checkpatch fixes
>   * remapping is now done regardless of the mmap errno
> 
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: anatoly.burakov@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> ---

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

Version changes should not be part of commit message. Thomas, could you 
please fix this on apply?

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area
  2018-06-01 15:06     ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
@ 2018-07-12 21:42       ` Thomas Monjalon
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Monjalon @ 2018-07-12 21:42 UTC (permalink / raw)
  To: Dariusz Stojaczyk; +Cc: stable, Burakov, Anatoly, dev

01/06/2018 17:06, Burakov, Anatoly:
> On 01-Jun-18 1:59 PM, Dariusz Stojaczyk wrote:
> > EAL reserves a huge area in virtual address space
> > to provide virtual address contiguity for e.g.
> > future memory extensions (memory hotplug). During
> > memory hotplug, if the hugepage mmap succeeds but
> > doesn't suffice EAL's requiriments, the EAL would
> > unmap this mapping straight away, leaving a hole in
> > its virtual memory area and making it available
> > to everyone. As EAL still thinks it owns the entire
> > region, it may try to mmap it later with MAP_FIXED,
> > possibly overriding a user's mapping that was made
> > in the meantime.
> > 
> > This patch ensures each hole is mapped back by EAL,
> > so that it won't be available to anyone else.
> > 
> > Changes from v2:
> >    * replaced rte_panic() with a CRIT log.
> >    * added "git fixline" tags
> > 
> > Changes from v1:
> >   * checkpatch fixes
> > 
> > Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> > Cc: anatoly.burakov@intel.com
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
> > ---
> 
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> Thomas, could you please fix the commit message on apply?

Yes

Applied, thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-07-12 21:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-01 12:51 [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
2018-06-01 11:00 ` Burakov, Anatoly
2018-06-01 12:22   ` Stojaczyk, DariuszX
2018-06-01 14:59     ` Burakov, Anatoly
2018-06-01 12:51 ` [dpdk-dev] [PATCH 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
2018-06-01 11:08   ` Burakov, Anatoly
2018-06-01 12:41     ` Stojaczyk, DariuszX
2018-06-01 13:37 ` [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Dariusz Stojaczyk
2018-06-01 13:39 ` [dpdk-dev] [PATCH v2 " Dariusz Stojaczyk
2018-06-01 10:24   ` Stojaczyk, DariuszX
2018-06-01 11:14     ` Burakov, Anatoly
2018-06-01 12:59   ` [dpdk-dev] [PATCH v3 " Dariusz Stojaczyk
2018-06-01 12:59     ` [dpdk-dev] [PATCH v3 2/2] memalloc: keep in mind a failed MAP_FIXED mmap may still perform an unmap Dariusz Stojaczyk
2018-06-01 15:07       ` Burakov, Anatoly
2018-06-01 15:06     ` [dpdk-dev] [PATCH v3 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area Burakov, Anatoly
2018-07-12 21:42       ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).