Since its introduction in 2018, the SIGBUS handler was never registered, and all related functions were unused. A SIGBUS can be received by the application when accessing to hugepages even if mmap() was successful, This happens especially when running inside containers when there is not enough hugepages. In this case, we need to recover. A similar scheme can be found in eal_memory.c. Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> --- lib/eal/linux/eal_memalloc.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c index 0ec8542283..337f2bc739 100644 --- a/lib/eal/linux/eal_memalloc.c +++ b/lib/eal/linux/eal_memalloc.c @@ -107,7 +107,7 @@ static struct rte_memseg_list local_memsegs[RTE_MAX_MEMSEG_LISTS]; static sigjmp_buf huge_jmpenv; -static void __rte_unused huge_sigbus_handler(int signo __rte_unused) +static void huge_sigbus_handler(int signo __rte_unused) { siglongjmp(huge_jmpenv, 1); } @@ -116,7 +116,7 @@ static void __rte_unused huge_sigbus_handler(int signo __rte_unused) * non-static local variable in the stack frame calling sigsetjmp might be * clobbered by a call to longjmp. */ -static int __rte_unused huge_wrap_sigsetjmp(void) +static int huge_wrap_sigsetjmp(void) { return sigsetjmp(huge_jmpenv, 1); } @@ -124,7 +124,7 @@ static int __rte_unused huge_wrap_sigsetjmp(void) static struct sigaction huge_action_old; static int huge_need_recover; -static void __rte_unused +static void huge_register_sigbus(void) { sigset_t mask; @@ -139,7 +139,7 @@ huge_register_sigbus(void) huge_need_recover = !sigaction(SIGBUS, &action, &huge_action_old); } -static void __rte_unused +static void huge_recover_sigbus(void) { if (huge_need_recover) { @@ -576,6 +576,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id, mmap_flags = MAP_SHARED | MAP_POPULATE | MAP_FIXED; } + huge_register_sigbus(); + /* * map the segment, and populate page tables, the kernel fills * this segment with zeros if it's a new page. @@ -651,6 +653,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id, __func__); #endif + huge_recover_sigbus(); + ms->addr = addr; ms->hugepage_sz = alloc_sz; ms->len = alloc_sz; @@ -664,6 +668,7 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id, mapped: munmap(addr, alloc_sz); unmapped: + huge_recover_sigbus(); flags = EAL_RESERVE_FORCE_ADDRESS; new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags); if (new_addr != addr) { -- 2.30.2
Hi Olivier,
On 10/29/21 11:53, Olivier Matz wrote:
> Since its introduction in 2018, the SIGBUS handler was never registered,
> and all related functions were unused.
>
> A SIGBUS can be received by the application when accessing to hugepages
> even if mmap() was successful, This happens especially when running
> inside containers when there is not enough hugepages. In this case, we
> need to recover. A similar scheme can be found in eal_memory.c.
>
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: stable@dpdk.org
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
> lib/eal/linux/eal_memalloc.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
> index 0ec8542283..337f2bc739 100644
> --- a/lib/eal/linux/eal_memalloc.c
> +++ b/lib/eal/linux/eal_memalloc.c
> @@ -107,7 +107,7 @@ static struct rte_memseg_list local_memsegs[RTE_MAX_MEMSEG_LISTS];
>
> static sigjmp_buf huge_jmpenv;
>
> -static void __rte_unused huge_sigbus_handler(int signo __rte_unused)
> +static void huge_sigbus_handler(int signo __rte_unused)
> {
> siglongjmp(huge_jmpenv, 1);
> }
> @@ -116,7 +116,7 @@ static void __rte_unused huge_sigbus_handler(int signo __rte_unused)
> * non-static local variable in the stack frame calling sigsetjmp might be
> * clobbered by a call to longjmp.
> */
> -static int __rte_unused huge_wrap_sigsetjmp(void)
> +static int huge_wrap_sigsetjmp(void)
> {
> return sigsetjmp(huge_jmpenv, 1);
> }
> @@ -124,7 +124,7 @@ static int __rte_unused huge_wrap_sigsetjmp(void)
> static struct sigaction huge_action_old;
> static int huge_need_recover;
>
> -static void __rte_unused
> +static void
> huge_register_sigbus(void)
> {
> sigset_t mask;
> @@ -139,7 +139,7 @@ huge_register_sigbus(void)
> huge_need_recover = !sigaction(SIGBUS, &action, &huge_action_old);
> }
>
> -static void __rte_unused
> +static void
> huge_recover_sigbus(void)
> {
> if (huge_need_recover) {
> @@ -576,6 +576,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
> mmap_flags = MAP_SHARED | MAP_POPULATE | MAP_FIXED;
> }
>
> + huge_register_sigbus();
> +
> /*
> * map the segment, and populate page tables, the kernel fills
> * this segment with zeros if it's a new page.
> @@ -651,6 +653,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
> __func__);
> #endif
>
> + huge_recover_sigbus();
> +
> ms->addr = addr;
> ms->hugepage_sz = alloc_sz;
> ms->len = alloc_sz;
> @@ -664,6 +668,7 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
> mapped:
> munmap(addr, alloc_sz);
> unmapped:
> + huge_recover_sigbus();
> flags = EAL_RESERVE_FORCE_ADDRESS;
> new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags);
> if (new_addr != addr) {
>
I had almost the same series ready, instead that I installed/removed the
handler before/after the loop of alloc_seg(). I don't think this is
necessary though, so I'm fine with your patch, which is simpler:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On Fri, Oct 29, 2021 at 11:53 AM Olivier Matz <olivier.matz@6wind.com> wrote:
>
> Since its introduction in 2018, the SIGBUS handler was never registered,
> and all related functions were unused.
>
> A SIGBUS can be received by the application when accessing to hugepages
> even if mmap() was successful, This happens especially when running
> inside containers when there is not enough hugepages. In this case, we
> need to recover. A similar scheme can be found in eal_memory.c.
>
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: stable@dpdk.org
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This patch lgtm.
The key point here is that in the "container context" (well, cgroups)
mmap succeeds regardless of availability of hugepages.
I would put an emphasis about this in the title.
What do you think of:
mem: fix dynamic hugepage mapping in container
--
David Marchand
On Wed, Nov 03, 2021 at 09:03:19PM +0100, David Marchand wrote:
> On Fri, Oct 29, 2021 at 11:53 AM Olivier Matz <olivier.matz@6wind.com> wrote:
> >
> > Since its introduction in 2018, the SIGBUS handler was never registered,
> > and all related functions were unused.
> >
> > A SIGBUS can be received by the application when accessing to hugepages
> > even if mmap() was successful, This happens especially when running
> > inside containers when there is not enough hugepages. In this case, we
> > need to recover. A similar scheme can be found in eal_memory.c.
> >
> > Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>
> This patch lgtm.
>
> The key point here is that in the "container context" (well, cgroups)
> mmap succeeds regardless of availability of hugepages.
> I would put an emphasis about this in the title.
>
> What do you think of:
> mem: fix dynamic hugepage mapping in container
Yes it's a better title.
On Fri, Oct 29, 2021 at 11:53 AM Olivier Matz <olivier.matz@6wind.com> wrote:
>
> Since its introduction in 2018, the SIGBUS handler was never registered,
> and all related functions were unused.
>
> A SIGBUS can be received by the application when accessing to hugepages
> even if mmap() was successful, This happens especially when running
> inside containers when there is not enough hugepages. In this case, we
> need to recover. A similar scheme can be found in eal_memory.c.
>
> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
> Cc: stable@dpdk.org
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Applied, thanks.
--
David Marchand