* Hugepage migration
@ 2023-05-28 20:07 Baruch Even
2023-05-30 1:35 ` Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Baruch Even @ 2023-05-28 20:07 UTC (permalink / raw)
To: dpdk-dev
[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]
Hi,
We found an issue with newer kernels (5.13+) that are found on newer OSes
(Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
allocated for DPDK was migrated (moved into another physical page) when a
1G page was allocated.
From our reading of the kernel commits this started with commit
ae37c7ff79f1f030e28ec76c46ee032f8fd07607
mm: make alloc_contig_range handle in-use hugetlb pages
This caused what looked like memory corruptions to us and cases where the
rings were moved from their physical location and communication was no
longer possible.
I wanted to ask if anyone else hit this issue and what mitigations are
available?
We are currently looking at using a kernel driver to pin the pages but I
expect that this issue will affect others and that a more general approach
is needed.
Thanks,
Baruch
--
Baruch Even
Platform Technical Lead, WEKA
E baruch@weka.io* *W www.weka.io
<https://www.weka.io?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>*
* * *
<https://www.weka.io/lp/weka-named-a-2023-customers-choice-by-gartner-peer-insights/?utm_source=signature&utm_medium=email>
[-- Attachment #2: Type: text/html, Size: 4808 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Hugepage migration
2023-05-28 20:07 Hugepage migration Baruch Even
@ 2023-05-30 1:35 ` Stephen Hemminger
2023-05-30 13:51 ` Baruch Even
2023-05-30 3:11 ` Stephen Hemminger
2023-05-30 8:04 ` Bruce Richardson
2 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2023-05-30 1:35 UTC (permalink / raw)
To: Baruch Even; +Cc: dpdk-dev
On Sun, 28 May 2023 23:07:40 +0300
Baruch Even <baruch@weka.io> wrote:
> Hi,
>
> We found an issue with newer kernels (5.13+) that are found on newer OSes
> (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> allocated for DPDK was migrated (moved into another physical page) when a
> 1G page was allocated.
>
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> mm: make alloc_contig_range handle in-use hugetlb pages
>
> This caused what looked like memory corruptions to us and cases where the
> rings were moved from their physical location and communication was no
> longer possible.
>
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
>
> We are currently looking at using a kernel driver to pin the pages but I
> expect that this issue will affect others and that a more general approach
> is needed.
>
> Thanks,
> Baruch
>
Fix might be as simple as asking kernel to lock the mmap().
diff --git a/lib/eal/linux/eal_hugepage_info.c b/lib/eal/linux/eal_hugepage_info.c
index 581d9dfc91eb..989c69387233 100644
--- a/lib/eal/linux/eal_hugepage_info.c
+++ b/lib/eal/linux/eal_hugepage_info.c
@@ -48,7 +48,8 @@ map_shared_memory(const char *filename, const size_t mem_size, int flags)
return NULL;
}
retval = mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
- MAP_SHARED, fd, 0);
+ MAP_SHARED_VALIDATE | MAP_LOCKED, fd, 0);
+
close(fd);
return retval == MAP_FAILED ? NULL : retval;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Hugepage migration
2023-05-28 20:07 Hugepage migration Baruch Even
2023-05-30 1:35 ` Stephen Hemminger
@ 2023-05-30 3:11 ` Stephen Hemminger
2023-05-30 8:04 ` Bruce Richardson
2 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2023-05-30 3:11 UTC (permalink / raw)
To: Baruch Even; +Cc: dpdk-dev
On Sun, 28 May 2023 23:07:40 +0300
Baruch Even <baruch@weka.io> wrote:
> Hi,
>
> We found an issue with newer kernels (5.13+) that are found on newer OSes
> (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> allocated for DPDK was migrated (moved into another physical page) when a
> 1G page was allocated.
>
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> mm: make alloc_contig_range handle in-use hugetlb pages
>
> This caused what looked like memory corruptions to us and cases where the
> rings were moved from their physical location and communication was no
> longer possible.
>
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
>
> We are currently looking at using a kernel driver to pin the pages but I
> expect that this issue will affect others and that a more general approach
> is needed.
>
> Thanks,
> Baruch
Report this to upstream kernel regressions, they probably care about it.
Doing a kernel driver hack is overkill, maintenance and long term technical debt problem.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Hugepage migration
2023-05-28 20:07 Hugepage migration Baruch Even
2023-05-30 1:35 ` Stephen Hemminger
2023-05-30 3:11 ` Stephen Hemminger
@ 2023-05-30 8:04 ` Bruce Richardson
2023-05-30 13:53 ` Baruch Even
2 siblings, 1 reply; 7+ messages in thread
From: Bruce Richardson @ 2023-05-30 8:04 UTC (permalink / raw)
To: Baruch Even; +Cc: dpdk-dev
On Sun, May 28, 2023 at 11:07:40PM +0300, Baruch Even wrote:
> Hi,
> We found an issue with newer kernels (5.13+) that are found on newer
> OSes (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that
> was allocated for DPDK was migrated (moved into another physical page)
> when a 1G page was allocated.
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> mm: make alloc_contig_range handle in-use hugetlb pages
> This caused what looked like memory corruptions to us and cases where
> the rings were moved from their physical location and communication was
> no longer possible.
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
> We are currently looking at using a kernel driver to pin the pages but
> I expect that this issue will affect others and that a more general
> approach is needed.
> Thanks,
> Baruch
> --
Hi,
what kernel driver was being used for the device I/O part? Was it a UIO
based driver or "vfio-pci"? When using vfio-pci and configuring IOMMU
mappings, the pages mapped should be pinned by the kernel, I would have
thought, since the kernel knows they are being used by devices.
/Bruce
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Hugepage migration
2023-05-30 1:35 ` Stephen Hemminger
@ 2023-05-30 13:51 ` Baruch Even
0 siblings, 0 replies; 7+ messages in thread
From: Baruch Even @ 2023-05-30 13:51 UTC (permalink / raw)
To: stephen; +Cc: dpdk-dev
[-- Attachment #1: Type: text/plain, Size: 2321 bytes --]
I have tested the MAP_LOCKED, it doesn't help in this case. I do intend to
report to the kernel but was wondering if others have hit upon this first.
On Tue, May 30, 2023 at 4:35 AM Stephen Hemminger <
stephen@networkplumber.org> wrote:
> On Sun, 28 May 2023 23:07:40 +0300
> Baruch Even <baruch@weka.io> wrote:
>
> > Hi,
> >
> > We found an issue with newer kernels (5.13+) that are found on newer OSes
> > (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> > allocated for DPDK was migrated (moved into another physical page) when a
> > 1G page was allocated.
> >
> > From our reading of the kernel commits this started with commit
> > ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> > mm: make alloc_contig_range handle in-use hugetlb pages
> >
> > This caused what looked like memory corruptions to us and cases where the
> > rings were moved from their physical location and communication was no
> > longer possible.
> >
> > I wanted to ask if anyone else hit this issue and what mitigations are
> > available?
> >
> > We are currently looking at using a kernel driver to pin the pages but I
> > expect that this issue will affect others and that a more general
> approach
> > is needed.
> >
> > Thanks,
> > Baruch
> >
>
> Fix might be as simple as asking kernel to lock the mmap().
>
> diff --git a/lib/eal/linux/eal_hugepage_info.c
> b/lib/eal/linux/eal_hugepage_info.c
> index 581d9dfc91eb..989c69387233 100644
> --- a/lib/eal/linux/eal_hugepage_info.c
> +++ b/lib/eal/linux/eal_hugepage_info.c
> @@ -48,7 +48,8 @@ map_shared_memory(const char *filename, const size_t
> mem_size, int flags)
> return NULL;
> }
> retval = mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
> - MAP_SHARED, fd, 0);
> + MAP_SHARED_VALIDATE | MAP_LOCKED, fd, 0);
> +
> close(fd);
> return retval == MAP_FAILED ? NULL : retval;
> }
>
--
Baruch Even
Platform Technical Lead, WEKA
E baruch@weka.io* *W www.weka.io
<https://www.weka.io?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>*
* * *
<https://www.weka.io/lp/weka-named-a-2023-customers-choice-by-gartner-peer-insights/?utm_source=signature&utm_medium=email>
[-- Attachment #2: Type: text/html, Size: 6402 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Hugepage migration
2023-05-30 8:04 ` Bruce Richardson
@ 2023-05-30 13:53 ` Baruch Even
2023-05-30 15:33 ` Stephen Hemminger
0 siblings, 1 reply; 7+ messages in thread
From: Baruch Even @ 2023-05-30 13:53 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dpdk-dev
[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]
On Tue, May 30, 2023 at 11:04 AM Bruce Richardson <
bruce.richardson@intel.com> wrote:
> On Sun, May 28, 2023 at 11:07:40PM +0300, Baruch Even wrote:
> > Hi,
> > We found an issue with newer kernels (5.13+) that are found on newer
> > OSes (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page
> that
> > was allocated for DPDK was migrated (moved into another physical page)
> > when a 1G page was allocated.
> > From our reading of the kernel commits this started with commit
> > ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> > mm: make alloc_contig_range handle in-use hugetlb pages
> > This caused what looked like memory corruptions to us and cases where
> > the rings were moved from their physical location and communication
> was
> > no longer possible.
> > I wanted to ask if anyone else hit this issue and what mitigations are
> > available?
> > We are currently looking at using a kernel driver to pin the pages but
> > I expect that this issue will affect others and that a more general
> > approach is needed.
> > Thanks,
> > Baruch
> > --
>
> Hi,
>
> what kernel driver was being used for the device I/O part? Was it a UIO
> based driver or "vfio-pci"? When using vfio-pci and configuring IOMMU
> mappings, the pages mapped should be pinned by the kernel, I would have
> thought, since the kernel knows they are being used by devices.
>
> /Bruce
>
This was using igb_uio on an AWS instance with their ena driver.
Baruch
--
Baruch Even
Platform Technical Lead, WEKA
E baruch@weka.io* *W www.weka.io
<https://www.weka.io?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>*
* * *
<https://www.weka.io/lp/weka-named-a-2023-customers-choice-by-gartner-peer-insights/?utm_source=signature&utm_medium=email>
[-- Attachment #2: Type: text/html, Size: 5900 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Hugepage migration
2023-05-30 13:53 ` Baruch Even
@ 2023-05-30 15:33 ` Stephen Hemminger
0 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2023-05-30 15:33 UTC (permalink / raw)
To: Baruch Even; +Cc: Bruce Richardson, dpdk-dev
On Tue, 30 May 2023 16:53:14 +0300
Baruch Even <baruch@weka.io> wrote:
> > what kernel driver was being used for the device I/O part? Was it a UIO
> > based driver or "vfio-pci"? When using vfio-pci and configuring IOMMU
> > mappings, the pages mapped should be pinned by the kernel, I would have
> > thought, since the kernel knows they are being used by devices.
> >
> > /Bruce
> >
>
> This was using igb_uio on an AWS instance with their ena driver.
>
> Baruch
Try VFIO, using igb_uio is effectively and out tree driver and the kernel
maintainers are unlikely to give you much support.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-05-30 15:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-28 20:07 Hugepage migration Baruch Even
2023-05-30 1:35 ` Stephen Hemminger
2023-05-30 13:51 ` Baruch Even
2023-05-30 3:11 ` Stephen Hemminger
2023-05-30 8:04 ` Bruce Richardson
2023-05-30 13:53 ` Baruch Even
2023-05-30 15:33 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).