* [dpdk-dev] DPDK hugepage memory fragmentation
@ 2020-07-10 9:22 Kamaraj P
2020-07-10 10:28 ` Bruce Richardson
0 siblings, 1 reply; 10+ messages in thread
From: Kamaraj P @ 2020-07-10 9:22 UTC (permalink / raw)
To: dev; +Cc: Burakov, Anatoly
Hello All,
We are running to run DPDK based application in a container mode,
When we do multiple start/stop of our container application, the DPDK
initialization seems to be failing.
This is because the hugepage memory fragementated and is not able to find
the continuous allocation of the memory to initialize the buffer in the
dpdk init.
As part of the cleanup of the container, we do call rte_eal_cleanup() to
cleanup the memory w.r.t our application. However after iterations we still
see the memory allocation failure due to the fragmentation issue.
We also tried to set the "--huge-unlink" as an argument before when we
called the rte_eal_init() and it did not help.
Could you please suggest if there is an option or any existing patches
available to clean up the memory to avoid fragmentation issues in the
future.
Please advise.
Thanks.
Kamaraj
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-10 9:22 [dpdk-dev] DPDK hugepage memory fragmentation Kamaraj P
@ 2020-07-10 10:28 ` Bruce Richardson
2020-07-10 15:44 ` Burakov, Anatoly
0 siblings, 1 reply; 10+ messages in thread
From: Bruce Richardson @ 2020-07-10 10:28 UTC (permalink / raw)
To: Kamaraj P; +Cc: dev, Burakov, Anatoly
On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> Hello All,
>
> We are running to run DPDK based application in a container mode,
> When we do multiple start/stop of our container application, the DPDK
> initialization seems to be failing.
> This is because the hugepage memory fragementated and is not able to find
> the continuous allocation of the memory to initialize the buffer in the
> dpdk init.
>
> As part of the cleanup of the container, we do call rte_eal_cleanup() to
> cleanup the memory w.r.t our application. However after iterations we still
> see the memory allocation failure due to the fragmentation issue.
>
> We also tried to set the "--huge-unlink" as an argument before when we
> called the rte_eal_init() and it did not help.
>
> Could you please suggest if there is an option or any existing patches
> available to clean up the memory to avoid fragmentation issues in the
> future.
>
> Please advise.
>
What version of DPDK are you using, and what kernel driver for NIC
interfacing are you using?
DPDK versions since 18.05 should be more forgiving of fragmented memory,
especially if using the vfio-pci kernel driver.
Regards,
/Bruce
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-10 10:28 ` Bruce Richardson
@ 2020-07-10 15:44 ` Burakov, Anatoly
2020-07-11 7:51 ` Kamaraj P
0 siblings, 1 reply; 10+ messages in thread
From: Burakov, Anatoly @ 2020-07-10 15:44 UTC (permalink / raw)
To: Bruce Richardson, Kamaraj P; +Cc: dev
On 10-Jul-20 11:28 AM, Bruce Richardson wrote:
> On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
>> Hello All,
>>
>> We are running to run DPDK based application in a container mode,
>> When we do multiple start/stop of our container application, the DPDK
>> initialization seems to be failing.
>> This is because the hugepage memory fragementated and is not able to find
>> the continuous allocation of the memory to initialize the buffer in the
>> dpdk init.
>>
>> As part of the cleanup of the container, we do call rte_eal_cleanup() to
>> cleanup the memory w.r.t our application. However after iterations we still
>> see the memory allocation failure due to the fragmentation issue.
>>
>> We also tried to set the "--huge-unlink" as an argument before when we
>> called the rte_eal_init() and it did not help.
>>
>> Could you please suggest if there is an option or any existing patches
>> available to clean up the memory to avoid fragmentation issues in the
>> future.
>>
>> Please advise.
>>
> What version of DPDK are you using, and what kernel driver for NIC
> interfacing are you using?
> DPDK versions since 18.05 should be more forgiving of fragmented memory,
> especially if using the vfio-pci kernel driver.
>
This sounds odd, to be honest.
Unless you're allocating huge chunks of IOVA-contiguous memory,
fragmentation shouldn't be an issue. How did you determine that this was
in fact due to fragmentation?
> Regards,
> /Bruce
>
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-10 15:44 ` Burakov, Anatoly
@ 2020-07-11 7:51 ` Kamaraj P
2020-07-13 9:27 ` Burakov, Anatoly
0 siblings, 1 reply; 10+ messages in thread
From: Kamaraj P @ 2020-07-11 7:51 UTC (permalink / raw)
To: Burakov, Anatoly; +Cc: Bruce Richardson, dev
Hello Anatoly/Bruce,
We are using the 18_11 version of DPDK and we are using igb_uio.
The way we observe an issue here is that, after we tried multiple
iterations of start/stop of container application(which has DPDK),
we were not able to allocate the memory for port during the init.
We thought that it could be an issue of not getting continuous allocation
hence it fails.
Is there an API where I can check if the memory is fragmented before we
invoke an allocation ?
Or do we have any such mechanism to defragment the memory allocation once
we exist from the application ?
Please advise.
Thanks,
Kamaraj
On Fri, Jul 10, 2020 at 9:14 PM Burakov, Anatoly <anatoly.burakov@intel.com>
wrote:
> On 10-Jul-20 11:28 AM, Bruce Richardson wrote:
> > On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> >> Hello All,
> >>
> >> We are running to run DPDK based application in a container mode,
> >> When we do multiple start/stop of our container application, the DPDK
> >> initialization seems to be failing.
> >> This is because the hugepage memory fragementated and is not able to
> find
> >> the continuous allocation of the memory to initialize the buffer in the
> >> dpdk init.
> >>
> >> As part of the cleanup of the container, we do call rte_eal_cleanup() to
> >> cleanup the memory w.r.t our application. However after iterations we
> still
> >> see the memory allocation failure due to the fragmentation issue.
> >>
> >> We also tried to set the "--huge-unlink" as an argument before when we
> >> called the rte_eal_init() and it did not help.
> >>
> >> Could you please suggest if there is an option or any existing patches
> >> available to clean up the memory to avoid fragmentation issues in the
> >> future.
> >>
> >> Please advise.
> >>
> > What version of DPDK are you using, and what kernel driver for NIC
> > interfacing are you using?
> > DPDK versions since 18.05 should be more forgiving of fragmented memory,
> > especially if using the vfio-pci kernel driver.
> >
>
> This sounds odd, to be honest.
>
> Unless you're allocating huge chunks of IOVA-contiguous memory,
> fragmentation shouldn't be an issue. How did you determine that this was
> in fact due to fragmentation?
>
> > Regards,
> > /Bruce
> >
>
>
> --
> Thanks,
> Anatoly
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-11 7:51 ` Kamaraj P
@ 2020-07-13 9:27 ` Burakov, Anatoly
2020-07-27 15:30 ` Kamaraj P
0 siblings, 1 reply; 10+ messages in thread
From: Burakov, Anatoly @ 2020-07-13 9:27 UTC (permalink / raw)
To: Kamaraj P; +Cc: Bruce Richardson, dev
On 11-Jul-20 8:51 AM, Kamaraj P wrote:
> Hello Anatoly/Bruce,
>
> We are using the 18_11 version of DPDK and we are using igb_uio.
> The way we observe an issue here is that, after we tried multiple
> iterations of start/stop of container application(which has DPDK),
> we were not able to allocate the memory for port during the init.
> We thought that it could be an issue of not getting continuous
> allocation hence it fails.
>
> Is there an API where I can check if the memory is fragmented before we
> invoke an allocation ?
> Or do we have any such mechanism to defragment the memory allocation
> once we exist from the application ?
> Please advise.
>
This is unlikely due to fragmentation, because the only way for 18.11 to
be affected my memory fragmentation is 1) if you're using legacy mem
mode, or 2) you're using IOVA as PA mode and you need huge amounts of
contiguous memory. (you are using igb_uio, so you would be in IOVA as PA
mode)
NICs very rarely, if ever, allocate more than a 2M-page worth of
contiguous memory, because their descriptor rings aren't that big, and
they'll usually get all the IOVA-contiguous space they need even in the
face of heavily fragmented memory. Similarly, while 18.11 mempools will
request IOVA-contiguous memory first, they have a fallback to using
non-contiguous memory and thus too work just fine in the face of high
memory fragmentation.
The nature of the 18.11 memory subsystem is such that IOVA layout is
decoupled from VA layout, so fragmentation does not affect DPDK as much
as it has for previous versions. The first thing i'd suggest is using
VFIO as opposed to igb_uio, as it's safer to use in a container
environment, and it's less susceptible to memory fragmentation issues
because it can remap memory to appear IOVA-contiguous.
Could you please provide detailed logs of the init process? You can add
'--log-level=eal,8' to the EAL command-line to enable debug logging in
the EAL.
> Thanks,
> Kamaraj
>
>
>
> On Fri, Jul 10, 2020 at 9:14 PM Burakov, Anatoly
> <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
>
> On 10-Jul-20 11:28 AM, Bruce Richardson wrote:
> > On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> >> Hello All,
> >>
> >> We are running to run DPDK based application in a container mode,
> >> When we do multiple start/stop of our container application, the
> DPDK
> >> initialization seems to be failing.
> >> This is because the hugepage memory fragementated and is not
> able to find
> >> the continuous allocation of the memory to initialize the buffer
> in the
> >> dpdk init.
> >>
> >> As part of the cleanup of the container, we do call
> rte_eal_cleanup() to
> >> cleanup the memory w.r.t our application. However after
> iterations we still
> >> see the memory allocation failure due to the fragmentation issue.
> >>
> >> We also tried to set the "--huge-unlink" as an argument before
> when we
> >> called the rte_eal_init() and it did not help.
> >>
> >> Could you please suggest if there is an option or any existing
> patches
> >> available to clean up the memory to avoid fragmentation issues
> in the
> >> future.
> >>
> >> Please advise.
> >>
> > What version of DPDK are you using, and what kernel driver for NIC
> > interfacing are you using?
> > DPDK versions since 18.05 should be more forgiving of fragmented
> memory,
> > especially if using the vfio-pci kernel driver.
> >
>
> This sounds odd, to be honest.
>
> Unless you're allocating huge chunks of IOVA-contiguous memory,
> fragmentation shouldn't be an issue. How did you determine that this
> was
> in fact due to fragmentation?
>
> > Regards,
> > /Bruce
> >
>
>
> --
> Thanks,
> Anatoly
>
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-13 9:27 ` Burakov, Anatoly
@ 2020-07-27 15:30 ` Kamaraj P
2020-07-28 10:10 ` Burakov, Anatoly
0 siblings, 1 reply; 10+ messages in thread
From: Kamaraj P @ 2020-07-27 15:30 UTC (permalink / raw)
To: Burakov, Anatoly; +Cc: Bruce Richardson, dev, mmahmoud
Hi Anatoly,
Since we do not have the driver support of SRIOv with VFIO, we are using
IGB_UIO .
Basically our application is crashing due to the buffer allocation failure.
I believe because it didn't get a contiguous memory location and fails to
allocate the memory.
Is there any API, I can use to dump before our application dies ?
Please let me know.
Thanks,
Kamaraj
On Mon, Jul 13, 2020 at 2:57 PM Burakov, Anatoly <anatoly.burakov@intel.com>
wrote:
> On 11-Jul-20 8:51 AM, Kamaraj P wrote:
> > Hello Anatoly/Bruce,
> >
> > We are using the 18_11 version of DPDK and we are using igb_uio.
> > The way we observe an issue here is that, after we tried multiple
> > iterations of start/stop of container application(which has DPDK),
> > we were not able to allocate the memory for port during the init.
> > We thought that it could be an issue of not getting continuous
> > allocation hence it fails.
> >
> > Is there an API where I can check if the memory is fragmented before we
> > invoke an allocation ?
> > Or do we have any such mechanism to defragment the memory allocation
> > once we exist from the application ?
> > Please advise.
> >
>
> This is unlikely due to fragmentation, because the only way for 18.11 to
> be affected my memory fragmentation is 1) if you're using legacy mem
> mode, or 2) you're using IOVA as PA mode and you need huge amounts of
> contiguous memory. (you are using igb_uio, so you would be in IOVA as PA
> mode)
>
> NICs very rarely, if ever, allocate more than a 2M-page worth of
> contiguous memory, because their descriptor rings aren't that big, and
> they'll usually get all the IOVA-contiguous space they need even in the
> face of heavily fragmented memory. Similarly, while 18.11 mempools will
> request IOVA-contiguous memory first, they have a fallback to using
> non-contiguous memory and thus too work just fine in the face of high
> memory fragmentation.
>
> The nature of the 18.11 memory subsystem is such that IOVA layout is
> decoupled from VA layout, so fragmentation does not affect DPDK as much
> as it has for previous versions. The first thing i'd suggest is using
> VFIO as opposed to igb_uio, as it's safer to use in a container
> environment, and it's less susceptible to memory fragmentation issues
> because it can remap memory to appear IOVA-contiguous.
>
> Could you please provide detailed logs of the init process? You can add
> '--log-level=eal,8' to the EAL command-line to enable debug logging in
> the EAL.
>
> > Thanks,
> > Kamaraj
> >
> >
> >
> > On Fri, Jul 10, 2020 at 9:14 PM Burakov, Anatoly
> > <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
> >
> > On 10-Jul-20 11:28 AM, Bruce Richardson wrote:
> > > On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> > >> Hello All,
> > >>
> > >> We are running to run DPDK based application in a container mode,
> > >> When we do multiple start/stop of our container application, the
> > DPDK
> > >> initialization seems to be failing.
> > >> This is because the hugepage memory fragementated and is not
> > able to find
> > >> the continuous allocation of the memory to initialize the buffer
> > in the
> > >> dpdk init.
> > >>
> > >> As part of the cleanup of the container, we do call
> > rte_eal_cleanup() to
> > >> cleanup the memory w.r.t our application. However after
> > iterations we still
> > >> see the memory allocation failure due to the fragmentation issue.
> > >>
> > >> We also tried to set the "--huge-unlink" as an argument before
> > when we
> > >> called the rte_eal_init() and it did not help.
> > >>
> > >> Could you please suggest if there is an option or any existing
> > patches
> > >> available to clean up the memory to avoid fragmentation issues
> > in the
> > >> future.
> > >>
> > >> Please advise.
> > >>
> > > What version of DPDK are you using, and what kernel driver for NIC
> > > interfacing are you using?
> > > DPDK versions since 18.05 should be more forgiving of fragmented
> > memory,
> > > especially if using the vfio-pci kernel driver.
> > >
> >
> > This sounds odd, to be honest.
> >
> > Unless you're allocating huge chunks of IOVA-contiguous memory,
> > fragmentation shouldn't be an issue. How did you determine that this
> > was
> > in fact due to fragmentation?
> >
> > > Regards,
> > > /Bruce
> > >
> >
> >
> > --
> > Thanks,
> > Anatoly
> >
>
>
> --
> Thanks,
> Anatoly
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-27 15:30 ` Kamaraj P
@ 2020-07-28 10:10 ` Burakov, Anatoly
2020-09-16 4:32 ` Kamaraj P
0 siblings, 1 reply; 10+ messages in thread
From: Burakov, Anatoly @ 2020-07-28 10:10 UTC (permalink / raw)
To: Kamaraj P; +Cc: Bruce Richardson, dev, mmahmoud
On 27-Jul-20 4:30 PM, Kamaraj P wrote:
> Hi Anatoly,
> Since we do not have the driver support of SRIOv with VFIO, we are using
> IGB_UIO .
I believe it's coming :)
> Basically our application is crashing due to the buffer allocation
> failure. I believe because it didn't get a contiguous memory location
> and fails to allocate the memory.
Again, "crashing due to buffer allocation failure" is not very
descriptive. When allocation fails, EAL will produce an error log, so if
your failures are indeed due to memory allocation failures, an EAL log
will tell you if it's actually the case (and enabling debug level
logging will tell you more).
By default, all memory allocations will *not* be contiguous and
therefore will not fail if the memory is not contiguous. In order for
such an allocation to fail, you actually have to run out of memory.
If there is indeed a place where you are specifically requesting
contiguous memory, it will be signified by a call to memzone reserve API
with a RTE_MEMZONE_IOVA_CONTIG flag (or a call to
rte_eth_dma_zone_reserve(), if your driver makes use of that API). So if
you're not willing to provide any logs to help with debugging, i would
at least suggest you grep your codebase for the above two things, and
put GDB breakpoints right after the calls to either memzone reserve API
or a ethdev DMA zone reserve API.
To summarize: a regular allocation *will not fail* if memory is non
contiguous, so you can disregard those. If you find all places where
you're requesting *contiguous* memory (which should be at most one or
two), you'll be in a better position to determine whether this is what's
causing the failures.
> Is there any API, I can use to dump before our application dies ?
> Please let me know.
Not sure what you mean by that, but you could use
rte_dump_physmem_layout() function to dump your hugepage layout. That
said, i believe a debugger is, in most cases, a better way to diagnose
the issue.
>
> Thanks,
> Kamaraj
>
>
> On Mon, Jul 13, 2020 at 2:57 PM Burakov, Anatoly
> <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
>
> On 11-Jul-20 8:51 AM, Kamaraj P wrote:
> > Hello Anatoly/Bruce,
> >
> > We are using the 18_11 version of DPDK and we are using igb_uio.
> > The way we observe an issue here is that, after we tried multiple
> > iterations of start/stop of container application(which has DPDK),
> > we were not able to allocate the memory for port during the init.
> > We thought that it could be an issue of not getting continuous
> > allocation hence it fails.
> >
> > Is there an API where I can check if the memory is fragmented
> before we
> > invoke an allocation ?
> > Or do we have any such mechanism to defragment the memory allocation
> > once we exist from the application ?
> > Please advise.
> >
>
> This is unlikely due to fragmentation, because the only way for
> 18.11 to
> be affected my memory fragmentation is 1) if you're using legacy mem
> mode, or 2) you're using IOVA as PA mode and you need huge amounts of
> contiguous memory. (you are using igb_uio, so you would be in IOVA
> as PA
> mode)
>
> NICs very rarely, if ever, allocate more than a 2M-page worth of
> contiguous memory, because their descriptor rings aren't that big, and
> they'll usually get all the IOVA-contiguous space they need even in the
> face of heavily fragmented memory. Similarly, while 18.11 mempools will
> request IOVA-contiguous memory first, they have a fallback to using
> non-contiguous memory and thus too work just fine in the face of high
> memory fragmentation.
>
> The nature of the 18.11 memory subsystem is such that IOVA layout is
> decoupled from VA layout, so fragmentation does not affect DPDK as much
> as it has for previous versions. The first thing i'd suggest is using
> VFIO as opposed to igb_uio, as it's safer to use in a container
> environment, and it's less susceptible to memory fragmentation issues
> because it can remap memory to appear IOVA-contiguous.
>
> Could you please provide detailed logs of the init process? You can add
> '--log-level=eal,8' to the EAL command-line to enable debug logging in
> the EAL.
>
> > Thanks,
> > Kamaraj
> >
> >
> >
> > On Fri, Jul 10, 2020 at 9:14 PM Burakov, Anatoly
> > <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>
> <mailto:anatoly.burakov@intel.com
> <mailto:anatoly.burakov@intel.com>>> wrote:
> >
> > On 10-Jul-20 11:28 AM, Bruce Richardson wrote:
> > > On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> > >> Hello All,
> > >>
> > >> We are running to run DPDK based application in a
> container mode,
> > >> When we do multiple start/stop of our container
> application, the
> > DPDK
> > >> initialization seems to be failing.
> > >> This is because the hugepage memory fragementated and is not
> > able to find
> > >> the continuous allocation of the memory to initialize the
> buffer
> > in the
> > >> dpdk init.
> > >>
> > >> As part of the cleanup of the container, we do call
> > rte_eal_cleanup() to
> > >> cleanup the memory w.r.t our application. However after
> > iterations we still
> > >> see the memory allocation failure due to the
> fragmentation issue.
> > >>
> > >> We also tried to set the "--huge-unlink" as an argument
> before
> > when we
> > >> called the rte_eal_init() and it did not help.
> > >>
> > >> Could you please suggest if there is an option or any
> existing
> > patches
> > >> available to clean up the memory to avoid fragmentation
> issues
> > in the
> > >> future.
> > >>
> > >> Please advise.
> > >>
> > > What version of DPDK are you using, and what kernel driver
> for NIC
> > > interfacing are you using?
> > > DPDK versions since 18.05 should be more forgiving of
> fragmented
> > memory,
> > > especially if using the vfio-pci kernel driver.
> > >
> >
> > This sounds odd, to be honest.
> >
> > Unless you're allocating huge chunks of IOVA-contiguous memory,
> > fragmentation shouldn't be an issue. How did you determine
> that this
> > was
> > in fact due to fragmentation?
> >
> > > Regards,
> > > /Bruce
> > >
> >
> >
> > --
> > Thanks,
> > Anatoly
> >
>
>
> --
> Thanks,
> Anatoly
>
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-07-28 10:10 ` Burakov, Anatoly
@ 2020-09-16 4:32 ` Kamaraj P
2020-09-16 11:19 ` Burakov, Anatoly
0 siblings, 1 reply; 10+ messages in thread
From: Kamaraj P @ 2020-09-16 4:32 UTC (permalink / raw)
To: Burakov, Anatoly; +Cc: Bruce Richardson, dev
Hi Anatoly,
We just dump the memory contents when it fails to allocate the memory.
rte_malloc_dump_stats();
rte_dump_physmem_layout();
rte_memzone_dump();
We could see the fragmented memory
----------- MALLOC_STATS -----------
Heap id:0
Heap name:socket_0
Heap_size:134217728,
Free_size:133604096,
Alloc_size:613632,
Greatest_free_size:8388608,
Alloc_count:48,
Free_count:53,
Heap id:1
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:2
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:3
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:4
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:5
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:6
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:7
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:8
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:9
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:10
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:11
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:12
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:13
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:14
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:15
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:16
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:17
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:18
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:19
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:20
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:21
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:22
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:23
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:24
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:25
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:26
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:27
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:28
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:29
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:30
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
Heap id:31
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
--------- END_MALLOC_STATS ---------
----------- MEMORY_SEGMENTS -----------
Segment 4-0: IOVA:0x2a5e00000, len:2097152, virt:0x2200200000, socket_id:0,
hugepage_sz:2097152, nchannel:0, nrank:0 fd:15
Segment 4-1: IOVA:0x2a6000000, len:2097152, virt:0x2200400000, socket_id:0,
hugepage_sz:2097152, nchannel:0, nrank:0 fd:19
Segment 4-2: IOVA:0x2a6200000, len:2097152, virt:0x2200600000, socket_id:0,
hugepage_sz:2097152, nchannel:0, nrank:0 fd:20
Segment 4-4: IOVA:0x2a6600000, len:2097152, virt:0x2200a00000, socket_id:0,
hugepage_sz:2097152, nchannel:0, nrank:0 fd:21
Segment 4-6: IOVA:0x2a7000000, len:2097152, virt:0x2200e00000, socket_id:0,
hugepage_sz:2097152, nchannel:0, nrank:0 fd:22
Segment 4-8: IOVA:0x2a7600000, len:2097152, virt:0x2201200000, socket_id:0,
hugepage_sz:2097152, nchannel:0, nrank:0 fd:23
Segment 4-10: IOVA:0x2a8000000, len:2097152, virt:0x2201600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:24
Segment 4-11: IOVA:0x2a8200000, len:2097152, virt:0x2201800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:25
Segment 4-12: IOVA:0x2a8400000, len:2097152, virt:0x2201a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:26
Segment 4-13: IOVA:0x2a8600000, len:2097152, virt:0x2201c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:27
Segment 4-15: IOVA:0x2a8c00000, len:2097152, virt:0x2202000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:28
Segment 4-16: IOVA:0x2a8e00000, len:2097152, virt:0x2202200000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:29
Segment 4-18: IOVA:0x2a9400000, len:2097152, virt:0x2202600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:30
Segment 4-20: IOVA:0x2aa400000, len:2097152, virt:0x2202a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:31
Segment 4-21: IOVA:0x2aa600000, len:2097152, virt:0x2202c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:32
Segment 4-23: IOVA:0x2aac00000, len:2097152, virt:0x2203000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:33
Segment 4-25: IOVA:0x2abc00000, len:2097152, virt:0x2203400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:34
Segment 4-26: IOVA:0x2abe00000, len:2097152, virt:0x2203600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:35
Segment 4-28: IOVA:0x2b0400000, len:2097152, virt:0x2203a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:36
Segment 4-29: IOVA:0x2b0600000, len:2097152, virt:0x2203c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:37
Segment 4-30: IOVA:0x2b0800000, len:2097152, virt:0x2203e00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:38
Segment 4-31: IOVA:0x2b0a00000, len:2097152, virt:0x2204000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:39
Segment 4-33: IOVA:0x2b1200000, len:2097152, virt:0x2204400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:40
Segment 4-35: IOVA:0x2b2600000, len:2097152, virt:0x2204800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:41
Segment 4-37: IOVA:0x2b2a00000, len:2097152, virt:0x2204c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:42
Segment 4-39: IOVA:0x2b5000000, len:2097152, virt:0x2205000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:43
Segment 4-41: IOVA:0x2b5600000, len:2097152, virt:0x2205400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:44
Segment 4-43: IOVA:0x2b6800000, len:2097152, virt:0x2205800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:45
Segment 4-45: IOVA:0x2b7c00000, len:2097152, virt:0x2205c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:46
Segment 4-47: IOVA:0x2b8e00000, len:2097152, virt:0x2206000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:47
Segment 4-49: IOVA:0x2b9a00000, len:2097152, virt:0x2206400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:48
Segment 4-50: IOVA:0x2b9c00000, len:2097152, virt:0x2206600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:49
Segment 4-51: IOVA:0x2b9e00000, len:2097152, virt:0x2206800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:50
Segment 4-53: IOVA:0x2ba400000, len:2097152, virt:0x2206c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:51
Segment 4-55: IOVA:0x2bbe00000, len:2097152, virt:0x2207000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:52
Segment 4-57: IOVA:0x2bc200000, len:2097152, virt:0x2207400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:53
Segment 4-59: IOVA:0x2c1c00000, len:2097152, virt:0x2207800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:54
Segment 4-60: IOVA:0x2c1e00000, len:2097152, virt:0x2207a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:55
Segment 4-62: IOVA:0x2d1a00000, len:2097152, virt:0x2207e00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:56
Segment 4-64: IOVA:0x2d5000000, len:2097152, virt:0x2208200000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:57
Segment 4-130: IOVA:0x2d7000000, len:2097152, virt:0x2210600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:58
Segment 4-132: IOVA:0x2d8000000, len:2097152, virt:0x2210a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:59
Segment 4-134: IOVA:0x2d8400000, len:2097152, virt:0x2210e00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:60
Segment 4-136: IOVA:0x2db400000, len:2097152, virt:0x2211200000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:61
Segment 4-138: IOVA:0x2dc600000, len:2097152, virt:0x2211600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:62
Segment 4-139: IOVA:0x2dc800000, len:2097152, virt:0x2211800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:63
Segment 4-140: IOVA:0x2dca00000, len:2097152, virt:0x2211a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:64
Segment 4-142: IOVA:0x2de800000, len:2097152, virt:0x2211e00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:65
Segment 4-143: IOVA:0x2dea00000, len:2097152, virt:0x2212000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:66
Segment 4-145: IOVA:0x3d8c00000, len:2097152, virt:0x2212400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:67
Segment 4-147: IOVA:0x3d9400000, len:2097152, virt:0x2212800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:68
Segment 4-149: IOVA:0x3d9c00000, len:2097152, virt:0x2212c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:69
Segment 4-151: IOVA:0x3e2200000, len:2097152, virt:0x2213000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:70
Segment 4-153: IOVA:0x3e5a00000, len:2097152, virt:0x2213400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:71
Segment 4-155: IOVA:0x3e6000000, len:2097152, virt:0x2213800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:72
Segment 4-157: IOVA:0x3e9e00000, len:2097152, virt:0x2213c00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:73
Segment 4-159: IOVA:0x3f0a00000, len:2097152, virt:0x2214000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:74
Segment 4-160: IOVA:0x3f0c00000, len:2097152, virt:0x2214200000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:75
Segment 4-162: IOVA:0x3f1c00000, len:2097152, virt:0x2214600000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:76
Segment 4-164: IOVA:0x3f2400000, len:2097152, virt:0x2214a00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:77
Segment 4-166: IOVA:0x3f2e00000, len:2097152, virt:0x2214e00000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:78
Segment 4-168: IOVA:0x3f3200000, len:2097152, virt:0x2215200000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:79
Segment 4-169: IOVA:0x3f3400000, len:2097152, virt:0x2215400000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:80
Segment 4-171: IOVA:0x3f3a00000, len:2097152, virt:0x2215800000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 fd:81
--------- END_MEMORY_SEGMENTS ---------
------------ MEMORY_ZONES -------------
Zone 0: name:<rte_eth_dev_data>, len:0x35840, virt:0x22159b4740,
socket_id:0, flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 1: name:<port0_vq0>, len:0x3000, virt:0x22159ac000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 2: name:<port0_vq1>, len:0x3000, virt:0x22159a7000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 3: name:<port0_vq1_hdr>, len:0x9000, virt:0x221599dfc0, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 4: name:<port0_vq2>, len:0x2000, virt:0x221599b000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 5: name:<port0_vq2_hdr>, len:0x1000, virt:0x2215999fc0, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 6: name:<port1_vq0>, len:0x3000, virt:0x2215992000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 7: name:<port1_vq1>, len:0x3000, virt:0x221598d000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 8: name:<port1_vq1_hdr>, len:0x9000, virt:0x2215983fc0, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 9: name:<port1_vq2>, len:0x2000, virt:0x2215981000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 10: name:<port1_vq2_hdr>, len:0x1000, virt:0x221597ffc0, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 11: name:<port2_vq0>, len:0x3000, virt:0x2215978000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 12: name:<port2_vq1>, len:0x3000, virt:0x2215973000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 13: name:<port2_vq1_hdr>, len:0x9000, virt:0x2215969fc0, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 14: name:<port2_vq2>, len:0x2000, virt:0x2215967000, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
Zone 15: name:<port2_vq2_hdr>, len:0x1000, virt:0x2215965fc0, socket_id:0,
flags:0
physical segments used:
addr: 0x2215800000 iova: 0x3f3a00000 len: 0x200000 pagesz: 0x200000
---------- END_MEMORY_ZONES -----------
Could you please suggest is there any rte_mem_lib to align the memory pages
?
Thanks,
Kamaraj
On Tue, Jul 28, 2020 at 3:40 PM Burakov, Anatoly <anatoly.burakov@intel.com>
wrote:
> On 27-Jul-20 4:30 PM, Kamaraj P wrote:
> > Hi Anatoly,
> > Since we do not have the driver support of SRIOv with VFIO, we are using
> > IGB_UIO .
>
> I believe it's coming :)
>
> > Basically our application is crashing due to the buffer allocation
> > failure. I believe because it didn't get a contiguous memory location
> > and fails to allocate the memory.
>
> Again, "crashing due to buffer allocation failure" is not very
> descriptive. When allocation fails, EAL will produce an error log, so if
> your failures are indeed due to memory allocation failures, an EAL log
> will tell you if it's actually the case (and enabling debug level
> logging will tell you more).
>
> By default, all memory allocations will *not* be contiguous and
> therefore will not fail if the memory is not contiguous. In order for
> such an allocation to fail, you actually have to run out of memory.
>
> If there is indeed a place where you are specifically requesting
> contiguous memory, it will be signified by a call to memzone reserve API
> with a RTE_MEMZONE_IOVA_CONTIG flag (or a call to
> rte_eth_dma_zone_reserve(), if your driver makes use of that API). So if
> you're not willing to provide any logs to help with debugging, i would
> at least suggest you grep your codebase for the above two things, and
> put GDB breakpoints right after the calls to either memzone reserve API
> or a ethdev DMA zone reserve API.
>
> To summarize: a regular allocation *will not fail* if memory is non
> contiguous, so you can disregard those. If you find all places where
> you're requesting *contiguous* memory (which should be at most one or
> two), you'll be in a better position to determine whether this is what's
> causing the failures.
>
> > Is there any API, I can use to dump before our application dies ?
> > Please let me know.
>
> Not sure what you mean by that, but you could use
> rte_dump_physmem_layout() function to dump your hugepage layout. That
> said, i believe a debugger is, in most cases, a better way to diagnose
> the issue.
>
> >
> > Thanks,
> > Kamaraj
> >
> >
> > On Mon, Jul 13, 2020 at 2:57 PM Burakov, Anatoly
> > <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
> >
> > On 11-Jul-20 8:51 AM, Kamaraj P wrote:
> > > Hello Anatoly/Bruce,
> > >
> > > We are using the 18_11 version of DPDK and we are using igb_uio.
> > > The way we observe an issue here is that, after we tried multiple
> > > iterations of start/stop of container application(which has DPDK),
> > > we were not able to allocate the memory for port during the init.
> > > We thought that it could be an issue of not getting continuous
> > > allocation hence it fails.
> > >
> > > Is there an API where I can check if the memory is fragmented
> > before we
> > > invoke an allocation ?
> > > Or do we have any such mechanism to defragment the memory
> allocation
> > > once we exist from the application ?
> > > Please advise.
> > >
> >
> > This is unlikely due to fragmentation, because the only way for
> > 18.11 to
> > be affected my memory fragmentation is 1) if you're using legacy mem
> > mode, or 2) you're using IOVA as PA mode and you need huge amounts of
> > contiguous memory. (you are using igb_uio, so you would be in IOVA
> > as PA
> > mode)
> >
> > NICs very rarely, if ever, allocate more than a 2M-page worth of
> > contiguous memory, because their descriptor rings aren't that big,
> and
> > they'll usually get all the IOVA-contiguous space they need even in
> the
> > face of heavily fragmented memory. Similarly, while 18.11 mempools
> will
> > request IOVA-contiguous memory first, they have a fallback to using
> > non-contiguous memory and thus too work just fine in the face of high
> > memory fragmentation.
> >
> > The nature of the 18.11 memory subsystem is such that IOVA layout is
> > decoupled from VA layout, so fragmentation does not affect DPDK as
> much
> > as it has for previous versions. The first thing i'd suggest is using
> > VFIO as opposed to igb_uio, as it's safer to use in a container
> > environment, and it's less susceptible to memory fragmentation issues
> > because it can remap memory to appear IOVA-contiguous.
> >
> > Could you please provide detailed logs of the init process? You can
> add
> > '--log-level=eal,8' to the EAL command-line to enable debug logging
> in
> > the EAL.
> >
> > > Thanks,
> > > Kamaraj
> > >
> > >
> > >
> > > On Fri, Jul 10, 2020 at 9:14 PM Burakov, Anatoly
> > > <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>
> > <mailto:anatoly.burakov@intel.com
> > <mailto:anatoly.burakov@intel.com>>> wrote:
> > >
> > > On 10-Jul-20 11:28 AM, Bruce Richardson wrote:
> > > > On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> > > >> Hello All,
> > > >>
> > > >> We are running to run DPDK based application in a
> > container mode,
> > > >> When we do multiple start/stop of our container
> > application, the
> > > DPDK
> > > >> initialization seems to be failing.
> > > >> This is because the hugepage memory fragementated and is
> not
> > > able to find
> > > >> the continuous allocation of the memory to initialize the
> > buffer
> > > in the
> > > >> dpdk init.
> > > >>
> > > >> As part of the cleanup of the container, we do call
> > > rte_eal_cleanup() to
> > > >> cleanup the memory w.r.t our application. However after
> > > iterations we still
> > > >> see the memory allocation failure due to the
> > fragmentation issue.
> > > >>
> > > >> We also tried to set the "--huge-unlink" as an argument
> > before
> > > when we
> > > >> called the rte_eal_init() and it did not help.
> > > >>
> > > >> Could you please suggest if there is an option or any
> > existing
> > > patches
> > > >> available to clean up the memory to avoid fragmentation
> > issues
> > > in the
> > > >> future.
> > > >>
> > > >> Please advise.
> > > >>
> > > > What version of DPDK are you using, and what kernel driver
> > for NIC
> > > > interfacing are you using?
> > > > DPDK versions since 18.05 should be more forgiving of
> > fragmented
> > > memory,
> > > > especially if using the vfio-pci kernel driver.
> > > >
> > >
> > > This sounds odd, to be honest.
> > >
> > > Unless you're allocating huge chunks of IOVA-contiguous
> memory,
> > > fragmentation shouldn't be an issue. How did you determine
> > that this
> > > was
> > > in fact due to fragmentation?
> > >
> > > > Regards,
> > > > /Bruce
> > > >
> > >
> > >
> > > --
> > > Thanks,
> > > Anatoly
> > >
> >
> >
> > --
> > Thanks,
> > Anatoly
> >
>
>
> --
> Thanks,
> Anatoly
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-09-16 4:32 ` Kamaraj P
@ 2020-09-16 11:19 ` Burakov, Anatoly
2020-09-16 11:47 ` Burakov, Anatoly
0 siblings, 1 reply; 10+ messages in thread
From: Burakov, Anatoly @ 2020-09-16 11:19 UTC (permalink / raw)
To: Kamaraj P; +Cc: Bruce Richardson, dev
On 16-Sep-20 5:32 AM, Kamaraj P wrote:
> Hi Anatoly,
>
> We just dump the memory contents when it fails to allocate the memory.
>
Hi Kamaraj,
Yes, i can see that the memory is fragmented. That's not what i was
asking though, because memory fragmentation is *expected* if you're
using igb_uio. You're not using VFIO and IOVA as VA addressing, so
you're at the mercy of your kernel when it comes to getting
IOVA-contiguous address. We in DPDK cannot do anything about it, because
we don't control which pages we get and what addresses they get
assigned. There's nothing to fix on our side in this situation.
Here are the things you can do to avoid allocation failure in your case:
1) Drop the IOVA-contiguous allocation flag [1]
2) Use legacy mode [2] [3]
3) Switch to using VFIO [4]
4) Use bigger page size
The first point is crucial! Your allocation *wouldn't have failed* if
your code didn't specify for the allocation to require
IOVA-contiguousness. The fact that it has failed means that your
allocation has requested such memory, so it's on you to ensure that
whatever you're allocating, IOVA-contiguousness is *required*. I cannot
decide that for you as it is your code, so it is up to you to figure out
if whatever you're allocating actually requires such memory, or if you
can safely remove this allocation flag from your code.
[1]
http://doc.dpdk.org/api/rte__memzone_8h.html#a3ccbea77ccab608c6e683817a3eb170f
[2]
http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#memory-mapping-discovery-and-memory-reservation
[3] http://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3
[4] I understand the requirement for PF driver, but i think support for
PF in VFIO is coming in 20.11 release
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] DPDK hugepage memory fragmentation
2020-09-16 11:19 ` Burakov, Anatoly
@ 2020-09-16 11:47 ` Burakov, Anatoly
0 siblings, 0 replies; 10+ messages in thread
From: Burakov, Anatoly @ 2020-09-16 11:47 UTC (permalink / raw)
To: Kamaraj P; +Cc: Bruce Richardson, dev
On 16-Sep-20 12:19 PM, Burakov, Anatoly wrote:
> On 16-Sep-20 5:32 AM, Kamaraj P wrote:
>> Hi Anatoly,
>>
>> We just dump the memory contents when it fails to allocate the memory.
>>
>
> Hi Kamaraj,
>
> Yes, i can see that the memory is fragmented. That's not what i was
> asking though, because memory fragmentation is *expected* if you're
> using igb_uio. You're not using VFIO and IOVA as VA addressing, so
> you're at the mercy of your kernel when it comes to getting
> IOVA-contiguous address. We in DPDK cannot do anything about it, because
> we don't control which pages we get and what addresses they get
> assigned. There's nothing to fix on our side in this situation.
>
> Here are the things you can do to avoid allocation failure in your case:
>
> 1) Drop the IOVA-contiguous allocation flag [1]
> 2) Use legacy mode [2] [3]
> 3) Switch to using VFIO [4]
> 4) Use bigger page size
>
> The first point is crucial! Your allocation *wouldn't have failed* if
> your code didn't specify for the allocation to require
> IOVA-contiguousness. The fact that it has failed means that your
> allocation has requested such memory, so it's on you to ensure that
> whatever you're allocating, IOVA-contiguousness is *required*. I cannot
> decide that for you as it is your code, so it is up to you to figure out
> if whatever you're allocating actually requires such memory, or if you
> can safely remove this allocation flag from your code.
>
> [1]
> http://doc.dpdk.org/api/rte__memzone_8h.html#a3ccbea77ccab608c6e683817a3eb170f
>
> [2]
> http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#memory-mapping-discovery-and-memory-reservation
>
> [3] http://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3
> [4] I understand the requirement for PF driver, but i think support for
> PF in VFIO is coming in 20.11 release
>
Also, to add to this, we still haven't established that the allocation
error is in fact due to memory fragmentation. You *claim* that is the
case, but memory allocations can fail for other reasons, and you haven't
provided me with anything that would confirm this, as *merely* the fact
that the memory is fragmented will not, in and of itself, necessarily
cause any allocation failures *unless* you are trying to allocate memory
area that is bigger than your page size *and* is required to be contiguous.
If you can't tell me *what* you're allocating, at least tell me the size
of the allocation, the memzone flags it is allocated with, and any debug
log messages around the allocation, so that we can at least be sure that
we know the reason why the allocation fails. You're forcing me to fly
blind here, i can't help you if i don't know what i'm dealing with.
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-09-16 11:47 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10 9:22 [dpdk-dev] DPDK hugepage memory fragmentation Kamaraj P
2020-07-10 10:28 ` Bruce Richardson
2020-07-10 15:44 ` Burakov, Anatoly
2020-07-11 7:51 ` Kamaraj P
2020-07-13 9:27 ` Burakov, Anatoly
2020-07-27 15:30 ` Kamaraj P
2020-07-28 10:10 ` Burakov, Anatoly
2020-09-16 4:32 ` Kamaraj P
2020-09-16 11:19 ` Burakov, Anatoly
2020-09-16 11:47 ` Burakov, Anatoly
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).