DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] mmap() hint address
@ 2014-06-13 21:02 Gooch, Stephen
  2014-06-13 21:27 ` Richardson, Bruce
  0 siblings, 1 reply; 5+ messages in thread
From: Gooch, Stephen @ 2014-06-13 21:02 UTC (permalink / raw)
  To: dev

Hello,

I have seen a case where a secondary DPDK process tries to map uio resource in which mmap() normally sends the corresponding virtual address as a hint address.  However on some instances mmap() returns a virtual address that is not the hint address, and it result in rte_panic() and the secondary process goes defunct.

This happens from time to time on an embedded device when nr_hugepages is set to 128, but never when nr_hugepage is set to 256 on the same device.    My question is, if mmap() can find the correct memory regions when hugepages is set to 256, would it not require less resources (and therefore be more likely to pass) at a lower value such as 128?

Any ideas what would cause this mmap() behavior at a lower nr_hugepage value?

- Stephen

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] mmap() hint address
  2014-06-13 21:02 [dpdk-dev] mmap() hint address Gooch, Stephen
@ 2014-06-13 21:27 ` Richardson, Bruce
  2014-06-16  8:00   ` Burakov, Anatoly
  0 siblings, 1 reply; 5+ messages in thread
From: Richardson, Bruce @ 2014-06-13 21:27 UTC (permalink / raw)
  To: Gooch, Stephen (Wind River), dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gooch, Stephen
> Sent: Friday, June 13, 2014 2:03 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] mmap() hint address
> 
> Hello,
> 
> I have seen a case where a secondary DPDK process tries to map uio resource in
> which mmap() normally sends the corresponding virtual address as a hint
> address.  However on some instances mmap() returns a virtual address that is
> not the hint address, and it result in rte_panic() and the secondary process goes
> defunct.
> 
> This happens from time to time on an embedded device when nr_hugepages is
> set to 128, but never when nr_hugepage is set to 256 on the same device.    My
> question is, if mmap() can find the correct memory regions when hugepages is
> set to 256, would it not require less resources (and therefore be more likely to
> pass) at a lower value such as 128?
> 
> Any ideas what would cause this mmap() behavior at a lower nr_hugepage
> value?
> 
> - Stephen

Hi Stephen,

That's a strange one!
I don't know for definite why this is happening, but here is one possible theory. :-)

It could be due to the size of the memory blocks that are getting mmapped. When you use 256 pages, the blocks of memory getting mapped may well be larger (depending on how fragmented in memory the 2MB pages are), and so may be getting mapped at a higher set of address ranges where there is more free memory. This set of address ranges is then free in the secondary process and it is similarly able to map the memory.
With the 128 hugepages, you may be looking for smaller amounts of memory and so the addresses get mapped in at a different spot in the virtual address space, one that may be more heavily used. Then when the secondary process tries to duplicate the mappings, it already has memory in that region in use and the mapping fails.
In short - one theory is that having bigger blocks to map causes the memory to be mapped to a different location in memory which is free from conflicts in the secondary process.

So, how to confirm or refute this, and generally debug this issue?
Well, in general we  would need to look at the messages printed out at startup in the primary process to see how big of blocks it is trying to map in each case, and where they end up in the virtual address-space.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] mmap() hint address
  2014-06-13 21:27 ` Richardson, Bruce
@ 2014-06-16  8:00   ` Burakov, Anatoly
  2014-06-20 14:36     ` Gooch, Stephen
  0 siblings, 1 reply; 5+ messages in thread
From: Burakov, Anatoly @ 2014-06-16  8:00 UTC (permalink / raw)
  To: Richardson, Bruce, Gooch, Stephen (Wind River), dev

Hi Bruce, Stephen,

> > Hello,
> >
> > I have seen a case where a secondary DPDK process tries to map uio
> > resource in which mmap() normally sends the corresponding virtual
> > address as a hint address.  However on some instances mmap() returns a
> > virtual address that is not the hint address, and it result in
> > rte_panic() and the secondary process goes defunct.
> >
> > This happens from time to time on an embedded device when
> nr_hugepages is
> > set to 128, but never when nr_hugepage is set to 256 on the same device.
> My
> > question is, if mmap() can find the correct memory regions when
> > hugepages is set to 256, would it not require less resources (and
> > therefore be more likely to
> > pass) at a lower value such as 128?
> >
> > Any ideas what would cause this mmap() behavior at a lower nr_hugepage
> > value?
> >
> > - Stephen
> 
> Hi Stephen,
> 
> That's a strange one!
> I don't know for definite why this is happening, but here is one possible
> theory. :-)
> 
> It could be due to the size of the memory blocks that are getting mmapped.
> When you use 256 pages, the blocks of memory getting mapped may well be
> larger (depending on how fragmented in memory the 2MB pages are), and
> so may be getting mapped at a higher set of address ranges where there is
> more free memory. This set of address ranges is then free in the secondary
> process and it is similarly able to map the memory.
> With the 128 hugepages, you may be looking for smaller amounts of memory
> and so the addresses get mapped in at a different spot in the virtual address
> space, one that may be more heavily used. Then when the secondary
> process tries to duplicate the mappings, it already has memory in that region
> in use and the mapping fails.
> In short - one theory is that having bigger blocks to map causes the memory
> to be mapped to a different location in memory which is free from conflicts in
> the secondary process.
> 
> So, how to confirm or refute this, and generally debug this issue?
> Well, in general we  would need to look at the messages printed out at
> startup in the primary process to see how big of blocks it is trying to map in
> each case, and where they end up in the virtual address-space.

As I remember, OVDK project has had vaguely similar issues (only they were trying to map hugepages into the space that QEMU  has already occupied). This resulted in us adding a --base-virtaddr EAL command-line flag that would specify the start virtual address where primary process would start mapping pages. I guess you can try that as well (just remember that it needs to be done in the primary process, because the secondary one just copies the mappings and succeeds or fails to do so).

Best regards,
Anatoly Burakov
DPDK SW Engineer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] mmap() hint address
  2014-06-16  8:00   ` Burakov, Anatoly
@ 2014-06-20 14:36     ` Gooch, Stephen
  2014-06-20 14:42       ` Burakov, Anatoly
  0 siblings, 1 reply; 5+ messages in thread
From: Gooch, Stephen @ 2014-06-20 14:36 UTC (permalink / raw)
  To: BURAKOV, ANATOLY, RICHARDSON, BRUCE, dev

Hello,

One item I should have included is this device is running 32-bit 2.6.27, quite old, and sharing 4GB of RAM with a number of applications.   We were able to find the issue.  In the failure case vDSO is mapped lower (toward [heap]) than normal.  As a result , .rte_config was mapped into the pre-mapped pci uio resource virtual address range.

The fix: (1) move uio mmap() out of the narrow range at the bottom of the memory maps and (2) creating spacing between the uio maps and rte_config mmap().  It works with all huge page settings tested.

- Stephen

-----Original Message-----
From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com] 
Sent: Monday, June 16, 2014 1:00 AM
To: RICHARDSON, BRUCE; Gooch, Stephen; dev@dpdk.org
Subject: RE: mmap() hint address

Hi Bruce, Stephen,

> > Hello,
> >
> > I have seen a case where a secondary DPDK process tries to map uio 
> > resource in which mmap() normally sends the corresponding virtual 
> > address as a hint address.  However on some instances mmap() returns 
> > a virtual address that is not the hint address, and it result in
> > rte_panic() and the secondary process goes defunct.
> >
> > This happens from time to time on an embedded device when
> nr_hugepages is
> > set to 128, but never when nr_hugepage is set to 256 on the same device.
> My
> > question is, if mmap() can find the correct memory regions when 
> > hugepages is set to 256, would it not require less resources (and 
> > therefore be more likely to
> > pass) at a lower value such as 128?
> >
> > Any ideas what would cause this mmap() behavior at a lower 
> > nr_hugepage value?
> >
> > - Stephen
> 
> Hi Stephen,
> 
> That's a strange one!
> I don't know for definite why this is happening, but here is one 
> possible theory. :-)
> 
> It could be due to the size of the memory blocks that are getting mmapped.
> When you use 256 pages, the blocks of memory getting mapped may well 
> be larger (depending on how fragmented in memory the 2MB pages are), 
> and so may be getting mapped at a higher set of address ranges where 
> there is more free memory. This set of address ranges is then free in 
> the secondary process and it is similarly able to map the memory.
> With the 128 hugepages, you may be looking for smaller amounts of 
> memory and so the addresses get mapped in at a different spot in the 
> virtual address space, one that may be more heavily used. Then when 
> the secondary process tries to duplicate the mappings, it already has 
> memory in that region in use and the mapping fails.
> In short - one theory is that having bigger blocks to map causes the 
> memory to be mapped to a different location in memory which is free 
> from conflicts in the secondary process.
> 
> So, how to confirm or refute this, and generally debug this issue?
> Well, in general we  would need to look at the messages printed out at 
> startup in the primary process to see how big of blocks it is trying 
> to map in each case, and where they end up in the virtual address-space.

As I remember, OVDK project has had vaguely similar issues (only they were trying to map hugepages into the space that QEMU  has already occupied). This resulted in us adding a --base-virtaddr EAL command-line flag that would specify the start virtual address where primary process would start mapping pages. I guess you can try that as well (just remember that it needs to be done in the primary process, because the secondary one just copies the mappings and succeeds or fails to do so).

Best regards,
Anatoly Burakov
DPDK SW Engineer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] mmap() hint address
  2014-06-20 14:36     ` Gooch, Stephen
@ 2014-06-20 14:42       ` Burakov, Anatoly
  0 siblings, 0 replies; 5+ messages in thread
From: Burakov, Anatoly @ 2014-06-20 14:42 UTC (permalink / raw)
  To: Gooch, Stephen (Wind River), Richardson,  Bruce, dev

Hi Stephen,

You may want to look at the local tailqs patchset I've made. This may fix your issues as well.

http://dpdk.org/ml/archives/dev/2014-June/003573.html

I'm planning to respin a v4 of it, with an addition of using --base-virtaddr flag to also control where rte_config is mapped as well as hugepages.

Thanks,
Anatoly


> -----Original Message-----
> From: Gooch, Stephen [mailto:stephen.gooch@windriver.com]
> Sent: Friday, June 20, 2014 3:37 PM
> To: Burakov, Anatoly; Richardson, Bruce; dev@dpdk.org
> Subject: RE: mmap() hint address
> 
> Hello,
> 
> One item I should have included is this device is running 32-bit 2.6.27, quite
> old, and sharing 4GB of RAM with a number of applications.   We were able to
> find the issue.  In the failure case vDSO is mapped lower (toward [heap])
> than normal.  As a result , .rte_config was mapped into the pre-mapped pci
> uio resource virtual address range.
> 
> The fix: (1) move uio mmap() out of the narrow range at the bottom of the
> memory maps and (2) creating spacing between the uio maps and rte_config
> mmap().  It works with all huge page settings tested.
> 
> - Stephen
> 
> -----Original Message-----
> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]
> Sent: Monday, June 16, 2014 1:00 AM
> To: RICHARDSON, BRUCE; Gooch, Stephen; dev@dpdk.org
> Subject: RE: mmap() hint address
> 
> Hi Bruce, Stephen,
> 
> > > Hello,
> > >
> > > I have seen a case where a secondary DPDK process tries to map uio
> > > resource in which mmap() normally sends the corresponding virtual
> > > address as a hint address.  However on some instances mmap() returns
> > > a virtual address that is not the hint address, and it result in
> > > rte_panic() and the secondary process goes defunct.
> > >
> > > This happens from time to time on an embedded device when
> > nr_hugepages is
> > > set to 128, but never when nr_hugepage is set to 256 on the same
> device.
> > My
> > > question is, if mmap() can find the correct memory regions when
> > > hugepages is set to 256, would it not require less resources (and
> > > therefore be more likely to
> > > pass) at a lower value such as 128?
> > >
> > > Any ideas what would cause this mmap() behavior at a lower
> > > nr_hugepage value?
> > >
> > > - Stephen
> >
> > Hi Stephen,
> >
> > That's a strange one!
> > I don't know for definite why this is happening, but here is one
> > possible theory. :-)
> >
> > It could be due to the size of the memory blocks that are getting
> mmapped.
> > When you use 256 pages, the blocks of memory getting mapped may well
> > be larger (depending on how fragmented in memory the 2MB pages are),
> > and so may be getting mapped at a higher set of address ranges where
> > there is more free memory. This set of address ranges is then free in
> > the secondary process and it is similarly able to map the memory.
> > With the 128 hugepages, you may be looking for smaller amounts of
> > memory and so the addresses get mapped in at a different spot in the
> > virtual address space, one that may be more heavily used. Then when
> > the secondary process tries to duplicate the mappings, it already has
> > memory in that region in use and the mapping fails.
> > In short - one theory is that having bigger blocks to map causes the
> > memory to be mapped to a different location in memory which is free
> > from conflicts in the secondary process.
> >
> > So, how to confirm or refute this, and generally debug this issue?
> > Well, in general we  would need to look at the messages printed out at
> > startup in the primary process to see how big of blocks it is trying
> > to map in each case, and where they end up in the virtual address-space.
> 
> As I remember, OVDK project has had vaguely similar issues (only they were
> trying to map hugepages into the space that QEMU  has already occupied).
> This resulted in us adding a --base-virtaddr EAL command-line flag that would
> specify the start virtual address where primary process would start mapping
> pages. I guess you can try that as well (just remember that it needs to be
> done in the primary process, because the secondary one just copies the
> mappings and succeeds or fails to do so).
> 
> Best regards,
> Anatoly Burakov
> DPDK SW Engineer
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-20 14:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-13 21:02 [dpdk-dev] mmap() hint address Gooch, Stephen
2014-06-13 21:27 ` Richardson, Bruce
2014-06-16  8:00   ` Burakov, Anatoly
2014-06-20 14:36     ` Gooch, Stephen
2014-06-20 14:42       ` Burakov, Anatoly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).