From: Bruce Richardson <bruce.richardson@intel.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] A question about hugepage initialization time
Date: Wed, 10 Dec 2014 14:35:58 +0000 [thread overview]
Message-ID: <20141210143558.GB1632@bricha3-MOBL3> (raw)
In-Reply-To: <20141210142926.GA17040@localhost.localdomain>
On Wed, Dec 10, 2014 at 09:29:26AM -0500, Neil Horman wrote:
> On Wed, Dec 10, 2014 at 10:32:25AM +0000, Bruce Richardson wrote:
> > On Tue, Dec 09, 2014 at 02:10:32PM -0800, Stephen Hemminger wrote:
> > > On Tue, 9 Dec 2014 11:45:07 -0800
> > > &rew <andras.kovacs@ericsson.com> wrote:
> > >
> > > > > Hey Folks,
> > > > >
> > > > > Our DPDK application deals with very large in memory data structures, and
> > > > > can potentially use tens or even hundreds of gigabytes of hugepage memory.
> > > > > During the course of development, we've noticed that as the number of huge
> > > > > pages increases, the memory initialization time during EAL init gets to be
> > > > > quite long, lasting several minutes at present. The growth in init time
> > > > > doesn't appear to be linear, which is concerning.
> > > > >
> > > > > This is a minor inconvenience for us and our customers, as memory
> > > > > initialization makes our boot times a lot longer than it would otherwise
> > > > > be. Also, my experience has been that really long operations often are
> > > > > hiding errors - what you think is merely a slow operation is actually a
> > > > > timeout of some sort, often due to misconfiguration. This leads to two
> > > > > questions:
> > > > >
> > > > > 1. Does the long initialization time suggest that there's an error
> > > > > happening under the covers?
> > > > > 2. If not, is there any simple way that we can shorten memory
> > > > > initialization time?
> > > > >
> > > > > Thanks in advance for your insights.
> > > > >
> > > > > --
> > > > > Matt Laswell
> > > > > laswell@infiniteio.com
> > > > > infinite io, inc.
> > > > >
> > > >
> > > > Hello,
> > > >
> > > > please find some quick comments on the questions:
> > > > 1.) By our experience long initialization time is normal in case of
> > > > large amount of memory. However this time depends on some things:
> > > > - number of hugepages (pagefault handled by kernel is pretty expensive)
> > > > - size of hugepages (memset at initialization)
> > > >
> > > > 2.) Using 1G pages instead of 2M will reduce the initialization time
> > > > significantly. Using wmemset instead of memset adds an additional 20-30%
> > > > boost by our measurements. Or, just by touching the pages but not cleaning
> > > > them you can have still some more speedup. But in this case your layer or
> > > > the applications above need to do the cleanup at allocation time
> > > > (e.g. by using rte_zmalloc).
> > > >
> > > > Cheers,
> > > > &rew
> > >
> > > I wonder if the whole rte_malloc code is even worth it with a modern kernel
> > > with transparent huge pages? rte_malloc adds very little value and is less safe
> > > and slower than glibc or other allocators. Plus you lose the ablilty to get
> > > all the benefit out of valgrind or electric fence.
> >
> > While I'd dearly love to not have our own custom malloc lib to maintain, for DPDK
> > multiprocess, rte_malloc will be hard to replace as we would need a replacement
> > solution that similarly guarantees that memory mapped in process A is also
> > available at the same address in process B. :-(
> >
> Just out of curiosity, why even bother with multiprocess support? What you're
> talking about above is a multithread model, and your shoehorning multiple
> processes into it.
> Neil
>
Yep, that's pretty much what it is alright. However, this multiprocess support
is very widely used by our customers in building their applications, and has
been in place and supported since some of the earliest DPDK releases. If it
is to be removed, it needs to be replaced by something that provides equivalent
capabilities to application writers (perhaps something with more fine-grained
sharing etc.)
/Bruce
next prev parent reply other threads:[~2014-12-10 14:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-09 16:33 Matt Laswell
2014-12-09 16:50 ` Burakov, Anatoly
2014-12-09 19:06 ` Matthew Hall
2014-12-09 22:05 ` Matt Laswell
2014-12-09 19:45 ` &rew
2014-12-09 22:10 ` Stephen Hemminger
2014-12-10 10:32 ` Bruce Richardson
2014-12-10 14:29 ` Neil Horman
2014-12-10 14:35 ` Bruce Richardson [this message]
2014-12-10 19:16 ` László Vadkerti
2014-12-11 10:14 ` Bruce Richardson
2014-12-12 4:07 ` László Vadkerti
2014-12-12 9:59 ` Bruce Richardson
2014-12-12 15:50 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141210143558.GB1632@bricha3-MOBL3 \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=nhorman@tuxdriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).