From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id E91A8282 for ; Wed, 10 Dec 2014 11:32:28 +0100 (CET) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 10 Dec 2014 02:32:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,551,1413270000"; d="scan'208";a="635475812" Received: from bricha3-mobl3.ger.corp.intel.com ([10.243.20.31]) by fmsmga001.fm.intel.com with SMTP; 10 Dec 2014 02:32:25 -0800 Received: by (sSMTP sendmail emulation); Wed, 10 Dec 2014 10:32:25 +0025 Date: Wed, 10 Dec 2014 10:32:25 +0000 From: Bruce Richardson To: Stephen Hemminger Message-ID: <20141210103225.GA10056@bricha3-MOBL3> References: <20141209141032.5fa2db0d@urahara> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20141209141032.5fa2db0d@urahara> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] A question about hugepage initialization time X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Dec 2014 10:32:29 -0000 On Tue, Dec 09, 2014 at 02:10:32PM -0800, Stephen Hemminger wrote: > On Tue, 9 Dec 2014 11:45:07 -0800 > &rew wrote: > > > > Hey Folks, > > > > > > Our DPDK application deals with very large in memory data structures, and > > > can potentially use tens or even hundreds of gigabytes of hugepage memory. > > > During the course of development, we've noticed that as the number of huge > > > pages increases, the memory initialization time during EAL init gets to be > > > quite long, lasting several minutes at present. The growth in init time > > > doesn't appear to be linear, which is concerning. > > > > > > This is a minor inconvenience for us and our customers, as memory > > > initialization makes our boot times a lot longer than it would otherwise > > > be. Also, my experience has been that really long operations often are > > > hiding errors - what you think is merely a slow operation is actually a > > > timeout of some sort, often due to misconfiguration. This leads to two > > > questions: > > > > > > 1. Does the long initialization time suggest that there's an error > > > happening under the covers? > > > 2. If not, is there any simple way that we can shorten memory > > > initialization time? > > > > > > Thanks in advance for your insights. > > > > > > -- > > > Matt Laswell > > > laswell@infiniteio.com > > > infinite io, inc. > > > > > > > Hello, > > > > please find some quick comments on the questions: > > 1.) By our experience long initialization time is normal in case of > > large amount of memory. However this time depends on some things: > > - number of hugepages (pagefault handled by kernel is pretty expensive) > > - size of hugepages (memset at initialization) > > > > 2.) Using 1G pages instead of 2M will reduce the initialization time > > significantly. Using wmemset instead of memset adds an additional 20-30% > > boost by our measurements. Or, just by touching the pages but not cleaning > > them you can have still some more speedup. But in this case your layer or > > the applications above need to do the cleanup at allocation time > > (e.g. by using rte_zmalloc). > > > > Cheers, > > &rew > > I wonder if the whole rte_malloc code is even worth it with a modern kernel > with transparent huge pages? rte_malloc adds very little value and is less safe > and slower than glibc or other allocators. Plus you lose the ablilty to get > all the benefit out of valgrind or electric fence. While I'd dearly love to not have our own custom malloc lib to maintain, for DPDK multiprocess, rte_malloc will be hard to replace as we would need a replacement solution that similarly guarantees that memory mapped in process A is also available at the same address in process B. :-( /Bruce