From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by dpdk.org (Postfix) with ESMTP id 1955B7F50 for ; Tue, 9 Dec 2014 23:10:43 +0100 (CET) Received: by mail-pa0-f54.google.com with SMTP id fb1so1422799pad.13 for ; Tue, 09 Dec 2014 14:10:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=e2hkIvqO/WS7bkD9Ycxw5KnGxMLE5KudfTDfIOtRyqw=; b=fmrDkrkTa/38+AjCbwtx0ErzJH/3qQkN1zqTrZ5F5BpPjUjb2DUcWhQDBYYqUH6W4J gQgZcTz2nnS7CpEWnVR5U+H5musQbo4UIuL/yWbSNPsxVvdk6G9hBZdyeu27cuxBiExg BBr4XqA9zT8CQtTbmoxTVSoMKa7ACBMkEIFhRhCRTZPk+x8zdtARESdFitomIX+HsFI2 9q8bFIrnJXlTnPC+KHA2gOXhy1K67KCaYcfR3tA/4OWxPywzPd1zTvamcsJB4TmrO961 CTnJ7UGmaQskamfj5pqGjelzJa7zbgCvF8M25DIq+iuYMi66DGKPsRuSdTF55qOawsOi /7Pg== X-Gm-Message-State: ALoCoQm+PXHUSUz/2AVCjIWMfkaoRW0JEl030KlxOPgNnxopBBey9TtQ8UAxiIIqU7YTwzF+E/mk X-Received: by 10.68.129.197 with SMTP id ny5mr1396185pbb.34.1418163042189; Tue, 09 Dec 2014 14:10:42 -0800 (PST) Received: from urahara (static-50-53-82-155.bvtn.or.frontiernet.net. [50.53.82.155]) by mx.google.com with ESMTPSA id c9sm2316127pdn.81.2014.12.09.14.10.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Dec 2014 14:10:41 -0800 (PST) Date: Tue, 9 Dec 2014 14:10:32 -0800 From: Stephen Hemminger To: &rew Message-ID: <20141209141032.5fa2db0d@urahara> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] A question about hugepage initialization time X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Dec 2014 22:10:43 -0000 On Tue, 9 Dec 2014 11:45:07 -0800 &rew wrote: > > Hey Folks, > > > > Our DPDK application deals with very large in memory data structures, and > > can potentially use tens or even hundreds of gigabytes of hugepage memory. > > During the course of development, we've noticed that as the number of huge > > pages increases, the memory initialization time during EAL init gets to be > > quite long, lasting several minutes at present. The growth in init time > > doesn't appear to be linear, which is concerning. > > > > This is a minor inconvenience for us and our customers, as memory > > initialization makes our boot times a lot longer than it would otherwise > > be. Also, my experience has been that really long operations often are > > hiding errors - what you think is merely a slow operation is actually a > > timeout of some sort, often due to misconfiguration. This leads to two > > questions: > > > > 1. Does the long initialization time suggest that there's an error > > happening under the covers? > > 2. If not, is there any simple way that we can shorten memory > > initialization time? > > > > Thanks in advance for your insights. > > > > -- > > Matt Laswell > > laswell@infiniteio.com > > infinite io, inc. > > > > Hello, > > please find some quick comments on the questions: > 1.) By our experience long initialization time is normal in case of > large amount of memory. However this time depends on some things: > - number of hugepages (pagefault handled by kernel is pretty expensive) > - size of hugepages (memset at initialization) > > 2.) Using 1G pages instead of 2M will reduce the initialization time > significantly. Using wmemset instead of memset adds an additional 20-30% > boost by our measurements. Or, just by touching the pages but not cleaning > them you can have still some more speedup. But in this case your layer or > the applications above need to do the cleanup at allocation time > (e.g. by using rte_zmalloc). > > Cheers, > &rew I wonder if the whole rte_malloc code is even worth it with a modern kernel with transparent huge pages? rte_malloc adds very little value and is less safe and slower than glibc or other allocators. Plus you lose the ablilty to get all the benefit out of valgrind or electric fence.