From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <anatoly.burakov@intel.com>
Received: from mga17.intel.com (mga17.intel.com [192.55.52.151])
 by dpdk.org (Postfix) with ESMTP id B939D4C77
 for <dev@dpdk.org>; Fri,  9 Nov 2018 15:03:28 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 09 Nov 2018 06:03:27 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,483,1534834800"; d="scan'208";a="107266068"
Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.143])
 ([10.237.220.143])
 by orsmga002.jf.intel.com with ESMTP; 09 Nov 2018 06:03:26 -0800
To: jianmingfan <jianmingfan@126.com>, dev@dpdk.org
Cc: Jianming Fan <fanjianming@jd.com>
References: <20181109075830.27265-1-jianmingfan@126.com>
 <20181109092338.30097-1-jianmingfan@126.com>
 <6a7fcafc-5ac7-417a-fea6-2fd33b8b6c90@intel.com>
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Message-ID: <357b1b24-68f8-2c83-ff42-6ea1dce11b9c@intel.com>
Date: Fri, 9 Nov 2018 14:03:25 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <6a7fcafc-5ac7-417a-fea6-2fd33b8b6c90@intel.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH v2] mem: accelerate dpdk program startup by
 reuse page from page cache
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Nov 2018 14:03:29 -0000

On 09-Nov-18 12:20 PM, Burakov, Anatoly wrote:
> On 09-Nov-18 9:23 AM, jianmingfan wrote:
>> --- fix coding style of the previous patch
>>
>> During procless startup, dpdk invokes clear_hugedir() to unlink all
>> hugepage files under /dev/hugepages. Then in map_all_hugepages(),
>> it invokes mmap to allocate and zero all the huge pages as configured
>> in /sys/kernel/mm/hugepages/xxx/nr_hugepages.
>>
>> This cause startup process extreamly slow with large size of huge page
>> configured.
>>
>> In our use case, we usually configure as large as 200GB hugepages in our
>> router. It takes more than 50s each time dpdk process startup to clear
>> the pages.
>>
>> To address this issue, user can turn on --reuse-map switch. With it,
>> dpdk will check the validity of the exiting page cache under
>> /dev/hugespages. If valid, the cache will be reused not deleted,
>> so that the os doesn't need to zero the pages again.
>>
>> However, as there are a lot of users ,e.g. rte_kni_alloc, rely on the
>> os zeor page behavior. To keep things work, I add memset during
>> malloc_heap_alloc(). This makes sense due to the following reason.
>> 1) user often configure hugepage size too large to be used by the 
>> program.
>> In our router, 200GB is configured, but less than 2GB is actually used.
>> 2) dpdk users don't call heap allocation in performance-critical path.
>> They alloc memory during process bootup.
>>
>> Signed-off-by: Jianming Fan <fanjianming@jd.com>
>> ---
> 
> I believe this issue is better solved by actually fixing all of the 
> memory that DPDK leaves behind. We already have rte_eal_cleanup() call 
> which will deallocate any EAL-allocated memory that have been reserved, 
> and an exited application should free any memory it was using so that 
> memory subsystem could free it back to the system, thereby not needing 
> any cleaning of hugepages at startup.
> 
> If your application does not e.g. free its mempools on exit, it should 
> :) Chances are, the problem will go away. The only circumstance where 
> this may not work is if you preallocated your memory using 
> -m/--socket-mem flag.
> 

To clarify - all of the above is only applicable to 18.05 and beyond. 
The map_all_hugepages() function only gets called in the legacy mem 
init, so this patch solves a problem that does not exist on recent DPDK 
versions in the first place - faster initialization is one of the key 
reasons why the new memory subsystem was developed.

-- 
Thanks,
Anatoly