From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 880AA5A5C for ; Wed, 18 May 2016 09:56:16 +0200 (CEST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP; 18 May 2016 00:56:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,327,1459839600"; d="scan'208";a="105971510" Received: from smonroyx-mobl.ger.corp.intel.com (HELO [10.237.221.17]) ([10.237.221.17]) by fmsmga004.fm.intel.com with ESMTP; 18 May 2016 00:56:14 -0700 To: David Marchand , Jianfeng Tan References: <1457089092-4128-1-git-send-email-jianfeng.tan@intel.com> <1463013881-27985-1-git-send-email-jianfeng.tan@intel.com> Cc: "dev@dpdk.org" , Neil Horman From: Sergio Gonzalez Monroy Message-ID: <7e3e3aa7-4277-ac4f-433e-7d63c9eef78b@intel.com> Date: Wed, 18 May 2016 08:56:13 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v4] eal: make hugetlb initialization more robust X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2016 07:56:17 -0000 On 17/05/2016 17:39, David Marchand wrote: > Hello Jianfeng, > > On Thu, May 12, 2016 at 2:44 AM, Jianfeng Tan wrote: >> This patch adds an option, --huge-trybest, to use a recover mechanism to >> the case that there are not so many hugepages (declared in sysfs), which >> can be used. It relys on a mem access to fault-in hugepages, and if fails >> with SIGBUS, recover to previously saved stack environment with >> siglongjmp(). >> >> Besides, this solution fixes an issue when hugetlbfs is specified with an >> option of size. Currently DPDK does not respect the quota of a hugetblfs >> mount. It fails to init the EAL because it tries to map the number of free >> hugepages in the system rather than using the number specified in the quota >> for that mount. >> >> It's still an open issue with CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS. Under >> this case (such as IVSHMEM target), having hugetlbfs mounts with quota will >> fail to remap hugepages as it relies on having mapped all free hugepages >> in the system. > For such a case case, maybe having some warning log message when it > fails would help the user. > + a known issue in the release notes ? > > >> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c >> index 5b9132c..8c77010 100644 >> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c >> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c >> @@ -417,12 +434,33 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, >> hugepg_tbl[i].final_va = virtaddr; >> } >> >> + if (orig && internal_config.huge_trybest) { >> + /* In linux, hugetlb limitations, like cgroup, are >> + * enforced at fault time instead of mmap(), even >> + * with the option of MAP_POPULATE. Kernel will send >> + * a SIGBUS signal. To avoid to be killed, save stack >> + * environment here, if SIGBUS happens, we can jump >> + * back here. >> + */ >> + if (wrap_sigsetjmp()) { >> + RTE_LOG(DEBUG, EAL, "SIGBUS: Cannot mmap more " >> + "hugepages of size %u MB\n", >> + (unsigned)(hugepage_sz / 0x100000)); >> + munmap(virtaddr, hugepage_sz); >> + close(fd); >> + unlink(hugepg_tbl[i].filepath); >> + return i; >> + } >> + *(int *)virtaddr = 0; >> + } >> + >> + >> /* set shared flock on the file. */ >> if (flock(fd, LOCK_SH | LOCK_NB) == -1) { >> - RTE_LOG(ERR, EAL, "%s(): Locking file failed:%s \n", >> + RTE_LOG(DEBUG, EAL, "%s(): Locking file failed:%s \n", >> __func__, strerror(errno)); >> close(fd); >> - return -1; >> + return i; >> } >> >> close(fd); > Maybe I missed something, but we are writing into some hugepage before > the flock has been called. > Are we sure there is nobody else using this hugepage ? > > Especially, can't this cause trouble to a primary process running if > we start the exact same primary process ? > We lock the hugepage directory during eal_hugepage_info_init(), and we do not unlock until we have finished eal_memory_init. I think that takes care of that case. Sergio