From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 017F82B8B for ; Wed, 4 May 2016 15:15:27 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 04 May 2016 06:15:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,576,1455004800"; d="scan'208";a="696710308" Received: from smonroyx-mobl.ger.corp.intel.com (HELO [10.237.221.35]) ([10.237.221.35]) by FMSMGA003.fm.intel.com with ESMTP; 04 May 2016 04:07:06 -0700 To: Jianfeng Tan , dev@dpdk.org References: <1457089092-4128-1-git-send-email-jianfeng.tan@intel.com> <1457401359-132260-1-git-send-email-jianfeng.tan@intel.com> Cc: david.marchand@6wind.com, nhorman@tuxdriver.com, konstantin.ananyev@intel.com From: Sergio Gonzalez Monroy Message-ID: <5729D7D7.3020400@intel.com> Date: Wed, 4 May 2016 12:07:03 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <1457401359-132260-1-git-send-email-jianfeng.tan@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2] eal: make hugetlb initialization more robust X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 13:15:28 -0000 On 08/03/2016 01:42, Jianfeng Tan wrote: > This patch adds an option, --huge-trybest, to use a recover mechanism to > the case that there are not so many hugepages (declared in sysfs), which > can be used. It relys on a mem access to fault-in hugepages, and if fails > with SIGBUS, recover to previously saved stack environment with > siglongjmp(). > > Test example: > a. cgcreate -g hugetlb:/test-subgroup > b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup > c. cgexec -g hugetlb:test-subgroup \ > ./examples/helloworld/build/helloworld -c 0x2 -n 4 --huge-trybest I think you should mention in the commit message that this option also covers the case of hugetlbfs mount with quota. > > +static sigjmp_buf jmpenv; > + > +static void sigbus_handler(int signo __rte_unused) > +{ > + siglongjmp(jmpenv, 1); > +} > + > +/* Put setjmp into a wrap method to avoid compiling error. Any non-volatile, > + * non-static local variable in the stack frame calling setjmp might be > + * clobbered by a call to longjmp. > + */ > +static int wrap_setjmp(void) > +{ > + return setjmp(jmpenv); > +} Use sigsetjmp instead of setjmp and restore the signal masks. > /* > * Mmap all hugepages of hugepage table: it first open a file in > * hugetlbfs, then mmap() hugepage_sz data in it. If orig is set, the > @@ -396,7 +413,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, > if (fd < 0) { > RTE_LOG(ERR, EAL, "%s(): open failed: %s\n", __func__, > strerror(errno)); > - return -1; > + return i; When using --try-best, we could get an error and still work as expected. It can be confusing for users to see an error when it is expected behavior. Any thoughts? > } > > /* map the segment, and populate page tables, > @@ -407,7 +424,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, > RTE_LOG(ERR, EAL, "%s(): mmap failed: %s\n", __func__, > strerror(errno)); > close(fd); > - return -1; > + return i; > } > Same comment as above > /* set shared flock on the file. */ > if (flock(fd, LOCK_SH | LOCK_NB) == -1) { > RTE_LOG(ERR, EAL, "%s(): Locking file failed:%s \n", > __func__, strerror(errno)); > close(fd); > - return -1; > + return i; Same comment as above > @@ -1137,10 +1206,24 @@ rte_eal_hugepage_init(void) > continue; > > /* map all hugepages available */ > - if (map_all_hugepages(&tmp_hp[hp_offset], hpi, 1) < 0){ > - RTE_LOG(DEBUG, EAL, "Failed to mmap %u MB hugepages\n", > - (unsigned)(hpi->hugepage_sz / 0x100000)); > - goto fail; > + pages_old = hpi->num_pages[0]; > + pages_new = map_all_hugepages(&tmp_hp[hp_offset], hpi, 1); > + if (pages_new < pages_old) { > + RTE_LOG(DEBUG, EAL, > + "%d not %d hugepages of size %u MB allocated\n", > + pages_new, pages_old, > + (unsigned)(hpi->hugepage_sz / 0x100000)); > + if (internal_config.huge_trybest) { > + int pages = pages_old - pages_new; > + > + internal_config.memory -= > + hpi->hugepage_sz * pages; > + nr_hugepages -= pages; > + hpi->num_pages[0] = pages_new; > + if (pages_new == 0) > + continue; > + } else > + goto fail; > } There is another call to map_all_hugepages that you are not updating the check of the return value. Sergio