From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jianfeng.tan@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id 1B0F45954
 for <dev@dpdk.org>; Wed,  4 May 2016 13:28:07 +0200 (CEST)
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
 by fmsmga101.fm.intel.com with ESMTP; 04 May 2016 04:28:08 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.24,576,1455004800"; d="scan'208";a="972350760"
Received: from shwdeisgchi083.ccr.corp.intel.com (HELO [10.239.67.193])
 ([10.239.67.193])
 by fmsmga002.fm.intel.com with ESMTP; 04 May 2016 04:28:05 -0700
To: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>, dev@dpdk.org
References: <1457089092-4128-1-git-send-email-jianfeng.tan@intel.com>
 <1457401359-132260-1-git-send-email-jianfeng.tan@intel.com>
 <5729D7D7.3020400@intel.com>
Cc: david.marchand@6wind.com, nhorman@tuxdriver.com,
 konstantin.ananyev@intel.com
From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
Message-ID: <d9c70f62-d293-3cd8-2f90-7e3804c79f54@intel.com>
Date: Wed, 4 May 2016 19:28:05 +0800
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.0
MIME-Version: 1.0
In-Reply-To: <5729D7D7.3020400@intel.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH v2] eal: make hugetlb initialization more
	robust
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 04 May 2016 11:28:08 -0000

Hi Sergio,


On 5/4/2016 7:07 PM, Sergio Gonzalez Monroy wrote:
> On 08/03/2016 01:42, Jianfeng Tan wrote:
>> This patch adds an option, --huge-trybest, to use a recover mechanism to
>> the case that there are not so many hugepages (declared in sysfs), which
>> can be used. It relys on a mem access to fault-in hugepages, and if 
>> fails
>> with SIGBUS, recover to previously saved stack environment with
>> siglongjmp().
>>
>> Test example:
>>    a. cgcreate -g hugetlb:/test-subgroup
>>    b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
>>    c. cgexec -g hugetlb:test-subgroup \
>>       ./examples/helloworld/build/helloworld -c 0x2 -n 4 --huge-trybest
>
> I think you should mention in the commit message that this option also 
> covers the case
> of hugetlbfs mount with quota.

Yes, I should do that.

>
>>   +static sigjmp_buf jmpenv;
>> +
>> +static void sigbus_handler(int signo __rte_unused)
>> +{
>> +    siglongjmp(jmpenv, 1);
>> +}
>> +
>> +/* Put setjmp into a wrap method to avoid compiling error. Any 
>> non-volatile,
>> + * non-static local variable in the stack frame calling setjmp might be
>> + * clobbered by a call to longjmp.
>> + */
>> +static int wrap_setjmp(void)
>> +{
>> +    return setjmp(jmpenv);
>> +}
>
> Use sigsetjmp instead of setjmp and restore the signal masks.

The difference lies in whether signal mask will be saved for further 
restore. And you are right we should keep either sigsetjmp(xxx, 
1)/siglongjmp(xxx, 1) or setjmp()/longjmp. Nice catch! I'll go with the 
former.

>
>>   /*
>>    * Mmap all hugepages of hugepage table: it first open a file in
>>    * hugetlbfs, then mmap() hugepage_sz data in it. If orig is set, the
>> @@ -396,7 +413,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
>>           if (fd < 0) {
>>               RTE_LOG(ERR, EAL, "%s(): open failed: %s\n", __func__,
>>                       strerror(errno));
>> -            return -1;
>> +            return i;
>
> When using --try-best, we could get an error and still work as expected.
> It can be confusing for users to see an error when it is expected 
> behavior.
>
> Any thoughts?

Shall we remove those RTE_LOG complaints, because the failure here does 
not mean we cannot satisfy the requirements of applications?

...
> There is another call to map_all_hugepages that you are not updating 
> the check of the return value.

You are right, the second map_all_hugepages's return value check should 
be changed to compare with the total hugepages owned by the hpi. I'll 
fix this.

Thanks,
Jianfeng

>
> Sergio
>