From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <dev-bounces@dpdk.org> Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 61646A04A2; Wed, 6 Nov 2019 11:37:49 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DDEB11C07C; Wed, 6 Nov 2019 11:37:47 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id CFA901C038; Wed, 6 Nov 2019 11:37:45 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Nov 2019 02:37:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,274,1569308400"; d="scan'208";a="212745620" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.92]) ([10.237.220.92]) by fmsmga001.fm.intel.com with ESMTP; 06 Nov 2019 02:37:43 -0800 To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, David Marchand <david.marchand@redhat.com>, Yasufumi Ogawa <yasufum.o@gmail.com> Cc: dev <dev@dpdk.org>, dpdk stable <stable@dpdk.org>, Yasufumi Ogawa <ogawa.yasufumi@lab.ntt.co.jp> References: <20190724082031.45546-1-yasufum.o@gmail.com> <20191101090413.17997-1-yasufum.o@gmail.com> <20191101090413.17997-2-yasufum.o@gmail.com> <05605cac-693e-042a-a3cb-6506b1ec653e@intel.com> <CAJFAV8xPPu8kDNcjUycvWKjUBc3YuOT_ZQ_aDjrD4-yW3JgiEg@mail.gmail.com> <d7a0729c-eb4d-d87a-4f7a-0d94627010e4@intel.com> <2601191342CEEE43887BDE71AB97725801A8C800B7@IRSMSX104.ger.corp.intel.com> From: "Burakov, Anatoly" <anatoly.burakov@intel.com> Message-ID: <9c3fba36-9cb2-76ad-198f-c11a63f01a9a@intel.com> Date: Wed, 6 Nov 2019 10:37:42 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <2601191342CEEE43887BDE71AB97725801A8C800B7@IRSMSX104.ger.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v6 1/1] fbarray: fix duplicated fbarray file in secondary X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> On 05-Nov-19 11:41 AM, Ananyev, Konstantin wrote: > > >> -----Original Message----- >> From: Burakov, Anatoly <anatoly.burakov@intel.com> >> Sent: Tuesday, November 5, 2019 11:31 AM >> To: David Marchand <david.marchand@redhat.com>; Yasufumi Ogawa <yasufum.o@gmail.com> >> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev <dev@dpdk.org>; dpdk stable <stable@dpdk.org>; Yasufumi Ogawa >> <ogawa.yasufumi@lab.ntt.co.jp> >> Subject: Re: [PATCH v6 1/1] fbarray: fix duplicated fbarray file in secondary >> >> On 05-Nov-19 10:13 AM, David Marchand wrote: >>> Hello Anatoly, Yasufumi, >>> >>> On Mon, Nov 4, 2019 at 11:20 AM Burakov, Anatoly >>> <anatoly.burakov@intel.com> wrote: >>>> >>>> On 01-Nov-19 9:04 AM, yasufum.o@gmail.com wrote: >>>>> From: Yasufumi Ogawa <ogawa.yasufumi@lab.ntt.co.jp> >>>>> >>>>> In secondary_msl_create_walk(), it creates a file for fbarrays with its >>>>> PID for reserving unique name among secondary processes. However, it >>>>> does not work if several secondaries run as app containers because each >>>>> of containerized secondary has PID 1, and failed to reserve unique name >>>>> other than first one. To reserve unique name in each of containers, use >>>>> hostname in addition to PID. >>>>> >>>>> Cc: stable@dpdk.org >>> >>> We can't backport this as is, see below. >>> >>> >>>>> >>>>> Signed-off-by: Yasufumi Ogawa <yasufum.o@gmail.com> >>>>> --- >>>>> lib/librte_eal/common/include/rte_fbarray.h | 2 +- >>>>> lib/librte_eal/linux/eal/eal_memalloc.c | 11 ++++++++--- >>>>> 2 files changed, 9 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/lib/librte_eal/common/include/rte_fbarray.h b/lib/librte_eal/common/include/rte_fbarray.h >>>>> index 6dccdbec9..5c2815093 100644 >>>>> --- a/lib/librte_eal/common/include/rte_fbarray.h >>>>> +++ b/lib/librte_eal/common/include/rte_fbarray.h >>>>> @@ -39,7 +39,7 @@ extern "C" { >>>>> #include <rte_compat.h> >>>>> #include <rte_rwlock.h> >>>>> >>>>> -#define RTE_FBARRAY_NAME_LEN 64 >>>>> +#define RTE_FBARRAY_NAME_LEN NAME_MAX >>> >>> The change on RTE_FBARRAY_NAME_LEN breaks the ABI, so we cannot >>> backport this as is. >>> For 19.11, we can allow this breakage, but we need an update of the >>> release notes. >>> >>> Besides, what is the impact in terms of memory consumption? >>> >>> >>>>> >>>>> struct rte_fbarray { >>>>> char name[RTE_FBARRAY_NAME_LEN]; /**< name associated with an array */ >>>>> diff --git a/lib/librte_eal/linux/eal/eal_memalloc.c b/lib/librte_eal/linux/eal/eal_memalloc.c >>>>> index af6d0d023..24f0275c9 100644 >>>>> --- a/lib/librte_eal/linux/eal/eal_memalloc.c >>>>> +++ b/lib/librte_eal/linux/eal/eal_memalloc.c >>>>> @@ -1365,6 +1365,7 @@ secondary_msl_create_walk(const struct rte_memseg_list *msl, >>>>> struct rte_memseg_list *primary_msl, *local_msl; >>>>> char name[PATH_MAX]; >>>>> int msl_idx, ret; >>>>> + char hostname[HOST_NAME_MAX] = { 0 }; >>>>> >>>>> if (msl->external) >>>>> return 0; >>>>> @@ -1373,9 +1374,13 @@ secondary_msl_create_walk(const struct rte_memseg_list *msl, >>>>> primary_msl = &mcfg->memsegs[msl_idx]; >>>>> local_msl = &local_memsegs[msl_idx]; >>>>> >>>>> - /* create distinct fbarrays for each secondary */ >>>>> - snprintf(name, RTE_FBARRAY_NAME_LEN, "%s_%i", >>>>> - primary_msl->memseg_arr.name, getpid()); >>>>> + /* Create distinct fbarrays for each secondary by using PID and >>>>> + * hostname. The reason why using hostname is because PID could be >>>>> + * duplicated among secondaries if it is launched in a container. >>>>> + */ >>>>> + gethostname(hostname, HOST_NAME_MAX); >>> >>> Personal preference, s/HOST_NAME_MAX/sizeof(hostname)/. >>> >>> >>> hostname[] is HOST_NAME_MAX bytes long. >>> In the worst case, we can get a non NULL terminated hostname string. >>> " >>> gethostname() returns the null-terminated hostname in the >>> character array name, which has a length of len bytes. If the >>> null-terminated hostname is too large to fit, then the name is >>> truncated, and >>> no error is returned (but see NOTES below). POSIX.1-2001 says >>> that if such truncation occurs, then it is unspecified whether the >>> returned buffer includes a terminating null byte. >>> ... >>> NOTES >>> SUSv2 guarantees that "Host names are limited to 255 bytes". >>> POSIX.1-2001 guarantees that "Host names (not including the >>> terminating null byte) are limited to HOST_NAME_MAX bytes". On >>> Linux, >>> HOST_NAME_MAX is defined with the value 64, which has been the >>> limit since Linux 1.0 (earlier kernels imposed a limit of 8 bytes). >>> " >>> >>> How about making hostname[] HOST_NAME_MAX+1 bytes long? >>> >>>>> + snprintf(name, RTE_FBARRAY_NAME_LEN, "%s_%s_%d", >>>>> + primary_msl->memseg_arr.name, hostname, (int)getpid()); >>>>> >>>>> ret = rte_fbarray_init(&local_msl->memseg_arr, name, >>>>> primary_msl->memseg_arr.len, >>>>> >>>> >>>> I think the order should be reversed. Both containers and non-containers >>>> can have their hostname set, and RTE_FBARRAY_NAME_LEN is of fairly >>>> limited length, so if the hostname is long enough, the PID never gets >>>> into the name string, resulting in duplicates. It is better have pid first. >>> >>> Anatoly, >>> >>> On the principle, it seems better, yes. >>> Just the comment on RTE_FBARRAY_NAME_LEN indicates that you missed the >>> change at the top of the patch. >>> What do you think of this change? >>> >> >> Yes, i did miss that, apologies. >> >> I don't have a strong opinion on this change, however the above comment >> would still be true if we make fbarray size to be hostname_max + 1 - we >> still potentially get no space for a pid. So if we're going to have pid >> in there as well, it should be hostname_max + pid_max (5 digits?) + >> whatever underscores we have + null terminator, to ensure it fits under >> any and all circumstances.# > > I think that at least on linux we have more than enough space here: > > $ find /usr/include -type f | xargs grep ' NAME_MAX' | grep define > /usr/include/linux/limits.h:#define NAME_MAX 255 /* # chars in a file name */ > > $ find /usr/include -type f | xargs grep ' HOST_NAME_MAX' | grep define > /usr/include/i386-linux-gnu/bits/local_lim.h:#define HOST_NAME_MAX 64 > /usr/include/x86_64-linux-gnu/bits/local_lim.h:#define HOST_NAME_MAX 64 > Okay, works for me :) -- Thanks, Anatoly