From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 78A1947CE for ; Wed, 9 Mar 2016 18:46:01 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 09 Mar 2016 09:46:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,311,1455004800"; d="scan'208";a="930304608" Received: from slin14-mobl3.ccr.corp.intel.com (HELO [10.255.25.114]) ([10.255.25.114]) by orsmga002.jf.intel.com with ESMTP; 09 Mar 2016 09:45:58 -0800 To: "Ananyev, Konstantin" , Panu Matilainen , "dev@dpdk.org" References: <1453661393-85704-1-git-send-email-jianfeng.tan@intel.com> <1457085957-115339-1-git-send-email-jianfeng.tan@intel.com> <56DE9359.1090705@redhat.com> <56DF0E0A.8000108@intel.com> <56E01F94.2060906@redhat.com> <56E02AC2.7010704@intel.com> <2601191342CEEE43887BDE71AB97725836B1A536@irsmsx105.ger.corp.intel.com> <56E03078.3000501@intel.com> <2601191342CEEE43887BDE71AB97725836B1A5A2@irsmsx105.ger.corp.intel.com> <56E03977.7050103@intel.com> <2601191342CEEE43887BDE71AB97725836B1A5FD@irsmsx105.ger.corp.intel.com> From: "Tan, Jianfeng" Message-ID: <56E06156.2080400@intel.com> Date: Thu, 10 Mar 2016 01:45:58 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <2601191342CEEE43887BDE71AB97725836B1A5FD@irsmsx105.ger.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2016 17:46:02 -0000 Hi Konstantin, On 3/9/2016 11:17 PM, Ananyev, Konstantin wrote: > Hi Jianfeng, > >> -----Original Message----- >> From: Tan, Jianfeng >> Sent: Wednesday, March 09, 2016 2:56 PM >> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org >> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores >> >> Hi Konstantin, >> >> On 3/9/2016 10:44 PM, Ananyev, Konstantin wrote: >>>> -----Original Message----- >>>> From: Tan, Jianfeng >>>> Sent: Wednesday, March 09, 2016 2:17 PM >>>> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org >>>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores >>>> >>>> >>>> >>>> On 3/9/2016 10:01 PM, Ananyev, Konstantin wrote: >>>>>> -----Original Message----- >>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng >>>>>> Sent: Wednesday, March 09, 2016 1:53 PM >>>>>> To: Panu Matilainen; dev@dpdk.org >>>>>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores >>>>>> >>>>>> >>>>>> >>>>>> On 3/9/2016 9:05 PM, Panu Matilainen wrote: >>>>>>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote: >>>>>>>> Hi Panu, >>>>>>>> >>>>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote: >>>>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote: >>>>>>>>>> This patch adds option, --avail-cores, to use lcores which are >>>>>>>>>> available >>>>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores >>>>>>>>>> before >>>>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores). >>>>>>>>>> >>>>>>>>>> Test example: >>>>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \ >>>>>>>>>> --avail-cores -m 1024 >>>>>>>>>> >>>>>>>>>> Signed-off-by: Jianfeng Tan >>>>>>>>>> Acked-by: Neil Horman >>>>>>>>> Hmm, to me this sounds like something that should be done always so >>>>>>>>> there's no need for an option. Or if there's a chance it might do the >>>>>>>>> wrong thing in some rare circumstance then perhaps there should be a >>>>>>>>> disabler option instead? >>>>>>>> Thanks for comments. >>>>>>>> >>>>>>>> Yes, there's a use case that we cannot handle. >>>>>>>> >>>>>>>> If we make it as default, DPDK applications may fail to start, when user >>>>>>>> specifies a core in isolcpus and its parent process (say bash) has a >>>>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications >>>>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because >>>>>>>> it always has root privilege to change any cpu affinity. >>>>>>>> >>>>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be >>>>>>>> flagged as undetected (in my older implementation) and leads to failure. >>>>>>>> To make it correct, we would always add "taskset mask" (or other ways) >>>>>>>> before DPDK application cmd lines. >>>>>>>> >>>>>>>> How do you think? >>>>>>> I still think it sounds like something that should be done by default >>>>>>> and maybe be overridable with some flag, rather than the other way >>>>>>> around. Another alternative might be detecting the cores always but if >>>>>>> running as root, override but with a warning. >>>>>> For your second solution, only root can setaffinity to isolcpus? >>>>>> Your first solution seems like a promising way for me. >>>>>> >>>>>>> But I dont know, just wondering. To look at it from another angle: why >>>>>>> would somebody use this new --avail-cores option and in what >>>>>>> situation, if things "just work" otherwise anyway? >>>>>> For DPDK applications, the most common case to initialize DPDK is like >>>>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need >>>>>> to specify which cores to run and how much hugepages are used. Suppose >>>>>> we need this dpdk-app to run in a container, users already give those >>>>>> information when they build up the cgroup for it to run inside, this >>>>>> option or this patch is to make DPDK more smart to discover how much >>>>>> resource will be used. Make sense? >>>>> But then, all we need might be just a script that would extract this information from the system >>>>> and form a proper cmdline parameter for DPDK? >>>> Yes, a script will work. Or to construct (argc, argv) to call >>>> rte_eal_init() in the application. But as Neil Horman once suggested, a >>>> simple pthread_getaffinity_np() will get all things done. So if it worth >>>> a patch here? >>> Don't know... >>> Personally I would prefer not to put extra logic inside EAL. >>> For me - there are too many different options already. >> Then how about make it default in rte_eal_cpu_init()? And it is already >> known it will bring trouble to those use isolcpus users, they need to >> add "taskset [mask]" before starting a DPDK app. > As I said - provide a script? Yes. But what I want to say is this script is hard to be right, if there are different kinds of limitations. (Barely happen though :-) ) > Same might be for amount of hugepage memory available to the user? Ditto. Limitations like hugetlbfs quota, cgroup hugetlb, some are used by app themself (more like an artificial argument) ... > >>> From other side looking at the patch itself: >>> You are updating lcore_count and lcore_config[],based on physical cpu availability, >>> but these days it is not always one-to-one mapping between EAL lcore and physical cpu. >>> Shouldn't that be taken into account? >> I have not see the problem so far, because this work is done before >> parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core >> is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or >> could you please give more hints? > I didn't test try changes, so probably I am missing something. > Let say iuser allowed to use only cpus 0-3. > If he would type with: > --avail-cores --lcores='(1-7)@2', > then only lcores 1-3 would be started. > Again if user would specify '2@(1-7)' it would also be undetected > that cpus 4-7 are note available to the user. > Is that so? After reading the code: For case --lcores='(1-7)@2', lcores 1-7 would be started, and bind to pcore 2. For case --lcores='2@(1-7)', this will fail with "core 4 unavailable". It's because: a. although 1:1 mapping is built-up and flagged as detected if pcore is found in sysfs. (ROLE_RTE, cpuset, detected is true) b. in the beginning of eal_parse_lcores(), "reset lcore config". (ROLE_OFF, cpuset is empty, detected is still true) c. pcore cpuset will be checked by convert_to_cpuset using the previous "detected" value. I have tested it with the patch. Result aligns above analysis. For case --lcores='(1-7)@2': sudo taskset 0xf ./examples/helloworld/build/helloworld --avail-cores --lcores='(1-7)@2' ... hello from core 2 hello from core 3 hello from core 4 hello from core 5 hello from core 6 hello from core 7 hello from core 1 For case --lcores='2@(1-7)': sudo taskset 0xf ./examples/helloworld/build/helloworld --avail-cores --lcores='2@(1-7)' ... EAL: core 4 unavailable EAL: invalid parameter for --lcores ... One thing may worth mention: shall "detected" be maintained in struct lcore_config? Maybe we need to maintain an data structure for pcores? Thanks, Jianfeng > > Konstantin > >> Thanks, >> Jianfeng >> >>> Konstantin >>> >>> >>>