From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
Panu Matilainen <pmatilai@redhat.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
Date: Thu, 10 Mar 2016 09:36:33 +0800 [thread overview]
Message-ID: <56E0CFA1.7030303@intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725836B1AA0C@irsmsx105.ger.corp.intel.com>
On 3/10/2016 3:33 AM, Ananyev, Konstantin wrote:
>
>>>>>>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>>>>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>>>>>>>>> This patch adds option, --avail-cores, to use lcores which are
>>>>>>>>>>>> available
>>>>>>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores
>>>>>>>>>>>> before
>>>>>>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>>>>>>>>>
>>>>>>>>>>>> Test example:
>>>>>>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>>>>>>>>> --avail-cores -m 1024
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>>>>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>>>>>>>>> Hmm, to me this sounds like something that should be done always so
>>>>>>>>>>> there's no need for an option. Or if there's a chance it might do the
>>>>>>>>>>> wrong thing in some rare circumstance then perhaps there should be a
>>>>>>>>>>> disabler option instead?
>>>>>>>>>> Thanks for comments.
>>>>>>>>>>
>>>>>>>>>> Yes, there's a use case that we cannot handle.
>>>>>>>>>>
>>>>>>>>>> If we make it as default, DPDK applications may fail to start, when user
>>>>>>>>>> specifies a core in isolcpus and its parent process (say bash) has a
>>>>>>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>>>>>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>>>>>>>>> it always has root privilege to change any cpu affinity.
>>>>>>>>>>
>>>>>>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>>>>>>>>> flagged as undetected (in my older implementation) and leads to failure.
>>>>>>>>>> To make it correct, we would always add "taskset mask" (or other ways)
>>>>>>>>>> before DPDK application cmd lines.
>>>>>>>>>>
>>>>>>>>>> How do you think?
>>>>>>>>> I still think it sounds like something that should be done by default
>>>>>>>>> and maybe be overridable with some flag, rather than the other way
>>>>>>>>> around. Another alternative might be detecting the cores always but if
>>>>>>>>> running as root, override but with a warning.
>>>>>>>> For your second solution, only root can setaffinity to isolcpus?
>>>>>>>> Your first solution seems like a promising way for me.
>>>>>>>>
>>>>>>>>> But I dont know, just wondering. To look at it from another angle: why
>>>>>>>>> would somebody use this new --avail-cores option and in what
>>>>>>>>> situation, if things "just work" otherwise anyway?
>>>>>>>> For DPDK applications, the most common case to initialize DPDK is like
>>>>>>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need
>>>>>>>> to specify which cores to run and how much hugepages are used. Suppose
>>>>>>>> we need this dpdk-app to run in a container, users already give those
>>>>>>>> information when they build up the cgroup for it to run inside, this
>>>>>>>> option or this patch is to make DPDK more smart to discover how much
>>>>>>>> resource will be used. Make sense?
>>>>>>> But then, all we need might be just a script that would extract this information from the system
>>>>>>> and form a proper cmdline parameter for DPDK?
>>>>>> Yes, a script will work. Or to construct (argc, argv) to call
>>>>>> rte_eal_init() in the application. But as Neil Horman once suggested, a
>>>>>> simple pthread_getaffinity_np() will get all things done. So if it worth
>>>>>> a patch here?
>>>>> Don't know...
>>>>> Personally I would prefer not to put extra logic inside EAL.
>>>>> For me - there are too many different options already.
>>>> Then how about make it default in rte_eal_cpu_init()? And it is already
>>>> known it will bring trouble to those use isolcpus users, they need to
>>>> add "taskset [mask]" before starting a DPDK app.
>>> As I said - provide a script?
>> Yes. But what I want to say is this script is hard to be right, if there
>> are different kinds of limitations. (Barely happen though :-) )
> My thought was to keep dpdk code untouched - i.e. let it still blindly set_pthread_affinity()
> based on the input parameters, and in addition provide a script for those who want to run
> in '--avail-cores' mode.
> So it could do 'taskset -p $$' and then either form -c parameter list for the app,
> or check existing -c/-l/--lcores parameter and complain if not allowed pcpu detected.
> But ok, might be it is easier and more convenient to have this logic inside EAL,
> then in a separate script.
>
>>> Same might be for amount of hugepage memory available to the user?
>> Ditto. Limitations like hugetlbfs quota, cgroup hugetlb, some are used
>> by app themself (more like an artificial argument) ...
>>>>> From other side looking at the patch itself:
>>>>> You are updating lcore_count and lcore_config[],based on physical cpu availability,
>>>>> but these days it is not always one-to-one mapping between EAL lcore and physical cpu.
>>>>> Shouldn't that be taken into account?
>>>> I have not see the problem so far, because this work is done before
>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core
>>>> is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or
>>>> could you please give more hints?
>>> I didn't test try changes, so probably I am missing something.
>>> Let say iuser allowed to use only cpus 0-3.
>>> If he would type with:
>>> --avail-cores --lcores='(1-7)@2',
>>> then only lcores 1-3 would be started.
>>> Again if user would specify '2@(1-7)' it would also be undetected
>>> that cpus 4-7 are note available to the user.
>>> Is that so?
>> After reading the code:
>> For case --lcores='(1-7)@2', lcores 1-7 would be started, and bind to
>> pcore 2.
>> For case --lcores='2@(1-7)', this will fail with "core 4 unavailable".
>>
>> It's because:
>> a. although 1:1 mapping is built-up and flagged as detected if pcore is
>> found in sysfs. (ROLE_RTE, cpuset, detected is true)
>> b. in the beginning of eal_parse_lcores(), "reset lcore config".
>> (ROLE_OFF, cpuset is empty, detected is still true)
>> c. pcore cpuset will be checked by convert_to_cpuset using the previous
>> "detected" value.
> Ok, my bad then - I misunderstood the code.
> Thanks for explanation.
> So if I get it right now - first inside lib/librte_eal/common/eal_common_lcore.c
> Both lcore_count and lcore_config relate to the pcpus.
> Then later, at lib/librte_eal/common/eal_common_options.c
> they are overwritten related to lcores information.
> Except lcore_config[].detected, which seems kept intact.
> Is that correct?
Yes, exactly. And really appreciate that you raise up this question for
discussion.
>
>> I have tested it with the patch. Result aligns above analysis.
>> For case --lcores='(1-7)@2': sudo taskset 0xf
>> ./examples/helloworld/build/helloworld --avail-cores --lcores='(1-7)@2'
>> ...
>> hello from core 2
>> hello from core 3
>> hello from core 4
>> hello from core 5
>> hello from core 6
>> hello from core 7
>> hello from core 1
>>
>> For case --lcores='2@(1-7)': sudo taskset 0xf
>> ./examples/helloworld/build/helloworld --avail-cores --lcores='2@(1-7)'
>> ...
>> EAL: core 4 unavailable
>> EAL: invalid parameter for --lcores
>> ...
>>
>> One thing may worth mention: shall "detected" be maintained in struct
>> lcore_config? Maybe we need to maintain an data structure for pcores?
> Yes, it might be good to split pcpu and lcores information somehow,
> as it is a bit confusing right now.
> But I suppose this is a subject for another patch/discussion.
Yes, just another topic.
Thanks,
Jianfeng
> Konstantin
>
>
next prev parent reply other threads:[~2016-03-10 1:36 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-24 18:49 [dpdk-dev] [RFC] eal: add cgroup-aware resource self discovery Jianfeng Tan
2016-01-25 13:46 ` Neil Horman
2016-01-26 2:22 ` Tan, Jianfeng
2016-01-26 14:19 ` Neil Horman
2016-01-27 12:02 ` Tan, Jianfeng
2016-01-27 17:30 ` Neil Horman
2016-01-29 11:22 ` [dpdk-dev] [PATCH] eal: make resource initialization more robust Jianfeng Tan
2016-02-01 18:08 ` Neil Horman
2016-02-22 6:08 ` Tan, Jianfeng
2016-02-22 13:18 ` Neil Horman
2016-02-28 21:12 ` Thomas Monjalon
2016-02-29 1:50 ` Tan, Jianfeng
2016-03-04 10:05 ` [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores Jianfeng Tan
2016-03-08 8:54 ` Panu Matilainen
2016-03-08 17:38 ` Tan, Jianfeng
2016-03-09 13:05 ` Panu Matilainen
2016-03-09 13:53 ` Tan, Jianfeng
2016-03-09 14:01 ` Ananyev, Konstantin
2016-03-09 14:17 ` Tan, Jianfeng
2016-03-09 14:44 ` Ananyev, Konstantin
2016-03-09 14:55 ` Tan, Jianfeng
2016-03-09 15:17 ` Ananyev, Konstantin
2016-03-09 17:45 ` Tan, Jianfeng
2016-03-09 19:33 ` Ananyev, Konstantin
2016-03-10 1:36 ` Tan, Jianfeng [this message]
2016-05-18 12:46 ` David Marchand
2016-05-19 2:25 ` Tan, Jianfeng
2016-06-30 13:43 ` Thomas Monjalon
2016-07-01 0:52 ` Tan, Jianfeng
2016-04-26 12:39 ` Tan, Jianfeng
2016-03-04 10:58 ` [dpdk-dev] [PATCH] eal: make hugetlb initialization more robust Jianfeng Tan
2016-03-08 1:42 ` [dpdk-dev] [PATCH v2] " Jianfeng Tan
2016-03-08 8:46 ` Tan, Jianfeng
2016-05-04 11:07 ` Sergio Gonzalez Monroy
2016-05-04 11:28 ` Tan, Jianfeng
2016-05-04 12:25 ` Sergio Gonzalez Monroy
2016-05-09 10:48 ` [dpdk-dev] [PATCH v3] " Jianfeng Tan
2016-05-10 8:54 ` Sergio Gonzalez Monroy
2016-05-10 9:11 ` Tan, Jianfeng
2016-05-12 0:44 ` [dpdk-dev] [PATCH v4] " Jianfeng Tan
2016-05-17 16:39 ` David Marchand
2016-05-18 7:56 ` Sergio Gonzalez Monroy
2016-05-18 9:34 ` David Marchand
2016-05-19 2:00 ` Tan, Jianfeng
2016-05-17 16:40 ` Thomas Monjalon
2016-05-18 8:06 ` Sergio Gonzalez Monroy
2016-05-18 9:38 ` David Marchand
2016-05-19 2:11 ` Tan, Jianfeng
2016-05-31 3:37 ` [dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages Jianfeng Tan
2016-06-06 2:49 ` Pei, Yulong
2016-06-08 11:27 ` Sergio Gonzalez Monroy
2016-06-30 13:34 ` Thomas Monjalon
2016-08-31 3:07 ` [dpdk-dev] [PATCH v2] eal: restrict cores detection Jianfeng Tan
2016-08-31 15:30 ` Stephen Hemminger
2016-09-01 1:15 ` Tan, Jianfeng
2016-09-01 1:31 ` [dpdk-dev] [PATCH v3] " Jianfeng Tan
2016-09-02 16:53 ` Bruce Richardson
2016-09-16 14:04 ` Thomas Monjalon
2016-09-16 14:02 ` Thomas Monjalon
2016-12-02 17:48 ` [dpdk-dev] [PATCH v4] eal: restrict cores auto detection Jianfeng Tan
2016-12-08 18:19 ` Thomas Monjalon
2016-12-09 15:14 ` Bruce Richardson
2016-12-21 14:31 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56E0CFA1.7030303@intel.com \
--to=jianfeng.tan@intel.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@intel.com \
--cc=pmatilai@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).