DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Johan Källström" <johan.kallstrom@ericsson.com>
To: 'David Marchand' <david.marchand@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] eal: fix ctrl thread affinity with --lcores
Date: Tue, 30 Jul 2019 16:32:02 +0000	[thread overview]
Message-ID: <HE1PR0701MB2153B6E5EEAD04BE6217794D98DC0@HE1PR0701MB2153.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <CAJFAV8xNUQSC65T=kR+xiFG4X4EzKv9j1ZaUYEkx+Frfqyx9PA@mail.gmail.com>

Hi, for the online check I referred to the check of "default_set" via the initial thread affinity.

I see that pthread_getaffinity_np returns an already and:ed mask, was under the impression that pthread_getaffinity_np would return the same mask as was set using pthread_setaffinity_np. 
Looking on the implementation I see that it has been implemented on this line (https://github.com/torvalds/linux/blob/master/kernel/sched/core.c#L5242) for the last decade. Don’t know how this is implemented on FreeBSD or Windows.

Below is some example runs without the online cpu check running inside the exclusive cpuset 1-3,19,79 with cpu 79 offline.
Added a print statements after each consecutive calculation just to verify what the different steps.

Nice that you were able to reproduce the bug, the fix looks good otherwise :) .

= Example runs
echo 0 > /sys/bus/cpu/devices/cpu79/online
== 1. Ctrl threads via fallback
app# LD_LIBRARY_PATH=$PWD/../lib:$LD_LIBRARY_PATH taskset -c 19,79 ./testpmd --master-lcore 0 --lcores "(0,19)@(19,1,2,3)"
EAL: Detected 79 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: default_set: 19
EAL: cset_online: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78
EAL: cset_non_busy: 0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
EAL: cpuset: 
EAL: cpuset fallback: 1,2,3,19
...
^Z
app#  grep -HE '^(Cpus_allowed_list|Name):' /proc/48803/task/*/status 
/proc/48803/task/48803/status:Name:     testpmd
/proc/48803/task/48803/status:Cpus_allowed_list:        1-3,19
/proc/48803/task/48804/status:Name:     eal-intr-thread
/proc/48803/task/48804/status:Cpus_allowed_list:        1-3,19
/proc/48803/task/48805/status:Name:     rte_mp_handle
/proc/48803/task/48805/status:Cpus_allowed_list:        1-3,19
/proc/48803/task/48806/status:Name:     lcore-slave-19
/proc/48803/task/48806/status:Cpus_allowed_list:        1-3,19

== 2. Ctrl threads via default_set
app# LD_LIBRARY_PATH=$PWD/../lib:$LD_LIBRARY_PATH taskset -c 3,79 ./testpmd --master-lcore 0 --lcores "(0,19)@(19,1,2)"
EAL: Detected 79 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: default_set: 3
EAL: cset_online: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78
EAL: cset_non_busy: 0,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
EAL: cpuset: 3
EAL: cpuset fallback: 3
...
^Z
app# grep -HE '^(Cpus_allowed_list|Name):' /proc/54032/task/*/status 
/proc/54032/task/54032/status:Name:     testpmd
/proc/54032/task/54032/status:Cpus_allowed_list:        1-2,19
/proc/54032/task/54033/status:Name:     eal-intr-thread
/proc/54032/task/54033/status:Cpus_allowed_list:        3
/proc/54032/task/54034/status:Name:     rte_mp_handle
/proc/54032/task/54034/status:Cpus_allowed_list:        3
/proc/54032/task/54035/status:Name:     lcore-slave-19
/proc/54032/task/54035/status:Cpus_allowed_list:        1-2,19

BR
Johan

-----Original Message-----
From: David Marchand [mailto:david.marchand@redhat.com] 
Sent: July 30, 2019 15:48
To: Johan Källström <johan.kallstrom@ericsson.com>
Cc: dev@dpdk.org; anatoly.burakov@intel.com; olivier.matz@6wind.com; stable@dpdk.org
Subject: Re: [PATCH] eal: fix ctrl thread affinity with --lcores

On Tue, Jul 30, 2019 at 1:38 PM Johan Källström <johan.kallstrom@ericsson.com> wrote:
> The CPU failsafe is nice to have as you could set the thread affinity to offline cpus.

Created a "dpdk" cpuset and put cpus 4-7 into it (my system is mono numa with 8 cpus) # cd /sys/fs/cgroup/cpuset/ # mkdir dpdk # cd dpdk # echo 4-7 > cpuset.cpus # echo 0 > cpuset.mems

Disabled cpu 5.
# echo 0 > /sys/bus/cpu/devices/cpu5/online

Put my shell that starts testpmd in this dpdk cpuset # echo 4439 > tasks


EAL refuses an offline core when parsing the thread affinities and this did not change.

$ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)'
--log-level *:debug --no-huge  --no-pci -m 512 -- -i
--total-num-mbufs=2048
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 6 as core 2 on socket 0
EAL: Detected lcore 7 as core 3 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 7 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: core 5 unavailable
EAL: invalid parameter for --lcores

What did I miss?


>
> Maybe also add the example I gave you to trigger the bug? 
> https://protect2.fireeye.com/url?k=51a8b8b7-0d2163b8-51a8f82c-0cc47ad9
> 3e1a-2e7d7fab24e99be5&q=1&u=https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi
> %3Fid%3D322%23c12

I managed to reproduce your error with the setup above (without relying on the cset tool that is not available on rhel afaics), I can add it to the commitlog yes.


> This also shows how to set the default_affinity mask and proves that the calculation will result in threads inside the cpuset on Linux.
>
> /Johan
>
> On tis, 2019-07-30 at 11:35 +0200, David Marchand wrote:
> > When using -l/-c options, each lcore is mapped to a physical cpu in 
> > a
> > 1:1 fashion.
> > On the contrary, when using --lcores, each lcore has its own cpuset
>
> Use "thread affinity" instead of cpuset when we talk about setting the thread affinity.
>
> I know that the term cpuset is used in the data structure, but it is not a cpuset as described by 'man cpuset' (on Linux). This comment can be seen as cosmetic, but I think that it could be good to have a clear definitions to minimize confusion.

Indeed, using cpuset is inappropriate.
I will update the commitlog and the comment.



--
David Marchand

  reply	other threads:[~2019-07-30 20:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30  9:35 David Marchand
2019-07-30  9:45 ` Jerin Jacob Kollanukkaran
2019-07-30  9:46   ` David Marchand
2019-07-30 11:38 ` Johan Källström
2019-07-30 13:47   ` David Marchand
2019-07-30 16:32     ` Johan Källström [this message]
2019-07-30 19:21       ` David Marchand
2019-07-31  8:12         ` Johan Källström
2019-07-30 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
2019-07-30 21:12   ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=HE1PR0701MB2153B6E5EEAD04BE6217794D98DC0@HE1PR0701MB2153.eurprd07.prod.outlook.com \
    --to=johan.kallstrom@ericsson.com \
    --cc=anatoly.burakov@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).