DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Mattias Rönnblom" <hofors@lysator.liu.se>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "harry.van.haaren@intel.com" <harry.van.haaren@intel.com>,
	nd <nd@arm.com>
Subject: Re: Service core statistics MT safety
Date: Tue, 28 Jun 2022 18:30:12 +0000	[thread overview]
Message-ID: <d88035e5-f9a7-bac1-052e-91d9871540b6@ericsson.com> (raw)
In-Reply-To: <DBAPR08MB5814C22F5B93785BFD4D840D98B89@DBAPR08MB5814.eurprd08.prod.outlook.com>

On 2022-06-28 17:24, Honnappa Nagarahalli wrote:
> <snip>
> 
>>>>>>>>
>>>>>>>>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
>>>>>>>>> Sent: Monday, 27 June 2022 13.06
>>>>>>>>>
>>>>>>>>> Hi.
>>>>>>>>>
>>>>>>>>> Is it safe to enable stats on MT safe services?
>>>>>>>>>
>>>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-
>> 313273af
>>>>>>>>> -
>>>> 4
>>>>>>>>> 54445555731-6096fdb16385f88f&q=1&e=27b94605-d1e2-40b6-
>> af6d-
>>>> 9ebc54d
>>>>>>>>>
>>>>
>> 5db18&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%
>>>> 2Flib
>>>>>>>>> %2Feal%2Fcommon%2Frte_service.c%23
>>>>>>>>> L3
>>>>>>>>> 6
>>>>>>>>> 6
>>>>>>>>>
>>>>>>>>> It seems to me this would have to be an __atomic_add for this
>>>>>>>>> code to produce deterministic results.
>>>>>>>>
>>>>>>>> I agree. The same goes for the 'calls' field.
>>>>>>> The calling function does the locking.
>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-
>>>> 454
>>>>>>> 445555731-5ce419f8bf9f9b76&q=1&e=27b94605-d1e2-40b6-af6d-
>>>> 9ebc54d5db1
>>>>>>>
>>>>
>> 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib
>>>> %2Feal
>>>>>>> %2Fcommon%2Frte_service.c%23L3
>>>>>>> 98
>>>>>>>
>>>>>>> For more information you can look at:
>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-
>>>> 454
>>>>>>> 445555731-ba0d1416f08856f0&q=1&e=27b94605-d1e2-40b6-af6d-
>>>> 9ebc54d5db1
>>>>>>>
>>>>
>> 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib
>>>> %2Feal
>>>>>>> %2Finclude%2Frte_service.h%23L
>>>>>>> 120
>>>>>>>
>>>>>>
>>>>>> What about the
>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-
>>>> 4544
>>>>>> 45555731-b64334addc78c264&q=1&e=27b94605-d1e2-40b6-af6d-
>>>> 9ebc54d5db18&
>>>>>>
>>>>
>> u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2
>>>> Feal%2F
>>>>>> common%2Frte_service.c%23L404
>>>>>> call (for MT safe services)?
>>>>>>
>>>>>> There's no lock held there.
>>>>> Good point.
>>>>> This is the case where the service running in service cores is MT
>>>>> safe. However,
>>>> the stats are incremented outside of the MT Safety mechanism employed
>>>> by the service. So, yes, this and other updates in the function
>>>> 'service_runner_do_callback' need to be updated atomically.
>>>>
>>>> Maybe a better solution would be to move this to the core_state
>>>> struct (and eliminate the "calls" field since the same information is
>>>> already in the core_state struct). The contention on these cache
>>>> lines will be pretty crazy for services with short run time (say a
>>>> thousand cycles or less per call), assuming they are mapped to many cores.
>>> That's one option, the structures are internal as well. With this option stats
>> need to be aggregated which will not give an accurate view. But, that is the
>> nature of the statistics.
>>>
>>
>> Per-core counters is a very common pattern. Used for Linux MIB counters, for
>> example. I'm not sure I think it's much less accurate. I mean, you just load in
>> quick succession what's globally visible for the different per-lcore counters. If
>> you do a relaxed store on an ARM, how long time does it take until it's seen by
>> someone doing a relaxed load on a different core? Roughly.
> I think my explanation of the problem is not clear.
> 
> If a service is running on more than one core and the stats are per core, when we aggregate, the resulting statistics is not atomic. By making the stats per core, we will be taking out that feature which is present currently (even though it is implemented incorrectly). As we agree, the proposed change is a common pattern and taking away the atomicity of the stats might not be a problem.
> 

Isn't it just a push model, versus a pull one? Both give just an 
approximation, albeit a very good one, of how many cycles are spent 
"now" for a particular service. Isn't time a local phenomena in a SMP 
system, and there is no global "now"? Maybe you can achieve such with a 
transaction or handshake of some sort, but I don't see how the an 
__atomic_add would be enough.

I was fortunate to get some data from a real-world application, and 
enabling service core stats resulted in a 7% degradation of overall 
system capacity. I'm guessing atomic instructions would not make things 
better.

>>
>>> I am also wondering if these stats are of any use other than for debugging.
>> Adding a capability to disable stats might help as well.
>>>
>>
>> They could be used as a crude tool to determine service core utilization.
>> Comparing utilization between different services running on the same core
>> should be straight-forward, but lcore utilization is harder in absolute terms. If
>> you just look at "cycles", a completely idle core would look like it's very busy
>> (basically rdtsc latency added for every loop). I assume you'd have to do some
>> heuristic based on both "calls" and "cycles" to get an estimate.
>>
>> I think service core utilization would be very useful, although that would require
>> some changes in the service function signature, so the service can report back if
>> it did some useful work for a particular call.
>>
>> That would make for a DPDK 'top'. Just like 'top', it can't impose any serious
>> performance degradation when used, to be really useful, I think.
>>
>> Sure, it should be possible to turn it on and off. I thought that was the case
>> already?
> Thanks, yes, this exists already. Though the 'loops' counter is out of the stats enable check, looks like it is considered as an attribute for some reason.
> 
>>
>>>>
>>>> Idle service cores will basically do nothing else than stall waiting
>>>> for these lines, I suspect, hampering the progress of more busy cores.
> 


  reply	other threads:[~2022-06-28 18:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27 11:05 Mattias Rönnblom
2022-06-27 12:31 ` Morten Brørup
2022-06-27 17:39   ` Honnappa Nagarahalli
2022-06-27 17:47     ` Mattias Rönnblom
2022-06-27 18:19       ` Honnappa Nagarahalli
2022-06-27 20:00         ` Mattias Rönnblom
2022-06-28  0:14           ` Honnappa Nagarahalli
2022-06-28  6:52             ` Mattias Rönnblom
2022-06-28 15:24               ` Honnappa Nagarahalli
2022-06-28 18:30                 ` Mattias Rönnblom [this message]
2022-06-28 19:15                   ` Honnappa Nagarahalli
2022-06-29  6:34                     ` Mattias Rönnblom
2022-06-29 20:07                       ` Honnappa Nagarahalli
2022-06-30  8:46                         ` Van Haaren, Harry
2022-07-01 18:37                           ` Honnappa Nagarahalli
2022-07-01 20:51                             ` Morten Brørup
2022-07-05 17:09                               ` Van Haaren, Harry
2022-07-07 17:26                                 ` Honnappa Nagarahalli
2022-07-08 12:58                                   ` Van Haaren, Harry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d88035e5-f9a7-bac1-052e-91d9871540b6@ericsson.com \
    --to=mattias.ronnblom@ericsson.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=harry.van.haaren@intel.com \
    --cc=hofors@lysator.liu.se \
    --cc=mb@smartsharesystems.com \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).