Using rdtsc to timestamp RTT of packets

DPDK usage discussions
 help / color / mirror / Atom feed

* Using rdtsc to timestamp RTT of packets
@ 2023-03-06  1:01 fwefew 4t4tg
  2023-03-06  7:56 ` Gabor LENCSE
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: fwefew 4t4tg @ 2023-03-06  1:01 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 1657 bytes --]

I convinced myself that a viable way to measure timestamps between a
request packet and its response packet can be the difference between two
Intel rdtsc calls

The restrictions to valid use include:

   - RTT (time difference) must be calculated on the same CORE
   - fencing instructions (lfence) could be required

The time difference is OK provided,

   - it delivers at least microsecond resolution (rdtsc does)
   - the difference is always positive (end-start) or zero
   - the details of whether the clock runs or does not run at the processor
   speed is not material so long as there's sufficient resolution
   - DPDK gives me the frequency rte_rdtsc_cycles(); this way I can convert
   from a rdtsc difference to elapsed time
   - The OS doesn't reset the counter or pause it for interrupts or on halts

I think rdtsc does all this. But then I read [1]:

   - The TSC is not always invariant
   - And of course context switches (if a thread is not pinned to a core)
   will invalidate any time difference
   - The TSC is not incremented when the processor enters a deep sleep. I
   don't care about this because I'll turn off the power saving modes anyway

So I am not so sure.

Now, of course, Mellanox can report time stamps. Is it actually possible to
get HW NIC timestamps reported for every packet sent and received without
overburdening the NIC? Based on what I can see for my case (Connect 4 LX)
resolution is nanoseconds. So I am tempted to not fool around with rdtsc
and just use NIC timestamps.

What is praxis in DPDK programming when one needs RTTs?

[1]
https://stackoverflow.com/questions/42189976/calculate-system-time-using-rdtsc

[-- Attachment #2: Type: text/html, Size: 1926 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using rdtsc to timestamp RTT of packets
  2023-03-06  1:01 Using rdtsc to timestamp RTT of packets fwefew 4t4tg
@ 2023-03-06  7:56 ` Gabor LENCSE
  2023-03-06  9:46 ` Van Haaren, Harry
  2023-03-06 16:56 ` Stephen Hemminger
  2 siblings, 0 replies; 5+ messages in thread
From: Gabor LENCSE @ 2023-03-06  7:56 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 2829 bytes --]

Please see my comments inline.

On 3/6/2023 2:01 AM, fwefew 4t4tg wrote:
> I convinced myself that a viable way to measure timestamps between a 
> request packet and its response packet can be the difference between 
> two Intel rdtsc calls
I think it is a good solution:  computationally inexpensive and accurate.
>
> The restrictions to valid use include:
>
>   * RTT (time difference) must be calculated on the same CORE
>
Yes, it is safe.

But I think it is also OK, if the two cores used for taking sending TSC 
and receiving TSC belong to the same physical CPU. At least it works me 
well in my measurement program called siitperf.

>   * fencing instructions (lfence) could be required
>
But it can also decrease the performance of your program.


> The time difference is OK provided,
>
>   * it delivers at least microsecond resolution (rdtsc does)
>   * the difference is always positive (end-start) or zero
>   * the details of whether the clock runs or does not run at the
>     processor speed is not material so long as there's sufficient
>     resolution
>   * DPDK gives me the frequency rte_rdtsc_cycles(); this way I can
>     convert from a rdtsc difference to elapsed time
>   * The OS doesn't reset the counter or pause it for interrupts or on
>     halts
>
> I think rdtsc does all this. But then I read [1]:
>
>   * The TSC is not always invariant
>   * And of course context switches (if a thread is not pinned to a
>     core) will invalidate any time difference
>
I start the packet sender and receiver threads by the 
rte_eal_remote_launch() calls on the appropriate cores and it works fine 
for me.
>
>   * The TSC is not incremented when the processor enters a deep sleep.
>     I don't care about this because I'll turn off the power saving
>     modes anyway
>
> So I am not so sure.
I set the CPU clock frequency to a fixed value. (If I cannot do it from 
BIOS, I use the tlp Linux package.)
>
> Now, of course, Mellanox can report time stamps. Is it actually 
> possible to get HW NIC timestamps reported for every packet sent and 
> received without overburdening the NIC? Based on what I can see for my 
> case (Connect 4 LX) resolution is nanoseconds. So I am tempted to not 
> fool around with rdtsc and just use NIC timestamps.
>
> What is praxis in DPDK programming when one needs RTTs?

I have implemented both the Latency and PDV (Packet Delay Variation) 
measurements of RFC 8219 in siitperf using RDTSC. I am satisfied with 
the result.

If you are interested, you can find the source code here: 
https://github.com/lencsegabor/siitperf

And its latest version is documented in my paper: 
http://www.hit.bme.hu/~lencse/publications/ECC-2022-SFNATxy-Tester-published.pdf

Best regards,

Gábor


>
> [1] 
> https://stackoverflow.com/questions/42189976/calculate-system-time-using-rdtsc

[-- Attachment #2: Type: text/html, Size: 5467 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Using rdtsc to timestamp RTT of packets
  2023-03-06  1:01 Using rdtsc to timestamp RTT of packets fwefew 4t4tg
  2023-03-06  7:56 ` Gabor LENCSE
@ 2023-03-06  9:46 ` Van Haaren, Harry
  2023-03-06 16:56 ` Stephen Hemminger
  2 siblings, 0 replies; 5+ messages in thread
From: Van Haaren, Harry @ 2023-03-06  9:46 UTC (permalink / raw)
  To: fwefew 4t4tg, users; +Cc: Gábor LENCSE

> From: fwefew 4t4tg <7532yahoo@gmail.com> 
> Sent: Monday, March 6, 2023 1:01 AM
> To: users@dpdk.org
> Subject: Using rdtsc to timestamp RTT of packets

> I convinced myself that a viable way to measure timestamps between a request packet and its response packet can be the difference between two Intel rdtsc calls

> The restrictions to valid use include:
• RTT (time difference) must be calculated on the same CORE

For all CPU generations that you likely care-about (check     lscpu | egrep "constant_tsc|nonstop_tsc"   output) all CPUs on the same socket have a single TSC, and never stop.

> • DPDK gives me the frequency rte_rdtsc_cycles(); this way I can convert from a rdtsc difference to elapsed time

Correct, this gives a reference to convert "N TSC ticks" to real-world time "X of a second".

<snip>

> • The TSC is not always invariant

TSC is invariant/constant and non-stop based on the lscpu check above - so you can confirm your platform.

> • And of course context switches (if a thread is not pinned to a core) will invalidate any time difference

Context switches will not invalidate your timestamp - but if you're reacting to an interrupt which *then* timestamps "end" of measurement, then yes it is invalid/noisy. I'm trying to say, the context-switch jitter will make the measurement much more noisy - but "invalid" depends on requirements. Typically as DPDK has pinned & polling cores, this isn't a problem.

> Now, of course, Mellanox can report time stamps. Is it actually possible to get HW NIC timestamps reported for every packet sent and received without overburdening the NIC? > Based on what I can see for my case (Connect 4 LX) resolution is nanoseconds. So I am tempted to not fool around with rdtsc and just use NIC timestamps.

What is praxis in DPDK programming when one needs RTTs?

Some NICs have 1588 capabilities; these can timestamp *specific* packets on the wire to NS resolution. They typically can not timestamp *every* packet, or even 1 every ~million packets... but if measuring every Nth packet (with large N) then that might be worth trying, if the resolution and "closest-to-the-wire" timestamp are really required.

I would recommend going with a SW + TSC approach, for flexibility & ease of use; deploying in a cloud/virtio NIC environment will mean that ptp1588 or HW features are not available, so TSC is the "only" way to go then.

PTP 1588: https://doc.dpdk.org/api/rte__ethdev_8h.html#ad7aaa6d846a6e71edb4f6c737cdf60c7 and examples/ptpclient, and a HW PTP timestamping paper by "moongen" authors for reference (section 6) https://arxiv.org/pdf/1410.3322.pdf

On SW traffic-generation, latency, histograms etc, I've presented at DPDK Userspace last September on the topic: https://www.youtube.com/watch?v=Djcjq59H1uo
My conclusion (in the Q&A at end) is that for my use-cases, using TSC was a simpler/faster/easier option, and that resolution was much higher (and more than I required) for RTT in packet processing applications.

Hope that helps! -Harry

PS: apologies for formatting, it seems some HTML like mail was sent originally? Please send plain-text emails to mailing-lists.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using rdtsc to timestamp RTT of packets
  2023-03-06  1:01 Using rdtsc to timestamp RTT of packets fwefew 4t4tg
  2023-03-06  7:56 ` Gabor LENCSE
  2023-03-06  9:46 ` Van Haaren, Harry
@ 2023-03-06 16:56 ` Stephen Hemminger
  2023-03-06 17:33   ` fwefew 4t4tg
  2 siblings, 1 reply; 5+ messages in thread
From: Stephen Hemminger @ 2023-03-06 16:56 UTC (permalink / raw)
  To: fwefew 4t4tg; +Cc: users

On Sun, 5 Mar 2023 20:01:15 -0500
fwefew 4t4tg <7532yahoo@gmail.com> wrote:

> I think rdtsc does all this. But then I read [1]:
> 
>    - The TSC is not always invariant
>    - And of course context switches (if a thread is not pinned to a core)
>    will invalidate any time difference
>    - The TSC is not incremented when the processor enters a deep sleep. I
>    don't care about this because I'll turn off the power saving modes anyway

Stack Overflow is only one step better than ChatGPT in giving
partially correct answers.

TSC is almost always works well on modern processors.
The Linux kernel aligns all the TSC values for each core at boot up.
It is invariant (derived from a single clock source) unless you have some
poorly designed NUMA system.  In the past, there were some CPU's that
did bad things during suspend, but that is fixed in current generations.

Bottom line: that advice is no longer true.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using rdtsc to timestamp RTT of packets
  2023-03-06 16:56 ` Stephen Hemminger
@ 2023-03-06 17:33   ` fwefew 4t4tg
  0 siblings, 0 replies; 5+ messages in thread
From: fwefew 4t4tg @ 2023-03-06 17:33 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

Using rdtsc to timestamp RTT of packets

Stephen/Gabor/Harry: Gents thanks for the guidance; this mailing list is
great. The message is rdtsc is just fine. Got it.

The command line (from Harry):
>lscpu | egrep "constant_tsc|nonstop_tsc"
is operationally ideal: specific, easy, and answers the issue of
invariance/constancy clearly.

On Mon, Mar 6, 2023 at 11:56 AM Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Sun, 5 Mar 2023 20:01:15 -0500
> fwefew 4t4tg <7532yahoo@gmail.com> wrote:
>
> > I think rdtsc does all this. But then I read [1]:
> >
> >    - The TSC is not always invariant
> >    - And of course context switches (if a thread is not pinned to a core)
> >    will invalidate any time difference
> >    - The TSC is not incremented when the processor enters a deep sleep. I
> >    don't care about this because I'll turn off the power saving modes
> anyway
>
> Stack Overflow is only one step better than ChatGPT in giving
> partially correct answers.
>
> TSC is almost always works well on modern processors.
> The Linux kernel aligns all the TSC values for each core at boot up.
> It is invariant (derived from a single clock source) unless you have some
> poorly designed NUMA system.  In the past, there were some CPU's that
> did bad things during suspend, but that is fixed in current generations.
>
> Bottom line: that advice is no longer true.
>

[-- Attachment #2: Type: text/html, Size: 1870 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-06 17:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-06  1:01 Using rdtsc to timestamp RTT of packets fwefew 4t4tg
2023-03-06  7:56 ` Gabor LENCSE
2023-03-06  9:46 ` Van Haaren, Harry
2023-03-06 16:56 ` Stephen Hemminger
2023-03-06 17:33   ` fwefew 4t4tg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).