DPDK usage discussions
 help / color / mirror / Atom feed
* Accuracy of rte_get_tsc_hz() compared to linux
@ 2024-09-18 22:04 Isaac Boukris
  2024-09-18 23:27 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Isaac Boukris @ 2024-09-18 22:04 UTC (permalink / raw)
  To: users

Hi,

On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).

The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
timestamps to lag behind real time (roughly a sec per 10 min). I
noticed the kernel uses 2095082 KHz and in fact it gives much better
results.

dmesg:
tsc: Detected 2095.082 MHz processor

tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
2095082

I changed the dpdk code to print more when it initializes
eal_tsc_resolution_hz and lowered the rounding of the estimations, as
follows:

git diff
diff --git a/lib/eal/common/eal_common_timer.c
b/lib/eal/common/eal_common_timer.c
index c5c4703f15..faf1efc90f 100644
--- a/lib/eal/common/eal_common_timer.c
+++ b/lib/eal/common/eal_common_timer.c
@@ -38,7 +38,7 @@ rte_get_tsc_hz(void)
 static uint64_t
 estimate_tsc_freq(void)
 {
-#define CYC_PER_10MHZ 1E7
+#define CYC_PER_10MHZ 1E3
        EAL_LOG(WARNING, "WARNING: TSC frequency estimated roughly"
                " - clock timings may be less accurate.");
        /* assume that the rte_delay_us_sleep() will sleep for 1 second */
@@ -71,6 +71,10 @@ set_tsc_freq(void)
        if (!freq)
                freq = estimate_tsc_freq();

+       EAL_LOG(DEBUG, "TSC frequency arch ~%" PRIu64 " KHz",
get_tsc_freq_arch() / 1000);
+       EAL_LOG(DEBUG, "TSC frequency linux ~%" PRIu64 " KHz",
get_tsc_freq() / 1000);
+       EAL_LOG(DEBUG, "TSC frequency estimate ~%" PRIu64 " KHz",
estimate_tsc_freq() / 1000);
+
        EAL_LOG(DEBUG, "TSC frequency is ~%" PRIu64 " KHz", freq / 1000);
        eal_tsc_resolution_hz = freq;
        mcfg->tsc_hz = freq;
diff --git a/lib/eal/linux/eal_timer.c b/lib/eal/linux/eal_timer.c
index 1cb1e92193..9254c901b8 100644
--- a/lib/eal/linux/eal_timer.c
+++ b/lib/eal/linux/eal_timer.c
@@ -192,9 +192,9 @@ get_tsc_freq(void)
 {
 #ifdef CLOCK_MONOTONIC_RAW
 #define NS_PER_SEC 1E9
-#define CYC_PER_10MHZ 1E7
+#define CYC_PER_10MHZ 1E3

-       struct timespec sleeptime = {.tv_nsec = NS_PER_SEC / 10 }; /*
1/10 second */
+       struct timespec sleeptime = {.tv_sec = 1 }; /* 1/10 second */

        struct timespec t_start, t_end;
        uint64_t tsc_hz;


I've run the helloworld application on an isolated cpu:
taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge

The results are:
EAL: TSC frequency arch ~2100000 KHz
EAL: TSC frequency linux ~2095082 KHz
EAL: TSC frequency estimate ~2095346 KHz

The arch one is picked, which seems rather wrong, any way to override that?
Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?

Thoughts? Thanks!

Kernel: 4.18.0-513.9.1.el8_9.x86_64

lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              64
On-line CPU(s) list: 0-63
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel(R) Corporation
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
BIOS Model name:     Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
Stepping:            4
CPU MHz:             2100.000
BogoMIPS:            4200.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            22528K
NUMA node0 CPU(s):   0-15,32-47
NUMA node1 CPU(s):   16-31,48-63
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm
pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3
cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp
tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1
hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq
rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl
xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total
cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d
arch_capabilities

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
@ 2024-09-18 23:27 ` Stephen Hemminger
  2024-09-19  9:37   ` Isaac Boukris
  2024-09-19 12:26   ` Isaac Boukris
  2024-09-19 21:53 ` Stephen Hemminger
  2024-09-19 22:02 ` Stephen Hemminger
  2 siblings, 2 replies; 8+ messages in thread
From: Stephen Hemminger @ 2024-09-18 23:27 UTC (permalink / raw)
  To: Isaac Boukris; +Cc: users

On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:

> I've run the helloworld application on an isolated cpu:
> taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
> 
> The results are:
> EAL: TSC frequency arch ~2100000 KHz
> EAL: TSC frequency linux ~2095082 KHz
> EAL: TSC frequency estimate ~2095346 KHz
> 
> The arch one is picked, which seems rather wrong, any way to override that?
> Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
> 
> Thoughts? Thanks!
> 
> Kernel: 4.18.0-513.9.1.el8_9.x86_64

Note: 4.18 kernel was end of life 12 August 2018, I assume
this is RHEL8 which does their own backports and never changes kernel version.

What is the kernel dmesg, why is it deciding on that value?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-18 23:27 ` Stephen Hemminger
@ 2024-09-19  9:37   ` Isaac Boukris
  2024-09-19 12:26   ` Isaac Boukris
  1 sibling, 0 replies; 8+ messages in thread
From: Isaac Boukris @ 2024-09-19  9:37 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

On Thu, Sep 19, 2024 at 2:27 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 19 Sep 2024 01:04:40 +0300
> Isaac Boukris <iboukris@gmail.com> wrote:
>
> > I've run the helloworld application on an isolated cpu:
> > taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
> >
> > The results are:
> > EAL: TSC frequency arch ~2100000 KHz
> > EAL: TSC frequency linux ~2095082 KHz
> > EAL: TSC frequency estimate ~2095346 KHz
> >
> > The arch one is picked, which seems rather wrong, any way to override that?
> > Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
> >
> > Thoughts? Thanks!
> >
> > Kernel: 4.18.0-513.9.1.el8_9.x86_64
>
> Note: 4.18 kernel was end of life 12 August 2018, I assume
> this is RHEL8 which does their own backports and never changes kernel version.

Indeed RHEL 8.9

> What is the kernel dmesg, why is it deciding on that value?

It comes from kernel's determine_cpu_tsc_frequencies() afaict (which
didn't change that much).

As a matter of fact, I got a similar behavior on my vmware VM on my
laptop (although smaller diff).

kernel: 4.18.0-553.8.1.el8_10.x86_64

lscpu:
Model name:          12th Gen Intel(R) Core(TM) i7-1260P
Stepping:            3
CPU MHz:             2495.994
BogoMIPS:            4991.98
Hypervisor vendor:   VMware

dmesg | grep -i tsc
[    0.000000] vmware: TSC freq read from hypervisor : 2495.994 MHz
[    0.000000] tsc: Detected 2495.994 MHz processor
[    0.000000] TSC deadline timer available
[    0.010000] clocksource: tsc-early: mask: 0xffffffffffffffff
max_cycles: 0x23fa717cb36, max_idle_ns: 440795237972 ns
[    0.905000] clocksource: Switched to clocksource tsc-early
[    3.104381] tsc: Refined TSC clocksource calibration: 2495.990 MHz
[    3.105473] clocksource: tsc: mask: 0xffffffffffffffff max_cycles:
0x23fa6db1dfc, max_idle_ns: 440795265852 ns
[    3.264297] clocksource: Switched to clocksource tsc

cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
2495990

sudo bpftrace -e 'BEGIN { printf("%u\n", *kaddr("tsc_khz")); exit(); }'
Attaching 1 probe...
2495990

My modified dpdk code above gives:
EAL: TSC frequency arch ~0 KHz
EAL: TSC frequency linux ~2495982 KHz
EAL: TSC frequency estimate ~2497263 KHz

Note that with the unmodified dpdk code which rounds to 10MHz both the
linux and the common estimation would give 2500000 KHz.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-18 23:27 ` Stephen Hemminger
  2024-09-19  9:37   ` Isaac Boukris
@ 2024-09-19 12:26   ` Isaac Boukris
  2024-09-19 13:04     ` Isaac Boukris
  1 sibling, 1 reply; 8+ messages in thread
From: Isaac Boukris @ 2024-09-19 12:26 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

On Thu, Sep 19, 2024 at 2:27 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 19 Sep 2024 01:04:40 +0300
> Isaac Boukris <iboukris@gmail.com> wrote:
>
> > I've run the helloworld application on an isolated cpu:
> > taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
> >
> > The results are:
> > EAL: TSC frequency arch ~2100000 KHz
> > EAL: TSC frequency linux ~2095082 KHz
> > EAL: TSC frequency estimate ~2095346 KHz
> >
> > The arch one is picked, which seems rather wrong, any way to override that?
> > Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
> >
> > Thoughts? Thanks!
> >
> > Kernel: 4.18.0-513.9.1.el8_9.x86_64
>
> Note: 4.18 kernel was end of life 12 August 2018, I assume
> this is RHEL8 which does their own backports and never changes kernel version.
>
> What is the kernel dmesg, why is it deciding on that value?

Actually, this is the boot dmesg on the machine itself (the previous
log was from a kvm on that machine).

# journalctl -b --system | grep -i tsc
Sep 15 17:50:16 localhost kernel: tsc: Detected 2100.000 MHz processor
Sep 15 17:50:16 localhost kernel: TSC deadline timer available
Sep 15 17:50:16 localhost kernel: clocksource: tsc-early: mask:
0xffffffffffffffff max_cycles: 0x1e4530a99b6, max_idle_ns:
440795257976 ns
Sep 15 17:50:16 localhost kernel: clocksource: Switched to clocksource tsc-early
Sep 15 17:50:16 localhost kernel: tsc: Refined TSC clocksource
calibration: 2095.082 MHz
Sep 15 17:50:16 localhost kernel: clocksource: tsc: mask:
0xffffffffffffffff max_cycles: 0x1e330abbade, max_idle_ns:
440795251159 ns
Sep 15 17:50:16 localhost kernel: clocksource: Switched to clocksource tsc

So it looks like it is refined based on calibration, which we could do
by preferring the linux estimation results over the arch (and lowering
the rounding to 1MHz). Alternatively, maybe find a way to read the
linux values or allow to set the value manually at init (as an eal
param).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-19 12:26   ` Isaac Boukris
@ 2024-09-19 13:04     ` Isaac Boukris
  2024-09-19 18:33       ` Isaac Boukris
  0 siblings, 1 reply; 8+ messages in thread
From: Isaac Boukris @ 2024-09-19 13:04 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

On an older laptop (i7-8650U), running fedora with kernel 6.8.9-100.fc38.x86_64

journalctl -b | grep -i tsc
Aug 29 12:34:00 localhost.localdomain kernel: tsc: Detected 2100.000
MHz processor
Aug 29 12:34:00 localhost.localdomain kernel: tsc: Detected 2099.944 MHz TSC
Aug 29 12:34:00 localhost.localdomain kernel: TSC deadline timer available
Aug 29 12:34:00 localhost.localdomain kernel: clocksource: tsc-early:
mask: 0xffffffffffffffff max_cycles: 0x1e44fb6c2ab, max_idle_ns:
440795206594 ns
Aug 29 12:34:00 localhost.localdomain kernel: clocksource: Switched to
clocksource tsc-early
Aug 29 12:34:00 localhost.localdomain kernel: tsc: Refined TSC
clocksource calibration: 2112.000 MHz

sudo bpftrace -e 'BEGIN { printf("%u\n", *kaddr("tsc_khz")); exit(); }'
Attaching 1 probe...
2112000

dpdk logs before lowering the rounding:

EAL: TSC frequency arch ~0 KHz
EAL: TSC frequency linux ~2110000 KHz
EAL: TSC frequency estimate ~2110000 KHz

after lowering the rounding to 1KHz:

EAL: TSC frequency arch ~0 KHz
EAL: TSC frequency linux ~2112000 KHz
EAL: TSC frequency estimate ~2112949 KHz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-19 13:04     ` Isaac Boukris
@ 2024-09-19 18:33       ` Isaac Boukris
  0 siblings, 0 replies; 8+ messages in thread
From: Isaac Boukris @ 2024-09-19 18:33 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

Looking some more at the kernel code (tsc.c), it appears that it would
only trust the arch frequency if the cpu 'tsc_known_freq' flag is set
(which none of the machines I have access to has, although for some
the dpdk's get_tsc_freq_arch() does return a value), otherwise it
would calibrate it (hence the "Refined" in dmesg). Perhaps we should
do the same.

        /*
         * When TSC frequency is known (retrieved via MSR or CPUID), we skip
         * the refined calibration and directly register it as a clocksource.
         */
        if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
  2024-09-18 23:27 ` Stephen Hemminger
@ 2024-09-19 21:53 ` Stephen Hemminger
  2024-09-19 22:02 ` Stephen Hemminger
  2 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2024-09-19 21:53 UTC (permalink / raw)
  To: Isaac Boukris; +Cc: users

On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:

> Hi,
> 
> On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
> 
> The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> timestamps to lag behind real time (roughly a sec per 10 min). I
> noticed the kernel uses 2095082 KHz and in fact it gives much better
> results.

FYI there is a bug from 2022 about this:

https://bugs.dpdk.org/show_bug.cgi?id=959

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Accuracy of rte_get_tsc_hz() compared to linux
  2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
  2024-09-18 23:27 ` Stephen Hemminger
  2024-09-19 21:53 ` Stephen Hemminger
@ 2024-09-19 22:02 ` Stephen Hemminger
  2 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2024-09-19 22:02 UTC (permalink / raw)
  To: Isaac Boukris; +Cc: users

On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:

> Hi,
> 
> On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
> 
> The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> timestamps to lag behind real time (roughly a sec per 10 min). I
> noticed the kernel uses 2095082 KHz and in fact it gives much better
> results.
> 
> dmesg:
> tsc: Detected 2095.082 MHz processor
> 
> tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
> cat /sys/devices/system/cpu/cpu0/tsc_freq_khz

Rather going off into all the weeds of cpuid and whether
the value reported is correct. Perhaps DPDK should just look at the
kernel sysfs files??

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-09-19 22:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
2024-09-18 23:27 ` Stephen Hemminger
2024-09-19  9:37   ` Isaac Boukris
2024-09-19 12:26   ` Isaac Boukris
2024-09-19 13:04     ` Isaac Boukris
2024-09-19 18:33       ` Isaac Boukris
2024-09-19 21:53 ` Stephen Hemminger
2024-09-19 22:02 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).