* Accuracy of rte_get_tsc_hz() compared to linux
@ 2024-09-18 22:04 Isaac Boukris
2024-09-18 23:27 ` Stephen Hemminger
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Isaac Boukris @ 2024-09-18 22:04 UTC (permalink / raw)
To: users
Hi,
On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
timestamps to lag behind real time (roughly a sec per 10 min). I
noticed the kernel uses 2095082 KHz and in fact it gives much better
results.
dmesg:
tsc: Detected 2095.082 MHz processor
tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
2095082
I changed the dpdk code to print more when it initializes
eal_tsc_resolution_hz and lowered the rounding of the estimations, as
follows:
git diff
diff --git a/lib/eal/common/eal_common_timer.c
b/lib/eal/common/eal_common_timer.c
index c5c4703f15..faf1efc90f 100644
--- a/lib/eal/common/eal_common_timer.c
+++ b/lib/eal/common/eal_common_timer.c
@@ -38,7 +38,7 @@ rte_get_tsc_hz(void)
static uint64_t
estimate_tsc_freq(void)
{
-#define CYC_PER_10MHZ 1E7
+#define CYC_PER_10MHZ 1E3
EAL_LOG(WARNING, "WARNING: TSC frequency estimated roughly"
" - clock timings may be less accurate.");
/* assume that the rte_delay_us_sleep() will sleep for 1 second */
@@ -71,6 +71,10 @@ set_tsc_freq(void)
if (!freq)
freq = estimate_tsc_freq();
+ EAL_LOG(DEBUG, "TSC frequency arch ~%" PRIu64 " KHz",
get_tsc_freq_arch() / 1000);
+ EAL_LOG(DEBUG, "TSC frequency linux ~%" PRIu64 " KHz",
get_tsc_freq() / 1000);
+ EAL_LOG(DEBUG, "TSC frequency estimate ~%" PRIu64 " KHz",
estimate_tsc_freq() / 1000);
+
EAL_LOG(DEBUG, "TSC frequency is ~%" PRIu64 " KHz", freq / 1000);
eal_tsc_resolution_hz = freq;
mcfg->tsc_hz = freq;
diff --git a/lib/eal/linux/eal_timer.c b/lib/eal/linux/eal_timer.c
index 1cb1e92193..9254c901b8 100644
--- a/lib/eal/linux/eal_timer.c
+++ b/lib/eal/linux/eal_timer.c
@@ -192,9 +192,9 @@ get_tsc_freq(void)
{
#ifdef CLOCK_MONOTONIC_RAW
#define NS_PER_SEC 1E9
-#define CYC_PER_10MHZ 1E7
+#define CYC_PER_10MHZ 1E3
- struct timespec sleeptime = {.tv_nsec = NS_PER_SEC / 10 }; /*
1/10 second */
+ struct timespec sleeptime = {.tv_sec = 1 }; /* 1/10 second */
struct timespec t_start, t_end;
uint64_t tsc_hz;
I've run the helloworld application on an isolated cpu:
taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
The results are:
EAL: TSC frequency arch ~2100000 KHz
EAL: TSC frequency linux ~2095082 KHz
EAL: TSC frequency estimate ~2095346 KHz
The arch one is picked, which seems rather wrong, any way to override that?
Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
Thoughts? Thanks!
Kernel: 4.18.0-513.9.1.el8_9.x86_64
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
BIOS Model name: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
Stepping: 4
CPU MHz: 2100.000
BogoMIPS: 4200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 22528K
NUMA node0 CPU(s): 0-15,32-47
NUMA node1 CPU(s): 16-31,48-63
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm
pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3
cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp
tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1
hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq
rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl
xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total
cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d
arch_capabilities
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
@ 2024-09-18 23:27 ` Stephen Hemminger
2024-09-19 9:37 ` Isaac Boukris
2024-09-19 12:26 ` Isaac Boukris
2024-09-19 21:53 ` Stephen Hemminger
` (2 subsequent siblings)
3 siblings, 2 replies; 16+ messages in thread
From: Stephen Hemminger @ 2024-09-18 23:27 UTC (permalink / raw)
To: Isaac Boukris; +Cc: users
On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:
> I've run the helloworld application on an isolated cpu:
> taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
>
> The results are:
> EAL: TSC frequency arch ~2100000 KHz
> EAL: TSC frequency linux ~2095082 KHz
> EAL: TSC frequency estimate ~2095346 KHz
>
> The arch one is picked, which seems rather wrong, any way to override that?
> Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
>
> Thoughts? Thanks!
>
> Kernel: 4.18.0-513.9.1.el8_9.x86_64
Note: 4.18 kernel was end of life 12 August 2018, I assume
this is RHEL8 which does their own backports and never changes kernel version.
What is the kernel dmesg, why is it deciding on that value?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-18 23:27 ` Stephen Hemminger
@ 2024-09-19 9:37 ` Isaac Boukris
2024-09-19 12:26 ` Isaac Boukris
1 sibling, 0 replies; 16+ messages in thread
From: Isaac Boukris @ 2024-09-19 9:37 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
On Thu, Sep 19, 2024 at 2:27 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 19 Sep 2024 01:04:40 +0300
> Isaac Boukris <iboukris@gmail.com> wrote:
>
> > I've run the helloworld application on an isolated cpu:
> > taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
> >
> > The results are:
> > EAL: TSC frequency arch ~2100000 KHz
> > EAL: TSC frequency linux ~2095082 KHz
> > EAL: TSC frequency estimate ~2095346 KHz
> >
> > The arch one is picked, which seems rather wrong, any way to override that?
> > Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
> >
> > Thoughts? Thanks!
> >
> > Kernel: 4.18.0-513.9.1.el8_9.x86_64
>
> Note: 4.18 kernel was end of life 12 August 2018, I assume
> this is RHEL8 which does their own backports and never changes kernel version.
Indeed RHEL 8.9
> What is the kernel dmesg, why is it deciding on that value?
It comes from kernel's determine_cpu_tsc_frequencies() afaict (which
didn't change that much).
As a matter of fact, I got a similar behavior on my vmware VM on my
laptop (although smaller diff).
kernel: 4.18.0-553.8.1.el8_10.x86_64
lscpu:
Model name: 12th Gen Intel(R) Core(TM) i7-1260P
Stepping: 3
CPU MHz: 2495.994
BogoMIPS: 4991.98
Hypervisor vendor: VMware
dmesg | grep -i tsc
[ 0.000000] vmware: TSC freq read from hypervisor : 2495.994 MHz
[ 0.000000] tsc: Detected 2495.994 MHz processor
[ 0.000000] TSC deadline timer available
[ 0.010000] clocksource: tsc-early: mask: 0xffffffffffffffff
max_cycles: 0x23fa717cb36, max_idle_ns: 440795237972 ns
[ 0.905000] clocksource: Switched to clocksource tsc-early
[ 3.104381] tsc: Refined TSC clocksource calibration: 2495.990 MHz
[ 3.105473] clocksource: tsc: mask: 0xffffffffffffffff max_cycles:
0x23fa6db1dfc, max_idle_ns: 440795265852 ns
[ 3.264297] clocksource: Switched to clocksource tsc
cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
2495990
sudo bpftrace -e 'BEGIN { printf("%u\n", *kaddr("tsc_khz")); exit(); }'
Attaching 1 probe...
2495990
My modified dpdk code above gives:
EAL: TSC frequency arch ~0 KHz
EAL: TSC frequency linux ~2495982 KHz
EAL: TSC frequency estimate ~2497263 KHz
Note that with the unmodified dpdk code which rounds to 10MHz both the
linux and the common estimation would give 2500000 KHz.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-18 23:27 ` Stephen Hemminger
2024-09-19 9:37 ` Isaac Boukris
@ 2024-09-19 12:26 ` Isaac Boukris
2024-09-19 13:04 ` Isaac Boukris
1 sibling, 1 reply; 16+ messages in thread
From: Isaac Boukris @ 2024-09-19 12:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
On Thu, Sep 19, 2024 at 2:27 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 19 Sep 2024 01:04:40 +0300
> Isaac Boukris <iboukris@gmail.com> wrote:
>
> > I've run the helloworld application on an isolated cpu:
> > taskset -c 10 ./dpdk-helloworld --log-level=lib.eal:debug --no-huge
> >
> > The results are:
> > EAL: TSC frequency arch ~2100000 KHz
> > EAL: TSC frequency linux ~2095082 KHz
> > EAL: TSC frequency estimate ~2095346 KHz
> >
> > The arch one is picked, which seems rather wrong, any way to override that?
> > Should we lower the estimation rounding to 1MHz or even 1KHz in the linux one?
> >
> > Thoughts? Thanks!
> >
> > Kernel: 4.18.0-513.9.1.el8_9.x86_64
>
> Note: 4.18 kernel was end of life 12 August 2018, I assume
> this is RHEL8 which does their own backports and never changes kernel version.
>
> What is the kernel dmesg, why is it deciding on that value?
Actually, this is the boot dmesg on the machine itself (the previous
log was from a kvm on that machine).
# journalctl -b --system | grep -i tsc
Sep 15 17:50:16 localhost kernel: tsc: Detected 2100.000 MHz processor
Sep 15 17:50:16 localhost kernel: TSC deadline timer available
Sep 15 17:50:16 localhost kernel: clocksource: tsc-early: mask:
0xffffffffffffffff max_cycles: 0x1e4530a99b6, max_idle_ns:
440795257976 ns
Sep 15 17:50:16 localhost kernel: clocksource: Switched to clocksource tsc-early
Sep 15 17:50:16 localhost kernel: tsc: Refined TSC clocksource
calibration: 2095.082 MHz
Sep 15 17:50:16 localhost kernel: clocksource: tsc: mask:
0xffffffffffffffff max_cycles: 0x1e330abbade, max_idle_ns:
440795251159 ns
Sep 15 17:50:16 localhost kernel: clocksource: Switched to clocksource tsc
So it looks like it is refined based on calibration, which we could do
by preferring the linux estimation results over the arch (and lowering
the rounding to 1MHz). Alternatively, maybe find a way to read the
linux values or allow to set the value manually at init (as an eal
param).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-19 12:26 ` Isaac Boukris
@ 2024-09-19 13:04 ` Isaac Boukris
2024-09-19 18:33 ` Isaac Boukris
0 siblings, 1 reply; 16+ messages in thread
From: Isaac Boukris @ 2024-09-19 13:04 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
On an older laptop (i7-8650U), running fedora with kernel 6.8.9-100.fc38.x86_64
journalctl -b | grep -i tsc
Aug 29 12:34:00 localhost.localdomain kernel: tsc: Detected 2100.000
MHz processor
Aug 29 12:34:00 localhost.localdomain kernel: tsc: Detected 2099.944 MHz TSC
Aug 29 12:34:00 localhost.localdomain kernel: TSC deadline timer available
Aug 29 12:34:00 localhost.localdomain kernel: clocksource: tsc-early:
mask: 0xffffffffffffffff max_cycles: 0x1e44fb6c2ab, max_idle_ns:
440795206594 ns
Aug 29 12:34:00 localhost.localdomain kernel: clocksource: Switched to
clocksource tsc-early
Aug 29 12:34:00 localhost.localdomain kernel: tsc: Refined TSC
clocksource calibration: 2112.000 MHz
sudo bpftrace -e 'BEGIN { printf("%u\n", *kaddr("tsc_khz")); exit(); }'
Attaching 1 probe...
2112000
dpdk logs before lowering the rounding:
EAL: TSC frequency arch ~0 KHz
EAL: TSC frequency linux ~2110000 KHz
EAL: TSC frequency estimate ~2110000 KHz
after lowering the rounding to 1KHz:
EAL: TSC frequency arch ~0 KHz
EAL: TSC frequency linux ~2112000 KHz
EAL: TSC frequency estimate ~2112949 KHz
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-19 13:04 ` Isaac Boukris
@ 2024-09-19 18:33 ` Isaac Boukris
0 siblings, 0 replies; 16+ messages in thread
From: Isaac Boukris @ 2024-09-19 18:33 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
Looking some more at the kernel code (tsc.c), it appears that it would
only trust the arch frequency if the cpu 'tsc_known_freq' flag is set
(which none of the machines I have access to has, although for some
the dpdk's get_tsc_freq_arch() does return a value), otherwise it
would calibrate it (hence the "Refined" in dmesg). Perhaps we should
do the same.
/*
* When TSC frequency is known (retrieved via MSR or CPUID), we skip
* the refined calibration and directly register it as a clocksource.
*/
if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
2024-09-18 23:27 ` Stephen Hemminger
@ 2024-09-19 21:53 ` Stephen Hemminger
2024-09-19 22:02 ` Stephen Hemminger
2024-09-19 22:21 ` Stephen Hemminger
3 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2024-09-19 21:53 UTC (permalink / raw)
To: Isaac Boukris; +Cc: users
On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:
> Hi,
>
> On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
>
> The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> timestamps to lag behind real time (roughly a sec per 10 min). I
> noticed the kernel uses 2095082 KHz and in fact it gives much better
> results.
FYI there is a bug from 2022 about this:
https://bugs.dpdk.org/show_bug.cgi?id=959
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
2024-09-18 23:27 ` Stephen Hemminger
2024-09-19 21:53 ` Stephen Hemminger
@ 2024-09-19 22:02 ` Stephen Hemminger
2024-09-19 22:21 ` Stephen Hemminger
3 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2024-09-19 22:02 UTC (permalink / raw)
To: Isaac Boukris; +Cc: users
On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:
> Hi,
>
> On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
>
> The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> timestamps to lag behind real time (roughly a sec per 10 min). I
> noticed the kernel uses 2095082 KHz and in fact it gives much better
> results.
>
> dmesg:
> tsc: Detected 2095.082 MHz processor
>
> tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
> cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
Rather going off into all the weeds of cpuid and whether
the value reported is correct. Perhaps DPDK should just look at the
kernel sysfs files??
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
` (2 preceding siblings ...)
2024-09-19 22:02 ` Stephen Hemminger
@ 2024-09-19 22:21 ` Stephen Hemminger
2024-09-20 3:19 ` Isaac Boukris
2024-09-20 7:26 ` David Marchand
3 siblings, 2 replies; 16+ messages in thread
From: Stephen Hemminger @ 2024-09-19 22:21 UTC (permalink / raw)
To: Isaac Boukris; +Cc: users
On Thu, 19 Sep 2024 01:04:40 +0300
Isaac Boukris <iboukris@gmail.com> wrote:
> Hi,
>
> On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
>
> The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> timestamps to lag behind real time (roughly a sec per 10 min). I
> noticed the kernel uses 2095082 KHz and in fact it gives much better
> results.
>
> dmesg:
> tsc: Detected 2095.082 MHz processor
>
> tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
> cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
> 2095082
Sigh. exposing tsc frequency through sysfs is a Redhat extension
that never got merged upstream.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-19 22:21 ` Stephen Hemminger
@ 2024-09-20 3:19 ` Isaac Boukris
2024-09-20 14:39 ` Stephen Hemminger
2024-09-20 7:26 ` David Marchand
1 sibling, 1 reply; 16+ messages in thread
From: Isaac Boukris @ 2024-09-20 3:19 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
On Fri, Sep 20, 2024 at 1:21 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 19 Sep 2024 01:04:40 +0300
> Isaac Boukris <iboukris@gmail.com> wrote:
>
> > Hi,
> >
> > On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
> >
> > The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> > timestamps to lag behind real time (roughly a sec per 10 min). I
> > noticed the kernel uses 2095082 KHz and in fact it gives much better
> > results.
> >
> > dmesg:
> > tsc: Detected 2095.082 MHz processor
> >
> > tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
> > cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
> > 2095082
>
>
> Sigh. exposing tsc frequency through sysfs is a Redhat extension
> that never got merged upstream.
Actually I think it is a google thing, I got it from github, someone
implemented it for all (hence custom). It is a shame it was never
merged as people know about it for a long time. I mean the whole rdtsc
API is incomplete without it.
As it is, I think the low hanging fruit for now would be to:
- lower the rounding in our linux estimate from 10 to 1 MHz.
- ignore the arch result if the cpu doesn't have the tsc_known_freq flag.
On top of that we can consider:
- increase the time of our linux estimate code and lower the rounding
even further.
- lower the rounding of our common estimation code from 10 to 1 MHz.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-19 22:21 ` Stephen Hemminger
2024-09-20 3:19 ` Isaac Boukris
@ 2024-09-20 7:26 ` David Marchand
2024-09-20 14:36 ` Stephen Hemminger
1 sibling, 1 reply; 16+ messages in thread
From: David Marchand @ 2024-09-20 7:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Isaac Boukris, users
On Fri, Sep 20, 2024 at 12:22 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Sigh. exposing tsc frequency through sysfs is a Redhat extension
> that never got merged upstream.
Counter sight :-).
Not sure where this assertion comes from.
I see no trace of this downstream: RH policy is "upstream first".
--
David Marchand
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-20 7:26 ` David Marchand
@ 2024-09-20 14:36 ` Stephen Hemminger
2024-09-20 15:11 ` Isaac Boukris
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2024-09-20 14:36 UTC (permalink / raw)
To: David Marchand; +Cc: Isaac Boukris, users
On Fri, 20 Sep 2024 09:26:05 +0200
David Marchand <david.marchand@redhat.com> wrote:
> On Fri, Sep 20, 2024 at 12:22 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > Sigh. exposing tsc frequency through sysfs is a Redhat extension
> > that never got merged upstream.
>
> Counter sight :-).
>
> Not sure where this assertion comes from.
> I see no trace of this downstream: RH policy is "upstream first".
>
I missed the driver stuff in original mail and assumed that
since it wasn't in upstream (or Debian) that it came from Redhat. Sorry.
It would be good if kernel exposed it, and there was a proposal
to do that, but it seemed to die from "no one should ever need or care about that"
https://lwn.net/Articles/388263/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-20 3:19 ` Isaac Boukris
@ 2024-09-20 14:39 ` Stephen Hemminger
2024-09-20 15:06 ` Isaac Boukris
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2024-09-20 14:39 UTC (permalink / raw)
To: Isaac Boukris; +Cc: users
On Fri, 20 Sep 2024 06:19:35 +0300
Isaac Boukris <iboukris@gmail.com> wrote:
> On Fri, Sep 20, 2024 at 1:21 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Thu, 19 Sep 2024 01:04:40 +0300
> > Isaac Boukris <iboukris@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
> > >
> > > The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> > > timestamps to lag behind real time (roughly a sec per 10 min). I
> > > noticed the kernel uses 2095082 KHz and in fact it gives much better
> > > results.
> > >
> > > dmesg:
> > > tsc: Detected 2095.082 MHz processor
> > >
> > > tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
> > > cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
> > > 2095082
> >
> >
> > Sigh. exposing tsc frequency through sysfs is a Redhat extension
> > that never got merged upstream.
>
> Actually I think it is a google thing, I got it from github, someone
> implemented it for all (hence custom). It is a shame it was never
> merged as people know about it for a long time. I mean the whole rdtsc
> API is incomplete without it.
>
> As it is, I think the low hanging fruit for now would be to:
> - lower the rounding in our linux estimate from 10 to 1 MHz.
> - ignore the arch result if the cpu doesn't have the tsc_known_freq flag.
The known freq flag doesn't appear to be a CPU flags property, but more
of a property that kernel sets when it decides "this is good enough".
Really getting better value would require some sort of repeated check
(maybe an alarm callback), and using cpu value as a starting point.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-20 14:39 ` Stephen Hemminger
@ 2024-09-20 15:06 ` Isaac Boukris
2024-09-21 6:36 ` Isaac Boukris
0 siblings, 1 reply; 16+ messages in thread
From: Isaac Boukris @ 2024-09-20 15:06 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
On Fri, Sep 20, 2024 at 5:39 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 20 Sep 2024 06:19:35 +0300
> Isaac Boukris <iboukris@gmail.com> wrote:
>
> > On Fri, Sep 20, 2024 at 1:21 AM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Thu, 19 Sep 2024 01:04:40 +0300
> > > Isaac Boukris <iboukris@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > On Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (see lscpu output at the end).
> > > >
> > > > The rte_get_tsc_hz() returns 2100000 KHz but using it causes our
> > > > timestamps to lag behind real time (roughly a sec per 10 min). I
> > > > noticed the kernel uses 2095082 KHz and in fact it gives much better
> > > > results.
> > > >
> > > > dmesg:
> > > > tsc: Detected 2095.082 MHz processor
> > > >
> > > > tsc_freq_khz (custom kmod to exposes kernel's tsc_khz):
> > > > cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
> > > > 2095082
> > >
> > >
> > > Sigh. exposing tsc frequency through sysfs is a Redhat extension
> > > that never got merged upstream.
> >
> > Actually I think it is a google thing, I got it from github, someone
> > implemented it for all (hence custom). It is a shame it was never
> > merged as people know about it for a long time. I mean the whole rdtsc
> > API is incomplete without it.
> >
> > As it is, I think the low hanging fruit for now would be to:
> > - lower the rounding in our linux estimate from 10 to 1 MHz.
> > - ignore the arch result if the cpu doesn't have the tsc_known_freq flag.
>
> The known freq flag doesn't appear to be a CPU flags property, but more
> of a property that kernel sets when it decides "this is good enough".
On linux we could just read /proc/cpuinfo and see what the kernel thinks.
> Really getting better value would require some sort of repeated check
> (maybe an alarm callback), and using cpu value as a starting point.
In practically all my tests, on machines without tsc_known_freq, the
value determined by our linux estimation code with rounding lowered to
1KHz, was much better (closer to kernel value, actually exact the
kernel value except one case where they differed in a couple of KHz).
What would repeated tests give us? I don't think the kernel value
changes, does it?
In fact I think we should lower the rounding in our linux estimation
code to 1KHz (and its time from 100ms to 200ms, just to be on the safe
side, the kernel does a full second), as well as lower the rounding of
our common code to 1MHz. This will simply be more accurate.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-20 14:36 ` Stephen Hemminger
@ 2024-09-20 15:11 ` Isaac Boukris
0 siblings, 0 replies; 16+ messages in thread
From: Isaac Boukris @ 2024-09-20 15:11 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Marchand, users
On Fri, Sep 20, 2024 at 5:36 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 20 Sep 2024 09:26:05 +0200
> David Marchand <david.marchand@redhat.com> wrote:
>
> > On Fri, Sep 20, 2024 at 12:22 AM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > > Sigh. exposing tsc frequency through sysfs is a Redhat extension
> > > that never got merged upstream.
> >
> > Counter sight :-).
> >
> > Not sure where this assertion comes from.
> > I see no trace of this downstream: RH policy is "upstream first".
> >
>
> I missed the driver stuff in original mail and assumed that
> since it wasn't in upstream (or Debian) that it came from Redhat. Sorry.
>
> It would be good if kernel exposed it, and there was a proposal
> to do that, but it seemed to die from "no one should ever need or care about that"
>
> https://lwn.net/Articles/388263/
>
Too bad it wasn't merged, the tsc_mult and tsc_shift would have been
even better than tsc_khz itself, we could've implemented an even more
efficient cycles_ts_ns.I don't get what is the point of making rdtsc
available in userspace without this.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Accuracy of rte_get_tsc_hz() compared to linux
2024-09-20 15:06 ` Isaac Boukris
@ 2024-09-21 6:36 ` Isaac Boukris
0 siblings, 0 replies; 16+ messages in thread
From: Isaac Boukris @ 2024-09-21 6:36 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
> > Really getting better value would require some sort of repeated check
> > (maybe an alarm callback), and using cpu value as a starting point.
>
> In practically all my tests, on machines without tsc_known_freq, the
> value determined by our linux estimation code with rounding lowered to
> 1KHz, was much better (closer to kernel value, actually exact the
> kernel value except one case where they differed in a couple of KHz).
> What would repeated tests give us? I don't think the kernel value
> changes, does it?
Looking at the kernel code, there is a mention of a watchdog but it
seems mostly disabled for TSC based on cpu flags (and it doesn't seem
to change on the systems I'm testing).
> In fact I think we should lower the rounding in our linux estimation
> code to 1KHz (and its time from 100ms to 200ms, just to be on the safe
> side, the kernel does a full second), as well as lower the rounding of
> our common code to 1MHz. This will simply be more accurate.
Increasing the test time to 200ms or even a full second doesn't seem
to provide any improvement, so I'll keep it at 100ms and round it at
10KHz (close to the margin error).
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-09-21 6:36 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-18 22:04 Accuracy of rte_get_tsc_hz() compared to linux Isaac Boukris
2024-09-18 23:27 ` Stephen Hemminger
2024-09-19 9:37 ` Isaac Boukris
2024-09-19 12:26 ` Isaac Boukris
2024-09-19 13:04 ` Isaac Boukris
2024-09-19 18:33 ` Isaac Boukris
2024-09-19 21:53 ` Stephen Hemminger
2024-09-19 22:02 ` Stephen Hemminger
2024-09-19 22:21 ` Stephen Hemminger
2024-09-20 3:19 ` Isaac Boukris
2024-09-20 14:39 ` Stephen Hemminger
2024-09-20 15:06 ` Isaac Boukris
2024-09-21 6:36 ` Isaac Boukris
2024-09-20 7:26 ` David Marchand
2024-09-20 14:36 ` Stephen Hemminger
2024-09-20 15:11 ` Isaac Boukris
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).