* [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation
@ 2020-06-08 21:34 Honnappa Nagarahalli
2020-06-24 12:50 ` Jerin Jacob
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Honnappa Nagarahalli @ 2020-06-08 21:34 UTC (permalink / raw)
To: dev, honnappa.nagarahalli, jerinj, hemant.agrawal, akhil.goyal,
ogerlitz, ajit.khaparde
Cc: ruigeng.wang, dharmik.thakkar, phil.yang, stable
get_tsc_freq uses 'nanosleep' system call to calculate the CPU
frequency. However, 'nanosleep' results in the process getting
un-scheduled. The kernel saves and restores the PMU state. This
ensures that the PMU cycles are not counted towards a sleeping
process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
in incorrect CPU frequency calculation. This logic is replaced
with generic counter based loop.
Bugzilla ID: 450
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
lib/librte_eal/arm/include/rte_cycles_64.h | 45 +++++++++++++++++++---
lib/librte_eal/arm/rte_cycles.c | 24 ++++++++++--
2 files changed, 61 insertions(+), 8 deletions(-)
diff --git a/lib/librte_eal/arm/include/rte_cycles_64.h b/lib/librte_eal/arm/include/rte_cycles_64.h
index da557b6a1..6fc352036 100644
--- a/lib/librte_eal/arm/include/rte_cycles_64.h
+++ b/lib/librte_eal/arm/include/rte_cycles_64.h
@@ -11,6 +11,36 @@ extern "C" {
#include "generic/rte_cycles.h"
+/** Read generic counter frequency */
+static inline uint64_t
+__rte_rd_generic_cntr_freq(void)
+{
+ uint64_t freq;
+
+ asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
+ return freq;
+}
+
+/** Read generic counter */
+static inline uint64_t
+__rte_rd_generic_cntr(void)
+{
+ uint64_t tsc;
+
+ asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
+ return tsc;
+}
+
+static inline uint64_t
+__rte_rd_generic_cntr_precise(void)
+{
+ uint64_t tsc;
+
+ asm volatile("isb" : : : "memory");
+ asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
+ return tsc;
+}
+
/**
* Read the time base register.
*
@@ -25,10 +55,7 @@ extern "C" {
static inline uint64_t
rte_rdtsc(void)
{
- uint64_t tsc;
-
- asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
- return tsc;
+ return __rte_rd_generic_cntr();
}
#else
/**
@@ -49,14 +76,22 @@ rte_rdtsc(void)
* asm volatile("msr pmcr_el0, %0" : : "r" (val));
*
*/
+
+/** Read PMU cycle counter */
static inline uint64_t
-rte_rdtsc(void)
+__rte_rd_pmu_cycle_cntr(void)
{
uint64_t tsc;
asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
return tsc;
}
+
+static inline uint64_t
+rte_rdtsc(void)
+{
+ return __rte_rd_pmu_cycle_cntr();
+}
#endif
static inline uint64_t
diff --git a/lib/librte_eal/arm/rte_cycles.c b/lib/librte_eal/arm/rte_cycles.c
index 3500d523e..92c87a8a4 100644
--- a/lib/librte_eal/arm/rte_cycles.c
+++ b/lib/librte_eal/arm/rte_cycles.c
@@ -3,14 +3,32 @@
*/
#include "eal_private.h"
+#include "rte_cycles.h"
uint64_t
get_tsc_freq_arch(void)
{
#if defined RTE_ARCH_ARM64 && !defined RTE_ARM_EAL_RDTSC_USE_PMU
- uint64_t freq;
- asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
- return freq;
+ return __rte_rd_generic_cntr_freq();
+#elif defined RTE_ARCH_ARM64 && defined RTE_ARM_EAL_RDTSC_USE_PMU
+ /* Use the generic counter ticks to calculate the PMU
+ * cycle frequency.
+ */
+ uint64_t gcnt_ticks;
+ uint64_t start_ticks, cur_ticks;
+ uint64_t start_pmu_cycles, end_pmu_cycles;
+
+ /* Number of ticks for 1/10 second */
+ gcnt_ticks = __rte_rd_generic_cntr_freq() / 10;
+
+ start_ticks = __rte_rd_generic_cntr_precise();
+ start_pmu_cycles = rte_rdtsc_precise();
+ do {
+ cur_ticks = __rte_rd_generic_cntr();
+ } while ((cur_ticks - start_ticks) < gcnt_ticks);
+ end_pmu_cycles = rte_rdtsc_precise();
+
+ return ((end_pmu_cycles - start_pmu_cycles) * 10);
#else
return 0;
#endif
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation
2020-06-08 21:34 [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation Honnappa Nagarahalli
@ 2020-06-24 12:50 ` Jerin Jacob
2020-06-26 20:46 ` Honnappa Nagarahalli
2020-06-24 15:09 ` Pavan Nikhilesh Bhagavatula
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: " Honnappa Nagarahalli
2 siblings, 1 reply; 8+ messages in thread
From: Jerin Jacob @ 2020-06-24 12:50 UTC (permalink / raw)
To: Honnappa Nagarahalli
Cc: dpdk-dev, Jerin Jacob, Hemant Agrawal, Akhil Goyal, ogerlitz,
Ajit Khaparde, ruigeng.wang, Dharmik Thakkar, Phil Yang,
dpdk stable
On Tue, Jun 9, 2020 at 3:04 AM Honnappa Nagarahalli
<honnappa.nagarahalli@arm.com> wrote:
>
> get_tsc_freq uses 'nanosleep' system call to calculate the CPU
> frequency. However, 'nanosleep' results in the process getting
> un-scheduled. The kernel saves and restores the PMU state. This
> ensures that the PMU cycles are not counted towards a sleeping
> process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
> in incorrect CPU frequency calculation. This logic is replaced
> with generic counter based loop.
>
> Bugzilla ID: 450
> Fixes: af75078fece3 ("first public release")
The Fix looks good to me.
The Fixes is not correct. It should be the patch where
RTE_ARM_EAL_RDTSC_USE_PMU got introduced.
> Cc: stable@dpdk.org
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
>
> ---
> lib/librte_eal/arm/include/rte_cycles_64.h | 45 +++++++++++++++++++---
> lib/librte_eal/arm/rte_cycles.c | 24 ++++++++++--
> 2 files changed, 61 insertions(+), 8 deletions(-)
>
> diff --git a/lib/librte_eal/arm/include/rte_cycles_64.h b/lib/librte_eal/arm/include/rte_cycles_64.h
> index da557b6a1..6fc352036 100644
> --- a/lib/librte_eal/arm/include/rte_cycles_64.h
> +++ b/lib/librte_eal/arm/include/rte_cycles_64.h
> @@ -11,6 +11,36 @@ extern "C" {
>
> #include "generic/rte_cycles.h"
>
> +/** Read generic counter frequency */
> +static inline uint64_t
I prefer to have __rte_allways_inline
> +__rte_rd_generic_cntr_freq(void)
I think, the generic counter is confusing, I think, since the symbol
is exposed due to placed in
header file, it is better to change, __rte_arm64_cntfrq()
> +{
> + uint64_t freq;
> +
> + asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
> + return freq;
> +}
> +
> +/** Read generic counter */
> +static inline uint64_t
Likewise, __rte_arm64_cntvct()
> +__rte_rd_generic_cntr(void)
> +{
> + uint64_t tsc;
> +
> + asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> + return tsc;
> +}
> +
> +static inline uint64_t
> +__rte_rd_generic_cntr_precise(void)
__rte_arm64_cntfrq_precise()
> +{
> + uint64_t tsc;
> +
> + asm volatile("isb" : : : "memory");
> + asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> + return tsc;
> +}
> +
> /**
> * Read the time base register.
> *
> @@ -25,10 +55,7 @@ extern "C" {
> static inline uint64_t
> rte_rdtsc(void)
> {
> - uint64_t tsc;
> -
> - asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> - return tsc;
> + return __rte_rd_generic_cntr();
> }
> #else
> /**
> @@ -49,14 +76,22 @@ rte_rdtsc(void)
> * asm volatile("msr pmcr_el0, %0" : : "r" (val));
> *
> */
> +
> +/** Read PMU cycle counter */
> static inline uint64_t
> -rte_rdtsc(void)
> +__rte_rd_pmu_cycle_cntr(void)
> {
> uint64_t tsc;
>
> asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> return tsc;
> }
> +
> +static inline uint64_t
> +rte_rdtsc(void)
> +{
> + return __rte_rd_pmu_cycle_cntr();
> +}
> #endif
>
> static inline uint64_t
> diff --git a/lib/librte_eal/arm/rte_cycles.c b/lib/librte_eal/arm/rte_cycles.c
> index 3500d523e..92c87a8a4 100644
> --- a/lib/librte_eal/arm/rte_cycles.c
> +++ b/lib/librte_eal/arm/rte_cycles.c
> @@ -3,14 +3,32 @@
> */
>
> #include "eal_private.h"
> +#include "rte_cycles.h"
>
> uint64_t
> get_tsc_freq_arch(void)
> {
> #if defined RTE_ARCH_ARM64 && !defined RTE_ARM_EAL_RDTSC_USE_PMU
> - uint64_t freq;
> - asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
> - return freq;
> + return __rte_rd_generic_cntr_freq();
> +#elif defined RTE_ARCH_ARM64 && defined RTE_ARM_EAL_RDTSC_USE_PMU
> + /* Use the generic counter ticks to calculate the PMU
> + * cycle frequency.
> + */
> + uint64_t gcnt_ticks;
> + uint64_t start_ticks, cur_ticks;
> + uint64_t start_pmu_cycles, end_pmu_cycles;
> +
> + /* Number of ticks for 1/10 second */
> + gcnt_ticks = __rte_rd_generic_cntr_freq() / 10;
> +
> + start_ticks = __rte_rd_generic_cntr_precise();
> + start_pmu_cycles = rte_rdtsc_precise();
> + do {
> + cur_ticks = __rte_rd_generic_cntr();
> + } while ((cur_ticks - start_ticks) < gcnt_ticks);
> + end_pmu_cycles = rte_rdtsc_precise();
> +
> + return ((end_pmu_cycles - start_pmu_cycles) * 10);
Good thought. On the plus side, it will reduce the boot time by .9 sec.
> #else
> return 0;
With above changes:
Acked-by: Jerin Jacob <jerinj@marvell.com>
> #endif
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation
2020-06-08 21:34 [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation Honnappa Nagarahalli
2020-06-24 12:50 ` Jerin Jacob
@ 2020-06-24 15:09 ` Pavan Nikhilesh Bhagavatula
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: " Honnappa Nagarahalli
2 siblings, 0 replies; 8+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-06-24 15:09 UTC (permalink / raw)
To: Honnappa Nagarahalli, dev, Jerin Jacob Kollanukkaran,
hemant.agrawal, akhil.goyal, ogerlitz, ajit.khaparde
Cc: ruigeng.wang, dharmik.thakkar, phil.yang, stable
>Subject: [dpdk-dev] [PATCH] eal: generic counter based loop for CPU
>freq calculation
>
>get_tsc_freq uses 'nanosleep' system call to calculate the CPU
>frequency. However, 'nanosleep' results in the process getting
>un-scheduled. The kernel saves and restores the PMU state. This
>ensures that the PMU cycles are not counted towards a sleeping
>process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
>in incorrect CPU frequency calculation. This logic is replaced
>with generic counter based loop.
>
>Bugzilla ID: 450
>Fixes: af75078fece3 ("first public release")
>Cc: stable@dpdk.org
>
>Signed-off-by: Honnappa Nagarahalli
><honnappa.nagarahalli@arm.com>
>Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
>Reviewed-by: Phil Yang <phil.yang@arm.com>
>
>---
> lib/librte_eal/arm/include/rte_cycles_64.h | 45
>+++++++++++++++++++---
> lib/librte_eal/arm/rte_cycles.c | 24 ++++++++++--
> 2 files changed, 61 insertions(+), 8 deletions(-)
>
<Snip>
>
> uint64_t
> get_tsc_freq_arch(void)
> {
> #if defined RTE_ARCH_ARM64 && !defined
>RTE_ARM_EAL_RDTSC_USE_PMU
>- uint64_t freq;
>- asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
>- return freq;
>+ return __rte_rd_generic_cntr_freq();
>+#elif defined RTE_ARCH_ARM64 && defined
>RTE_ARM_EAL_RDTSC_USE_PMU
>+ /* Use the generic counter ticks to calculate the PMU
>+ * cycle frequency.
>+ */
>+ uint64_t gcnt_ticks;
>+ uint64_t start_ticks, cur_ticks;
>+ uint64_t start_pmu_cycles, end_pmu_cycles;
>+
>+ /* Number of ticks for 1/10 second */
>+ gcnt_ticks = __rte_rd_generic_cntr_freq() / 10;
>+
>+ start_ticks = __rte_rd_generic_cntr_precise();
>+ start_pmu_cycles = rte_rdtsc_precise();
>+ do {
>+ cur_ticks = __rte_rd_generic_cntr();
>+ } while ((cur_ticks - start_ticks) < gcnt_ticks);
>+ end_pmu_cycles = rte_rdtsc_precise();
>+
>+ return ((end_pmu_cycles - start_pmu_cycles) * 10);
I think we need to round this of to the next multiple of 10.
Sometimes it is off by one
EAL: TSC frequency is ~2399999 KHz
Similar to http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/eal_common_timer.c#n54
Pavan.
> #else
> return 0;
> #endif
>--
>2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v2 1/2] eal/arm: generic counter based loop for CPU freq calculation
2020-06-08 21:34 [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation Honnappa Nagarahalli
2020-06-24 12:50 ` Jerin Jacob
2020-06-24 15:09 ` Pavan Nikhilesh Bhagavatula
@ 2020-06-26 20:35 ` Honnappa Nagarahalli
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 2/2] eal/arm: change inline functions to always inline Honnappa Nagarahalli
2020-07-07 11:16 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: generic counter based loop for CPU freq calculation David Marchand
2 siblings, 2 replies; 8+ messages in thread
From: Honnappa Nagarahalli @ 2020-06-26 20:35 UTC (permalink / raw)
To: dev, honnappa.nagarahalli, jerinj, hemant.agrawal, akhil.goyal,
ogerlitz, ajit.khaparde, pbhagavatula
Cc: nd, stable
get_tsc_freq uses 'nanosleep' system call to calculate the CPU
frequency. However, 'nanosleep' results in the process getting
un-scheduled. The kernel saves and restores the PMU state. This
ensures that the PMU cycles are not counted towards a sleeping
process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
in incorrect CPU frequency calculation. This logic is replaced
with generic counter based loop.
Bugzilla ID: 450
Fixes: f91bcbb2d9a6 ("eal/arm: use high-resolution cycle counter")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
v2:
1) renamed functions (Jerin)
2) Aligned the frequency to 1MHz ceiling (Pavan)
3) Made all the inlines to always inline for consistency
lib/librte_eal/arm/include/rte_cycles_64.h | 45 +++++++++++++++++++---
lib/librte_eal/arm/rte_cycles.c | 27 +++++++++++--
2 files changed, 63 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/arm/include/rte_cycles_64.h b/lib/librte_eal/arm/include/rte_cycles_64.h
index da557b6a1..e41f9dbd6 100644
--- a/lib/librte_eal/arm/include/rte_cycles_64.h
+++ b/lib/librte_eal/arm/include/rte_cycles_64.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2015 Cavium, Inc
+ * Copyright(c) 2020 Arm Limited
*/
#ifndef _RTE_CYCLES_ARM64_H_
@@ -11,6 +12,33 @@ extern "C" {
#include "generic/rte_cycles.h"
+/** Read generic counter frequency */
+static __rte_always_inline uint64_t
+__rte_arm64_cntfrq(void)
+{
+ uint64_t freq;
+
+ asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
+ return freq;
+}
+
+/** Read generic counter */
+static __rte_always_inline uint64_t
+__rte_arm64_cntvct(void)
+{
+ uint64_t tsc;
+
+ asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
+ return tsc;
+}
+
+static __rte_always_inline uint64_t
+__rte_arm64_cntvct_precise(void)
+{
+ asm volatile("isb" : : : "memory");
+ return __rte_arm64_cntvct();
+}
+
/**
* Read the time base register.
*
@@ -25,10 +53,7 @@ extern "C" {
static inline uint64_t
rte_rdtsc(void)
{
- uint64_t tsc;
-
- asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
- return tsc;
+ return __rte_arm64_cntvct();
}
#else
/**
@@ -49,14 +74,22 @@ rte_rdtsc(void)
* asm volatile("msr pmcr_el0, %0" : : "r" (val));
*
*/
-static inline uint64_t
-rte_rdtsc(void)
+
+/** Read PMU cycle counter */
+static __rte_always_inline uint64_t
+__rte_arm64_pmccntr(void)
{
uint64_t tsc;
asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
return tsc;
}
+
+static inline uint64_t
+rte_rdtsc(void)
+{
+ return __rte_arm64_pmccntr();
+}
#endif
static inline uint64_t
diff --git a/lib/librte_eal/arm/rte_cycles.c b/lib/librte_eal/arm/rte_cycles.c
index 3500d523e..5bd29b24b 100644
--- a/lib/librte_eal/arm/rte_cycles.c
+++ b/lib/librte_eal/arm/rte_cycles.c
@@ -3,14 +3,35 @@
*/
#include "eal_private.h"
+#include "rte_cycles.h"
uint64_t
get_tsc_freq_arch(void)
{
#if defined RTE_ARCH_ARM64 && !defined RTE_ARM_EAL_RDTSC_USE_PMU
- uint64_t freq;
- asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
- return freq;
+ return __rte_arm64_cntfrq();
+#elif defined RTE_ARCH_ARM64 && defined RTE_ARM_EAL_RDTSC_USE_PMU
+#define CYC_PER_1MHZ 1E6
+ /* Use the generic counter ticks to calculate the PMU
+ * cycle frequency.
+ */
+ uint64_t ticks;
+ uint64_t start_ticks, cur_ticks;
+ uint64_t start_pmu_cycles, end_pmu_cycles;
+
+ /* Number of ticks for 1/10 second */
+ ticks = __rte_arm64_cntfrq() / 10;
+
+ start_ticks = __rte_arm64_cntvct_precise();
+ start_pmu_cycles = rte_rdtsc_precise();
+ do {
+ cur_ticks = __rte_arm64_cntvct();
+ } while ((cur_ticks - start_ticks) < ticks);
+ end_pmu_cycles = rte_rdtsc_precise();
+
+ /* Adjust the cycles to next 1Mhz */
+ return RTE_ALIGN_MUL_CEIL(end_pmu_cycles - start_pmu_cycles,
+ CYC_PER_1MHZ) * 10;
#else
return 0;
#endif
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v2 2/2] eal/arm: change inline functions to always inline
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: " Honnappa Nagarahalli
@ 2020-06-26 20:35 ` Honnappa Nagarahalli
2020-07-07 2:05 ` Jerin Jacob
2020-07-07 11:16 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: generic counter based loop for CPU freq calculation David Marchand
1 sibling, 1 reply; 8+ messages in thread
From: Honnappa Nagarahalli @ 2020-06-26 20:35 UTC (permalink / raw)
To: dev, honnappa.nagarahalli, jerinj, hemant.agrawal, akhil.goyal,
ogerlitz, ajit.khaparde, pbhagavatula
Cc: nd
Change the inline functions to use __rte_always_inline to be
consistent with rest of the inline functions.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
lib/librte_eal/arm/include/rte_cycles_64.h | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/lib/librte_eal/arm/include/rte_cycles_64.h b/lib/librte_eal/arm/include/rte_cycles_64.h
index e41f9dbd6..029fdc435 100644
--- a/lib/librte_eal/arm/include/rte_cycles_64.h
+++ b/lib/librte_eal/arm/include/rte_cycles_64.h
@@ -50,7 +50,7 @@ __rte_arm64_cntvct_precise(void)
* This call is portable to any ARMv8 architecture, however, typically
* cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
*/
-static inline uint64_t
+static __rte_always_inline uint64_t
rte_rdtsc(void)
{
return __rte_arm64_cntvct();
@@ -85,22 +85,25 @@ __rte_arm64_pmccntr(void)
return tsc;
}
-static inline uint64_t
+static __rte_always_inline uint64_t
rte_rdtsc(void)
{
return __rte_arm64_pmccntr();
}
#endif
-static inline uint64_t
+static __rte_always_inline uint64_t
rte_rdtsc_precise(void)
{
asm volatile("isb" : : : "memory");
return rte_rdtsc();
}
-static inline uint64_t
-rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+static __rte_always_inline uint64_t
+rte_get_tsc_cycles(void)
+{
+ return rte_rdtsc();
+}
#ifdef __cplusplus
}
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation
2020-06-24 12:50 ` Jerin Jacob
@ 2020-06-26 20:46 ` Honnappa Nagarahalli
0 siblings, 0 replies; 8+ messages in thread
From: Honnappa Nagarahalli @ 2020-06-26 20:46 UTC (permalink / raw)
To: Jerin Jacob
Cc: dpdk-dev, jerinj, hemant.agrawal, Akhil.goyal@nxp.com, ogerlitz,
Ajit Khaparde (ajit.khaparde@broadcom.com),
ruigeng.wang, Dharmik Thakkar, Phil Yang, dpdk stable, nd,
Honnappa Nagarahalli, nd
Hi Jerin,
Thanks for the comments.
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Wednesday, June 24, 2020 7:51 AM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Cc: dpdk-dev <dev@dpdk.org>; jerinj@marvell.com;
> hemant.agrawal@nxp.com; Akhil.goyal@nxp.com; ogerlitz@mellanox.com;
> Ajit Khaparde (ajit.khaparde@broadcom.com)
> <ajit.khaparde@broadcom.com>; ruigeng.wang@arm.com; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; Phil Yang <Phil.Yang@arm.com>; dpdk stable
> <stable@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq
> calculation
>
> On Tue, Jun 9, 2020 at 3:04 AM Honnappa Nagarahalli
> <honnappa.nagarahalli@arm.com> wrote:
> >
> > get_tsc_freq uses 'nanosleep' system call to calculate the CPU
> > frequency. However, 'nanosleep' results in the process getting
> > un-scheduled. The kernel saves and restores the PMU state. This
> > ensures that the PMU cycles are not counted towards a sleeping
> > process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results in
> > incorrect CPU frequency calculation. This logic is replaced with
> > generic counter based loop.
> >
> > Bugzilla ID: 450
> > Fixes: af75078fece3 ("first public release")
>
> The Fix looks good to me.
>
> The Fixes is not correct. It should be the patch where
> RTE_ARM_EAL_RDTSC_USE_PMU got introduced.
Ok, will dig that out.
>
>
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Phil Yang <phil.yang@arm.com>
> >
> > ---
> > lib/librte_eal/arm/include/rte_cycles_64.h | 45 +++++++++++++++++++---
> > lib/librte_eal/arm/rte_cycles.c | 24 ++++++++++--
> > 2 files changed, 61 insertions(+), 8 deletions(-)
> >
> > diff --git a/lib/librte_eal/arm/include/rte_cycles_64.h
> > b/lib/librte_eal/arm/include/rte_cycles_64.h
> > index da557b6a1..6fc352036 100644
> > --- a/lib/librte_eal/arm/include/rte_cycles_64.h
> > +++ b/lib/librte_eal/arm/include/rte_cycles_64.h
> > @@ -11,6 +11,36 @@ extern "C" {
> >
> > #include "generic/rte_cycles.h"
> >
> > +/** Read generic counter frequency */ static inline uint64_t
>
> I prefer to have __rte_allways_inline
>
> > +__rte_rd_generic_cntr_freq(void)
>
> I think, the generic counter is confusing, I think, since the symbol is exposed
> due to placed in header file, it is better to change, __rte_arm64_cntfrq()
Ok, makes sense.
>
> > +{
> > + uint64_t freq;
> > +
> > + asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
> > + return freq;
> > +}
> > +
> > +/** Read generic counter */
> > +static inline uint64_t
>
> Likewise, __rte_arm64_cntvct()
>
>
> > +__rte_rd_generic_cntr(void)
> > +{
> > + uint64_t tsc;
> > +
> > + asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> > + return tsc;
> > +}
> > +
> > +static inline uint64_t
> > +__rte_rd_generic_cntr_precise(void)
>
> __rte_arm64_cntfrq_precise()
>
> > +{
> > + uint64_t tsc;
> > +
> > + asm volatile("isb" : : : "memory");
> > + asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> > + return tsc;
> > +}
> > +
> > /**
> > * Read the time base register.
> > *
> > @@ -25,10 +55,7 @@ extern "C" {
> > static inline uint64_t
> > rte_rdtsc(void)
> > {
> > - uint64_t tsc;
> > -
> > - asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> > - return tsc;
> > + return __rte_rd_generic_cntr();
> > }
> > #else
> > /**
> > @@ -49,14 +76,22 @@ rte_rdtsc(void)
> > * asm volatile("msr pmcr_el0, %0" : : "r" (val));
> > *
> > */
> > +
> > +/** Read PMU cycle counter */
> > static inline uint64_t
> > -rte_rdtsc(void)
> > +__rte_rd_pmu_cycle_cntr(void)
I will change this to __rte_arm64_pmccntr
> > {
> > uint64_t tsc;
> >
> > asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> > return tsc;
> > }
> > +
> > +static inline uint64_t
> > +rte_rdtsc(void)
> > +{
> > + return __rte_rd_pmu_cycle_cntr(); }
> > #endif
> >
> > static inline uint64_t
> > diff --git a/lib/librte_eal/arm/rte_cycles.c
> > b/lib/librte_eal/arm/rte_cycles.c index 3500d523e..92c87a8a4 100644
> > --- a/lib/librte_eal/arm/rte_cycles.c
> > +++ b/lib/librte_eal/arm/rte_cycles.c
> > @@ -3,14 +3,32 @@
> > */
> >
> > #include "eal_private.h"
> > +#include "rte_cycles.h"
> >
> > uint64_t
> > get_tsc_freq_arch(void)
> > {
> > #if defined RTE_ARCH_ARM64 && !defined
> RTE_ARM_EAL_RDTSC_USE_PMU
> > - uint64_t freq;
> > - asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));
> > - return freq;
> > + return __rte_rd_generic_cntr_freq(); #elif defined
> > +RTE_ARCH_ARM64 && defined RTE_ARM_EAL_RDTSC_USE_PMU
> > + /* Use the generic counter ticks to calculate the PMU
> > + * cycle frequency.
> > + */
> > + uint64_t gcnt_ticks;
> > + uint64_t start_ticks, cur_ticks;
> > + uint64_t start_pmu_cycles, end_pmu_cycles;
> > +
> > + /* Number of ticks for 1/10 second */
> > + gcnt_ticks = __rte_rd_generic_cntr_freq() / 10;
> > +
> > + start_ticks = __rte_rd_generic_cntr_precise();
> > + start_pmu_cycles = rte_rdtsc_precise();
> > + do {
> > + cur_ticks = __rte_rd_generic_cntr();
> > + } while ((cur_ticks - start_ticks) < gcnt_ticks);
> > + end_pmu_cycles = rte_rdtsc_precise();
> > +
> > + return ((end_pmu_cycles - start_pmu_cycles) * 10);
>
> Good thought. On the plus side, it will reduce the boot time by .9 sec.
>
> > #else
> > return 0;
>
> With above changes:
>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
>
>
>
> > #endif
> > --
> > 2.17.1
> >
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/2] eal/arm: change inline functions to always inline
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 2/2] eal/arm: change inline functions to always inline Honnappa Nagarahalli
@ 2020-07-07 2:05 ` Jerin Jacob
0 siblings, 0 replies; 8+ messages in thread
From: Jerin Jacob @ 2020-07-07 2:05 UTC (permalink / raw)
To: Honnappa Nagarahalli
Cc: dpdk-dev, Jerin Jacob, Hemant Agrawal, Akhil Goyal, ogerlitz,
Ajit Khaparde, Pavan Nikhilesh, nd
On Sat, Jun 27, 2020 at 2:05 AM Honnappa Nagarahalli
<honnappa.nagarahalli@arm.com> wrote:
>
> Change the inline functions to use __rte_always_inline to be
> consistent with rest of the inline functions.
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
> ---
> lib/librte_eal/arm/include/rte_cycles_64.h | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/lib/librte_eal/arm/include/rte_cycles_64.h b/lib/librte_eal/arm/include/rte_cycles_64.h
> index e41f9dbd6..029fdc435 100644
> --- a/lib/librte_eal/arm/include/rte_cycles_64.h
> +++ b/lib/librte_eal/arm/include/rte_cycles_64.h
> @@ -50,7 +50,7 @@ __rte_arm64_cntvct_precise(void)
> * This call is portable to any ARMv8 architecture, however, typically
> * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> */
> -static inline uint64_t
> +static __rte_always_inline uint64_t
> rte_rdtsc(void)
> {
> return __rte_arm64_cntvct();
> @@ -85,22 +85,25 @@ __rte_arm64_pmccntr(void)
> return tsc;
> }
>
> -static inline uint64_t
> +static __rte_always_inline uint64_t
> rte_rdtsc(void)
> {
> return __rte_arm64_pmccntr();
> }
> #endif
>
> -static inline uint64_t
> +static __rte_always_inline uint64_t
> rte_rdtsc_precise(void)
> {
> asm volatile("isb" : : : "memory");
> return rte_rdtsc();
> }
>
> -static inline uint64_t
> -rte_get_tsc_cycles(void) { return rte_rdtsc(); }
> +static __rte_always_inline uint64_t
> +rte_get_tsc_cycles(void)
> +{
> + return rte_rdtsc();
> +}
>
> #ifdef __cplusplus
> }
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/2] eal/arm: generic counter based loop for CPU freq calculation
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: " Honnappa Nagarahalli
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 2/2] eal/arm: change inline functions to always inline Honnappa Nagarahalli
@ 2020-07-07 11:16 ` David Marchand
1 sibling, 0 replies; 8+ messages in thread
From: David Marchand @ 2020-07-07 11:16 UTC (permalink / raw)
To: Honnappa Nagarahalli
Cc: dev, Jerin Jacob Kollanukkaran, Hemant Agrawal, Akhil Goyal,
ogerlitz, Ajit Khaparde, Pavan Nikhilesh, nd, dpdk stable
On Fri, Jun 26, 2020 at 10:35 PM Honnappa Nagarahalli
<honnappa.nagarahalli@arm.com> wrote:
>
> get_tsc_freq uses 'nanosleep' system call to calculate the CPU
> frequency. However, 'nanosleep' results in the process getting
> un-scheduled. The kernel saves and restores the PMU state. This
> ensures that the PMU cycles are not counted towards a sleeping
> process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
> in incorrect CPU frequency calculation. This logic is replaced
> with generic counter based loop.
>
> Bugzilla ID: 450
> Fixes: f91bcbb2d9a6 ("eal/arm: use high-resolution cycle counter")
> Cc: stable@dpdk.org
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
Series applied, thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-07-07 11:16 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-08 21:34 [dpdk-dev] [PATCH] eal: generic counter based loop for CPU freq calculation Honnappa Nagarahalli
2020-06-24 12:50 ` Jerin Jacob
2020-06-26 20:46 ` Honnappa Nagarahalli
2020-06-24 15:09 ` Pavan Nikhilesh Bhagavatula
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: " Honnappa Nagarahalli
2020-06-26 20:35 ` [dpdk-dev] [PATCH v2 2/2] eal/arm: change inline functions to always inline Honnappa Nagarahalli
2020-07-07 2:05 ` Jerin Jacob
2020-07-07 11:16 ` [dpdk-dev] [PATCH v2 1/2] eal/arm: generic counter based loop for CPU freq calculation David Marchand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).