DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  [PATCH] eal/armv8: high-resolution cycle counter
@ 2016-08-18 11:51 Jerin Jacob
  2016-08-19  9:43 ` Nipun Gupta
  2016-08-23 10:01 ` Hemant Agrawal
  0 siblings, 2 replies; 8+ messages in thread
From: Jerin Jacob @ 2016-08-18 11:51 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, jianbo.liu, viktorin, Jerin Jacob

Existing cntvct_el0 based rte_rdtsc() provides portable
means to get wall clock counter at user space. Typically
it runs at <= 100MHz.

The alternative method to enable rte_rdtsc() for high resolution
wall clock counter is through armv8 PMU subsystem.
The PMU cycle counter runs at CPU frequency, However,
access to PMU cycle counter from user space is not enabled
by default in the arm64 linux kernel.
It is possible to enable cycle counter at user space access
by configuring the PMU from the privileged mode (kernel space).

by default rte_rdtsc() implementation uses portable
cntvct_el0 scheme. Application can choose the PMU based
implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---

The PMU based scheme useful for high accuracy performance profiling.
Find below the example steps to configure the PMU based cycle counter on an
armv8 machine.

# git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0
# cd armv8_pmu_cycle_counter_el0
# make
# sudo insmod pmu_el0_cycle_counter.ko
# cd $DPDK_DIR
# make config T=arm64-armv8a-linuxapp-gcc
# echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config
# make -j 4

---
 .../common/include/arch/arm/rte_cycles_64.h        | 33 ++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
index 14f2612..867a946 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
@@ -45,6 +45,11 @@ extern "C" {
  * @return
  *   The time base for this lcore.
  */
+#ifndef RTE_ARM_EAL_RDTSC_USE_PMU
+/**
+ * This call is portable to any ARMv8 architecture, however, typically
+ * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
+ */
 static inline uint64_t
 rte_rdtsc(void)
 {
@@ -53,6 +58,34 @@ rte_rdtsc(void)
 	asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
 	return tsc;
 }
+#else
+/**
+ * This is an alternative method to enable rte_rdtsc() with high resolution
+ * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
+ * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
+ * access to PMU cycle counter from user space is not enabled by default in
+ * arm64 linux kernel.
+ * It is possible to enable cycle counter at user space access by configuring
+ * the PMU from the privileged mode (kernel space).
+ *
+ * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
+ * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
+ * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
+ * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
+ * val |= (BIT(0) | BIT(2));
+ * isb();
+ * asm volatile("msr pmcr_el0, %0" : : "r" (val));
+ *
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	uint64_t tsc;
+
+	asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
+	return tsc;
+}
+#endif
 
 static inline uint64_t
 rte_rdtsc_precise(void)
-- 
2.5.5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-18 11:51 [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter Jerin Jacob
@ 2016-08-19  9:43 ` Nipun Gupta
  2016-08-19 11:46   ` Jerin Jacob
  2016-08-23 10:01 ` Hemant Agrawal
  1 sibling, 1 reply; 8+ messages in thread
From: Nipun Gupta @ 2016-08-19  9:43 UTC (permalink / raw)
  To: Jerin Jacob, dev; +Cc: thomas.monjalon, jianbo.liu, viktorin, Hemant Agrawal

Hi Jerin,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> Sent: Thursday, August 18, 2016 17:22
> To: dev@dpdk.org
> Cc: thomas.monjalon@6wind.com; jianbo.liu@linaro.org;
> viktorin@rehivetech.com; Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
> 
> Existing cntvct_el0 based rte_rdtsc() provides portable
> means to get wall clock counter at user space. Typically
> it runs at <= 100MHz.
> 
> The alternative method to enable rte_rdtsc() for high resolution
> wall clock counter is through armv8 PMU subsystem.
> The PMU cycle counter runs at CPU frequency, However,
> access to PMU cycle counter from user space is not enabled
> by default in the arm64 linux kernel.
> It is possible to enable cycle counter at user space access
> by configuring the PMU from the privileged mode (kernel space).
> 
> by default rte_rdtsc() implementation uses portable
> cntvct_el0 scheme. Application can choose the PMU based
> implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
> 
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
> 
> The PMU based scheme useful for high accuracy performance profiling.
> Find below the example steps to configure the PMU based cycle counter on an
> armv8 machine.
> 
> # git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0
> # cd armv8_pmu_cycle_counter_el0
> # make
> # sudo insmod pmu_el0_cycle_counter.ko
> # cd $DPDK_DIR
> # make config T=arm64-armv8a-linuxapp-gcc
> # echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config
> # make -j 4

Can we make this kernel module also a part of DPDK. May be in the linuxapp so that it is also compiled with DPDK?

> 
> ---
>  .../common/include/arch/arm/rte_cycles_64.h        | 33
> ++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> index 14f2612..867a946 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> @@ -45,6 +45,11 @@ extern "C" {
>   * @return
>   *   The time base for this lcore.
>   */
> +#ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> +/**
> + * This call is portable to any ARMv8 architecture, however, typically
> + * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> + */
>  static inline uint64_t
>  rte_rdtsc(void)
>  {
> @@ -53,6 +58,34 @@ rte_rdtsc(void)
>  	asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
>  	return tsc;
>  }
> +#else
> +/**
> + * This is an alternative method to enable rte_rdtsc() with high resolution
> + * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
> + * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
> + * access to PMU cycle counter from user space is not enabled by default in
> + * arm64 linux kernel.
> + * It is possible to enable cycle counter at user space access by configuring
> + * the PMU from the privileged mode (kernel space).
> + *
> + * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
> + * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
> + * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
> + * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
> + * val |= (BIT(0) | BIT(2));
> + * isb();
> + * asm volatile("msr pmcr_el0, %0" : : "r" (val));

In your git repo I see that on cleanup the cycle count register is not disabled (PMCNTENCLR_EL0). It shall be better to disable the cycle count register too at module exit.

> + *
> + */
> +static inline uint64_t
> +rte_rdtsc(void)
> +{
> +	uint64_t tsc;
> +
> +	asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> +	return tsc;
> +}
> +#endif
> 
>  static inline uint64_t
>  rte_rdtsc_precise(void)
> --
> 2.5.5

Do you also plan to support performance monitor event counters?

Regards,
Nipun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-19  9:43 ` Nipun Gupta
@ 2016-08-19 11:46   ` Jerin Jacob
  2016-08-19 12:24     ` Jan Viktorin
  0 siblings, 1 reply; 8+ messages in thread
From: Jerin Jacob @ 2016-08-19 11:46 UTC (permalink / raw)
  To: Nipun Gupta; +Cc: dev, thomas.monjalon, jianbo.liu, viktorin, Hemant Agrawal

On Fri, Aug 19, 2016 at 09:43:36AM +0000, Nipun Gupta wrote:
> Hi Jerin,
> 

Hi Nipun,

> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> > Sent: Thursday, August 18, 2016 17:22
> > To: dev@dpdk.org
> > Cc: thomas.monjalon@6wind.com; jianbo.liu@linaro.org;
> > viktorin@rehivetech.com; Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
> > 
> > Existing cntvct_el0 based rte_rdtsc() provides portable
> > means to get wall clock counter at user space. Typically
> > it runs at <= 100MHz.
> > 
> > The alternative method to enable rte_rdtsc() for high resolution
> > wall clock counter is through armv8 PMU subsystem.
> > The PMU cycle counter runs at CPU frequency, However,
> > access to PMU cycle counter from user space is not enabled
> > by default in the arm64 linux kernel.
> > It is possible to enable cycle counter at user space access
> > by configuring the PMU from the privileged mode (kernel space).
> > 
> > by default rte_rdtsc() implementation uses portable
> > cntvct_el0 scheme. Application can choose the PMU based
> > implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
> > 
> > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > ---
> > 
> > The PMU based scheme useful for high accuracy performance profiling.
> > Find below the example steps to configure the PMU based cycle counter on an
> > armv8 machine.
> > 
> > # git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0
> > # cd armv8_pmu_cycle_counter_el0
> > # make
> > # sudo insmod pmu_el0_cycle_counter.ko
> > # cd $DPDK_DIR
> > # make config T=arm64-armv8a-linuxapp-gcc
> > # echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config
> > # make -j 4
> 
> Can we make this kernel module also a part of DPDK. May be in the linuxapp so that it is also compiled with DPDK?

I thought so, Later I realized it may not be a good idea to add yet
another out of tree module in DPDK repo and DPDK tries to get rid of
existing out of tree modules.

> 
> > 
> > ---
> >  .../common/include/arch/arm/rte_cycles_64.h        | 33
> > ++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > index 14f2612..867a946 100644
> > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > @@ -45,6 +45,11 @@ extern "C" {
> >   * @return
> >   *   The time base for this lcore.
> >   */
> > +#ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> > +/**
> > + * This call is portable to any ARMv8 architecture, however, typically
> > + * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> > + */
> >  static inline uint64_t
> >  rte_rdtsc(void)
> >  {
> > @@ -53,6 +58,34 @@ rte_rdtsc(void)
> >  	asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> >  	return tsc;
> >  }
> > +#else
> > +/**
> > + * This is an alternative method to enable rte_rdtsc() with high resolution
> > + * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
> > + * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
> > + * access to PMU cycle counter from user space is not enabled by default in
> > + * arm64 linux kernel.
> > + * It is possible to enable cycle counter at user space access by configuring
> > + * the PMU from the privileged mode (kernel space).
> > + *
> > + * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
> > + * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
> > + * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
> > + * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
> > + * val |= (BIT(0) | BIT(2));
> > + * isb();
> > + * asm volatile("msr pmcr_el0, %0" : : "r" (val));
> 
> In your git repo I see that on cleanup the cycle count register is not disabled (PMCNTENCLR_EL0). It shall be better to disable the cycle count register too at module exit.

OK

> 
> > + *
> > + */
> > +static inline uint64_t
> > +rte_rdtsc(void)
> > +{
> > +	uint64_t tsc;
> > +
> > +	asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> > +	return tsc;
> > +}
> > +#endif
> > 
> >  static inline uint64_t
> >  rte_rdtsc_precise(void)
> > --
> > 2.5.5
> 
> Do you also plan to support performance monitor event counters?

No. This patch was inspired by armv7 PMU scheme and its part of DPDK.
The sole reason to add this support to catch any performance regression
through app/test application.Other than that, I think cntvct_el0 based
existing scheme is good enough for all the use cases.

> 
> Regards,
> Nipun
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-19 11:46   ` Jerin Jacob
@ 2016-08-19 12:24     ` Jan Viktorin
  2016-08-19 12:52       ` Jerin Jacob
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Viktorin @ 2016-08-19 12:24 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: Nipun Gupta, dev, thomas.monjalon, jianbo.liu, Hemant Agrawal

On Fri, 19 Aug 2016 17:16:12 +0530
Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:

> On Fri, Aug 19, 2016 at 09:43:36AM +0000, Nipun Gupta wrote:
> > Hi Jerin,
> >   
> 
> Hi Nipun,
> 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Thursday, August 18, 2016 17:22
> > > To: dev@dpdk.org
> > > Cc: thomas.monjalon@6wind.com; jianbo.liu@linaro.org;
> > > viktorin@rehivetech.com; Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
> > > 
> > > Existing cntvct_el0 based rte_rdtsc() provides portable
> > > means to get wall clock counter at user space. Typically
> > > it runs at <= 100MHz.
> > > 
> > > The alternative method to enable rte_rdtsc() for high resolution
> > > wall clock counter is through armv8 PMU subsystem.
> > > The PMU cycle counter runs at CPU frequency, However,
> > > access to PMU cycle counter from user space is not enabled
> > > by default in the arm64 linux kernel.
> > > It is possible to enable cycle counter at user space access
> > > by configuring the PMU from the privileged mode (kernel space).
> > > 
> > > by default rte_rdtsc() implementation uses portable
> > > cntvct_el0 scheme. Application can choose the PMU based
> > > implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
> > > 
> > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > ---
> > > 
> > > The PMU based scheme useful for high accuracy performance profiling.
> > > Find below the example steps to configure the PMU based cycle counter on an
> > > armv8 machine.
> > > 
> > > # git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0
> > > # cd armv8_pmu_cycle_counter_el0
> > > # make
> > > # sudo insmod pmu_el0_cycle_counter.ko
> > > # cd $DPDK_DIR
> > > # make config T=arm64-armv8a-linuxapp-gcc
> > > # echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config
> > > # make -j 4  
> > 
> > Can we make this kernel module also a part of DPDK. May be in the linuxapp so that it is also compiled with DPDK?  
> 
> I thought so, Later I realized it may not be a good idea to add yet
> another out of tree module in DPDK repo and DPDK tries to get rid of
> existing out of tree modules.

This has also been my way of thinking. However, if we discover that such
kernel module would be really useful, I think we can do it.

> 
> >   
> > > 
> > > ---
> > >  .../common/include/arch/arm/rte_cycles_64.h        | 33
> > > ++++++++++++++++++++++
> > >  1 file changed, 33 insertions(+)
> > > 
> > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > index 14f2612..867a946 100644
> > > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > @@ -45,6 +45,11 @@ extern "C" {
> > >   * @return
> > >   *   The time base for this lcore.
> > >   */
> > > +#ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> > > +/**
> > > + * This call is portable to any ARMv8 architecture, however, typically
> > > + * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> > > + */
> > >  static inline uint64_t
> > >  rte_rdtsc(void)
> > >  {
> > > @@ -53,6 +58,34 @@ rte_rdtsc(void)
> > >  	asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> > >  	return tsc;
> > >  }
> > > +#else
> > > +/**
> > > + * This is an alternative method to enable rte_rdtsc() with high resolution
> > > + * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
> > > + * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
> > > + * access to PMU cycle counter from user space is not enabled by default in
> > > + * arm64 linux kernel.
> > > + * It is possible to enable cycle counter at user space access by configuring
> > > + * the PMU from the privileged mode (kernel space).
> > > + *
> > > + * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
> > > + * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
> > > + * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
> > > + * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
> > > + * val |= (BIT(0) | BIT(2));
> > > + * isb();
> > > + * asm volatile("msr pmcr_el0, %0" : : "r" (val));  
> > 
> > In your git repo I see that on cleanup the cycle count register is not disabled (PMCNTENCLR_EL0). It shall be better to disable the cycle count register too at module exit.  
> 
> OK

+1

I've got a private kernel driver enabling and disabling (hopefully) properly
this for ARMv7. If we'd like to merge it, I'd like to have a single module
or at least single module with 2 implementations...

I can post it if it would be helpful.

Regards
Jan

> 
> >   
> > > + *
> > > + */
> > > +static inline uint64_t
> > > +rte_rdtsc(void)
> > > +{
> > > +	uint64_t tsc;
> > > +
> > > +	asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> > > +	return tsc;
> > > +}
> > > +#endif
> > > 
> > >  static inline uint64_t
> > >  rte_rdtsc_precise(void)
> > > --
> > > 2.5.5  
> > 
> > Do you also plan to support performance monitor event counters?  
> 
> No. This patch was inspired by armv7 PMU scheme and its part of DPDK.
> The sole reason to add this support to catch any performance regression
> through app/test application.Other than that, I think cntvct_el0 based
> existing scheme is good enough for all the use cases.
> 
> > 
> > Regards,
> > Nipun
> >   



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-19 12:24     ` Jan Viktorin
@ 2016-08-19 12:52       ` Jerin Jacob
  2016-10-04  8:42         ` Thomas Monjalon
  0 siblings, 1 reply; 8+ messages in thread
From: Jerin Jacob @ 2016-08-19 12:52 UTC (permalink / raw)
  To: Jan Viktorin
  Cc: Nipun Gupta, dev, thomas.monjalon, jianbo.liu, Hemant Agrawal

On Fri, Aug 19, 2016 at 02:24:58PM +0200, Jan Viktorin wrote:
> On Fri, 19 Aug 2016 17:16:12 +0530
> Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> 
> 
> I've got a private kernel driver enabling and disabling (hopefully) properly
> this for ARMv7. If we'd like to merge it, I'd like to have a single module
> or at least single module with 2 implementations...
> 
> I can post it if it would be helpful.

I don't think we can use this in production as this may alter PMU state used
by 'perf' etc.I think let it be a debug interface for armv7 and armv8
and disable it by default.


> 
> Regards
> Jan
> 
> > 
> > >   
> > > > + *
> > > > + */
> > > > +static inline uint64_t
> > > > +rte_rdtsc(void)
> > > > +{
> > > > +	uint64_t tsc;
> > > > +
> > > > +	asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> > > > +	return tsc;
> > > > +}
> > > > +#endif
> > > > 
> > > >  static inline uint64_t
> > > >  rte_rdtsc_precise(void)
> > > > --
> > > > 2.5.5  
> > > 
> > > Do you also plan to support performance monitor event counters?  
> > 
> > No. This patch was inspired by armv7 PMU scheme and its part of DPDK.
> > The sole reason to add this support to catch any performance regression
> > through app/test application.Other than that, I think cntvct_el0 based
> > existing scheme is good enough for all the use cases.
> > 
> > > 
> > > Regards,
> > > Nipun
> > >   
> 
> 
> 
> -- 
>    Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
>    System Architect              Web:    www.RehiveTech.com
>    RehiveTech
>    Brno, Czech Republic

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-18 11:51 [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter Jerin Jacob
  2016-08-19  9:43 ` Nipun Gupta
@ 2016-08-23 10:01 ` Hemant Agrawal
  2016-10-04  8:46   ` Thomas Monjalon
  1 sibling, 1 reply; 8+ messages in thread
From: Hemant Agrawal @ 2016-08-23 10:01 UTC (permalink / raw)
  To: Jerin Jacob, dev; +Cc: thomas.monjalon, jianbo.liu, viktorin


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> Sent: Thursday, August 18, 2016 5:22 PM
> To: dev@dpdk.org
> Cc: thomas.monjalon@6wind.com; jianbo.liu@linaro.org;
> viktorin@rehivetech.com; Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
> 
> Existing cntvct_el0 based rte_rdtsc() provides portable means to get wall clock
> counter at user space. Typically it runs at <= 100MHz.
> 
> The alternative method to enable rte_rdtsc() for high resolution wall clock
> counter is through armv8 PMU subsystem.
> The PMU cycle counter runs at CPU frequency, However, access to PMU cycle
> counter from user space is not enabled by default in the arm64 linux kernel.
> It is possible to enable cycle counter at user space access by configuring the
> PMU from the privileged mode (kernel space).
> 
> by default rte_rdtsc() implementation uses portable
> cntvct_el0 scheme. Application can choose the PMU based implementation with
> CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
> 
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-19 12:52       ` Jerin Jacob
@ 2016-10-04  8:42         ` Thomas Monjalon
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2016-10-04  8:42 UTC (permalink / raw)
  To: Jerin Jacob, Jan Viktorin; +Cc: Nipun Gupta, dev, jianbo.liu, Hemant Agrawal

2016-08-19 18:22, Jerin Jacob:
> On Fri, Aug 19, 2016 at 02:24:58PM +0200, Jan Viktorin wrote:
> > On Fri, 19 Aug 2016 17:16:12 +0530
> > Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> > 
> > 
> > I've got a private kernel driver enabling and disabling (hopefully) properly
> > this for ARMv7. If we'd like to merge it, I'd like to have a single module
> > or at least single module with 2 implementations...
> > 
> > I can post it if it would be helpful.
> 
> I don't think we can use this in production as this may alter PMU state used
> by 'perf' etc.I think let it be a debug interface for armv7 and armv8
> and disable it by default.

Please could you document the use of PMU for debug and how it alters
usage of kernel counters?
A patch in doc/guides/prog_guide/profile_app.rst would be welcome.

Ideally, it would be a lot better to have a sysfs entry to enable PMU
counter with an upstream kernel.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
  2016-08-23 10:01 ` Hemant Agrawal
@ 2016-10-04  8:46   ` Thomas Monjalon
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2016-10-04  8:46 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: Hemant Agrawal, dev, jianbo.liu, viktorin

> > Existing cntvct_el0 based rte_rdtsc() provides portable means to get wall clock
> > counter at user space. Typically it runs at <= 100MHz.
> > 
> > The alternative method to enable rte_rdtsc() for high resolution wall clock
> > counter is through armv8 PMU subsystem.
> > The PMU cycle counter runs at CPU frequency, However, access to PMU cycle
> > counter from user space is not enabled by default in the arm64 linux kernel.
> > It is possible to enable cycle counter at user space access by configuring the
> > PMU from the privileged mode (kernel space).
> > 
> > by default rte_rdtsc() implementation uses portable
> > cntvct_el0 scheme. Application can choose the PMU based implementation with
> > CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
> > 
> > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> 
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

Applied, thanks

Please do not forget documentation and upstreaming efforts.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-10-04  8:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-18 11:51 [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter Jerin Jacob
2016-08-19  9:43 ` Nipun Gupta
2016-08-19 11:46   ` Jerin Jacob
2016-08-19 12:24     ` Jan Viktorin
2016-08-19 12:52       ` Jerin Jacob
2016-10-04  8:42         ` Thomas Monjalon
2016-08-23 10:01 ` Hemant Agrawal
2016-10-04  8:46   ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).