From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wes1-so2.wedos.net (wes1-so2-c.wedos.net [46.28.106.46]) by dpdk.org (Postfix) with ESMTP id 1741B5A8B for ; Fri, 19 Aug 2016 14:25:09 +0200 (CEST) Received: from pcviktorin.fit.vutbr.cz (pcviktorin.fit.vutbr.cz [147.229.13.147]) by wes1-so2.wedos.net (Postfix) with ESMTPSA id 3sG2Gr0vWNzrb; Fri, 19 Aug 2016 14:25:07 +0200 (CEST) Date: Fri, 19 Aug 2016 14:24:58 +0200 From: Jan Viktorin To: Jerin Jacob Cc: Nipun Gupta , "dev@dpdk.org" , "thomas.monjalon@6wind.com" , "jianbo.liu@linaro.org" , Hemant Agrawal Message-ID: <20160819142458.42dad72b@pcviktorin.fit.vutbr.cz> In-Reply-To: <20160819114611.GA5510@localhost.localdomain> References: <1471521090-21067-1-git-send-email-jerin.jacob@caviumnetworks.com> <20160819114611.GA5510@localhost.localdomain> Organization: RehiveTech MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Aug 2016 12:25:09 -0000 On Fri, 19 Aug 2016 17:16:12 +0530 Jerin Jacob wrote: > On Fri, Aug 19, 2016 at 09:43:36AM +0000, Nipun Gupta wrote: > > Hi Jerin, > > > > Hi Nipun, > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob > > > Sent: Thursday, August 18, 2016 17:22 > > > To: dev@dpdk.org > > > Cc: thomas.monjalon@6wind.com; jianbo.liu@linaro.org; > > > viktorin@rehivetech.com; Jerin Jacob > > > Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter > > > > > > Existing cntvct_el0 based rte_rdtsc() provides portable > > > means to get wall clock counter at user space. Typically > > > it runs at <= 100MHz. > > > > > > The alternative method to enable rte_rdtsc() for high resolution > > > wall clock counter is through armv8 PMU subsystem. > > > The PMU cycle counter runs at CPU frequency, However, > > > access to PMU cycle counter from user space is not enabled > > > by default in the arm64 linux kernel. > > > It is possible to enable cycle counter at user space access > > > by configuring the PMU from the privileged mode (kernel space). > > > > > > by default rte_rdtsc() implementation uses portable > > > cntvct_el0 scheme. Application can choose the PMU based > > > implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU > > > > > > Signed-off-by: Jerin Jacob > > > --- > > > > > > The PMU based scheme useful for high accuracy performance profiling. > > > Find below the example steps to configure the PMU based cycle counter on an > > > armv8 machine. > > > > > > # git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0 > > > # cd armv8_pmu_cycle_counter_el0 > > > # make > > > # sudo insmod pmu_el0_cycle_counter.ko > > > # cd $DPDK_DIR > > > # make config T=arm64-armv8a-linuxapp-gcc > > > # echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config > > > # make -j 4 > > > > Can we make this kernel module also a part of DPDK. May be in the linuxapp so that it is also compiled with DPDK? > > I thought so, Later I realized it may not be a good idea to add yet > another out of tree module in DPDK repo and DPDK tries to get rid of > existing out of tree modules. This has also been my way of thinking. However, if we discover that such kernel module would be really useful, I think we can do it. > > > > > > > > > --- > > > .../common/include/arch/arm/rte_cycles_64.h | 33 > > > ++++++++++++++++++++++ > > > 1 file changed, 33 insertions(+) > > > > > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > > > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > > > index 14f2612..867a946 100644 > > > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > > > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > > > @@ -45,6 +45,11 @@ extern "C" { > > > * @return > > > * The time base for this lcore. > > > */ > > > +#ifndef RTE_ARM_EAL_RDTSC_USE_PMU > > > +/** > > > + * This call is portable to any ARMv8 architecture, however, typically > > > + * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks. > > > + */ > > > static inline uint64_t > > > rte_rdtsc(void) > > > { > > > @@ -53,6 +58,34 @@ rte_rdtsc(void) > > > asm volatile("mrs %0, cntvct_el0" : "=r" (tsc)); > > > return tsc; > > > } > > > +#else > > > +/** > > > + * This is an alternative method to enable rte_rdtsc() with high resolution > > > + * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme > > > + * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However, > > > + * access to PMU cycle counter from user space is not enabled by default in > > > + * arm64 linux kernel. > > > + * It is possible to enable cycle counter at user space access by configuring > > > + * the PMU from the privileged mode (kernel space). > > > + * > > > + * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31))); > > > + * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31)); > > > + * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2))); > > > + * asm volatile("mrs %0, pmcr_el0" : "=r" (val)); > > > + * val |= (BIT(0) | BIT(2)); > > > + * isb(); > > > + * asm volatile("msr pmcr_el0, %0" : : "r" (val)); > > > > In your git repo I see that on cleanup the cycle count register is not disabled (PMCNTENCLR_EL0). It shall be better to disable the cycle count register too at module exit. > > OK +1 I've got a private kernel driver enabling and disabling (hopefully) properly this for ARMv7. If we'd like to merge it, I'd like to have a single module or at least single module with 2 implementations... I can post it if it would be helpful. Regards Jan > > > > > > + * > > > + */ > > > +static inline uint64_t > > > +rte_rdtsc(void) > > > +{ > > > + uint64_t tsc; > > > + > > > + asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc)); > > > + return tsc; > > > +} > > > +#endif > > > > > > static inline uint64_t > > > rte_rdtsc_precise(void) > > > -- > > > 2.5.5 > > > > Do you also plan to support performance monitor event counters? > > No. This patch was inspired by armv7 PMU scheme and its part of DPDK. > The sole reason to add this support to catch any performance regression > through app/test application.Other than that, I think cntvct_el0 based > existing scheme is good enough for all the use cases. > > > > > Regards, > > Nipun > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic