From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wes1-so2.wedos.net (wes1-so2.wedos.net [46.28.106.16]) by dpdk.org (Postfix) with ESMTP id D94062B93 for ; Sat, 19 Mar 2016 12:05:54 +0100 (CET) Received: from jvn (172.215.broadband18.iol.cz [109.81.215.172]) by wes1-so2.wedos.net (Postfix) with ESMTPSA id 3qRzm23G67zsj; Sat, 19 Mar 2016 12:05:54 +0100 (CET) Date: Sat, 19 Mar 2016 12:05:59 +0100 From: Jan Viktorin To: thomas.monjalon@6wind.com Cc: jerin.jacob@caviumnetworks.com, tomaszx.kulasek@intel.com, jianbo.liu@linaro.org, dev@dpdk.org Message-ID: <20160319120559.372e9088@jvn> In-Reply-To: <1458379590-18618-1-git-send-email-viktorin@rehivetech.com> References: <1458379590-18618-1-git-send-email-viktorin@rehivetech.com> Organization: RehiveTech X-Mailer: Claws Mail 3.13.0 (GTK+ 2.24.28; x86_64-unknown-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Mar 2016 11:05:55 -0000 On Sat, 19 Mar 2016 10:26:30 +0100 Jan Viktorin wrote: > The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing. However, > the target CPU may not support NEON or the user can disable to use it (as it > does not always improve the performance). > > The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the __ARM_NEON_FP > feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from the .config. The memcpy > implemention is driven by RTE_MACHINE_CPUFLAG_NEON, so the reason to disable > NEON is hidden for the actual code. Unfortunately, I've overlooked a mistake. I have to remake the patch a bit, sorry. I am a bit confused about the __ARM_NEON and __ARM_NEON_FP settings. The arm_neon.h is available only when the __ARM_NEON is present. But... $ arm-buildroot-linux-gnueabi-gcc -dM -E - < /dev/null | grep "_FP\|_NEON" #define __ARM_FP 12 #define __ARM_NEON_FP 4 #define __VFP_FP__ 1 Without -mfpu=neon we don't have arm_neon.h. I consider this strange as we are not interested in the FPU features but in the SIMD features... $ arm-buildroot-linux-gnueabi-gcc -mfpu=neon -dM -E - < /dev/null | grep "_FP\|_NEON" #define __ARM_FP 12 #define __ARM_NEON_FP 4 #define __ARM_NEON__ 1 #define __VFP_FP__ 1 #define __ARM_NEON 1 $ arm-buildroot-linux-gnueabi-gcc -mfpu=neon-vfpv4 -dM -E - < /dev/null | grep "_FP\|_NEON" #define __ARM_FP 14 #define __ARM_NEON_FP 6 #define __FP_FAST_FMAF 1 #define __FP_FAST_FMAL 1 #define __ARM_NEON__ 1 #define __VFP_FP__ 1 #define __ARM_NEON 1 #define __FP_FAST_FMA 1 ARM64 is OK here... $ aarch64-buildroot-linux-gnu-gcc -dM -E - < /dev/null | grep "NEON\|FP" #define __FP_FAST_FMAF 1 #define __ARM_NEON 1 #define __FP_FAST_FMA 1 So... > > Signed-off-by: Jan Viktorin > --- > v2: fix l3fwm_em.c to refer RTE_MACHINE_CPUFLAG_NEON instead of __ARM_NEON > --- > examples/l3fwd/l3fwd_em.c | 2 +- > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 4 ++-- > mk/machine/armv7a/rte.vars.mk | 2 +- > mk/rte.cpuflags.mk | 2 ++ > 4 files changed, 6 insertions(+), 4 deletions(-) > [...] > #ifdef __cplusplus > } > diff --git a/mk/machine/armv7a/rte.vars.mk b/mk/machine/armv7a/rte.vars.mk > index 48d3979..7a167c1 100644 > --- a/mk/machine/armv7a/rte.vars.mk > +++ b/mk/machine/armv7a/rte.vars.mk > @@ -62,6 +62,6 @@ ifdef CONFIG_RTE_ARCH_ARM_TUNE > MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE) > endif > > -ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y) > +ifdef $(RTE_MACHINE_CPUFLAG_NEON) > MACHINE_CFLAGS += -mfpu=neon > endif RTE_MACHINE_CPUFLAG_NEON is not *yet* set here (cpuflags are detected later)... So the -mfpu=neon is never configured and the build fails. The MACHINE_CFLAGS should rather depend on the CONFIG_RTE_ARCH_ARM_NEON telling the build-system "we want NEON". > diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk > index 19a3e7e..1947511 100644 > --- a/mk/rte.cpuflags.mk > +++ b/mk/rte.cpuflags.mk > @@ -111,9 +111,11 @@ CPUFLAGS += VSX > endif > > # ARM flags > +ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y) > ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),) Here, we should check __ARM_NEON (to be ARM32/64 compatible) but we cannot see __ARM_NEON without the -mfpu=neon flag. Jerin, does the current DPDK detect NEON feature on ARM64? I'd say, it cannot. So, we should probably check both __ARM_NEON and __ARM_NEON_FP here. Another point, related to the original discussion: http://dpdk.org/ml/archives/dev/2016-March/thread.html#35972 we should probably have a config option to enable memcpy optimizations separated from the NEON support. The NEON support can then be detected only by the __ARM_NEON flag. The ARMv7 would have the -mfpu=neon always set. If somebody likes to customize this, she would do it by hand. The result is, we correctly detect NEON during build time from the GCC. > CPUFLAGS += NEON > endif > +endif > > ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),) > CPUFLAGS += CRC32 -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic