From: Jan Viktorin <viktorin@rehivetech.com>
To: thomas.monjalon@6wind.com
Cc: jerin.jacob@caviumnetworks.com, tomaszx.kulasek@intel.com,
jianbo.liu@linaro.org, dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only
Date: Sat, 19 Mar 2016 12:05:59 +0100 [thread overview]
Message-ID: <20160319120559.372e9088@jvn> (raw)
In-Reply-To: <1458379590-18618-1-git-send-email-viktorin@rehivetech.com>
On Sat, 19 Mar 2016 10:26:30 +0100
Jan Viktorin <viktorin@rehivetech.com> wrote:
> The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing. However,
> the target CPU may not support NEON or the user can disable to use it (as it
> does not always improve the performance).
>
> The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the __ARM_NEON_FP
> feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from the .config. The memcpy
> implemention is driven by RTE_MACHINE_CPUFLAG_NEON, so the reason to disable
> NEON is hidden for the actual code.
Unfortunately, I've overlooked a mistake. I have to remake the patch a
bit, sorry. I am a bit confused about the __ARM_NEON and __ARM_NEON_FP
settings.
The arm_neon.h is available only when the __ARM_NEON is present. But...
$ arm-buildroot-linux-gnueabi-gcc -dM -E - < /dev/null | grep "_FP\|_NEON"
#define __ARM_FP 12
#define __ARM_NEON_FP 4
#define __VFP_FP__ 1
Without -mfpu=neon we don't have arm_neon.h. I consider this strange as
we are not interested in the FPU features but in the SIMD features...
$ arm-buildroot-linux-gnueabi-gcc -mfpu=neon -dM -E - < /dev/null | grep "_FP\|_NEON"
#define __ARM_FP 12
#define __ARM_NEON_FP 4
#define __ARM_NEON__ 1
#define __VFP_FP__ 1
#define __ARM_NEON 1
$ arm-buildroot-linux-gnueabi-gcc -mfpu=neon-vfpv4 -dM -E - < /dev/null | grep "_FP\|_NEON"
#define __ARM_FP 14
#define __ARM_NEON_FP 6
#define __FP_FAST_FMAF 1
#define __FP_FAST_FMAL 1
#define __ARM_NEON__ 1
#define __VFP_FP__ 1
#define __ARM_NEON 1
#define __FP_FAST_FMA 1
ARM64 is OK here...
$ aarch64-buildroot-linux-gnu-gcc -dM -E - < /dev/null | grep "NEON\|FP"
#define __FP_FAST_FMAF 1
#define __ARM_NEON 1
#define __FP_FAST_FMA 1
So...
>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
> v2: fix l3fwm_em.c to refer RTE_MACHINE_CPUFLAG_NEON instead of __ARM_NEON
> ---
> examples/l3fwd/l3fwd_em.c | 2 +-
> lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 4 ++--
> mk/machine/armv7a/rte.vars.mk | 2 +-
> mk/rte.cpuflags.mk | 2 ++
> 4 files changed, 6 insertions(+), 4 deletions(-)
>
[...]
> #ifdef __cplusplus
> }
> diff --git a/mk/machine/armv7a/rte.vars.mk b/mk/machine/armv7a/rte.vars.mk
> index 48d3979..7a167c1 100644
> --- a/mk/machine/armv7a/rte.vars.mk
> +++ b/mk/machine/armv7a/rte.vars.mk
> @@ -62,6 +62,6 @@ ifdef CONFIG_RTE_ARCH_ARM_TUNE
> MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
> endif
>
> -ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
> +ifdef $(RTE_MACHINE_CPUFLAG_NEON)
> MACHINE_CFLAGS += -mfpu=neon
> endif
RTE_MACHINE_CPUFLAG_NEON is not *yet* set here (cpuflags are detected later)...
So the -mfpu=neon is never configured and the build fails. The
MACHINE_CFLAGS should rather depend on the CONFIG_RTE_ARCH_ARM_NEON
telling the build-system "we want NEON".
> diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
> index 19a3e7e..1947511 100644
> --- a/mk/rte.cpuflags.mk
> +++ b/mk/rte.cpuflags.mk
> @@ -111,9 +111,11 @@ CPUFLAGS += VSX
> endif
>
> # ARM flags
> +ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
> ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
Here, we should check __ARM_NEON (to be ARM32/64 compatible) but we
cannot see __ARM_NEON without the -mfpu=neon flag.
Jerin, does the current DPDK detect NEON feature on ARM64? I'd say, it
cannot.
So, we should probably check both __ARM_NEON and __ARM_NEON_FP here.
Another point, related to the original discussion:
http://dpdk.org/ml/archives/dev/2016-March/thread.html#35972
we should probably have a config option to enable memcpy optimizations
separated from the NEON support. The NEON support can then be detected
only by the __ARM_NEON flag. The ARMv7 would have the -mfpu=neon always
set. If somebody likes to customize this, she would do it by hand. The
result is, we correctly detect NEON during build time from the GCC.
> CPUFLAGS += NEON
> endif
> +endif
>
> ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),)
> CPUFLAGS += CRC32
--
Jan Viktorin E-mail: Viktorin@RehiveTech.com
System Architect Web: www.RehiveTech.com
RehiveTech
Brno, Czech Republic
next prev parent reply other threads:[~2016-03-19 11:05 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-19 9:26 Jan Viktorin
2016-03-19 11:05 ` Jan Viktorin [this message]
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 0/4] " Jan Viktorin
2016-03-24 16:47 ` Thomas Monjalon
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 1/4] arm: remove CONFIG_RTE_ARCH_ARM_NEON Jan Viktorin
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 2/4] arm: detect NEON cpu feature by checking __ARM_NEON Jan Viktorin
2016-03-20 17:27 ` Jerin Jacob
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 3/4] arm: detect NEON by checking RTE_MACHINE_CPUFLAG_NEON Jan Viktorin
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY Jan Viktorin
2016-03-19 20:14 ` Thomas Monjalon
2016-03-20 9:41 ` Jan Viktorin
2016-03-20 9:46 ` Jan Viktorin
2016-03-20 10:33 ` Thomas Monjalon
2016-03-20 10:29 ` Thomas Monjalon
2016-03-20 17:38 ` Jerin Jacob
2016-03-21 5:42 ` Jianbo Liu
2016-03-21 12:21 ` Jan Viktorin
2016-03-21 13:24 ` Thomas Monjalon
2016-03-21 14:01 ` Jan Viktorin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160319120559.372e9088@jvn \
--to=viktorin@rehivetech.com \
--cc=dev@dpdk.org \
--cc=jerin.jacob@caviumnetworks.com \
--cc=jianbo.liu@linaro.org \
--cc=thomas.monjalon@6wind.com \
--cc=tomaszx.kulasek@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).