DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jan Viktorin <viktorin@rehivetech.com>
To: thomas.monjalon@6wind.com
Cc: jerin.jacob@caviumnetworks.com, tomaszx.kulasek@intel.com,
	jianbo.liu@linaro.org, dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only
Date: Sat, 19 Mar 2016 12:05:59 +0100	[thread overview]
Message-ID: <20160319120559.372e9088@jvn> (raw)
In-Reply-To: <1458379590-18618-1-git-send-email-viktorin@rehivetech.com>

On Sat, 19 Mar 2016 10:26:30 +0100
Jan Viktorin <viktorin@rehivetech.com> wrote:

> The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing. However,
> the target CPU may not support NEON or the user can disable to use it (as it
> does not always improve the performance).
> 
> The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the __ARM_NEON_FP
> feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from the .config. The memcpy
> implemention is driven by RTE_MACHINE_CPUFLAG_NEON, so the reason to disable
> NEON is hidden for the actual code.

Unfortunately, I've overlooked a mistake. I have to remake the patch a
bit, sorry. I am a bit confused about the __ARM_NEON and __ARM_NEON_FP
settings.

The arm_neon.h is available only when the __ARM_NEON is present. But...

$ arm-buildroot-linux-gnueabi-gcc -dM -E - < /dev/null  | grep "_FP\|_NEON"
#define __ARM_FP 12
#define __ARM_NEON_FP 4
#define __VFP_FP__ 1

Without -mfpu=neon we don't have arm_neon.h. I consider this strange as
we are not interested in the FPU features but in the SIMD features...

$ arm-buildroot-linux-gnueabi-gcc -mfpu=neon -dM -E - < /dev/null  | grep "_FP\|_NEON"
#define __ARM_FP 12
#define __ARM_NEON_FP 4
#define __ARM_NEON__ 1
#define __VFP_FP__ 1
#define __ARM_NEON 1

$ arm-buildroot-linux-gnueabi-gcc -mfpu=neon-vfpv4 -dM -E - < /dev/null  | grep "_FP\|_NEON"
#define __ARM_FP 14
#define __ARM_NEON_FP 6
#define __FP_FAST_FMAF 1
#define __FP_FAST_FMAL 1
#define __ARM_NEON__ 1
#define __VFP_FP__ 1
#define __ARM_NEON 1
#define __FP_FAST_FMA 1

ARM64 is OK here...

$ aarch64-buildroot-linux-gnu-gcc -dM -E - < /dev/null | grep "NEON\|FP"
#define __FP_FAST_FMAF 1
#define __ARM_NEON 1
#define __FP_FAST_FMA 1

So...

> 
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
> v2: fix l3fwm_em.c to refer RTE_MACHINE_CPUFLAG_NEON instead of __ARM_NEON
> ---
>  examples/l3fwd/l3fwd_em.c                              | 2 +-
>  lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 4 ++--
>  mk/machine/armv7a/rte.vars.mk                          | 2 +-
>  mk/rte.cpuflags.mk                                     | 2 ++
>  4 files changed, 6 insertions(+), 4 deletions(-)
> 
[...]
>  #ifdef __cplusplus
>  }
> diff --git a/mk/machine/armv7a/rte.vars.mk b/mk/machine/armv7a/rte.vars.mk
> index 48d3979..7a167c1 100644
> --- a/mk/machine/armv7a/rte.vars.mk
> +++ b/mk/machine/armv7a/rte.vars.mk
> @@ -62,6 +62,6 @@ ifdef CONFIG_RTE_ARCH_ARM_TUNE
>  MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
>  endif
>  
> -ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
> +ifdef $(RTE_MACHINE_CPUFLAG_NEON)
>  MACHINE_CFLAGS += -mfpu=neon
>  endif

RTE_MACHINE_CPUFLAG_NEON is not *yet* set here (cpuflags are detected later)...
So the -mfpu=neon is never configured and the build fails. The
MACHINE_CFLAGS should rather depend on the CONFIG_RTE_ARCH_ARM_NEON
telling the build-system "we want NEON".

> diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
> index 19a3e7e..1947511 100644
> --- a/mk/rte.cpuflags.mk
> +++ b/mk/rte.cpuflags.mk
> @@ -111,9 +111,11 @@ CPUFLAGS += VSX
>  endif
>  
>  # ARM flags
> +ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
>  ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)

Here, we should check __ARM_NEON (to be ARM32/64 compatible) but we
cannot see __ARM_NEON without the -mfpu=neon flag.

Jerin, does the current DPDK detect NEON feature on ARM64? I'd say, it
cannot.

So, we should probably check both __ARM_NEON and __ARM_NEON_FP here.

Another point, related to the original discussion:

http://dpdk.org/ml/archives/dev/2016-March/thread.html#35972

we should probably have a config option to enable memcpy optimizations
separated from the NEON support. The NEON support can then be detected
only by the __ARM_NEON flag. The ARMv7 would have the -mfpu=neon always
set. If somebody likes to customize this, she would do it by hand. The
result is, we correctly detect NEON during build time from the GCC.

>  CPUFLAGS += NEON
>  endif
> +endif
>  
>  ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),)
>  CPUFLAGS += CRC32



-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

  reply	other threads:[~2016-03-19 11:05 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-19  9:26 Jan Viktorin
2016-03-19 11:05 ` Jan Viktorin [this message]
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 0/4] " Jan Viktorin
2016-03-24 16:47   ` Thomas Monjalon
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 1/4] arm: remove CONFIG_RTE_ARCH_ARM_NEON Jan Viktorin
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 2/4] arm: detect NEON cpu feature by checking __ARM_NEON Jan Viktorin
2016-03-20 17:27   ` Jerin Jacob
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 3/4] arm: detect NEON by checking RTE_MACHINE_CPUFLAG_NEON Jan Viktorin
2016-03-19 19:58 ` [dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY Jan Viktorin
2016-03-19 20:14   ` Thomas Monjalon
2016-03-20  9:41     ` Jan Viktorin
2016-03-20  9:46       ` Jan Viktorin
2016-03-20 10:33         ` Thomas Monjalon
2016-03-20 10:29       ` Thomas Monjalon
2016-03-20 17:38         ` Jerin Jacob
2016-03-21  5:42   ` Jianbo Liu
2016-03-21 12:21     ` Jan Viktorin
2016-03-21 13:24       ` Thomas Monjalon
2016-03-21 14:01         ` Jan Viktorin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160319120559.372e9088@jvn \
    --to=viktorin@rehivetech.com \
    --cc=dev@dpdk.org \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=jianbo.liu@linaro.org \
    --cc=thomas.monjalon@6wind.com \
    --cc=tomaszx.kulasek@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).