DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture
@ 2015-10-26 16:37 Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 01/16] mk: Introduce " Jan Viktorin
                   ` (16 more replies)
  0 siblings, 17 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

Hello DPDK community, Thomas, Dave,

here I propose the second version of the ARM support patch series. I've included
some ideas from Dave's patch. There are no big changes to the original series.

Important:

* The timer issue has now 2 solutions, the user may configure to use PMU counter
  or the clock_gettime API. The PMU counter may however break perf or other tools
  using the PMU Linux API. This is the reason why I did not make it the default.
  Also, I didn't include the Linux Kernel module that enables the PMU for userspace.
  There is a note in the rte_cycles.h about it. You should know what you are doing
  if you use that, so you may also write that simple driver or get from the Dave's
  patch. Later, we can integrate it, after we have some real PMD driver (and some
  supporting Linux Kernel module infra...).

* There is the NEON implementation of memcpy. It is faster then the native one
  (you can see stats in the patch), however, we must be sure, the target CPU contains
  the NEON co-processor. Also, for longer data lengths and ARM SoCs, the NEON memcpy
  implementation can be much slower then the native one. So this is again configurable.

* The cpuflags now contains the best from my and Dave's patchs.

* ACL build is broken. I've included a patch (16) that just prevents to pass -msse4.1
  into gcc if it does not support it. But that does not solve the whole issue.

* LPM build is broken unless you apply the patch 15. However, this is not the right
  solution and I provided just to have a workaround. I don't expect to merge it.

* I've added myself to the MAINTAINERS. Dave, would I like to be there as well?

* The Cortex A7, A8, A9 cores are non-LPAE (non Large Physical Address Extension)
  and thus there is no upstream support for huge pages in the Linux Kernel. It sounds
  like useless for devices with max 4 GB of RAM (usually 0.5-2 GB). However, our
  measurements have shown that it improve performance. A patch is somewhere deep in
  the kernel.org mailing lists.

* Only the GCC toolchain is considered at the moment.

Other details are included in each individual commit.

---

You can pull the changes from

  https://github.com/RehiveTech/dpdk.git arm-support-v2

since commit d08d304508a8a8caf255baf622ab65db1fec952c:

  eal/linux: make alarm not affected by system time jump (2015-10-21 17:01:24 +0200)

up to 57396c958571b651b4d14f90683b3d1b2d42a70e:

  acl: check for SSE 4.1 support (2015-10-26 17:29:36 +0100)

---

Regards
Jan Viktorin

Jan Viktorin (7):
  eal/arm: implement rdtsc by PMU or clock_gettime
  eal/arm: use vector memcpy only when NEON is enabled
  eal/arm: detect arm architecture in cpu flags
  eal/arm: rwlock support for ARM
  gcc/arm: avoid alignment errors to break build
  maintainers: claim responsibility for ARMv7
  acl: check for SSE 4.1 support

Vlastimil Kosar (9):
  mk: Introduce ARMv7 architecture
  eal/arm: atomic operations for ARM
  eal/arm: byte order operations for ARM
  eal/arm: cpu cycle operations for ARM
  eal/arm: prefetch operations for ARM
  eal/arm: spinlock operations for ARM (without HTM)
  eal/arm: vector memcpy for ARM
  eal/arm: cpu flag checks for ARM
  lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on
    for-x86

 MAINTAINERS                                        |   4 +
 app/test/test_cpuflags.c                           |   5 +
 config/defconfig_arm-armv7-a-linuxapp-gcc          |  75 +++++
 lib/librte_acl/Makefile                            |   4 +
 .../common/include/arch/arm/rte_atomic.h           | 256 ++++++++++++++++
 .../common/include/arch/arm/rte_byteorder.h        | 148 ++++++++++
 .../common/include/arch/arm/rte_cpuflags.h         | 192 ++++++++++++
 .../common/include/arch/arm/rte_cycles.h           | 121 ++++++++
 .../common/include/arch/arm/rte_memcpy.h           | 325 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |  61 ++++
 .../common/include/arch/arm/rte_rwlock.h           |  40 +++
 .../common/include/arch/arm/rte_spinlock.h         | 114 ++++++++
 lib/librte_lpm/rte_lpm.h                           |  71 +++++
 mk/arch/arm/rte.vars.mk                            |  39 +++
 mk/machine/armv7-a/rte.vars.mk                     |  60 ++++
 mk/rte.cpuflags.mk                                 |   6 +
 mk/toolchain/gcc/rte.vars.mk                       |   6 +
 17 files changed, 1527 insertions(+)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-28 13:34   ` David Marchand
  2015-10-28 13:39   ` David Marchand
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 02/16] eal/arm: atomic operations for ARM Jan Viktorin
                   ` (15 subsequent siblings)
  16 siblings, 2 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9. However, it is known to be working on Cortex-A7
and Cortex-A15.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* the -mtune parameter of GCC is configurable now
* the -mfpu=neon can be turned off

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 config/defconfig_arm-armv7-a-linuxapp-gcc | 78 +++++++++++++++++++++++++++++++
 mk/arch/arm/rte.vars.mk                   | 39 ++++++++++++++++
 mk/machine/armv7-a/rte.vars.mk            | 67 ++++++++++++++++++++++++++
 3 files changed, 184 insertions(+)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
new file mode 100644
index 0000000..5b582a8
--- /dev/null
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -0,0 +1,78 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All right reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9"
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# avoids using i686/x86_64 SIMD instructions, nothing for ARM
+CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# missing rte_vect.h for ARM
+CONFIG_XMM_SIZE=16
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 0000000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH  ?= arm
+CROSS ?=
+
+CPU_CFLAGS  ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk
new file mode 100644
index 0000000..48d3979
--- /dev/null
+++ b/mk/machine/armv7-a/rte.vars.mk
@@ -0,0 +1,67 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+CPU_CFLAGS += -mfloat-abi=softfp
+
+MACHINE_CFLAGS += -march=armv7-a
+
+ifdef CONFIG_RTE_ARCH_ARM_TUNE
+MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
+endif
+
+ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
+MACHINE_CFLAGS += -mfpu=neon
+endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 02/16] eal/arm: atomic operations for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 01/16] mk: Introduce " Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 03/16] eal/arm: byte order " Jan Viktorin
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* improve rte_wmb()
* use __atomic_* or __sync_*? (may affect the required GCC version)

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_atomic.h           | 256 +++++++++++++++++++++
 1 file changed, 256 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 0000000..1815766
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,256 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb()  __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while(0)
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() __sync_synchronize()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *) &v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	__atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	__atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	__atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	__atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 03/16] eal/arm: byte order operations for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 01/16] mk: Introduce " Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 02/16] eal/arm: atomic operations for ARM Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 04/16] eal/arm: cpu cycle " Jan Viktorin
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_byteorder.h        | 148 +++++++++++++++++++++
 1 file changed, 148 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 0000000..04e7b87
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,148 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("rev16 %[x1],%[x2]"
+		      : [x1] "=r" (x)
+		      : [x2] "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("rev %[x1],%[x2]"
+		      : [x1] "=r" (x)
+		      : [x2] "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	return  __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else /* RTE_BIG_ENDIAN */
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 04/16] eal/arm: cpu cycle operations for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (2 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 03/16] eal/arm: byte order " Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 05/16] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 85 ++++++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 0000000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock counter
+   available in the core is 32 bit wide. Therefore it is unsuitable as the
+   counter overlaps every few seconds and probably is not accessible by
+   userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+   simulate counter running at 1GHz.
+*/
+
+#include <time.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	struct timespec val;
+	uint64_t v;
+
+	while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+		/* no body */;
+
+	v  = (uint64_t) val.tv_sec * 1000000000LL;
+	v += (uint64_t) val.tv_nsec;
+	return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 05/16] eal/arm: implement rdtsc by PMU or clock_gettime
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (3 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 04/16] eal/arm: cpu cycle " Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 06/16] eal/arm: prefetch operations for ARM Jan Viktorin
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

Enable to choose a preferred way to read timer based on the
configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU.
It requires a kernel module that is not included to work.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 38 +++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index ff66ae2..5dcef25 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -54,8 +54,14 @@ extern "C" {
  * @return
  *   The time base for this lcore.
  */
+#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
+
+/**
+ * This call is easily portable to any ARM architecture, however,
+ * it may be damn slow and inprecise for some tasks.
+ */
 static inline uint64_t
-rte_rdtsc(void)
+__rte_rdtsc_syscall(void)
 {
 	struct timespec val;
 	uint64_t v;
@@ -67,6 +73,36 @@ rte_rdtsc(void)
 	v += (uint64_t) val.tv_nsec;
 	return v;
 }
+#define rte_rdtsc __rte_rdtsc_syscall
+
+#else
+
+/**
+ * This function requires to configure the PMCCNTR and enable
+ * userspace access to it:
+ *
+ *      asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1));
+ *      asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29));
+ *      asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x8000000f));
+ *
+ * which is possible only from the priviledged mode (kernel space).
+ */
+static inline uint64_t
+__rte_rdtsc_pmccntr(void)
+{
+	unsigned tsc;
+	uint64_t final_tsc;
+
+	/* Read PMCCNTR */
+	asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc));
+	/* 1 tick = 64 clocks */
+	final_tsc = ((uint64_t)tsc) << 6;
+
+	return (uint64_t)final_tsc;
+}
+#define rte_rdtsc __rte_rdtsc_pmccntr
+
+#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */
 
 static inline uint64_t
 rte_rdtsc_precise(void)
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 06/16] eal/arm: prefetch operations for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (4 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 05/16] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 07/16] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_prefetch.h         | 61 ++++++++++++++++++++++
 1 file changed, 61 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 0000000..8d75fe6
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 07/16] eal/arm: spinlock operations for ARM (without HTM)
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (5 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 06/16] eal/arm: prefetch operations for ARM Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 08/16] eal/arm: vector memcpy for ARM Jan Viktorin
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 0000000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while (sl->locked)
+			rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+	return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+	return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+	return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 08/16] eal/arm: vector memcpy for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (6 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 07/16] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 09/16] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.

The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.

               by X % faster
Length (B)   a15    a7     a9
   1         4.9  15.2    3.2
   7        56.9  48.2   40.3
   8        37.3  39.8   29.6
   9        69.3  38.7   33.9
  15        60.8  35.3   23.7
  16        50.6  35.9   35.0
  17        57.7  35.7   31.1
  31        16.0  23.3    9.0
  32        65.9  13.5   21.4
  33         3.9  10.3   -3.7
  63         2.0  12.9   -2.0
  64        66.5   0.0   16.5
  65         2.7   7.6  -35.6
 127         0.1   4.5  -18.9
 128        66.2   1.5  -51.4
 129        -0.8   3.2  -35.8
 255        -3.1  -0.9  -69.1
 256        67.9   1.2    7.2
 257        -3.6  -1.9  -36.9
 320        67.7   1.4    0.0
 384        66.8   1.4  -14.2
 511       -44.9  -2.3  -41.9
 512        67.3   1.4   -6.8
 513       -41.7  -3.0  -36.2
1023       -82.4  -2.8  -41.2
1024        68.3   1.4  -11.6
1025       -80.1  -3.3  -38.1
1518       -47.3  -5.0  -38.3
1522       -48.3  -6.0  -37.9
1600        65.4   1.3  -27.3
2048        59.5   1.5  -10.9
3072        52.3   1.5  -12.2
4096        45.3   1.4  -12.5
5120        40.6   1.5  -14.5
6144        35.4   1.4  -13.4
7168        32.9   1.4  -13.9
8192        28.2   1.4  -15.1

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_memcpy.h           | 270 +++++++++++++++++++++
 1 file changed, 270 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 0000000..ac885e9
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,270 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include <stdint.h>
+#include <string.h>
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+	              "vld1.8 {d4-d5}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+	              "vst1.8 {d4-d5}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+	              "vld1.8 {d4-d7}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7");
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+	asm volatile ("vld1.8 {d0-d3},   [%[src]]!\n\t"
+	              "vld1.8 {d4-d7},   [%[src]]!\n\t"
+	              "vld1.8 {d8-d11},  [%[src]]!\n\t"
+	              "vld1.8 {d12-d15}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3},   [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7},   [%[dst]]!\n\t"
+	              "vst1.8 {d8-d11},  [%[dst]]!\n\t"
+	              "vst1.8 {d12-d15}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+	                  "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #128]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #192]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #256]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #320]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #384]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #448]" : : [src] "r" (src));
+	asm volatile ("vld1.8 {d0-d3},   [%[src]]!\n\t"
+	              "vld1.8 {d4-d7},   [%[src]]!\n\t"
+	              "vld1.8 {d8-d11},  [%[src]]!\n\t"
+	              "vld1.8 {d12-d15}, [%[src]]!\n\t"
+	              "vld1.8 {d16-d19}, [%[src]]!\n\t"
+	              "vld1.8 {d20-d23}, [%[src]]!\n\t"
+	              "vld1.8 {d24-d27}, [%[src]]!\n\t"
+	              "vld1.8 {d28-d31}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3},   [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7},   [%[dst]]!\n\t"
+	              "vst1.8 {d8-d11},  [%[dst]]!\n\t"
+	              "vst1.8 {d12-d15}, [%[dst]]!\n\t"
+	              "vst1.8 {d16-d19}, [%[dst]]!\n\t"
+	              "vst1.8 {d20-d23}, [%[dst]]!\n\t"
+	              "vst1.8 {d24-d27}, [%[dst]]!\n\t"
+	              "vst1.8 {d28-d31}, [%[dst]]!\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+	                  "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15",
+	                  "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+	                  "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31");
+}
+
+#define rte_memcpy(dst, src, n)              \
+	({ (__builtin_constant_p(n)) ?       \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			/* ARMv7 can not handle unaligned access to long long
+			 * (uint64_t). Therefore two uint32_t operations are used.
+			 * TODO: use NEON too?
+			 */
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+			*(uint32_t *)dst = *(const uint32_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n,
+			(const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n,
+			(const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0)
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 09/16] eal/arm: use vector memcpy only when NEON is enabled
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (7 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 08/16] eal/arm: vector memcpy for ARM Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 10/16] eal/arm: cpu flag checks for ARM Jan Viktorin
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

The GCC can be configured to avoid using NEON extensions.
For that purpose, we provide just the memcpy implementation
of the rte_memcpy.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_memcpy.h           | 59 +++++++++++++++++++++-
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index ac885e9..75e8bda 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -35,8 +35,6 @@
 
 #include <stdint.h>
 #include <string.h>
-/* ARM NEON Intrinsics are used to copy data */
-#include <arm_neon.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -44,6 +42,11 @@ extern "C" {
 
 #include "generic/rte_memcpy.h"
 
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
 static inline void
 rte_mov16(uint8_t *dst, const uint8_t *src)
 {
@@ -263,6 +266,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n)
 	return ret;
 }
 
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 10/16] eal/arm: cpu flag checks for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (8 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 09/16] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 11/16] eal/arm: detect arm architecture in cpu flags Jan Viktorin
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2: check whether AT_HWCAP and AT_HWCAP2 exists
---
 app/test/test_cpuflags.c                           |   5 +
 .../common/include/arch/arm/rte_cpuflags.h         | 177 +++++++++++++++++++++
 mk/rte.cpuflags.mk                                 |   6 +
 3 files changed, 188 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif
 
+#if defined(RTE_ARCH_ARM)
+	printf("Check for NEON:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 	printf("Check for SSE:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 0000000..1eadb33
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,177 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+	REG_HWCAP = 0,
+	REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	RTE_CPUFLAG_SWP = 0,
+	RTE_CPUFLAG_HALF,
+	RTE_CPUFLAG_THUMB,
+	RTE_CPUFLAG_A26BIT,
+	RTE_CPUFLAG_FAST_MULT,
+	RTE_CPUFLAG_FPA,
+	RTE_CPUFLAG_VFP,
+	RTE_CPUFLAG_EDSP,
+	RTE_CPUFLAG_JAVA,
+	RTE_CPUFLAG_IWMMXT,
+	RTE_CPUFLAG_CRUNCH,
+	RTE_CPUFLAG_THUMBEE,
+	RTE_CPUFLAG_NEON,
+	RTE_CPUFLAG_VFPv3,
+	RTE_CPUFLAG_VFPv3D16,
+	RTE_CPUFLAG_TLS,
+	RTE_CPUFLAG_VFPv4,
+	RTE_CPUFLAG_IDIVA,
+	RTE_CPUFLAG_IDIVT,
+	RTE_CPUFLAG_VFPD32,
+	RTE_CPUFLAG_LPAE,
+	RTE_CPUFLAG_EVTSTRM,
+	RTE_CPUFLAG_AES,
+	RTE_CPUFLAG_PMULL,
+	RTE_CPUFLAG_SHA1,
+	RTE_CPUFLAG_SHA2,
+	RTE_CPUFLAG_CRC32,
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0)
+	FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1)
+	FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2)
+	FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
+	FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP,  4)
+	FEAT_DEF(FPA,       0x00000001, 0, REG_HWCAP,  5)
+	FEAT_DEF(VFP,       0x00000001, 0, REG_HWCAP,  6)
+	FEAT_DEF(EDSP,      0x00000001, 0, REG_HWCAP,  7)
+	FEAT_DEF(JAVA,      0x00000001, 0, REG_HWCAP,  8)
+	FEAT_DEF(IWMMXT,    0x00000001, 0, REG_HWCAP,  9)
+	FEAT_DEF(CRUNCH,    0x00000001, 0, REG_HWCAP,  10)
+	FEAT_DEF(THUMBEE,   0x00000001, 0, REG_HWCAP,  11)
+	FEAT_DEF(NEON,      0x00000001, 0, REG_HWCAP,  12)
+	FEAT_DEF(VFPv3,     0x00000001, 0, REG_HWCAP,  13)
+	FEAT_DEF(VFPv3D16,  0x00000001, 0, REG_HWCAP,  14)
+	FEAT_DEF(TLS,       0x00000001, 0, REG_HWCAP,  15)
+	FEAT_DEF(VFPv4,     0x00000001, 0, REG_HWCAP,  16)
+	FEAT_DEF(IDIVA,     0x00000001, 0, REG_HWCAP,  17)
+	FEAT_DEF(IDIVT,     0x00000001, 0, REG_HWCAP,  18)
+	FEAT_DEF(VFPD32,    0x00000001, 0, REG_HWCAP,  19)
+	FEAT_DEF(LPAE,      0x00000001, 0, REG_HWCAP,  20)
+	FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  21)
+	FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP2,  0)
+	FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP2,  1)
+	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
+	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
+	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+	__attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+	int auxv_fd;
+	Elf32_auxv_t auxv;
+
+	auxv_fd = open("/proc/self/auxv", O_RDONLY);
+	assert(auxv_fd);
+	while (read(auxv_fd, &auxv,
+		sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+		if (auxv.a_type == AT_HWCAP)
+			out[REG_HWCAP] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_HWCAP2)
+			out[REG_HWCAP2] = auxv.a_un.a_val;
+	}
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs = {0};
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index f595cd0..bec7bdd 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
 CPUFLAGS += VSX
 endif
 
+# ARM flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
+CPUFLAGS += NEON
+endif
+
+
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
 
 # To strip whitespace
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 11/16] eal/arm: detect arm architecture in cpu flags
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (9 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 10/16] eal/arm: cpu flag checks for ARM Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 12/16] eal/arm: rwlock support for ARM Jan Viktorin
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 1eadb33..17e13fc 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -52,10 +52,15 @@ extern "C" {
 #define AT_HWCAP2 26
 #endif
 
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
 /* software based registers */
 enum cpu_register_t {
 	REG_HWCAP = 0,
 	REG_HWCAP2,
+	REG_PLATFORM,
 };
 
 /**
@@ -89,6 +94,8 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_SHA1,
 	RTE_CPUFLAG_SHA2,
 	RTE_CPUFLAG_CRC32,
+	RTE_CPUFLAG_AARCH32,
+	RTE_CPUFLAG_AARCH64,
 	/* The last item */
 	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
 };
@@ -121,6 +128,8 @@ static const struct feature_entry cpu_feature_table[] = {
 	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
 	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
 	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+	FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
+	FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
 };
 
 /*
@@ -141,6 +150,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
 			out[REG_HWCAP] = auxv.a_un.a_val;
 		else if (auxv.a_type == AT_HWCAP2)
 			out[REG_HWCAP2] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_PLATFORM) {
+			if (!strcmp((const char *)auxv.a_un.a_val, "aarch32"))
+				out[REG_PLATFORM] = 0x0001;
+			else if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+				out[REG_PLATFORM] = 0x0002;
+		}
 	}
 }
 
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 12/16] eal/arm: rwlock support for ARM
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (10 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 11/16] eal/arm: detect arm architecture in cpu flags Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 13/16] gcc/arm: avoid alignment errors to break build Jan Viktorin
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

Just a copy from PPC.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_rwlock.h           | 40 ++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 0000000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 13/16] gcc/arm: avoid alignment errors to break build
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (11 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 12/16] eal/arm: rwlock support for ARM Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 14/16] maintainers: claim responsibility for ARMv7 Jan Viktorin
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
---
 mk/toolchain/gcc/rte.vars.mk | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..8f9c396 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
 WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
 WERROR_FLAGS += -Wundef -Wwrite-strings
 
+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error
+endif
+
 # process cpu flags
 include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
 
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 14/16] maintainers: claim responsibility for ARMv7
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (12 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 13/16] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 MAINTAINERS | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 080a8e8..a8933eb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,6 +124,10 @@ IBM POWER
 M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
 F: lib/librte_eal/common/include/arch/ppc_64/
 
+ARM v7
+M: Jan Viktorin <viktorin@rehivetech.com>
+F: lib/librte_eal/common/include/arch/arm/
+
 Intel x86
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (13 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 14/16] maintainers: claim responsibility for ARMv7 Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-27 15:31   ` Ananyev, Konstantin
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support Jan Viktorin
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
  16 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
the function is reimplemented using non-vector operations for non-x86
architectures. In the future, each architecture should have vectorized code.
This patch includes rudimentary emulation of intrinsic functions _mm_set_epi32(),
_mm_loadu_si128() and _mm_load_si128() for easy portability of existing
applications.

LPM builds now when on ARM.

FIXME: to be reworked

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
 lib/librte_lpm/rte_lpm.h                  | 71 +++++++++++++++++++++++++++++++
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
index 5b582a8..33afb33 100644
--- a/config/defconfig_arm-armv7-a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -58,7 +58,6 @@ CONFIG_XMM_SIZE=16
 
 # fails to compile on ARM
 CONFIG_RTE_LIBRTE_ACL=n
-CONFIG_RTE_LIBRTE_LPM=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index c299ce2..4619992 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -47,7 +47,9 @@
 #include <rte_byteorder.h>
 #include <rte_memory.h>
 #include <rte_common.h>
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 #include <rte_vect.h>
+#endif
 
 #ifdef __cplusplus
 extern "C" {
@@ -358,6 +360,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
 	return 0;
 }
 
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 /* Mask four results. */
 #define	 RTE_LPM_MASKX4_RES	UINT64_C(0x00ff00ff00ff00ff)
 
@@ -472,6 +475,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
 	hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
 	hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
 }
+#else
+// TODO: this code should be reworked.
+
+typedef struct {
+	union uint128 {
+		uint8_t uint8[16];
+		uint32_t uint32[4];
+	} val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+	__m128i res;
+	res.val.uint32[0] = v0;
+	res.val.uint32[1] = v1;
+	res.val.uint32[2] = v2;
+	res.val.uint32[3] = v3;
+	return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i * v)
+{
+	__m128i res;
+	res = *v;
+	return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i * v)
+{
+	__m128i res;
+	res = *v;
+	return res;
+}
+
+/**
+ * Lookup four IP addresses in an LPM table.
+ *
+ * @param lpm
+ *   LPM object handle
+ * @param ip
+ *   Four IPs to be looked up in the LPM table
+ * @param hop
+ *   Next hop of the most specific rule found for IP (valid on lookup hit only).
+ *   This is an 4 elements array of two byte values.
+ *   If the lookup was succesfull for the given IP, then least significant byte
+ *   of the corresponding element is the  actual next hop and the most
+ *   significant byte is zero.
+ *   If the lookup for the given IP failed, then corresponding element would
+ *   contain default value, see description of then next parameter.
+ * @param defv
+ *   Default value to populate into corresponding element of hop[] array,
+ *   if lookup would fail.
+ */
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+	uint16_t defv)
+{
+	rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
+
+	hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
+	hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
+	hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
+	hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
+}
+#endif
 
 #ifdef __cplusplus
 }
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (14 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
@ 2015-10-26 16:37 ` Jan Viktorin
  2015-10-27 15:55   ` Ananyev, Konstantin
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
  16 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-26 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, David Hunt, dev

The main goal of this check is to avoid passing the -msse4.1
option to the GCC that does not support it (like arm toolchains).

Anyway, the ACL library does not compile on ARM.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 lib/librte_acl/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index 7a1cf8a..401fb8c 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -50,7 +50,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
 
+CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 2>&1 && echo 1)
+
+ifeq ($(CC_SSE4_1_SUPPORT),1)
 CFLAGS_acl_run_sse.o += -msse4.1
+endif
 
 #
 # If the compiler supports AVX2 instructions,
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
@ 2015-10-27 15:31   ` Ananyev, Konstantin
  2015-10-27 15:38     ` Jan Viktorin
  0 siblings, 1 reply; 72+ messages in thread
From: Ananyev, Konstantin @ 2015-10-27 15:31 UTC (permalink / raw)
  To: Jan Viktorin, Thomas Monjalon, Hunt, David, dev; +Cc: Vlastimil Kosar

Hi Jan,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jan Viktorin
> Sent: Monday, October 26, 2015 4:38 PM
> To: Thomas Monjalon; Hunt, David; dev@dpdk.org
> Cc: Vlastimil Kosar
> Subject: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86
> 
> From: Vlastimil Kosar <kosar@rehivetech.com>
> 
> LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
> the function is reimplemented using non-vector operations for non-x86
> architectures. In the future, each architecture should have vectorized code.
> This patch includes rudimentary emulation of intrinsic functions _mm_set_epi32(),
> _mm_loadu_si128() and _mm_load_si128() for easy portability of existing
> applications.
> 
> LPM builds now when on ARM.
> 
> FIXME: to be reworked
> 
> Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
>  config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
>  lib/librte_lpm/rte_lpm.h                  | 71 +++++++++++++++++++++++++++++++
>  2 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
> index 5b582a8..33afb33 100644
> --- a/config/defconfig_arm-armv7-a-linuxapp-gcc
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> @@ -58,7 +58,6 @@ CONFIG_XMM_SIZE=16
> 
>  # fails to compile on ARM
>  CONFIG_RTE_LIBRTE_ACL=n
> -CONFIG_RTE_LIBRTE_LPM=n
> 
>  # cannot use those on ARM
>  CONFIG_RTE_KNI_KMOD=n
> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> index c299ce2..4619992 100644
> --- a/lib/librte_lpm/rte_lpm.h
> +++ b/lib/librte_lpm/rte_lpm.h
> @@ -47,7 +47,9 @@
>  #include <rte_byteorder.h>
>  #include <rte_memory.h>
>  #include <rte_common.h>
> +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
>  #include <rte_vect.h>
> +#endif
> 
>  #ifdef __cplusplus
>  extern "C" {
> @@ -358,6 +360,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
>  	return 0;
>  }
> 
> +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
>  /* Mask four results. */
>  #define	 RTE_LPM_MASKX4_RES	UINT64_C(0x00ff00ff00ff00ff)
> 
> @@ -472,6 +475,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
>  	hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
>  	hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
>  }
> +#else

Probably better to create an lib/librte_eal/common/include/arch/arm/rte_vect.h,
and move all these x86 vector support emulation there?
Konstantin

> +// TODO: this code should be reworked.
> +
> +typedef struct {
> +	union uint128 {
> +		uint8_t uint8[16];
> +		uint32_t uint32[4];
> +	} val;
> +} __m128i;
> +
> +static inline __m128i
> +_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
> +{
> +	__m128i res;
> +	res.val.uint32[0] = v0;
> +	res.val.uint32[1] = v1;
> +	res.val.uint32[2] = v2;
> +	res.val.uint32[3] = v3;
> +	return res;
> +}
> +
> +static inline __m128i
> +_mm_loadu_si128(__m128i * v)
> +{
> +	__m128i res;
> +	res = *v;
> +	return res;
> +}
> +
> +static inline __m128i
> +_mm_load_si128(__m128i * v)
> +{
> +	__m128i res;
> +	res = *v;
> +	return res;
> +}
> +
> +/**
> + * Lookup four IP addresses in an LPM table.
> + *
> + * @param lpm
> + *   LPM object handle
> + * @param ip
> + *   Four IPs to be looked up in the LPM table
> + * @param hop
> + *   Next hop of the most specific rule found for IP (valid on lookup hit only).
> + *   This is an 4 elements array of two byte values.
> + *   If the lookup was succesfull for the given IP, then least significant byte
> + *   of the corresponding element is the  actual next hop and the most
> + *   significant byte is zero.
> + *   If the lookup for the given IP failed, then corresponding element would
> + *   contain default value, see description of then next parameter.
> + * @param defv
> + *   Default value to populate into corresponding element of hop[] array,
> + *   if lookup would fail.
> + */
> +static inline void
> +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
> +	uint16_t defv)
> +{
> +	rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
> +
> +	hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
> +	hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
> +	hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
> +	hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
> +}
> +#endif
> 
>  #ifdef __cplusplus
>  }
> --
> 2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86
  2015-10-27 15:31   ` Ananyev, Konstantin
@ 2015-10-27 15:38     ` Jan Viktorin
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 15:38 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Vlastimil Kosar

Hi Konstantin,

On Tue, 27 Oct 2015 15:31:44 +0000
"Ananyev, Konstantin" <konstantin.ananyev@intel.com> wrote:

> Hi Jan,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jan Viktorin
> > Sent: Monday, October 26, 2015 4:38 PM
> > To: Thomas Monjalon; Hunt, David; dev@dpdk.org
> > Cc: Vlastimil Kosar
> > Subject: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86
> > 
> > From: Vlastimil Kosar <kosar@rehivetech.com>
> > 
> > LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
> > the function is reimplemented using non-vector operations for non-x86
> > architectures. In the future, each architecture should have vectorized code.
> > This patch includes rudimentary emulation of intrinsic functions _mm_set_epi32(),
> > _mm_loadu_si128() and _mm_load_si128() for easy portability of existing
> > applications.
> > 
> > LPM builds now when on ARM.
> > 
> > FIXME: to be reworked
> > 
> > Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > ---
> >  config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
> >  lib/librte_lpm/rte_lpm.h                  | 71 +++++++++++++++++++++++++++++++
> >  2 files changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > index 5b582a8..33afb33 100644
> > --- a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > @@ -58,7 +58,6 @@ CONFIG_XMM_SIZE=16
> > 
> >  # fails to compile on ARM
> >  CONFIG_RTE_LIBRTE_ACL=n
> > -CONFIG_RTE_LIBRTE_LPM=n
> > 
> >  # cannot use those on ARM
> >  CONFIG_RTE_KNI_KMOD=n
> > diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> > index c299ce2..4619992 100644
> > --- a/lib/librte_lpm/rte_lpm.h
> > +++ b/lib/librte_lpm/rte_lpm.h
> > @@ -47,7 +47,9 @@
> >  #include <rte_byteorder.h>
> >  #include <rte_memory.h>
> >  #include <rte_common.h>
> > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
> >  #include <rte_vect.h>
> > +#endif
> > 
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -358,6 +360,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
> >  	return 0;
> >  }
> > 
> > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
> >  /* Mask four results. */
> >  #define	 RTE_LPM_MASKX4_RES	UINT64_C(0x00ff00ff00ff00ff)
> > 
> > @@ -472,6 +475,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
> >  	hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
> >  	hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
> >  }
> > +#else  
> 
> Probably better to create an lib/librte_eal/common/include/arch/arm/rte_vect.h,
> and move all these x86 vector support emulation there?
> Konstantin

Sure. This patch is terribly wrong and it's not to be merged. It is a
question whether to make it this way (with the refactoring as you
suggested) or to make some general abstraction of the SSE calls in DPDK.

Jan

> 
> > +// TODO: this code should be reworked.
> > +
> > +typedef struct {
> > +	union uint128 {
> > +		uint8_t uint8[16];
> > +		uint32_t uint32[4];
> > +	} val;
> > +} __m128i;
> > +
> > +static inline __m128i
> > +_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
> > +{
> > +	__m128i res;
> > +	res.val.uint32[0] = v0;
> > +	res.val.uint32[1] = v1;
> > +	res.val.uint32[2] = v2;
> > +	res.val.uint32[3] = v3;
> > +	return res;
> > +}
> > +
> > +static inline __m128i
> > +_mm_loadu_si128(__m128i * v)
> > +{
> > +	__m128i res;
> > +	res = *v;
> > +	return res;
> > +}
> > +
> > +static inline __m128i
> > +_mm_load_si128(__m128i * v)
> > +{
> > +	__m128i res;
> > +	res = *v;
> > +	return res;
> > +}
> > +
> > +/**
> > + * Lookup four IP addresses in an LPM table.
> > + *
> > + * @param lpm
> > + *   LPM object handle
> > + * @param ip
> > + *   Four IPs to be looked up in the LPM table
> > + * @param hop
> > + *   Next hop of the most specific rule found for IP (valid on lookup hit only).
> > + *   This is an 4 elements array of two byte values.
> > + *   If the lookup was succesfull for the given IP, then least significant byte
> > + *   of the corresponding element is the  actual next hop and the most
> > + *   significant byte is zero.
> > + *   If the lookup for the given IP failed, then corresponding element would
> > + *   contain default value, see description of then next parameter.
> > + * @param defv
> > + *   Default value to populate into corresponding element of hop[] array,
> > + *   if lookup would fail.
> > + */
> > +static inline void
> > +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
> > +	uint16_t defv)
> > +{
> > +	rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
> > +
> > +	hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
> > +	hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
> > +	hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
> > +	hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
> > +}
> > +#endif
> > 
> >  #ifdef __cplusplus
> >  }
> > --
> > 2.6.1  
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support Jan Viktorin
@ 2015-10-27 15:55   ` Ananyev, Konstantin
  2015-10-27 17:10     ` Jan Viktorin
  0 siblings, 1 reply; 72+ messages in thread
From: Ananyev, Konstantin @ 2015-10-27 15:55 UTC (permalink / raw)
  To: Jan Viktorin, Thomas Monjalon, Hunt, David, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jan Viktorin
> Sent: Monday, October 26, 2015 4:38 PM
> To: Thomas Monjalon; Hunt, David; dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
> 
> The main goal of this check is to avoid passing the -msse4.1
> option to the GCC that does not support it (like arm toolchains).
> 
> Anyway, the ACL library does not compile on ARM.
> 
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
>  lib/librte_acl/Makefile | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
> index 7a1cf8a..401fb8c 100644
> --- a/lib/librte_acl/Makefile
> +++ b/lib/librte_acl/Makefile
> @@ -50,7 +50,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
>  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
>  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> 
> +CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 2>&1 && echo 1)
> +
> +ifeq ($(CC_SSE4_1_SUPPORT),1)
>  CFLAGS_acl_run_sse.o += -msse4.1
> +endif

I don't think acl_run_sse.c would compile if SSE4_1 is not supported.
So, I think you need to do same thing, as is done for AVX2:
Compile in acl_run_sse.c  only if SSE41 is supported by the compiler:  

- SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
-
-CFLAGS_acl_run_sse.o += -msse4.1

+CC_SSE41_SUPPORT=$(shell $(CC) -msse4.1 -dM -E - </dev/null 2>&1 | \
grep -q  && echo 1)
+ifeq ($(CC_SSE41_SUPPORT), 1)
+        SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+        CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
+        CFLAGS_acl_run_sse.o += -msse4.1	
+endif

And then change rte_acl_init() accordingly.
Something like:

int __attribute__ ((weak))
rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
        __rte_unused const uint8_t **data,
        __rte_unused uint32_t *results,
        __rte_unused uint32_t num,
        __rte_unused uint32_t categories)
{
        return -ENOTSUP;
}

....

static void __attribute__((constructor))
rte_acl_init(void)
{
        enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;

#if defined(CC_AVX2_SUPPORT)
        if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
                alg = RTE_ACL_CLASSIFY_AVX2;
        else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
#elif defined (CC_SSE41_SUPPORT)
        if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
                alg = RTE_ACL_CLASSIFY_SSE;
#endif

        rte_acl_set_default_classify(alg);
}

After that, I suppose, you should be able to build and (probably use) librte_acl on arm.
Konstantin

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
  2015-10-27 15:55   ` Ananyev, Konstantin
@ 2015-10-27 17:10     ` Jan Viktorin
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 17:10 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

Hello Konstantin,

thank you for those hints! I reworked the code a little bit -
refactored the LPM vector operations into rte_vect.h which helped to
the ACL library as well. I will sort it out a little and send the v3 of
the series. The LPM and ACL now seem to be much more sane on ARM.

Regards
Jan

On Tue, 27 Oct 2015 15:55:48 +0000
"Ananyev, Konstantin" <konstantin.ananyev@intel.com> wrote:

> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jan Viktorin
> > Sent: Monday, October 26, 2015 4:38 PM
> > To: Thomas Monjalon; Hunt, David; dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
> > 
> > The main goal of this check is to avoid passing the -msse4.1
> > option to the GCC that does not support it (like arm toolchains).
> > 
> > Anyway, the ACL library does not compile on ARM.
> > 
> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > ---
> >  lib/librte_acl/Makefile | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
> > index 7a1cf8a..401fb8c 100644
> > --- a/lib/librte_acl/Makefile
> > +++ b/lib/librte_acl/Makefile
> > @@ -50,7 +50,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> > 
> > +CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 2>&1 && echo 1)
> > +
> > +ifeq ($(CC_SSE4_1_SUPPORT),1)
> >  CFLAGS_acl_run_sse.o += -msse4.1
> > +endif  
> 
> I don't think acl_run_sse.c would compile if SSE4_1 is not supported.
> So, I think you need to do same thing, as is done for AVX2:
> Compile in acl_run_sse.c  only if SSE41 is supported by the compiler:  
> 
> - SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> -
> -CFLAGS_acl_run_sse.o += -msse4.1
> 
> +CC_SSE41_SUPPORT=$(shell $(CC) -msse4.1 -dM -E - </dev/null 2>&1 | \
> grep -q  && echo 1)
> +ifeq ($(CC_SSE41_SUPPORT), 1)
> +        SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> +        CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
> +        CFLAGS_acl_run_sse.o += -msse4.1	
> +endif
> 
> And then change rte_acl_init() accordingly.
> Something like:
> 
> int __attribute__ ((weak))
> rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
>         __rte_unused const uint8_t **data,
>         __rte_unused uint32_t *results,
>         __rte_unused uint32_t num,
>         __rte_unused uint32_t categories)
> {
>         return -ENOTSUP;
> }
> 
> ....
> 
> static void __attribute__((constructor))
> rte_acl_init(void)
> {
>         enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;
> 
> #if defined(CC_AVX2_SUPPORT)
>         if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
>                 alg = RTE_ACL_CLASSIFY_AVX2;
>         else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
> #elif defined (CC_SSE41_SUPPORT)
>         if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
>                 alg = RTE_ACL_CLASSIFY_SSE;
> #endif
> 
>         rte_acl_set_default_classify(alg);
> }
> 
> After that, I suppose, you should be able to build and (probably use) librte_acl on arm.
> Konstantin
> 
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
                   ` (15 preceding siblings ...)
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
                     ` (18 more replies)
  16 siblings, 19 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

Hello DPDK community,

this is the third attempt to post support for ARMv7 into the DPDK.
There are changes related to the LPM and ACL libraries only:

* included rte_vect.h, however, it is more a placeholder
* rte_lpm.h was simplified due to the previous point
* ACL now compiles as we detect whether the compiler
  supports SSE 4.1

Regards
Jan

---

You can pull the changes from

  https://github.com/RehiveTech/dpdk.git arm-support-v3

since commit 67b38d93c960e3a650b49dec1cd8b53c12ee3e2d:

  e1000: fix total byte statistics (2015-10-27 18:39:44 +0100)

up to d84581e3681de076819d202b1f09f2751d28d5be:

acl: handle when SSE 4.1 is unsupported (2015-10-27 20:03:24 +0100)

---

Jan Viktorin (8):
  eal/arm: implement rdtsc by PMU or clock_gettime
  eal/arm: use vector memcpy only when NEON is enabled
  eal/arm: detect arm architecture in cpu flags
  eal/arm: rwlock support for ARM
  gcc/arm: avoid alignment errors to break build
  maintainers: claim responsibility for ARMv7
  eal/arm: add very incomplete rte_vect
  acl: handle when SSE 4.1 is unsupported

Vlastimil Kosar (9):
  mk: Introduce ARMv7 architecture
  eal/arm: atomic operations for ARM
  eal/arm: byte order operations for ARM
  eal/arm: cpu cycle operations for ARM
  eal/arm: prefetch operations for ARM
  eal/arm: spinlock operations for ARM (without HTM)
  eal/arm: vector memcpy for ARM
  eal/arm: cpu flag checks for ARM
  lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for
    non-x86

 MAINTAINERS                                        |   4 +
 app/test/test_cpuflags.c                           |   5 +
 config/defconfig_arm-armv7-a-linuxapp-gcc          |  74 +++++
 lib/librte_acl/Makefile                            |   7 +-
 lib/librte_acl/rte_acl.c                           |  19 +-
 .../common/include/arch/arm/rte_atomic.h           | 256 ++++++++++++++++
 .../common/include/arch/arm/rte_byteorder.h        | 148 ++++++++++
 .../common/include/arch/arm/rte_cpuflags.h         | 193 ++++++++++++
 .../common/include/arch/arm/rte_cycles.h           | 121 ++++++++
 .../common/include/arch/arm/rte_memcpy.h           | 325 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |  61 ++++
 .../common/include/arch/arm/rte_rwlock.h           |  40 +++
 .../common/include/arch/arm/rte_spinlock.h         | 114 ++++++++
 lib/librte_eal/common/include/arch/arm/rte_vect.h  |  81 +++++
 lib/librte_lpm/rte_lpm.h                           |  24 +-
 mk/arch/arm/rte.vars.mk                            |  39 +++
 mk/machine/armv7-a/rte.vars.mk                     |  67 +++++
 mk/rte.cpuflags.mk                                 |   6 +
 mk/toolchain/gcc/rte.vars.mk                       |   6 +
 19 files changed, 1584 insertions(+), 6 deletions(-)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-28 10:09     ` David Marchand
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM Jan Viktorin
                     ` (17 subsequent siblings)
  18 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9. However, it is known to be working on Cortex-A7
and Cortex-A15.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* the -mtune parameter of GCC is configurable now
* the -mfpu=neon can be turned off

v2 -> v3: XMM_SIZE is defined in rte_vect.h in a following patch
---
 config/defconfig_arm-armv7-a-linuxapp-gcc | 75 +++++++++++++++++++++++++++++++
 mk/arch/arm/rte.vars.mk                   | 39 ++++++++++++++++
 mk/machine/armv7-a/rte.vars.mk            | 67 +++++++++++++++++++++++++++
 3 files changed, 181 insertions(+)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
new file mode 100644
index 0000000..5a778cf
--- /dev/null
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -0,0 +1,75 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All right reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9"
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# avoids using i686/x86_64 SIMD instructions, nothing for ARM
+CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 0000000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH  ?= arm
+CROSS ?=
+
+CPU_CFLAGS  ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk
new file mode 100644
index 0000000..48d3979
--- /dev/null
+++ b/mk/machine/armv7-a/rte.vars.mk
@@ -0,0 +1,67 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+CPU_CFLAGS += -mfloat-abi=softfp
+
+MACHINE_CFLAGS += -march=armv7-a
+
+ifdef CONFIG_RTE_ARCH_ARM_TUNE
+MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
+endif
+
+ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
+MACHINE_CFLAGS += -mfpu=neon
+endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order " Jan Viktorin
                     ` (16 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* improve rte_wmb()
* use __atomic_* or __sync_*? (may affect the required GCC version)
---
 .../common/include/arch/arm/rte_atomic.h           | 256 +++++++++++++++++++++
 1 file changed, 256 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 0000000..1815766
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,256 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb()  __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while(0)
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() __sync_synchronize()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *) &v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	__atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	__atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	__atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	__atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order operations for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle " Jan Viktorin
                     ` (15 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_byteorder.h        | 148 +++++++++++++++++++++
 1 file changed, 148 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 0000000..04e7b87
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,148 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("rev16 %[x1],%[x2]"
+		      : [x1] "=r" (x)
+		      : [x2] "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("rev %[x1],%[x2]"
+		      : [x1] "=r" (x)
+		      : [x2] "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	return  __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else /* RTE_BIG_ENDIAN */
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle operations for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (2 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order " Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
                     ` (14 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 85 ++++++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 0000000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock counter
+   available in the core is 32 bit wide. Therefore it is unsuitable as the
+   counter overlaps every few seconds and probably is not accessible by
+   userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+   simulate counter running at 1GHz.
+*/
+
+#include <time.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	struct timespec val;
+	uint64_t v;
+
+	while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+		/* no body */;
+
+	v  = (uint64_t) val.tv_sec * 1000000000LL;
+	v += (uint64_t) val.tv_nsec;
+	return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (3 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle " Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM Jan Viktorin
                     ` (13 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

Enable to choose a preferred way to read timer based on the
configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU.
It requires a kernel module that is not included to work.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 38 +++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index ff66ae2..5dcef25 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -54,8 +54,14 @@ extern "C" {
  * @return
  *   The time base for this lcore.
  */
+#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
+
+/**
+ * This call is easily portable to any ARM architecture, however,
+ * it may be damn slow and inprecise for some tasks.
+ */
 static inline uint64_t
-rte_rdtsc(void)
+__rte_rdtsc_syscall(void)
 {
 	struct timespec val;
 	uint64_t v;
@@ -67,6 +73,36 @@ rte_rdtsc(void)
 	v += (uint64_t) val.tv_nsec;
 	return v;
 }
+#define rte_rdtsc __rte_rdtsc_syscall
+
+#else
+
+/**
+ * This function requires to configure the PMCCNTR and enable
+ * userspace access to it:
+ *
+ *      asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1));
+ *      asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29));
+ *      asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x8000000f));
+ *
+ * which is possible only from the priviledged mode (kernel space).
+ */
+static inline uint64_t
+__rte_rdtsc_pmccntr(void)
+{
+	unsigned tsc;
+	uint64_t final_tsc;
+
+	/* Read PMCCNTR */
+	asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc));
+	/* 1 tick = 64 clocks */
+	final_tsc = ((uint64_t)tsc) << 6;
+
+	return (uint64_t)final_tsc;
+}
+#define rte_rdtsc __rte_rdtsc_pmccntr
+
+#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */
 
 static inline uint64_t
 rte_rdtsc_precise(void)
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (4 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
                     ` (12 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_prefetch.h         | 61 ++++++++++++++++++++++
 1 file changed, 61 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 0000000..8d75fe6
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM)
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (5 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM Jan Viktorin
                     ` (11 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 0000000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while (sl->locked)
+			rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+	return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+	return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+	return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (6 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
                     ` (10 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.

The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.

               by X % faster
Length (B)   a15    a7     a9
   1         4.9  15.2    3.2
   7        56.9  48.2   40.3
   8        37.3  39.8   29.6
   9        69.3  38.7   33.9
  15        60.8  35.3   23.7
  16        50.6  35.9   35.0
  17        57.7  35.7   31.1
  31        16.0  23.3    9.0
  32        65.9  13.5   21.4
  33         3.9  10.3   -3.7
  63         2.0  12.9   -2.0
  64        66.5   0.0   16.5
  65         2.7   7.6  -35.6
 127         0.1   4.5  -18.9
 128        66.2   1.5  -51.4
 129        -0.8   3.2  -35.8
 255        -3.1  -0.9  -69.1
 256        67.9   1.2    7.2
 257        -3.6  -1.9  -36.9
 320        67.7   1.4    0.0
 384        66.8   1.4  -14.2
 511       -44.9  -2.3  -41.9
 512        67.3   1.4   -6.8
 513       -41.7  -3.0  -36.2
1023       -82.4  -2.8  -41.2
1024        68.3   1.4  -11.6
1025       -80.1  -3.3  -38.1
1518       -47.3  -5.0  -38.3
1522       -48.3  -6.0  -37.9
1600        65.4   1.3  -27.3
2048        59.5   1.5  -10.9
3072        52.3   1.5  -12.2
4096        45.3   1.4  -12.5
5120        40.6   1.5  -14.5
6144        35.4   1.4  -13.4
7168        32.9   1.4  -13.9
8192        28.2   1.4  -15.1

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_memcpy.h           | 270 +++++++++++++++++++++
 1 file changed, 270 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 0000000..ac885e9
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,270 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include <stdint.h>
+#include <string.h>
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+	              "vld1.8 {d4-d5}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+	              "vst1.8 {d4-d5}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+	              "vld1.8 {d4-d7}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7");
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+	asm volatile ("vld1.8 {d0-d3},   [%[src]]!\n\t"
+	              "vld1.8 {d4-d7},   [%[src]]!\n\t"
+	              "vld1.8 {d8-d11},  [%[src]]!\n\t"
+	              "vld1.8 {d12-d15}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3},   [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7},   [%[dst]]!\n\t"
+	              "vst1.8 {d8-d11},  [%[dst]]!\n\t"
+	              "vst1.8 {d12-d15}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+	                  "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #128]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #192]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #256]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #320]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #384]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #448]" : : [src] "r" (src));
+	asm volatile ("vld1.8 {d0-d3},   [%[src]]!\n\t"
+	              "vld1.8 {d4-d7},   [%[src]]!\n\t"
+	              "vld1.8 {d8-d11},  [%[src]]!\n\t"
+	              "vld1.8 {d12-d15}, [%[src]]!\n\t"
+	              "vld1.8 {d16-d19}, [%[src]]!\n\t"
+	              "vld1.8 {d20-d23}, [%[src]]!\n\t"
+	              "vld1.8 {d24-d27}, [%[src]]!\n\t"
+	              "vld1.8 {d28-d31}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3},   [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7},   [%[dst]]!\n\t"
+	              "vst1.8 {d8-d11},  [%[dst]]!\n\t"
+	              "vst1.8 {d12-d15}, [%[dst]]!\n\t"
+	              "vst1.8 {d16-d19}, [%[dst]]!\n\t"
+	              "vst1.8 {d20-d23}, [%[dst]]!\n\t"
+	              "vst1.8 {d24-d27}, [%[dst]]!\n\t"
+	              "vst1.8 {d28-d31}, [%[dst]]!\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+	                  "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15",
+	                  "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+	                  "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31");
+}
+
+#define rte_memcpy(dst, src, n)              \
+	({ (__builtin_constant_p(n)) ?       \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			/* ARMv7 can not handle unaligned access to long long
+			 * (uint64_t). Therefore two uint32_t operations are used.
+			 * TODO: use NEON too?
+			 */
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+			*(uint32_t *)dst = *(const uint32_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n,
+			(const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n,
+			(const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0)
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (7 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM Jan Viktorin
                     ` (9 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

The GCC can be configured to avoid using NEON extensions.
For that purpose, we provide just the memcpy implementation
of the rte_memcpy.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_memcpy.h           | 59 +++++++++++++++++++++-
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index ac885e9..75e8bda 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -35,8 +35,6 @@
 
 #include <stdint.h>
 #include <string.h>
-/* ARM NEON Intrinsics are used to copy data */
-#include <arm_neon.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -44,6 +42,11 @@ extern "C" {
 
 #include "generic/rte_memcpy.h"
 
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
 static inline void
 rte_mov16(uint8_t *dst, const uint8_t *src)
 {
@@ -263,6 +266,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n)
 	return ret;
 }
 
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (8 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags Jan Viktorin
                     ` (8 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 app/test/test_cpuflags.c                           |   5 +
 .../common/include/arch/arm/rte_cpuflags.h         | 177 +++++++++++++++++++++
 mk/rte.cpuflags.mk                                 |   6 +
 3 files changed, 188 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif
 
+#if defined(RTE_ARCH_ARM)
+	printf("Check for NEON:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 	printf("Check for SSE:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 0000000..1eadb33
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,177 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+	REG_HWCAP = 0,
+	REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	RTE_CPUFLAG_SWP = 0,
+	RTE_CPUFLAG_HALF,
+	RTE_CPUFLAG_THUMB,
+	RTE_CPUFLAG_A26BIT,
+	RTE_CPUFLAG_FAST_MULT,
+	RTE_CPUFLAG_FPA,
+	RTE_CPUFLAG_VFP,
+	RTE_CPUFLAG_EDSP,
+	RTE_CPUFLAG_JAVA,
+	RTE_CPUFLAG_IWMMXT,
+	RTE_CPUFLAG_CRUNCH,
+	RTE_CPUFLAG_THUMBEE,
+	RTE_CPUFLAG_NEON,
+	RTE_CPUFLAG_VFPv3,
+	RTE_CPUFLAG_VFPv3D16,
+	RTE_CPUFLAG_TLS,
+	RTE_CPUFLAG_VFPv4,
+	RTE_CPUFLAG_IDIVA,
+	RTE_CPUFLAG_IDIVT,
+	RTE_CPUFLAG_VFPD32,
+	RTE_CPUFLAG_LPAE,
+	RTE_CPUFLAG_EVTSTRM,
+	RTE_CPUFLAG_AES,
+	RTE_CPUFLAG_PMULL,
+	RTE_CPUFLAG_SHA1,
+	RTE_CPUFLAG_SHA2,
+	RTE_CPUFLAG_CRC32,
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0)
+	FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1)
+	FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2)
+	FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
+	FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP,  4)
+	FEAT_DEF(FPA,       0x00000001, 0, REG_HWCAP,  5)
+	FEAT_DEF(VFP,       0x00000001, 0, REG_HWCAP,  6)
+	FEAT_DEF(EDSP,      0x00000001, 0, REG_HWCAP,  7)
+	FEAT_DEF(JAVA,      0x00000001, 0, REG_HWCAP,  8)
+	FEAT_DEF(IWMMXT,    0x00000001, 0, REG_HWCAP,  9)
+	FEAT_DEF(CRUNCH,    0x00000001, 0, REG_HWCAP,  10)
+	FEAT_DEF(THUMBEE,   0x00000001, 0, REG_HWCAP,  11)
+	FEAT_DEF(NEON,      0x00000001, 0, REG_HWCAP,  12)
+	FEAT_DEF(VFPv3,     0x00000001, 0, REG_HWCAP,  13)
+	FEAT_DEF(VFPv3D16,  0x00000001, 0, REG_HWCAP,  14)
+	FEAT_DEF(TLS,       0x00000001, 0, REG_HWCAP,  15)
+	FEAT_DEF(VFPv4,     0x00000001, 0, REG_HWCAP,  16)
+	FEAT_DEF(IDIVA,     0x00000001, 0, REG_HWCAP,  17)
+	FEAT_DEF(IDIVT,     0x00000001, 0, REG_HWCAP,  18)
+	FEAT_DEF(VFPD32,    0x00000001, 0, REG_HWCAP,  19)
+	FEAT_DEF(LPAE,      0x00000001, 0, REG_HWCAP,  20)
+	FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  21)
+	FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP2,  0)
+	FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP2,  1)
+	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
+	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
+	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+	__attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+	int auxv_fd;
+	Elf32_auxv_t auxv;
+
+	auxv_fd = open("/proc/self/auxv", O_RDONLY);
+	assert(auxv_fd);
+	while (read(auxv_fd, &auxv,
+		sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+		if (auxv.a_type == AT_HWCAP)
+			out[REG_HWCAP] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_HWCAP2)
+			out[REG_HWCAP2] = auxv.a_un.a_val;
+	}
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs = {0};
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index f595cd0..bec7bdd 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
 CPUFLAGS += VSX
 endif
 
+# ARM flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
+CPUFLAGS += NEON
+endif
+
+
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
 
 # To strip whitespace
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (9 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM Jan Viktorin
                     ` (7 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
v2 -> v3: fixed forgotten include of string.h
---
 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 1eadb33..7ce9d14 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -41,6 +41,7 @@ extern "C" {
 #include <fcntl.h>
 #include <assert.h>
 #include <unistd.h>
+#include <string.h>
 
 #include "generic/rte_cpuflags.h"
 
@@ -52,10 +53,15 @@ extern "C" {
 #define AT_HWCAP2 26
 #endif
 
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
 /* software based registers */
 enum cpu_register_t {
 	REG_HWCAP = 0,
 	REG_HWCAP2,
+	REG_PLATFORM,
 };
 
 /**
@@ -89,6 +95,8 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_SHA1,
 	RTE_CPUFLAG_SHA2,
 	RTE_CPUFLAG_CRC32,
+	RTE_CPUFLAG_AARCH32,
+	RTE_CPUFLAG_AARCH64,
 	/* The last item */
 	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
 };
@@ -121,6 +129,8 @@ static const struct feature_entry cpu_feature_table[] = {
 	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
 	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
 	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+	FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
+	FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
 };
 
 /*
@@ -141,6 +151,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
 			out[REG_HWCAP] = auxv.a_un.a_val;
 		else if (auxv.a_type == AT_HWCAP2)
 			out[REG_HWCAP2] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_PLATFORM) {
+			if (!strcmp((const char *)auxv.a_un.a_val, "aarch32"))
+				out[REG_PLATFORM] = 0x0001;
+			else if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+				out[REG_PLATFORM] = 0x0002;
+		}
 	}
 }
 
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (10 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
                     ` (6 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

Just a copy from PPC.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_rwlock.h           | 40 ++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 0000000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (11 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-28 12:16     ` David Marchand
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7 Jan Viktorin
                     ` (5 subsequent siblings)
  18 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
---
 mk/toolchain/gcc/rte.vars.mk | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..8f9c396 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
 WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
 WERROR_FLAGS += -Wundef -Wwrite-strings
 
+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error
+endif
+
 # process cpu flags
 include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
 
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (12 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect Jan Viktorin
                     ` (4 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 MAINTAINERS | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 080a8e8..a8933eb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,6 +124,10 @@ IBM POWER
 M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
 F: lib/librte_eal/common/include/arch/ppc_64/
 
+ARM v7
+M: Jan Viktorin <viktorin@rehivetech.com>
+F: lib/librte_eal/common/include/arch/arm/
+
 Intel x86
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (13 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7 Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86 Jan Viktorin
                     ` (3 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

This patch does not map x86 SIMD operations to the ARM ones.
It just fills the necessary gap between the platforms to enable
compilation of libraries LPM (includes rte_vect.h, lpm_test needs
those SIMD functions) and ACL (includes rte_vect.h).

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 81 +++++++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
new file mode 100644
index 0000000..b346c7d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -0,0 +1,81 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VECT_ARM_H_
+#define _RTE_VECT_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define XMM_SIZE 16
+#define XMM_MASK (XMM_MASK - 1)
+
+typedef struct {
+	union uint128 {
+		uint8_t uint8[16];
+		uint32_t uint32[4];
+	} val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+	__m128i res;
+	res.val.uint32[0] = v0;
+	res.val.uint32[1] = v1;
+	res.val.uint32[2] = v2;
+	res.val.uint32[3] = v3;
+	return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i * v)
+{
+	__m128i res;
+	res = *v;
+	return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i * v)
+{
+	__m128i res;
+	res = *v;
+	return res;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (14 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported Jan Viktorin
                     ` (2 subsequent siblings)
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar

From: Vlastimil Kosar <kosar@rehivetech.com>

LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
the function is reimplemented using non-vector operations for non-x86
architectures.

LPM now builds for ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v2 -> v3: as SIMD operations have been moved to rte_vect.h,
          this patch is now quite clear and just defines the
          non-x86 version of rte_lpm_lookupx4
---
 config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
 lib/librte_lpm/rte_lpm.h                  | 24 +++++++++++++++++++++---
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
index 5a778cf..a2c8b95 100644
--- a/config/defconfig_arm-armv7-a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -55,7 +55,6 @@ CONFIG_RTE_EAL_IGB_UIO=n
 
 # fails to compile on ARM
 CONFIG_RTE_LIBRTE_ACL=n
-CONFIG_RTE_LIBRTE_LPM=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index c299ce2..c02b355 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -358,9 +358,6 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
 	return 0;
 }
 
-/* Mask four results. */
-#define	 RTE_LPM_MASKX4_RES	UINT64_C(0x00ff00ff00ff00ff)
-
 /**
  * Lookup four IP addresses in an LPM table.
  *
@@ -382,6 +379,14 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
  */
 static inline void
 rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+	uint16_t defv);
+
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
+/* Mask four results. */
+#define	 RTE_LPM_MASKX4_RES	UINT64_C(0x00ff00ff00ff00ff)
+
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
 	uint16_t defv)
 {
 	__m128i i24;
@@ -472,6 +477,19 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
 	hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
 	hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
 }
+#else
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+	uint16_t defv)
+{
+	rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
+
+	hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
+	hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
+	hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
+	hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
+}
+#endif
 
 #ifdef __cplusplus
 }
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (15 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86 Jan Viktorin
@ 2015-10-27 19:13   ` Jan Viktorin
  2015-10-28 14:54   ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
  18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
  To: dev, David Hunt, David Marchand, Ananyev, Konstantin

The main goal of this check is to avoid passing the -msse4.1
option to the GCC that does not support it (like arm toolchains).

The ACL now builds for ARM.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v2 -> v3: handle missing SSE as suggested by K. Ananyev
---
 lib/librte_acl/Makefile  |  7 ++++++-
 lib/librte_acl/rte_acl.c | 19 +++++++++++++++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index 7a1cf8a..ed95f03 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
-SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
 
+CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 2>&1 && echo 1)
+
+ifeq ($(CC_SSE4_1_SUPPORT),1)
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
 CFLAGS_acl_run_sse.o += -msse4.1
+endif
 
 #
 # If the compiler supports AVX2 instructions,
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
index d60219f..e7822de 100644
--- a/lib/librte_acl/rte_acl.c
+++ b/lib/librte_acl/rte_acl.c
@@ -42,6 +42,20 @@ static struct rte_tailq_elem rte_acl_tailq = {
 EAL_REGISTER_TAILQ(rte_acl_tailq)
 
 /*
+ * If the compiler doesn't support SSE instructions,
+ * then the dummy one would be used instead for SSE classify method.
+ */
+int __attribute__ ((weak))
+rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
+
+/*
  * If the compiler doesn't support AVX2 instructions,
  * then the dummy one would be used instead for AVX2 classify method.
  */
@@ -97,10 +111,11 @@ rte_acl_init(void)
 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
 		alg = RTE_ACL_CLASSIFY_AVX2;
 	else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
-#else
+		alg = RTE_ACL_CLASSIFY_SSE;
+#elif defined (CC_SSE41_SUPPORT)
 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
-#endif
 		alg = RTE_ACL_CLASSIFY_SSE;
+#endif
 
 	rte_acl_set_default_classify(alg);
 }
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
@ 2015-10-28 10:09     ` David Marchand
  2015-10-28 10:56       ` Jan Viktorin
  0 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 10:09 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev, Vlastimil Kosar

Hello Jan,

On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

>
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> b/config/defconfig_arm-armv7-a-linuxapp-gcc
> new file mode 100644
> index 0000000..5a778cf
> --- /dev/null
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> +
> +# avoids using i686/x86_64 SIMD instructions, nothing for ARM
> +CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
>

(<unrelated>yet another build flag which has to disappear, and bitmap
header should be moved from librte_sched to eal with arch-specific
implementations when applicable</unrelated>)

Well, I am a bit confused by this comment.
For me, gcc provides ctzll builtins.
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

And with your patchset applied, it builds fine with
RTE_BITMAP_OPTIMIZATIONS enabled using gcc 4.7.3 for arm on ubuntu 14.04.
Is there a dependency on gcc version ?


+# PCI is usually not used on ARM
> +CONFIG_RTE_EAL_IGB_UIO=n
>

Not sure "usually not used" is a good reason to disable something.
Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?


Thanks.

-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
  2015-10-28 10:09     ` David Marchand
@ 2015-10-28 10:56       ` Jan Viktorin
  2015-10-28 13:40         ` David Marchand
  2015-10-28 13:44         ` Hunt, David
  0 siblings, 2 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 10:56 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Vlastimil Kosar

On Wed, 28 Oct 2015 11:09:21 +0100
David Marchand <david.marchand@6wind.com> wrote:

> Hello Jan,
> 
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
> 
> >
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > new file mode 100644
> > index 0000000..5a778cf
> > --- /dev/null
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > +
> > +# avoids using i686/x86_64 SIMD instructions, nothing for ARM
> > +CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
> >  
> 
> (<unrelated>yet another build flag which has to disappear, and bitmap
> header should be moved from librte_sched to eal with arch-specific
> implementations when applicable</unrelated>)
> 
> Well, I am a bit confused by this comment.
> For me, gcc provides ctzll builtins.
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
> 
> And with your patchset applied, it builds fine with
> RTE_BITMAP_OPTIMIZATIONS enabled using gcc 4.7.3 for arm on ubuntu 14.04.
> Is there a dependency on gcc version ?

It seems, there is no need for this. I will remove it. DPDK compiles
well.

> 
> 
> +# PCI is usually not used on ARM
> > +CONFIG_RTE_EAL_IGB_UIO=n
> >  
> 
> Not sure "usually not used" is a good reason to disable something.
> Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
> 

Well, it requires to set some options in Linux Kernel (at least PCI
support) which are usually disabled by the in-kernel *arm*_defconfigs.
Moreover, it seems I cannot enable it for some ARM architectures (I've
tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
with a PCI bus. I suppose that if somebody _really_ needs this, she would
enable it by hand.

At the moment, it breaks my common builds... The driver is mostly
useless on ARMv7 and just takes space in the filesystem.

> 
> Thanks.
> 

Regards
Jan



-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-28 12:16     ` David Marchand
  2015-10-28 17:34       ` Jan Viktorin
  0 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 12:16 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev, Vlastimil Kosar

On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

> There several issues with alignment when compiling for ARMv7.
> They are not considered to be fatal (ARMv7 supports unaligned
> access of 32b words), so we just leave them as warnings. They
> should be solved later, however.
>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> ---
>  mk/toolchain/gcc/rte.vars.mk | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
> index 0f51c66..8f9c396 100644
> --- a/mk/toolchain/gcc/rte.vars.mk
> +++ b/mk/toolchain/gcc/rte.vars.mk
> @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs
> -Wcast-qual
>  WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
>  WERROR_FLAGS += -Wundef -Wwrite-strings
>
> +# There are many issues reported for ARMv7 architecture
> +# which are not necessarily fatal. Report as warnings.
> +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
> +WERROR_FLAGS += -Wno-error
> +endif
> +
>

Can we disable only "known" problems ?

Something like :
WERROR_FLAGS += -Wno-error=cast-align


-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 01/16] mk: Introduce " Jan Viktorin
@ 2015-10-28 13:34   ` David Marchand
  2015-10-28 17:32     ` Jan Viktorin
  2015-10-28 13:39   ` David Marchand
  1 sibling, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 13:34 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev, Vlastimil Kosar

On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

> From: Vlastimil Kosar <kosar@rehivetech.com>
>
> Make DPDK run on ARMv7-A architecture. This patch assumes
> ARM Cortex-A9. However, it is known to be working on Cortex-A7
> and Cortex-A15.
>
> Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
> v1 -> v2:
> * the -mtune parameter of GCC is configurable now
> * the -mfpu=neon can be turned off
>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
>  config/defconfig_arm-armv7-a-linuxapp-gcc | 78
> +++++++++++++++++++++++++++++++
>  mk/arch/arm/rte.vars.mk                   | 39 ++++++++++++++++
>  mk/machine/armv7-a/rte.vars.mk            | 67 ++++++++++++++++++++++++++
>  3 files changed, 184 insertions(+)
>  create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
>  create mode 100644 mk/arch/arm/rte.vars.mk
>  create mode 100644 mk/machine/armv7-a/rte.vars.mk
>

This patch comes too early in the patchset, I would put it once compilation
is fine (more comment to come, btw), so once all headers are in place, not
before.

Besides, do we really need this -a suffix ?


-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
  2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 01/16] mk: Introduce " Jan Viktorin
  2015-10-28 13:34   ` David Marchand
@ 2015-10-28 13:39   ` David Marchand
  2015-10-28 17:32     ` Jan Viktorin
  1 sibling, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 13:39 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev, Vlastimil Kosar

On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

>
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> b/config/defconfig_arm-armv7-a-linuxapp-gcc
> new file mode 100644
> index 0000000..5b582a8
> --- /dev/null
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> @@ -0,0 +1,78 @@
>
> +# fails to compile on ARM
> +CONFIG_RTE_LIBRTE_ACL=n
> +CONFIG_RTE_LIBRTE_LPM=n
>

librte_lpm is used by librte_table, used by librte_pipeline.

So until lpm is fixed (later in your patchset), this config file won't
build.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
  2015-10-28 10:56       ` Jan Viktorin
@ 2015-10-28 13:40         ` David Marchand
  2015-10-28 13:44         ` Hunt, David
  1 sibling, 0 replies; 72+ messages in thread
From: David Marchand @ 2015-10-28 13:40 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev, Vlastimil Kosar

On Wed, Oct 28, 2015 at 11:56 AM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

> On Wed, 28 Oct 2015 11:09:21 +0100
> David Marchand <david.marchand@6wind.com> wrote:
>
> > +# PCI is usually not used on ARM
> > > +CONFIG_RTE_EAL_IGB_UIO=n
> > >
> >
> > Not sure "usually not used" is a good reason to disable something.
> > Is there a real issue on arm with igb_uio code (compilation, pci
> accesses) ?
> >
>
> Well, it requires to set some options in Linux Kernel (at least PCI
> support) which are usually disabled by the in-kernel *arm*_defconfigs.
> Moreover, it seems I cannot enable it for some ARM architectures (I've
> tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
> with a PCI bus. I suppose that if somebody _really_ needs this, she would
> enable it by hand.
>
> At the moment, it breaks my common builds... The driver is mostly
> useless on ARMv7 and just takes space in the filesystem.
>
>
Ok, well, at the moment, you seem to be the only user :-)
Let's see what other people say.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
  2015-10-28 10:56       ` Jan Viktorin
  2015-10-28 13:40         ` David Marchand
@ 2015-10-28 13:44         ` Hunt, David
  1 sibling, 0 replies; 72+ messages in thread
From: Hunt, David @ 2015-10-28 13:44 UTC (permalink / raw)
  To: Jan Viktorin, David Marchand; +Cc: dev, Vlastimil Kosar

On 28/10/2015 10:56, Jan Viktorin wrote:
> On Wed, 28 Oct 2015 11:09:21 +0100
> David Marchand <david.marchand@6wind.com> wrote:
>
>> Hello Jan,
>>
>> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
>> wrote:
>>
>> +# PCI is usually not used on ARM
>>> +CONFIG_RTE_EAL_IGB_UIO=n
>>>
>>
>> Not sure "usually not used" is a good reason to disable something.
>> Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
>>
>
> Well, it requires to set some options in Linux Kernel (at least PCI
> support) which are usually disabled by the in-kernel *arm*_defconfigs.
> Moreover, it seems I cannot enable it for some ARM architectures (I've
> tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
> with a PCI bus. I suppose that if somebody _really_ needs this, she would
> enable it by hand.
>
> At the moment, it breaks my common builds... The driver is mostly
> useless on ARMv7 and just takes space in the filesystem.

I have an ARMv8 board here that I've built a new kernel for the purposes 
of an ARMv8 port, and it took quite a while to get the PCI
functionality all working, including implementing a fix to the kernel 
PCI driver to expose the mmap resources in sysfs properly. But after 
that, igb_uio compiles fine (on the ARMv8 patch) and works with a 
Niantic to pass traffic between ports.

If the majority of ARMv7 boards don't have a PCI bus, then I'd suggest 
leaving igb_uio disabled. Those few boards with PCI will most likely 
have a correctly kernel (and source) ready to go, so enabling igb_uio 
for them will be easy, but disabling seems a more sensible default for 
the majority of ARMv7 users.

Rgds,
Dave.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (16 preceding siblings ...)
  2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported Jan Viktorin
@ 2015-10-28 14:54   ` David Marchand
  2015-10-28 17:38     ` Jan Viktorin
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
  18 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 14:54 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

Hello Jan,

On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

> Hello DPDK community,
>
> this is the third attempt to post support for ARMv7 into the DPDK.
> There are changes related to the LPM and ACL libraries only:
>
> * included rte_vect.h, however, it is more a placeholder
> * rte_lpm.h was simplified due to the previous point
> * ACL now compiles as we detect whether the compiler
>   supports SSE 4.1
>

This patchset looks good to me (with the minor comments I sent).
And armv8 support should fit quite well in this.

A last few things :
- checkpatch is not happy with some patches, can you have a look at this ?
- can you update the 2.2 release notes as part of this patchset to announce
armv7 support ?
- I am not really sure the acl et lpm fixes really belong to this patchset
as a more larger cleanup is necessary to have all libraries compile fine on
non-x86
- since you introduce a new architecture, do you intend to run daily build
checks and send reports to the test-report mailing list ?


Thanks.

-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
  2015-10-28 13:34   ` David Marchand
@ 2015-10-28 17:32     ` Jan Viktorin
  2015-10-28 17:36       ` Richardson, Bruce
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 17:32 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Vlastimil Kosar

On Wed, 28 Oct 2015 14:34:40 +0100
David Marchand <david.marchand@6wind.com> wrote:

> On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
> 
> > From: Vlastimil Kosar <kosar@rehivetech.com>
> >
> > Make DPDK run on ARMv7-A architecture. This patch assumes
> > ARM Cortex-A9. However, it is known to be working on Cortex-A7
> > and Cortex-A15.
> >
> > Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > ---
> > v1 -> v2:
> > * the -mtune parameter of GCC is configurable now
> > * the -mfpu=neon can be turned off
> >
> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > ---
> >  config/defconfig_arm-armv7-a-linuxapp-gcc | 78
> > +++++++++++++++++++++++++++++++
> >  mk/arch/arm/rte.vars.mk                   | 39 ++++++++++++++++
> >  mk/machine/armv7-a/rte.vars.mk            | 67 ++++++++++++++++++++++++++
> >  3 files changed, 184 insertions(+)
> >  create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
> >  create mode 100644 mk/arch/arm/rte.vars.mk
> >  create mode 100644 mk/machine/armv7-a/rte.vars.mk
> >  
> 
> This patch comes too early in the patchset, I would put it once compilation
> is fine (more comment to come, btw), so once all headers are in place, not
> before.

Agree, this was done by watching the Power 8 patchset. But this seems
quite logical.

> 
> Besides, do we really need this -a suffix ?

It is the full name of the ARM architecture - armv7, A profile. There
are 3 profiles: A - application, R - real time, M - microcontroller.
They differ in what MMU/MPU and other such things they support. But,
finally, we can omit. The M profile is unsuitable for DPDK anyway. The
R profile may be however used under certain circumstances (I believe, it
can run Linux).

Jan

-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
  2015-10-28 13:39   ` David Marchand
@ 2015-10-28 17:32     ` Jan Viktorin
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 17:32 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Vlastimil Kosar

On Wed, 28 Oct 2015 14:39:27 +0100
David Marchand <david.marchand@6wind.com> wrote:

> On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
> 
> >
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > new file mode 100644
> > index 0000000..5b582a8
> > --- /dev/null
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > @@ -0,0 +1,78 @@
> >
> > +# fails to compile on ARM
> > +CONFIG_RTE_LIBRTE_ACL=n
> > +CONFIG_RTE_LIBRTE_LPM=n
> >  
> 
> librte_lpm is used by librte_table, used by librte_pipeline.
> 
> So until lpm is fixed (later in your patchset), this config file won't
> build.
> 
> 

So, you mean to disable all those? OK.

Jan


-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build
  2015-10-28 12:16     ` David Marchand
@ 2015-10-28 17:34       ` Jan Viktorin
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 17:34 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Vlastimil Kosar

On Wed, 28 Oct 2015 13:16:24 +0100
David Marchand <david.marchand@6wind.com> wrote:

> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
> 
> > There several issues with alignment when compiling for ARMv7.
> > They are not considered to be fatal (ARMv7 supports unaligned
> > access of 32b words), so we just leave them as warnings. They
> > should be solved later, however.
> >
> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> > ---
> >  mk/toolchain/gcc/rte.vars.mk | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
> > index 0f51c66..8f9c396 100644
> > --- a/mk/toolchain/gcc/rte.vars.mk
> > +++ b/mk/toolchain/gcc/rte.vars.mk
> > @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs
> > -Wcast-qual
> >  WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
> >  WERROR_FLAGS += -Wundef -Wwrite-strings
> >
> > +# There are many issues reported for ARMv7 architecture
> > +# which are not necessarily fatal. Report as warnings.
> > +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
> > +WERROR_FLAGS += -Wno-error
> > +endif
> > +
> >  
> 
> Can we disable only "known" problems ?
> 
> Something like :
> WERROR_FLAGS += -Wno-error=cast-align
> 
> 

Sure! That's better idea, I always forgot about this possibilities in
GCC...

Jan

-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
  2015-10-28 17:32     ` Jan Viktorin
@ 2015-10-28 17:36       ` Richardson, Bruce
  0 siblings, 0 replies; 72+ messages in thread
From: Richardson, Bruce @ 2015-10-28 17:36 UTC (permalink / raw)
  To: Jan Viktorin, David Marchand; +Cc: dev, Vlastimil Kosar



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jan Viktorin
> Sent: Wednesday, October 28, 2015 5:32 PM
> To: David Marchand <david.marchand@6wind.com>
> Cc: dev@dpdk.org; Vlastimil Kosar <kosar@rehivetech.com>
> Subject: Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
> 
> On Wed, 28 Oct 2015 14:34:40 +0100
> David Marchand <david.marchand@6wind.com> wrote:
> 
> > On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin
> > <viktorin@rehivetech.com>
> > wrote:
> >
> > > From: Vlastimil Kosar <kosar@rehivetech.com>
> > >
> > > Make DPDK run on ARMv7-A architecture. This patch assumes ARM
> > > Cortex-A9. However, it is known to be working on Cortex-A7 and
> > > Cortex-A15.
> > >
> > > Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> > > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > > ---
> > > v1 -> v2:
> > > * the -mtune parameter of GCC is configurable now
> > > * the -mfpu=neon can be turned off
> > >
> > > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > > ---
> > >  config/defconfig_arm-armv7-a-linuxapp-gcc | 78
> > > +++++++++++++++++++++++++++++++
> > >  mk/arch/arm/rte.vars.mk                   | 39 ++++++++++++++++
> > >  mk/machine/armv7-a/rte.vars.mk            | 67
> ++++++++++++++++++++++++++
> > >  3 files changed, 184 insertions(+)
> > >  create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
> > >  create mode 100644 mk/arch/arm/rte.vars.mk  create mode 100644
> > > mk/machine/armv7-a/rte.vars.mk
> > >
> >
> > This patch comes too early in the patchset, I would put it once
> > compilation is fine (more comment to come, btw), so once all headers
> > are in place, not before.
> 
> Agree, this was done by watching the Power 8 patchset. But this seems
> quite logical.
> 
> >
> > Besides, do we really need this -a suffix ?
> 
> It is the full name of the ARM architecture - armv7, A profile. There are
> 3 profiles: A - application, R - real time, M - microcontroller.
> They differ in what MMU/MPU and other such things they support. But,
> finally, we can omit. The M profile is unsuitable for DPDK anyway. The R
> profile may be however used under certain circumstances (I believe, it can
> run Linux).
> 
If you do want to include the "a" maybe just drop the "-" before it, please. 
The "-" is used to separate the elements of the RTE_TARGET and so it doesn't look right.

/Bruce

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-28 14:54   ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
@ 2015-10-28 17:38     ` Jan Viktorin
  2015-10-28 17:58       ` David Marchand
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 17:38 UTC (permalink / raw)
  To: David Marchand; +Cc: dev

On Wed, 28 Oct 2015 15:54:47 +0100
David Marchand <david.marchand@6wind.com> wrote:

> Hello Jan,
> 
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
> 
> > Hello DPDK community,
> >
> > this is the third attempt to post support for ARMv7 into the DPDK.
> > There are changes related to the LPM and ACL libraries only:
> >
> > * included rte_vect.h, however, it is more a placeholder
> > * rte_lpm.h was simplified due to the previous point
> > * ACL now compiles as we detect whether the compiler
> >   supports SSE 4.1
> >  
> 
> This patchset looks good to me (with the minor comments I sent).
> And armv8 support should fit quite well in this.
> 
> A last few things :
> - checkpatch is not happy with some patches, can you have a look at this ?

I will check this.

> - can you update the 2.2 release notes as part of this patchset to announce
> armv7 support ?

Yes, but where?

> - I am not really sure the acl et lpm fixes really belong to this patchset
> as a more larger cleanup is necessary to have all libraries compile fine on
> non-x86

So, you mean to omit those and disable them all? The LPM and ACL fixes
will be then included in 2.3?

> - since you introduce a new architecture, do you intend to run daily build
> checks and send reports to the test-report mailing list ?

I think, this is possible, if I automate it somehow. Do you mean to
test every individual patch? I have no tools for this (some ideas?). If
its just about git pull && test_script.sh, then it is quite OK.

I'd appreciate some help, ideas, advices, experiences in this area...

> 
> 
> Thanks.
> 

Jan

-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-28 17:38     ` Jan Viktorin
@ 2015-10-28 17:58       ` David Marchand
  2015-10-29 14:02         ` Thomas Monjalon
  0 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 17:58 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

On Wed, Oct 28, 2015 at 6:38 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:

>
>
> > - can you update the 2.2 release notes as part of this patchset to
> announce
> > armv7 support ?
>
> Yes, but where?
>

I would say "New Features" in doc/guides/rel_notes/release_2_2.rst.


> > - I am not really sure the acl et lpm fixes really belong to this
> patchset
> > as a more larger cleanup is necessary to have all libraries compile fine
> on
> > non-x86
>
> So, you mean to omit those and disable them all? The LPM and ACL fixes
> will be then included in 2.3?
>

This sounds more sane to me, rather than workarounds only for arm.


> > - since you introduce a new architecture, do you intend to run daily
> build
> > checks and send reports to the test-report mailing list ?
>
> I think, this is possible, if I automate it somehow. Do you mean to
> test every individual patch? I have no tools for this (some ideas?). If
> its just about git pull && test_script.sh, then it is quite OK.
>
> I'd appreciate some help, ideas, advices, experiences in this area...
>

I am pretty sure Thomas has some ideas about this.


Thanks.

-- 
David Marchand

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 00/15] Support ARMv7 architecture
  2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
                     ` (17 preceding siblings ...)
  2015-10-28 14:54   ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
@ 2015-10-29 12:43   ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
                       ` (14 more replies)
  18 siblings, 15 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

Hello DPDK community,

This is the 4th series of the ARMv7 patchset. I've cleaned up
most checkpatch errors:

* Whitespaces were fixed.

* The asm volatile syntax (checkpatch didn't like the named "%xyz"
  parameters when listed in the InputOperands list as "[xyz]").

* There are still few complaints (documented in each patch) but
  I consider them as unimportant (volatile, couple of line overlaps,
  new typedef). If it still can be done better, please report me
  your ideas.

Other changes:

* The "introduction patch" was moved to be almost the last patch as
  suggested by D. Marchand.
  
* I've added a note into the release_2_2.rst.

* The RTE_BITMAP_OPTIMIZATIONS was removed.

* The ARMv7 defconfig was renamed as suggested by B. Richardson.

* I've removed the LPM and ACL fixes from the patchset for now.
  The libraries table and pipeline are disabled as well.

* The igb_uio driver is disabled.

* The -Wno-error was restricted to cast-align only.

To be answered (from rte_atomic.h patch):

* use __atomic_* or __sync_*? (may affect the required GCC version)

Regards
Jan

---

You can pull the changes from

  https://github.com/RehiveTech/dpdk.git arm-support-v4

since commit 82fb702077f67585d64a07de0080e5cb6a924a72:

  ixgbe: support new flow director modes for X550 (2015-10-29 00:06:01 +0100)

up to 437c85fd6d9c5f3bdd2411fb9ddf703dc4cba5a5:

  maintainers: claim responsibility for ARMv7 (2015-10-29 13:33:49 +0100)

---

Jan Viktorin (7):
  eal/arm: implement rdtsc by PMU or clock_gettime
  eal/arm: use vector memcpy only when NEON is enabled
  eal/arm: detect arm architecture in cpu flags
  eal/arm: rwlock support for ARM
  eal/arm: add very incomplete rte_vect
  gcc/arm: avoid alignment errors to break build
  maintainers: claim responsibility for ARMv7

Vlastimil Kosar (8):
  eal/arm: atomic operations for ARM
  eal/arm: byte order operations for ARM
  eal/arm: cpu cycle operations for ARM
  eal/arm: prefetch operations for ARM
  eal/arm: spinlock operations for ARM (without HTM)
  eal/arm: vector memcpy for ARM
  eal/arm: cpu flag checks for ARM
  mk: Introduce ARMv7 architecture

 MAINTAINERS                                        |   4 +
 app/test/test_cpuflags.c                           |   5 +
 config/defconfig_arm-armv7a-linuxapp-gcc           |  74 +++++
 doc/guides/rel_notes/release_2_2.rst               |   5 +
 .../common/include/arch/arm/rte_atomic.h           | 256 ++++++++++++++++
 .../common/include/arch/arm/rte_byteorder.h        | 150 +++++++++
 .../common/include/arch/arm/rte_cpuflags.h         | 193 ++++++++++++
 .../common/include/arch/arm/rte_cycles.h           | 121 ++++++++
 .../common/include/arch/arm/rte_memcpy.h           | 334 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |  61 ++++
 .../common/include/arch/arm/rte_rwlock.h           |  40 +++
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++
 lib/librte_eal/common/include/arch/arm/rte_vect.h  |  84 ++++++
 mk/arch/arm/rte.vars.mk                            |  39 +++
 mk/machine/armv7-a/rte.vars.mk                     |  67 +++++
 mk/rte.cpuflags.mk                                 |   6 +
 mk/toolchain/gcc/rte.vars.mk                       |   6 +
 17 files changed, 1559 insertions(+)
 create mode 100644 config/defconfig_arm-armv7a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order " Jan Viktorin
                       ` (13 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* improve rte_wmb()
* use __atomic_* or __sync_*? (may affect the required GCC version)

v4:
* checkpatch complaints about volatile keyword (but seems to be OK to me)
* checkpatch complaints about do { ... } while (0) for single statement
  with asm volatile (but I didn't find a way how to write it without
  the checkpatch complaints)
* checkpatch is now happy with whitespaces
---
 .../common/include/arch/arm/rte_atomic.h           | 256 +++++++++++++++++++++
 1 file changed, 256 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 0000000..ea1e485
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,256 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb()  __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while (0)
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() __sync_synchronize()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset(
+				(volatile uint64_t *)&v->cnt, tmp, 0);
+	}
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset(
+				(volatile uint64_t *) &v->cnt, tmp, tmp);
+	}
+	return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset(
+				(volatile uint64_t *)&v->cnt, tmp, new_value);
+	}
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	__atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	__atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	__atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	__atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order operations for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle " Jan Viktorin
                       ` (12 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4: fix passing params to asm volatile for checkpatch
---
 .../common/include/arch/arm/rte_byteorder.h        | 150 +++++++++++++++++++++
 1 file changed, 150 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 0000000..5776997
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,150 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+
+	asm volatile ("rev16 %0,%1"
+		      : "=r" (x)
+		      : "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+
+	asm volatile ("rev %0,%1"
+		      : "=r" (x)
+		      : "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	return  __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else /* RTE_BIG_ENDIAN */
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle operations for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order " Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
                       ` (11 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 85 ++++++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 0000000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock counter
+   available in the core is 32 bit wide. Therefore it is unsuitable as the
+   counter overlaps every few seconds and probably is not accessible by
+   userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+   simulate counter running at 1GHz.
+*/
+
+#include <time.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	struct timespec val;
+	uint64_t v;
+
+	while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+		/* no body */;
+
+	v  = (uint64_t) val.tv_sec * 1000000000LL;
+	v += (uint64_t) val.tv_nsec;
+	return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (2 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle " Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM Jan Viktorin
                       ` (10 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

Enable to choose a preferred way to read timer based on the
configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU.
It requires a kernel module that is not included to work.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 38 +++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index ff66ae2..5dcef25 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -54,8 +54,14 @@ extern "C" {
  * @return
  *   The time base for this lcore.
  */
+#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
+
+/**
+ * This call is easily portable to any ARM architecture, however,
+ * it may be damn slow and inprecise for some tasks.
+ */
 static inline uint64_t
-rte_rdtsc(void)
+__rte_rdtsc_syscall(void)
 {
 	struct timespec val;
 	uint64_t v;
@@ -67,6 +73,36 @@ rte_rdtsc(void)
 	v += (uint64_t) val.tv_nsec;
 	return v;
 }
+#define rte_rdtsc __rte_rdtsc_syscall
+
+#else
+
+/**
+ * This function requires to configure the PMCCNTR and enable
+ * userspace access to it:
+ *
+ *      asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1));
+ *      asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29));
+ *      asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x8000000f));
+ *
+ * which is possible only from the priviledged mode (kernel space).
+ */
+static inline uint64_t
+__rte_rdtsc_pmccntr(void)
+{
+	unsigned tsc;
+	uint64_t final_tsc;
+
+	/* Read PMCCNTR */
+	asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc));
+	/* 1 tick = 64 clocks */
+	final_tsc = ((uint64_t)tsc) << 6;
+
+	return (uint64_t)final_tsc;
+}
+#define rte_rdtsc __rte_rdtsc_pmccntr
+
+#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */
 
 static inline uint64_t
 rte_rdtsc_precise(void)
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (3 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
                       ` (9 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4:
* checkpatch does not like the syntax of naming params
    to asm volatile; switched to %0, %1 syntax
* checkpatch complatins about volatile (seems to be OK for me)
---
 .../common/include/arch/arm/rte_prefetch.h         | 61 ++++++++++++++++++++++
 1 file changed, 61 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 0000000..62c3991
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("pld [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("pld [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("pld [%0]" : : "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM)
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (4 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM Jan Viktorin
                       ` (8 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 0000000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while (sl->locked)
+			rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+	return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+	return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+	return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (5 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
                       ` (7 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.

The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.

               by X % faster
Length (B)   a15    a7     a9
   1         4.9  15.2    3.2
   7        56.9  48.2   40.3
   8        37.3  39.8   29.6
   9        69.3  38.7   33.9
  15        60.8  35.3   23.7
  16        50.6  35.9   35.0
  17        57.7  35.7   31.1
  31        16.0  23.3    9.0
  32        65.9  13.5   21.4
  33         3.9  10.3   -3.7
  63         2.0  12.9   -2.0
  64        66.5   0.0   16.5
  65         2.7   7.6  -35.6
 127         0.1   4.5  -18.9
 128        66.2   1.5  -51.4
 129        -0.8   3.2  -35.8
 255        -3.1  -0.9  -69.1
 256        67.9   1.2    7.2
 257        -3.6  -1.9  -36.9
 320        67.7   1.4    0.0
 384        66.8   1.4  -14.2
 511       -44.9  -2.3  -41.9
 512        67.3   1.4   -6.8
 513       -41.7  -3.0  -36.2
1023       -82.4  -2.8  -41.2
1024        68.3   1.4  -11.6
1025       -80.1  -3.3  -38.1
1518       -47.3  -5.0  -38.3
1522       -48.3  -6.0  -37.9
1600        65.4   1.3  -27.3
2048        59.5   1.5  -10.9
3072        52.3   1.5  -12.2
4096        45.3   1.4  -12.5
5120        40.6   1.5  -14.5
6144        35.4   1.4  -13.4
7168        32.9   1.4  -13.9
8192        28.2   1.4  -15.1

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4:
* fix whitespace issues reported by checkpatch
* fix passing params to asm volatile for checkpatch
---
 .../common/include/arch/arm/rte_memcpy.h           | 279 +++++++++++++++++++++
 1 file changed, 279 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 0000000..3662b81
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,279 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include <stdint.h>
+#include <string.h>
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile (
+		"vld1.8 {d0-d3}, [%0]\n\t"
+		"vst1.8 {d0-d3}, [%1]\n\t"
+		: "+r" (src), "+r" (dst)
+		: : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile (
+		"vld1.8 {d0-d3}, [%0]!\n\t"
+		"vld1.8 {d4-d5}, [%0]\n\t"
+		"vst1.8 {d0-d3}, [%1]!\n\t"
+		"vst1.8 {d4-d5}, [%1]\n\t"
+		: "+r" (src), "+r" (dst)
+		:
+		: "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile (
+		"vld1.8 {d0-d3}, [%0]!\n\t"
+		"vld1.8 {d4-d7}, [%0]\n\t"
+		"vst1.8 {d0-d3}, [%1]!\n\t"
+		"vst1.8 {d4-d7}, [%1]\n\t"
+		: "+r" (src), "+r" (dst)
+		:
+		: "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7");
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%0, #64]" : : "r" (src));
+	asm volatile (
+		"vld1.8 {d0-d3},   [%0]!\n\t"
+		"vld1.8 {d4-d7},   [%0]!\n\t"
+		"vld1.8 {d8-d11},  [%0]!\n\t"
+		"vld1.8 {d12-d15}, [%0]\n\t"
+		"vst1.8 {d0-d3},   [%1]!\n\t"
+		"vst1.8 {d4-d7},   [%1]!\n\t"
+		"vst1.8 {d8-d11},  [%1]!\n\t"
+		"vst1.8 {d12-d15}, [%1]\n\t"
+		: "+r" (src), "+r" (dst)
+		:
+		: "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+		"d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%0,  #64]" : : "r" (src));
+	asm volatile ("pld [%0, #128]" : : "r" (src));
+	asm volatile ("pld [%0, #192]" : : "r" (src));
+	asm volatile ("pld [%0, #256]" : : "r" (src));
+	asm volatile ("pld [%0, #320]" : : "r" (src));
+	asm volatile ("pld [%0, #384]" : : "r" (src));
+	asm volatile ("pld [%0, #448]" : : "r" (src));
+	asm volatile (
+		"vld1.8 {d0-d3},   [%0]!\n\t"
+		"vld1.8 {d4-d7},   [%0]!\n\t"
+		"vld1.8 {d8-d11},  [%0]!\n\t"
+		"vld1.8 {d12-d15}, [%0]!\n\t"
+		"vld1.8 {d16-d19}, [%0]!\n\t"
+		"vld1.8 {d20-d23}, [%0]!\n\t"
+		"vld1.8 {d24-d27}, [%0]!\n\t"
+		"vld1.8 {d28-d31}, [%0]\n\t"
+		"vst1.8 {d0-d3},   [%1]!\n\t"
+		"vst1.8 {d4-d7},   [%1]!\n\t"
+		"vst1.8 {d8-d11},  [%1]!\n\t"
+		"vst1.8 {d12-d15}, [%1]!\n\t"
+		"vst1.8 {d16-d19}, [%1]!\n\t"
+		"vst1.8 {d20-d23}, [%1]!\n\t"
+		"vst1.8 {d24-d27}, [%1]!\n\t"
+		"vst1.8 {d28-d31}, [%1]!\n\t"
+		: "+r" (src), "+r" (dst)
+		:
+		: "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+		"d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15",
+		"d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+		"d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31");
+}
+
+#define rte_memcpy(dst, src, n)              \
+	({ (__builtin_constant_p(n)) ?       \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			/* ARMv7 can not handle unaligned access to long long
+			 * (uint64_t). Therefore two uint32_t operations are
+			 * used.
+			 */
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+			*(uint32_t *)dst = *(const uint32_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n,
+			(const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n,
+			(const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		break;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		break;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0)
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (6 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin
                       ` (6 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

The GCC can be configured to avoid using NEON extensions.
For that purpose, we provide just the memcpy implementation
of the rte_memcpy.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_memcpy.h           | 59 +++++++++++++++++++++-
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index 3662b81..f41648a 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -35,8 +35,6 @@
 
 #include <stdint.h>
 #include <string.h>
-/* ARM NEON Intrinsics are used to copy data */
-#include <arm_neon.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -44,6 +42,11 @@ extern "C" {
 
 #include "generic/rte_memcpy.h"
 
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
 static inline void
 rte_mov16(uint8_t *dst, const uint8_t *src)
 {
@@ -272,6 +275,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n)
 	return ret;
 }
 
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (7 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin
                       ` (5 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 app/test/test_cpuflags.c                           |   5 +
 .../common/include/arch/arm/rte_cpuflags.h         | 177 +++++++++++++++++++++
 mk/rte.cpuflags.mk                                 |   6 +
 3 files changed, 188 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif
 
+#if defined(RTE_ARCH_ARM)
+	printf("Check for NEON:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 	printf("Check for SSE:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 0000000..1eadb33
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,177 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+	REG_HWCAP = 0,
+	REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	RTE_CPUFLAG_SWP = 0,
+	RTE_CPUFLAG_HALF,
+	RTE_CPUFLAG_THUMB,
+	RTE_CPUFLAG_A26BIT,
+	RTE_CPUFLAG_FAST_MULT,
+	RTE_CPUFLAG_FPA,
+	RTE_CPUFLAG_VFP,
+	RTE_CPUFLAG_EDSP,
+	RTE_CPUFLAG_JAVA,
+	RTE_CPUFLAG_IWMMXT,
+	RTE_CPUFLAG_CRUNCH,
+	RTE_CPUFLAG_THUMBEE,
+	RTE_CPUFLAG_NEON,
+	RTE_CPUFLAG_VFPv3,
+	RTE_CPUFLAG_VFPv3D16,
+	RTE_CPUFLAG_TLS,
+	RTE_CPUFLAG_VFPv4,
+	RTE_CPUFLAG_IDIVA,
+	RTE_CPUFLAG_IDIVT,
+	RTE_CPUFLAG_VFPD32,
+	RTE_CPUFLAG_LPAE,
+	RTE_CPUFLAG_EVTSTRM,
+	RTE_CPUFLAG_AES,
+	RTE_CPUFLAG_PMULL,
+	RTE_CPUFLAG_SHA1,
+	RTE_CPUFLAG_SHA2,
+	RTE_CPUFLAG_CRC32,
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0)
+	FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1)
+	FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2)
+	FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
+	FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP,  4)
+	FEAT_DEF(FPA,       0x00000001, 0, REG_HWCAP,  5)
+	FEAT_DEF(VFP,       0x00000001, 0, REG_HWCAP,  6)
+	FEAT_DEF(EDSP,      0x00000001, 0, REG_HWCAP,  7)
+	FEAT_DEF(JAVA,      0x00000001, 0, REG_HWCAP,  8)
+	FEAT_DEF(IWMMXT,    0x00000001, 0, REG_HWCAP,  9)
+	FEAT_DEF(CRUNCH,    0x00000001, 0, REG_HWCAP,  10)
+	FEAT_DEF(THUMBEE,   0x00000001, 0, REG_HWCAP,  11)
+	FEAT_DEF(NEON,      0x00000001, 0, REG_HWCAP,  12)
+	FEAT_DEF(VFPv3,     0x00000001, 0, REG_HWCAP,  13)
+	FEAT_DEF(VFPv3D16,  0x00000001, 0, REG_HWCAP,  14)
+	FEAT_DEF(TLS,       0x00000001, 0, REG_HWCAP,  15)
+	FEAT_DEF(VFPv4,     0x00000001, 0, REG_HWCAP,  16)
+	FEAT_DEF(IDIVA,     0x00000001, 0, REG_HWCAP,  17)
+	FEAT_DEF(IDIVT,     0x00000001, 0, REG_HWCAP,  18)
+	FEAT_DEF(VFPD32,    0x00000001, 0, REG_HWCAP,  19)
+	FEAT_DEF(LPAE,      0x00000001, 0, REG_HWCAP,  20)
+	FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  21)
+	FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP2,  0)
+	FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP2,  1)
+	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
+	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
+	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+	__attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+	int auxv_fd;
+	Elf32_auxv_t auxv;
+
+	auxv_fd = open("/proc/self/auxv", O_RDONLY);
+	assert(auxv_fd);
+	while (read(auxv_fd, &auxv,
+		sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+		if (auxv.a_type == AT_HWCAP)
+			out[REG_HWCAP] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_HWCAP2)
+			out[REG_HWCAP2] = auxv.a_un.a_val;
+	}
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs = {0};
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index f595cd0..bec7bdd 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
 CPUFLAGS += VSX
 endif
 
+# ARM flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
+CPUFLAGS += NEON
+endif
+
+
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
 
 # To strip whitespace
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (8 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM Jan Viktorin
                       ` (4 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
v2 -> v3: fixed forgotten include of string.h
v4: checkpatch reports few characters over 80 for checking aarch64
---
 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 1eadb33..7ce9d14 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -41,6 +41,7 @@ extern "C" {
 #include <fcntl.h>
 #include <assert.h>
 #include <unistd.h>
+#include <string.h>
 
 #include "generic/rte_cpuflags.h"
 
@@ -52,10 +53,15 @@ extern "C" {
 #define AT_HWCAP2 26
 #endif
 
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
 /* software based registers */
 enum cpu_register_t {
 	REG_HWCAP = 0,
 	REG_HWCAP2,
+	REG_PLATFORM,
 };
 
 /**
@@ -89,6 +95,8 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_SHA1,
 	RTE_CPUFLAG_SHA2,
 	RTE_CPUFLAG_CRC32,
+	RTE_CPUFLAG_AARCH32,
+	RTE_CPUFLAG_AARCH64,
 	/* The last item */
 	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
 };
@@ -121,6 +129,8 @@ static const struct feature_entry cpu_feature_table[] = {
 	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
 	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
 	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+	FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
+	FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
 };
 
 /*
@@ -141,6 +151,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
 			out[REG_HWCAP] = auxv.a_un.a_val;
 		else if (auxv.a_type == AT_HWCAP2)
 			out[REG_HWCAP2] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_PLATFORM) {
+			if (!strcmp((const char *)auxv.a_un.a_val, "aarch32"))
+				out[REG_PLATFORM] = 0x0001;
+			else if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+				out[REG_PLATFORM] = 0x0002;
+		}
 	}
 }
 
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (9 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin
                       ` (3 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

Just a copy from PPC.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_rwlock.h           | 40 ++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 0000000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (10 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin
                       ` (2 subsequent siblings)
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

This patch does not map x86 SIMD operations to the ARM ones.
It just fills the necessary gap between the platforms to enable
compilation of libraries LPM (includes rte_vect.h, lpm_test needs
those SIMD functions) and ACL (includes rte_vect.h).

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4: checkpatch reports warning for the new typedef
---
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 84 +++++++++++++++++++++++
 1 file changed, 84 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
new file mode 100644
index 0000000..7d5de97
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -0,0 +1,84 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VECT_ARM_H_
+#define _RTE_VECT_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define XMM_SIZE 16
+#define XMM_MASK (XMM_MASK - 1)
+
+typedef struct {
+	union uint128 {
+		uint8_t uint8[16];
+		uint32_t uint32[4];
+	} val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+	__m128i res;
+
+	res.val.uint32[0] = v0;
+	res.val.uint32[1] = v1;
+	res.val.uint32[2] = v2;
+	res.val.uint32[3] = v3;
+	return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i *v)
+{
+	__m128i res;
+
+	res = *v;
+	return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i *v)
+{
+	__m128i res;
+
+	res = *v;
+	return res;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (11 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev, Vlastimil Kosar

There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
---
v4: restrict -Wno-error to the cast-align only
---
 mk/toolchain/gcc/rte.vars.mk | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..c2c5255 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
 WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
 WERROR_FLAGS += -Wundef -Wwrite-strings
 
+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error=cast-align
+endif
+
 # process cpu flags
 include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
 
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (12 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev

From: Vlastimil Kosar <kosar@rehivetech.com>

Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9. However, it is known to be working on Cortex-A7
and Cortex-A15.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v2:
* the -mtune parameter of GCC is configurable now
* the -mfpu=neon can be turned off

v3: XMM_SIZE is defined in rte_vect.h in a following patch

v4:
* update release notes for 2.2
* get rid of CONFIG_RTE_BITMAP_OPTIMIZATIONS=0 setting
* rename arm defconfig: "armv7-a" -> "arvm7a"
* disable pipeline and table modules unless lpm is fixed
---
 config/defconfig_arm-armv7a-linuxapp-gcc | 74 ++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_2_2.rst     |  5 +++
 mk/arch/arm/rte.vars.mk                  | 39 +++++++++++++++++
 mk/machine/armv7-a/rte.vars.mk           | 67 +++++++++++++++++++++++++++++
 4 files changed, 185 insertions(+)
 create mode 100644 config/defconfig_arm-armv7a-linuxapp-gcc
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
new file mode 100644
index 0000000..d623222
--- /dev/null
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -0,0 +1,74 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All right reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9"
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index be6f827..43a3a3c 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -23,6 +23,11 @@ New Features
 
 * **Added vhost-user multiple queue support.**
 
+* **Introduce ARMv7 architecture**
+
+  It is now possible to build DPDK for the ARMv7 platform and test with
+  virtual PMD drivers.
+
 
 Resolved Issues
 ---------------
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 0000000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH  ?= arm
+CROSS ?=
+
+CPU_CFLAGS  ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk
new file mode 100644
index 0000000..48d3979
--- /dev/null
+++ b/mk/machine/armv7-a/rte.vars.mk
@@ -0,0 +1,67 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+CPU_CFLAGS += -mfloat-abi=softfp
+
+MACHINE_CFLAGS += -march=armv7-a
+
+ifdef CONFIG_RTE_ARCH_ARM_TUNE
+MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
+endif
+
+ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
+MACHINE_CFLAGS += -mfpu=neon
+endif
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7
  2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
                       ` (13 preceding siblings ...)
  2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture Jan Viktorin
@ 2015-10-29 12:43     ` Jan Viktorin
  14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
  To: David Hunt, David Marchand; +Cc: dev

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 MAINTAINERS | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 080a8e8..a8933eb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,6 +124,10 @@ IBM POWER
 M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
 F: lib/librte_eal/common/include/arch/ppc_64/
 
+ARM v7
+M: Jan Viktorin <viktorin@rehivetech.com>
+F: lib/librte_eal/common/include/arch/arm/
+
 Intel x86
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
2.6.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-28 17:58       ` David Marchand
@ 2015-10-29 14:02         ` Thomas Monjalon
  2015-10-29 14:09           ` Jan Viktorin
  0 siblings, 1 reply; 72+ messages in thread
From: Thomas Monjalon @ 2015-10-29 14:02 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

2015-10-28 18:58, David Marchand:
> > > - since you introduce a new architecture, do you intend to run daily
> > > build checks and send reports to the test-report mailing list ?
> >
> > I think, this is possible, if I automate it somehow. Do you mean to
> > test every individual patch? I have no tools for this (some ideas?). If
> > its just about git pull && test_script.sh, then it is quite OK.
> >
> > I'd appreciate some help, ideas, advices, experiences in this area...
> 
> I am pretty sure Thomas has some ideas about this.

In order to make sure new commits won't introduce a regression on ARM,
we need to run some tests before accepting the patch.
If those tests are not run, some daily tests could catch the recent
regressions.
The most basic test is the compilation, then there are some unit tests
and DTS.

Are you OK to start with a basic daily compilation test? It can be achieved
with a simple crontab job.
If you or someone else have some time and a machine to do more, it would
be great. I'm sure the DPDK/ARM will gain enough interest to attract some
volunteers for more automatic tests.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-29 14:02         ` Thomas Monjalon
@ 2015-10-29 14:09           ` Jan Viktorin
  2015-10-29 15:02             ` Thomas Monjalon
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 14:09 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, 29 Oct 2015 15:02:03 +0100
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:

> 2015-10-28 18:58, David Marchand:
> > > > - since you introduce a new architecture, do you intend to run daily
> > > > build checks and send reports to the test-report mailing list ?  
> > >
> > > I think, this is possible, if I automate it somehow. Do you mean to
> > > test every individual patch? I have no tools for this (some ideas?). If
> > > its just about git pull && test_script.sh, then it is quite OK.
> > >
> > > I'd appreciate some help, ideas, advices, experiences in this area...  
> > 
> > I am pretty sure Thomas has some ideas about this.  
> 
> In order to make sure new commits won't introduce a regression on ARM,
> we need to run some tests before accepting the patch.
> If those tests are not run, some daily tests could catch the recent
> regressions.

I understand the purpose.

> The most basic test is the compilation, then there are some unit tests
> and DTS.

The unit tests are a problem at the moment. Probably, I will later find
a way, how to perform some unit tests in QEMU.

> 
> Are you OK to start with a basic daily compilation test? It can be achieved
> with a simple crontab job.

I can do daily build test in Jenkins or a cron job.

Is it sufficient to check the master branch? Or you mean to pick patches from
patchwork? That seems quite complicated to me.

> If you or someone else have some time and a machine to do more, it would
> be great. I'm sure the DPDK/ARM will gain enough interest to attract some
> volunteers for more automatic tests.
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
  2015-10-29 14:09           ` Jan Viktorin
@ 2015-10-29 15:02             ` Thomas Monjalon
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Monjalon @ 2015-10-29 15:02 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

2015-10-29 15:09, Jan Viktorin:
> On Thu, 29 Oct 2015 15:02:03 +0100
> Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> > The most basic test is the compilation, then there are some unit tests
> > and DTS.
> 
> The unit tests are a problem at the moment. Probably, I will later find
> a way, how to perform some unit tests in QEMU.
> 
> > Are you OK to start with a basic daily compilation test? It can be achieved
> > with a simple crontab job.
> 
> I can do daily build test in Jenkins or a cron job.
> 
> Is it sufficient to check the master branch? Or you mean to pick patches from
> patchwork? That seems quite complicated to me.

Yes do what you can. A daily build test of the master branch is the minimum
and may be enough to start.
As I said below, we may hope having some contributions in test coverage.

> > If you or someone else have some time and a machine to do more, it would
> > be great. I'm sure the DPDK/ARM will gain enough interest to attract some
> > volunteers for more automatic tests.

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2015-10-29 15:04 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-26 16:37 [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 01/16] mk: Introduce " Jan Viktorin
2015-10-28 13:34   ` David Marchand
2015-10-28 17:32     ` Jan Viktorin
2015-10-28 17:36       ` Richardson, Bruce
2015-10-28 13:39   ` David Marchand
2015-10-28 17:32     ` Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 02/16] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 03/16] eal/arm: byte order " Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 04/16] eal/arm: cpu cycle " Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 05/16] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 06/16] eal/arm: prefetch operations for ARM Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 07/16] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 08/16] eal/arm: vector memcpy for ARM Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 09/16] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 10/16] eal/arm: cpu flag checks for ARM Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 11/16] eal/arm: detect arm architecture in cpu flags Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 12/16] eal/arm: rwlock support for ARM Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 13/16] gcc/arm: avoid alignment errors to break build Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 14/16] maintainers: claim responsibility for ARMv7 Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
2015-10-27 15:31   ` Ananyev, Konstantin
2015-10-27 15:38     ` Jan Viktorin
2015-10-26 16:37 ` [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support Jan Viktorin
2015-10-27 15:55   ` Ananyev, Konstantin
2015-10-27 17:10     ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
2015-10-28 10:09     ` David Marchand
2015-10-28 10:56       ` Jan Viktorin
2015-10-28 13:40         ` David Marchand
2015-10-28 13:44         ` Hunt, David
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order " Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle " Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
2015-10-28 12:16     ` David Marchand
2015-10-28 17:34       ` Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7 Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86 Jan Viktorin
2015-10-27 19:13   ` [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported Jan Viktorin
2015-10-28 14:54   ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
2015-10-28 17:38     ` Jan Viktorin
2015-10-28 17:58       ` David Marchand
2015-10-29 14:02         ` Thomas Monjalon
2015-10-29 14:09           ` Jan Viktorin
2015-10-29 15:02             ` Thomas Monjalon
2015-10-29 12:43   ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order " Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle " Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture Jan Viktorin
2015-10-29 12:43     ` [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).