DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7)
@ 2015-10-03  8:58 Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture Jan Viktorin
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Jan Viktorin

Dear DPDK community,

I am proposing a patch series with support of the ARMv7 architecture
for DPDK. The patch series does not introduce any PMD driver. It is
possible to compile it, boot it and test it with some virtual PMD (eg.
pcap). It is rebased on top of v2.1.0.

All but the last two patches (11, 12) are quite staightforward
and usually based on the ppc_64 architecture. Notes:

* we test on Cortex-A9 (mostly Xilinx Zynq at the moment)
* atomic operations and spinlocks are implemented by (GCC) intrinsics
* cpu cycle is implemented by clock_gettime because there is no
  standard 64-bit counter available
* we have to set -Wno-error to pass the build process because there are
  quite a lot of alignment problems reported (we didn't find any real issues
  so far)

The last two patches (11, 12) are not to be merged into mainline. They
are just a temporary workaround for the two libraries (ACL, LPM) which
heavily utilizes the SSE... It is not possible to easily convert the
SSE calls to the NEON SIMD operations.

============

It is important to note that the current Linux Kernel does not contain
the support for huge tables for non-LPAE ARM architectures (Cortex-A9).
There is a patch available on the Internet but it is not going to be
merged for now (4/2014):

 http://thread.gmane.org/gmane.linux.kernel.mm/115788

We ported this patch to 3.18 and it can improve the performance. Here
follow results for our tests of several algorithms showing the execution
time reduction:

CPU median 3x3        -  0.2 %
NEON median 3x3       - 19.5 %
Random read           -  0.0 %
Random write          -  6.2 %
Matrix multiplication - 31.0 %
NEON copy             -  4.2 %

============

We are working on the PMD + kernel-support part. At the moment, we have
a working PMD for Xilinx Zynq's EMAC. However, it uses some dirty features.
We have to rethink it a bit before going to the mainline. We are facing some
problems during the implementation (some are already being solved in the
mailing-list):

* rte_eth_dev is defined as a PCI device. As ARMs are SoCs with integrated
  EMAC on the chip and an external phyter, we need a different approach.
  There can be an ARM computer with PCI-E but then you put there a network
  card and use a different kind of driver (but this is not very common
  at the moment).
* ARM does not have coherent memory for DMA transfers. It is possible to
  allocate non-cachable memory (DMA transfers can be as fast as possible)
  but it slows down the payload processing on CPU. For this purpose, we
  have to call dma_map/unmap_* in kernel. A custom kernel driver is needed
  and it should not be the UIO because it is quite limited (almost
  non-extendable mmap, no support for custom ioctl and write).
* We are not going to put the PHY layer into userspace, so it will stay
  in the kernel. There is also a need for the CLK control (clock gating)
  in the PMD.

Regards
Jan Viktorin


Jan Viktorin (2):
  eal/arm: rwlock support for ARM
  gcc/arm: avoid alignment errors to break build

Vlastimil Kosar (10):
  mk: Introduce ARMv7 architecture
  eal/arm: atomic operations for ARM
  eal/arm: byte order operations for ARM
  eal/arm: cpu cycle operations for ARM
  eal/arm: prefetch operations for ARM
  eal/arm: spinlock operations for ARM (without HTM)
  eal/arm: vector memcpy for ARM
  eal/arm: cpu flag checks for ARM
  lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on
    for-x86
  arm: Disable usage of SSE optimized code in librte_acl

 app/test/test_cpuflags.c                           |   5 +
 config/defconfig_arm-armv7-a-linuxapp-gcc          |  72 ++++++
 lib/librte_acl/acl.h                               |   2 +
 lib/librte_acl/rte_acl.c                           |   8 +-
 lib/librte_acl/rte_acl_osdep.h                     |   2 +
 .../common/include/arch/arm/rte_atomic.h           | 257 ++++++++++++++++++++
 .../common/include/arch/arm/rte_byteorder.h        | 148 +++++++++++
 .../common/include/arch/arm/rte_cpuflags.h         | 169 +++++++++++++
 .../common/include/arch/arm/rte_cycles.h           |  85 +++++++
 .../common/include/arch/arm/rte_memcpy.h           | 270 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |  61 +++++
 .../common/include/arch/arm/rte_rwlock.h           |  40 +++
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++++
 lib/librte_lpm/rte_lpm.h                           |  71 ++++++
 mk/arch/arm/rte.vars.mk                            |  39 +++
 mk/machine/armv7-a/rte.vars.mk                     |  60 +++++
 mk/rte.cpuflags.mk                                 |   6 +
 mk/toolchain/gcc/rte.vars.mk                       |   6 +
 18 files changed, 1414 insertions(+), 1 deletion(-)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 02/12] eal/arm: atomic operations for ARM Jan Viktorin
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 config/defconfig_arm-armv7-a-linuxapp-gcc | 76 +++++++++++++++++++++++++++++++
 mk/arch/arm/rte.vars.mk                   | 39 ++++++++++++++++
 mk/machine/armv7-a/rte.vars.mk            | 60 ++++++++++++++++++++++++
 3 files changed, 175 insertions(+)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
new file mode 100644
index 0000000..a466d13
--- /dev/null
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -0,0 +1,76 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All right reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# avoids using i686/x86_64 SIMD instructions, nothing for ARM
+CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# missing rte_vect.h for ARM
+CONFIG_XMM_SIZE=16
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 0000000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH  ?= arm
+CROSS ?=
+
+CPU_CFLAGS  ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk
new file mode 100644
index 0000000..7764e4f
--- /dev/null
+++ b/mk/machine/armv7-a/rte.vars.mk
@@ -0,0 +1,60 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of RehiveTech nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+CPU_CFLAGS += -mfloat-abi=softfp
+
+MACHINE_CFLAGS  = -march=armv7-a -mtune=cortex-a9
+MACHINE_CFLAGS += -mfpu=neon
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 02/12] eal/arm: atomic operations for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 03/12] eal/arm: byte order " Jan Viktorin
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_atomic.h           | 257 +++++++++++++++++++++
 1 file changed, 257 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 0000000..aac5b0a
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,257 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb()  __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() __sync_synchronize()
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() __sync_synchronize()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+// FIXME: nechapu: We use intrinsics even if RTE_FORCE_INTRINSICS is not set.
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+		__ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *) &v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	__atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	__atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	__atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	__atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 03/12] eal/arm: byte order operations for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 02/12] eal/arm: atomic operations for ARM Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 04/12] eal/arm: cpu cycle " Jan Viktorin
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_byteorder.h        | 148 +++++++++++++++++++++
 1 file changed, 148 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 0000000..04e7b87
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,148 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("rev16 %[x1],%[x2]"
+		      : [x1] "=r" (x)
+		      : [x2] "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("rev %[x1],%[x2]"
+		      : [x1] "=r" (x)
+		      : [x2] "r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	return  __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else /* RTE_BIG_ENDIAN */
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 04/12] eal/arm: cpu cycle operations for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (2 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 03/12] eal/arm: byte order " Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 05/12] eal/arm: prefetch " Jan Viktorin
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_cycles.h           | 85 ++++++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 0000000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock counter
+   available in the core is 32 bit wide. Therefore it is unsuitable as the
+   counter overlaps every few seconds and probably is not accessible by
+   userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+   simulate counter running at 1GHz.
+*/
+
+#include <time.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	struct timespec val;
+	uint64_t v;
+
+	while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+		/* no body */;
+
+	v  = (uint64_t) val.tv_sec * 1000000000LL;
+	v += (uint64_t) val.tv_nsec;
+	return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 05/12] eal/arm: prefetch operations for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (3 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 04/12] eal/arm: cpu cycle " Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 06/12] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_prefetch.h         | 61 ++++++++++++++++++++++
 1 file changed, 61 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 0000000..8d75fe6
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 06/12] eal/arm: spinlock operations for ARM (without HTM)
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (4 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 05/12] eal/arm: prefetch " Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 07/12] eal/arm: vector memcpy for ARM Jan Viktorin
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 0000000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while (sl->locked)
+			rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+	return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+	return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+	rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+	return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 07/12] eal/arm: vector memcpy for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (5 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 06/12] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 08/12] eal/arm: cpu flag checks " Jan Viktorin
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.

The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.

Length (B)   a15    a7     a9
   1         4.9  15.2    3.2
   7        56.9  48.2   40.3
   8        37.3  39.8   29.6
   9        69.3  38.7   33.9
  15        60.8  35.3   23.7
  16        50.6  35.9   35.0
  17        57.7  35.7   31.1
  31        16.0  23.3    9.0
  32        65.9  13.5   21.4
  33         3.9  10.3   -3.7
  63         2.0  12.9   -2.0
  64        66.5   0.0   16.5
  65         2.7   7.6  -35.6
 127         0.1   4.5  -18.9
 128        66.2   1.5  -51.4
 129        -0.8   3.2  -35.8
 255        -3.1  -0.9  -69.1
 256        67.9   1.2    7.2
 257        -3.6  -1.9  -36.9
 320        67.7   1.4    0.0
 384        66.8   1.4  -14.2
 511       -44.9  -2.3  -41.9
 512        67.3   1.4   -6.8
 513       -41.7  -3.0  -36.2
1023       -82.4  -2.8  -41.2
1024        68.3   1.4  -11.6
1025       -80.1  -3.3  -38.1
1518       -47.3  -5.0  -38.3
1522       -48.3  -6.0  -37.9
1600        65.4   1.3  -27.3
2048        59.5   1.5  -10.9
3072        52.3   1.5  -12.2
4096        45.3   1.4  -12.5
5120        40.6   1.5  -14.5
6144        35.4   1.4  -13.4
7168        32.9   1.4  -13.9
8192        28.2   1.4  -15.1

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_memcpy.h           | 270 +++++++++++++++++++++
 1 file changed, 270 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 0000000..ac885e9
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,270 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include <stdint.h>
+#include <string.h>
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+	              "vld1.8 {d4-d5}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+	              "vst1.8 {d4-d5}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+	              "vld1.8 {d4-d7}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7");
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+	asm volatile ("vld1.8 {d0-d3},   [%[src]]!\n\t"
+	              "vld1.8 {d4-d7},   [%[src]]!\n\t"
+	              "vld1.8 {d8-d11},  [%[src]]!\n\t"
+	              "vld1.8 {d12-d15}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3},   [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7},   [%[dst]]!\n\t"
+	              "vst1.8 {d8-d11},  [%[dst]]!\n\t"
+	              "vst1.8 {d12-d15}, [%[dst]]\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+	                  "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #128]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #192]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #256]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #320]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #384]" : : [src] "r" (src));
+	asm volatile ("pld [%[src], #448]" : : [src] "r" (src));
+	asm volatile ("vld1.8 {d0-d3},   [%[src]]!\n\t"
+	              "vld1.8 {d4-d7},   [%[src]]!\n\t"
+	              "vld1.8 {d8-d11},  [%[src]]!\n\t"
+	              "vld1.8 {d12-d15}, [%[src]]!\n\t"
+	              "vld1.8 {d16-d19}, [%[src]]!\n\t"
+	              "vld1.8 {d20-d23}, [%[src]]!\n\t"
+	              "vld1.8 {d24-d27}, [%[src]]!\n\t"
+	              "vld1.8 {d28-d31}, [%[src]]\n\t"
+	              "vst1.8 {d0-d3},   [%[dst]]!\n\t"
+	              "vst1.8 {d4-d7},   [%[dst]]!\n\t"
+	              "vst1.8 {d8-d11},  [%[dst]]!\n\t"
+	              "vst1.8 {d12-d15}, [%[dst]]!\n\t"
+	              "vst1.8 {d16-d19}, [%[dst]]!\n\t"
+	              "vst1.8 {d20-d23}, [%[dst]]!\n\t"
+	              "vst1.8 {d24-d27}, [%[dst]]!\n\t"
+	              "vst1.8 {d28-d31}, [%[dst]]!\n\t"
+	              : [src] "+r" (src), [dst] "+r" (dst)
+	              : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+	                  "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15",
+	                  "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+	                  "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31");
+}
+
+#define rte_memcpy(dst, src, n)              \
+	({ (__builtin_constant_p(n)) ?       \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			/* ARMv7 can not handle unaligned access to long long
+			 * (uint64_t). Therefore two uint32_t operations are used.
+			 * TODO: use NEON too?
+			 */
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+			*(uint32_t *)dst = *(const uint32_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n,
+			(const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n,
+			(const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0)
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 08/12] eal/arm: cpu flag checks for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (6 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 07/12] eal/arm: vector memcpy for ARM Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 09/12] eal/arm: rwlock support " Jan Viktorin
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 app/test/test_cpuflags.c                           |   5 +
 .../common/include/arch/arm/rte_cpuflags.h         | 169 +++++++++++++++++++++
 mk/rte.cpuflags.mk                                 |   6 +
 3 files changed, 180 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif
 
+#if defined(RTE_ARCH_ARM)
+	printf("Check for NEON:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 	printf("Check for SSE:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 0000000..5a402a4
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,169 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+/* software based registers */
+enum cpu_register_t {
+	REG_HWCAP = 0,
+	REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	RTE_CPUFLAG_SWP = 0,
+	RTE_CPUFLAG_HALF,
+	RTE_CPUFLAG_THUMB,
+	RTE_CPUFLAG_A26BIT,
+	RTE_CPUFLAG_FAST_MULT,
+	RTE_CPUFLAG_FPA,
+	RTE_CPUFLAG_VFP,
+	RTE_CPUFLAG_EDSP,
+	RTE_CPUFLAG_JAVA,
+	RTE_CPUFLAG_IWMMXT,
+	RTE_CPUFLAG_CRUNCH,
+	RTE_CPUFLAG_THUMBEE,
+	RTE_CPUFLAG_NEON,
+	RTE_CPUFLAG_VFPv3,
+	RTE_CPUFLAG_VFPv3D16,
+	RTE_CPUFLAG_TLS,
+	RTE_CPUFLAG_VFPv4,
+	RTE_CPUFLAG_IDIVA,
+	RTE_CPUFLAG_IDIVT,
+	RTE_CPUFLAG_VFPD32,
+	RTE_CPUFLAG_LPAE,
+	RTE_CPUFLAG_EVTSTRM,
+	RTE_CPUFLAG_AES,
+	RTE_CPUFLAG_PMULL,
+	RTE_CPUFLAG_SHA1,
+	RTE_CPUFLAG_SHA2,
+	RTE_CPUFLAG_CRC32,
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0)
+	FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1)
+	FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2)
+	FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
+	FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP,  4)
+	FEAT_DEF(FPA,       0x00000001, 0, REG_HWCAP,  5)
+	FEAT_DEF(VFP,       0x00000001, 0, REG_HWCAP,  6)
+	FEAT_DEF(EDSP,      0x00000001, 0, REG_HWCAP,  7)
+	FEAT_DEF(JAVA,      0x00000001, 0, REG_HWCAP,  8)
+	FEAT_DEF(IWMMXT,    0x00000001, 0, REG_HWCAP,  9)
+	FEAT_DEF(CRUNCH,    0x00000001, 0, REG_HWCAP,  10)
+	FEAT_DEF(THUMBEE,   0x00000001, 0, REG_HWCAP,  11)
+	FEAT_DEF(NEON,      0x00000001, 0, REG_HWCAP,  12)
+	FEAT_DEF(VFPv3,     0x00000001, 0, REG_HWCAP,  13)
+	FEAT_DEF(VFPv3D16,  0x00000001, 0, REG_HWCAP,  14)
+	FEAT_DEF(TLS,       0x00000001, 0, REG_HWCAP,  15)
+	FEAT_DEF(VFPv4,     0x00000001, 0, REG_HWCAP,  16)
+	FEAT_DEF(IDIVA,     0x00000001, 0, REG_HWCAP,  17)
+	FEAT_DEF(IDIVT,     0x00000001, 0, REG_HWCAP,  18)
+	FEAT_DEF(VFPD32,    0x00000001, 0, REG_HWCAP,  19)
+	FEAT_DEF(LPAE,      0x00000001, 0, REG_HWCAP,  20)
+	FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  21)
+	FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP2,  0)
+	FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP2,  1)
+	FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
+	FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
+	FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+	__attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+	int auxv_fd;
+	Elf32_auxv_t auxv;
+
+	auxv_fd = open("/proc/self/auxv", O_RDONLY);
+	assert(auxv_fd);
+	while (read(auxv_fd, &auxv,
+		sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+		if (auxv.a_type == AT_HWCAP)
+			out[REG_HWCAP] = auxv.a_un.a_val;
+		else if (auxv.a_type == AT_HWCAP2)
+			out[REG_HWCAP2] = auxv.a_un.a_val;
+	}
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs = {0};
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index f595cd0..bec7bdd 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
 CPUFLAGS += VSX
 endif
 
+# ARM flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
+CPUFLAGS += NEON
+endif
+
+
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
 
 # To strip whitespace
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 09/12] eal/arm: rwlock support for ARM
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (7 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 08/12] eal/arm: cpu flag checks " Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 10/12] gcc/arm: avoid alignment errors to break build Jan Viktorin
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Jan Viktorin

Just a copy from PPC.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 .../common/include/arch/arm/rte_rwlock.h           | 40 ++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 0000000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+	rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 10/12] gcc/arm: avoid alignment errors to break build
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (8 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 09/12] eal/arm: rwlock support " Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 11/12] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 12/12] arm: Disable usage of SSE optimized code in librte_acl Jan Viktorin
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
---
 mk/toolchain/gcc/rte.vars.mk | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..8f9c396 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
 WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
 WERROR_FLAGS += -Wundef -Wwrite-strings
 
+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error
+endif
+
 # process cpu flags
 include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
 
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 11/12] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (9 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 10/12] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 12/12] arm: Disable usage of SSE optimized code in librte_acl Jan Viktorin
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
the function is reimplemented using non-vector operations for non-x86
architectures. In the future, each architecture should have vectorized code.
This patch includes rudimentary emulation of intrinsic functions _mm_set_epi32(),
_mm_loadu_si128() and _mm_load_si128() for easy portability of existing
applications.

LPM builds now when on ARM.

FIXME: to be reworked

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
 lib/librte_lpm/rte_lpm.h                  | 71 +++++++++++++++++++++++++++++++
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
index a466d13..c434429 100644
--- a/config/defconfig_arm-armv7-a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -56,7 +56,6 @@ CONFIG_XMM_SIZE=16
 
 # fails to compile on ARM
 CONFIG_RTE_LIBRTE_ACL=n
-CONFIG_RTE_LIBRTE_LPM=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index 11f0c04..81d23a2 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -47,7 +47,9 @@
 #include <rte_byteorder.h>
 #include <rte_memory.h>
 #include <rte_common.h>
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 #include <rte_vect.h>
+#endif
 
 #ifdef __cplusplus
 extern "C" {
@@ -369,6 +371,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
 	return 0;
 }
 
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 /* Mask four results. */
 #define	 RTE_LPM_MASKX4_RES	UINT64_C(0x00ff00ff00ff00ff)
 
@@ -483,6 +486,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
 	hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
 	hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
 }
+#else
+// TODO: this code should be reworked.
+
+typedef struct {
+	union uint128 {
+		uint8_t uint8[16];
+		uint32_t uint32[4];
+	} val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+	__m128i res;
+	res.val.uint32[0] = v0;
+	res.val.uint32[1] = v1;
+	res.val.uint32[2] = v2;
+	res.val.uint32[3] = v3;
+	return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i * v)
+{
+	__m128i res;
+	res = *v;
+	return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i * v)
+{
+	__m128i res;
+	res = *v;
+	return res;
+}
+
+/**
+ * Lookup four IP addresses in an LPM table.
+ *
+ * @param lpm
+ *   LPM object handle
+ * @param ip
+ *   Four IPs to be looked up in the LPM table
+ * @param hop
+ *   Next hop of the most specific rule found for IP (valid on lookup hit only).
+ *   This is an 4 elements array of two byte values.
+ *   If the lookup was succesfull for the given IP, then least significant byte
+ *   of the corresponding element is the  actual next hop and the most
+ *   significant byte is zero.
+ *   If the lookup for the given IP failed, then corresponding element would
+ *   contain default value, see description of then next parameter.
+ * @param defv
+ *   Default value to populate into corresponding element of hop[] array,
+ *   if lookup would fail.
+ */
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+	uint16_t defv)
+{
+	rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
+
+	hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
+	hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
+	hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
+	hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
+}
+#endif
 
 #ifdef __cplusplus
 }
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v1 12/12] arm: Disable usage of SSE optimized code in librte_acl
  2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
                   ` (10 preceding siblings ...)
  2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 11/12] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
@ 2015-10-03  8:58 ` Jan Viktorin
  11 siblings, 0 replies; 13+ messages in thread
From: Jan Viktorin @ 2015-10-03  8:58 UTC (permalink / raw)
  To: dev; +Cc: Vlastimil Kosar, Jan Viktorin

From: Vlastimil Kosar <kosar@rehivetech.com>

The SSE optimized code is disabled for architectures other
than i686 and x86_64.

ACL works now on ARM.

FIXME: should be reworked to avoid if/endif mess

Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
 config/defconfig_arm-armv7-a-linuxapp-gcc | 3 ---
 lib/librte_acl/acl.h                      | 2 ++
 lib/librte_acl/rte_acl.c                  | 8 +++++++-
 lib/librte_acl/rte_acl_osdep.h            | 2 ++
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
index c434429..e091200 100644
--- a/config/defconfig_arm-armv7-a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -54,9 +54,6 @@ CONFIG_RTE_EAL_IGB_UIO=n
 # missing rte_vect.h for ARM
 CONFIG_XMM_SIZE=16
 
-# fails to compile on ARM
-CONFIG_RTE_LIBRTE_ACL=n
-
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
index eb4930c..cb4e6a8 100644
--- a/lib/librte_acl/acl.h
+++ b/lib/librte_acl/acl.h
@@ -222,9 +222,11 @@ int
 rte_acl_classify_scalar(const struct rte_acl_ctx *ctx, const uint8_t **data,
 	uint32_t *results, uint32_t num, uint32_t categories);
 
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 int
 rte_acl_classify_sse(const struct rte_acl_ctx *ctx, const uint8_t **data,
 	uint32_t *results, uint32_t num, uint32_t categories);
+#endif
 
 int
 rte_acl_classify_avx2(const struct rte_acl_ctx *ctx, const uint8_t **data,
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
index a54d531..f173b31 100644
--- a/lib/librte_acl/rte_acl.c
+++ b/lib/librte_acl/rte_acl.c
@@ -60,8 +60,13 @@ rte_acl_classify_avx2(__rte_unused const struct rte_acl_ctx *ctx,
 static const rte_acl_classify_t classify_fns[] = {
 	[RTE_ACL_CLASSIFY_DEFAULT] = rte_acl_classify_scalar,
 	[RTE_ACL_CLASSIFY_SCALAR] = rte_acl_classify_scalar,
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 	[RTE_ACL_CLASSIFY_SSE] = rte_acl_classify_sse,
 	[RTE_ACL_CLASSIFY_AVX2] = rte_acl_classify_avx2,
+#else
+	[RTE_ACL_CLASSIFY_SSE] = rte_acl_classify_scalar,
+	[RTE_ACL_CLASSIFY_AVX2] = rte_acl_classify_scalar,
+#endif
 };
 
 /* by default, use always available scalar code path. */
@@ -95,6 +100,7 @@ rte_acl_init(void)
 {
 	enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;
 
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 #ifdef CC_AVX2_SUPPORT
 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
 		alg = RTE_ACL_CLASSIFY_AVX2;
@@ -103,7 +109,7 @@ rte_acl_init(void)
 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
 #endif
 		alg = RTE_ACL_CLASSIFY_SSE;
-
+#endif
 	rte_acl_set_default_classify(alg);
 }
 
diff --git a/lib/librte_acl/rte_acl_osdep.h b/lib/librte_acl/rte_acl_osdep.h
index 41f7e3d..05cf9a8 100644
--- a/lib/librte_acl/rte_acl_osdep.h
+++ b/lib/librte_acl/rte_acl_osdep.h
@@ -59,7 +59,9 @@
 #define DIM(x) RTE_DIM(x)
 
 #include <rte_common.h>
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 #include <rte_vect.h>
+#endif
 #include <rte_memory.h>
 #include <rte_log.h>
 #include <rte_memcpy.h>
-- 
2.5.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-10-03  8:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-03  8:58 [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7) Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 02/12] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 03/12] eal/arm: byte order " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 04/12] eal/arm: cpu cycle " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 05/12] eal/arm: prefetch " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 06/12] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 07/12] eal/arm: vector memcpy for ARM Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 08/12] eal/arm: cpu flag checks " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 09/12] eal/arm: rwlock support " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 10/12] gcc/arm: avoid alignment errors to break build Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 11/12] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 12/12] arm: Disable usage of SSE optimized code in librte_acl Jan Viktorin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).