* [dpdk-dev] [PATCH 00/12] DPDK armv8-a support
@ 2015-11-03 13:09 Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David
0 siblings, 2 replies; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
This is the v1 patchset for ARMv8 that now sits on top of the v6 patch
of the ARMv7 code by RehiveTech. It adds code into the same arm include
directory, reducing code duplication.
Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic
between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should
work with many other NICS as long as long as there is no unaligned access to
device memory but not yet untested.
Compiles igb_uio, kni and all the physical device PMDs.
An entry has been added to the Release notes.
NOTE:
Part of the work has been taken from David Hunt's v3 patch who was
initiated the armv8 port.
Notes on arm64 kernel configuration:
Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio.
ARM64 kernels does not have functional resource mapping of PCI memory
(PCI_MMAP), so the pci driver needs to be patched to enable this. The
symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is
missing the resource0...N files for mmapping the device memory.
Following patch fixes the PCI resource mapping issue om armv8.
Its not yet up streamed.We are in the process of up streaming it.
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html
Jerin Jacob (12):
eal: arm64: add armv8-a version of rte_atomic_64.h
eal: arm64: add armv8-a version of rte_cpuflags_64.h
eal: arm64: add armv8-a version of rte_prefetch_64.h
eal: arm64: add armv8-a version of rte_cycles_64.h
eal: arm64: rte_memcpy_64.h version based libc memcpy
eal: arm: ret_vector.h improvements
app: test: added the new cpu flags of arm64 in test_cpuflags test case
acl: arm64: acl implementation using NEON gcc intrinsic
mk: add support for armv8 on top of armv7
mk: add support for thunderx machine target based armv8-a
updated release note for armv8 support for DPDK 2.2
maintainers: claim responsibility for ARMv8
MAINTAINERS | 5 +
app/test-acl/main.c | 4 +
app/test/test_cpuflags.c | 26 ++
config/defconfig_arm64-armv8a-linuxapp-gcc | 56 ++++
config/defconfig_arm64-thunderx-linuxapp-gcc | 56 ++++
doc/guides/rel_notes/release_2_2.rst | 7 +-
lib/librte_acl/Makefile | 5 +
lib/librte_acl/acl.h | 4 +
lib/librte_acl/acl_run_neon.c | 46 ++++
lib/librte_acl/acl_run_neon.h | 290 +++++++++++++++++++++
lib/librte_acl/rte_acl.c | 25 ++
lib/librte_acl/rte_acl.h | 1 +
.../common/include/arch/arm/rte_atomic.h | 4 +
.../common/include/arch/arm/rte_atomic_64.h | 88 +++++++
.../common/include/arch/arm/rte_cpuflags.h | 4 +
.../common/include/arch/arm/rte_cpuflags_64.h | 152 +++++++++++
.../common/include/arch/arm/rte_cycles.h | 4 +
.../common/include/arch/arm/rte_cycles_64.h | 71 +++++
.../common/include/arch/arm/rte_memcpy.h | 4 +
.../common/include/arch/arm/rte_memcpy_64.h | 93 +++++++
.../common/include/arch/arm/rte_prefetch.h | 4 +
.../common/include/arch/arm/rte_prefetch_64.h | 61 +++++
lib/librte_eal/common/include/arch/arm/rte_vect.h | 58 ++---
mk/arch/arm64/rte.vars.mk | 58 +++++
mk/machine/armv8a/rte.vars.mk | 58 +++++
mk/machine/thunderx/rte.vars.mk | 58 +++++
26 files changed, 1198 insertions(+), 44 deletions(-)
create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
create mode 100644 config/defconfig_arm64-thunderx-linuxapp-gcc
create mode 100644 lib/librte_acl/acl_run_neon.c
create mode 100644 lib/librte_acl/acl_run_neon.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
create mode 100644 mk/arch/arm64/rte.vars.mk
create mode 100644 mk/machine/armv8a/rte.vars.mk
create mode 100644 mk/machine/thunderx/rte.vars.mk
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h
2015-11-03 13:09 [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob
2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David
1 sibling, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
except rte_?wb() functions other functions are used from
RTE_FORCE_INTRINSICS=y scheme
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
.../common/include/arch/arm/rte_atomic.h | 4 +
.../common/include/arch/arm/rte_atomic_64.h | 88 ++++++++++++++++++++++
2 files changed, 92 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
index f4f5783..f3f3b6e 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -33,6 +33,10 @@
#ifndef _RTE_ATOMIC_ARM_H_
#define _RTE_ATOMIC_ARM_H_
+#ifdef RTE_ARCH_64
+#include <rte_atomic_64.h>
+#else
#include <rte_atomic_32.h>
+#endif
#endif /* _RTE_ATOMIC_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
new file mode 100644
index 0000000..671caa7
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -0,0 +1,88 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_ATOMIC_ARM64_H_
+#define _RTE_ATOMIC_ARM64_H_
+
+#ifndef RTE_FORCE_INTRINSICS
+# error Platform must be built with CONFIG_RTE_FORCE_INTRINSICS
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+#define dmb(opt) do { asm volatile("dmb " #opt : : : "memory"); } while (0)
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_mb(void)
+{
+ dmb(ish);
+}
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_wmb(void)
+{
+ dmb(ishst);
+}
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_rmb(void)
+{
+ dmb(ishld);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM64_H_ */
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h
2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
.../common/include/arch/arm/rte_cpuflags.h | 4 +
.../common/include/arch/arm/rte_cpuflags_64.h | 152 +++++++++++++++++++++
2 files changed, 156 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 8de78d2..b8f6288 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -33,6 +33,10 @@
#ifndef _RTE_CPUFLAGS_ARM_H_
#define _RTE_CPUFLAGS_ARM_H_
+#ifdef RTE_ARCH_64
+#include <rte_cpuflags_64.h>
+#else
#include <rte_cpuflags_32.h>
+#endif
#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
new file mode 100644
index 0000000..7bcc12f
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
@@ -0,0 +1,152 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM64_H_
+#define _RTE_CPUFLAGS_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+ REG_HWCAP = 0,
+ REG_HWCAP2,
+ REG_PLATFORM,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+ RTE_CPUFLAG_FP = 0,
+ RTE_CPUFLAG_NEON,
+ RTE_CPUFLAG_EVTSTRM,
+ RTE_CPUFLAG_AES,
+ RTE_CPUFLAG_PMULL,
+ RTE_CPUFLAG_SHA1,
+ RTE_CPUFLAG_SHA2,
+ RTE_CPUFLAG_CRC32,
+ RTE_CPUFLAG_AARCH64,
+ /* The last item */
+ RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+ FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0)
+ FEAT_DEF(NEON, 0x00000001, 0, REG_HWCAP, 1)
+ FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2)
+ FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3)
+ FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4)
+ FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5)
+ FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6)
+ FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7)
+ FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+ __attribute__((unused)) uint32_t subleaf,
+ cpuid_registers_t out)
+{
+ int auxv_fd;
+ Elf64_auxv_t auxv;
+
+ auxv_fd = open("/proc/self/auxv", O_RDONLY);
+ assert(auxv_fd);
+ while (read(auxv_fd, &auxv,
+ sizeof(Elf64_auxv_t)) == sizeof(Elf64_auxv_t)) {
+ if (auxv.a_type == AT_HWCAP) {
+ out[REG_HWCAP] = auxv.a_un.a_val;
+ } else if (auxv.a_type == AT_HWCAP2) {
+ out[REG_HWCAP2] = auxv.a_un.a_val;
+ } else if (auxv.a_type == AT_PLATFORM) {
+ if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+ out[REG_PLATFORM] = 0x0001;
+ }
+ }
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+ const struct feature_entry *feat;
+ cpuid_registers_t regs = {0};
+
+ if (feature >= RTE_CPUFLAG_NUMFLAGS)
+ /* Flag does not match anything in the feature tables */
+ return -ENOENT;
+
+ feat = &cpu_feature_table[feature];
+
+ if (!feat->leaf)
+ /* This entry in the table wasn't filled out! */
+ return -EFAULT;
+
+ /* get the cpuid leaf containing the desired feature */
+ rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+ /* check if the feature is enabled */
+ return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM64_H_ */
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h
2015-11-03 13:09 ` [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
.../common/include/arch/arm/rte_prefetch.h | 4 ++
.../common/include/arch/arm/rte_prefetch_64.h | 61 ++++++++++++++++++++++
2 files changed, 65 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
index 1f46697..aa37de5 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -33,6 +33,10 @@
#ifndef _RTE_PREFETCH_ARM_H_
#define _RTE_PREFETCH_ARM_H_
+#ifdef RTE_ARCH_64
+#include <rte_prefetch_64.h>
+#else
#include <rte_prefetch_32.h>
+#endif
#endif /* _RTE_PREFETCH_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
new file mode 100644
index 0000000..f9cc62e
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
@@ -0,0 +1,61 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_64_H_
+#define _RTE_PREFETCH_ARM_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+ asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+ asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+ asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_64_H_ */
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h
2015-11-03 13:09 ` [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
cntcvt_el0 ticks are not based on cpu clk unlike rdtsc in x86.
Its a fixed clock running based at constant speed.
Though its a armv8-a implementer choice, typically it runs at 50 or 100 MHz
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
.../common/include/arch/arm/rte_cycles.h | 4 ++
.../common/include/arch/arm/rte_cycles_64.h | 71 ++++++++++++++++++++++
2 files changed, 75 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index b2372fa..a8009a0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -33,6 +33,10 @@
#ifndef _RTE_CYCLES_ARM_H_
#define _RTE_CYCLES_ARM_H_
+#ifdef RTE_ARCH_64
+#include <rte_cycles_64.h>
+#else
#include <rte_cycles_32.h>
+#endif
#endif /* _RTE_CYCLES_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
new file mode 100644
index 0000000..14f2612
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
@@ -0,0 +1,71 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_CYCLES_ARM64_H_
+#define _RTE_CYCLES_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ * The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+ uint64_t tsc;
+
+ asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
+ return tsc;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+ rte_mb();
+ return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM64_H_ */
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy
2015-11-03 13:09 ` [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
.../common/include/arch/arm/rte_memcpy.h | 4 +
.../common/include/arch/arm/rte_memcpy_64.h | 93 ++++++++++++++++++++++
2 files changed, 97 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index d9f5bf1..1d562c3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -33,6 +33,10 @@
#ifndef _RTE_MEMCPY_ARM_H_
#define _RTE_MEMCPY_ARM_H_
+#ifdef RTE_ARCH_64
+#include <rte_memcpy_64.h>
+#else
#include <rte_memcpy_32.h>
+#endif
#endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
new file mode 100644
index 0000000..917cdc1
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
@@ -0,0 +1,93 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMCPY_ARM64_H_
+#define _RTE_MEMCPY_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <string.h>
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 256);
+}
+
+#define rte_memcpy(d, s, n) memcpy((d), (s), (n))
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+ return memcpy(dst, src, n);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_64_H_ */
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements
2015-11-03 13:09 ` [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
added the definition of rte_xmm and xmm_t for acl noen implementation.
removed the emulated _mm_* functions
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_vect.h | 58 +++++++----------------
1 file changed, 17 insertions(+), 41 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
index 7d5de97..21cdb4d 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2015 RehiveTech. All rights reserved.
+ * Copyright(c) 2015 Cavium Networks. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
@@ -13,7 +13,7 @@
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
- * * Neither the name of RehiveTech nor the names of its
+ * * Neither the name of Cavium Networks nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
@@ -33,49 +33,25 @@
#ifndef _RTE_VECT_ARM_H_
#define _RTE_VECT_ARM_H_
+#include "arm_neon.h"
+
#ifdef __cplusplus
extern "C" {
#endif
-#define XMM_SIZE 16
-#define XMM_MASK (XMM_MASK - 1)
-
-typedef struct {
- union uint128 {
- uint8_t uint8[16];
- uint32_t uint32[4];
- } val;
-} __m128i;
-
-static inline __m128i
-_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
-{
- __m128i res;
-
- res.val.uint32[0] = v0;
- res.val.uint32[1] = v1;
- res.val.uint32[2] = v2;
- res.val.uint32[3] = v3;
- return res;
-}
-
-static inline __m128i
-_mm_loadu_si128(__m128i *v)
-{
- __m128i res;
-
- res = *v;
- return res;
-}
-
-static inline __m128i
-_mm_load_si128(__m128i *v)
-{
- __m128i res;
-
- res = *v;
- return res;
-}
+typedef int32x4_t xmm_t;
+
+#define XMM_SIZE (sizeof(xmm_t))
+#define XMM_MASK (XMM_SIZE - 1)
+
+typedef union rte_xmm {
+ xmm_t x;
+ uint8_t u8[XMM_SIZE / sizeof(uint8_t)];
+ uint16_t u16[XMM_SIZE / sizeof(uint16_t)];
+ uint32_t u32[XMM_SIZE / sizeof(uint32_t)];
+ uint64_t u64[XMM_SIZE / sizeof(uint64_t)];
+ double pd[XMM_SIZE / sizeof(double)];
+} __attribute__((aligned(16))) rte_xmm_t;
#ifdef __cplusplus
}
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case
2015-11-03 13:09 ` [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
app/test/test_cpuflags.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 557458f..e8d0ce7 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -120,6 +120,32 @@ test_cpuflags(void)
CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
#endif
+#if defined(RTE_ARCH_ARM64)
+ printf("Check for FP:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_FP);
+
+ printf("Check for ASIMD:\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+
+ printf("Check for EVTSTRM:\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_EVTSTRM);
+
+ printf("Check for AES:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_AES);
+
+ printf("Check for PMULL:\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_PMULL);
+
+ printf("Check for SHA1:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_SHA1);
+
+ printf("Check for SHA2:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_SHA2);
+
+ printf("Check for CRC32:\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_CRC32);
+#endif
+
#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
printf("Check for SSE:\t\t");
CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic
2015-11-03 13:09 ` [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
verified with testacl and acl_autotest applications on arm64 architecture.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
app/test-acl/main.c | 4 +
lib/librte_acl/Makefile | 5 +
lib/librte_acl/acl.h | 4 +
lib/librte_acl/acl_run_neon.c | 46 +++++++
lib/librte_acl/acl_run_neon.h | 290 ++++++++++++++++++++++++++++++++++++++++++
lib/librte_acl/rte_acl.c | 25 ++++
lib/librte_acl/rte_acl.h | 1 +
7 files changed, 375 insertions(+)
create mode 100644 lib/librte_acl/acl_run_neon.c
create mode 100644 lib/librte_acl/acl_run_neon.h
diff --git a/app/test-acl/main.c b/app/test-acl/main.c
index 72ce83c..0b0c093 100644
--- a/app/test-acl/main.c
+++ b/app/test-acl/main.c
@@ -101,6 +101,10 @@ static const struct acl_alg acl_alg[] = {
.name = "avx2",
.alg = RTE_ACL_CLASSIFY_AVX2,
},
+ {
+ .name = "neon",
+ .alg = RTE_ACL_CLASSIFY_NEON,
+ },
};
static struct {
diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index 7a1cf8a..27f91d5 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
+ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_neon.c
+else
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+endif
CFLAGS_acl_run_sse.o += -msse4.1
+CFLAGS_acl_run_neon.o += -flax-vector-conversions -Wno-maybe-uninitialized
#
# If the compiler supports AVX2 instructions,
diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
index eb4930c..09d6784 100644
--- a/lib/librte_acl/acl.h
+++ b/lib/librte_acl/acl.h
@@ -230,6 +230,10 @@ int
rte_acl_classify_avx2(const struct rte_acl_ctx *ctx, const uint8_t **data,
uint32_t *results, uint32_t num, uint32_t categories);
+int
+rte_acl_classify_neon(const struct rte_acl_ctx *ctx, const uint8_t **data,
+ uint32_t *results, uint32_t num, uint32_t categories);
+
#ifdef __cplusplus
}
#endif /* __cplusplus */
diff --git a/lib/librte_acl/acl_run_neon.c b/lib/librte_acl/acl_run_neon.c
new file mode 100644
index 0000000..b014451
--- /dev/null
+++ b/lib/librte_acl/acl_run_neon.c
@@ -0,0 +1,46 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#include "acl_run_neon.h"
+
+int
+rte_acl_classify_neon(const struct rte_acl_ctx *ctx, const uint8_t **data,
+ uint32_t *results, uint32_t num, uint32_t categories)
+{
+ if (likely(num >= 8))
+ return search_neon_8(ctx, data, results, num, categories);
+ else if (num >= 4)
+ return search_neon_4(ctx, data, results, num, categories);
+ else
+ return rte_acl_classify_scalar(ctx, data, results, num,
+ categories);
+}
diff --git a/lib/librte_acl/acl_run_neon.h b/lib/librte_acl/acl_run_neon.h
new file mode 100644
index 0000000..4579476
--- /dev/null
+++ b/lib/librte_acl/acl_run_neon.h
@@ -0,0 +1,290 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2015.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#include "acl_run.h"
+#include "acl_vect.h"
+
+struct _neon_acl_const {
+ rte_xmm_t xmm_shuffle_input;
+ rte_xmm_t xmm_index_mask;
+ rte_xmm_t range_base;
+} neon_acl_const __attribute__((aligned(RTE_CACHE_LINE_SIZE))) = {
+ {
+ .u32 = {0x00000000, 0x04040404, 0x08080808, 0x0c0c0c0c}
+ },
+ {
+ .u32 = {RTE_ACL_NODE_INDEX, RTE_ACL_NODE_INDEX,
+ RTE_ACL_NODE_INDEX, RTE_ACL_NODE_INDEX}
+ },
+ {
+ .u32 = {0xffffff00, 0xffffff04, 0xffffff08, 0xffffff0c}
+ },
+};
+
+/*
+ * Resolve priority for multiple results (neon version).
+ * This consists comparing the priority of the current traversal with the
+ * running set of results for the packet.
+ * For each result, keep a running array of the result (rule number) and
+ * its priority for each category.
+ */
+static inline void
+resolve_priority_neon(uint64_t transition, int n, const struct rte_acl_ctx *ctx,
+ struct parms *parms,
+ const struct rte_acl_match_results *p,
+ uint32_t categories)
+{
+ uint32_t x;
+ int32x4_t results, priority, results1, priority1;
+ uint32x4_t selector;
+ int32_t *saved_results, *saved_priority;
+
+ for (x = 0; x < categories; x += RTE_ACL_RESULTS_MULTIPLIER) {
+ saved_results = (int32_t *)(&parms[n].cmplt->results[x]);
+ saved_priority = (int32_t *)(&parms[n].cmplt->priority[x]);
+
+ /* get results and priorities for completed trie */
+ results = vld1q_s32(
+ (const int32_t *)&p[transition].results[x]);
+ priority = vld1q_s32(
+ (const int32_t *)&p[transition].priority[x]);
+
+ /* if this is not the first completed trie */
+ if (parms[n].cmplt->count != ctx->num_tries) {
+ /* get running best results and their priorities */
+ results1 = vld1q_s32(saved_results);
+ priority1 = vld1q_s32(saved_priority);
+
+ /* select results that are highest priority */
+ selector = vcgtq_s32(priority1, priority);
+ results = vbslq_s32(selector, results1, results);
+ priority = vbslq_s32(selector, priority1, priority);
+ }
+
+ /* save running best results and their priorities */
+ vst1q_s32(saved_results, results);
+ vst1q_s32(saved_priority, priority);
+ }
+}
+
+/*
+ * Check for any match in 4 transitions
+ */
+static inline __attribute__((always_inline)) uint32_t
+check_any_match_x4(uint64_t val[])
+{
+ return ((val[0] | val[1] | val[2] | val[3]) & RTE_ACL_NODE_MATCH);
+}
+
+static inline __attribute__((always_inline)) void
+acl_match_check_x4(int slot, const struct rte_acl_ctx *ctx, struct parms *parms,
+ struct acl_flow_data *flows, uint64_t transitions[])
+{
+ while (check_any_match_x4(transitions)) {
+ transitions[0] = acl_match_check(transitions[0], slot, ctx,
+ parms, flows, resolve_priority_neon);
+ transitions[1] = acl_match_check(transitions[1], slot + 1, ctx,
+ parms, flows, resolve_priority_neon);
+ transitions[2] = acl_match_check(transitions[2], slot + 2, ctx,
+ parms, flows, resolve_priority_neon);
+ transitions[3] = acl_match_check(transitions[3], slot + 3, ctx,
+ parms, flows, resolve_priority_neon);
+ }
+}
+
+/*
+ * Process 4 transitions (in 2 NEON Q registers) in parallel
+ */
+static inline __attribute__((always_inline)) int32x4_t
+transition4(int32x4_t next_input, const uint64_t *trans, uint64_t transitions[])
+{
+ int32x4x2_t tr_hi_lo;
+ int32x4_t t, in, r;
+ uint32x4_t index_msk, node_type, addr;
+ uint32x4_t dfa_msk, mask, quad_ofs, dfa_ofs;
+
+ /* Move low 32 into tr_hi_lo.val[0] and high 32 into tr_hi_lo.val[1] */
+ tr_hi_lo = vld2q_s32((const int32_t *)transitions);
+
+ /* Calculate the address (array index) for all 4 transitions. */
+
+ index_msk = vld1q_u32((const uint32_t *)&neon_acl_const.xmm_index_mask);
+
+ /* Calc node type and node addr */
+ node_type = vbicq_s32(tr_hi_lo.val[0], index_msk);
+ addr = vandq_s32(tr_hi_lo.val[0], index_msk);
+
+ /* t = 0 */
+ t = veorq_s32(node_type, node_type);
+
+ /* mask for DFA type(0) nodes */
+ dfa_msk = vceqq_u32(node_type, t);
+
+ mask = vld1q_s32((const int32_t *)&neon_acl_const.xmm_shuffle_input);
+ in = vqtbl1q_u8((uint8x16_t)next_input, (uint8x16_t)mask);
+
+ /* DFA calculations. */
+ r = vshrq_n_u32(in, 30); /* div by 64 */
+ mask = vld1q_s32((const int32_t *)&neon_acl_const.range_base);
+ r = vaddq_u8(r, mask);
+ t = vshrq_n_u32(in, 24);
+ r = vqtbl1q_u8((uint8x16_t)tr_hi_lo.val[1], (uint8x16_t)r);
+ dfa_ofs = vsubq_s32(t, r);
+
+ /* QUAD/SINGLE calculations. */
+ t = vcgtq_s8(in, tr_hi_lo.val[1]);
+ t = vabsq_s8(t);
+ t = vpaddlq_u8(t);
+ quad_ofs = vpaddlq_u16(t);
+
+ /* blend DFA and QUAD/SINGLE. */
+ t = vbslq_u8(dfa_msk, dfa_ofs, quad_ofs);
+
+ /* calculate address for next transitions */
+ addr = vaddq_u32(addr, t);
+
+ /* Fill next transitions */
+ transitions[0] = trans[vgetq_lane_u32(addr, 0)];
+ transitions[1] = trans[vgetq_lane_u32(addr, 1)];
+ transitions[2] = trans[vgetq_lane_u32(addr, 2)];
+ transitions[3] = trans[vgetq_lane_u32(addr, 3)];
+
+ return vshrq_n_u32(next_input, CHAR_BIT);
+}
+
+/*
+ * Execute trie traversal with 8 traversals in parallel
+ */
+static inline int
+search_neon_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
+ uint32_t *results, uint32_t total_packets, uint32_t categories)
+{
+ int n;
+ struct acl_flow_data flows;
+ uint64_t index_array[8];
+ struct completion cmplt[8];
+ struct parms parms[8];
+ int32x4_t input0, input1;
+
+ acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+ total_packets, categories, ctx->trans_table);
+
+ for (n = 0; n < 8; n++) {
+ cmplt[n].count = 0;
+ index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+ }
+
+ /* Check for any matches. */
+ acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]);
+ acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]);
+
+ while (flows.started > 0) {
+ /* Gather 4 bytes of input data for each stream. */
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0, 0);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1, 0);
+
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0, 1);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1, 1);
+
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input0, 2);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 6), input1, 2);
+
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input0, 3);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 7), input1, 3);
+
+ /* Process the 4 bytes of input on each stream. */
+
+ input0 = transition4(input0, flows.trans, &index_array[0]);
+ input1 = transition4(input1, flows.trans, &index_array[4]);
+
+ input0 = transition4(input0, flows.trans, &index_array[0]);
+ input1 = transition4(input1, flows.trans, &index_array[4]);
+
+ input0 = transition4(input0, flows.trans, &index_array[0]);
+ input1 = transition4(input1, flows.trans, &index_array[4]);
+
+ input0 = transition4(input0, flows.trans, &index_array[0]);
+ input1 = transition4(input1, flows.trans, &index_array[4]);
+
+ /* Check for any matches. */
+ acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]);
+ acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]);
+ }
+
+ return 0;
+}
+
+/*
+ * Execute trie traversal with 4 traversals in parallel
+ */
+static inline int
+search_neon_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
+ uint32_t *results, int total_packets, uint32_t categories)
+{
+ int n;
+ struct acl_flow_data flows;
+ uint64_t index_array[4];
+ struct completion cmplt[4];
+ struct parms parms[4];
+ int32x4_t input;
+
+ acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+ total_packets, categories, ctx->trans_table);
+
+ for (n = 0; n < 4; n++) {
+ cmplt[n].count = 0;
+ index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+ }
+
+ /* Check for any matches. */
+ acl_match_check_x4(0, ctx, parms, &flows, index_array);
+
+ while (flows.started > 0) {
+ /* Gather 4 bytes of input data for each stream. */
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1);
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2);
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
+
+ /* Process the 4 bytes of input on each stream. */
+ input = transition4(input, flows.trans, index_array);
+ input = transition4(input, flows.trans, index_array);
+ input = transition4(input, flows.trans, index_array);
+ input = transition4(input, flows.trans, index_array);
+
+ /* Check for any matches. */
+ acl_match_check_x4(0, ctx, parms, &flows, index_array);
+ }
+
+ return 0;
+}
+
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
index d60219f..e2fdebd 100644
--- a/lib/librte_acl/rte_acl.c
+++ b/lib/librte_acl/rte_acl.c
@@ -55,11 +55,32 @@ rte_acl_classify_avx2(__rte_unused const struct rte_acl_ctx *ctx,
return -ENOTSUP;
}
+int __attribute__ ((weak))
+rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
+ __rte_unused const uint8_t **data,
+ __rte_unused uint32_t *results,
+ __rte_unused uint32_t num,
+ __rte_unused uint32_t categories)
+{
+ return -ENOTSUP;
+}
+
+int __attribute__ ((weak))
+rte_acl_classify_neon(__rte_unused const struct rte_acl_ctx *ctx,
+ __rte_unused const uint8_t **data,
+ __rte_unused uint32_t *results,
+ __rte_unused uint32_t num,
+ __rte_unused uint32_t categories)
+{
+ return -ENOTSUP;
+}
+
static const rte_acl_classify_t classify_fns[] = {
[RTE_ACL_CLASSIFY_DEFAULT] = rte_acl_classify_scalar,
[RTE_ACL_CLASSIFY_SCALAR] = rte_acl_classify_scalar,
[RTE_ACL_CLASSIFY_SSE] = rte_acl_classify_sse,
[RTE_ACL_CLASSIFY_AVX2] = rte_acl_classify_avx2,
+ [RTE_ACL_CLASSIFY_NEON] = rte_acl_classify_neon,
};
/* by default, use always available scalar code path. */
@@ -93,6 +114,9 @@ rte_acl_init(void)
{
enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;
+#ifdef RTE_ARCH_ARM64
+ alg = RTE_ACL_CLASSIFY_NEON;
+#else
#ifdef CC_AVX2_SUPPORT
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
alg = RTE_ACL_CLASSIFY_AVX2;
@@ -102,6 +126,7 @@ rte_acl_init(void)
#endif
alg = RTE_ACL_CLASSIFY_SSE;
+#endif
rte_acl_set_default_classify(alg);
}
diff --git a/lib/librte_acl/rte_acl.h b/lib/librte_acl/rte_acl.h
index 98ef2fc..0979a09 100644
--- a/lib/librte_acl/rte_acl.h
+++ b/lib/librte_acl/rte_acl.h
@@ -270,6 +270,7 @@ enum rte_acl_classify_alg {
RTE_ACL_CLASSIFY_SCALAR = 1, /**< generic implementation. */
RTE_ACL_CLASSIFY_SSE = 2, /**< requires SSE4.1 support. */
RTE_ACL_CLASSIFY_AVX2 = 3, /**< requires AVX2 support. */
+ RTE_ACL_CLASSIFY_NEON = 4, /**< requires NEON support. */
RTE_ACL_CLASSIFY_NUM /* should always be the last one. */
};
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7
2015-11-03 13:09 ` [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++
mk/machine/armv8a/rte.vars.mk | 58 ++++++++++++++++++++++++++++++
3 files changed, 172 insertions(+)
create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
create mode 100644 mk/arch/arm64/rte.vars.mk
create mode 100644 mk/machine/armv8a/rte.vars.mk
diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
new file mode 100644
index 0000000..6ea38a5
--- /dev/null
+++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
@@ -0,0 +1,56 @@
+# BSD LICENSE
+#
+# Copyright (C) Cavium networks 2015. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Cavium networks nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv8a"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_FORCE_INTRINSICS=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_CACHE_LINE_SIZE=64
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk
new file mode 100644
index 0000000..32e3a5f
--- /dev/null
+++ b/mk/arch/arm64/rte.vars.mk
@@ -0,0 +1,58 @@
+# BSD LICENSE
+#
+# Copyright (C) Cavium networks 2015. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Cavium networks nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# arch:
+#
+# - define ARCH variable (overridden by cmdline or by previous
+# optional define in machine .mk)
+# - define CROSS variable (overridden by cmdline or previous define
+# in machine .mk)
+# - define CPU_CFLAGS variable (overridden by cmdline or previous
+# define in machine .mk)
+# - define CPU_LDFLAGS variable (overridden by cmdline or previous
+# define in machine .mk)
+# - define CPU_ASFLAGS variable (overridden by cmdline or previous
+# define in machine .mk)
+# - may override any previously defined variable
+#
+# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32
+#
+
+ARCH ?= arm64
+# common arch dir in eal headers
+ARCH_DIR := arm
+CROSS ?=
+
+CPU_CFLAGS ?=
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
new file mode 100644
index 0000000..bdf8c6b
--- /dev/null
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -0,0 +1,58 @@
+# BSD LICENSE
+#
+# Copyright (C) Cavium networks 2015. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Cavium networks nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#
+# machine:
+#
+# - can define ARCH variable (overridden by cmdline value)
+# - can define CROSS variable (overridden by cmdline value)
+# - define MACHINE_CFLAGS variable (overridden by cmdline value)
+# - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+# - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+# - can define CPU_CFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a -DRTE_CACHE_LINE_SIZE=64
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a
2015-11-03 13:09 ` [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Created the new thunderx machine target to address difference
in "cache line size" and "-mcpu=thunderx" vs default armv8-a machine target
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
config/defconfig_arm64-thunderx-linuxapp-gcc | 56 +++++++++++++++++++++++++++
mk/machine/thunderx/rte.vars.mk | 58 ++++++++++++++++++++++++++++
2 files changed, 114 insertions(+)
create mode 100644 config/defconfig_arm64-thunderx-linuxapp-gcc
create mode 100644 mk/machine/thunderx/rte.vars.mk
diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc
new file mode 100644
index 0000000..e8fccc7
--- /dev/null
+++ b/config/defconfig_arm64-thunderx-linuxapp-gcc
@@ -0,0 +1,56 @@
+# BSD LICENSE
+#
+# Copyright (C) Cavium networks 2015. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Cavium networks nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="thunderx"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_FORCE_INTRINSICS=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_CACHE_LINE_SIZE=128
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
diff --git a/mk/machine/thunderx/rte.vars.mk b/mk/machine/thunderx/rte.vars.mk
new file mode 100644
index 0000000..e49f9e1
--- /dev/null
+++ b/mk/machine/thunderx/rte.vars.mk
@@ -0,0 +1,58 @@
+# BSD LICENSE
+#
+# Copyright (C) Cavium networks 2015. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Cavium networks nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#
+# machine:
+#
+# - can define ARCH variable (overridden by cmdline value)
+# - can define CROSS variable (overridden by cmdline value)
+# - define MACHINE_CFLAGS variable (overridden by cmdline value)
+# - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+# - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+# - can define CPU_CFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - may override any previously defined variable
+#
+
+# ARCH =
+CROSS ?= aarch64-thunderx-linux-gnu-
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a -mcpu=thunderx -DRTE_CACHE_LINE_SIZE=128
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2
2015-11-03 13:09 ` [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 12/12] maintainers: claim responsibility for ARMv8 Jerin Jacob
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
doc/guides/rel_notes/release_2_2.rst | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index 43a3a3c..a3587a2 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -23,10 +23,11 @@ New Features
* **Added vhost-user multiple queue support.**
-* **Introduce ARMv7 architecture**
+* **Introduce ARMv7 and ARMv8 architectures**
- It is now possible to build DPDK for the ARMv7 platform and test with
- virtual PMD drivers.
+ * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms.
+ * ARMv7 can be tested with virtual PMD drivers.
+ * ARMv8 can be tested with virtual and physical PMD drivers.
Resolved Issues
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 12/12] maintainers: claim responsibility for ARMv8
2015-11-03 13:09 ` [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 Jerin Jacob
@ 2015-11-03 13:09 ` Jerin Jacob
0 siblings, 0 replies; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw)
To: dev
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
MAINTAINERS | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index a8933eb..c44b328 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -128,6 +128,11 @@ ARM v7
M: Jan Viktorin <viktorin@rehivetech.com>
F: lib/librte_eal/common/include/arch/arm/
+ARM v8
+M: Jerin Jacob <jerin.jacob@caviumnetworks.com>
+F: lib/librte_eal/common/include/arch/arm/*_64.h
+F: lib/librte_acl/acl_run_neon.*
+
Intel x86
M: Bruce Richardson <bruce.richardson@intel.com>
M: Konstantin Ananyev <konstantin.ananyev@intel.com>
--
2.1.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 00/12] DPDK armv8-a support
2015-11-03 13:09 [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
@ 2015-11-03 14:17 ` Hunt, David
2015-11-03 16:38 ` Jerin Jacob
1 sibling, 1 reply; 16+ messages in thread
From: Hunt, David @ 2015-11-03 14:17 UTC (permalink / raw)
To: Jerin Jacob, dev
On 03/11/2015 13:09, Jerin Jacob wrote:
> This is the v1 patchset for ARMv8 that now sits on top of the v6 patch
> of the ARMv7 code by RehiveTech. It adds code into the same arm include
> directory, reducing code duplication.
>
> Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic
> between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should
> work with many other NICS as long as long as there is no unaligned access to
> device memory but not yet untested.
I have your patchset building and running on an X-Gene based 8-core
MP30AR0 system, passing traffic between two ports on and 82599 also.
> Notes on arm64 kernel configuration:
>
> Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio.
> ARM64 kernels does not have functional resource mapping of PCI memory
> (PCI_MMAP), so the pci driver needs to be patched to enable this. The
> symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is
> missing the resource0...N files for mmapping the device memory.
>
> Following patch fixes the PCI resource mapping issue om armv8.
> Its not yet up streamed.We are in the process of up streaming it.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html
Good to see that there's a patch on the way for this. That fix looks
almost exactly the same as the hack I did to my kernel :)
I had a couple of small issues when patching/building:
1. Three of the files had an extra blank line at the end. Maybe worth
running checkpatch on the patches. 'git am' was complaining.
2. I had problems compiling two drivers because they were attempting to
include tmmintrin.h:
...dpdk/drivers/net/fm10k/fm10k_rxtx_vec.c:41:23: fatal error:
tmmintrin.h: No such file or directory
...dpdk/drivers/net/i40e/i40e_rxtx_vec.c:43:23: fatal error:
tmmintrin.h: No such file or directory
To avoid this, I added the following two lines into
defconfig_arm64-armv8a-linuxapp-gcc
CONFIG_RTE_LIBRTE_FM10K_PMD=n
CONFIG_RTE_LIBRTE_I40E_PMD=n
and then it built fine, and I can run testpmd with my 82599's and run
autotests.
Thanks for that.
Dave.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 00/12] DPDK armv8-a support
2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David
@ 2015-11-03 16:38 ` Jerin Jacob
2015-11-05 14:39 ` Hunt, David
0 siblings, 1 reply; 16+ messages in thread
From: Jerin Jacob @ 2015-11-03 16:38 UTC (permalink / raw)
To: Hunt, David; +Cc: dev
On Tue, Nov 03, 2015 at 02:17:38PM +0000, Hunt, David wrote:
> On 03/11/2015 13:09, Jerin Jacob wrote:
> >This is the v1 patchset for ARMv8 that now sits on top of the v6 patch
> >of the ARMv7 code by RehiveTech. It adds code into the same arm include
> >directory, reducing code duplication.
> >
> >Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic
> >between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should
> >work with many other NICS as long as long as there is no unaligned access to
> >device memory but not yet untested.
>
> I have your patchset building and running on an X-Gene based 8-core MP30AR0
> system, passing traffic between two ports on and 82599 also.
>
Thanks.
> >Notes on arm64 kernel configuration:
> >
> > Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio.
> > ARM64 kernels does not have functional resource mapping of PCI memory
> > (PCI_MMAP), so the pci driver needs to be patched to enable this. The
> > symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is
> > missing the resource0...N files for mmapping the device memory.
> >
> > Following patch fixes the PCI resource mapping issue om armv8.
> > Its not yet up streamed.We are in the process of up streaming it.
> >
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html
>
> Good to see that there's a patch on the way for this. That fix looks almost
> exactly the same as the hack I did to my kernel :)
>
> I had a couple of small issues when patching/building:
>
> 1. Three of the files had an extra blank line at the end. Maybe worth
> running checkpatch on the patches. 'git am' was complaining.
I will fix it in next version.
>
> 2. I had problems compiling two drivers because they were attempting to
> include tmmintrin.h:
>
> ...dpdk/drivers/net/fm10k/fm10k_rxtx_vec.c:41:23: fatal error: tmmintrin.h:
> No such file or directory
>
> ...dpdk/drivers/net/i40e/i40e_rxtx_vec.c:43:23: fatal error: tmmintrin.h: No
> such file or directory
>
> To avoid this, I added the following two lines into
> defconfig_arm64-armv8a-linuxapp-gcc
>
> CONFIG_RTE_LIBRTE_FM10K_PMD=n
> CONFIG_RTE_LIBRTE_I40E_PMD=n
the patch was based on 82fb702077f67585d64a07de0080e5cb6a924a72 which
don't have these changes. I will add these in next version.
> and then it built fine, and I can run testpmd with my 82599's and run
> autotests.
I ran autotest, "Mbuf autotest" stress failure is due strong vs weak ordering
issue. I will send the next version based on new patch being discussed
on ml.
>
> Thanks for that.
> Dave.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 00/12] DPDK armv8-a support
2015-11-03 16:38 ` Jerin Jacob
@ 2015-11-05 14:39 ` Hunt, David
0 siblings, 0 replies; 16+ messages in thread
From: Hunt, David @ 2015-11-05 14:39 UTC (permalink / raw)
To: Jerin Jacob; +Cc: dev
On 03/11/2015 16:38, Jerin Jacob wrote:
> On Tue, Nov 03, 2015 at 02:17:38PM +0000, Hunt, David wrote:
--snip--
>> and then it built fine, and I can run testpmd with my 82599's and run
>> autotests.
>
> I ran autotest, "Mbuf autotest" stress failure is due strong vs weak ordering
> issue. I will send the next version based on new patch being discussed
> on ml.
Jerin,
I've marked my patch-set for the armv8 support as superseded in
PatchWork. I'm happy for your patch-set to take precedence.
If you're uploading another rev, I'll be sure to give it a test on my
X-Gene board.
Dave.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2015-11-05 14:39 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03 13:09 [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 Jerin Jacob
2015-11-03 13:09 ` [dpdk-dev] [PATCH 12/12] maintainers: claim responsibility for ARMv8 Jerin Jacob
2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David
2015-11-03 16:38 ` Jerin Jacob
2015-11-05 14:39 ` Hunt, David
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).