* [dpdk-dev] [PATCH 00/12] DPDK armv8-a support @ 2015-11-03 13:09 Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob 2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David 0 siblings, 2 replies; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev This is the v1 patchset for ARMv8 that now sits on top of the v6 patch of the ARMv7 code by RehiveTech. It adds code into the same arm include directory, reducing code duplication. Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should work with many other NICS as long as long as there is no unaligned access to device memory but not yet untested. Compiles igb_uio, kni and all the physical device PMDs. An entry has been added to the Release notes. NOTE: Part of the work has been taken from David Hunt's v3 patch who was initiated the armv8 port. Notes on arm64 kernel configuration: Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio. ARM64 kernels does not have functional resource mapping of PCI memory (PCI_MMAP), so the pci driver needs to be patched to enable this. The symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is missing the resource0...N files for mmapping the device memory. Following patch fixes the PCI resource mapping issue om armv8. Its not yet up streamed.We are in the process of up streaming it. http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html Jerin Jacob (12): eal: arm64: add armv8-a version of rte_atomic_64.h eal: arm64: add armv8-a version of rte_cpuflags_64.h eal: arm64: add armv8-a version of rte_prefetch_64.h eal: arm64: add armv8-a version of rte_cycles_64.h eal: arm64: rte_memcpy_64.h version based libc memcpy eal: arm: ret_vector.h improvements app: test: added the new cpu flags of arm64 in test_cpuflags test case acl: arm64: acl implementation using NEON gcc intrinsic mk: add support for armv8 on top of armv7 mk: add support for thunderx machine target based armv8-a updated release note for armv8 support for DPDK 2.2 maintainers: claim responsibility for ARMv8 MAINTAINERS | 5 + app/test-acl/main.c | 4 + app/test/test_cpuflags.c | 26 ++ config/defconfig_arm64-armv8a-linuxapp-gcc | 56 ++++ config/defconfig_arm64-thunderx-linuxapp-gcc | 56 ++++ doc/guides/rel_notes/release_2_2.rst | 7 +- lib/librte_acl/Makefile | 5 + lib/librte_acl/acl.h | 4 + lib/librte_acl/acl_run_neon.c | 46 ++++ lib/librte_acl/acl_run_neon.h | 290 +++++++++++++++++++++ lib/librte_acl/rte_acl.c | 25 ++ lib/librte_acl/rte_acl.h | 1 + .../common/include/arch/arm/rte_atomic.h | 4 + .../common/include/arch/arm/rte_atomic_64.h | 88 +++++++ .../common/include/arch/arm/rte_cpuflags.h | 4 + .../common/include/arch/arm/rte_cpuflags_64.h | 152 +++++++++++ .../common/include/arch/arm/rte_cycles.h | 4 + .../common/include/arch/arm/rte_cycles_64.h | 71 +++++ .../common/include/arch/arm/rte_memcpy.h | 4 + .../common/include/arch/arm/rte_memcpy_64.h | 93 +++++++ .../common/include/arch/arm/rte_prefetch.h | 4 + .../common/include/arch/arm/rte_prefetch_64.h | 61 +++++ lib/librte_eal/common/include/arch/arm/rte_vect.h | 58 ++--- mk/arch/arm64/rte.vars.mk | 58 +++++ mk/machine/armv8a/rte.vars.mk | 58 +++++ mk/machine/thunderx/rte.vars.mk | 58 +++++ 26 files changed, 1198 insertions(+), 44 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 config/defconfig_arm64-thunderx-linuxapp-gcc create mode 100644 lib/librte_acl/acl_run_neon.c create mode 100644 lib/librte_acl/acl_run_neon.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk create mode 100644 mk/machine/thunderx/rte.vars.mk -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h 2015-11-03 13:09 [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob 2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David 1 sibling, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev except rte_?wb() functions other functions are used from RTE_FORCE_INTRINSICS=y scheme Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- .../common/include/arch/arm/rte_atomic.h | 4 + .../common/include/arch/arm/rte_atomic_64.h | 88 ++++++++++++++++++++++ 2 files changed, 92 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h index f4f5783..f3f3b6e 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_atomic.h +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h @@ -33,6 +33,10 @@ #ifndef _RTE_ATOMIC_ARM_H_ #define _RTE_ATOMIC_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_atomic_64.h> +#else #include <rte_atomic_32.h> +#endif #endif /* _RTE_ATOMIC_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h new file mode 100644 index 0000000..671caa7 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h @@ -0,0 +1,88 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_ATOMIC_ARM64_H_ +#define _RTE_ATOMIC_ARM64_H_ + +#ifndef RTE_FORCE_INTRINSICS +# error Platform must be built with CONFIG_RTE_FORCE_INTRINSICS +#endif + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_atomic.h" + +#define dmb(opt) do { asm volatile("dmb " #opt : : : "memory"); } while (0) + +/** + * General memory barrier. + * + * Guarantees that the LOAD and STORE operations generated before the + * barrier occur before the LOAD and STORE operations generated after. + * This function is architecture dependent. + */ +static inline void rte_mb(void) +{ + dmb(ish); +} + +/** + * Write memory barrier. + * + * Guarantees that the STORE operations generated before the barrier + * occur before the STORE operations generated after. + * This function is architecture dependent. + */ +static inline void rte_wmb(void) +{ + dmb(ishst); +} + +/** + * Read memory barrier. + * + * Guarantees that the LOAD operations generated before the barrier + * occur before the LOAD operations generated after. + * This function is architecture dependent. + */ +static inline void rte_rmb(void) +{ + dmb(ishld); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_ATOMIC_ARM64_H_ */ -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h 2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- .../common/include/arch/arm/rte_cpuflags.h | 4 + .../common/include/arch/arm/rte_cpuflags_64.h | 152 +++++++++++++++++++++ 2 files changed, 156 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h index 8de78d2..b8f6288 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h @@ -33,6 +33,10 @@ #ifndef _RTE_CPUFLAGS_ARM_H_ #define _RTE_CPUFLAGS_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_cpuflags_64.h> +#else #include <rte_cpuflags_32.h> +#endif #endif /* _RTE_CPUFLAGS_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h new file mode 100644 index 0000000..7bcc12f --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h @@ -0,0 +1,152 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_CPUFLAGS_ARM64_H_ +#define _RTE_CPUFLAGS_ARM64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include <elf.h> +#include <fcntl.h> +#include <assert.h> +#include <unistd.h> +#include <string.h> + +#include "generic/rte_cpuflags.h" + +#ifndef AT_HWCAP +#define AT_HWCAP 16 +#endif + +#ifndef AT_HWCAP2 +#define AT_HWCAP2 26 +#endif + +#ifndef AT_PLATFORM +#define AT_PLATFORM 15 +#endif + +/* software based registers */ +enum cpu_register_t { + REG_HWCAP = 0, + REG_HWCAP2, + REG_PLATFORM, +}; + +/** + * Enumeration of all CPU features supported + */ +enum rte_cpu_flag_t { + RTE_CPUFLAG_FP = 0, + RTE_CPUFLAG_NEON, + RTE_CPUFLAG_EVTSTRM, + RTE_CPUFLAG_AES, + RTE_CPUFLAG_PMULL, + RTE_CPUFLAG_SHA1, + RTE_CPUFLAG_SHA2, + RTE_CPUFLAG_CRC32, + RTE_CPUFLAG_AARCH64, + /* The last item */ + RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */ +}; + +static const struct feature_entry cpu_feature_table[] = { + FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) + FEAT_DEF(NEON, 0x00000001, 0, REG_HWCAP, 1) + FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) + FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) + FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) + FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) + FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) + FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) + FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) +}; + +/* + * Read AUXV software register and get cpu features for ARM + */ +static inline void +rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, + __attribute__((unused)) uint32_t subleaf, + cpuid_registers_t out) +{ + int auxv_fd; + Elf64_auxv_t auxv; + + auxv_fd = open("/proc/self/auxv", O_RDONLY); + assert(auxv_fd); + while (read(auxv_fd, &auxv, + sizeof(Elf64_auxv_t)) == sizeof(Elf64_auxv_t)) { + if (auxv.a_type == AT_HWCAP) { + out[REG_HWCAP] = auxv.a_un.a_val; + } else if (auxv.a_type == AT_HWCAP2) { + out[REG_HWCAP2] = auxv.a_un.a_val; + } else if (auxv.a_type == AT_PLATFORM) { + if (!strcmp((const char *)auxv.a_un.a_val, "aarch64")) + out[REG_PLATFORM] = 0x0001; + } + } +} + +/* + * Checks if a particular flag is available on current machine. + */ +static inline int +rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) +{ + const struct feature_entry *feat; + cpuid_registers_t regs = {0}; + + if (feature >= RTE_CPUFLAG_NUMFLAGS) + /* Flag does not match anything in the feature tables */ + return -ENOENT; + + feat = &cpu_feature_table[feature]; + + if (!feat->leaf) + /* This entry in the table wasn't filled out! */ + return -EFAULT; + + /* get the cpuid leaf containing the desired feature */ + rte_cpu_get_features(feat->leaf, feat->subleaf, regs); + + /* check if the feature is enabled */ + return (regs[feat->reg] >> feat->bit) & 1; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_CPUFLAGS_ARM64_H_ */ -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h 2015-11-03 13:09 ` [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- .../common/include/arch/arm/rte_prefetch.h | 4 ++ .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++++++++++++++++++++ 2 files changed, 65 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h index 1f46697..aa37de5 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h @@ -33,6 +33,10 @@ #ifndef _RTE_PREFETCH_ARM_H_ #define _RTE_PREFETCH_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_prefetch_64.h> +#else #include <rte_prefetch_32.h> +#endif #endif /* _RTE_PREFETCH_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h new file mode 100644 index 0000000..f9cc62e --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h @@ -0,0 +1,61 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM_64_H_ +#define _RTE_PREFETCH_ARM_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_prefetch.h" + +static inline void rte_prefetch0(const volatile void *p) +{ + asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch1(const volatile void *p) +{ + asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch2(const volatile void *p) +{ + asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PREFETCH_ARM_64_H_ */ -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h 2015-11-03 13:09 ` [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev cntcvt_el0 ticks are not based on cpu clk unlike rdtsc in x86. Its a fixed clock running based at constant speed. Though its a armv8-a implementer choice, typically it runs at 50 or 100 MHz Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- .../common/include/arch/arm/rte_cycles.h | 4 ++ .../common/include/arch/arm/rte_cycles_64.h | 71 ++++++++++++++++++++++ 2 files changed, 75 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h index b2372fa..a8009a0 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h @@ -33,6 +33,10 @@ #ifndef _RTE_CYCLES_ARM_H_ #define _RTE_CYCLES_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_cycles_64.h> +#else #include <rte_cycles_32.h> +#endif #endif /* _RTE_CYCLES_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h new file mode 100644 index 0000000..14f2612 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h @@ -0,0 +1,71 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_CYCLES_ARM64_H_ +#define _RTE_CYCLES_ARM64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_cycles.h" + +/** + * Read the time base register. + * + * @return + * The time base for this lcore. + */ +static inline uint64_t +rte_rdtsc(void) +{ + uint64_t tsc; + + asm volatile("mrs %0, cntvct_el0" : "=r" (tsc)); + return tsc; +} + +static inline uint64_t +rte_rdtsc_precise(void) +{ + rte_mb(); + return rte_rdtsc(); +} + +static inline uint64_t +rte_get_tsc_cycles(void) { return rte_rdtsc(); } + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_CYCLES_ARM64_H_ */ -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy 2015-11-03 13:09 ` [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- .../common/include/arch/arm/rte_memcpy.h | 4 + .../common/include/arch/arm/rte_memcpy_64.h | 93 ++++++++++++++++++++++ 2 files changed, 97 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h index d9f5bf1..1d562c3 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h @@ -33,6 +33,10 @@ #ifndef _RTE_MEMCPY_ARM_H_ #define _RTE_MEMCPY_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_memcpy_64.h> +#else #include <rte_memcpy_32.h> +#endif #endif /* _RTE_MEMCPY_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h new file mode 100644 index 0000000..917cdc1 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h @@ -0,0 +1,93 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_MEMCPY_ARM64_H_ +#define _RTE_MEMCPY_ARM64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include <stdint.h> +#include <string.h> + +#include "generic/rte_memcpy.h" + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +#define rte_memcpy(d, s, n) memcpy((d), (s), (n)) + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMCPY_ARM_64_H_ */ -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements 2015-11-03 13:09 ` [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev added the definition of rte_xmm and xmm_t for acl noen implementation. removed the emulated _mm_* functions Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- lib/librte_eal/common/include/arch/arm/rte_vect.h | 58 +++++++---------------- 1 file changed, 17 insertions(+), 41 deletions(-) diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h index 7d5de97..21cdb4d 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2015 RehiveTech. All rights reserved. + * Copyright(c) 2015 Cavium Networks. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -13,7 +13,7 @@ * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. - * * Neither the name of RehiveTech nor the names of its + * * Neither the name of Cavium Networks nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * @@ -33,49 +33,25 @@ #ifndef _RTE_VECT_ARM_H_ #define _RTE_VECT_ARM_H_ +#include "arm_neon.h" + #ifdef __cplusplus extern "C" { #endif -#define XMM_SIZE 16 -#define XMM_MASK (XMM_MASK - 1) - -typedef struct { - union uint128 { - uint8_t uint8[16]; - uint32_t uint32[4]; - } val; -} __m128i; - -static inline __m128i -_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3) -{ - __m128i res; - - res.val.uint32[0] = v0; - res.val.uint32[1] = v1; - res.val.uint32[2] = v2; - res.val.uint32[3] = v3; - return res; -} - -static inline __m128i -_mm_loadu_si128(__m128i *v) -{ - __m128i res; - - res = *v; - return res; -} - -static inline __m128i -_mm_load_si128(__m128i *v) -{ - __m128i res; - - res = *v; - return res; -} +typedef int32x4_t xmm_t; + +#define XMM_SIZE (sizeof(xmm_t)) +#define XMM_MASK (XMM_SIZE - 1) + +typedef union rte_xmm { + xmm_t x; + uint8_t u8[XMM_SIZE / sizeof(uint8_t)]; + uint16_t u16[XMM_SIZE / sizeof(uint16_t)]; + uint32_t u32[XMM_SIZE / sizeof(uint32_t)]; + uint64_t u64[XMM_SIZE / sizeof(uint64_t)]; + double pd[XMM_SIZE / sizeof(double)]; +} __attribute__((aligned(16))) rte_xmm_t; #ifdef __cplusplus } -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case 2015-11-03 13:09 ` [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- app/test/test_cpuflags.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index 557458f..e8d0ce7 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -120,6 +120,32 @@ test_cpuflags(void) CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); #endif +#if defined(RTE_ARCH_ARM64) + printf("Check for FP:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_FP); + + printf("Check for ASIMD:\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); + + printf("Check for EVTSTRM:\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_EVTSTRM); + + printf("Check for AES:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AES); + + printf("Check for PMULL:\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_PMULL); + + printf("Check for SHA1:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_SHA1); + + printf("Check for SHA2:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_SHA2); + + printf("Check for CRC32:\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_CRC32); +#endif + #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) printf("Check for SSE:\t\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_SSE); -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic 2015-11-03 13:09 ` [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev verified with testacl and acl_autotest applications on arm64 architecture. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- app/test-acl/main.c | 4 + lib/librte_acl/Makefile | 5 + lib/librte_acl/acl.h | 4 + lib/librte_acl/acl_run_neon.c | 46 +++++++ lib/librte_acl/acl_run_neon.h | 290 ++++++++++++++++++++++++++++++++++++++++++ lib/librte_acl/rte_acl.c | 25 ++++ lib/librte_acl/rte_acl.h | 1 + 7 files changed, 375 insertions(+) create mode 100644 lib/librte_acl/acl_run_neon.c create mode 100644 lib/librte_acl/acl_run_neon.h diff --git a/app/test-acl/main.c b/app/test-acl/main.c index 72ce83c..0b0c093 100644 --- a/app/test-acl/main.c +++ b/app/test-acl/main.c @@ -101,6 +101,10 @@ static const struct acl_alg acl_alg[] = { .name = "avx2", .alg = RTE_ACL_CLASSIFY_AVX2, }, + { + .name = "neon", + .alg = RTE_ACL_CLASSIFY_NEON, + }, }; static struct { diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile index 7a1cf8a..27f91d5 100644 --- a/lib/librte_acl/Makefile +++ b/lib/librte_acl/Makefile @@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c +ifeq ($(CONFIG_RTE_ARCH_ARM64),y) +SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_neon.c +else SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c +endif CFLAGS_acl_run_sse.o += -msse4.1 +CFLAGS_acl_run_neon.o += -flax-vector-conversions -Wno-maybe-uninitialized # # If the compiler supports AVX2 instructions, diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h index eb4930c..09d6784 100644 --- a/lib/librte_acl/acl.h +++ b/lib/librte_acl/acl.h @@ -230,6 +230,10 @@ int rte_acl_classify_avx2(const struct rte_acl_ctx *ctx, const uint8_t **data, uint32_t *results, uint32_t num, uint32_t categories); +int +rte_acl_classify_neon(const struct rte_acl_ctx *ctx, const uint8_t **data, + uint32_t *results, uint32_t num, uint32_t categories); + #ifdef __cplusplus } #endif /* __cplusplus */ diff --git a/lib/librte_acl/acl_run_neon.c b/lib/librte_acl/acl_run_neon.c new file mode 100644 index 0000000..b014451 --- /dev/null +++ b/lib/librte_acl/acl_run_neon.c @@ -0,0 +1,46 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#include "acl_run_neon.h" + +int +rte_acl_classify_neon(const struct rte_acl_ctx *ctx, const uint8_t **data, + uint32_t *results, uint32_t num, uint32_t categories) +{ + if (likely(num >= 8)) + return search_neon_8(ctx, data, results, num, categories); + else if (num >= 4) + return search_neon_4(ctx, data, results, num, categories); + else + return rte_acl_classify_scalar(ctx, data, results, num, + categories); +} diff --git a/lib/librte_acl/acl_run_neon.h b/lib/librte_acl/acl_run_neon.h new file mode 100644 index 0000000..4579476 --- /dev/null +++ b/lib/librte_acl/acl_run_neon.h @@ -0,0 +1,290 @@ +/* + * BSD LICENSE + * + * Copyright (C) Cavium networks Ltd. 2015. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#include "acl_run.h" +#include "acl_vect.h" + +struct _neon_acl_const { + rte_xmm_t xmm_shuffle_input; + rte_xmm_t xmm_index_mask; + rte_xmm_t range_base; +} neon_acl_const __attribute__((aligned(RTE_CACHE_LINE_SIZE))) = { + { + .u32 = {0x00000000, 0x04040404, 0x08080808, 0x0c0c0c0c} + }, + { + .u32 = {RTE_ACL_NODE_INDEX, RTE_ACL_NODE_INDEX, + RTE_ACL_NODE_INDEX, RTE_ACL_NODE_INDEX} + }, + { + .u32 = {0xffffff00, 0xffffff04, 0xffffff08, 0xffffff0c} + }, +}; + +/* + * Resolve priority for multiple results (neon version). + * This consists comparing the priority of the current traversal with the + * running set of results for the packet. + * For each result, keep a running array of the result (rule number) and + * its priority for each category. + */ +static inline void +resolve_priority_neon(uint64_t transition, int n, const struct rte_acl_ctx *ctx, + struct parms *parms, + const struct rte_acl_match_results *p, + uint32_t categories) +{ + uint32_t x; + int32x4_t results, priority, results1, priority1; + uint32x4_t selector; + int32_t *saved_results, *saved_priority; + + for (x = 0; x < categories; x += RTE_ACL_RESULTS_MULTIPLIER) { + saved_results = (int32_t *)(&parms[n].cmplt->results[x]); + saved_priority = (int32_t *)(&parms[n].cmplt->priority[x]); + + /* get results and priorities for completed trie */ + results = vld1q_s32( + (const int32_t *)&p[transition].results[x]); + priority = vld1q_s32( + (const int32_t *)&p[transition].priority[x]); + + /* if this is not the first completed trie */ + if (parms[n].cmplt->count != ctx->num_tries) { + /* get running best results and their priorities */ + results1 = vld1q_s32(saved_results); + priority1 = vld1q_s32(saved_priority); + + /* select results that are highest priority */ + selector = vcgtq_s32(priority1, priority); + results = vbslq_s32(selector, results1, results); + priority = vbslq_s32(selector, priority1, priority); + } + + /* save running best results and their priorities */ + vst1q_s32(saved_results, results); + vst1q_s32(saved_priority, priority); + } +} + +/* + * Check for any match in 4 transitions + */ +static inline __attribute__((always_inline)) uint32_t +check_any_match_x4(uint64_t val[]) +{ + return ((val[0] | val[1] | val[2] | val[3]) & RTE_ACL_NODE_MATCH); +} + +static inline __attribute__((always_inline)) void +acl_match_check_x4(int slot, const struct rte_acl_ctx *ctx, struct parms *parms, + struct acl_flow_data *flows, uint64_t transitions[]) +{ + while (check_any_match_x4(transitions)) { + transitions[0] = acl_match_check(transitions[0], slot, ctx, + parms, flows, resolve_priority_neon); + transitions[1] = acl_match_check(transitions[1], slot + 1, ctx, + parms, flows, resolve_priority_neon); + transitions[2] = acl_match_check(transitions[2], slot + 2, ctx, + parms, flows, resolve_priority_neon); + transitions[3] = acl_match_check(transitions[3], slot + 3, ctx, + parms, flows, resolve_priority_neon); + } +} + +/* + * Process 4 transitions (in 2 NEON Q registers) in parallel + */ +static inline __attribute__((always_inline)) int32x4_t +transition4(int32x4_t next_input, const uint64_t *trans, uint64_t transitions[]) +{ + int32x4x2_t tr_hi_lo; + int32x4_t t, in, r; + uint32x4_t index_msk, node_type, addr; + uint32x4_t dfa_msk, mask, quad_ofs, dfa_ofs; + + /* Move low 32 into tr_hi_lo.val[0] and high 32 into tr_hi_lo.val[1] */ + tr_hi_lo = vld2q_s32((const int32_t *)transitions); + + /* Calculate the address (array index) for all 4 transitions. */ + + index_msk = vld1q_u32((const uint32_t *)&neon_acl_const.xmm_index_mask); + + /* Calc node type and node addr */ + node_type = vbicq_s32(tr_hi_lo.val[0], index_msk); + addr = vandq_s32(tr_hi_lo.val[0], index_msk); + + /* t = 0 */ + t = veorq_s32(node_type, node_type); + + /* mask for DFA type(0) nodes */ + dfa_msk = vceqq_u32(node_type, t); + + mask = vld1q_s32((const int32_t *)&neon_acl_const.xmm_shuffle_input); + in = vqtbl1q_u8((uint8x16_t)next_input, (uint8x16_t)mask); + + /* DFA calculations. */ + r = vshrq_n_u32(in, 30); /* div by 64 */ + mask = vld1q_s32((const int32_t *)&neon_acl_const.range_base); + r = vaddq_u8(r, mask); + t = vshrq_n_u32(in, 24); + r = vqtbl1q_u8((uint8x16_t)tr_hi_lo.val[1], (uint8x16_t)r); + dfa_ofs = vsubq_s32(t, r); + + /* QUAD/SINGLE calculations. */ + t = vcgtq_s8(in, tr_hi_lo.val[1]); + t = vabsq_s8(t); + t = vpaddlq_u8(t); + quad_ofs = vpaddlq_u16(t); + + /* blend DFA and QUAD/SINGLE. */ + t = vbslq_u8(dfa_msk, dfa_ofs, quad_ofs); + + /* calculate address for next transitions */ + addr = vaddq_u32(addr, t); + + /* Fill next transitions */ + transitions[0] = trans[vgetq_lane_u32(addr, 0)]; + transitions[1] = trans[vgetq_lane_u32(addr, 1)]; + transitions[2] = trans[vgetq_lane_u32(addr, 2)]; + transitions[3] = trans[vgetq_lane_u32(addr, 3)]; + + return vshrq_n_u32(next_input, CHAR_BIT); +} + +/* + * Execute trie traversal with 8 traversals in parallel + */ +static inline int +search_neon_8(const struct rte_acl_ctx *ctx, const uint8_t **data, + uint32_t *results, uint32_t total_packets, uint32_t categories) +{ + int n; + struct acl_flow_data flows; + uint64_t index_array[8]; + struct completion cmplt[8]; + struct parms parms[8]; + int32x4_t input0, input1; + + acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, + total_packets, categories, ctx->trans_table); + + for (n = 0; n < 8; n++) { + cmplt[n].count = 0; + index_array[n] = acl_start_next_trie(&flows, parms, n, ctx); + } + + /* Check for any matches. */ + acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]); + acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]); + + while (flows.started > 0) { + /* Gather 4 bytes of input data for each stream. */ + input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0, 0); + input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1, 0); + + input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0, 1); + input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1, 1); + + input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input0, 2); + input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 6), input1, 2); + + input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input0, 3); + input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 7), input1, 3); + + /* Process the 4 bytes of input on each stream. */ + + input0 = transition4(input0, flows.trans, &index_array[0]); + input1 = transition4(input1, flows.trans, &index_array[4]); + + input0 = transition4(input0, flows.trans, &index_array[0]); + input1 = transition4(input1, flows.trans, &index_array[4]); + + input0 = transition4(input0, flows.trans, &index_array[0]); + input1 = transition4(input1, flows.trans, &index_array[4]); + + input0 = transition4(input0, flows.trans, &index_array[0]); + input1 = transition4(input1, flows.trans, &index_array[4]); + + /* Check for any matches. */ + acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]); + acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]); + } + + return 0; +} + +/* + * Execute trie traversal with 4 traversals in parallel + */ +static inline int +search_neon_4(const struct rte_acl_ctx *ctx, const uint8_t **data, + uint32_t *results, int total_packets, uint32_t categories) +{ + int n; + struct acl_flow_data flows; + uint64_t index_array[4]; + struct completion cmplt[4]; + struct parms parms[4]; + int32x4_t input; + + acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, + total_packets, categories, ctx->trans_table); + + for (n = 0; n < 4; n++) { + cmplt[n].count = 0; + index_array[n] = acl_start_next_trie(&flows, parms, n, ctx); + } + + /* Check for any matches. */ + acl_match_check_x4(0, ctx, parms, &flows, index_array); + + while (flows.started > 0) { + /* Gather 4 bytes of input data for each stream. */ + input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0); + input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1); + input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2); + input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3); + + /* Process the 4 bytes of input on each stream. */ + input = transition4(input, flows.trans, index_array); + input = transition4(input, flows.trans, index_array); + input = transition4(input, flows.trans, index_array); + input = transition4(input, flows.trans, index_array); + + /* Check for any matches. */ + acl_match_check_x4(0, ctx, parms, &flows, index_array); + } + + return 0; +} + diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c index d60219f..e2fdebd 100644 --- a/lib/librte_acl/rte_acl.c +++ b/lib/librte_acl/rte_acl.c @@ -55,11 +55,32 @@ rte_acl_classify_avx2(__rte_unused const struct rte_acl_ctx *ctx, return -ENOTSUP; } +int __attribute__ ((weak)) +rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx, + __rte_unused const uint8_t **data, + __rte_unused uint32_t *results, + __rte_unused uint32_t num, + __rte_unused uint32_t categories) +{ + return -ENOTSUP; +} + +int __attribute__ ((weak)) +rte_acl_classify_neon(__rte_unused const struct rte_acl_ctx *ctx, + __rte_unused const uint8_t **data, + __rte_unused uint32_t *results, + __rte_unused uint32_t num, + __rte_unused uint32_t categories) +{ + return -ENOTSUP; +} + static const rte_acl_classify_t classify_fns[] = { [RTE_ACL_CLASSIFY_DEFAULT] = rte_acl_classify_scalar, [RTE_ACL_CLASSIFY_SCALAR] = rte_acl_classify_scalar, [RTE_ACL_CLASSIFY_SSE] = rte_acl_classify_sse, [RTE_ACL_CLASSIFY_AVX2] = rte_acl_classify_avx2, + [RTE_ACL_CLASSIFY_NEON] = rte_acl_classify_neon, }; /* by default, use always available scalar code path. */ @@ -93,6 +114,9 @@ rte_acl_init(void) { enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT; +#ifdef RTE_ARCH_ARM64 + alg = RTE_ACL_CLASSIFY_NEON; +#else #ifdef CC_AVX2_SUPPORT if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) alg = RTE_ACL_CLASSIFY_AVX2; @@ -102,6 +126,7 @@ rte_acl_init(void) #endif alg = RTE_ACL_CLASSIFY_SSE; +#endif rte_acl_set_default_classify(alg); } diff --git a/lib/librte_acl/rte_acl.h b/lib/librte_acl/rte_acl.h index 98ef2fc..0979a09 100644 --- a/lib/librte_acl/rte_acl.h +++ b/lib/librte_acl/rte_acl.h @@ -270,6 +270,7 @@ enum rte_acl_classify_alg { RTE_ACL_CLASSIFY_SCALAR = 1, /**< generic implementation. */ RTE_ACL_CLASSIFY_SSE = 2, /**< requires SSE4.1 support. */ RTE_ACL_CLASSIFY_AVX2 = 3, /**< requires AVX2 support. */ + RTE_ACL_CLASSIFY_NEON = 4, /**< requires NEON support. */ RTE_ACL_CLASSIFY_NUM /* should always be the last one. */ }; -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 2015-11-03 13:09 ` [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++ mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ mk/machine/armv8a/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ 3 files changed, 172 insertions(+) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc new file mode 100644 index 0000000..6ea38a5 --- /dev/null +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc @@ -0,0 +1,56 @@ +# BSD LICENSE +# +# Copyright (C) Cavium networks 2015. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Cavium networks nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +#include "common_linuxapp" + +CONFIG_RTE_MACHINE="armv8a" + +CONFIG_RTE_ARCH="arm64" +CONFIG_RTE_ARCH_ARM64=y +CONFIG_RTE_ARCH_64=y +CONFIG_RTE_ARCH_ARM_NEON=y + +CONFIG_RTE_FORCE_INTRINSICS=y + +CONFIG_RTE_TOOLCHAIN="gcc" +CONFIG_RTE_TOOLCHAIN_GCC=y + +CONFIG_RTE_CACHE_LINE_SIZE=64 + +CONFIG_RTE_IXGBE_INC_VECTOR=n +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n +CONFIG_RTE_LIBRTE_IVSHMEM=n +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n + +CONFIG_RTE_LIBRTE_LPM=n +CONFIG_RTE_LIBRTE_TABLE=n +CONFIG_RTE_LIBRTE_PIPELINE=n + diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk new file mode 100644 index 0000000..32e3a5f --- /dev/null +++ b/mk/arch/arm64/rte.vars.mk @@ -0,0 +1,58 @@ +# BSD LICENSE +# +# Copyright (C) Cavium networks 2015. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Cavium networks nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# arch: +# +# - define ARCH variable (overridden by cmdline or by previous +# optional define in machine .mk) +# - define CROSS variable (overridden by cmdline or previous define +# in machine .mk) +# - define CPU_CFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_LDFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_ASFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - may override any previously defined variable +# +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32 +# + +ARCH ?= arm64 +# common arch dir in eal headers +ARCH_DIR := arm +CROSS ?= + +CPU_CFLAGS ?= +CPU_LDFLAGS ?= +CPU_ASFLAGS ?= -felf + +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk new file mode 100644 index 0000000..bdf8c6b --- /dev/null +++ b/mk/machine/armv8a/rte.vars.mk @@ -0,0 +1,58 @@ +# BSD LICENSE +# +# Copyright (C) Cavium networks 2015. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Cavium networks nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +# +# machine: +# +# - can define ARCH variable (overridden by cmdline value) +# - can define CROSS variable (overridden by cmdline value) +# - define MACHINE_CFLAGS variable (overridden by cmdline value) +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) +# - can define CPU_CFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - may override any previously defined variable +# + +# ARCH = +# CROSS = +# MACHINE_CFLAGS = +# MACHINE_LDFLAGS = +# MACHINE_ASFLAGS = +# CPU_CFLAGS = +# CPU_LDFLAGS = +# CPU_ASFLAGS = + +MACHINE_CFLAGS += -march=armv8-a -DRTE_CACHE_LINE_SIZE=64 -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a 2015-11-03 13:09 ` [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Created the new thunderx machine target to address difference in "cache line size" and "-mcpu=thunderx" vs default armv8-a machine target Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- config/defconfig_arm64-thunderx-linuxapp-gcc | 56 +++++++++++++++++++++++++++ mk/machine/thunderx/rte.vars.mk | 58 ++++++++++++++++++++++++++++ 2 files changed, 114 insertions(+) create mode 100644 config/defconfig_arm64-thunderx-linuxapp-gcc create mode 100644 mk/machine/thunderx/rte.vars.mk diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc new file mode 100644 index 0000000..e8fccc7 --- /dev/null +++ b/config/defconfig_arm64-thunderx-linuxapp-gcc @@ -0,0 +1,56 @@ +# BSD LICENSE +# +# Copyright (C) Cavium networks 2015. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Cavium networks nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +#include "common_linuxapp" + +CONFIG_RTE_MACHINE="thunderx" + +CONFIG_RTE_ARCH="arm64" +CONFIG_RTE_ARCH_ARM64=y +CONFIG_RTE_ARCH_64=y +CONFIG_RTE_ARCH_ARM_NEON=y + +CONFIG_RTE_FORCE_INTRINSICS=y + +CONFIG_RTE_TOOLCHAIN="gcc" +CONFIG_RTE_TOOLCHAIN_GCC=y + +CONFIG_RTE_CACHE_LINE_SIZE=128 + +CONFIG_RTE_IXGBE_INC_VECTOR=n +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n +CONFIG_RTE_LIBRTE_IVSHMEM=n +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n + +CONFIG_RTE_LIBRTE_LPM=n +CONFIG_RTE_LIBRTE_TABLE=n +CONFIG_RTE_LIBRTE_PIPELINE=n + diff --git a/mk/machine/thunderx/rte.vars.mk b/mk/machine/thunderx/rte.vars.mk new file mode 100644 index 0000000..e49f9e1 --- /dev/null +++ b/mk/machine/thunderx/rte.vars.mk @@ -0,0 +1,58 @@ +# BSD LICENSE +# +# Copyright (C) Cavium networks 2015. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Cavium networks nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +# +# machine: +# +# - can define ARCH variable (overridden by cmdline value) +# - can define CROSS variable (overridden by cmdline value) +# - define MACHINE_CFLAGS variable (overridden by cmdline value) +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) +# - can define CPU_CFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - may override any previously defined variable +# + +# ARCH = +CROSS ?= aarch64-thunderx-linux-gnu- +# MACHINE_CFLAGS = +# MACHINE_LDFLAGS = +# MACHINE_ASFLAGS = +# CPU_CFLAGS = +# CPU_LDFLAGS = +# CPU_ASFLAGS = + +MACHINE_CFLAGS += -march=armv8-a -mcpu=thunderx -DRTE_CACHE_LINE_SIZE=128 -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 2015-11-03 13:09 ` [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 12/12] maintainers: claim responsibility for ARMv8 Jerin Jacob 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- doc/guides/rel_notes/release_2_2.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index 43a3a3c..a3587a2 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -23,10 +23,11 @@ New Features * **Added vhost-user multiple queue support.** -* **Introduce ARMv7 architecture** +* **Introduce ARMv7 and ARMv8 architectures** - It is now possible to build DPDK for the ARMv7 platform and test with - virtual PMD drivers. + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. + * ARMv7 can be tested with virtual PMD drivers. + * ARMv8 can be tested with virtual and physical PMD drivers. Resolved Issues -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH 12/12] maintainers: claim responsibility for ARMv8 2015-11-03 13:09 ` [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 Jerin Jacob @ 2015-11-03 13:09 ` Jerin Jacob 0 siblings, 0 replies; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 13:09 UTC (permalink / raw) To: dev Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- MAINTAINERS | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index a8933eb..c44b328 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -128,6 +128,11 @@ ARM v7 M: Jan Viktorin <viktorin@rehivetech.com> F: lib/librte_eal/common/include/arch/arm/ +ARM v8 +M: Jerin Jacob <jerin.jacob@caviumnetworks.com> +F: lib/librte_eal/common/include/arch/arm/*_64.h +F: lib/librte_acl/acl_run_neon.* + Intel x86 M: Bruce Richardson <bruce.richardson@intel.com> M: Konstantin Ananyev <konstantin.ananyev@intel.com> -- 2.1.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 00/12] DPDK armv8-a support 2015-11-03 13:09 [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob @ 2015-11-03 14:17 ` Hunt, David 2015-11-03 16:38 ` Jerin Jacob 1 sibling, 1 reply; 16+ messages in thread From: Hunt, David @ 2015-11-03 14:17 UTC (permalink / raw) To: Jerin Jacob, dev On 03/11/2015 13:09, Jerin Jacob wrote: > This is the v1 patchset for ARMv8 that now sits on top of the v6 patch > of the ARMv7 code by RehiveTech. It adds code into the same arm include > directory, reducing code duplication. > > Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic > between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should > work with many other NICS as long as long as there is no unaligned access to > device memory but not yet untested. I have your patchset building and running on an X-Gene based 8-core MP30AR0 system, passing traffic between two ports on and 82599 also. > Notes on arm64 kernel configuration: > > Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio. > ARM64 kernels does not have functional resource mapping of PCI memory > (PCI_MMAP), so the pci driver needs to be patched to enable this. The > symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is > missing the resource0...N files for mmapping the device memory. > > Following patch fixes the PCI resource mapping issue om armv8. > Its not yet up streamed.We are in the process of up streaming it. > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html Good to see that there's a patch on the way for this. That fix looks almost exactly the same as the hack I did to my kernel :) I had a couple of small issues when patching/building: 1. Three of the files had an extra blank line at the end. Maybe worth running checkpatch on the patches. 'git am' was complaining. 2. I had problems compiling two drivers because they were attempting to include tmmintrin.h: ...dpdk/drivers/net/fm10k/fm10k_rxtx_vec.c:41:23: fatal error: tmmintrin.h: No such file or directory ...dpdk/drivers/net/i40e/i40e_rxtx_vec.c:43:23: fatal error: tmmintrin.h: No such file or directory To avoid this, I added the following two lines into defconfig_arm64-armv8a-linuxapp-gcc CONFIG_RTE_LIBRTE_FM10K_PMD=n CONFIG_RTE_LIBRTE_I40E_PMD=n and then it built fine, and I can run testpmd with my 82599's and run autotests. Thanks for that. Dave. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 00/12] DPDK armv8-a support 2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David @ 2015-11-03 16:38 ` Jerin Jacob 2015-11-05 14:39 ` Hunt, David 0 siblings, 1 reply; 16+ messages in thread From: Jerin Jacob @ 2015-11-03 16:38 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Tue, Nov 03, 2015 at 02:17:38PM +0000, Hunt, David wrote: > On 03/11/2015 13:09, Jerin Jacob wrote: > >This is the v1 patchset for ARMv8 that now sits on top of the v6 patch > >of the ARMv7 code by RehiveTech. It adds code into the same arm include > >directory, reducing code duplication. > > > >Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic > >between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should > >work with many other NICS as long as long as there is no unaligned access to > >device memory but not yet untested. > > I have your patchset building and running on an X-Gene based 8-core MP30AR0 > system, passing traffic between two ports on and 82599 also. > Thanks. > >Notes on arm64 kernel configuration: > > > > Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio. > > ARM64 kernels does not have functional resource mapping of PCI memory > > (PCI_MMAP), so the pci driver needs to be patched to enable this. The > > symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is > > missing the resource0...N files for mmapping the device memory. > > > > Following patch fixes the PCI resource mapping issue om armv8. > > Its not yet up streamed.We are in the process of up streaming it. > > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html > > Good to see that there's a patch on the way for this. That fix looks almost > exactly the same as the hack I did to my kernel :) > > I had a couple of small issues when patching/building: > > 1. Three of the files had an extra blank line at the end. Maybe worth > running checkpatch on the patches. 'git am' was complaining. I will fix it in next version. > > 2. I had problems compiling two drivers because they were attempting to > include tmmintrin.h: > > ...dpdk/drivers/net/fm10k/fm10k_rxtx_vec.c:41:23: fatal error: tmmintrin.h: > No such file or directory > > ...dpdk/drivers/net/i40e/i40e_rxtx_vec.c:43:23: fatal error: tmmintrin.h: No > such file or directory > > To avoid this, I added the following two lines into > defconfig_arm64-armv8a-linuxapp-gcc > > CONFIG_RTE_LIBRTE_FM10K_PMD=n > CONFIG_RTE_LIBRTE_I40E_PMD=n the patch was based on 82fb702077f67585d64a07de0080e5cb6a924a72 which don't have these changes. I will add these in next version. > and then it built fine, and I can run testpmd with my 82599's and run > autotests. I ran autotest, "Mbuf autotest" stress failure is due strong vs weak ordering issue. I will send the next version based on new patch being discussed on ml. > > Thanks for that. > Dave. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH 00/12] DPDK armv8-a support 2015-11-03 16:38 ` Jerin Jacob @ 2015-11-05 14:39 ` Hunt, David 0 siblings, 0 replies; 16+ messages in thread From: Hunt, David @ 2015-11-05 14:39 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 03/11/2015 16:38, Jerin Jacob wrote: > On Tue, Nov 03, 2015 at 02:17:38PM +0000, Hunt, David wrote: --snip-- >> and then it built fine, and I can run testpmd with my 82599's and run >> autotests. > > I ran autotest, "Mbuf autotest" stress failure is due strong vs weak ordering > issue. I will send the next version based on new patch being discussed > on ml. Jerin, I've marked my patch-set for the armv8 support as superseded in PatchWork. I'm happy for your patch-set to take precedence. If you're uploading another rev, I'll be sure to give it a test on my X-Gene board. Dave. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2015-11-05 14:39 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-11-03 13:09 [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 01/12] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 02/12] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 03/12] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 04/12] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 05/12] eal: arm64: rte_memcpy_64.h version based libc memcpy Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 06/12] eal: arm: ret_vector.h improvements Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 07/12] app: test: added the new cpu flags of arm64 in test_cpuflags test case Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 08/12] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 09/12] mk: add support for armv8 on top of armv7 Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 10/12] mk: add support for thunderx machine target based armv8-a Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 11/12] updated release note for armv8 support for DPDK 2.2 Jerin Jacob 2015-11-03 13:09 ` [dpdk-dev] [PATCH 12/12] maintainers: claim responsibility for ARMv8 Jerin Jacob 2015-11-03 14:17 ` [dpdk-dev] [PATCH 00/12] DPDK armv8-a support Hunt, David 2015-11-03 16:38 ` Jerin Jacob 2015-11-05 14:39 ` Hunt, David
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).