* [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support @ 2015-10-29 17:29 David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 1/5] eal: split arm rte_memcpy.h into 32-bit and 64-bit versions David Hunt ` (6 more replies) 0 siblings, 7 replies; 32+ messages in thread From: David Hunt @ 2015-10-29 17:29 UTC (permalink / raw) To: dev Hi DPDK Community. This is an updated patchset for ARMv8 that now sits on top of the previously submitted ARMv7 code by RehiveTech. It re-uses a lot of that code, and splits some header files into 32-bit and 64-bit versions, so uses the same arm include directory. Tested on an XGene 64-bit arm server board, with PCI slots. Passes traffic between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should work with many other NICS, but these are as yet untested. Compiles igb_uio, kni and all the physical device PMDs. ACL and LPM are disabled due to compilation issues. Note added to the Release notes. David Hunt (5): eal/arm: split arm rte_memcpy.h into 32 and 64 bit versions. eal/arm: split arm rte_prefetch.h into 32 and 64 bit versions eal/arm: fix 64-bit compilation for armv8 mk: Add makefile support for armv8 architecture test: add test for cpu flags on armv8 MAINTAINERS | 3 +- app/test/test_cpuflags.c | 13 +- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 ++++ doc/guides/rel_notes/release_2_2.rst | 7 +- .../common/include/arch/arm/rte_cpuflags.h | 9 + .../common/include/arch/arm/rte_memcpy.h | 302 +------------------ .../common/include/arch/arm/rte_memcpy_32.h | 334 +++++++++++++++++++++ .../common/include/arch/arm/rte_memcpy_64.h | 322 ++++++++++++++++++++ .../common/include/arch/arm/rte_prefetch.h | 31 +- .../common/include/arch/arm/rte_prefetch_32.h | 61 ++++ .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++ mk/arch/arm64/rte.vars.mk | 58 ++++ mk/machine/armv8a/rte.vars.mk | 57 ++++ 13 files changed, 986 insertions(+), 328 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk -- 1.9.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 1/5] eal: split arm rte_memcpy.h into 32-bit and 64-bit versions 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt @ 2015-10-29 17:29 ` David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 2/5] eal: split arm rte_prefetch.h " David Hunt ` (5 subsequent siblings) 6 siblings, 0 replies; 32+ messages in thread From: David Hunt @ 2015-10-29 17:29 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_memcpy.h | 302 +------------------ .../common/include/arch/arm/rte_memcpy_32.h | 334 +++++++++++++++++++++ .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++ 3 files changed, 647 insertions(+), 297 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h index f41648a..19c98e1 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h @@ -1,7 +1,7 @@ /* * BSD LICENSE * - * Copyright(c) 2015 RehiveTech. All rights reserved. + * Copyright(c) 2015 Intel Corporation. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -13,7 +13,7 @@ * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. - * * Neither the name of RehiveTech nor the names of its + * * Neither the name of Intel Corportation nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * @@ -33,302 +33,10 @@ #ifndef _RTE_MEMCPY_ARM_H_ #define _RTE_MEMCPY_ARM_H_ -#include <stdint.h> -#include <string.h> - -#ifdef __cplusplus -extern "C" { -#endif - -#include "generic/rte_memcpy.h" - -#ifdef __ARM_NEON_FP - -/* ARM NEON Intrinsics are used to copy data */ -#include <arm_neon.h> - -static inline void -rte_mov16(uint8_t *dst, const uint8_t *src) -{ - vst1q_u8(dst, vld1q_u8(src)); -} - -static inline void -rte_mov32(uint8_t *dst, const uint8_t *src) -{ - asm volatile ( - "vld1.8 {d0-d3}, [%0]\n\t" - "vst1.8 {d0-d3}, [%1]\n\t" - : "+r" (src), "+r" (dst) - : : "memory", "d0", "d1", "d2", "d3"); -} - -static inline void -rte_mov48(uint8_t *dst, const uint8_t *src) -{ - asm volatile ( - "vld1.8 {d0-d3}, [%0]!\n\t" - "vld1.8 {d4-d5}, [%0]\n\t" - "vst1.8 {d0-d3}, [%1]!\n\t" - "vst1.8 {d4-d5}, [%1]\n\t" - : "+r" (src), "+r" (dst) - : - : "memory", "d0", "d1", "d2", "d3", "d4", "d5"); -} - -static inline void -rte_mov64(uint8_t *dst, const uint8_t *src) -{ - asm volatile ( - "vld1.8 {d0-d3}, [%0]!\n\t" - "vld1.8 {d4-d7}, [%0]\n\t" - "vst1.8 {d0-d3}, [%1]!\n\t" - "vst1.8 {d4-d7}, [%1]\n\t" - : "+r" (src), "+r" (dst) - : - : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7"); -} - -static inline void -rte_mov128(uint8_t *dst, const uint8_t *src) -{ - asm volatile ("pld [%0, #64]" : : "r" (src)); - asm volatile ( - "vld1.8 {d0-d3}, [%0]!\n\t" - "vld1.8 {d4-d7}, [%0]!\n\t" - "vld1.8 {d8-d11}, [%0]!\n\t" - "vld1.8 {d12-d15}, [%0]\n\t" - "vst1.8 {d0-d3}, [%1]!\n\t" - "vst1.8 {d4-d7}, [%1]!\n\t" - "vst1.8 {d8-d11}, [%1]!\n\t" - "vst1.8 {d12-d15}, [%1]\n\t" - : "+r" (src), "+r" (dst) - : - : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7", - "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15"); -} - -static inline void -rte_mov256(uint8_t *dst, const uint8_t *src) -{ - asm volatile ("pld [%0, #64]" : : "r" (src)); - asm volatile ("pld [%0, #128]" : : "r" (src)); - asm volatile ("pld [%0, #192]" : : "r" (src)); - asm volatile ("pld [%0, #256]" : : "r" (src)); - asm volatile ("pld [%0, #320]" : : "r" (src)); - asm volatile ("pld [%0, #384]" : : "r" (src)); - asm volatile ("pld [%0, #448]" : : "r" (src)); - asm volatile ( - "vld1.8 {d0-d3}, [%0]!\n\t" - "vld1.8 {d4-d7}, [%0]!\n\t" - "vld1.8 {d8-d11}, [%0]!\n\t" - "vld1.8 {d12-d15}, [%0]!\n\t" - "vld1.8 {d16-d19}, [%0]!\n\t" - "vld1.8 {d20-d23}, [%0]!\n\t" - "vld1.8 {d24-d27}, [%0]!\n\t" - "vld1.8 {d28-d31}, [%0]\n\t" - "vst1.8 {d0-d3}, [%1]!\n\t" - "vst1.8 {d4-d7}, [%1]!\n\t" - "vst1.8 {d8-d11}, [%1]!\n\t" - "vst1.8 {d12-d15}, [%1]!\n\t" - "vst1.8 {d16-d19}, [%1]!\n\t" - "vst1.8 {d20-d23}, [%1]!\n\t" - "vst1.8 {d24-d27}, [%1]!\n\t" - "vst1.8 {d28-d31}, [%1]!\n\t" - : "+r" (src), "+r" (dst) - : - : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7", - "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15", - "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23", - "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31"); -} - -#define rte_memcpy(dst, src, n) \ - ({ (__builtin_constant_p(n)) ? \ - memcpy((dst), (src), (n)) : \ - rte_memcpy_func((dst), (src), (n)); }) - -static inline void * -rte_memcpy_func(void *dst, const void *src, size_t n) -{ - void *ret = dst; - - /* We can't copy < 16 bytes using XMM registers so do it manually. */ - if (n < 16) { - if (n & 0x01) { - *(uint8_t *)dst = *(const uint8_t *)src; - dst = (uint8_t *)dst + 1; - src = (const uint8_t *)src + 1; - } - if (n & 0x02) { - *(uint16_t *)dst = *(const uint16_t *)src; - dst = (uint16_t *)dst + 1; - src = (const uint16_t *)src + 1; - } - if (n & 0x04) { - *(uint32_t *)dst = *(const uint32_t *)src; - dst = (uint32_t *)dst + 1; - src = (const uint32_t *)src + 1; - } - if (n & 0x08) { - /* ARMv7 can not handle unaligned access to long long - * (uint64_t). Therefore two uint32_t operations are - * used. - */ - *(uint32_t *)dst = *(const uint32_t *)src; - dst = (uint32_t *)dst + 1; - src = (const uint32_t *)src + 1; - *(uint32_t *)dst = *(const uint32_t *)src; - } - return ret; - } - - /* Special fast cases for <= 128 bytes */ - if (n <= 32) { - rte_mov16((uint8_t *)dst, (const uint8_t *)src); - rte_mov16((uint8_t *)dst - 16 + n, - (const uint8_t *)src - 16 + n); - return ret; - } - - if (n <= 64) { - rte_mov32((uint8_t *)dst, (const uint8_t *)src); - rte_mov32((uint8_t *)dst - 32 + n, - (const uint8_t *)src - 32 + n); - return ret; - } - - if (n <= 128) { - rte_mov64((uint8_t *)dst, (const uint8_t *)src); - rte_mov64((uint8_t *)dst - 64 + n, - (const uint8_t *)src - 64 + n); - return ret; - } - - /* - * For large copies > 128 bytes. This combination of 256, 64 and 16 byte - * copies was found to be faster than doing 128 and 32 byte copies as - * well. - */ - for ( ; n >= 256; n -= 256) { - rte_mov256((uint8_t *)dst, (const uint8_t *)src); - dst = (uint8_t *)dst + 256; - src = (const uint8_t *)src + 256; - } - - /* - * We split the remaining bytes (which will be less than 256) into - * 64byte (2^6) chunks. - * Using incrementing integers in the case labels of a switch statement - * enourages the compiler to use a jump table. To get incrementing - * integers, we shift the 2 relevant bits to the LSB position to first - * get decrementing integers, and then subtract. - */ - switch (3 - (n >> 6)) { - case 0x00: - rte_mov64((uint8_t *)dst, (const uint8_t *)src); - n -= 64; - dst = (uint8_t *)dst + 64; - src = (const uint8_t *)src + 64; /* fallthrough */ - case 0x01: - rte_mov64((uint8_t *)dst, (const uint8_t *)src); - n -= 64; - dst = (uint8_t *)dst + 64; - src = (const uint8_t *)src + 64; /* fallthrough */ - case 0x02: - rte_mov64((uint8_t *)dst, (const uint8_t *)src); - n -= 64; - dst = (uint8_t *)dst + 64; - src = (const uint8_t *)src + 64; /* fallthrough */ - default: - break; - } - - /* - * We split the remaining bytes (which will be less than 64) into - * 16byte (2^4) chunks, using the same switch structure as above. - */ - switch (3 - (n >> 4)) { - case 0x00: - rte_mov16((uint8_t *)dst, (const uint8_t *)src); - n -= 16; - dst = (uint8_t *)dst + 16; - src = (const uint8_t *)src + 16; /* fallthrough */ - case 0x01: - rte_mov16((uint8_t *)dst, (const uint8_t *)src); - n -= 16; - dst = (uint8_t *)dst + 16; - src = (const uint8_t *)src + 16; /* fallthrough */ - case 0x02: - rte_mov16((uint8_t *)dst, (const uint8_t *)src); - n -= 16; - dst = (uint8_t *)dst + 16; - src = (const uint8_t *)src + 16; /* fallthrough */ - default: - break; - } - - /* Copy any remaining bytes, without going beyond end of buffers */ - if (n != 0) - rte_mov16((uint8_t *)dst - 16 + n, - (const uint8_t *)src - 16 + n); - return ret; -} - +#ifdef RTE_ARCH_64 +#include "rte_memcpy_64.h" #else - -static inline void -rte_mov16(uint8_t *dst, const uint8_t *src) -{ - memcpy(dst, src, 16); -} - -static inline void -rte_mov32(uint8_t *dst, const uint8_t *src) -{ - memcpy(dst, src, 32); -} - -static inline void -rte_mov48(uint8_t *dst, const uint8_t *src) -{ - memcpy(dst, src, 48); -} - -static inline void -rte_mov64(uint8_t *dst, const uint8_t *src) -{ - memcpy(dst, src, 64); -} - -static inline void -rte_mov128(uint8_t *dst, const uint8_t *src) -{ - memcpy(dst, src, 128); -} - -static inline void -rte_mov256(uint8_t *dst, const uint8_t *src) -{ - memcpy(dst, src, 256); -} - -static inline void * -rte_memcpy(void *dst, const void *src, size_t n) -{ - return memcpy(dst, src, n); -} - -static inline void * -rte_memcpy_func(void *dst, const void *src, size_t n) -{ - return memcpy(dst, src, n); -} - -#endif /* __ARM_NEON_FP */ - -#ifdef __cplusplus -} +#include "rte_memcpy_32.h" #endif #endif /* _RTE_MEMCPY_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h new file mode 100644 index 0000000..f41648a --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h @@ -0,0 +1,334 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_MEMCPY_ARM32_H_ +#define _RTE_MEMCPY_ARM32_H_ + +#include <stdint.h> +#include <string.h> + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_memcpy.h" + +#ifdef __ARM_NEON_FP + +/* ARM NEON Intrinsics are used to copy data */ +#include <arm_neon.h> + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + vst1q_u8(dst, vld1q_u8(src)); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + asm volatile ( + "vld1.8 {d0-d3}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : : "memory", "d0", "d1", "d2", "d3"); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d5}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d5}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5"); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d7}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d7}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7"); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + asm volatile ("pld [%0, #64]" : : "r" (src)); + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d7}, [%0]!\n\t" + "vld1.8 {d8-d11}, [%0]!\n\t" + "vld1.8 {d12-d15}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d7}, [%1]!\n\t" + "vst1.8 {d8-d11}, [%1]!\n\t" + "vst1.8 {d12-d15}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7", + "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15"); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + asm volatile ("pld [%0, #64]" : : "r" (src)); + asm volatile ("pld [%0, #128]" : : "r" (src)); + asm volatile ("pld [%0, #192]" : : "r" (src)); + asm volatile ("pld [%0, #256]" : : "r" (src)); + asm volatile ("pld [%0, #320]" : : "r" (src)); + asm volatile ("pld [%0, #384]" : : "r" (src)); + asm volatile ("pld [%0, #448]" : : "r" (src)); + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d7}, [%0]!\n\t" + "vld1.8 {d8-d11}, [%0]!\n\t" + "vld1.8 {d12-d15}, [%0]!\n\t" + "vld1.8 {d16-d19}, [%0]!\n\t" + "vld1.8 {d20-d23}, [%0]!\n\t" + "vld1.8 {d24-d27}, [%0]!\n\t" + "vld1.8 {d28-d31}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d7}, [%1]!\n\t" + "vst1.8 {d8-d11}, [%1]!\n\t" + "vst1.8 {d12-d15}, [%1]!\n\t" + "vst1.8 {d16-d19}, [%1]!\n\t" + "vst1.8 {d20-d23}, [%1]!\n\t" + "vst1.8 {d24-d27}, [%1]!\n\t" + "vst1.8 {d28-d31}, [%1]!\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7", + "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15", + "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23", + "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31"); +} + +#define rte_memcpy(dst, src, n) \ + ({ (__builtin_constant_p(n)) ? \ + memcpy((dst), (src), (n)) : \ + rte_memcpy_func((dst), (src), (n)); }) + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + void *ret = dst; + + /* We can't copy < 16 bytes using XMM registers so do it manually. */ + if (n < 16) { + if (n & 0x01) { + *(uint8_t *)dst = *(const uint8_t *)src; + dst = (uint8_t *)dst + 1; + src = (const uint8_t *)src + 1; + } + if (n & 0x02) { + *(uint16_t *)dst = *(const uint16_t *)src; + dst = (uint16_t *)dst + 1; + src = (const uint16_t *)src + 1; + } + if (n & 0x04) { + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + } + if (n & 0x08) { + /* ARMv7 can not handle unaligned access to long long + * (uint64_t). Therefore two uint32_t operations are + * used. + */ + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + *(uint32_t *)dst = *(const uint32_t *)src; + } + return ret; + } + + /* Special fast cases for <= 128 bytes */ + if (n <= 32) { + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; + } + + if (n <= 64) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + rte_mov32((uint8_t *)dst - 32 + n, + (const uint8_t *)src - 32 + n); + return ret; + } + + if (n <= 128) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + rte_mov64((uint8_t *)dst - 64 + n, + (const uint8_t *)src - 64 + n); + return ret; + } + + /* + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte + * copies was found to be faster than doing 128 and 32 byte copies as + * well. + */ + for ( ; n >= 256; n -= 256) { + rte_mov256((uint8_t *)dst, (const uint8_t *)src); + dst = (uint8_t *)dst + 256; + src = (const uint8_t *)src + 256; + } + + /* + * We split the remaining bytes (which will be less than 256) into + * 64byte (2^6) chunks. + * Using incrementing integers in the case labels of a switch statement + * enourages the compiler to use a jump table. To get incrementing + * integers, we shift the 2 relevant bits to the LSB position to first + * get decrementing integers, and then subtract. + */ + switch (3 - (n >> 6)) { + case 0x00: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x01: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x02: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + default: + break; + } + + /* + * We split the remaining bytes (which will be less than 64) into + * 16byte (2^4) chunks, using the same switch structure as above. + */ + switch (3 - (n >> 4)) { + case 0x00: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x01: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x02: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + default: + break; + } + + /* Copy any remaining bytes, without going beyond end of buffers */ + if (n != 0) + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; +} + +#else + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +static inline void * +rte_memcpy(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +#endif /* __ARM_NEON_FP */ + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMCPY_ARM32_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h new file mode 100644 index 0000000..6d85113 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h @@ -0,0 +1,308 @@ +/* + * BSD LICENSE + * + * Copyright (C) IBM Corporation 2014. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IBM Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_MEMCPY_ARM_64_H_ +#define _RTE_MEMCPY_ARM_64_H_ + +#include <stdint.h> +#include <string.h> + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_memcpy.h" + +#ifdef __ARM_NEON_FP + +/* ARM NEON Intrinsics are used to copy data */ +#include <arm_neon.h> + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP d0, d1, [%0]\n\t" + "STP d0, d1, [%1]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP d0, d1, [%0 , #32]\n\t" + "STP d0, d1, [%1 , #32]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + "LDP q0, q1, [%0 , #64]\n\t" + "STP q0, q1, [%1 , #64]\n\t" + "LDP q0, q1, [%0 , #96]\n\t" + "STP q0, q1, [%1 , #96]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + "LDP q0, q1, [%0 , #64]\n\t" + "STP q0, q1, [%1 , #64]\n\t" + "LDP q0, q1, [%0 , #96]\n\t" + "STP q0, q1, [%1 , #96]\n\t" + "LDP q0, q1, [%0 , #128]\n\t" + "STP q0, q1, [%1 , #128]\n\t" + "LDP q0, q1, [%0 , #160]\n\t" + "STP q0, q1, [%1 , #160]\n\t" + "LDP q0, q1, [%0 , #192]\n\t" + "STP q0, q1, [%1 , #192]\n\t" + "LDP q0, q1, [%0 , #224]\n\t" + "STP q0, q1, [%1 , #224]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +#define rte_memcpy(dst, src, n) \ + ({ (__builtin_constant_p(n)) ? \ + memcpy((dst), (src), (n)) : \ + rte_memcpy_func((dst), (src), (n)); }) + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + void *ret = dst; + + /* We can't copy < 16 bytes using XMM registers so do it manually. */ + if (n < 16) { + if (n & 0x01) { + *(uint8_t *)dst = *(const uint8_t *)src; + dst = (uint8_t *)dst + 1; + src = (const uint8_t *)src + 1; + } + if (n & 0x02) { + *(uint16_t *)dst = *(const uint16_t *)src; + dst = (uint16_t *)dst + 1; + src = (const uint16_t *)src + 1; + } + if (n & 0x04) { + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + } + if (n & 0x08) + *(uint64_t *)dst = *(const uint64_t *)src; + return ret; + } + + /* Special fast cases for <= 128 bytes */ + if (n <= 32) { + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; + } + + if (n <= 64) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + rte_mov32((uint8_t *)dst - 32 + n, + (const uint8_t *)src - 32 + n); + return ret; + } + + if (n <= 128) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + rte_mov64((uint8_t *)dst - 64 + n, + (const uint8_t *)src - 64 + n); + return ret; + } + + /* + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte + * copies was found to be faster than doing 128 and 32 byte copies as + * well. + */ + for ( ; n >= 256; n -= 256) { + rte_mov256((uint8_t *)dst, (const uint8_t *)src); + dst = (uint8_t *)dst + 256; + src = (const uint8_t *)src + 256; + } + + /* + * We split the remaining bytes (which will be less than 256) into + * 64byte (2^6) chunks. + * Using incrementing integers in the case labels of a switch statement + * enourages the compiler to use a jump table. To get incrementing + * integers, we shift the 2 relevant bits to the LSB position to first + * get decrementing integers, and then subtract. + */ + switch (3 - (n >> 6)) { + case 0x00: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x01: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x02: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + default: + break; + } + + /* + * We split the remaining bytes (which will be less than 64) into + * 16byte (2^4) chunks, using the same switch structure as above. + */ + switch (3 - (n >> 4)) { + case 0x00: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x01: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x02: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + default: + break; + } + + /* Copy any remaining bytes, without going beyond end of buffers */ + if (n != 0) + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; +} + +#else + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +static inline void * +rte_memcpy(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +#endif /* __ARM_NEON_FP */ + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMCPY_ARM_64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 2/5] eal: split arm rte_prefetch.h into 32-bit and 64-bit versions 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 1/5] eal: split arm rte_memcpy.h into 32-bit and 64-bit versions David Hunt @ 2015-10-29 17:29 ` David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 3/5] eal: fix compilation for armv8 64-bit David Hunt ` (4 subsequent siblings) 6 siblings, 0 replies; 32+ messages in thread From: David Hunt @ 2015-10-29 17:29 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_prefetch.h | 31 +++-------- .../common/include/arch/arm/rte_prefetch_32.h | 61 ++++++++++++++++++++++ .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++++++++++++++++++++ 3 files changed, 128 insertions(+), 25 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h index 62c3991..0c6473a 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h @@ -1,7 +1,7 @@ /* * BSD LICENSE * - * Copyright(c) 2015 RehiveTech. All rights reserved. + * Copyright(c) 2015 Intel Corporation. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -13,7 +13,7 @@ * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. - * * Neither the name of RehiveTech nor the names of its + * * Neither the name of Intel Corporation nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * @@ -33,29 +33,10 @@ #ifndef _RTE_PREFETCH_ARM_H_ #define _RTE_PREFETCH_ARM_H_ -#ifdef __cplusplus -extern "C" { -#endif - -#include "generic/rte_prefetch.h" - -static inline void rte_prefetch0(const volatile void *p) -{ - asm volatile ("pld [%0]" : : "r" (p)); -} - -static inline void rte_prefetch1(const volatile void *p) -{ - asm volatile ("pld [%0]" : : "r" (p)); -} - -static inline void rte_prefetch2(const volatile void *p) -{ - asm volatile ("pld [%0]" : : "r" (p)); -} - -#ifdef __cplusplus -} +#ifdef RTE_ARCH_64 +#include "rte_prefetch_64.h" +#else +#include "rte_prefetch_32.h" #endif #endif /* _RTE_PREFETCH_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h new file mode 100644 index 0000000..62c3991 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h @@ -0,0 +1,61 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM32_H_ +#define _RTE_PREFETCH_ARM32_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_prefetch.h" + +static inline void rte_prefetch0(const volatile void *p) +{ + asm volatile ("pld [%0]" : : "r" (p)); +} + +static inline void rte_prefetch1(const volatile void *p) +{ + asm volatile ("pld [%0]" : : "r" (p)); +} + +static inline void rte_prefetch2(const volatile void *p) +{ + asm volatile ("pld [%0]" : : "r" (p)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PREFETCH_ARM32_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h new file mode 100644 index 0000000..b0d9170 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h @@ -0,0 +1,61 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Intel Corporation. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM_64_H_ +#define _RTE_PREFETCH_ARM_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_prefetch.h" +/* May want to add PSTL1KEEP instructions for prefetch for ownership. */ +static inline void rte_prefetch0(const volatile void *p) +{ + asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch1(const volatile void *p) +{ + asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch2(const volatile void *p) +{ + asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PREFETCH_ARM_64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 3/5] eal: fix compilation for armv8 64-bit 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 1/5] eal: split arm rte_memcpy.h into 32-bit and 64-bit versions David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 2/5] eal: split arm rte_prefetch.h " David Hunt @ 2015-10-29 17:29 ` David Hunt 2015-10-29 17:38 ` Jan Viktorin 2015-10-29 17:29 ` [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 David Hunt ` (3 subsequent siblings) 6 siblings, 1 reply; 32+ messages in thread From: David Hunt @ 2015-10-29 17:29 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h index 7ce9d14..27d49c0 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h @@ -141,12 +141,21 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out) { int auxv_fd; +#ifdef RTE_ARCH_64 + Elf64_auxv_t auxv; +#else Elf32_auxv_t auxv; +#endif auxv_fd = open("/proc/self/auxv", O_RDONLY); assert(auxv_fd); +#ifdef RTE_ARCH_64 + while (read(auxv_fd, &auxv, + sizeof(Elf64_auxv_t)) == sizeof(Elf64_auxv_t)) { +#else while (read(auxv_fd, &auxv, sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) { +#endif if (auxv.a_type == AT_HWCAP) out[REG_HWCAP] = auxv.a_un.a_val; else if (auxv.a_type == AT_HWCAP2) -- 1.9.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] eal: fix compilation for armv8 64-bit 2015-10-29 17:29 ` [dpdk-dev] [PATCH 3/5] eal: fix compilation for armv8 64-bit David Hunt @ 2015-10-29 17:38 ` Jan Viktorin 0 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-29 17:38 UTC (permalink / raw) To: David Hunt; +Cc: dev Hello Dave, On Thu, 29 Oct 2015 17:29:52 +0000 David Hunt <david.hunt@intel.com> wrote: > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h > index 7ce9d14..27d49c0 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h > @@ -141,12 +141,21 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, > __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out) > { > int auxv_fd; > +#ifdef RTE_ARCH_64 > + Elf64_auxv_t auxv; > +#else > Elf32_auxv_t auxv; > +#endif > > auxv_fd = open("/proc/self/auxv", O_RDONLY); > assert(auxv_fd); > +#ifdef RTE_ARCH_64 > + while (read(auxv_fd, &auxv, > + sizeof(Elf64_auxv_t)) == sizeof(Elf64_auxv_t)) { > +#else > while (read(auxv_fd, &auxv, > sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) { > +#endif > if (auxv.a_type == AT_HWCAP) > out[REG_HWCAP] = auxv.a_un.a_val; > else if (auxv.a_type == AT_HWCAP2) I think, it might be better to do a typedef (or define) like #ifdef RTE_ARCH_64 typedef Elf64_auxv_t Elf_auxv_t; #else typedef Elf32_auxv_t Elf_auxv_t; #endif while leaving the above code almost untouched (just Elf32_auxv_t -> Elf_auxv_t). This is like spagetti... :) Regards Jan -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt ` (2 preceding siblings ...) 2015-10-29 17:29 ` [dpdk-dev] [PATCH 3/5] eal: fix compilation for armv8 64-bit David Hunt @ 2015-10-29 17:29 ` David Hunt 2015-10-29 17:39 ` Jan Viktorin 2015-10-29 17:42 ` Jan Viktorin 2015-10-29 17:29 ` [dpdk-dev] [PATCH 5/5] test: add checks for cpu flags on armv8 David Hunt ` (2 subsequent siblings) 6 siblings, 2 replies; 32+ messages in thread From: David Hunt @ 2015-10-29 17:29 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- MAINTAINERS | 3 +- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++ doc/guides/rel_notes/release_2_2.rst | 7 ++-- mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ mk/machine/armv8a/rte.vars.mk | 57 +++++++++++++++++++++++++++++ 5 files changed, 177 insertions(+), 4 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk diff --git a/MAINTAINERS b/MAINTAINERS index a8933eb..4569f13 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -124,8 +124,9 @@ IBM POWER M: Chao Zhu <chaozhu@linux.vnet.ibm.com> F: lib/librte_eal/common/include/arch/ppc_64/ -ARM v7 +ARM M: Jan Viktorin <viktorin@rehivetech.com> +M: David Hunt <david.hunt@intel.com> F: lib/librte_eal/common/include/arch/arm/ Intel x86 diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc new file mode 100644 index 0000000..79a9533 --- /dev/null +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc @@ -0,0 +1,56 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +#include "common_linuxapp" + +CONFIG_RTE_MACHINE="armv8a" + +CONFIG_RTE_ARCH="arm64" +CONFIG_RTE_ARCH_ARM64=y +CONFIG_RTE_ARCH_64=y +CONFIG_RTE_ARCH_ARM_NEON=y + +CONFIG_RTE_TOOLCHAIN="gcc" +CONFIG_RTE_TOOLCHAIN_GCC=y + +CONFIG_RTE_IXGBE_INC_VECTOR=n +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n +CONFIG_RTE_LIBRTE_IVSHMEM=n +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n + +CONFIG_RTE_LIBRTE_LPM=n +CONFIG_RTE_LIBRTE_ACL=n +CONFIG_RTE_LIBRTE_TABLE=n +CONFIG_RTE_LIBRTE_PIPELINE=n + +# This is used to adjust the generic arm timer to align with the cpu cycle count. +CONFIG_RTE_TIMER_MULTIPLIER=48 diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index 43a3a3c..2b806f5 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -23,10 +23,11 @@ New Features * **Added vhost-user multiple queue support.** -* **Introduce ARMv7 architecture** +* **Introduce ARMv7 and ARMv8 architectures** - It is now possible to build DPDK for the ARMv7 platform and test with - virtual PMD drivers. + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. + * ARMv7 can be tested with virtual PMD drivers. + * ARMv8 can be tested with virtual and physicla PMD drivers. Resolved Issues diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk new file mode 100644 index 0000000..3aad712 --- /dev/null +++ b/mk/arch/arm64/rte.vars.mk @@ -0,0 +1,58 @@ +# BSD LICENSE +# +# Copyright(c) 2015 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# arch: +# +# - define ARCH variable (overridden by cmdline or by previous +# optional define in machine .mk) +# - define CROSS variable (overridden by cmdline or previous define +# in machine .mk) +# - define CPU_CFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_LDFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_ASFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - may override any previously defined variable +# +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32 +# + +ARCH ?= arm64 +# common arch dir in eal headers +ARCH_DIR := arm +CROSS ?= + +CPU_CFLAGS ?= -DRTE_CACHE_LINE_SIZE=64 +CPU_LDFLAGS ?= +CPU_ASFLAGS ?= -felf + +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk new file mode 100644 index 0000000..b785062 --- /dev/null +++ b/mk/machine/armv8a/rte.vars.mk @@ -0,0 +1,57 @@ +# BSD LICENSE +# +# Copyright(c) 2015 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# machine: +# +# - can define ARCH variable (overridden by cmdline value) +# - can define CROSS variable (overridden by cmdline value) +# - define MACHINE_CFLAGS variable (overridden by cmdline value) +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) +# - can define CPU_CFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - may override any previously defined variable +# + +# ARCH = +# CROSS = +# MACHINE_CFLAGS = +# MACHINE_LDFLAGS = +# MACHINE_ASFLAGS = +# CPU_CFLAGS = +# CPU_LDFLAGS = +# CPU_ASFLAGS = + +MACHINE_CFLAGS += -march=armv8-a -- 1.9.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 2015-10-29 17:29 ` [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 David Hunt @ 2015-10-29 17:39 ` Jan Viktorin 2015-10-29 17:42 ` Jan Viktorin 1 sibling, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-29 17:39 UTC (permalink / raw) To: David Hunt; +Cc: dev On Thu, 29 Oct 2015 17:29:53 +0000 David Hunt <david.hunt@intel.com> wrote: > +* **Introduce ARMv7 and ARMv8 architectures** > > - It is now possible to build DPDK for the ARMv7 platform and test with > - virtual PMD drivers. > + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. > + * ARMv7 can be tested with virtual PMD drivers. > + * ARMv8 can be tested with virtual and physicla PMD drivers. Typo "physical" > > > Resolved Issues > diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk > new file mode 100644 -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 2015-10-29 17:29 ` [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 David Hunt 2015-10-29 17:39 ` Jan Viktorin @ 2015-10-29 17:42 ` Jan Viktorin 1 sibling, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-29 17:42 UTC (permalink / raw) To: David Hunt; +Cc: dev On Thu, 29 Oct 2015 17:29:53 +0000 David Hunt <david.hunt@intel.com> wrote: > + > +CONFIG_RTE_LIBRTE_LPM=n > +CONFIG_RTE_LIBRTE_ACL=n > +CONFIG_RTE_LIBRTE_TABLE=n > +CONFIG_RTE_LIBRTE_PIPELINE=n > + > +# This is used to adjust the generic arm timer to align with the cpu cycle count. > +CONFIG_RTE_TIMER_MULTIPLIER=48 Where is the rte_cycles.h for ARMv8? Did you forget it? I could not find it in the patch set. Jan > diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst > index 43a3a3c..2b806f5 100644 > --- a/doc/guides/rel_notes/release_2_2.rst > +++ b/doc/guides/rel_notes/release_2_2.rst > @@ -23,10 +23,11 @@ New Features > > * **Added vhost-user multiple queue support.** > > -* **Introduce ARMv7 architecture** -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 5/5] test: add checks for cpu flags on armv8 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt ` (3 preceding siblings ...) 2015-10-29 17:29 ` [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 David Hunt @ 2015-10-29 17:29 ` David Hunt 2015-10-29 18:27 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Thomas Monjalon 2015-10-30 0:17 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Jan Viktorin 6 siblings, 0 replies; 32+ messages in thread From: David Hunt @ 2015-10-29 17:29 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- app/test/test_cpuflags.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index 557458f..1689048 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -1,4 +1,4 @@ -/*- +/* * BSD LICENSE * * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. @@ -115,9 +115,18 @@ test_cpuflags(void) CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP); #endif -#if defined(RTE_ARCH_ARM) +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) + printf("Checking for Floating Point:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_FPA); + printf("Check for NEON:\t\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); + + printf("Checking for ARM32 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32); + + printf("Checking for ARM64 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64); #endif #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) -- 1.9.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt ` (4 preceding siblings ...) 2015-10-29 17:29 ` [dpdk-dev] [PATCH 5/5] test: add checks for cpu flags on armv8 David Hunt @ 2015-10-29 18:27 ` Thomas Monjalon 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin 2015-10-30 0:17 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Jan Viktorin 6 siblings, 1 reply; 32+ messages in thread From: Thomas Monjalon @ 2015-10-29 18:27 UTC (permalink / raw) To: David Hunt, Jan Viktorin; +Cc: dev Thanks David. 2015-10-29 17:29, David Hunt: > This is an updated patchset for ARMv8 that now sits on top of the previously > submitted ARMv7 code by RehiveTech. It re-uses a lot of that code, and splits > some header files into 32-bit and 64-bit versions, so uses the same arm include > directory. [...] > eal/arm: split arm rte_memcpy.h into 32 and 64 bit versions. > eal/arm: split arm rte_prefetch.h into 32 and 64 bit versions [...] > .../common/include/arch/arm/rte_memcpy.h | 302 +------------------ > .../common/include/arch/arm/rte_memcpy_32.h | 334 +++++++++++++++++++++ > .../common/include/arch/arm/rte_memcpy_64.h | 322 ++++++++++++++++++++ > .../common/include/arch/arm/rte_prefetch.h | 31 +- > .../common/include/arch/arm/rte_prefetch_32.h | 61 ++++ > .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++ Jan, it would be easier to review if your patchset was creating the 32-bit versions of these files. Then David just has to add the 64-bit ones. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture 2015-10-29 18:27 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Thomas Monjalon @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM Jan Viktorin ` (14 more replies) 0 siblings, 15 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev Hello, as Thomas M. suggested, I've made few changes to the ARMv7 code to make the ARMv8 inclusion easier. I can just say that it compiles, however, as there are no functional changes I would expect it is OK. Regards Jan --- You can pull the changes from https://github.com/RehiveTech/dpdk.git arm-support-v5 since commit 82fb702077f67585d64a07de0080e5cb6a924a72: ixgbe: support new flow director modes for X550 (2015-10-29 00:06:01 +0100) up to 285d29f6226d53c8af8035ebaf4c9edf635e2c56: maintainers: claim responsibility for ARMv7 (2015-10-30 01:13:26 +0100) --- Jan Viktorin (7): eal/arm: implement rdtsc by PMU or clock_gettime eal/arm: use vector memcpy only when NEON is enabled eal/arm: detect arm architecture in cpu flags eal/arm: rwlock support for ARM eal/arm: add very incomplete rte_vect gcc/arm: avoid alignment errors to break build maintainers: claim responsibility for ARMv7 Vlastimil Kosar (8): eal/arm: atomic operations for ARM eal/arm: byte order operations for ARM eal/arm: cpu cycle operations for ARM eal/arm: prefetch operations for ARM eal/arm: spinlock operations for ARM (without HTM) eal/arm: vector memcpy for ARM eal/arm: cpu flag checks for ARM mk: Introduce ARMv7 architecture MAINTAINERS | 4 + app/test/test_cpuflags.c | 5 + config/defconfig_arm-armv7a-linuxapp-gcc | 74 +++++ doc/guides/rel_notes/release_2_2.rst | 5 + .../common/include/arch/arm/rte_atomic.h | 256 ++++++++++++++++ .../common/include/arch/arm/rte_byteorder.h | 150 +++++++++ .../common/include/arch/arm/rte_cpuflags.h | 193 ++++++++++++ .../common/include/arch/arm/rte_cycles.h | 38 +++ .../common/include/arch/arm/rte_cycles_32.h | 121 ++++++++ .../common/include/arch/arm/rte_memcpy.h | 38 +++ .../common/include/arch/arm/rte_memcpy_32.h | 334 +++++++++++++++++++++ .../common/include/arch/arm/rte_prefetch.h | 38 +++ .../common/include/arch/arm/rte_prefetch_32.h | 61 ++++ .../common/include/arch/arm/rte_rwlock.h | 40 +++ .../common/include/arch/arm/rte_spinlock.h | 114 +++++++ lib/librte_eal/common/include/arch/arm/rte_vect.h | 84 ++++++ mk/arch/arm/rte.vars.mk | 39 +++ mk/machine/armv7-a/rte.vars.mk | 67 +++++ mk/rte.cpuflags.mk | 6 + mk/toolchain/gcc/rte.vars.mk | 6 + 20 files changed, 1673 insertions(+) create mode 100644 config/defconfig_arm-armv7a-linuxapp-gcc create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h create mode 100644 mk/arch/arm/rte.vars.mk create mode 100644 mk/machine/armv7-a/rte.vars.mk -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-11-02 5:53 ` Jerin Jacob 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 02/15] eal/arm: byte order " Jan Viktorin ` (13 subsequent siblings) 14 siblings, 1 reply; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> This patch adds architecture specific atomic operation file for ARM architecture. It utilizes compiler intrinsics only. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v1 -> v2: * improve rte_wmb() * use __atomic_* or __sync_*? (may affect the required GCC version) v4: * checkpatch complaints about volatile keyword (but seems to be OK to me) * checkpatch complaints about do { ... } while (0) for single statement with asm volatile (but I didn't find a way how to write it without the checkpatch complaints) * checkpatch is now happy with whitespaces --- .../common/include/arch/arm/rte_atomic.h | 256 +++++++++++++++++++++ 1 file changed, 256 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h new file mode 100644 index 0000000..ea1e485 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h @@ -0,0 +1,256 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_ATOMIC_ARM_H_ +#define _RTE_ATOMIC_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_atomic.h" + +/** + * General memory barrier. + * + * Guarantees that the LOAD and STORE operations generated before the + * barrier occur before the LOAD and STORE operations generated after. + */ +#define rte_mb() __sync_synchronize() + +/** + * Write memory barrier. + * + * Guarantees that the STORE operations generated before the barrier + * occur before the STORE operations generated after. + */ +#define rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while (0) + +/** + * Read memory barrier. + * + * Guarantees that the LOAD operations generated before the barrier + * occur before the LOAD operations generated after. + */ +#define rte_rmb() __sync_synchronize() + +/*------------------------- 16 bit atomic operations -------------------------*/ + +#ifndef RTE_FORCE_INTRINSICS +static inline int +rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src) +{ + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, + __ATOMIC_ACQUIRE) ? 1 : 0; +} + +static inline int rte_atomic16_test_and_set(rte_atomic16_t *v) +{ + return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1); +} + +static inline void +rte_atomic16_inc(rte_atomic16_t *v) +{ + __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); +} + +static inline void +rte_atomic16_dec(rte_atomic16_t *v) +{ + __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); +} + +static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v) +{ + return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); +} + +static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v) +{ + return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); +} + +/*------------------------- 32 bit atomic operations -------------------------*/ + +static inline int +rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src) +{ + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, + __ATOMIC_ACQUIRE) ? 1 : 0; +} + +static inline int rte_atomic32_test_and_set(rte_atomic32_t *v) +{ + return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1); +} + +static inline void +rte_atomic32_inc(rte_atomic32_t *v) +{ + __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); +} + +static inline void +rte_atomic32_dec(rte_atomic32_t *v) +{ + __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); +} + +static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v) +{ + return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); +} + +static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v) +{ + return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); +} + +/*------------------------- 64 bit atomic operations -------------------------*/ + +static inline int +rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src) +{ + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, + __ATOMIC_ACQUIRE) ? 1 : 0; +} + +static inline void +rte_atomic64_init(rte_atomic64_t *v) +{ + int success = 0; + uint64_t tmp; + + while (success == 0) { + tmp = v->cnt; + success = rte_atomic64_cmpset( + (volatile uint64_t *)&v->cnt, tmp, 0); + } +} + +static inline int64_t +rte_atomic64_read(rte_atomic64_t *v) +{ + int success = 0; + uint64_t tmp; + + while (success == 0) { + tmp = v->cnt; + /* replace the value by itself */ + success = rte_atomic64_cmpset( + (volatile uint64_t *) &v->cnt, tmp, tmp); + } + return tmp; +} + +static inline void +rte_atomic64_set(rte_atomic64_t *v, int64_t new_value) +{ + int success = 0; + uint64_t tmp; + + while (success == 0) { + tmp = v->cnt; + success = rte_atomic64_cmpset( + (volatile uint64_t *)&v->cnt, tmp, new_value); + } +} + +static inline void +rte_atomic64_add(rte_atomic64_t *v, int64_t inc) +{ + __atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE); +} + +static inline void +rte_atomic64_sub(rte_atomic64_t *v, int64_t dec) +{ + __atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE); +} + +static inline void +rte_atomic64_inc(rte_atomic64_t *v) +{ + __atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE); +} + +static inline void +rte_atomic64_dec(rte_atomic64_t *v) +{ + __atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE); +} + +static inline int64_t +rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc) +{ + return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE); +} + +static inline int64_t +rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec) +{ + return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE); +} + +static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v) +{ + return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); +} + +static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v) +{ + return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); +} + +static inline int rte_atomic64_test_and_set(rte_atomic64_t *v) +{ + return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1); +} + +/** + * Atomically set a 64-bit counter to 0. + * + * @param v + * A pointer to the atomic counter. + */ +static inline void rte_atomic64_clear(rte_atomic64_t *v) +{ + rte_atomic64_set(v, 0); +} +#endif + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_ATOMIC_ARM_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM Jan Viktorin @ 2015-11-02 5:53 ` Jerin Jacob 2015-11-02 13:00 ` Jan Viktorin 2015-11-02 13:10 ` Jan Viktorin 0 siblings, 2 replies; 32+ messages in thread From: Jerin Jacob @ 2015-11-02 5:53 UTC (permalink / raw) To: Jan Viktorin; +Cc: Vlastimil Kosar, dev On Fri, Oct 30, 2015 at 01:25:28AM +0100, Jan Viktorin wrote: > From: Vlastimil Kosar <kosar@rehivetech.com> > > This patch adds architecture specific atomic operation file > for ARM architecture. It utilizes compiler intrinsics only. > > Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> > --- > v1 -> v2: > * improve rte_wmb() > * use __atomic_* or __sync_*? (may affect the required GCC version) > > v4: > * checkpatch complaints about volatile keyword (but seems to be OK to me) > * checkpatch complaints about do { ... } while (0) for single statement > with asm volatile (but I didn't find a way how to write it without > the checkpatch complaints) > * checkpatch is now happy with whitespaces > --- > .../common/include/arch/arm/rte_atomic.h | 256 +++++++++++++++++++++ > 1 file changed, 256 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h > new file mode 100644 > index 0000000..ea1e485 > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h > @@ -0,0 +1,256 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2015 RehiveTech. All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of RehiveTech nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + */ > + > +#ifndef _RTE_ATOMIC_ARM_H_ > +#define _RTE_ATOMIC_ARM_H_ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include "generic/rte_atomic.h" > + > +/** > + * General memory barrier. > + * > + * Guarantees that the LOAD and STORE operations generated before the > + * barrier occur before the LOAD and STORE operations generated after. > + */ > +#define rte_mb() __sync_synchronize() > + > +/** > + * Write memory barrier. > + * > + * Guarantees that the STORE operations generated before the barrier > + * occur before the STORE operations generated after. > + */ > +#define rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while (0) > + > +/** > + * Read memory barrier. > + * > + * Guarantees that the LOAD operations generated before the barrier > + * occur before the LOAD operations generated after. > + */ > +#define rte_rmb() __sync_synchronize() > + #define dmb(opt) asm volatile("dmb " #opt : : : "memory") static inline void rte_mb(void) { dmb(ish); } static inline void rte_wmb(void) { dmb(ishst); } static inline void rte_rmb(void) { dmb(ishld); } For armv8, it make sense to have above definition for rte_*mb(). If does n't make sense for armv7 then we need split this file rte_atomic_32/64.h > +/*------------------------- 16 bit atomic operations -------------------------*/ > + > +#ifndef RTE_FORCE_INTRINSICS > +static inline int > +rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src) > +{ > + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, > + __ATOMIC_ACQUIRE) ? 1 : 0; > +} IMO, it should be __ATOMIC_SEQ_CST be instead of __ATOMIC_ACQUIRE. __ATOMIC_ACQUIRE works in conjunction with __ATOMIC_RELEASE. AFAIK, DPDK atomic api expects full barrier. C11 memory model not yet used. So why can't we use RTE_FORCE_INTRINSICS based generic implementation. Same holds true for spinlock implementation too(i.e using RTE_FORCE_INTRINSICS). Am I missing something here ? > + > +static inline int rte_atomic16_test_and_set(rte_atomic16_t *v) > +{ > + return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1); > +} > + > +static inline void > +rte_atomic16_inc(rte_atomic16_t *v) > +{ > + __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); > +} > + > +static inline void > +rte_atomic16_dec(rte_atomic16_t *v) > +{ > + __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); > +} > + > +static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v) > +{ > + return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); > +} > + > +static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v) > +{ > + return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); > +} > + > +/*------------------------- 32 bit atomic operations -------------------------*/ > + > +static inline int > +rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src) > +{ > + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, > + __ATOMIC_ACQUIRE) ? 1 : 0; > +} > + > +static inline int rte_atomic32_test_and_set(rte_atomic32_t *v) > +{ > + return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1); > +} > + > +static inline void > +rte_atomic32_inc(rte_atomic32_t *v) > +{ > + __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); > +} > + > +static inline void > +rte_atomic32_dec(rte_atomic32_t *v) > +{ > + __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE); > +} > + > +static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v) > +{ > + return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); > +} > + > +static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v) > +{ > + return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); > +} > + > +/*------------------------- 64 bit atomic operations -------------------------*/ > + > +static inline int > +rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src) > +{ > + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, > + __ATOMIC_ACQUIRE) ? 1 : 0; > +} > + > +static inline void > +rte_atomic64_init(rte_atomic64_t *v) > +{ > + int success = 0; > + uint64_t tmp; > + > + while (success == 0) { > + tmp = v->cnt; > + success = rte_atomic64_cmpset( > + (volatile uint64_t *)&v->cnt, tmp, 0); > + } > +} > + > +static inline int64_t > +rte_atomic64_read(rte_atomic64_t *v) > +{ > + int success = 0; > + uint64_t tmp; > + > + while (success == 0) { > + tmp = v->cnt; > + /* replace the value by itself */ > + success = rte_atomic64_cmpset( > + (volatile uint64_t *) &v->cnt, tmp, tmp); > + } > + return tmp; > +} This will be overkill for arm64. Generic implementation has __LP64__ based check for 64bit platform > + > +static inline void > +rte_atomic64_set(rte_atomic64_t *v, int64_t new_value) > +{ > + int success = 0; > + uint64_t tmp; > + > + while (success == 0) { > + tmp = v->cnt; > + success = rte_atomic64_cmpset( > + (volatile uint64_t *)&v->cnt, tmp, new_value); > + } > +} > + > +static inline void > +rte_atomic64_add(rte_atomic64_t *v, int64_t inc) > +{ > + __atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE); > +} > + > +static inline void > +rte_atomic64_sub(rte_atomic64_t *v, int64_t dec) > +{ > + __atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE); > +} > + __atomic_fetch_* operations on 64bit works only when compiler support (__GCC_ATOMIC_LLONG_LOCK_FREE >=2). if DPDK API's expects full barrier not the C11 memory model based __ATOMIC_ACQUIRE then better to use generic implementation. > +static inline void > +rte_atomic64_inc(rte_atomic64_t *v) > +{ > + __atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE); > +} > + > +static inline void > +rte_atomic64_dec(rte_atomic64_t *v) > +{ > + __atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE); > +} > + > +static inline int64_t > +rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc) > +{ > + return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE); > +} > + > +static inline int64_t > +rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec) > +{ > + return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE); > +} > + > +static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v) > +{ > + return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); > +} > + > +static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v) > +{ > + return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0); > +} > + > +static inline int rte_atomic64_test_and_set(rte_atomic64_t *v) > +{ > + return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1); > +} > + > +/** > + * Atomically set a 64-bit counter to 0. > + * > + * @param v > + * A pointer to the atomic counter. > + */ > +static inline void rte_atomic64_clear(rte_atomic64_t *v) > +{ > + rte_atomic64_set(v, 0); > +} > +#endif > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_ATOMIC_ARM_H_ */ > -- > 2.6.1 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM 2015-11-02 5:53 ` Jerin Jacob @ 2015-11-02 13:00 ` Jan Viktorin 2015-11-02 13:10 ` Jan Viktorin 1 sibling, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-11-02 13:00 UTC (permalink / raw) To: Jerin Jacob; +Cc: Vlastimil Kosar, dev On Mon, 2 Nov 2015 11:23:05 +0530 Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote: --snip-- > > +/*------------------------- 16 bit atomic operations -------------------------*/ > > + > > +#ifndef RTE_FORCE_INTRINSICS > > +static inline int > > +rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src) > > +{ > > + return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, > > + __ATOMIC_ACQUIRE) ? 1 : 0; > > +} > > IMO, it should be __ATOMIC_SEQ_CST be instead of __ATOMIC_ACQUIRE. > __ATOMIC_ACQUIRE works in conjunction with __ATOMIC_RELEASE. > AFAIK, DPDK atomic api expects full barrier. C11 memory model not yet > used. Seems to be reasonable, thanks. > So why can't we use RTE_FORCE_INTRINSICS based generic > implementation. Same holds true for spinlock implementation too(i.e using > RTE_FORCE_INTRINSICS). Am I missing something here ? True. This was done with the intention to rewrite as a platform-specific assembly. But it's never been done yet... If you mean to set RTE_FORCE_INTRINSICS=y in the defconfig and remove this code entirely (at least for ARMv7), I would agree. > > > > > + > > +static inline int rte_atomic16_test_and_set(rte_atomic16_t *v) > > +{ > > + return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1); > > +} --snip-- -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM 2015-11-02 5:53 ` Jerin Jacob 2015-11-02 13:00 ` Jan Viktorin @ 2015-11-02 13:10 ` Jan Viktorin 1 sibling, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-11-02 13:10 UTC (permalink / raw) To: Jerin Jacob; +Cc: Vlastimil Kosar, dev On Mon, 2 Nov 2015 11:23:05 +0530 Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote: --snip-- > > +#ifndef _RTE_ATOMIC_ARM_H_ > > +#define _RTE_ATOMIC_ARM_H_ > > + > > +#ifdef __cplusplus > > +extern "C" { > > +#endif > > + > > +#include "generic/rte_atomic.h" > > + > > +/** > > + * General memory barrier. > > + * > > + * Guarantees that the LOAD and STORE operations generated before the > > + * barrier occur before the LOAD and STORE operations generated after. > > + */ > > +#define rte_mb() __sync_synchronize() > > + > > +/** > > + * Write memory barrier. > > + * > > + * Guarantees that the STORE operations generated before the barrier > > + * occur before the STORE operations generated after. > > + */ > > +#define rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while (0) > > + > > +/** > > + * Read memory barrier. > > + * > > + * Guarantees that the LOAD operations generated before the barrier > > + * occur before the LOAD operations generated after. > > + */ > > +#define rte_rmb() __sync_synchronize() > > + > > #define dmb(opt) asm volatile("dmb " #opt : : : "memory") > > static inline void rte_mb(void) > { > dmb(ish); > } > > static inline void rte_wmb(void) > { > dmb(ishst); > } > > static inline void rte_rmb(void) > { > dmb(ishld); I cannot see this option in the doc for ARMv7 (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0588b/CIHGHHIE.html). > } > > For armv8, it make sense to have above definition for rte_*mb(). If it is OK to restrict the barriers to the inner-domain then OK. Quite frankly, I don't know. > If doesn't make sense for armv7 then we need split this file rte_atomic_32/64.h > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 02/15] eal/arm: byte order operations for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 03/15] eal/arm: cpu cycle " Jan Viktorin ` (12 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> This patch adds architecture specific byte order operations for ARM. The architecture supports both big and little endian. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v4: fix passing params to asm volatile for checkpatch --- .../common/include/arch/arm/rte_byteorder.h | 150 +++++++++++++++++++++ 1 file changed, 150 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h new file mode 100644 index 0000000..5776997 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h @@ -0,0 +1,150 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_BYTEORDER_ARM_H_ +#define _RTE_BYTEORDER_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_byteorder.h" + +/* + * An architecture-optimized byte swap for a 16-bit value. + * + * Do not use this function directly. The preferred function is rte_bswap16(). + */ +static inline uint16_t rte_arch_bswap16(uint16_t _x) +{ + register uint16_t x = _x; + + asm volatile ("rev16 %0,%1" + : "=r" (x) + : "r" (x) + ); + return x; +} + +/* + * An architecture-optimized byte swap for a 32-bit value. + * + * Do not use this function directly. The preferred function is rte_bswap32(). + */ +static inline uint32_t rte_arch_bswap32(uint32_t _x) +{ + register uint32_t x = _x; + + asm volatile ("rev %0,%1" + : "=r" (x) + : "r" (x) + ); + return x; +} + +/* + * An architecture-optimized byte swap for a 64-bit value. + * + * Do not use this function directly. The preferred function is rte_bswap64(). + */ +/* 64-bit mode */ +static inline uint64_t rte_arch_bswap64(uint64_t _x) +{ + return __builtin_bswap64(_x); +} + +#ifndef RTE_FORCE_INTRINSICS +#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \ + rte_constant_bswap16(x) : \ + rte_arch_bswap16(x))) + +#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ? \ + rte_constant_bswap32(x) : \ + rte_arch_bswap32(x))) + +#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ? \ + rte_constant_bswap64(x) : \ + rte_arch_bswap64(x))) +#else +/* + * __builtin_bswap16 is only available gcc 4.8 and upwards + */ +#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8) +#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \ + rte_constant_bswap16(x) : \ + rte_arch_bswap16(x))) +#endif +#endif + +/* ARM architecture is bi-endian (both big and little). */ +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN + +#define rte_cpu_to_le_16(x) (x) +#define rte_cpu_to_le_32(x) (x) +#define rte_cpu_to_le_64(x) (x) + +#define rte_cpu_to_be_16(x) rte_bswap16(x) +#define rte_cpu_to_be_32(x) rte_bswap32(x) +#define rte_cpu_to_be_64(x) rte_bswap64(x) + +#define rte_le_to_cpu_16(x) (x) +#define rte_le_to_cpu_32(x) (x) +#define rte_le_to_cpu_64(x) (x) + +#define rte_be_to_cpu_16(x) rte_bswap16(x) +#define rte_be_to_cpu_32(x) rte_bswap32(x) +#define rte_be_to_cpu_64(x) rte_bswap64(x) + +#else /* RTE_BIG_ENDIAN */ + +#define rte_cpu_to_le_16(x) rte_bswap16(x) +#define rte_cpu_to_le_32(x) rte_bswap32(x) +#define rte_cpu_to_le_64(x) rte_bswap64(x) + +#define rte_cpu_to_be_16(x) (x) +#define rte_cpu_to_be_32(x) (x) +#define rte_cpu_to_be_64(x) (x) + +#define rte_le_to_cpu_16(x) rte_bswap16(x) +#define rte_le_to_cpu_32(x) rte_bswap32(x) +#define rte_le_to_cpu_64(x) rte_bswap64(x) + +#define rte_be_to_cpu_16(x) (x) +#define rte_be_to_cpu_32(x) (x) +#define rte_be_to_cpu_64(x) (x) +#endif + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_BYTEORDER_ARM_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 03/15] eal/arm: cpu cycle operations for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 02/15] eal/arm: byte order " Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin ` (11 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> ARM architecture doesn't have a suitable source of CPU cycles. This patch uses clock_gettime instead. The implementation should be improved in the future. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v5: prepare for applying ARMv8 --- .../common/include/arch/arm/rte_cycles.h | 38 ++++++++++ .../common/include/arch/arm/rte_cycles_32.h | 85 ++++++++++++++++++++++ 2 files changed, 123 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_32.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h new file mode 100644 index 0000000..b2372fa --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h @@ -0,0 +1,38 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_CYCLES_ARM_H_ +#define _RTE_CYCLES_ARM_H_ + +#include <rte_cycles_32.h> + +#endif /* _RTE_CYCLES_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h new file mode 100644 index 0000000..755cc4a --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h @@ -0,0 +1,85 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_CYCLES_ARM32_H_ +#define _RTE_CYCLES_ARM32_H_ + +/* ARM v7 does not have suitable source of clock signals. The only clock counter + available in the core is 32 bit wide. Therefore it is unsuitable as the + counter overlaps every few seconds and probably is not accessible by + userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to + simulate counter running at 1GHz. +*/ + +#include <time.h> + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_cycles.h" + +/** + * Read the time base register. + * + * @return + * The time base for this lcore. + */ +static inline uint64_t +rte_rdtsc(void) +{ + struct timespec val; + uint64_t v; + + while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0) + /* no body */; + + v = (uint64_t) val.tv_sec * 1000000000LL; + v += (uint64_t) val.tv_nsec; + return v; +} + +static inline uint64_t +rte_rdtsc_precise(void) +{ + rte_mb(); + return rte_rdtsc(); +} + +static inline uint64_t +rte_get_tsc_cycles(void) { return rte_rdtsc(); } + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_CYCLES_ARM32_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 04/15] eal/arm: implement rdtsc by PMU or clock_gettime 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (2 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 03/15] eal/arm: cpu cycle " Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 05/15] eal/arm: prefetch operations for ARM Jan Viktorin ` (10 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev Enable to choose a preferred way to read timer based on the configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU. It requires a kernel module that is not included to work. Based on the patch by David Hunt and Armuta Zende: lib: added support for armv7 architecture Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> Signed-off-by: Amruta Zende <amruta.zende@intel.com> Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_cycles_32.h | 38 +++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h index 755cc4a..6c6098e 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_32.h @@ -54,8 +54,14 @@ extern "C" { * @return * The time base for this lcore. */ +#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU + +/** + * This call is easily portable to any ARM architecture, however, + * it may be damn slow and inprecise for some tasks. + */ static inline uint64_t -rte_rdtsc(void) +__rte_rdtsc_syscall(void) { struct timespec val; uint64_t v; @@ -67,6 +73,36 @@ rte_rdtsc(void) v += (uint64_t) val.tv_nsec; return v; } +#define rte_rdtsc __rte_rdtsc_syscall + +#else + +/** + * This function requires to configure the PMCCNTR and enable + * userspace access to it: + * + * asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1)); + * asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29)); + * asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x8000000f)); + * + * which is possible only from the priviledged mode (kernel space). + */ +static inline uint64_t +__rte_rdtsc_pmccntr(void) +{ + unsigned tsc; + uint64_t final_tsc; + + /* Read PMCCNTR */ + asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc)); + /* 1 tick = 64 clocks */ + final_tsc = ((uint64_t)tsc) << 6; + + return (uint64_t)final_tsc; +} +#define rte_rdtsc __rte_rdtsc_pmccntr + +#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */ static inline uint64_t rte_rdtsc_precise(void) -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 05/15] eal/arm: prefetch operations for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (3 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin ` (9 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> This patch adds architecture specific prefetch operations for ARM architecture. It utilizes the pld instruction that starts filling the appropriate cache line without blocking. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v4: * checkpatch does not like the syntax of naming params to asm volatile; switched to %0, %1 syntax * checkpatch complatins about volatile (seems to be OK for me) v5: prepare for applying ARMv8 --- .../common/include/arch/arm/rte_prefetch.h | 38 ++++++++++++++ .../common/include/arch/arm/rte_prefetch_32.h | 61 ++++++++++++++++++++++ 2 files changed, 99 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h new file mode 100644 index 0000000..1f46697 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h @@ -0,0 +1,38 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM_H_ +#define _RTE_PREFETCH_ARM_H_ + +#include <rte_prefetch_32.h> + +#endif /* _RTE_PREFETCH_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h new file mode 100644 index 0000000..b716384 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h @@ -0,0 +1,61 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM32_H_ +#define _RTE_PREFETCH_ARM32_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_prefetch.h" + +static inline void rte_prefetch0(const volatile void *p) +{ + asm volatile ("pld [%0]" : : "r" (p)); +} + +static inline void rte_prefetch1(const volatile void *p) +{ + asm volatile ("pld [%0]" : : "r" (p)); +} + +static inline void rte_prefetch2(const volatile void *p) +{ + asm volatile ("pld [%0]" : : "r" (p)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PREFETCH_ARM32_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 06/15] eal/arm: spinlock operations for ARM (without HTM) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (4 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 05/15] eal/arm: prefetch operations for ARM Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 07/15] eal/arm: vector memcpy for ARM Jan Viktorin ` (8 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> This patch adds spinlock operations for ARM architecture. We do not support HTM in spinlocks on ARM. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- .../common/include/arch/arm/rte_spinlock.h | 114 +++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h new file mode 100644 index 0000000..cd5ab8b --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h @@ -0,0 +1,114 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_SPINLOCK_ARM_H_ +#define _RTE_SPINLOCK_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include <rte_common.h> +#include "generic/rte_spinlock.h" + +/* Intrinsics are used to implement the spinlock on ARM architecture */ + +#ifndef RTE_FORCE_INTRINSICS + +static inline void +rte_spinlock_lock(rte_spinlock_t *sl) +{ + while (__sync_lock_test_and_set(&sl->locked, 1)) + while (sl->locked) + rte_pause(); +} + +static inline void +rte_spinlock_unlock(rte_spinlock_t *sl) +{ + __sync_lock_release(&sl->locked); +} + +static inline int +rte_spinlock_trylock(rte_spinlock_t *sl) +{ + return (__sync_lock_test_and_set(&sl->locked, 1) == 0); +} + +#endif + +static inline int rte_tm_supported(void) +{ + return 0; +} + +static inline void +rte_spinlock_lock_tm(rte_spinlock_t *sl) +{ + rte_spinlock_lock(sl); /* fall-back */ +} + +static inline int +rte_spinlock_trylock_tm(rte_spinlock_t *sl) +{ + return rte_spinlock_trylock(sl); +} + +static inline void +rte_spinlock_unlock_tm(rte_spinlock_t *sl) +{ + rte_spinlock_unlock(sl); +} + +static inline void +rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr) +{ + rte_spinlock_recursive_lock(slr); /* fall-back */ +} + +static inline void +rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr) +{ + rte_spinlock_recursive_unlock(slr); +} + +static inline int +rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr) +{ + return rte_spinlock_recursive_trylock(slr); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_SPINLOCK_ARM_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 07/15] eal/arm: vector memcpy for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (5 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin ` (7 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> The SSE based memory copy in DPDK only support x86. This patch adds ARM NEON based memory copy functions for ARM architecture. The implementation improves memory copy of short or well aligned data buffers. The following measurements show improvements over the libc memcpy on Cortex CPUs. by X % faster Length (B) a15 a7 a9 1 4.9 15.2 3.2 7 56.9 48.2 40.3 8 37.3 39.8 29.6 9 69.3 38.7 33.9 15 60.8 35.3 23.7 16 50.6 35.9 35.0 17 57.7 35.7 31.1 31 16.0 23.3 9.0 32 65.9 13.5 21.4 33 3.9 10.3 -3.7 63 2.0 12.9 -2.0 64 66.5 0.0 16.5 65 2.7 7.6 -35.6 127 0.1 4.5 -18.9 128 66.2 1.5 -51.4 129 -0.8 3.2 -35.8 255 -3.1 -0.9 -69.1 256 67.9 1.2 7.2 257 -3.6 -1.9 -36.9 320 67.7 1.4 0.0 384 66.8 1.4 -14.2 511 -44.9 -2.3 -41.9 512 67.3 1.4 -6.8 513 -41.7 -3.0 -36.2 1023 -82.4 -2.8 -41.2 1024 68.3 1.4 -11.6 1025 -80.1 -3.3 -38.1 1518 -47.3 -5.0 -38.3 1522 -48.3 -6.0 -37.9 1600 65.4 1.3 -27.3 2048 59.5 1.5 -10.9 3072 52.3 1.5 -12.2 4096 45.3 1.4 -12.5 5120 40.6 1.5 -14.5 6144 35.4 1.4 -13.4 7168 32.9 1.4 -13.9 8192 28.2 1.4 -15.1 Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v4: * fix whitespace issues reported by checkpatch * fix passing params to asm volatile for checkpatch v5: prepare for applying ARMv8 --- .../common/include/arch/arm/rte_memcpy.h | 38 +++ .../common/include/arch/arm/rte_memcpy_32.h | 279 +++++++++++++++++++++ 2 files changed, 317 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h new file mode 100644 index 0000000..d9f5bf1 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h @@ -0,0 +1,38 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_MEMCPY_ARM_H_ +#define _RTE_MEMCPY_ARM_H_ + +#include <rte_memcpy_32.h> + +#endif /* _RTE_MEMCPY_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h new file mode 100644 index 0000000..11f8241 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h @@ -0,0 +1,279 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_MEMCPY_ARM32_H_ +#define _RTE_MEMCPY_ARM32_H_ + +#include <stdint.h> +#include <string.h> +/* ARM NEON Intrinsics are used to copy data */ +#include <arm_neon.h> + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_memcpy.h" + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + vst1q_u8(dst, vld1q_u8(src)); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + asm volatile ( + "vld1.8 {d0-d3}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : : "memory", "d0", "d1", "d2", "d3"); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d5}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d5}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5"); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d7}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d7}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7"); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + asm volatile ("pld [%0, #64]" : : "r" (src)); + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d7}, [%0]!\n\t" + "vld1.8 {d8-d11}, [%0]!\n\t" + "vld1.8 {d12-d15}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d7}, [%1]!\n\t" + "vst1.8 {d8-d11}, [%1]!\n\t" + "vst1.8 {d12-d15}, [%1]\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7", + "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15"); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + asm volatile ("pld [%0, #64]" : : "r" (src)); + asm volatile ("pld [%0, #128]" : : "r" (src)); + asm volatile ("pld [%0, #192]" : : "r" (src)); + asm volatile ("pld [%0, #256]" : : "r" (src)); + asm volatile ("pld [%0, #320]" : : "r" (src)); + asm volatile ("pld [%0, #384]" : : "r" (src)); + asm volatile ("pld [%0, #448]" : : "r" (src)); + asm volatile ( + "vld1.8 {d0-d3}, [%0]!\n\t" + "vld1.8 {d4-d7}, [%0]!\n\t" + "vld1.8 {d8-d11}, [%0]!\n\t" + "vld1.8 {d12-d15}, [%0]!\n\t" + "vld1.8 {d16-d19}, [%0]!\n\t" + "vld1.8 {d20-d23}, [%0]!\n\t" + "vld1.8 {d24-d27}, [%0]!\n\t" + "vld1.8 {d28-d31}, [%0]\n\t" + "vst1.8 {d0-d3}, [%1]!\n\t" + "vst1.8 {d4-d7}, [%1]!\n\t" + "vst1.8 {d8-d11}, [%1]!\n\t" + "vst1.8 {d12-d15}, [%1]!\n\t" + "vst1.8 {d16-d19}, [%1]!\n\t" + "vst1.8 {d20-d23}, [%1]!\n\t" + "vst1.8 {d24-d27}, [%1]!\n\t" + "vst1.8 {d28-d31}, [%1]!\n\t" + : "+r" (src), "+r" (dst) + : + : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7", + "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15", + "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23", + "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31"); +} + +#define rte_memcpy(dst, src, n) \ + ({ (__builtin_constant_p(n)) ? \ + memcpy((dst), (src), (n)) : \ + rte_memcpy_func((dst), (src), (n)); }) + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + void *ret = dst; + + /* We can't copy < 16 bytes using XMM registers so do it manually. */ + if (n < 16) { + if (n & 0x01) { + *(uint8_t *)dst = *(const uint8_t *)src; + dst = (uint8_t *)dst + 1; + src = (const uint8_t *)src + 1; + } + if (n & 0x02) { + *(uint16_t *)dst = *(const uint16_t *)src; + dst = (uint16_t *)dst + 1; + src = (const uint16_t *)src + 1; + } + if (n & 0x04) { + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + } + if (n & 0x08) { + /* ARMv7 can not handle unaligned access to long long + * (uint64_t). Therefore two uint32_t operations are + * used. + */ + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + *(uint32_t *)dst = *(const uint32_t *)src; + } + return ret; + } + + /* Special fast cases for <= 128 bytes */ + if (n <= 32) { + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; + } + + if (n <= 64) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + rte_mov32((uint8_t *)dst - 32 + n, + (const uint8_t *)src - 32 + n); + return ret; + } + + if (n <= 128) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + rte_mov64((uint8_t *)dst - 64 + n, + (const uint8_t *)src - 64 + n); + return ret; + } + + /* + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte + * copies was found to be faster than doing 128 and 32 byte copies as + * well. + */ + for ( ; n >= 256; n -= 256) { + rte_mov256((uint8_t *)dst, (const uint8_t *)src); + dst = (uint8_t *)dst + 256; + src = (const uint8_t *)src + 256; + } + + /* + * We split the remaining bytes (which will be less than 256) into + * 64byte (2^6) chunks. + * Using incrementing integers in the case labels of a switch statement + * enourages the compiler to use a jump table. To get incrementing + * integers, we shift the 2 relevant bits to the LSB position to first + * get decrementing integers, and then subtract. + */ + switch (3 - (n >> 6)) { + case 0x00: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x01: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x02: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + default: + break; + } + + /* + * We split the remaining bytes (which will be less than 64) into + * 16byte (2^4) chunks, using the same switch structure as above. + */ + switch (3 - (n >> 4)) { + case 0x00: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x01: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x02: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + default: + break; + } + + /* Copy any remaining bytes, without going beyond end of buffers */ + if (n != 0) + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMCPY_ARM32_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 08/15] eal/arm: use vector memcpy only when NEON is enabled 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (6 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 07/15] eal/arm: vector memcpy for ARM Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin ` (6 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev The GCC can be configured to avoid using NEON extensions. For that purpose, we provide just the memcpy implementation of the rte_memcpy. Based on the patch by David Hunt and Armuta Zende: lib: added support for armv7 architecture Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> Signed-off-by: Amruta Zende <amruta.zende@intel.com> Signed-off-by: David Hunt <david.hunt@intel.com> --- v5: prepare for applying ARMv8 --- .../common/include/arch/arm/rte_memcpy_32.h | 59 +++++++++++++++++++++- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h index 11f8241..df47c0d 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h @@ -35,8 +35,6 @@ #include <stdint.h> #include <string.h> -/* ARM NEON Intrinsics are used to copy data */ -#include <arm_neon.h> #ifdef __cplusplus extern "C" { @@ -44,6 +42,11 @@ extern "C" { #include "generic/rte_memcpy.h" +#ifdef __ARM_NEON_FP + +/* ARM NEON Intrinsics are used to copy data */ +#include <arm_neon.h> + static inline void rte_mov16(uint8_t *dst, const uint8_t *src) { @@ -272,6 +275,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n) return ret; } +#else + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +static inline void * +rte_memcpy(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +#endif /* __ARM_NEON_FP */ + #ifdef __cplusplus } #endif -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 09/15] eal/arm: cpu flag checks for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (7 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin ` (5 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> This implementation is based on IBM POWER version of rte_cpuflags. We use software emulation of HW capability registers, because those are usually not directly accessible from userspace on ARM. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- app/test/test_cpuflags.c | 5 + .../common/include/arch/arm/rte_cpuflags.h | 177 +++++++++++++++++++++ mk/rte.cpuflags.mk | 6 + 3 files changed, 188 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index 5b92061..557458f 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -115,6 +115,11 @@ test_cpuflags(void) CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP); #endif +#if defined(RTE_ARCH_ARM) + printf("Check for NEON:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); +#endif + #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) printf("Check for SSE:\t\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_SSE); diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h new file mode 100644 index 0000000..1eadb33 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h @@ -0,0 +1,177 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_CPUFLAGS_ARM_H_ +#define _RTE_CPUFLAGS_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include <elf.h> +#include <fcntl.h> +#include <assert.h> +#include <unistd.h> + +#include "generic/rte_cpuflags.h" + +#ifndef AT_HWCAP +#define AT_HWCAP 16 +#endif + +#ifndef AT_HWCAP2 +#define AT_HWCAP2 26 +#endif + +/* software based registers */ +enum cpu_register_t { + REG_HWCAP = 0, + REG_HWCAP2, +}; + +/** + * Enumeration of all CPU features supported + */ +enum rte_cpu_flag_t { + RTE_CPUFLAG_SWP = 0, + RTE_CPUFLAG_HALF, + RTE_CPUFLAG_THUMB, + RTE_CPUFLAG_A26BIT, + RTE_CPUFLAG_FAST_MULT, + RTE_CPUFLAG_FPA, + RTE_CPUFLAG_VFP, + RTE_CPUFLAG_EDSP, + RTE_CPUFLAG_JAVA, + RTE_CPUFLAG_IWMMXT, + RTE_CPUFLAG_CRUNCH, + RTE_CPUFLAG_THUMBEE, + RTE_CPUFLAG_NEON, + RTE_CPUFLAG_VFPv3, + RTE_CPUFLAG_VFPv3D16, + RTE_CPUFLAG_TLS, + RTE_CPUFLAG_VFPv4, + RTE_CPUFLAG_IDIVA, + RTE_CPUFLAG_IDIVT, + RTE_CPUFLAG_VFPD32, + RTE_CPUFLAG_LPAE, + RTE_CPUFLAG_EVTSTRM, + RTE_CPUFLAG_AES, + RTE_CPUFLAG_PMULL, + RTE_CPUFLAG_SHA1, + RTE_CPUFLAG_SHA2, + RTE_CPUFLAG_CRC32, + /* The last item */ + RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */ +}; + +static const struct feature_entry cpu_feature_table[] = { + FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0) + FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1) + FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2) + FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3) + FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP, 4) + FEAT_DEF(FPA, 0x00000001, 0, REG_HWCAP, 5) + FEAT_DEF(VFP, 0x00000001, 0, REG_HWCAP, 6) + FEAT_DEF(EDSP, 0x00000001, 0, REG_HWCAP, 7) + FEAT_DEF(JAVA, 0x00000001, 0, REG_HWCAP, 8) + FEAT_DEF(IWMMXT, 0x00000001, 0, REG_HWCAP, 9) + FEAT_DEF(CRUNCH, 0x00000001, 0, REG_HWCAP, 10) + FEAT_DEF(THUMBEE, 0x00000001, 0, REG_HWCAP, 11) + FEAT_DEF(NEON, 0x00000001, 0, REG_HWCAP, 12) + FEAT_DEF(VFPv3, 0x00000001, 0, REG_HWCAP, 13) + FEAT_DEF(VFPv3D16, 0x00000001, 0, REG_HWCAP, 14) + FEAT_DEF(TLS, 0x00000001, 0, REG_HWCAP, 15) + FEAT_DEF(VFPv4, 0x00000001, 0, REG_HWCAP, 16) + FEAT_DEF(IDIVA, 0x00000001, 0, REG_HWCAP, 17) + FEAT_DEF(IDIVT, 0x00000001, 0, REG_HWCAP, 18) + FEAT_DEF(VFPD32, 0x00000001, 0, REG_HWCAP, 19) + FEAT_DEF(LPAE, 0x00000001, 0, REG_HWCAP, 20) + FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 21) + FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP2, 0) + FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP2, 1) + FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2) + FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3) + FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4) +}; + +/* + * Read AUXV software register and get cpu features for ARM + */ +static inline void +rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, + __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out) +{ + int auxv_fd; + Elf32_auxv_t auxv; + + auxv_fd = open("/proc/self/auxv", O_RDONLY); + assert(auxv_fd); + while (read(auxv_fd, &auxv, + sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) { + if (auxv.a_type == AT_HWCAP) + out[REG_HWCAP] = auxv.a_un.a_val; + else if (auxv.a_type == AT_HWCAP2) + out[REG_HWCAP2] = auxv.a_un.a_val; + } +} + +/* + * Checks if a particular flag is available on current machine. + */ +static inline int +rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) +{ + const struct feature_entry *feat; + cpuid_registers_t regs = {0}; + + if (feature >= RTE_CPUFLAG_NUMFLAGS) + /* Flag does not match anything in the feature tables */ + return -ENOENT; + + feat = &cpu_feature_table[feature]; + + if (!feat->leaf) + /* This entry in the table wasn't filled out! */ + return -EFAULT; + + /* get the cpuid leaf containing the desired feature */ + rte_cpu_get_features(feat->leaf, feat->subleaf, regs); + + /* check if the feature is enabled */ + return (regs[feat->reg] >> feat->bit) & 1; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_CPUFLAGS_ARM_H_ */ diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk index f595cd0..bec7bdd 100644 --- a/mk/rte.cpuflags.mk +++ b/mk/rte.cpuflags.mk @@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),) CPUFLAGS += VSX endif +# ARM flags +ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),) +CPUFLAGS += NEON +endif + + MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS)) # To strip whitespace -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 10/15] eal/arm: detect arm architecture in cpu flags 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (8 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 11/15] eal/arm: rwlock support for ARM Jan Viktorin ` (4 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev Based on the patch by David Hunt and Armuta Zende: lib: added support for armv7 architecture Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> Signed-off-by: Amruta Zende <amruta.zende@intel.com> Signed-off-by: David Hunt <david.hunt@intel.com> --- v2 -> v3: fixed forgotten include of string.h v4: checkpatch reports few characters over 80 for checking aarch64 --- lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h index 1eadb33..7ce9d14 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h @@ -41,6 +41,7 @@ extern "C" { #include <fcntl.h> #include <assert.h> #include <unistd.h> +#include <string.h> #include "generic/rte_cpuflags.h" @@ -52,10 +53,15 @@ extern "C" { #define AT_HWCAP2 26 #endif +#ifndef AT_PLATFORM +#define AT_PLATFORM 15 +#endif + /* software based registers */ enum cpu_register_t { REG_HWCAP = 0, REG_HWCAP2, + REG_PLATFORM, }; /** @@ -89,6 +95,8 @@ enum rte_cpu_flag_t { RTE_CPUFLAG_SHA1, RTE_CPUFLAG_SHA2, RTE_CPUFLAG_CRC32, + RTE_CPUFLAG_AARCH32, + RTE_CPUFLAG_AARCH64, /* The last item */ RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */ }; @@ -121,6 +129,8 @@ static const struct feature_entry cpu_feature_table[] = { FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2) FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3) FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4) + FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) + FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) }; /* @@ -141,6 +151,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, out[REG_HWCAP] = auxv.a_un.a_val; else if (auxv.a_type == AT_HWCAP2) out[REG_HWCAP2] = auxv.a_un.a_val; + else if (auxv.a_type == AT_PLATFORM) { + if (!strcmp((const char *)auxv.a_un.a_val, "aarch32")) + out[REG_PLATFORM] = 0x0001; + else if (!strcmp((const char *)auxv.a_un.a_val, "aarch64")) + out[REG_PLATFORM] = 0x0002; + } } } -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 11/15] eal/arm: rwlock support for ARM 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (9 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin ` (3 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev Just a copy from PPC. Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- .../common/include/arch/arm/rte_rwlock.h | 40 ++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h new file mode 100644 index 0000000..664bec8 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h @@ -0,0 +1,40 @@ +/* copied from ppc_64 */ + +#ifndef _RTE_RWLOCK_ARM_H_ +#define _RTE_RWLOCK_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_rwlock.h" + +static inline void +rte_rwlock_read_lock_tm(rte_rwlock_t *rwl) +{ + rte_rwlock_read_lock(rwl); +} + +static inline void +rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl) +{ + rte_rwlock_read_unlock(rwl); +} + +static inline void +rte_rwlock_write_lock_tm(rte_rwlock_t *rwl) +{ + rte_rwlock_write_lock(rwl); +} + +static inline void +rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl) +{ + rte_rwlock_write_unlock(rwl); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_RWLOCK_ARM_H_ */ -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 12/15] eal/arm: add very incomplete rte_vect 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (10 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 11/15] eal/arm: rwlock support for ARM Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin ` (2 subsequent siblings) 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev This patch does not map x86 SIMD operations to the ARM ones. It just fills the necessary gap between the platforms to enable compilation of libraries LPM (includes rte_vect.h, lpm_test needs those SIMD functions) and ACL (includes rte_vect.h). Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v4: checkpatch reports warning for the new typedef --- lib/librte_eal/common/include/arch/arm/rte_vect.h | 84 +++++++++++++++++++++++ 1 file changed, 84 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h new file mode 100644 index 0000000..7d5de97 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h @@ -0,0 +1,84 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 RehiveTech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of RehiveTech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_VECT_ARM_H_ +#define _RTE_VECT_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#define XMM_SIZE 16 +#define XMM_MASK (XMM_MASK - 1) + +typedef struct { + union uint128 { + uint8_t uint8[16]; + uint32_t uint32[4]; + } val; +} __m128i; + +static inline __m128i +_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3) +{ + __m128i res; + + res.val.uint32[0] = v0; + res.val.uint32[1] = v1; + res.val.uint32[2] = v2; + res.val.uint32[3] = v3; + return res; +} + +static inline __m128i +_mm_loadu_si128(__m128i *v) +{ + __m128i res; + + res = *v; + return res; +} + +static inline __m128i +_mm_load_si128(__m128i *v) +{ + __m128i res; + + res = *v; + return res; +} + +#ifdef __cplusplus +} +#endif + +#endif -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 13/15] gcc/arm: avoid alignment errors to break build 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (11 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 14/15] mk: Introduce ARMv7 architecture Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev, Vlastimil Kosar There several issues with alignment when compiling for ARMv7. They are not considered to be fatal (ARMv7 supports unaligned access of 32b words), so we just leave them as warnings. They should be solved later, however. Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> --- v4: restrict -Wno-error to the cast-align only --- mk/toolchain/gcc/rte.vars.mk | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk index 0f51c66..c2c5255 100644 --- a/mk/toolchain/gcc/rte.vars.mk +++ b/mk/toolchain/gcc/rte.vars.mk @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual WERROR_FLAGS += -Wformat-nonliteral -Wformat-security WERROR_FLAGS += -Wundef -Wwrite-strings +# There are many issues reported for ARMv7 architecture +# which are not necessarily fatal. Report as warnings. +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y) +WERROR_FLAGS += -Wno-error=cast-align +endif + # process cpu flags include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 14/15] mk: Introduce ARMv7 architecture 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (12 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: Vlastimil Kosar, dev From: Vlastimil Kosar <kosar@rehivetech.com> Make DPDK run on ARMv7-A architecture. This patch assumes ARM Cortex-A9. However, it is known to be working on Cortex-A7 and Cortex-A15. Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- v2: * the -mtune parameter of GCC is configurable now * the -mfpu=neon can be turned off v3: XMM_SIZE is defined in rte_vect.h in a following patch v4: * update release notes for 2.2 * get rid of CONFIG_RTE_BITMAP_OPTIMIZATIONS=0 setting * rename arm defconfig: "armv7-a" -> "arvm7a" * disable pipeline and table modules unless lpm is fixed --- config/defconfig_arm-armv7a-linuxapp-gcc | 74 ++++++++++++++++++++++++++++++++ doc/guides/rel_notes/release_2_2.rst | 5 +++ mk/arch/arm/rte.vars.mk | 39 +++++++++++++++++ mk/machine/armv7-a/rte.vars.mk | 67 +++++++++++++++++++++++++++++ 4 files changed, 185 insertions(+) create mode 100644 config/defconfig_arm-armv7a-linuxapp-gcc create mode 100644 mk/arch/arm/rte.vars.mk create mode 100644 mk/machine/armv7-a/rte.vars.mk diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc new file mode 100644 index 0000000..d623222 --- /dev/null +++ b/config/defconfig_arm-armv7a-linuxapp-gcc @@ -0,0 +1,74 @@ +# BSD LICENSE +# +# Copyright (C) 2015 RehiveTech. All right reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of RehiveTech nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +#include "common_linuxapp" + +CONFIG_RTE_MACHINE="armv7-a" + +CONFIG_RTE_ARCH="arm" +CONFIG_RTE_ARCH_ARM=y +CONFIG_RTE_ARCH_ARMv7=y +CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9" +CONFIG_RTE_ARCH_ARM_NEON=y + +CONFIG_RTE_TOOLCHAIN="gcc" +CONFIG_RTE_TOOLCHAIN_GCC=y + +# ARM doesn't have support for vmware TSC map +CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n + +# KNI is not supported on 32-bit +CONFIG_RTE_LIBRTE_KNI=n + +# PCI is usually not used on ARM +CONFIG_RTE_EAL_IGB_UIO=n + +# fails to compile on ARM +CONFIG_RTE_LIBRTE_ACL=n +CONFIG_RTE_LIBRTE_LPM=n +CONFIG_RTE_LIBRTE_TABLE=n +CONFIG_RTE_LIBRTE_PIPELINE=n + +# cannot use those on ARM +CONFIG_RTE_KNI_KMOD=n +CONFIG_RTE_LIBRTE_EM_PMD=n +CONFIG_RTE_LIBRTE_IGB_PMD=n +CONFIG_RTE_LIBRTE_CXGBE_PMD=n +CONFIG_RTE_LIBRTE_E1000_PMD=n +CONFIG_RTE_LIBRTE_ENIC_PMD=n +CONFIG_RTE_LIBRTE_FM10K_PMD=n +CONFIG_RTE_LIBRTE_I40E_PMD=n +CONFIG_RTE_LIBRTE_IXGBE_PMD=n +CONFIG_RTE_LIBRTE_MLX4_PMD=n +CONFIG_RTE_LIBRTE_MPIPE_PMD=n +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n +CONFIG_RTE_LIBRTE_VMXNET3_PMD=n +CONFIG_RTE_LIBRTE_PMD_XENVIRT=n +CONFIG_RTE_LIBRTE_PMD_BNX2X=n diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index be6f827..43a3a3c 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -23,6 +23,11 @@ New Features * **Added vhost-user multiple queue support.** +* **Introduce ARMv7 architecture** + + It is now possible to build DPDK for the ARMv7 platform and test with + virtual PMD drivers. + Resolved Issues --------------- diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk new file mode 100644 index 0000000..df0c043 --- /dev/null +++ b/mk/arch/arm/rte.vars.mk @@ -0,0 +1,39 @@ +# BSD LICENSE +# +# Copyright (C) 2015 RehiveTech. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of RehiveTech nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ARCH ?= arm +CROSS ?= + +CPU_CFLAGS ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access +CPU_LDFLAGS ?= +CPU_ASFLAGS ?= -felf + +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk new file mode 100644 index 0000000..48d3979 --- /dev/null +++ b/mk/machine/armv7-a/rte.vars.mk @@ -0,0 +1,67 @@ +# BSD LICENSE +# +# Copyright (C) 2015 RehiveTech. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of RehiveTech nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# machine: +# +# - can define ARCH variable (overridden by cmdline value) +# - can define CROSS variable (overridden by cmdline value) +# - define MACHINE_CFLAGS variable (overridden by cmdline value) +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) +# - can define CPU_CFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - may override any previously defined variable +# + +# ARCH = +# CROSS = +# MACHINE_CFLAGS = +# MACHINE_LDFLAGS = +# MACHINE_ASFLAGS = +# CPU_CFLAGS = +# CPU_LDFLAGS = +# CPU_ASFLAGS = + +CPU_CFLAGS += -mfloat-abi=softfp + +MACHINE_CFLAGS += -march=armv7-a + +ifdef CONFIG_RTE_ARCH_ARM_TUNE +MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE) +endif + +ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y) +MACHINE_CFLAGS += -mfpu=neon +endif -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v5 15/15] maintainers: claim responsibility for ARMv7 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin ` (13 preceding siblings ...) 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 14/15] mk: Introduce ARMv7 architecture Jan Viktorin @ 2015-10-30 0:25 ` Jan Viktorin 14 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:25 UTC (permalink / raw) To: david.marchand, David Hunt, Thomas Monjalon; +Cc: dev Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> --- MAINTAINERS | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 080a8e8..a8933eb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -124,6 +124,10 @@ IBM POWER M: Chao Zhu <chaozhu@linux.vnet.ibm.com> F: lib/librte_eal/common/include/arch/ppc_64/ +ARM v7 +M: Jan Viktorin <viktorin@rehivetech.com> +F: lib/librte_eal/common/include/arch/arm/ + Intel x86 M: Bruce Richardson <bruce.richardson@intel.com> M: Konstantin Ananyev <konstantin.ananyev@intel.com> -- 2.6.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt ` (5 preceding siblings ...) 2015-10-29 18:27 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Thomas Monjalon @ 2015-10-30 0:17 ` Jan Viktorin 2015-10-30 8:52 ` Hunt, David 6 siblings, 1 reply; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 0:17 UTC (permalink / raw) To: David Hunt; +Cc: dev I've failed to compile kni/igb for ARMv8. Any ideas? Is it Linux 4.2 compatbile? CC [M] /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.o /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c: In functi on ‘igb_ndo_bridge_getlink’: /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2279:9: er ror: too few arguments to function ‘ndo_dflt_bridge_getlink’ return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode, 0, 0, nlflags); ^ In file included from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/net/dst.h:13:0, from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/net/sock.h:67, from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/linux/tcp.h:22, from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:34: /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/linux/rtnetlink.h:115:12: note: declared here extern int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq, ^ /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2286:1: error: control reaches end of non-void function [-Werror=return-type] } ^ cc1: all warnings being treated as errors /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/scripts/Makefile.build:258: recipe for target '/home/jviki/Projects/bu ildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.o' failed Regards Jan On Thu, 29 Oct 2015 17:29:49 +0000 David Hunt <david.hunt@intel.com> wrote: > Hi DPDK Community. > > This is an updated patchset for ARMv8 that now sits on top of the previously > submitted ARMv7 code by RehiveTech. It re-uses a lot of that code, and splits > some header files into 32-bit and 64-bit versions, so uses the same arm include > directory. > > Tested on an XGene 64-bit arm server board, with PCI slots. Passes traffic between > two physical ports on an Intel 82599 dual-port 10Gig NIC. Should work with many > other NICS, but these are as yet untested. > > Compiles igb_uio, kni and all the physical device PMDs. > > ACL and LPM are disabled due to compilation issues. > > Note added to the Release notes. > > > David Hunt (5): > eal/arm: split arm rte_memcpy.h into 32 and 64 bit versions. > eal/arm: split arm rte_prefetch.h into 32 and 64 bit versions > eal/arm: fix 64-bit compilation for armv8 > mk: Add makefile support for armv8 architecture > test: add test for cpu flags on armv8 > > MAINTAINERS | 3 +- > app/test/test_cpuflags.c | 13 +- > config/defconfig_arm64-armv8a-linuxapp-gcc | 56 ++++ > doc/guides/rel_notes/release_2_2.rst | 7 +- > .../common/include/arch/arm/rte_cpuflags.h | 9 + > .../common/include/arch/arm/rte_memcpy.h | 302 +------------------ > .../common/include/arch/arm/rte_memcpy_32.h | 334 +++++++++++++++++++++ > .../common/include/arch/arm/rte_memcpy_64.h | 322 ++++++++++++++++++++ > .../common/include/arch/arm/rte_prefetch.h | 31 +- > .../common/include/arch/arm/rte_prefetch_32.h | 61 ++++ > .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++ > mk/arch/arm64/rte.vars.mk | 58 ++++ > mk/machine/armv8a/rte.vars.mk | 57 ++++ > 13 files changed, 986 insertions(+), 328 deletions(-) > create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h > create mode 100644 mk/arch/arm64/rte.vars.mk > create mode 100644 mk/machine/armv8a/rte.vars.mk > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support 2015-10-30 0:17 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Jan Viktorin @ 2015-10-30 8:52 ` Hunt, David 2015-10-30 10:48 ` Jan Viktorin 0 siblings, 1 reply; 32+ messages in thread From: Hunt, David @ 2015-10-30 8:52 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On 30/10/2015 00:17, Jan Viktorin wrote: > I've failed to compile kni/igb for ARMv8. Any ideas? Is it Linux 4.2 > compatbile? > > CC [M] /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.o > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c: In functi > on ‘igb_ndo_bridge_getlink’: > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2279:9: er > ror: too few arguments to function ‘ndo_dflt_bridge_getlink’ > return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode, 0, 0, nlflags); > ^ > In file included from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/net/dst.h:13:0, > from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/net/sock.h:67, > from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/linux/tcp.h:22, > from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:34: > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/linux/rtnetlink.h:115:12: note: declared here > extern int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq, > ^ > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2286:1: error: control reaches end of non-void function [-Werror=return-type] > } > ^ > cc1: all warnings being treated as errors > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/scripts/Makefile.build:258: recipe for target '/home/jviki/Projects/bu > ildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.o' failed > > Regards > Jan Jan, To compile DPDK on kernels 4.2 and later, you need two patches submitted to the list last week. The ID's are 7518 - kni-rename-HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK-macro 7519 - kni-fix-igb-build-with-kernel-4.2 And if you're on a 4.3 kernel: 8131 - fix igb_uio's access to pci_dev->msi_list for kernels >= 4.3 Regards, Dave. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support 2015-10-30 8:52 ` Hunt, David @ 2015-10-30 10:48 ` Jan Viktorin 0 siblings, 0 replies; 32+ messages in thread From: Jan Viktorin @ 2015-10-30 10:48 UTC (permalink / raw) To: Hunt, David; +Cc: dev Thanks for that hint. I am able to run it in qemu. I tried several tests from the test suite and it works. Jan On Fri, 30 Oct 2015 08:52:49 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 30/10/2015 00:17, Jan Viktorin wrote: > > I've failed to compile kni/igb for ARMv8. Any ideas? Is it Linux 4.2 > > compatbile? > > > > CC [M] /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.o > > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c: In functi > > on ‘igb_ndo_bridge_getlink’: > > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2279:9: er > > ror: too few arguments to function ‘ndo_dflt_bridge_getlink’ > > return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode, 0, 0, nlflags); > > ^ > > In file included from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/net/dst.h:13:0, > > from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/net/sock.h:67, > > from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/linux/tcp.h:22, > > from /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:34: > > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/include/linux/rtnetlink.h:115:12: note: declared here > > extern int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq, > > ^ > > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2286:1: error: control reaches end of non-void function [-Werror=return-type] > > } > > ^ > > cc1: all warnings being treated as errors > > /home/jviki/Projects/buildroot-armv8/qemu-armv8/build/linux-4.2/scripts/Makefile.build:258: recipe for target '/home/jviki/Projects/bu > > ildroot-armv8/qemu-armv8/build/dpdk-armv8-hunt-v1/build/build/lib/librte_eal/linuxapp/kni/igb_main.o' failed > > > > Regards > > Jan > > Jan, > > To compile DPDK on kernels 4.2 and later, you need two patches submitted > to the list last week. The ID's are > > 7518 - kni-rename-HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK-macro > 7519 - kni-fix-igb-build-with-kernel-4.2 > > And if you're on a 4.3 kernel: > > 8131 - fix igb_uio's access to pci_dev->msi_list for kernels >= 4.3 > > Regards, > Dave. > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2015-11-02 13:12 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-29 17:29 [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 1/5] eal: split arm rte_memcpy.h into 32-bit and 64-bit versions David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 2/5] eal: split arm rte_prefetch.h " David Hunt 2015-10-29 17:29 ` [dpdk-dev] [PATCH 3/5] eal: fix compilation for armv8 64-bit David Hunt 2015-10-29 17:38 ` Jan Viktorin 2015-10-29 17:29 ` [dpdk-dev] [PATCH 4/5] mk: add support for armv8 on top of armv7 David Hunt 2015-10-29 17:39 ` Jan Viktorin 2015-10-29 17:42 ` Jan Viktorin 2015-10-29 17:29 ` [dpdk-dev] [PATCH 5/5] test: add checks for cpu flags on armv8 David Hunt 2015-10-29 18:27 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Thomas Monjalon 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 00/15] Support ARMv7 architecture Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 01/15] eal/arm: atomic operations for ARM Jan Viktorin 2015-11-02 5:53 ` Jerin Jacob 2015-11-02 13:00 ` Jan Viktorin 2015-11-02 13:10 ` Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 02/15] eal/arm: byte order " Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 03/15] eal/arm: cpu cycle " Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 05/15] eal/arm: prefetch operations for ARM Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 07/15] eal/arm: vector memcpy for ARM Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 11/15] eal/arm: rwlock support for ARM Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 14/15] mk: Introduce ARMv7 architecture Jan Viktorin 2015-10-30 0:25 ` [dpdk-dev] [PATCH v5 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin 2015-10-30 0:17 ` [dpdk-dev] [PATCH 0/5] ARMv8 additions to ARMv7 support Jan Viktorin 2015-10-30 8:52 ` Hunt, David 2015-10-30 10:48 ` Jan Viktorin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).