* [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support @ 2015-10-30 13:49 David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt ` (5 more replies) 0 siblings, 6 replies; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev This is th v3 patchset for ARMv8 that now sits on top of the v5 patch of the ARMv7 code by RehiveTech. It adds code into the same arm include directory, reducing code duplication. Tested on an XGene 64-bit arm server board, with PCI slots. Passes traffic between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should work with many other NICS, but these are as yet untested. Compiles igb_uio, kni and all the physical device PMDs. An entry has been added to the Release notes. We hope that this will encourage the ARM community to contribute PMDs for their SoCs to DPDK. For now, we've added some Intel engineers to the MAINTAINERS file. We would like to encourage the ARM community to take over maintenance of this area in future, and to further improve it. Notes on arm64 kernel configuration: Using Ubuntu 14.04 LTS with a 4.3.0-rc6 kernel (with modified PCI drivers), and uio_pci_generic. ARM64 kernels do not seem to have functional resource mapping of PCI memory (PCI_MMAP), so the pci driver needs to be patched to enable this. The symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is missing the resource0...N files for mmapping the device memory. Earlier kernels (3.13.x) had these files present, but mmap'ping resulted in a "Bus Error" when the NIC memory was accessed. However, during limited testing with a modified 4.3.0-rc6 kernel, we were able to mmap the NIC memory, and pass traffic between the two ports on a 82599 NIC connected via fibre cable. We have no plans to upstream a kernel patch for this and hope that someone more familiar with the arm architecture can create a proper patch and enable this functionality. Reviewed-by: Jan Viktorin <viktorin@rehivetech.com> David Hunt (6): eal/arm: add 64-bit armv8 version of rte_memcpy.h eal/arm: add 64-bit armv8 version of rte_prefetch.h eal/arm: add 64-bit armv8 version of rte_cycles.h eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h mk: add support for armv8 on top of armv7 test: add checks for cpu flags on armv8 MAINTAINERS | 3 +- app/test/test_cpuflags.c | 13 +- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 ++++ doc/guides/rel_notes/release_2_2.rst | 7 +- .../common/include/arch/arm/rte_cpuflags.h | 6 +- .../common/include/arch/arm/rte_cycles.h | 4 + .../common/include/arch/arm/rte_cycles_64.h | 77 ++++++ .../common/include/arch/arm/rte_memcpy.h | 4 + .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ .../common/include/arch/arm/rte_prefetch.h | 4 + .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++ mk/arch/arm64/rte.vars.mk | 58 ++++ mk/machine/armv8a/rte.vars.mk | 57 ++++ 13 files changed, 651 insertions(+), 7 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-11-02 4:57 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt ` (4 subsequent siblings) 5 siblings, 1 reply; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_memcpy.h | 4 + .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ 2 files changed, 312 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h index d9f5bf1..1d562c3 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h @@ -33,6 +33,10 @@ #ifndef _RTE_MEMCPY_ARM_H_ #define _RTE_MEMCPY_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_memcpy_64.h> +#else #include <rte_memcpy_32.h> +#endif #endif /* _RTE_MEMCPY_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h new file mode 100644 index 0000000..6d85113 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h @@ -0,0 +1,308 @@ +/* + * BSD LICENSE + * + * Copyright (C) IBM Corporation 2014. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IBM Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_MEMCPY_ARM_64_H_ +#define _RTE_MEMCPY_ARM_64_H_ + +#include <stdint.h> +#include <string.h> + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_memcpy.h" + +#ifdef __ARM_NEON_FP + +/* ARM NEON Intrinsics are used to copy data */ +#include <arm_neon.h> + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP d0, d1, [%0]\n\t" + "STP d0, d1, [%1]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP d0, d1, [%0 , #32]\n\t" + "STP d0, d1, [%1 , #32]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + "LDP q0, q1, [%0 , #64]\n\t" + "STP q0, q1, [%1 , #64]\n\t" + "LDP q0, q1, [%0 , #96]\n\t" + "STP q0, q1, [%1 , #96]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + "LDP q0, q1, [%0 , #64]\n\t" + "STP q0, q1, [%1 , #64]\n\t" + "LDP q0, q1, [%0 , #96]\n\t" + "STP q0, q1, [%1 , #96]\n\t" + "LDP q0, q1, [%0 , #128]\n\t" + "STP q0, q1, [%1 , #128]\n\t" + "LDP q0, q1, [%0 , #160]\n\t" + "STP q0, q1, [%1 , #160]\n\t" + "LDP q0, q1, [%0 , #192]\n\t" + "STP q0, q1, [%1 , #192]\n\t" + "LDP q0, q1, [%0 , #224]\n\t" + "STP q0, q1, [%1 , #224]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +#define rte_memcpy(dst, src, n) \ + ({ (__builtin_constant_p(n)) ? \ + memcpy((dst), (src), (n)) : \ + rte_memcpy_func((dst), (src), (n)); }) + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + void *ret = dst; + + /* We can't copy < 16 bytes using XMM registers so do it manually. */ + if (n < 16) { + if (n & 0x01) { + *(uint8_t *)dst = *(const uint8_t *)src; + dst = (uint8_t *)dst + 1; + src = (const uint8_t *)src + 1; + } + if (n & 0x02) { + *(uint16_t *)dst = *(const uint16_t *)src; + dst = (uint16_t *)dst + 1; + src = (const uint16_t *)src + 1; + } + if (n & 0x04) { + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + } + if (n & 0x08) + *(uint64_t *)dst = *(const uint64_t *)src; + return ret; + } + + /* Special fast cases for <= 128 bytes */ + if (n <= 32) { + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; + } + + if (n <= 64) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + rte_mov32((uint8_t *)dst - 32 + n, + (const uint8_t *)src - 32 + n); + return ret; + } + + if (n <= 128) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + rte_mov64((uint8_t *)dst - 64 + n, + (const uint8_t *)src - 64 + n); + return ret; + } + + /* + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte + * copies was found to be faster than doing 128 and 32 byte copies as + * well. + */ + for ( ; n >= 256; n -= 256) { + rte_mov256((uint8_t *)dst, (const uint8_t *)src); + dst = (uint8_t *)dst + 256; + src = (const uint8_t *)src + 256; + } + + /* + * We split the remaining bytes (which will be less than 256) into + * 64byte (2^6) chunks. + * Using incrementing integers in the case labels of a switch statement + * enourages the compiler to use a jump table. To get incrementing + * integers, we shift the 2 relevant bits to the LSB position to first + * get decrementing integers, and then subtract. + */ + switch (3 - (n >> 6)) { + case 0x00: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x01: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x02: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + default: + break; + } + + /* + * We split the remaining bytes (which will be less than 64) into + * 16byte (2^4) chunks, using the same switch structure as above. + */ + switch (3 - (n >> 4)) { + case 0x00: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x01: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x02: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + default: + break; + } + + /* Copy any remaining bytes, without going beyond end of buffers */ + if (n != 0) + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; +} + +#else + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +static inline void * +rte_memcpy(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +#endif /* __ARM_NEON_FP */ + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMCPY_ARM_64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt @ 2015-11-02 4:57 ` Jerin Jacob 2015-11-02 12:22 ` Hunt, David 0 siblings, 1 reply; 18+ messages in thread From: Jerin Jacob @ 2015-11-02 4:57 UTC (permalink / raw) To: David Hunt; +Cc: dev On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > .../common/include/arch/arm/rte_memcpy.h | 4 + > .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ > 2 files changed, 312 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > index d9f5bf1..1d562c3 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > @@ -33,6 +33,10 @@ > #ifndef _RTE_MEMCPY_ARM_H_ > #define _RTE_MEMCPY_ARM_H_ > > +#ifdef RTE_ARCH_64 > +#include <rte_memcpy_64.h> > +#else > #include <rte_memcpy_32.h> > +#endif > > #endif /* _RTE_MEMCPY_ARM_H_ */ > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > new file mode 100644 > index 0000000..6d85113 > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > @@ -0,0 +1,308 @@ > +/* > + * BSD LICENSE > + * > + * Copyright (C) IBM Corporation 2014. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of IBM Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +*/ > + > +#ifndef _RTE_MEMCPY_ARM_64_H_ > +#define _RTE_MEMCPY_ARM_64_H_ > + > +#include <stdint.h> > +#include <string.h> > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include "generic/rte_memcpy.h" > + > +#ifdef __ARM_NEON_FP SIMD is not optional in armv8 spec.So every armv8 machine will have SIMD instruction unlike armv7.More over LDP/STP instruction is not part of SIMD.So this check is not required or it can be replaced with a check that select memcpy from either libc or this specific implementation > + > +/* ARM NEON Intrinsics are used to copy data */ > +#include <arm_neon.h> > + > +static inline void > +rte_mov16(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP d0, d1, [%0]\n\t" > + "STP d0, d1, [%1]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} IMO, no need to hardcode registers used for the mem move(d0, d1). Let compiler schedule the registers for better performance. > + > +static inline void > +rte_mov32(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov48(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP d0, d1, [%0 , #32]\n\t" > + "STP d0, d1, [%1 , #32]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov64(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov128(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + "LDP q0, q1, [%0 , #64]\n\t" > + "STP q0, q1, [%1 , #64]\n\t" > + "LDP q0, q1, [%0 , #96]\n\t" > + "STP q0, q1, [%1 , #96]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov256(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + "LDP q0, q1, [%0 , #64]\n\t" > + "STP q0, q1, [%1 , #64]\n\t" > + "LDP q0, q1, [%0 , #96]\n\t" > + "STP q0, q1, [%1 , #96]\n\t" > + "LDP q0, q1, [%0 , #128]\n\t" > + "STP q0, q1, [%1 , #128]\n\t" > + "LDP q0, q1, [%0 , #160]\n\t" > + "STP q0, q1, [%1 , #160]\n\t" > + "LDP q0, q1, [%0 , #192]\n\t" > + "STP q0, q1, [%1 , #192]\n\t" > + "LDP q0, q1, [%0 , #224]\n\t" > + "STP q0, q1, [%1 , #224]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +#define rte_memcpy(dst, src, n) \ > + ({ (__builtin_constant_p(n)) ? \ > + memcpy((dst), (src), (n)) : \ > + rte_memcpy_func((dst), (src), (n)); }) > + > +static inline void * > +rte_memcpy_func(void *dst, const void *src, size_t n) > +{ > + void *ret = dst; > + > + /* We can't copy < 16 bytes using XMM registers so do it manually. */ > + if (n < 16) { > + if (n & 0x01) { > + *(uint8_t *)dst = *(const uint8_t *)src; > + dst = (uint8_t *)dst + 1; > + src = (const uint8_t *)src + 1; > + } > + if (n & 0x02) { > + *(uint16_t *)dst = *(const uint16_t *)src; > + dst = (uint16_t *)dst + 1; > + src = (const uint16_t *)src + 1; > + } > + if (n & 0x04) { > + *(uint32_t *)dst = *(const uint32_t *)src; > + dst = (uint32_t *)dst + 1; > + src = (const uint32_t *)src + 1; > + } > + if (n & 0x08) > + *(uint64_t *)dst = *(const uint64_t *)src; > + return ret; > + } > + > + /* Special fast cases for <= 128 bytes */ > + if (n <= 32) { > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + rte_mov16((uint8_t *)dst - 16 + n, > + (const uint8_t *)src - 16 + n); > + return ret; > + } > + > + if (n <= 64) { > + rte_mov32((uint8_t *)dst, (const uint8_t *)src); > + rte_mov32((uint8_t *)dst - 32 + n, > + (const uint8_t *)src - 32 + n); > + return ret; > + } > + > + if (n <= 128) { > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + rte_mov64((uint8_t *)dst - 64 + n, > + (const uint8_t *)src - 64 + n); > + return ret; > + } > + > + /* > + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte > + * copies was found to be faster than doing 128 and 32 byte copies as > + * well. > + */ > + for ( ; n >= 256; n -= 256) { There is room for prefetching the next cacheline based on the cache line size. > + rte_mov256((uint8_t *)dst, (const uint8_t *)src); > + dst = (uint8_t *)dst + 256; > + src = (const uint8_t *)src + 256; > + } > + > + /* > + * We split the remaining bytes (which will be less than 256) into > + * 64byte (2^6) chunks. > + * Using incrementing integers in the case labels of a switch statement > + * enourages the compiler to use a jump table. To get incrementing > + * integers, we shift the 2 relevant bits to the LSB position to first > + * get decrementing integers, and then subtract. > + */ > + switch (3 - (n >> 6)) { > + case 0x00: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + case 0x01: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + case 0x02: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + default: > + break; > + } > + > + /* > + * We split the remaining bytes (which will be less than 64) into > + * 16byte (2^4) chunks, using the same switch structure as above. > + */ > + switch (3 - (n >> 4)) { > + case 0x00: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + case 0x01: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + case 0x02: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + default: > + break; > + } > + > + /* Copy any remaining bytes, without going beyond end of buffers */ > + if (n != 0) > + rte_mov16((uint8_t *)dst - 16 + n, > + (const uint8_t *)src - 16 + n); > + return ret; > +} > + > +#else > + > +static inline void > +rte_mov16(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 16); > +} > + > +static inline void > +rte_mov32(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 32); > +} > + > +static inline void > +rte_mov48(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 48); > +} > + > +static inline void > +rte_mov64(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 64); > +} > + > +static inline void > +rte_mov128(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 128); > +} > + > +static inline void > +rte_mov256(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 256); > +} > + > +static inline void * > +rte_memcpy(void *dst, const void *src, size_t n) > +{ > + return memcpy(dst, src, n); > +} > + > +static inline void * > +rte_memcpy_func(void *dst, const void *src, size_t n) > +{ > + return memcpy(dst, src, n); > +} > + > +#endif /* __ARM_NEON_FP */ > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_MEMCPY_ARM_64_H_ */ > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 4:57 ` Jerin Jacob @ 2015-11-02 12:22 ` Hunt, David 2015-11-02 12:45 ` Jan Viktorin 2015-11-02 12:57 ` Jerin Jacob 0 siblings, 2 replies; 18+ messages in thread From: Hunt, David @ 2015-11-02 12:22 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 02/11/2015 04:57, Jerin Jacob wrote: > On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: >> Signed-off-by: David Hunt <david.hunt@intel.com> --snip-- >> +#ifndef _RTE_MEMCPY_ARM_64_H_ >> +#define _RTE_MEMCPY_ARM_64_H_ >> + >> +#include <stdint.h> >> +#include <string.h> >> + >> +#ifdef __cplusplus >> +extern "C" { >> +#endif >> + >> +#include "generic/rte_memcpy.h" >> + >> +#ifdef __ARM_NEON_FP > > SIMD is not optional in armv8 spec.So every armv8 machine will have > SIMD instruction unlike armv7.More over LDP/STP instruction is > not part of SIMD.So this check is not required or it can > be replaced with a check that select memcpy from either libc or this specific > implementation Jerin, I've just benchmarked the libc version against the hand-coded version of the memcpy routines, and the libc wins in most cases. This code was just an initial attempt at optimising the memccpy's, so I feel that with the current benchmark results, it would better just to remove the assembly versions, and use the libc version for the initial release on ARMv8. Then, in the future, the ARMv8 experts are free to submit an optimised version as a patch in the future. Does that sound reasonable to you? Rgds, Dave. --snip-- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 12:22 ` Hunt, David @ 2015-11-02 12:45 ` Jan Viktorin 2015-11-02 12:57 ` Jerin Jacob 1 sibling, 0 replies; 18+ messages in thread From: Jan Viktorin @ 2015-11-02 12:45 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, 2 Nov 2015 12:22:47 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 02/11/2015 04:57, Jerin Jacob wrote: > > On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > >> Signed-off-by: David Hunt <david.hunt@intel.com> > --snip-- > >> +#ifndef _RTE_MEMCPY_ARM_64_H_ > >> +#define _RTE_MEMCPY_ARM_64_H_ > >> + > >> +#include <stdint.h> > >> +#include <string.h> > >> + > >> +#ifdef __cplusplus > >> +extern "C" { > >> +#endif > >> + > >> +#include "generic/rte_memcpy.h" > >> + > >> +#ifdef __ARM_NEON_FP > > > > SIMD is not optional in armv8 spec.So every armv8 machine will have > > SIMD instruction unlike armv7.More over LDP/STP instruction is > > not part of SIMD.So this check is not required or it can > > be replaced with a check that select memcpy from either libc or this specific > > implementation > > Jerin, > I've just benchmarked the libc version against the hand-coded > version of the memcpy routines, and the libc wins in most cases. This > code was just an initial attempt at optimising the memccpy's, so I feel > that with the current benchmark results, it would better just to remove > the assembly versions, and use the libc version for the initial release > on ARMv8. > Then, in the future, the ARMv8 experts are free to submit an optimised > version as a patch in the future. Does that sound reasonable to you? > Rgds, > Dave. As there is no use of NEON in the code, this optimization seems to be useless to me... Jan > > > --snip-- > > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 12:22 ` Hunt, David 2015-11-02 12:45 ` Jan Viktorin @ 2015-11-02 12:57 ` Jerin Jacob 2015-11-02 15:26 ` Hunt, David 1 sibling, 1 reply; 18+ messages in thread From: Jerin Jacob @ 2015-11-02 12:57 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote: > On 02/11/2015 04:57, Jerin Jacob wrote: > >On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > >>Signed-off-by: David Hunt <david.hunt@intel.com> > --snip-- > >>+#ifndef _RTE_MEMCPY_ARM_64_H_ > >>+#define _RTE_MEMCPY_ARM_64_H_ > >>+ > >>+#include <stdint.h> > >>+#include <string.h> > >>+ > >>+#ifdef __cplusplus > >>+extern "C" { > >>+#endif > >>+ > >>+#include "generic/rte_memcpy.h" > >>+ > >>+#ifdef __ARM_NEON_FP > > > >SIMD is not optional in armv8 spec.So every armv8 machine will have > >SIMD instruction unlike armv7.More over LDP/STP instruction is > >not part of SIMD.So this check is not required or it can > >be replaced with a check that select memcpy from either libc or this specific > >implementation > > Jerin, > I've just benchmarked the libc version against the hand-coded version of > the memcpy routines, and the libc wins in most cases. This code was just an > initial attempt at optimising the memccpy's, so I feel that with the current > benchmark results, it would better just to remove the assembly versions, and > use the libc version for the initial release on ARMv8. > Then, in the future, the ARMv8 experts are free to submit an optimised > version as a patch in the future. Does that sound reasonable to you? Make sense. Based on my understanding, other blocks are also not optimized for arm64. So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and libc for initial version. BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and "byteorder_autotest" is broken. I think existing arm64 code is not optimized beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified CONFIG_RTE_FORCE_INTRINSICS scheme. if you guys are OK with arm and arm64 as two different platform then I can summit the complete working patch for arm64.(as in my current source code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/) > Rgds, > Dave. > > > --snip-- > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 12:57 ` Jerin Jacob @ 2015-11-02 15:26 ` Hunt, David 2015-11-02 15:36 ` Jan Viktorin 0 siblings, 1 reply; 18+ messages in thread From: Hunt, David @ 2015-11-02 15:26 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 02/11/2015 12:57, Jerin Jacob wrote: > On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote: >> Jerin, >> I've just benchmarked the libc version against the hand-coded version of >> the memcpy routines, and the libc wins in most cases. This code was just an >> initial attempt at optimising the memccpy's, so I feel that with the current >> benchmark results, it would better just to remove the assembly versions, and >> use the libc version for the initial release on ARMv8. >> Then, in the future, the ARMv8 experts are free to submit an optimised >> version as a patch in the future. Does that sound reasonable to you? > > Make sense. Based on my understanding, other blocks are also not optimized > for arm64. > So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and > libc for initial version. > > BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and > "byteorder_autotest" is broken. I think existing arm64 code is not optimized > beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified > CONFIG_RTE_FORCE_INTRINSICS scheme. Agreed. > if you guys are OK with arm and arm64 as two different platform then > I can summit the complete working patch for arm64.(as in my current source > code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/) Sure. That would be great. We initially started with two ARMv7 patch-sets, and Jan merged into one. Something similar could happen for the ARMv8 patch set. We just want to end up with the best implementation possible. :) Dave. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 15:26 ` Hunt, David @ 2015-11-02 15:36 ` Jan Viktorin 2015-11-02 15:49 ` Hunt, David 0 siblings, 1 reply; 18+ messages in thread From: Jan Viktorin @ 2015-11-02 15:36 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, 2 Nov 2015 15:26:19 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 02/11/2015 12:57, Jerin Jacob wrote: > > On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote: > >> Jerin, > >> I've just benchmarked the libc version against the hand-coded version of > >> the memcpy routines, and the libc wins in most cases. This code was just an > >> initial attempt at optimising the memccpy's, so I feel that with the current > >> benchmark results, it would better just to remove the assembly versions, and > >> use the libc version for the initial release on ARMv8. > >> Then, in the future, the ARMv8 experts are free to submit an optimised > >> version as a patch in the future. Does that sound reasonable to you? > > > > Make sense. Based on my understanding, other blocks are also not optimized > > for arm64. > > So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and > > libc for initial version. > > > > BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and > > "byteorder_autotest" is broken. I think existing arm64 code is not optimized > > beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified > > CONFIG_RTE_FORCE_INTRINSICS scheme. > > Agreed. > > > if you guys are OK with arm and arm64 as two different platform then > > I can summit the complete working patch for arm64.(as in my current source > > code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/) > > Sure. That would be great. We initially started with two ARMv7 > patch-sets, and Jan merged into one. Something similar could happen for > the ARMv8 patch set. We just want to end up with the best implementation > possible. :) > It was looking like we can share a lot of common code for both architectures. I didn't know how much different are the cpuflags. IMHO, it'd be better to have two directories arm and arm64. I thought to refer from arm64 to arm where possible. But I don't know whether is this possible with the DPDK build system. Jan > Dave. > > > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 15:36 ` Jan Viktorin @ 2015-11-02 15:49 ` Hunt, David 2015-11-02 16:29 ` Jerin Jacob 0 siblings, 1 reply; 18+ messages in thread From: Hunt, David @ 2015-11-02 15:49 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On 02/11/2015 15:36, Jan Viktorin wrote: > On Mon, 2 Nov 2015 15:26:19 +0000 --snip-- > It was looking like we can share a lot of common code for both > architectures. I didn't know how much different are the cpuflags. CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7 ones. static const struct feature_entry cpu_feature_table[] = { FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) FEAT_DEF(ASIMD, 0x00000001, 0, REG_HWCAP, 1) FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) }; > IMHO, it'd be better to have two directories arm and arm64. I thought > to refer from arm64 to arm where possible. But I don't know whether is > this possible with the DPDK build system. I think both methodologies have their pros and cons. However, I'd lean towards the common directory with the "filename_32/64.h" scheme, as that similar to the x86 methodology, and we don't need to tweak the include paths to pull files from multiple directories. Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 15:49 ` Hunt, David @ 2015-11-02 16:29 ` Jerin Jacob 2015-11-02 17:29 ` Jan Viktorin 0 siblings, 1 reply; 18+ messages in thread From: Jerin Jacob @ 2015-11-02 16:29 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote: > On 02/11/2015 15:36, Jan Viktorin wrote: > >On Mon, 2 Nov 2015 15:26:19 +0000 > --snip-- > >It was looking like we can share a lot of common code for both > >architectures. I didn't know how much different are the cpuflags. > > CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7 > ones. > > static const struct feature_entry cpu_feature_table[] = { > FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) > FEAT_DEF(ASIMD, 0x00000001, 0, REG_HWCAP, 1) > FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) > FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) > FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) > FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) > FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) > FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) > FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) > FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) > }; > > >IMHO, it'd be better to have two directories arm and arm64. I thought > >to refer from arm64 to arm where possible. But I don't know whether is > >this possible with the DPDK build system. > > I think both methodologies have their pros and cons. However, I'd lean > towards the common directory with the "filename_32/64.h" scheme, as that > similar to the x86 methodology, and we don't need to tweak the include paths > to pull files from multiple directories. > I agree. Jan, could you please send the next version with filename_32/64.h for atomic and cpuflags(ie for all header files). I can re-base and send the complete arm64 patch based on your version. Thanks, Jerin > Dave > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 16:29 ` Jerin Jacob @ 2015-11-02 17:29 ` Jan Viktorin 0 siblings, 0 replies; 18+ messages in thread From: Jan Viktorin @ 2015-11-02 17:29 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On Mon, 2 Nov 2015 21:59:12 +0530 Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote: > On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote: > > On 02/11/2015 15:36, Jan Viktorin wrote: > > >On Mon, 2 Nov 2015 15:26:19 +0000 > > --snip-- > > >It was looking like we can share a lot of common code for both > > >architectures. I didn't know how much different are the cpuflags. > > > > CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7 > > ones. > > > > static const struct feature_entry cpu_feature_table[] = { > > FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) > > FEAT_DEF(ASIMD, 0x00000001, 0, REG_HWCAP, 1) > > FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) > > FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) > > FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) > > FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) > > FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) > > FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) > > FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) > > FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) > > }; > > > > >IMHO, it'd be better to have two directories arm and arm64. I thought > > >to refer from arm64 to arm where possible. But I don't know whether is > > >this possible with the DPDK build system. > > > > I think both methodologies have their pros and cons. However, I'd lean > > towards the common directory with the "filename_32/64.h" scheme, as that > > similar to the x86 methodology, and we don't need to tweak the include paths > > to pull files from multiple directories. > > > > I agree. Jan, could you please send the next version with > filename_32/64.h for atomic and cpuflags(ie for all header files). > I can re-base and send the complete arm64 patch based on your version. > I am working on it, however, after I've removed the unnecessary intrinsics code and set the RTE_FORCE_INTRINSICS=y, it doesn't build... So I'm figuring out what is wrong. Jan > Thanks, > Jerin > > > > > Dave > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt ` (3 subsequent siblings) 5 siblings, 0 replies; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_prefetch.h | 4 ++ .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++++++++++++++++++++ 2 files changed, 65 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h index 1f46697..aa37de5 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h @@ -33,6 +33,10 @@ #ifndef _RTE_PREFETCH_ARM_H_ #define _RTE_PREFETCH_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_prefetch_64.h> +#else #include <rte_prefetch_32.h> +#endif #endif /* _RTE_PREFETCH_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h new file mode 100644 index 0000000..b0d9170 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h @@ -0,0 +1,61 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Intel Corporation. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM_64_H_ +#define _RTE_PREFETCH_ARM_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_prefetch.h" +/* May want to add PSTL1KEEP instructions for prefetch for ownership. */ +static inline void rte_prefetch0(const volatile void *p) +{ + asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch1(const volatile void *p) +{ + asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch2(const volatile void *p) +{ + asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PREFETCH_ARM_64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-11-02 5:15 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt ` (2 subsequent siblings) 5 siblings, 1 reply; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_cycles.h | 4 ++ .../common/include/arch/arm/rte_cycles_64.h | 77 ++++++++++++++++++++++ 2 files changed, 81 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h index b2372fa..a8009a0 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h @@ -33,6 +33,10 @@ #ifndef _RTE_CYCLES_ARM_H_ #define _RTE_CYCLES_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_cycles_64.h> +#else #include <rte_cycles_32.h> +#endif #endif /* _RTE_CYCLES_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h new file mode 100644 index 0000000..148b9f4 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h @@ -0,0 +1,77 @@ +/* + * BSD LICENSE + * + * Copyright (C) IBM Corporation 2014. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IBM Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_CYCLES_ARM64_H_ +#define _RTE_CYCLES_ARM64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_cycles.h" + +/** + * Read the time base register. + * + * @return + * The time base for this lcore. + */ +static inline uint64_t +rte_rdtsc(void) +{ + uint64_t tsc; + + asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc)); + +#ifdef RTE_TIMER_MULTIPLIER + return tsc * RTE_TIMER_MULTIPLIER; +#else + return tsc; +#endif + +} + +static inline uint64_t +rte_rdtsc_precise(void) +{ + asm volatile("isb sy" :::); + return rte_rdtsc(); +} + +static inline uint64_t +rte_get_tsc_cycles(void) { return rte_rdtsc(); } + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_CYCLES_ARM64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt @ 2015-11-02 5:15 ` Jerin Jacob 0 siblings, 0 replies; 18+ messages in thread From: Jerin Jacob @ 2015-11-02 5:15 UTC (permalink / raw) To: David Hunt; +Cc: dev On Fri, Oct 30, 2015 at 01:49:16PM +0000, David Hunt wrote: > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > .../common/include/arch/arm/rte_cycles.h | 4 ++ > .../common/include/arch/arm/rte_cycles_64.h | 77 ++++++++++++++++++++++ > 2 files changed, 81 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h > index b2372fa..a8009a0 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h > @@ -33,6 +33,10 @@ > #ifndef _RTE_CYCLES_ARM_H_ > #define _RTE_CYCLES_ARM_H_ > > +#ifdef RTE_ARCH_64 > +#include <rte_cycles_64.h> > +#else > #include <rte_cycles_32.h> > +#endif > > #endif /* _RTE_CYCLES_ARM_H_ */ > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > new file mode 100644 > index 0000000..148b9f4 > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > @@ -0,0 +1,77 @@ > +/* > + * BSD LICENSE > + * > + * Copyright (C) IBM Corporation 2014. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of IBM Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +*/ > + > +#ifndef _RTE_CYCLES_ARM64_H_ > +#define _RTE_CYCLES_ARM64_H_ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include "generic/rte_cycles.h" > + > +/** > + * Read the time base register. > + * > + * @return > + * The time base for this lcore. > + */ > +static inline uint64_t > +rte_rdtsc(void) > +{ > + uint64_t tsc; > + > + asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc)); > + > +#ifdef RTE_TIMER_MULTIPLIER > + return tsc * RTE_TIMER_MULTIPLIER; > +#else > + return tsc; > +#endif > + > +} > + > +static inline uint64_t > +rte_rdtsc_precise(void) > +{ > + asm volatile("isb sy" :::); IMO, it should be asm volatile("dmb ish" : : : "memory") to represent the data memory barrier(rte_mb()). > + return rte_rdtsc(); > +} > + > +static inline uint64_t > +rte_get_tsc_cycles(void) { return rte_rdtsc(); } > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_CYCLES_ARM64_H_ */ > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt ` (2 preceding siblings ...) 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt 5 siblings, 0 replies; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h index 7ce9d14..5c5fd6a 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h @@ -141,12 +141,16 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out) { int auxv_fd; +#ifdef RTE_ARCH_64 + Elf64_auxv_t auxv; +#else Elf32_auxv_t auxv; +#endif auxv_fd = open("/proc/self/auxv", O_RDONLY); assert(auxv_fd); while (read(auxv_fd, &auxv, - sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) { + sizeof(auxv)) == sizeof(auxv)) { if (auxv.a_type == AT_HWCAP) out[REG_HWCAP] = auxv.a_un.a_val; else if (auxv.a_type == AT_HWCAP2) -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt ` (3 preceding siblings ...) 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-11-02 4:43 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt 5 siblings, 1 reply; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev The ARMv8 include files are in the arm directory in lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files Signed-off-by: David Hunt <david.hunt@intel.com> --- MAINTAINERS | 3 +- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++ doc/guides/rel_notes/release_2_2.rst | 7 ++-- mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ mk/machine/armv8a/rte.vars.mk | 57 +++++++++++++++++++++++++++++ 5 files changed, 177 insertions(+), 4 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk diff --git a/MAINTAINERS b/MAINTAINERS index a8933eb..4569f13 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -124,8 +124,9 @@ IBM POWER M: Chao Zhu <chaozhu@linux.vnet.ibm.com> F: lib/librte_eal/common/include/arch/ppc_64/ -ARM v7 +ARM M: Jan Viktorin <viktorin@rehivetech.com> +M: David Hunt <david.hunt@intel.com> F: lib/librte_eal/common/include/arch/arm/ Intel x86 diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc new file mode 100644 index 0000000..79a9533 --- /dev/null +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc @@ -0,0 +1,56 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +#include "common_linuxapp" + +CONFIG_RTE_MACHINE="armv8a" + +CONFIG_RTE_ARCH="arm64" +CONFIG_RTE_ARCH_ARM64=y +CONFIG_RTE_ARCH_64=y +CONFIG_RTE_ARCH_ARM_NEON=y + +CONFIG_RTE_TOOLCHAIN="gcc" +CONFIG_RTE_TOOLCHAIN_GCC=y + +CONFIG_RTE_IXGBE_INC_VECTOR=n +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n +CONFIG_RTE_LIBRTE_IVSHMEM=n +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n + +CONFIG_RTE_LIBRTE_LPM=n +CONFIG_RTE_LIBRTE_ACL=n +CONFIG_RTE_LIBRTE_TABLE=n +CONFIG_RTE_LIBRTE_PIPELINE=n + +# This is used to adjust the generic arm timer to align with the cpu cycle count. +CONFIG_RTE_TIMER_MULTIPLIER=48 diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index 5b5bb4c..5aa523b 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -31,10 +31,11 @@ New Features * **Added vhost-user multiple queue support.** -* **Introduce ARMv7 architecture** +* **Introduce ARMv7 and ARMv8 architectures** - It is now possible to build DPDK for the ARMv7 platform and test with - virtual PMD drivers. + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. + * ARMv7 can be tested with virtual PMD drivers. + * ARMv8 can be tested with virtual and physical PMD drivers. Resolved Issues diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk new file mode 100644 index 0000000..3aad712 --- /dev/null +++ b/mk/arch/arm64/rte.vars.mk @@ -0,0 +1,58 @@ +# BSD LICENSE +# +# Copyright(c) 2015 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# arch: +# +# - define ARCH variable (overridden by cmdline or by previous +# optional define in machine .mk) +# - define CROSS variable (overridden by cmdline or previous define +# in machine .mk) +# - define CPU_CFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_LDFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_ASFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - may override any previously defined variable +# +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32 +# + +ARCH ?= arm64 +# common arch dir in eal headers +ARCH_DIR := arm +CROSS ?= + +CPU_CFLAGS ?= -DRTE_CACHE_LINE_SIZE=64 +CPU_LDFLAGS ?= +CPU_ASFLAGS ?= -felf + +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk new file mode 100644 index 0000000..b785062 --- /dev/null +++ b/mk/machine/armv8a/rte.vars.mk @@ -0,0 +1,57 @@ +# BSD LICENSE +# +# Copyright(c) 2015 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# machine: +# +# - can define ARCH variable (overridden by cmdline value) +# - can define CROSS variable (overridden by cmdline value) +# - define MACHINE_CFLAGS variable (overridden by cmdline value) +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) +# - can define CPU_CFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - may override any previously defined variable +# + +# ARCH = +# CROSS = +# MACHINE_CFLAGS = +# MACHINE_LDFLAGS = +# MACHINE_ASFLAGS = +# CPU_CFLAGS = +# CPU_LDFLAGS = +# CPU_ASFLAGS = + +MACHINE_CFLAGS += -march=armv8-a -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt @ 2015-11-02 4:43 ` Jerin Jacob 0 siblings, 0 replies; 18+ messages in thread From: Jerin Jacob @ 2015-11-02 4:43 UTC (permalink / raw) To: David Hunt; +Cc: dev On Fri, Oct 30, 2015 at 01:49:18PM +0000, David Hunt wrote: > The ARMv8 include files are in the arm directory in > lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files > > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > MAINTAINERS | 3 +- > config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++ > doc/guides/rel_notes/release_2_2.rst | 7 ++-- > mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ > mk/machine/armv8a/rte.vars.mk | 57 +++++++++++++++++++++++++++++ > 5 files changed, 177 insertions(+), 4 deletions(-) > create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc > create mode 100644 mk/arch/arm64/rte.vars.mk > create mode 100644 mk/machine/armv8a/rte.vars.mk > > diff --git a/MAINTAINERS b/MAINTAINERS > index a8933eb..4569f13 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -124,8 +124,9 @@ IBM POWER > M: Chao Zhu <chaozhu@linux.vnet.ibm.com> > F: lib/librte_eal/common/include/arch/ppc_64/ > > -ARM v7 > +ARM > M: Jan Viktorin <viktorin@rehivetech.com> > +M: David Hunt <david.hunt@intel.com> > F: lib/librte_eal/common/include/arch/arm/ > > Intel x86 > diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc > new file mode 100644 > index 0000000..79a9533 > --- /dev/null > +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc > @@ -0,0 +1,56 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > +# All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +# > + > +#include "common_linuxapp" > + > +CONFIG_RTE_MACHINE="armv8a" > + > +CONFIG_RTE_ARCH="arm64" > +CONFIG_RTE_ARCH_ARM64=y > +CONFIG_RTE_ARCH_64=y > +CONFIG_RTE_ARCH_ARM_NEON=y > + > +CONFIG_RTE_TOOLCHAIN="gcc" > +CONFIG_RTE_TOOLCHAIN_GCC=y > + > +CONFIG_RTE_IXGBE_INC_VECTOR=n > +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n > +CONFIG_RTE_LIBRTE_IVSHMEM=n > +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n > + > +CONFIG_RTE_LIBRTE_LPM=n > +CONFIG_RTE_LIBRTE_ACL=n > +CONFIG_RTE_LIBRTE_TABLE=n > +CONFIG_RTE_LIBRTE_PIPELINE=n > + > +# This is used to adjust the generic arm timer to align with the cpu cycle count. > +CONFIG_RTE_TIMER_MULTIPLIER=48 Introducing a build-time dependency with cpu clock parameter not a good idea. Either this parameter needs be removed or find out out the multiplier at run-time by introducing a machine specific hook > diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst > index 5b5bb4c..5aa523b 100644 > --- a/doc/guides/rel_notes/release_2_2.rst > +++ b/doc/guides/rel_notes/release_2_2.rst > @@ -31,10 +31,11 @@ New Features > > * **Added vhost-user multiple queue support.** > > -* **Introduce ARMv7 architecture** > +* **Introduce ARMv7 and ARMv8 architectures** > > - It is now possible to build DPDK for the ARMv7 platform and test with > - virtual PMD drivers. > + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. > + * ARMv7 can be tested with virtual PMD drivers. > + * ARMv8 can be tested with virtual and physical PMD drivers. > > > Resolved Issues > diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk > new file mode 100644 > index 0000000..3aad712 > --- /dev/null > +++ b/mk/arch/arm64/rte.vars.mk > @@ -0,0 +1,58 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2015 Intel Corporation. All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + > +# > +# arch: > +# > +# - define ARCH variable (overridden by cmdline or by previous > +# optional define in machine .mk) > +# - define CROSS variable (overridden by cmdline or previous define > +# in machine .mk) > +# - define CPU_CFLAGS variable (overridden by cmdline or previous > +# define in machine .mk) > +# - define CPU_LDFLAGS variable (overridden by cmdline or previous > +# define in machine .mk) > +# - define CPU_ASFLAGS variable (overridden by cmdline or previous > +# define in machine .mk) > +# - may override any previously defined variable > +# > +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32 > +# > + > +ARCH ?= arm64 > +# common arch dir in eal headers > +ARCH_DIR := arm > +CROSS ?= > + > +CPU_CFLAGS ?= -DRTE_CACHE_LINE_SIZE=64 cache line size can be moved to MACHINE_CFLAGS as its more of machine parameter.so that if machine has different cache line size(based on arm64) can have new target like defconfig_arm64-xxxxxxx-linuxapp-gcc > +CPU_LDFLAGS ?= > +CPU_ASFLAGS ?= -felf > + > +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS > diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk > new file mode 100644 > index 0000000..b785062 > --- /dev/null > +++ b/mk/machine/armv8a/rte.vars.mk > @@ -0,0 +1,57 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2015 Intel Corporation. All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + > +# > +# machine: > +# > +# - can define ARCH variable (overridden by cmdline value) > +# - can define CROSS variable (overridden by cmdline value) > +# - define MACHINE_CFLAGS variable (overridden by cmdline value) > +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) > +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) > +# - can define CPU_CFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - may override any previously defined variable > +# > + > +# ARCH = > +# CROSS = > +# MACHINE_CFLAGS = > +# MACHINE_LDFLAGS = > +# MACHINE_ASFLAGS = > +# CPU_CFLAGS = > +# CPU_LDFLAGS = > +# CPU_ASFLAGS = > + > +MACHINE_CFLAGS += -march=armv8-a > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt ` (4 preceding siblings ...) 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt @ 2015-10-30 13:49 ` David Hunt 5 siblings, 0 replies; 18+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- app/test/test_cpuflags.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index 557458f..1689048 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -1,4 +1,4 @@ -/*- +/* * BSD LICENSE * * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. @@ -115,9 +115,18 @@ test_cpuflags(void) CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP); #endif -#if defined(RTE_ARCH_ARM) +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) + printf("Checking for Floating Point:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_FPA); + printf("Check for NEON:\t\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); + + printf("Checking for ARM32 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32); + + printf("Checking for ARM64 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64); #endif #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) -- 1.9.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-11-02 17:31 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt 2015-11-02 4:57 ` Jerin Jacob 2015-11-02 12:22 ` Hunt, David 2015-11-02 12:45 ` Jan Viktorin 2015-11-02 12:57 ` Jerin Jacob 2015-11-02 15:26 ` Hunt, David 2015-11-02 15:36 ` Jan Viktorin 2015-11-02 15:49 ` Hunt, David 2015-11-02 16:29 ` Jerin Jacob 2015-11-02 17:29 ` Jan Viktorin 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt 2015-11-02 5:15 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt 2015-11-02 4:43 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).