* [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support @ 2015-10-30 13:49 David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt ` (5 more replies) 0 siblings, 6 replies; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev This is th v3 patchset for ARMv8 that now sits on top of the v5 patch of the ARMv7 code by RehiveTech. It adds code into the same arm include directory, reducing code duplication. Tested on an XGene 64-bit arm server board, with PCI slots. Passes traffic between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should work with many other NICS, but these are as yet untested. Compiles igb_uio, kni and all the physical device PMDs. An entry has been added to the Release notes. We hope that this will encourage the ARM community to contribute PMDs for their SoCs to DPDK. For now, we've added some Intel engineers to the MAINTAINERS file. We would like to encourage the ARM community to take over maintenance of this area in future, and to further improve it. Notes on arm64 kernel configuration: Using Ubuntu 14.04 LTS with a 4.3.0-rc6 kernel (with modified PCI drivers), and uio_pci_generic. ARM64 kernels do not seem to have functional resource mapping of PCI memory (PCI_MMAP), so the pci driver needs to be patched to enable this. The symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is missing the resource0...N files for mmapping the device memory. Earlier kernels (3.13.x) had these files present, but mmap'ping resulted in a "Bus Error" when the NIC memory was accessed. However, during limited testing with a modified 4.3.0-rc6 kernel, we were able to mmap the NIC memory, and pass traffic between the two ports on a 82599 NIC connected via fibre cable. We have no plans to upstream a kernel patch for this and hope that someone more familiar with the arm architecture can create a proper patch and enable this functionality. Reviewed-by: Jan Viktorin <viktorin@rehivetech.com> David Hunt (6): eal/arm: add 64-bit armv8 version of rte_memcpy.h eal/arm: add 64-bit armv8 version of rte_prefetch.h eal/arm: add 64-bit armv8 version of rte_cycles.h eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h mk: add support for armv8 on top of armv7 test: add checks for cpu flags on armv8 MAINTAINERS | 3 +- app/test/test_cpuflags.c | 13 +- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 ++++ doc/guides/rel_notes/release_2_2.rst | 7 +- .../common/include/arch/arm/rte_cpuflags.h | 6 +- .../common/include/arch/arm/rte_cycles.h | 4 + .../common/include/arch/arm/rte_cycles_64.h | 77 ++++++ .../common/include/arch/arm/rte_memcpy.h | 4 + .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ .../common/include/arch/arm/rte_prefetch.h | 4 + .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++ mk/arch/arm64/rte.vars.mk | 58 ++++ mk/machine/armv8a/rte.vars.mk | 57 ++++ 13 files changed, 651 insertions(+), 7 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-11-02 4:57 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt ` (4 subsequent siblings) 5 siblings, 1 reply; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_memcpy.h | 4 + .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ 2 files changed, 312 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h index d9f5bf1..1d562c3 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h @@ -33,6 +33,10 @@ #ifndef _RTE_MEMCPY_ARM_H_ #define _RTE_MEMCPY_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_memcpy_64.h> +#else #include <rte_memcpy_32.h> +#endif #endif /* _RTE_MEMCPY_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h new file mode 100644 index 0000000..6d85113 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h @@ -0,0 +1,308 @@ +/* + * BSD LICENSE + * + * Copyright (C) IBM Corporation 2014. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IBM Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_MEMCPY_ARM_64_H_ +#define _RTE_MEMCPY_ARM_64_H_ + +#include <stdint.h> +#include <string.h> + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_memcpy.h" + +#ifdef __ARM_NEON_FP + +/* ARM NEON Intrinsics are used to copy data */ +#include <arm_neon.h> + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP d0, d1, [%0]\n\t" + "STP d0, d1, [%1]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP d0, d1, [%0 , #32]\n\t" + "STP d0, d1, [%1 , #32]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + "LDP q0, q1, [%0 , #64]\n\t" + "STP q0, q1, [%1 , #64]\n\t" + "LDP q0, q1, [%0 , #96]\n\t" + "STP q0, q1, [%1 , #96]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + asm volatile("LDP q0, q1, [%0]\n\t" + "STP q0, q1, [%1]\n\t" + "LDP q0, q1, [%0 , #32]\n\t" + "STP q0, q1, [%1 , #32]\n\t" + "LDP q0, q1, [%0 , #64]\n\t" + "STP q0, q1, [%1 , #64]\n\t" + "LDP q0, q1, [%0 , #96]\n\t" + "STP q0, q1, [%1 , #96]\n\t" + "LDP q0, q1, [%0 , #128]\n\t" + "STP q0, q1, [%1 , #128]\n\t" + "LDP q0, q1, [%0 , #160]\n\t" + "STP q0, q1, [%1 , #160]\n\t" + "LDP q0, q1, [%0 , #192]\n\t" + "STP q0, q1, [%1 , #192]\n\t" + "LDP q0, q1, [%0 , #224]\n\t" + "STP q0, q1, [%1 , #224]\n\t" + : : "r" (src), "r" (dst) : + ); +} + +#define rte_memcpy(dst, src, n) \ + ({ (__builtin_constant_p(n)) ? \ + memcpy((dst), (src), (n)) : \ + rte_memcpy_func((dst), (src), (n)); }) + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + void *ret = dst; + + /* We can't copy < 16 bytes using XMM registers so do it manually. */ + if (n < 16) { + if (n & 0x01) { + *(uint8_t *)dst = *(const uint8_t *)src; + dst = (uint8_t *)dst + 1; + src = (const uint8_t *)src + 1; + } + if (n & 0x02) { + *(uint16_t *)dst = *(const uint16_t *)src; + dst = (uint16_t *)dst + 1; + src = (const uint16_t *)src + 1; + } + if (n & 0x04) { + *(uint32_t *)dst = *(const uint32_t *)src; + dst = (uint32_t *)dst + 1; + src = (const uint32_t *)src + 1; + } + if (n & 0x08) + *(uint64_t *)dst = *(const uint64_t *)src; + return ret; + } + + /* Special fast cases for <= 128 bytes */ + if (n <= 32) { + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; + } + + if (n <= 64) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + rte_mov32((uint8_t *)dst - 32 + n, + (const uint8_t *)src - 32 + n); + return ret; + } + + if (n <= 128) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + rte_mov64((uint8_t *)dst - 64 + n, + (const uint8_t *)src - 64 + n); + return ret; + } + + /* + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte + * copies was found to be faster than doing 128 and 32 byte copies as + * well. + */ + for ( ; n >= 256; n -= 256) { + rte_mov256((uint8_t *)dst, (const uint8_t *)src); + dst = (uint8_t *)dst + 256; + src = (const uint8_t *)src + 256; + } + + /* + * We split the remaining bytes (which will be less than 256) into + * 64byte (2^6) chunks. + * Using incrementing integers in the case labels of a switch statement + * enourages the compiler to use a jump table. To get incrementing + * integers, we shift the 2 relevant bits to the LSB position to first + * get decrementing integers, and then subtract. + */ + switch (3 - (n >> 6)) { + case 0x00: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x01: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + case 0x02: + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + n -= 64; + dst = (uint8_t *)dst + 64; + src = (const uint8_t *)src + 64; /* fallthrough */ + default: + break; + } + + /* + * We split the remaining bytes (which will be less than 64) into + * 16byte (2^4) chunks, using the same switch structure as above. + */ + switch (3 - (n >> 4)) { + case 0x00: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x01: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + case 0x02: + rte_mov16((uint8_t *)dst, (const uint8_t *)src); + n -= 16; + dst = (uint8_t *)dst + 16; + src = (const uint8_t *)src + 16; /* fallthrough */ + default: + break; + } + + /* Copy any remaining bytes, without going beyond end of buffers */ + if (n != 0) + rte_mov16((uint8_t *)dst - 16 + n, + (const uint8_t *)src - 16 + n); + return ret; +} + +#else + +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +static inline void * +rte_memcpy(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +static inline void * +rte_memcpy_func(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} + +#endif /* __ARM_NEON_FP */ + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMCPY_ARM_64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt @ 2015-11-02 4:57 ` Jerin Jacob 2015-11-02 12:22 ` Hunt, David 0 siblings, 1 reply; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 4:57 UTC (permalink / raw) To: David Hunt; +Cc: dev On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > .../common/include/arch/arm/rte_memcpy.h | 4 + > .../common/include/arch/arm/rte_memcpy_64.h | 308 +++++++++++++++++++++ > 2 files changed, 312 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > index d9f5bf1..1d562c3 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h > @@ -33,6 +33,10 @@ > #ifndef _RTE_MEMCPY_ARM_H_ > #define _RTE_MEMCPY_ARM_H_ > > +#ifdef RTE_ARCH_64 > +#include <rte_memcpy_64.h> > +#else > #include <rte_memcpy_32.h> > +#endif > > #endif /* _RTE_MEMCPY_ARM_H_ */ > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > new file mode 100644 > index 0000000..6d85113 > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > @@ -0,0 +1,308 @@ > +/* > + * BSD LICENSE > + * > + * Copyright (C) IBM Corporation 2014. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of IBM Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +*/ > + > +#ifndef _RTE_MEMCPY_ARM_64_H_ > +#define _RTE_MEMCPY_ARM_64_H_ > + > +#include <stdint.h> > +#include <string.h> > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include "generic/rte_memcpy.h" > + > +#ifdef __ARM_NEON_FP SIMD is not optional in armv8 spec.So every armv8 machine will have SIMD instruction unlike armv7.More over LDP/STP instruction is not part of SIMD.So this check is not required or it can be replaced with a check that select memcpy from either libc or this specific implementation > + > +/* ARM NEON Intrinsics are used to copy data */ > +#include <arm_neon.h> > + > +static inline void > +rte_mov16(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP d0, d1, [%0]\n\t" > + "STP d0, d1, [%1]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} IMO, no need to hardcode registers used for the mem move(d0, d1). Let compiler schedule the registers for better performance. > + > +static inline void > +rte_mov32(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov48(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP d0, d1, [%0 , #32]\n\t" > + "STP d0, d1, [%1 , #32]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov64(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov128(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + "LDP q0, q1, [%0 , #64]\n\t" > + "STP q0, q1, [%1 , #64]\n\t" > + "LDP q0, q1, [%0 , #96]\n\t" > + "STP q0, q1, [%1 , #96]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +static inline void > +rte_mov256(uint8_t *dst, const uint8_t *src) > +{ > + asm volatile("LDP q0, q1, [%0]\n\t" > + "STP q0, q1, [%1]\n\t" > + "LDP q0, q1, [%0 , #32]\n\t" > + "STP q0, q1, [%1 , #32]\n\t" > + "LDP q0, q1, [%0 , #64]\n\t" > + "STP q0, q1, [%1 , #64]\n\t" > + "LDP q0, q1, [%0 , #96]\n\t" > + "STP q0, q1, [%1 , #96]\n\t" > + "LDP q0, q1, [%0 , #128]\n\t" > + "STP q0, q1, [%1 , #128]\n\t" > + "LDP q0, q1, [%0 , #160]\n\t" > + "STP q0, q1, [%1 , #160]\n\t" > + "LDP q0, q1, [%0 , #192]\n\t" > + "STP q0, q1, [%1 , #192]\n\t" > + "LDP q0, q1, [%0 , #224]\n\t" > + "STP q0, q1, [%1 , #224]\n\t" > + : : "r" (src), "r" (dst) : > + ); > +} > + > +#define rte_memcpy(dst, src, n) \ > + ({ (__builtin_constant_p(n)) ? \ > + memcpy((dst), (src), (n)) : \ > + rte_memcpy_func((dst), (src), (n)); }) > + > +static inline void * > +rte_memcpy_func(void *dst, const void *src, size_t n) > +{ > + void *ret = dst; > + > + /* We can't copy < 16 bytes using XMM registers so do it manually. */ > + if (n < 16) { > + if (n & 0x01) { > + *(uint8_t *)dst = *(const uint8_t *)src; > + dst = (uint8_t *)dst + 1; > + src = (const uint8_t *)src + 1; > + } > + if (n & 0x02) { > + *(uint16_t *)dst = *(const uint16_t *)src; > + dst = (uint16_t *)dst + 1; > + src = (const uint16_t *)src + 1; > + } > + if (n & 0x04) { > + *(uint32_t *)dst = *(const uint32_t *)src; > + dst = (uint32_t *)dst + 1; > + src = (const uint32_t *)src + 1; > + } > + if (n & 0x08) > + *(uint64_t *)dst = *(const uint64_t *)src; > + return ret; > + } > + > + /* Special fast cases for <= 128 bytes */ > + if (n <= 32) { > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + rte_mov16((uint8_t *)dst - 16 + n, > + (const uint8_t *)src - 16 + n); > + return ret; > + } > + > + if (n <= 64) { > + rte_mov32((uint8_t *)dst, (const uint8_t *)src); > + rte_mov32((uint8_t *)dst - 32 + n, > + (const uint8_t *)src - 32 + n); > + return ret; > + } > + > + if (n <= 128) { > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + rte_mov64((uint8_t *)dst - 64 + n, > + (const uint8_t *)src - 64 + n); > + return ret; > + } > + > + /* > + * For large copies > 128 bytes. This combination of 256, 64 and 16 byte > + * copies was found to be faster than doing 128 and 32 byte copies as > + * well. > + */ > + for ( ; n >= 256; n -= 256) { There is room for prefetching the next cacheline based on the cache line size. > + rte_mov256((uint8_t *)dst, (const uint8_t *)src); > + dst = (uint8_t *)dst + 256; > + src = (const uint8_t *)src + 256; > + } > + > + /* > + * We split the remaining bytes (which will be less than 256) into > + * 64byte (2^6) chunks. > + * Using incrementing integers in the case labels of a switch statement > + * enourages the compiler to use a jump table. To get incrementing > + * integers, we shift the 2 relevant bits to the LSB position to first > + * get decrementing integers, and then subtract. > + */ > + switch (3 - (n >> 6)) { > + case 0x00: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + case 0x01: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + case 0x02: > + rte_mov64((uint8_t *)dst, (const uint8_t *)src); > + n -= 64; > + dst = (uint8_t *)dst + 64; > + src = (const uint8_t *)src + 64; /* fallthrough */ > + default: > + break; > + } > + > + /* > + * We split the remaining bytes (which will be less than 64) into > + * 16byte (2^4) chunks, using the same switch structure as above. > + */ > + switch (3 - (n >> 4)) { > + case 0x00: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + case 0x01: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + case 0x02: > + rte_mov16((uint8_t *)dst, (const uint8_t *)src); > + n -= 16; > + dst = (uint8_t *)dst + 16; > + src = (const uint8_t *)src + 16; /* fallthrough */ > + default: > + break; > + } > + > + /* Copy any remaining bytes, without going beyond end of buffers */ > + if (n != 0) > + rte_mov16((uint8_t *)dst - 16 + n, > + (const uint8_t *)src - 16 + n); > + return ret; > +} > + > +#else > + > +static inline void > +rte_mov16(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 16); > +} > + > +static inline void > +rte_mov32(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 32); > +} > + > +static inline void > +rte_mov48(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 48); > +} > + > +static inline void > +rte_mov64(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 64); > +} > + > +static inline void > +rte_mov128(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 128); > +} > + > +static inline void > +rte_mov256(uint8_t *dst, const uint8_t *src) > +{ > + memcpy(dst, src, 256); > +} > + > +static inline void * > +rte_memcpy(void *dst, const void *src, size_t n) > +{ > + return memcpy(dst, src, n); > +} > + > +static inline void * > +rte_memcpy_func(void *dst, const void *src, size_t n) > +{ > + return memcpy(dst, src, n); > +} > + > +#endif /* __ARM_NEON_FP */ > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_MEMCPY_ARM_64_H_ */ > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 4:57 ` Jerin Jacob @ 2015-11-02 12:22 ` Hunt, David 2015-11-02 12:45 ` Jan Viktorin 2015-11-02 12:57 ` Jerin Jacob 0 siblings, 2 replies; 28+ messages in thread From: Hunt, David @ 2015-11-02 12:22 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 02/11/2015 04:57, Jerin Jacob wrote: > On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: >> Signed-off-by: David Hunt <david.hunt@intel.com> --snip-- >> +#ifndef _RTE_MEMCPY_ARM_64_H_ >> +#define _RTE_MEMCPY_ARM_64_H_ >> + >> +#include <stdint.h> >> +#include <string.h> >> + >> +#ifdef __cplusplus >> +extern "C" { >> +#endif >> + >> +#include "generic/rte_memcpy.h" >> + >> +#ifdef __ARM_NEON_FP > > SIMD is not optional in armv8 spec.So every armv8 machine will have > SIMD instruction unlike armv7.More over LDP/STP instruction is > not part of SIMD.So this check is not required or it can > be replaced with a check that select memcpy from either libc or this specific > implementation Jerin, I've just benchmarked the libc version against the hand-coded version of the memcpy routines, and the libc wins in most cases. This code was just an initial attempt at optimising the memccpy's, so I feel that with the current benchmark results, it would better just to remove the assembly versions, and use the libc version for the initial release on ARMv8. Then, in the future, the ARMv8 experts are free to submit an optimised version as a patch in the future. Does that sound reasonable to you? Rgds, Dave. --snip-- ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 12:22 ` Hunt, David @ 2015-11-02 12:45 ` Jan Viktorin 2015-11-02 12:57 ` Jerin Jacob 1 sibling, 0 replies; 28+ messages in thread From: Jan Viktorin @ 2015-11-02 12:45 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, 2 Nov 2015 12:22:47 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 02/11/2015 04:57, Jerin Jacob wrote: > > On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > >> Signed-off-by: David Hunt <david.hunt@intel.com> > --snip-- > >> +#ifndef _RTE_MEMCPY_ARM_64_H_ > >> +#define _RTE_MEMCPY_ARM_64_H_ > >> + > >> +#include <stdint.h> > >> +#include <string.h> > >> + > >> +#ifdef __cplusplus > >> +extern "C" { > >> +#endif > >> + > >> +#include "generic/rte_memcpy.h" > >> + > >> +#ifdef __ARM_NEON_FP > > > > SIMD is not optional in armv8 spec.So every armv8 machine will have > > SIMD instruction unlike armv7.More over LDP/STP instruction is > > not part of SIMD.So this check is not required or it can > > be replaced with a check that select memcpy from either libc or this specific > > implementation > > Jerin, > I've just benchmarked the libc version against the hand-coded > version of the memcpy routines, and the libc wins in most cases. This > code was just an initial attempt at optimising the memccpy's, so I feel > that with the current benchmark results, it would better just to remove > the assembly versions, and use the libc version for the initial release > on ARMv8. > Then, in the future, the ARMv8 experts are free to submit an optimised > version as a patch in the future. Does that sound reasonable to you? > Rgds, > Dave. As there is no use of NEON in the code, this optimization seems to be useless to me... Jan > > > --snip-- > > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 12:22 ` Hunt, David 2015-11-02 12:45 ` Jan Viktorin @ 2015-11-02 12:57 ` Jerin Jacob 2015-11-02 15:26 ` Hunt, David 1 sibling, 1 reply; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 12:57 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote: > On 02/11/2015 04:57, Jerin Jacob wrote: > >On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote: > >>Signed-off-by: David Hunt <david.hunt@intel.com> > --snip-- > >>+#ifndef _RTE_MEMCPY_ARM_64_H_ > >>+#define _RTE_MEMCPY_ARM_64_H_ > >>+ > >>+#include <stdint.h> > >>+#include <string.h> > >>+ > >>+#ifdef __cplusplus > >>+extern "C" { > >>+#endif > >>+ > >>+#include "generic/rte_memcpy.h" > >>+ > >>+#ifdef __ARM_NEON_FP > > > >SIMD is not optional in armv8 spec.So every armv8 machine will have > >SIMD instruction unlike armv7.More over LDP/STP instruction is > >not part of SIMD.So this check is not required or it can > >be replaced with a check that select memcpy from either libc or this specific > >implementation > > Jerin, > I've just benchmarked the libc version against the hand-coded version of > the memcpy routines, and the libc wins in most cases. This code was just an > initial attempt at optimising the memccpy's, so I feel that with the current > benchmark results, it would better just to remove the assembly versions, and > use the libc version for the initial release on ARMv8. > Then, in the future, the ARMv8 experts are free to submit an optimised > version as a patch in the future. Does that sound reasonable to you? Make sense. Based on my understanding, other blocks are also not optimized for arm64. So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and libc for initial version. BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and "byteorder_autotest" is broken. I think existing arm64 code is not optimized beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified CONFIG_RTE_FORCE_INTRINSICS scheme. if you guys are OK with arm and arm64 as two different platform then I can summit the complete working patch for arm64.(as in my current source code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/) > Rgds, > Dave. > > > --snip-- > > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 12:57 ` Jerin Jacob @ 2015-11-02 15:26 ` Hunt, David 2015-11-02 15:36 ` Jan Viktorin 0 siblings, 1 reply; 28+ messages in thread From: Hunt, David @ 2015-11-02 15:26 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 02/11/2015 12:57, Jerin Jacob wrote: > On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote: >> Jerin, >> I've just benchmarked the libc version against the hand-coded version of >> the memcpy routines, and the libc wins in most cases. This code was just an >> initial attempt at optimising the memccpy's, so I feel that with the current >> benchmark results, it would better just to remove the assembly versions, and >> use the libc version for the initial release on ARMv8. >> Then, in the future, the ARMv8 experts are free to submit an optimised >> version as a patch in the future. Does that sound reasonable to you? > > Make sense. Based on my understanding, other blocks are also not optimized > for arm64. > So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and > libc for initial version. > > BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and > "byteorder_autotest" is broken. I think existing arm64 code is not optimized > beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified > CONFIG_RTE_FORCE_INTRINSICS scheme. Agreed. > if you guys are OK with arm and arm64 as two different platform then > I can summit the complete working patch for arm64.(as in my current source > code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/) Sure. That would be great. We initially started with two ARMv7 patch-sets, and Jan merged into one. Something similar could happen for the ARMv8 patch set. We just want to end up with the best implementation possible. :) Dave. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 15:26 ` Hunt, David @ 2015-11-02 15:36 ` Jan Viktorin 2015-11-02 15:49 ` Hunt, David 0 siblings, 1 reply; 28+ messages in thread From: Jan Viktorin @ 2015-11-02 15:36 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, 2 Nov 2015 15:26:19 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 02/11/2015 12:57, Jerin Jacob wrote: > > On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote: > >> Jerin, > >> I've just benchmarked the libc version against the hand-coded version of > >> the memcpy routines, and the libc wins in most cases. This code was just an > >> initial attempt at optimising the memccpy's, so I feel that with the current > >> benchmark results, it would better just to remove the assembly versions, and > >> use the libc version for the initial release on ARMv8. > >> Then, in the future, the ARMv8 experts are free to submit an optimised > >> version as a patch in the future. Does that sound reasonable to you? > > > > Make sense. Based on my understanding, other blocks are also not optimized > > for arm64. > > So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and > > libc for initial version. > > > > BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and > > "byteorder_autotest" is broken. I think existing arm64 code is not optimized > > beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified > > CONFIG_RTE_FORCE_INTRINSICS scheme. > > Agreed. > > > if you guys are OK with arm and arm64 as two different platform then > > I can summit the complete working patch for arm64.(as in my current source > > code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/) > > Sure. That would be great. We initially started with two ARMv7 > patch-sets, and Jan merged into one. Something similar could happen for > the ARMv8 patch set. We just want to end up with the best implementation > possible. :) > It was looking like we can share a lot of common code for both architectures. I didn't know how much different are the cpuflags. IMHO, it'd be better to have two directories arm and arm64. I thought to refer from arm64 to arm where possible. But I don't know whether is this possible with the DPDK build system. Jan > Dave. > > > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 15:36 ` Jan Viktorin @ 2015-11-02 15:49 ` Hunt, David 2015-11-02 16:29 ` Jerin Jacob 0 siblings, 1 reply; 28+ messages in thread From: Hunt, David @ 2015-11-02 15:49 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On 02/11/2015 15:36, Jan Viktorin wrote: > On Mon, 2 Nov 2015 15:26:19 +0000 --snip-- > It was looking like we can share a lot of common code for both > architectures. I didn't know how much different are the cpuflags. CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7 ones. static const struct feature_entry cpu_feature_table[] = { FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) FEAT_DEF(ASIMD, 0x00000001, 0, REG_HWCAP, 1) FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) }; > IMHO, it'd be better to have two directories arm and arm64. I thought > to refer from arm64 to arm where possible. But I don't know whether is > this possible with the DPDK build system. I think both methodologies have their pros and cons. However, I'd lean towards the common directory with the "filename_32/64.h" scheme, as that similar to the x86 methodology, and we don't need to tweak the include paths to pull files from multiple directories. Dave ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 15:49 ` Hunt, David @ 2015-11-02 16:29 ` Jerin Jacob 2015-11-02 17:29 ` Jan Viktorin 0 siblings, 1 reply; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 16:29 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote: > On 02/11/2015 15:36, Jan Viktorin wrote: > >On Mon, 2 Nov 2015 15:26:19 +0000 > --snip-- > >It was looking like we can share a lot of common code for both > >architectures. I didn't know how much different are the cpuflags. > > CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7 > ones. > > static const struct feature_entry cpu_feature_table[] = { > FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) > FEAT_DEF(ASIMD, 0x00000001, 0, REG_HWCAP, 1) > FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) > FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) > FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) > FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) > FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) > FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) > FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) > FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) > }; > > >IMHO, it'd be better to have two directories arm and arm64. I thought > >to refer from arm64 to arm where possible. But I don't know whether is > >this possible with the DPDK build system. > > I think both methodologies have their pros and cons. However, I'd lean > towards the common directory with the "filename_32/64.h" scheme, as that > similar to the x86 methodology, and we don't need to tweak the include paths > to pull files from multiple directories. > I agree. Jan, could you please send the next version with filename_32/64.h for atomic and cpuflags(ie for all header files). I can re-base and send the complete arm64 patch based on your version. Thanks, Jerin > Dave > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h 2015-11-02 16:29 ` Jerin Jacob @ 2015-11-02 17:29 ` Jan Viktorin 0 siblings, 0 replies; 28+ messages in thread From: Jan Viktorin @ 2015-11-02 17:29 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On Mon, 2 Nov 2015 21:59:12 +0530 Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote: > On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote: > > On 02/11/2015 15:36, Jan Viktorin wrote: > > >On Mon, 2 Nov 2015 15:26:19 +0000 > > --snip-- > > >It was looking like we can share a lot of common code for both > > >architectures. I didn't know how much different are the cpuflags. > > > > CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7 > > ones. > > > > static const struct feature_entry cpu_feature_table[] = { > > FEAT_DEF(FP, 0x00000001, 0, REG_HWCAP, 0) > > FEAT_DEF(ASIMD, 0x00000001, 0, REG_HWCAP, 1) > > FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 2) > > FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP, 3) > > FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP, 4) > > FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP, 5) > > FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP, 6) > > FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP, 7) > > FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) > > FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) > > }; > > > > >IMHO, it'd be better to have two directories arm and arm64. I thought > > >to refer from arm64 to arm where possible. But I don't know whether is > > >this possible with the DPDK build system. > > > > I think both methodologies have their pros and cons. However, I'd lean > > towards the common directory with the "filename_32/64.h" scheme, as that > > similar to the x86 methodology, and we don't need to tweak the include paths > > to pull files from multiple directories. > > > > I agree. Jan, could you please send the next version with > filename_32/64.h for atomic and cpuflags(ie for all header files). > I can re-base and send the complete arm64 patch based on your version. > I am working on it, however, after I've removed the unnecessary intrinsics code and set the RTE_FORCE_INTRINSICS=y, it doesn't build... So I'm figuring out what is wrong. Jan > Thanks, > Jerin > > > > > Dave > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt ` (3 subsequent siblings) 5 siblings, 0 replies; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_prefetch.h | 4 ++ .../common/include/arch/arm/rte_prefetch_64.h | 61 ++++++++++++++++++++++ 2 files changed, 65 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h index 1f46697..aa37de5 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h @@ -33,6 +33,10 @@ #ifndef _RTE_PREFETCH_ARM_H_ #define _RTE_PREFETCH_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_prefetch_64.h> +#else #include <rte_prefetch_32.h> +#endif #endif /* _RTE_PREFETCH_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h new file mode 100644 index 0000000..b0d9170 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h @@ -0,0 +1,61 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Intel Corporation. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_ARM_64_H_ +#define _RTE_PREFETCH_ARM_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_prefetch.h" +/* May want to add PSTL1KEEP instructions for prefetch for ownership. */ +static inline void rte_prefetch0(const volatile void *p) +{ + asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch1(const volatile void *p) +{ + asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p)); +} + +static inline void rte_prefetch2(const volatile void *p) +{ + asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PREFETCH_ARM_64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-11-02 5:15 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt ` (2 subsequent siblings) 5 siblings, 1 reply; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- .../common/include/arch/arm/rte_cycles.h | 4 ++ .../common/include/arch/arm/rte_cycles_64.h | 77 ++++++++++++++++++++++ 2 files changed, 81 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h index b2372fa..a8009a0 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h @@ -33,6 +33,10 @@ #ifndef _RTE_CYCLES_ARM_H_ #define _RTE_CYCLES_ARM_H_ +#ifdef RTE_ARCH_64 +#include <rte_cycles_64.h> +#else #include <rte_cycles_32.h> +#endif #endif /* _RTE_CYCLES_ARM_H_ */ diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h new file mode 100644 index 0000000..148b9f4 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h @@ -0,0 +1,77 @@ +/* + * BSD LICENSE + * + * Copyright (C) IBM Corporation 2014. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IBM Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef _RTE_CYCLES_ARM64_H_ +#define _RTE_CYCLES_ARM64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_cycles.h" + +/** + * Read the time base register. + * + * @return + * The time base for this lcore. + */ +static inline uint64_t +rte_rdtsc(void) +{ + uint64_t tsc; + + asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc)); + +#ifdef RTE_TIMER_MULTIPLIER + return tsc * RTE_TIMER_MULTIPLIER; +#else + return tsc; +#endif + +} + +static inline uint64_t +rte_rdtsc_precise(void) +{ + asm volatile("isb sy" :::); + return rte_rdtsc(); +} + +static inline uint64_t +rte_get_tsc_cycles(void) { return rte_rdtsc(); } + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_CYCLES_ARM64_H_ */ -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt @ 2015-11-02 5:15 ` Jerin Jacob 0 siblings, 0 replies; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 5:15 UTC (permalink / raw) To: David Hunt; +Cc: dev On Fri, Oct 30, 2015 at 01:49:16PM +0000, David Hunt wrote: > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > .../common/include/arch/arm/rte_cycles.h | 4 ++ > .../common/include/arch/arm/rte_cycles_64.h | 77 ++++++++++++++++++++++ > 2 files changed, 81 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h > index b2372fa..a8009a0 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h > @@ -33,6 +33,10 @@ > #ifndef _RTE_CYCLES_ARM_H_ > #define _RTE_CYCLES_ARM_H_ > > +#ifdef RTE_ARCH_64 > +#include <rte_cycles_64.h> > +#else > #include <rte_cycles_32.h> > +#endif > > #endif /* _RTE_CYCLES_ARM_H_ */ > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > new file mode 100644 > index 0000000..148b9f4 > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > @@ -0,0 +1,77 @@ > +/* > + * BSD LICENSE > + * > + * Copyright (C) IBM Corporation 2014. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of IBM Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +*/ > + > +#ifndef _RTE_CYCLES_ARM64_H_ > +#define _RTE_CYCLES_ARM64_H_ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include "generic/rte_cycles.h" > + > +/** > + * Read the time base register. > + * > + * @return > + * The time base for this lcore. > + */ > +static inline uint64_t > +rte_rdtsc(void) > +{ > + uint64_t tsc; > + > + asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc)); > + > +#ifdef RTE_TIMER_MULTIPLIER > + return tsc * RTE_TIMER_MULTIPLIER; > +#else > + return tsc; > +#endif > + > +} > + > +static inline uint64_t > +rte_rdtsc_precise(void) > +{ > + asm volatile("isb sy" :::); IMO, it should be asm volatile("dmb ish" : : : "memory") to represent the data memory barrier(rte_mb()). > + return rte_rdtsc(); > +} > + > +static inline uint64_t > +rte_get_tsc_cycles(void) { return rte_rdtsc(); } > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_CYCLES_ARM64_H_ */ > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt ` (2 preceding siblings ...) 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt 5 siblings, 0 replies; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h index 7ce9d14..5c5fd6a 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h +++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h @@ -141,12 +141,16 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out) { int auxv_fd; +#ifdef RTE_ARCH_64 + Elf64_auxv_t auxv; +#else Elf32_auxv_t auxv; +#endif auxv_fd = open("/proc/self/auxv", O_RDONLY); assert(auxv_fd); while (read(auxv_fd, &auxv, - sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) { + sizeof(auxv)) == sizeof(auxv)) { if (auxv.a_type == AT_HWCAP) out[REG_HWCAP] = auxv.a_un.a_val; else if (auxv.a_type == AT_HWCAP2) -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt ` (3 preceding siblings ...) 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt @ 2015-10-30 13:49 ` David Hunt 2015-11-02 4:43 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt 5 siblings, 1 reply; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev The ARMv8 include files are in the arm directory in lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files Signed-off-by: David Hunt <david.hunt@intel.com> --- MAINTAINERS | 3 +- config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++ doc/guides/rel_notes/release_2_2.rst | 7 ++-- mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ mk/machine/armv8a/rte.vars.mk | 57 +++++++++++++++++++++++++++++ 5 files changed, 177 insertions(+), 4 deletions(-) create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc create mode 100644 mk/arch/arm64/rte.vars.mk create mode 100644 mk/machine/armv8a/rte.vars.mk diff --git a/MAINTAINERS b/MAINTAINERS index a8933eb..4569f13 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -124,8 +124,9 @@ IBM POWER M: Chao Zhu <chaozhu@linux.vnet.ibm.com> F: lib/librte_eal/common/include/arch/ppc_64/ -ARM v7 +ARM M: Jan Viktorin <viktorin@rehivetech.com> +M: David Hunt <david.hunt@intel.com> F: lib/librte_eal/common/include/arch/arm/ Intel x86 diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc new file mode 100644 index 0000000..79a9533 --- /dev/null +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc @@ -0,0 +1,56 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +#include "common_linuxapp" + +CONFIG_RTE_MACHINE="armv8a" + +CONFIG_RTE_ARCH="arm64" +CONFIG_RTE_ARCH_ARM64=y +CONFIG_RTE_ARCH_64=y +CONFIG_RTE_ARCH_ARM_NEON=y + +CONFIG_RTE_TOOLCHAIN="gcc" +CONFIG_RTE_TOOLCHAIN_GCC=y + +CONFIG_RTE_IXGBE_INC_VECTOR=n +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n +CONFIG_RTE_LIBRTE_IVSHMEM=n +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n + +CONFIG_RTE_LIBRTE_LPM=n +CONFIG_RTE_LIBRTE_ACL=n +CONFIG_RTE_LIBRTE_TABLE=n +CONFIG_RTE_LIBRTE_PIPELINE=n + +# This is used to adjust the generic arm timer to align with the cpu cycle count. +CONFIG_RTE_TIMER_MULTIPLIER=48 diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index 5b5bb4c..5aa523b 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -31,10 +31,11 @@ New Features * **Added vhost-user multiple queue support.** -* **Introduce ARMv7 architecture** +* **Introduce ARMv7 and ARMv8 architectures** - It is now possible to build DPDK for the ARMv7 platform and test with - virtual PMD drivers. + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. + * ARMv7 can be tested with virtual PMD drivers. + * ARMv8 can be tested with virtual and physical PMD drivers. Resolved Issues diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk new file mode 100644 index 0000000..3aad712 --- /dev/null +++ b/mk/arch/arm64/rte.vars.mk @@ -0,0 +1,58 @@ +# BSD LICENSE +# +# Copyright(c) 2015 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# arch: +# +# - define ARCH variable (overridden by cmdline or by previous +# optional define in machine .mk) +# - define CROSS variable (overridden by cmdline or previous define +# in machine .mk) +# - define CPU_CFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_LDFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - define CPU_ASFLAGS variable (overridden by cmdline or previous +# define in machine .mk) +# - may override any previously defined variable +# +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32 +# + +ARCH ?= arm64 +# common arch dir in eal headers +ARCH_DIR := arm +CROSS ?= + +CPU_CFLAGS ?= -DRTE_CACHE_LINE_SIZE=64 +CPU_LDFLAGS ?= +CPU_ASFLAGS ?= -felf + +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk new file mode 100644 index 0000000..b785062 --- /dev/null +++ b/mk/machine/armv8a/rte.vars.mk @@ -0,0 +1,57 @@ +# BSD LICENSE +# +# Copyright(c) 2015 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# +# machine: +# +# - can define ARCH variable (overridden by cmdline value) +# - can define CROSS variable (overridden by cmdline value) +# - define MACHINE_CFLAGS variable (overridden by cmdline value) +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) +# - can define CPU_CFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that +# overrides the one defined in arch. +# - may override any previously defined variable +# + +# ARCH = +# CROSS = +# MACHINE_CFLAGS = +# MACHINE_LDFLAGS = +# MACHINE_ASFLAGS = +# CPU_CFLAGS = +# CPU_LDFLAGS = +# CPU_ASFLAGS = + +MACHINE_CFLAGS += -march=armv8-a -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt @ 2015-11-02 4:43 ` Jerin Jacob 0 siblings, 0 replies; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 4:43 UTC (permalink / raw) To: David Hunt; +Cc: dev On Fri, Oct 30, 2015 at 01:49:18PM +0000, David Hunt wrote: > The ARMv8 include files are in the arm directory in > lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files > > Signed-off-by: David Hunt <david.hunt@intel.com> > --- > MAINTAINERS | 3 +- > config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++ > doc/guides/rel_notes/release_2_2.rst | 7 ++-- > mk/arch/arm64/rte.vars.mk | 58 ++++++++++++++++++++++++++++++ > mk/machine/armv8a/rte.vars.mk | 57 +++++++++++++++++++++++++++++ > 5 files changed, 177 insertions(+), 4 deletions(-) > create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc > create mode 100644 mk/arch/arm64/rte.vars.mk > create mode 100644 mk/machine/armv8a/rte.vars.mk > > diff --git a/MAINTAINERS b/MAINTAINERS > index a8933eb..4569f13 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -124,8 +124,9 @@ IBM POWER > M: Chao Zhu <chaozhu@linux.vnet.ibm.com> > F: lib/librte_eal/common/include/arch/ppc_64/ > > -ARM v7 > +ARM > M: Jan Viktorin <viktorin@rehivetech.com> > +M: David Hunt <david.hunt@intel.com> > F: lib/librte_eal/common/include/arch/arm/ > > Intel x86 > diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc > new file mode 100644 > index 0000000..79a9533 > --- /dev/null > +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc > @@ -0,0 +1,56 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > +# All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > +# > + > +#include "common_linuxapp" > + > +CONFIG_RTE_MACHINE="armv8a" > + > +CONFIG_RTE_ARCH="arm64" > +CONFIG_RTE_ARCH_ARM64=y > +CONFIG_RTE_ARCH_64=y > +CONFIG_RTE_ARCH_ARM_NEON=y > + > +CONFIG_RTE_TOOLCHAIN="gcc" > +CONFIG_RTE_TOOLCHAIN_GCC=y > + > +CONFIG_RTE_IXGBE_INC_VECTOR=n > +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n > +CONFIG_RTE_LIBRTE_IVSHMEM=n > +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n > + > +CONFIG_RTE_LIBRTE_LPM=n > +CONFIG_RTE_LIBRTE_ACL=n > +CONFIG_RTE_LIBRTE_TABLE=n > +CONFIG_RTE_LIBRTE_PIPELINE=n > + > +# This is used to adjust the generic arm timer to align with the cpu cycle count. > +CONFIG_RTE_TIMER_MULTIPLIER=48 Introducing a build-time dependency with cpu clock parameter not a good idea. Either this parameter needs be removed or find out out the multiplier at run-time by introducing a machine specific hook > diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst > index 5b5bb4c..5aa523b 100644 > --- a/doc/guides/rel_notes/release_2_2.rst > +++ b/doc/guides/rel_notes/release_2_2.rst > @@ -31,10 +31,11 @@ New Features > > * **Added vhost-user multiple queue support.** > > -* **Introduce ARMv7 architecture** > +* **Introduce ARMv7 and ARMv8 architectures** > > - It is now possible to build DPDK for the ARMv7 platform and test with > - virtual PMD drivers. > + * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms. > + * ARMv7 can be tested with virtual PMD drivers. > + * ARMv8 can be tested with virtual and physical PMD drivers. > > > Resolved Issues > diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk > new file mode 100644 > index 0000000..3aad712 > --- /dev/null > +++ b/mk/arch/arm64/rte.vars.mk > @@ -0,0 +1,58 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2015 Intel Corporation. All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + > +# > +# arch: > +# > +# - define ARCH variable (overridden by cmdline or by previous > +# optional define in machine .mk) > +# - define CROSS variable (overridden by cmdline or previous define > +# in machine .mk) > +# - define CPU_CFLAGS variable (overridden by cmdline or previous > +# define in machine .mk) > +# - define CPU_LDFLAGS variable (overridden by cmdline or previous > +# define in machine .mk) > +# - define CPU_ASFLAGS variable (overridden by cmdline or previous > +# define in machine .mk) > +# - may override any previously defined variable > +# > +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32 > +# > + > +ARCH ?= arm64 > +# common arch dir in eal headers > +ARCH_DIR := arm > +CROSS ?= > + > +CPU_CFLAGS ?= -DRTE_CACHE_LINE_SIZE=64 cache line size can be moved to MACHINE_CFLAGS as its more of machine parameter.so that if machine has different cache line size(based on arm64) can have new target like defconfig_arm64-xxxxxxx-linuxapp-gcc > +CPU_LDFLAGS ?= > +CPU_ASFLAGS ?= -felf > + > +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS > diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk > new file mode 100644 > index 0000000..b785062 > --- /dev/null > +++ b/mk/machine/armv8a/rte.vars.mk > @@ -0,0 +1,57 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2015 Intel Corporation. All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + > +# > +# machine: > +# > +# - can define ARCH variable (overridden by cmdline value) > +# - can define CROSS variable (overridden by cmdline value) > +# - define MACHINE_CFLAGS variable (overridden by cmdline value) > +# - define MACHINE_LDFLAGS variable (overridden by cmdline value) > +# - define MACHINE_ASFLAGS variable (overridden by cmdline value) > +# - can define CPU_CFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - may override any previously defined variable > +# > + > +# ARCH = > +# CROSS = > +# MACHINE_CFLAGS = > +# MACHINE_LDFLAGS = > +# MACHINE_ASFLAGS = > +# CPU_CFLAGS = > +# CPU_LDFLAGS = > +# CPU_ASFLAGS = > + > +MACHINE_CFLAGS += -march=armv8-a > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt ` (4 preceding siblings ...) 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt @ 2015-10-30 13:49 ` David Hunt 5 siblings, 0 replies; 28+ messages in thread From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw) To: dev Signed-off-by: David Hunt <david.hunt@intel.com> --- app/test/test_cpuflags.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index 557458f..1689048 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -1,4 +1,4 @@ -/*- +/* * BSD LICENSE * * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. @@ -115,9 +115,18 @@ test_cpuflags(void) CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP); #endif -#if defined(RTE_ARCH_ARM) +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) + printf("Checking for Floating Point:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_FPA); + printf("Check for NEON:\t\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); + + printf("Checking for ARM32 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32); + + printf("Checking for ARM64 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64); #endif #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <1446212826-19425-7-git-send-email-david.hunt@intel.com>]
[parent not found: <5633798B.2050708@intel.com>]
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 [not found] ` <5633798B.2050708@intel.com> @ 2015-10-30 16:11 ` Jan Viktorin 2015-10-30 16:16 ` Thomas Monjalon 2015-10-30 16:28 ` Hunt, David 0 siblings, 2 replies; 28+ messages in thread From: Jan Viktorin @ 2015-10-30 16:11 UTC (permalink / raw) To: Hunt, David; +Cc: dev Hmm, I see. It's good to fix this in the generated e-mails between format-patch and send-email calls. I always review those to be sure they meet my expectations ;). Anyway, it is not clear, what has changed in the v3. Just the rte_cycles? You should explain that at least in the 0000 patch. Better to keep some history in each single commit (are there any rules in dpdk for this? Just look how they do in kernel). I'll test the patchset in qemu anyway... so will probably send tested-by. I've put this conversation to mailing list as I cannot see any reason why it is not CC'd there... Jan Viktorin RehiveTech Sent from a mobile device Původní zpráva Od: Hunt, David Odesláno: pátek, 30. října 2015 15:07 Komu: Jan Viktorin Předmět: Fwd: [PATCH v3 6/6] test: add checks for cpu flags on armv8 Jan, I had gone to the trouble of adding a "Reviewed-by" line in all the commit messages for each patch in the patch set, as well as addressing the comment about the armv8 files being in the arm dir. However, the 'git format-patch' seems to have stripped out the "Reviewed-by" line for some reason. If you are happy with the latest patch set, could you reply and maybe say something like "series Reviewed-by..."? Thanks for your help in this. Regards, Dave. -------- Forwarded Message -------- Subject: [PATCH v3 6/6] test: add checks for cpu flags on armv8 Date: Fri, 30 Oct 2015 13:47:06 +0000 From: David Hunt <david.hunt@intel.com> To: david.hunt@intel.com Signed-off-by: David Hunt <david.hunt@intel.com> --- app/test/test_cpuflags.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index 557458f..1689048 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -1,4 +1,4 @@ -/*- +/* * BSD LICENSE * * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. @@ -115,9 +115,18 @@ test_cpuflags(void) CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP); #endif -#if defined(RTE_ARCH_ARM) +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) + printf("Checking for Floating Point:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_FPA); + printf("Check for NEON:\t\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_NEON); + + printf("Checking for ARM32 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32); + + printf("Checking for ARM64 mode:\t\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64); #endif #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-10-30 16:11 ` Jan Viktorin @ 2015-10-30 16:16 ` Thomas Monjalon 2015-10-30 16:28 ` Hunt, David 1 sibling, 0 replies; 28+ messages in thread From: Thomas Monjalon @ 2015-10-30 16:16 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev 2015-10-30 17:11, Jan Viktorin: > Anyway, it is not clear, what has changed in the v3. Just the rte_cycles? > You should explain that at least in the 0000 patch. > Better to keep some history in each single commit (are there any rules in > dpdk for this? Just look how they do in kernel). The rule is to help reviewers ;) History in the cover letter is good. If there are also some history in each patch, it's better. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-10-30 16:11 ` Jan Viktorin 2015-10-30 16:16 ` Thomas Monjalon @ 2015-10-30 16:28 ` Hunt, David 2015-11-02 6:32 ` Jerin Jacob 1 sibling, 1 reply; 28+ messages in thread From: Hunt, David @ 2015-10-30 16:28 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On 30/10/2015 16:11, Jan Viktorin wrote: > Hmm, I see. It's good to fix this in the generated e-mails between format-patch > and send-email calls. I always review those to be sure they meet my expectations ;). > Anyway, it is not clear, what has changed in the v3. Just the rte_cycles? > You should explain that at least in the 0000 patch. Better to keep some history > in each single commit (are there any rules in dpdk for this? Just look how they do in kernel). --snip-- Sure, I'll keep that in mind for the next time. A list of changes for each revision, and also changes in each patch in the patch set. As Thomas says - whatever helps the reviewer :) For the moment there probably isn't a need to release a new patch set for these comments, so I'll just list them here: 1. v3 has just the additional comment in one of the patches to say that the armv8 header files are in the 'arm' include directory. 2. The rte_cycles is unchanged, the CONFIG_ is not needed. If there is a need to post another patch set I'll include the change notes. Otherwise do we all think that the patch is there (or there abouts)? :) Regards, Dave. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-10-30 16:28 ` Hunt, David @ 2015-11-02 6:32 ` Jerin Jacob 2015-11-02 10:47 ` Hunt, David 0 siblings, 1 reply; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 6:32 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote: > On 30/10/2015 16:11, Jan Viktorin wrote: > >Hmm, I see. It's good to fix this in the generated e-mails between format-patch > > and send-email calls. I always review those to be sure they meet my > expectations ;). > >Anyway, it is not clear, what has changed in the v3. Just the rte_cycles? > >You should explain that at least in the 0000 patch. Better to keep some history > >in each single commit (are there any rules in dpdk for this? Just look how they do in kernel). > --snip-- > > Sure, I'll keep that in mind for the next time. A list of changes for each > revision, and also changes in each patch in the patch set. As Thomas says - > whatever helps the reviewer :) > > For the moment there probably isn't a need to release a new patch set for > these comments, so I'll just list them here: > 1. v3 has just the additional comment in one of the patches to say that the > armv8 header files are in the 'arm' include directory. > 2. The rte_cycles is unchanged, the CONFIG_ is not needed. > > If there is a need to post another patch set I'll include the change notes. > Otherwise do we all think that the patch is there (or there abouts)? :) Hi Jan and Dave, I have reviewed your patches for arm[64] support. Please check the review comments. Cavium would like to contribute on armv8 port and remaining libraries (ACL, LPM, HASH) implementation for armv8. Currently i am re-basing our ACL,HASH libraries implementation based on existing patches. Happy to work with you guys to have full fledged armv8 support for DPDK. Jerin other query on rte_cpu_get_flag_enabled for armv8, I have tried to run the existing patches on armv8-thunderX platform. But there application start failure due to mismatch in rte_cpu_get_flag_enabled() encoding. In my platform rte_cpu_get_flag_enabled() works based on AT_HWCAP with following values[1] which different from existing lib/librte_eal/common/include/arch/arm/rte_cpuflags.h [1]http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h In order to debug this, Could provide the following values in tested armv8 platform. Look like its running 32bit compatible mode in your environment root@arm64:/export/dpdk-arm64# LD_SHOW_AUXV=1 sleep 1000 AT_SYSINFO_EHDR: 0x3ff859f0000 AT_??? (0x26): 0x430f0a10 AT_HWCAP: fb AT_PAGESZ: 65536 AT_CLKTCK: 100 AT_PHDR: 0x400040 AT_PHENT: 56 AT_PHNUM: 7 AT_BASE: 0x3ff85a00000 AT_FLAGS: 0x0 AT_ENTRY: 0x401900 AT_UID: 0 AT_EUID: 0 AT_GID: 0 AT_EGID: 0 AT_SECURE: 0 AT_RANDOM: 0x3ffef1c7988 AT_EXECFN: /bin/sleep AT_PLATFORM: aarch64 root@arm64:/export/dpdk-arm64# zcat /proc/config.gz | grep CONFIG_COMPAT # CONFIG_COMPAT_BRK is not set CONFIG_COMPAT_BINFMT_ELF=y CONFIG_COMPAT=y CONFIG_COMPAT_NETLINK_MESSAGES=y root@arm64:/export/dpdk-arm64# cat /proc/cpuinfo Processor : AArch64 Processor rev 0 (aarch64) processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 processor : 8 processor : 9 processor : 10 processor : 11 processor : 12 processor : 13 processor : 14 processor : 15 processor : 16 processor : 17 processor : 18 processor : 19 processor : 20 processor : 21 processor : 22 processor : 23 processor : 24 processor : 25 processor : 26 processor : 27 processor : 28 processor : 29 processor : 30 processor : 31 processor : 32 processor : 33 processor : 34 processor : 35 processor : 36 processor : 37 processor : 38 processor : 39 processor : 40 processor : 41 processor : 42 processor : 43 processor : 44 processor : 45 processor : 46 processor : 47 Features : fp asimd aes pmull sha1 sha2 crc32 CPU implementer : 0x43 CPU architecture: AArch64 CPU variant : 0x0 CPU part : 0x0a1 CPU revision : 0 > > Regards, > Dave. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-11-02 6:32 ` Jerin Jacob @ 2015-11-02 10:47 ` Hunt, David 2015-11-02 13:17 ` Jerin Jacob 2015-11-02 15:24 ` Jan Viktorin 0 siblings, 2 replies; 28+ messages in thread From: Hunt, David @ 2015-11-02 10:47 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 02/11/2015 06:32, Jerin Jacob wrote: > On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote: --snip-- > > Hi Jan and Dave, > > I have reviewed your patches for arm[64] support. Please check the > review comments. Hi Jerin, I'm looking at the comments now, and working on getting the suggested changes merged into the patch-set. > Cavium would like to contribute on armv8 port and remaining libraries > (ACL, LPM, HASH) implementation for armv8. Currently i am re-basing > our ACL,HASH libraries implementation based on existing patches. > Happy to work with you guys to have full fledged armv8 support for DPDK. > > Jerin Thanks for that, it's good news indeed. > other query on rte_cpu_get_flag_enabled for armv8, > I have tried to run the existing patches on armv8-thunderX platform. > But there application start failure due to mismatch in > rte_cpu_get_flag_enabled() encoding. > > In my platform rte_cpu_get_flag_enabled() works based on > AT_HWCAP with following values[1] which different from > existing lib/librte_eal/common/include/arch/arm/rte_cpuflags.h > > [1]http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h > > In order to debug this, Could provide the following > values in tested armv8 platform. Look like its running 32bit compatible > mode in your environment I'm using a Gigabyte MP30AR0 motherboard with an 8-core X-Gene, Running a 4.3.0-rc6 kernel. Here's the information on the cpu_flags issue you requested: > AT_SYSINFO_EHDR: 0x3ff859f0000 > AT_??? (0x26): 0x430f0a10 > AT_HWCAP: fb > AT_PAGESZ: 65536 > AT_CLKTCK: 100 > AT_PHDR: 0x400040 > AT_PHENT: 56 > AT_PHNUM: 7 > AT_BASE: 0x3ff85a00000 > AT_FLAGS: 0x0 > AT_ENTRY: 0x401900 > AT_UID: 0 > AT_EUID: 0 > AT_GID: 0 > AT_EGID: 0 > AT_SECURE: 0 > AT_RANDOM: 0x3ffef1c7988 > AT_EXECFN: /bin/sleep > AT_PLATFORM: aarch64 root@mp30ar0:~# LD_SHOW_AUXV=1 sleep 1000 AT_SYSINFO_EHDR: 0x7f7956d000 AT_HWCAP: 7 AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x400040 AT_PHENT: 56 AT_PHNUM: 7 AT_BASE: 0x7f79543000 AT_FLAGS: 0x0 AT_ENTRY: 0x401900 AT_UID: 0 AT_EUID: 0 AT_GID: 0 AT_EGID: 0 AT_SECURE: 0 AT_RANDOM: 0x7ffcaf2e48 AT_EXECFN: /bin/sleep AT_PLATFORM: aarch64 > root@arm64:/export/dpdk-arm64# zcat /proc/config.gz | grep CONFIG_COMPAT > # CONFIG_COMPAT_BRK is not set > CONFIG_COMPAT_BINFMT_ELF=y > CONFIG_COMPAT=y > CONFIG_COMPAT_NETLINK_MESSAGES=y root@mp30ar0:~# zcat /proc/config.gz | grep CONFIG_COMPAT # CONFIG_COMPAT_BRK is not set CONFIG_COMPAT_OLD_SIGACTION=y CONFIG_COMPAT_BINFMT_ELF=y CONFIG_COMPAT=y > root@arm64:/export/dpdk-arm64# cat /proc/cpuinfo > Processor : AArch64 Processor rev 0 (aarch64) > processor : 0 > processor : 1 --snip-- > processor : 46 > processor : 47 > Features : fp asimd aes pmull sha1 sha2 crc32 > CPU implementer : 0x43 > CPU architecture: AArch64 > CPU variant : 0x0 > CPU part : 0x0a1 > CPU revision : 0 root@mp30ar0:~# cat /proc/cpuinfo processor : 0 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 1 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 2 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 3 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 4 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 5 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 6 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 processor : 7 Features : fp asimd evtstrm CPU implementer : 0x50 CPU architecture: 8 CPU variant : 0x0 CPU part : 0x000 CPU revision : 1 root@mp30ar0:~# Hope this helps. Regards, Dave. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-11-02 10:47 ` Hunt, David @ 2015-11-02 13:17 ` Jerin Jacob 2015-11-02 15:04 ` Hunt, David 2015-11-02 15:24 ` Jan Viktorin 1 sibling, 1 reply; 28+ messages in thread From: Jerin Jacob @ 2015-11-02 13:17 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, Nov 02, 2015 at 10:47:53AM +0000, Hunt, David wrote: > On 02/11/2015 06:32, Jerin Jacob wrote: > >On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote: > > --snip-- > > > > >Hi Jan and Dave, > > > >I have reviewed your patches for arm[64] support. Please check the > >review comments. > > Hi Jerin, > > I'm looking at the comments now, and working on getting the suggested > changes merged into the patch-set. > > >Cavium would like to contribute on armv8 port and remaining libraries > >(ACL, LPM, HASH) implementation for armv8. Currently i am re-basing > >our ACL,HASH libraries implementation based on existing patches. > >Happy to work with you guys to have full fledged armv8 support for DPDK. > > > >Jerin > > Thanks for that, it's good news indeed. > > >other query on rte_cpu_get_flag_enabled for armv8, > >I have tried to run the existing patches on armv8-thunderX platform. > >But there application start failure due to mismatch in > >rte_cpu_get_flag_enabled() encoding. > > > >In my platform rte_cpu_get_flag_enabled() works based on > >AT_HWCAP with following values[1] which different from > >existing lib/librte_eal/common/include/arch/arm/rte_cpuflags.h > > > >[1]http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h > > > >In order to debug this, Could provide the following > >values in tested armv8 platform. Look like its running 32bit compatible > >mode in your environment > > I'm using a Gigabyte MP30AR0 motherboard with an 8-core X-Gene, Running a > 4.3.0-rc6 kernel. > Here's the information on the cpu_flags issue you requested: > > >AT_SYSINFO_EHDR: 0x3ff859f0000 > >AT_??? (0x26): 0x430f0a10 > >AT_HWCAP: fb > >AT_PAGESZ: 65536 > >AT_CLKTCK: 100 > >AT_PHDR: 0x400040 > >AT_PHENT: 56 > >AT_PHNUM: 7 > >AT_BASE: 0x3ff85a00000 > >AT_FLAGS: 0x0 > >AT_ENTRY: 0x401900 > >AT_UID: 0 > >AT_EUID: 0 > >AT_GID: 0 > >AT_EGID: 0 > >AT_SECURE: 0 > >AT_RANDOM: 0x3ffef1c7988 > >AT_EXECFN: /bin/sleep > >AT_PLATFORM: aarch64 > > root@mp30ar0:~# LD_SHOW_AUXV=1 sleep 1000 > AT_SYSINFO_EHDR: 0x7f7956d000 > AT_HWCAP: 7 > AT_PAGESZ: 4096 > AT_CLKTCK: 100 > AT_PHDR: 0x400040 > AT_PHENT: 56 > AT_PHNUM: 7 > AT_BASE: 0x7f79543000 > AT_FLAGS: 0x0 > AT_ENTRY: 0x401900 > AT_UID: 0 > AT_EUID: 0 > AT_GID: 0 > AT_EGID: 0 > AT_SECURE: 0 > AT_RANDOM: 0x7ffcaf2e48 > AT_EXECFN: /bin/sleep > AT_PLATFORM: aarch64 > If am not wrong existing rte_cpu_get_flag_enabled() implementation should be broken in your platform also for arm64. as I could see only AT_HWCAP not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also follows http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h and the implmentation is FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0) // not correct for arm64 FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1) // not correct for arm64 FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2) // not correct for arm64 FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3) FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP, 4) FEAT_DEF(FPA, 0x00000001, 0, REG_HWCAP, 5) FEAT_DEF(VFP, 0x00000001, 0, REG_HWCAP, 6) FEAT_DEF(EDSP, 0x00000001, 0, REG_HWCAP, 7) FEAT_DEF(JAVA, 0x00000001, 0, REG_HWCAP, 8) FEAT_DEF(IWMMXT, 0x00000001, 0, REG_HWCAP, 9) FEAT_DEF(CRUNCH, 0x00000001, 0, REG_HWCAP, 10) FEAT_DEF(THUMBEE, 0x00000001, 0, REG_HWCAP, 11) FEAT_DEF(NEON, 0x00000001, 0, REG_HWCAP, 12) FEAT_DEF(VFPv3, 0x00000001, 0, REG_HWCAP, 13) FEAT_DEF(VFPv3D16, 0x00000001, 0, REG_HWCAP, 14) FEAT_DEF(TLS, 0x00000001, 0, REG_HWCAP, 15) FEAT_DEF(VFPv4, 0x00000001, 0, REG_HWCAP, 16) FEAT_DEF(IDIVA, 0x00000001, 0, REG_HWCAP, 17) FEAT_DEF(IDIVT, 0x00000001, 0, REG_HWCAP, 18) FEAT_DEF(VFPD32, 0x00000001, 0, REG_HWCAP, 19) FEAT_DEF(LPAE, 0x00000001, 0, REG_HWCAP, 20) FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 21) FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP2, 0) FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP2, 1) FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2) FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3) FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4) FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) Am I missing something ? > >root@arm64:/export/dpdk-arm64# zcat /proc/config.gz | grep CONFIG_COMPAT > ># CONFIG_COMPAT_BRK is not set > >CONFIG_COMPAT_BINFMT_ELF=y > >CONFIG_COMPAT=y > >CONFIG_COMPAT_NETLINK_MESSAGES=y > > root@mp30ar0:~# zcat /proc/config.gz | grep CONFIG_COMPAT > # CONFIG_COMPAT_BRK is not set > CONFIG_COMPAT_OLD_SIGACTION=y > CONFIG_COMPAT_BINFMT_ELF=y > CONFIG_COMPAT=y > > > >root@arm64:/export/dpdk-arm64# cat /proc/cpuinfo > >Processor : AArch64 Processor rev 0 (aarch64) > >processor : 0 > >processor : 1 > --snip-- > >processor : 46 > >processor : 47 > >Features : fp asimd aes pmull sha1 sha2 crc32 > >CPU implementer : 0x43 > >CPU architecture: AArch64 > >CPU variant : 0x0 > >CPU part : 0x0a1 > >CPU revision : 0 > > root@mp30ar0:~# cat /proc/cpuinfo > processor : 0 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 1 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 2 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 3 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 4 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 5 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 6 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > processor : 7 > Features : fp asimd evtstrm > CPU implementer : 0x50 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0x000 > CPU revision : 1 > > root@mp30ar0:~# > > Hope this helps. > > Regards, > Dave. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-11-02 13:17 ` Jerin Jacob @ 2015-11-02 15:04 ` Hunt, David 2015-11-02 15:13 ` Jan Viktorin 0 siblings, 1 reply; 28+ messages in thread From: Hunt, David @ 2015-11-02 15:04 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev On 02/11/2015 13:17, Jerin Jacob wrote: -snip-- > If am not wrong existing rte_cpu_get_flag_enabled() implementation > should be broken in your platform also for arm64. as I could see only AT_HWCAP > not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also > follows > > http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h > > and the implmentation is > > FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0) // not correct for arm64 > FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1) // not correct for arm64 > FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2) // not correct for arm64 > FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3) --snip-- > FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4) > FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) > FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) > > Am I missing something ? You are correct. I need to re-visit this. In merging the ARMv7 and ARVv8, I should have split the hardware capabilities flags into 32-but and 64-bit versions. I'll do that in the next patch. Thanks, Dave. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-11-02 15:04 ` Hunt, David @ 2015-11-02 15:13 ` Jan Viktorin 2015-11-02 15:20 ` Hunt, David 0 siblings, 1 reply; 28+ messages in thread From: Jan Viktorin @ 2015-11-02 15:13 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, 2 Nov 2015 15:04:14 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 02/11/2015 13:17, Jerin Jacob wrote: > -snip-- > > If am not wrong existing rte_cpu_get_flag_enabled() implementation > > should be broken in your platform also for arm64. as I could see only AT_HWCAP > > not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also > > follows > > > > http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h > > > > and the implmentation is > > > > FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0) // not correct for arm64 > > FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1) // not correct for arm64 > > FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2) // not correct for arm64 > > FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3) > --snip-- > > FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4) > > FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) > > FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) > > > > Am I missing something ? > > You are correct. I need to re-visit this. In merging the ARMv7 and > ARVv8, I should have split the hardware capabilities flags into 32-but > and 64-bit versions. I'll do that in the next patch. > Thanks, > Dave. Should I split the rte_atomic.h and rte_cpuflags.h then? Jan > > > > > -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-11-02 15:13 ` Jan Viktorin @ 2015-11-02 15:20 ` Hunt, David 0 siblings, 0 replies; 28+ messages in thread From: Hunt, David @ 2015-11-02 15:20 UTC (permalink / raw) To: Jan Viktorin; +Cc: dev On 02/11/2015 15:13, Jan Viktorin wrote: > On Mon, 2 Nov 2015 15:04:14 +0000 > "Hunt, David" <david.hunt@intel.com> wrote: > >> On 02/11/2015 13:17, Jerin Jacob wrote: >> -snip-- >>> If am not wrong existing rte_cpu_get_flag_enabled() implementation >>> should be broken in your platform also for arm64. as I could see only AT_HWCAP >>> not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also >>> follows >>> >>> http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h >>> >>> and the implmentation is >>> >>> FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0) // not correct for arm64 >>> FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1) // not correct for arm64 >>> FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2) // not correct for arm64 >>> FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3) >> --snip-- >>> FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4) >>> FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0) >>> FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1) >>> >>> Am I missing something ? >> >> You are correct. I need to re-visit this. In merging the ARMv7 and >> ARVv8, I should have split the hardware capabilities flags into 32-but >> and 64-bit versions. I'll do that in the next patch. >> Thanks, >> Dave. > > Should I split the rte_atomic.h and rte_cpuflags.h then? > > Jan It looks like we're headed in that direction, so yes, I think that would be a good idea. Dave ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 2015-11-02 10:47 ` Hunt, David 2015-11-02 13:17 ` Jerin Jacob @ 2015-11-02 15:24 ` Jan Viktorin 1 sibling, 0 replies; 28+ messages in thread From: Jan Viktorin @ 2015-11-02 15:24 UTC (permalink / raw) To: Hunt, David; +Cc: dev On Mon, 2 Nov 2015 10:47:53 +0000 "Hunt, David" <david.hunt@intel.com> wrote: > On 02/11/2015 06:32, Jerin Jacob wrote: > > On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote: > > --snip-- > > > > > Hi Jan and Dave, > > > > I have reviewed your patches for arm[64] support. Please check the > > review comments. > --snip-- > > In order to debug this, Could provide the following > > values in tested armv8 platform. Look like its running 32bit compatible > > mode in your environment > > I'm using a Gigabyte MP30AR0 motherboard with an 8-core X-Gene, Running > a 4.3.0-rc6 kernel. > Here's the information on the cpu_flags issue you requested: > --snip-- > > root@mp30ar0:~# > > Hope this helps. > > Regards, > Dave. > My few bits to compare to ARMv7. There is AT_PLATFORM=v7l (and no aarch32), this is probably to be fixed... Altera SoC FPGA: # LD_SHOW_AUXV=1 sleep 1 AT_HWCAP: swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x10034 AT_PHENT: 32 AT_PHNUM: 8 AT_BASE: 0x76fd3000 AT_FLAGS: 0x0 AT_ENTRY: 0x149d9 AT_UID: 0 AT_EUID: 0 AT_GID: 0 AT_EGID: 0 AT_SECURE: 0 AT_RANDOM: 0x7ebbcf2f AT_EXECFN: /bin/sleep AT_PLATFORM: v7l # cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 0 (v7l) Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpd32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x3 CPU part : 0xc09 CPU revision : 0 processor : 1 model name : ARMv7 Processor rev 0 (v7l) Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpd32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x3 CPU part : 0xc09 CPU revision : 0 Hardware : Altera SOCFPGA Revision : 0000 Serial : 0000000000000000 Odroid XU4: # LD_SHOW_AUXV=1 sleep 1 AT_HWCAP: swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x10034 AT_PHENT: 32 AT_PHNUM: 9 AT_BASE: 0xb6f8c000 AT_FLAGS: 0x0 AT_ENTRY: 0x11191 AT_UID: 1000 AT_EUID: 1000 AT_GID: 1000 AT_EGID: 1000 AT_SECURE: 0 AT_RANDOM: 0xbec42ed6 AT_EXECFN: /bin/sleep AT_PLATFORM: v7l # cat /proc/cpuinfo Processor : ARMv7 Processor rev 1 (v7l) processor : 0 BogoMIPS : 3.07 processor : 1 BogoMIPS : 3.07 processor : 2 BogoMIPS : 3.07 processor : 3 BogoMIPS : 3.07 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc05 CPU revision : 1 Hardware : ODROIDC Revision : 000a Serial : 1b00000000000000 -- Jan Viktorin E-mail: Viktorin@RehiveTech.com System Architect Web: www.RehiveTech.com RehiveTech Brno, Czech Republic ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2015-11-02 17:31 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt 2015-11-02 4:57 ` Jerin Jacob 2015-11-02 12:22 ` Hunt, David 2015-11-02 12:45 ` Jan Viktorin 2015-11-02 12:57 ` Jerin Jacob 2015-11-02 15:26 ` Hunt, David 2015-11-02 15:36 ` Jan Viktorin 2015-11-02 15:49 ` Hunt, David 2015-11-02 16:29 ` Jerin Jacob 2015-11-02 17:29 ` Jan Viktorin 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt 2015-11-02 5:15 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt 2015-11-02 4:43 ` Jerin Jacob 2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt [not found] <1446212826-19425-7-git-send-email-david.hunt@intel.com> [not found] ` <5633798B.2050708@intel.com> 2015-10-30 16:11 ` Jan Viktorin 2015-10-30 16:16 ` Thomas Monjalon 2015-10-30 16:28 ` Hunt, David 2015-11-02 6:32 ` Jerin Jacob 2015-11-02 10:47 ` Hunt, David 2015-11-02 13:17 ` Jerin Jacob 2015-11-02 15:04 ` Hunt, David 2015-11-02 15:13 ` Jan Viktorin 2015-11-02 15:20 ` Hunt, David 2015-11-02 15:24 ` Jan Viktorin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).