DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support
@ 2015-10-30 13:49 David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

This is th v3 patchset for ARMv8 that now sits on top of the v5 patch
of the ARMv7 code by RehiveTech. It adds code into the same arm include
directory, reducing code duplication.

Tested on an XGene 64-bit arm server board, with PCI slots. Passes traffic
between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should
work with many other NICS, but these are as yet untested.

Compiles igb_uio, kni and all the physical device PMDs.

An entry has been added to the Release notes.

We hope that this will encourage the ARM community to contribute PMDs
for their SoCs to DPDK.

For now, we've added some Intel engineers to the MAINTAINERS file. We would
like to encourage the ARM community to take over maintenance of this area
in future, and to further improve it.

Notes on arm64 kernel configuration:

  Using Ubuntu 14.04 LTS with a 4.3.0-rc6 kernel (with modified PCI drivers),
  and uio_pci_generic.
  ARM64 kernels do not seem to have functional resource mapping of PCI memory
  (PCI_MMAP), so the pci driver needs to be patched to enable this. The
  symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is
  missing the resource0...N files for mmapping the device memory. Earlier
  kernels (3.13.x) had these files present, but mmap'ping resulted in a
  "Bus Error" when the NIC memory was accessed.
  However, during limited testing with a modified 4.3.0-rc6 kernel, we were
  able to mmap the NIC memory, and pass traffic between the two ports on a
  82599 NIC connected via fibre cable.
  We have no plans to upstream a kernel patch for this and hope that
  someone more familiar with the arm architecture can create a proper patch
  and enable this functionality.

Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>

David Hunt (6):
  eal/arm: add 64-bit armv8 version of rte_memcpy.h
  eal/arm: add 64-bit armv8 version of rte_prefetch.h
  eal/arm: add 64-bit armv8 version of rte_cycles.h
  eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h
  mk: add support for armv8 on top of armv7
  test: add checks for cpu flags on armv8

 MAINTAINERS                                        |   3 +-
 app/test/test_cpuflags.c                           |  13 +-
 config/defconfig_arm64-armv8a-linuxapp-gcc         |  56 ++++
 doc/guides/rel_notes/release_2_2.rst               |   7 +-
 .../common/include/arch/arm/rte_cpuflags.h         |   6 +-
 .../common/include/arch/arm/rte_cycles.h           |   4 +
 .../common/include/arch/arm/rte_cycles_64.h        |  77 ++++++
 .../common/include/arch/arm/rte_memcpy.h           |   4 +
 .../common/include/arch/arm/rte_memcpy_64.h        | 308 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |   4 +
 .../common/include/arch/arm/rte_prefetch_64.h      |  61 ++++
 mk/arch/arm64/rte.vars.mk                          |  58 ++++
 mk/machine/armv8a/rte.vars.mk                      |  57 ++++
 13 files changed, 651 insertions(+), 7 deletions(-)
 create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
 create mode 100644 mk/arch/arm64/rte.vars.mk
 create mode 100644 mk/machine/armv8a/rte.vars.mk

-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-11-02  4:57   ` Jerin Jacob
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_memcpy.h           |   4 +
 .../common/include/arch/arm/rte_memcpy_64.h        | 308 +++++++++++++++++++++
 2 files changed, 312 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index d9f5bf1..1d562c3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_MEMCPY_ARM_H_
 #define _RTE_MEMCPY_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_memcpy_64.h>
+#else
 #include <rte_memcpy_32.h>
+#endif
 
 #endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
new file mode 100644
index 0000000..6d85113
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
@@ -0,0 +1,308 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMCPY_ARM_64_H_
+#define _RTE_MEMCPY_ARM_64_H_
+
+#include <stdint.h>
+#include <string.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP d0, d1, [%0]\n\t"
+		     "STP d0, d1, [%1]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP d0, d1, [%0 , #32]\n\t"
+		     "STP d0, d1, [%1 , #32]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP q0, q1, [%0 , #32]\n\t"
+		     "STP q0, q1, [%1 , #32]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP q0, q1, [%0 , #32]\n\t"
+		     "STP q0, q1, [%1 , #32]\n\t"
+		     "LDP q0, q1, [%0 , #64]\n\t"
+		     "STP q0, q1, [%1 , #64]\n\t"
+		     "LDP q0, q1, [%0 , #96]\n\t"
+		     "STP q0, q1, [%1 , #96]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP q0, q1, [%0 , #32]\n\t"
+		     "STP q0, q1, [%1 , #32]\n\t"
+		     "LDP q0, q1, [%0 , #64]\n\t"
+		     "STP q0, q1, [%1 , #64]\n\t"
+		     "LDP q0, q1, [%0 , #96]\n\t"
+		     "STP q0, q1, [%1 , #96]\n\t"
+		     "LDP q0, q1, [%0 , #128]\n\t"
+		     "STP q0, q1, [%1 , #128]\n\t"
+		     "LDP q0, q1, [%0 , #160]\n\t"
+		     "STP q0, q1, [%1 , #160]\n\t"
+		     "LDP q0, q1, [%0 , #192]\n\t"
+		     "STP q0, q1, [%1 , #192]\n\t"
+		     "LDP q0, q1, [%0 , #224]\n\t"
+		     "STP q0, q1, [%1 , #224]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+#define rte_memcpy(dst, src, n)              \
+	({ (__builtin_constant_p(n)) ?       \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08)
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n,
+			(const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n,
+			(const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		break;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		break;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0)
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+	return ret;
+}
+
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_64_H_ */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_prefetch.h         |  4 ++
 .../common/include/arch/arm/rte_prefetch_64.h      | 61 ++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
index 1f46697..aa37de5 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_PREFETCH_ARM_H_
 #define _RTE_PREFETCH_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_prefetch_64.h>
+#else
 #include <rte_prefetch_32.h>
+#endif
 
 #endif /* _RTE_PREFETCH_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
new file mode 100644
index 0000000..b0d9170
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
@@ -0,0 +1,61 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_64_H_
+#define _RTE_PREFETCH_ARM_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+/* May want to add PSTL1KEEP instructions for prefetch for ownership. */
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_64_H_ */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-11-02  5:15   ` Jerin Jacob
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_cycles.h           |  4 ++
 .../common/include/arch/arm/rte_cycles_64.h        | 77 ++++++++++++++++++++++
 2 files changed, 81 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index b2372fa..a8009a0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_CYCLES_ARM_H_
 #define _RTE_CYCLES_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_cycles_64.h>
+#else
 #include <rte_cycles_32.h>
+#endif
 
 #endif /* _RTE_CYCLES_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
new file mode 100644
index 0000000..148b9f4
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
@@ -0,0 +1,77 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_CYCLES_ARM64_H_
+#define _RTE_CYCLES_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	uint64_t tsc;
+
+	asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc));
+
+#ifdef RTE_TIMER_MULTIPLIER
+	return tsc * RTE_TIMER_MULTIPLIER;
+#else
+	return tsc;
+#endif
+
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	asm volatile("isb sy" :::);
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM64_H_ */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
                   ` (2 preceding siblings ...)
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt
  5 siblings, 0 replies; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 7ce9d14..5c5fd6a 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -141,12 +141,16 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
 	__attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
 {
 	int auxv_fd;
+#ifdef RTE_ARCH_64
+	Elf64_auxv_t auxv;
+#else
 	Elf32_auxv_t auxv;
+#endif
 
 	auxv_fd = open("/proc/self/auxv", O_RDONLY);
 	assert(auxv_fd);
 	while (read(auxv_fd, &auxv,
-		sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+		sizeof(auxv)) == sizeof(auxv)) {
 		if (auxv.a_type == AT_HWCAP)
 			out[REG_HWCAP] = auxv.a_un.a_val;
 		else if (auxv.a_type == AT_HWCAP2)
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
                   ` (3 preceding siblings ...)
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-11-02  4:43   ` Jerin Jacob
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt
  5 siblings, 1 reply; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

The ARMv8 include files are in the arm directory in
lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 MAINTAINERS                                |  3 +-
 config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_2_2.rst       |  7 ++--
 mk/arch/arm64/rte.vars.mk                  | 58 ++++++++++++++++++++++++++++++
 mk/machine/armv8a/rte.vars.mk              | 57 +++++++++++++++++++++++++++++
 5 files changed, 177 insertions(+), 4 deletions(-)
 create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
 create mode 100644 mk/arch/arm64/rte.vars.mk
 create mode 100644 mk/machine/armv8a/rte.vars.mk

diff --git a/MAINTAINERS b/MAINTAINERS
index a8933eb..4569f13 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,8 +124,9 @@ IBM POWER
 M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
 F: lib/librte_eal/common/include/arch/ppc_64/
 
-ARM v7
+ARM
 M: Jan Viktorin <viktorin@rehivetech.com>
+M: David Hunt <david.hunt@intel.com>
 F: lib/librte_eal/common/include/arch/arm/
 
 Intel x86
diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
new file mode 100644
index 0000000..79a9533
--- /dev/null
+++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv8a"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
+# This is used to adjust the generic arm timer to align with the cpu cycle count.
+CONFIG_RTE_TIMER_MULTIPLIER=48
diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index 5b5bb4c..5aa523b 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -31,10 +31,11 @@ New Features
 
 * **Added vhost-user multiple queue support.**
 
-* **Introduce ARMv7 architecture**
+* **Introduce ARMv7 and ARMv8 architectures**
 
-  It is now possible to build DPDK for the ARMv7 platform and test with
-  virtual PMD drivers.
+  * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms.
+  * ARMv7 can be tested with virtual PMD drivers.
+  * ARMv8 can be tested with virtual and physical PMD drivers.
 
 
 Resolved Issues
diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk
new file mode 100644
index 0000000..3aad712
--- /dev/null
+++ b/mk/arch/arm64/rte.vars.mk
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# arch:
+#
+#   - define ARCH variable (overridden by cmdline or by previous
+#     optional define in machine .mk)
+#   - define CROSS variable (overridden by cmdline or previous define
+#     in machine .mk)
+#   - define CPU_CFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - define CPU_LDFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - define CPU_ASFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - may override any previously defined variable
+#
+# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32
+#
+
+ARCH  ?= arm64
+# common arch dir in eal headers
+ARCH_DIR := arm
+CROSS ?=
+
+CPU_CFLAGS  ?= -DRTE_CACHE_LINE_SIZE=64
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
new file mode 100644
index 0000000..b785062
--- /dev/null
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
                   ` (4 preceding siblings ...)
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
@ 2015-10-30 13:49 ` David Hunt
  5 siblings, 0 replies; 28+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_cpuflags.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 557458f..1689048 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -1,4 +1,4 @@
-/*-
+/*
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
@@ -115,9 +115,18 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif
 
-#if defined(RTE_ARCH_ARM)
+#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64)
+	printf("Checking for Floating Point:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_FPA);
+
 	printf("Check for NEON:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+
+	printf("Checking for ARM32 mode:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32);
+
+	printf("Checking for ARM64 mode:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64);
 #endif
 
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
@ 2015-11-02  4:43   ` Jerin Jacob
  0 siblings, 0 replies; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02  4:43 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Fri, Oct 30, 2015 at 01:49:18PM +0000, David Hunt wrote:
> The ARMv8 include files are in the arm directory in
> lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  MAINTAINERS                                |  3 +-
>  config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
>  doc/guides/rel_notes/release_2_2.rst       |  7 ++--
>  mk/arch/arm64/rte.vars.mk                  | 58 ++++++++++++++++++++++++++++++
>  mk/machine/armv8a/rte.vars.mk              | 57 +++++++++++++++++++++++++++++
>  5 files changed, 177 insertions(+), 4 deletions(-)
>  create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
>  create mode 100644 mk/arch/arm64/rte.vars.mk
>  create mode 100644 mk/machine/armv8a/rte.vars.mk
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a8933eb..4569f13 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -124,8 +124,9 @@ IBM POWER
>  M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
>  F: lib/librte_eal/common/include/arch/ppc_64/
>  
> -ARM v7
> +ARM
>  M: Jan Viktorin <viktorin@rehivetech.com>
> +M: David Hunt <david.hunt@intel.com>
>  F: lib/librte_eal/common/include/arch/arm/
>  
>  Intel x86
> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
> new file mode 100644
> index 0000000..79a9533
> --- /dev/null
> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> @@ -0,0 +1,56 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +#
> +
> +#include "common_linuxapp"
> +
> +CONFIG_RTE_MACHINE="armv8a"
> +
> +CONFIG_RTE_ARCH="arm64"
> +CONFIG_RTE_ARCH_ARM64=y
> +CONFIG_RTE_ARCH_64=y
> +CONFIG_RTE_ARCH_ARM_NEON=y
> +
> +CONFIG_RTE_TOOLCHAIN="gcc"
> +CONFIG_RTE_TOOLCHAIN_GCC=y
> +
> +CONFIG_RTE_IXGBE_INC_VECTOR=n
> +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
> +CONFIG_RTE_LIBRTE_IVSHMEM=n
> +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
> +
> +CONFIG_RTE_LIBRTE_LPM=n
> +CONFIG_RTE_LIBRTE_ACL=n
> +CONFIG_RTE_LIBRTE_TABLE=n
> +CONFIG_RTE_LIBRTE_PIPELINE=n
> +
> +# This is used to adjust the generic arm timer to align with the cpu cycle count.
> +CONFIG_RTE_TIMER_MULTIPLIER=48

Introducing a build-time dependency with cpu clock parameter not a good
idea. Either this parameter needs be removed or find out out the
multiplier at run-time by introducing a machine specific hook


> diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
> index 5b5bb4c..5aa523b 100644
> --- a/doc/guides/rel_notes/release_2_2.rst
> +++ b/doc/guides/rel_notes/release_2_2.rst
> @@ -31,10 +31,11 @@ New Features
>  
>  * **Added vhost-user multiple queue support.**
>  
> -* **Introduce ARMv7 architecture**
> +* **Introduce ARMv7 and ARMv8 architectures**
>  
> -  It is now possible to build DPDK for the ARMv7 platform and test with
> -  virtual PMD drivers.
> +  * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms.
> +  * ARMv7 can be tested with virtual PMD drivers.
> +  * ARMv8 can be tested with virtual and physical PMD drivers.
>  
>  
>  Resolved Issues
> diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk
> new file mode 100644
> index 0000000..3aad712
> --- /dev/null
> +++ b/mk/arch/arm64/rte.vars.mk
> @@ -0,0 +1,58 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2015 Intel Corporation. All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +#
> +# arch:
> +#
> +#   - define ARCH variable (overridden by cmdline or by previous
> +#     optional define in machine .mk)
> +#   - define CROSS variable (overridden by cmdline or previous define
> +#     in machine .mk)
> +#   - define CPU_CFLAGS variable (overridden by cmdline or previous
> +#     define in machine .mk)
> +#   - define CPU_LDFLAGS variable (overridden by cmdline or previous
> +#     define in machine .mk)
> +#   - define CPU_ASFLAGS variable (overridden by cmdline or previous
> +#     define in machine .mk)
> +#   - may override any previously defined variable
> +#
> +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32
> +#
> +
> +ARCH  ?= arm64
> +# common arch dir in eal headers
> +ARCH_DIR := arm
> +CROSS ?=
> +
> +CPU_CFLAGS  ?= -DRTE_CACHE_LINE_SIZE=64

cache line size can be moved to MACHINE_CFLAGS as its more of machine
parameter.so that if machine has different cache line size(based on
arm64) can have new target like  defconfig_arm64-xxxxxxx-linuxapp-gcc

> +CPU_LDFLAGS ?=
> +CPU_ASFLAGS ?= -felf
> +
> +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
> diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
> new file mode 100644
> index 0000000..b785062
> --- /dev/null
> +++ b/mk/machine/armv8a/rte.vars.mk
> @@ -0,0 +1,57 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2015 Intel Corporation. All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +#
> +# machine:
> +#
> +#   - can define ARCH variable (overridden by cmdline value)
> +#   - can define CROSS variable (overridden by cmdline value)
> +#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
> +#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
> +#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
> +#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
> +#     overrides the one defined in arch.
> +#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
> +#     overrides the one defined in arch.
> +#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
> +#     overrides the one defined in arch.
> +#   - may override any previously defined variable
> +#
> +
> +# ARCH =
> +# CROSS =
> +# MACHINE_CFLAGS =
> +# MACHINE_LDFLAGS =
> +# MACHINE_ASFLAGS =
> +# CPU_CFLAGS =
> +# CPU_LDFLAGS =
> +# CPU_ASFLAGS =
> +
> +MACHINE_CFLAGS += -march=armv8-a
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
@ 2015-11-02  4:57   ` Jerin Jacob
  2015-11-02 12:22     ` Hunt, David
  0 siblings, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02  4:57 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  .../common/include/arch/arm/rte_memcpy.h           |   4 +
>  .../common/include/arch/arm/rte_memcpy_64.h        | 308 +++++++++++++++++++++
>  2 files changed, 312 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
> index d9f5bf1..1d562c3 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
> @@ -33,6 +33,10 @@
>  #ifndef _RTE_MEMCPY_ARM_H_
>  #define _RTE_MEMCPY_ARM_H_
>  
> +#ifdef RTE_ARCH_64
> +#include <rte_memcpy_64.h>
> +#else
>  #include <rte_memcpy_32.h>
> +#endif
>  
>  #endif /* _RTE_MEMCPY_ARM_H_ */
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> new file mode 100644
> index 0000000..6d85113
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> @@ -0,0 +1,308 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of IBM Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +*/
> +
> +#ifndef _RTE_MEMCPY_ARM_64_H_
> +#define _RTE_MEMCPY_ARM_64_H_
> +
> +#include <stdint.h>
> +#include <string.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include "generic/rte_memcpy.h"
> +
> +#ifdef __ARM_NEON_FP

SIMD is not optional in armv8 spec.So every armv8 machine will have
SIMD instruction unlike armv7.More over LDP/STP instruction is
not part of SIMD.So this check is not required or it can
be replaced with a check that select memcpy from either libc or this specific
implementation

> +
> +/* ARM NEON Intrinsics are used to copy data */
> +#include <arm_neon.h>
> +
> +static inline void
> +rte_mov16(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP d0, d1, [%0]\n\t"
> +		     "STP d0, d1, [%1]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}

IMO, no need to hardcode registers used for the mem move(d0, d1).
Let compiler schedule the registers for better performance.


> +
> +static inline void
> +rte_mov32(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov48(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP d0, d1, [%0 , #32]\n\t"
> +		     "STP d0, d1, [%1 , #32]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov64(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP q0, q1, [%0 , #32]\n\t"
> +		     "STP q0, q1, [%1 , #32]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov128(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP q0, q1, [%0 , #32]\n\t"
> +		     "STP q0, q1, [%1 , #32]\n\t"
> +		     "LDP q0, q1, [%0 , #64]\n\t"
> +		     "STP q0, q1, [%1 , #64]\n\t"
> +		     "LDP q0, q1, [%0 , #96]\n\t"
> +		     "STP q0, q1, [%1 , #96]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov256(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP q0, q1, [%0 , #32]\n\t"
> +		     "STP q0, q1, [%1 , #32]\n\t"
> +		     "LDP q0, q1, [%0 , #64]\n\t"
> +		     "STP q0, q1, [%1 , #64]\n\t"
> +		     "LDP q0, q1, [%0 , #96]\n\t"
> +		     "STP q0, q1, [%1 , #96]\n\t"
> +		     "LDP q0, q1, [%0 , #128]\n\t"
> +		     "STP q0, q1, [%1 , #128]\n\t"
> +		     "LDP q0, q1, [%0 , #160]\n\t"
> +		     "STP q0, q1, [%1 , #160]\n\t"
> +		     "LDP q0, q1, [%0 , #192]\n\t"
> +		     "STP q0, q1, [%1 , #192]\n\t"
> +		     "LDP q0, q1, [%0 , #224]\n\t"
> +		     "STP q0, q1, [%1 , #224]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +#define rte_memcpy(dst, src, n)              \
> +	({ (__builtin_constant_p(n)) ?       \
> +	memcpy((dst), (src), (n)) :          \
> +	rte_memcpy_func((dst), (src), (n)); })
> +
> +static inline void *
> +rte_memcpy_func(void *dst, const void *src, size_t n)
> +{
> +	void *ret = dst;
> +
> +	/* We can't copy < 16 bytes using XMM registers so do it manually. */
> +	if (n < 16) {
> +		if (n & 0x01) {
> +			*(uint8_t *)dst = *(const uint8_t *)src;
> +			dst = (uint8_t *)dst + 1;
> +			src = (const uint8_t *)src + 1;
> +		}
> +		if (n & 0x02) {
> +			*(uint16_t *)dst = *(const uint16_t *)src;
> +			dst = (uint16_t *)dst + 1;
> +			src = (const uint16_t *)src + 1;
> +		}
> +		if (n & 0x04) {
> +			*(uint32_t *)dst = *(const uint32_t *)src;
> +			dst = (uint32_t *)dst + 1;
> +			src = (const uint32_t *)src + 1;
> +		}
> +		if (n & 0x08)
> +			*(uint64_t *)dst = *(const uint64_t *)src;
> +		return ret;
> +	}
> +
> +	/* Special fast cases for <= 128 bytes */
> +	if (n <= 32) {
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		rte_mov16((uint8_t *)dst - 16 + n,
> +			(const uint8_t *)src - 16 + n);
> +		return ret;
> +	}
> +
> +	if (n <= 64) {
> +		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
> +		rte_mov32((uint8_t *)dst - 32 + n,
> +			(const uint8_t *)src - 32 + n);
> +		return ret;
> +	}
> +
> +	if (n <= 128) {
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		rte_mov64((uint8_t *)dst - 64 + n,
> +			(const uint8_t *)src - 64 + n);
> +		return ret;
> +	}
> +
> +	/*
> +	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
> +	 * copies was found to be faster than doing 128 and 32 byte copies as
> +	 * well.
> +	 */
> +	for ( ; n >= 256; n -= 256) {

There is room for prefetching the next cacheline based on the cache line
size.

> +		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
> +		dst = (uint8_t *)dst + 256;
> +		src = (const uint8_t *)src + 256;
> +	}
> +
> +	/*
> +	 * We split the remaining bytes (which will be less than 256) into
> +	 * 64byte (2^6) chunks.
> +	 * Using incrementing integers in the case labels of a switch statement
> +	 * enourages the compiler to use a jump table. To get incrementing
> +	 * integers, we shift the 2 relevant bits to the LSB position to first
> +	 * get decrementing integers, and then subtract.
> +	 */
> +	switch (3 - (n >> 6)) {
> +	case 0x00:
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 64;
> +		dst = (uint8_t *)dst + 64;
> +		src = (const uint8_t *)src + 64;      /* fallthrough */
> +	case 0x01:
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 64;
> +		dst = (uint8_t *)dst + 64;
> +		src = (const uint8_t *)src + 64;      /* fallthrough */
> +	case 0x02:
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 64;
> +		dst = (uint8_t *)dst + 64;
> +		src = (const uint8_t *)src + 64;      /* fallthrough */
> +	default:
> +		break;
> +	}
> +
> +	/*
> +	 * We split the remaining bytes (which will be less than 64) into
> +	 * 16byte (2^4) chunks, using the same switch structure as above.
> +	 */
> +	switch (3 - (n >> 4)) {
> +	case 0x00:
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 16;
> +		dst = (uint8_t *)dst + 16;
> +		src = (const uint8_t *)src + 16;      /* fallthrough */
> +	case 0x01:
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 16;
> +		dst = (uint8_t *)dst + 16;
> +		src = (const uint8_t *)src + 16;      /* fallthrough */
> +	case 0x02:
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 16;
> +		dst = (uint8_t *)dst + 16;
> +		src = (const uint8_t *)src + 16;      /* fallthrough */
> +	default:
> +		break;
> +	}
> +
> +	/* Copy any remaining bytes, without going beyond end of buffers */
> +	if (n != 0)
> +		rte_mov16((uint8_t *)dst - 16 + n,
> +			(const uint8_t *)src - 16 + n);
> +	return ret;
> +}
> +
> +#else
> +
> +static inline void
> +rte_mov16(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 16);
> +}
> +
> +static inline void
> +rte_mov32(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 32);
> +}
> +
> +static inline void
> +rte_mov48(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 48);
> +}
> +
> +static inline void
> +rte_mov64(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 64);
> +}
> +
> +static inline void
> +rte_mov128(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 128);
> +}
> +
> +static inline void
> +rte_mov256(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 256);
> +}
> +
> +static inline void *
> +rte_memcpy(void *dst, const void *src, size_t n)
> +{
> +	return memcpy(dst, src, n);
> +}
> +
> +static inline void *
> +rte_memcpy_func(void *dst, const void *src, size_t n)
> +{
> +	return memcpy(dst, src, n);
> +}
> +
> +#endif /* __ARM_NEON_FP */
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_MEMCPY_ARM_64_H_ */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
@ 2015-11-02  5:15   ` Jerin Jacob
  0 siblings, 0 replies; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02  5:15 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Fri, Oct 30, 2015 at 01:49:16PM +0000, David Hunt wrote:
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  .../common/include/arch/arm/rte_cycles.h           |  4 ++
>  .../common/include/arch/arm/rte_cycles_64.h        | 77 ++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
> index b2372fa..a8009a0 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
> @@ -33,6 +33,10 @@
>  #ifndef _RTE_CYCLES_ARM_H_
>  #define _RTE_CYCLES_ARM_H_
>  
> +#ifdef RTE_ARCH_64
> +#include <rte_cycles_64.h>
> +#else
>  #include <rte_cycles_32.h>
> +#endif
>  
>  #endif /* _RTE_CYCLES_ARM_H_ */
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> new file mode 100644
> index 0000000..148b9f4
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> @@ -0,0 +1,77 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of IBM Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +*/
> +
> +#ifndef _RTE_CYCLES_ARM64_H_
> +#define _RTE_CYCLES_ARM64_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include "generic/rte_cycles.h"
> +
> +/**
> + * Read the time base register.
> + *
> + * @return
> + *   The time base for this lcore.
> + */
> +static inline uint64_t
> +rte_rdtsc(void)
> +{
> +	uint64_t tsc;
> +
> +	asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc));
> +
> +#ifdef RTE_TIMER_MULTIPLIER
> +	return tsc * RTE_TIMER_MULTIPLIER;
> +#else
> +	return tsc;
> +#endif
> +
> +}
> +
> +static inline uint64_t
> +rte_rdtsc_precise(void)
> +{
> +	asm volatile("isb sy" :::);

IMO, it should be asm volatile("dmb ish" : : : "memory")
to represent the data memory barrier(rte_mb()).

> +	return rte_rdtsc();
> +}
> +
> +static inline uint64_t
> +rte_get_tsc_cycles(void) { return rte_rdtsc(); }
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_CYCLES_ARM64_H_ */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02  4:57   ` Jerin Jacob
@ 2015-11-02 12:22     ` Hunt, David
  2015-11-02 12:45       ` Jan Viktorin
  2015-11-02 12:57       ` Jerin Jacob
  0 siblings, 2 replies; 28+ messages in thread
From: Hunt, David @ 2015-11-02 12:22 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On 02/11/2015 04:57, Jerin Jacob wrote:
> On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:
>> Signed-off-by: David Hunt <david.hunt@intel.com>
--snip--
>> +#ifndef _RTE_MEMCPY_ARM_64_H_
>> +#define _RTE_MEMCPY_ARM_64_H_
>> +
>> +#include <stdint.h>
>> +#include <string.h>
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include "generic/rte_memcpy.h"
>> +
>> +#ifdef __ARM_NEON_FP
>
> SIMD is not optional in armv8 spec.So every armv8 machine will have
> SIMD instruction unlike armv7.More over LDP/STP instruction is
> not part of SIMD.So this check is not required or it can
> be replaced with a check that select memcpy from either libc or this specific
> implementation

Jerin,
    I've just benchmarked the libc version against the hand-coded 
version of the memcpy routines, and the libc wins in most cases. This 
code was just an initial attempt at optimising the memccpy's, so I feel 
that with the current benchmark results, it would better just to remove 
the assembly versions, and use the libc version for the initial release 
on ARMv8.
Then, in the future, the ARMv8 experts are free to submit an optimised 
version as a patch in the future. Does that sound reasonable to you?
Rgds,
Dave.


--snip--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 12:22     ` Hunt, David
@ 2015-11-02 12:45       ` Jan Viktorin
  2015-11-02 12:57       ` Jerin Jacob
  1 sibling, 0 replies; 28+ messages in thread
From: Jan Viktorin @ 2015-11-02 12:45 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, 2 Nov 2015 12:22:47 +0000
"Hunt, David" <david.hunt@intel.com> wrote:

> On 02/11/2015 04:57, Jerin Jacob wrote:
> > On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:  
> >> Signed-off-by: David Hunt <david.hunt@intel.com>  
> --snip--
> >> +#ifndef _RTE_MEMCPY_ARM_64_H_
> >> +#define _RTE_MEMCPY_ARM_64_H_
> >> +
> >> +#include <stdint.h>
> >> +#include <string.h>
> >> +
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >> +#include "generic/rte_memcpy.h"
> >> +
> >> +#ifdef __ARM_NEON_FP  
> >
> > SIMD is not optional in armv8 spec.So every armv8 machine will have
> > SIMD instruction unlike armv7.More over LDP/STP instruction is
> > not part of SIMD.So this check is not required or it can
> > be replaced with a check that select memcpy from either libc or this specific
> > implementation  
> 
> Jerin,
>     I've just benchmarked the libc version against the hand-coded 
> version of the memcpy routines, and the libc wins in most cases. This 
> code was just an initial attempt at optimising the memccpy's, so I feel 
> that with the current benchmark results, it would better just to remove 
> the assembly versions, and use the libc version for the initial release 
> on ARMv8.
> Then, in the future, the ARMv8 experts are free to submit an optimised 
> version as a patch in the future. Does that sound reasonable to you?
> Rgds,
> Dave.

As there is no use of NEON in the code, this optimization seems to be
useless to me...

Jan

> 
> 
> --snip--
> 
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 12:22     ` Hunt, David
  2015-11-02 12:45       ` Jan Viktorin
@ 2015-11-02 12:57       ` Jerin Jacob
  2015-11-02 15:26         ` Hunt, David
  1 sibling, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02 12:57 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote:
> On 02/11/2015 04:57, Jerin Jacob wrote:
> >On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:
> >>Signed-off-by: David Hunt <david.hunt@intel.com>
> --snip--
> >>+#ifndef _RTE_MEMCPY_ARM_64_H_
> >>+#define _RTE_MEMCPY_ARM_64_H_
> >>+
> >>+#include <stdint.h>
> >>+#include <string.h>
> >>+
> >>+#ifdef __cplusplus
> >>+extern "C" {
> >>+#endif
> >>+
> >>+#include "generic/rte_memcpy.h"
> >>+
> >>+#ifdef __ARM_NEON_FP
> >
> >SIMD is not optional in armv8 spec.So every armv8 machine will have
> >SIMD instruction unlike armv7.More over LDP/STP instruction is
> >not part of SIMD.So this check is not required or it can
> >be replaced with a check that select memcpy from either libc or this specific
> >implementation
> 
> Jerin,
>    I've just benchmarked the libc version against the hand-coded version of
> the memcpy routines, and the libc wins in most cases. This code was just an
> initial attempt at optimising the memccpy's, so I feel that with the current
> benchmark results, it would better just to remove the assembly versions, and
> use the libc version for the initial release on ARMv8.
> Then, in the future, the ARMv8 experts are free to submit an optimised
> version as a patch in the future. Does that sound reasonable to you?

Make sense. Based on my understanding, other blocks are also not optimized 
for arm64.
So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and
libc for initial version.

BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and
"byteorder_autotest" is broken. I think existing arm64 code is not optimized
beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified
CONFIG_RTE_FORCE_INTRINSICS scheme.

if you guys are OK with arm and arm64 as two different platform then
I can summit the complete working patch for arm64.(as in my current source
code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/)


> Rgds,
> Dave.
> 
> 
> --snip--
> 
> 
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 12:57       ` Jerin Jacob
@ 2015-11-02 15:26         ` Hunt, David
  2015-11-02 15:36           ` Jan Viktorin
  0 siblings, 1 reply; 28+ messages in thread
From: Hunt, David @ 2015-11-02 15:26 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On 02/11/2015 12:57, Jerin Jacob wrote:
> On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote:
>> Jerin,
>>     I've just benchmarked the libc version against the hand-coded version of
>> the memcpy routines, and the libc wins in most cases. This code was just an
>> initial attempt at optimising the memccpy's, so I feel that with the current
>> benchmark results, it would better just to remove the assembly versions, and
>> use the libc version for the initial release on ARMv8.
>> Then, in the future, the ARMv8 experts are free to submit an optimised
>> version as a patch in the future. Does that sound reasonable to you?
>
> Make sense. Based on my understanding, other blocks are also not optimized
> for arm64.
> So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and
> libc for initial version.
>
> BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and
> "byteorder_autotest" is broken. I think existing arm64 code is not optimized
> beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified
> CONFIG_RTE_FORCE_INTRINSICS scheme.

Agreed.

> if you guys are OK with arm and arm64 as two different platform then
> I can summit the complete working patch for arm64.(as in my current source
> code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/)

Sure. That would be great. We initially started with two ARMv7 
patch-sets, and Jan merged into one. Something similar could happen for 
the ARMv8 patch set. We just want to end up with the best implementation 
possible. :)

Dave.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 15:26         ` Hunt, David
@ 2015-11-02 15:36           ` Jan Viktorin
  2015-11-02 15:49             ` Hunt, David
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Viktorin @ 2015-11-02 15:36 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, 2 Nov 2015 15:26:19 +0000
"Hunt, David" <david.hunt@intel.com> wrote:

> On 02/11/2015 12:57, Jerin Jacob wrote:
> > On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote:  
> >> Jerin,
> >>     I've just benchmarked the libc version against the hand-coded version of
> >> the memcpy routines, and the libc wins in most cases. This code was just an
> >> initial attempt at optimising the memccpy's, so I feel that with the current
> >> benchmark results, it would better just to remove the assembly versions, and
> >> use the libc version for the initial release on ARMv8.
> >> Then, in the future, the ARMv8 experts are free to submit an optimised
> >> version as a patch in the future. Does that sound reasonable to you?  
> >
> > Make sense. Based on my understanding, other blocks are also not optimized
> > for arm64.
> > So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and
> > libc for initial version.
> >
> > BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and
> > "byteorder_autotest" is broken. I think existing arm64 code is not optimized
> > beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified
> > CONFIG_RTE_FORCE_INTRINSICS scheme.  
> 
> Agreed.
> 
> > if you guys are OK with arm and arm64 as two different platform then
> > I can summit the complete working patch for arm64.(as in my current source
> > code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/)  
> 
> Sure. That would be great. We initially started with two ARMv7 
> patch-sets, and Jan merged into one. Something similar could happen for 
> the ARMv8 patch set. We just want to end up with the best implementation 
> possible. :)
> 

It was looking like we can share a lot of common code for both
architectures. I didn't know how much different are the cpuflags.

IMHO, it'd be better to have two directories arm and arm64. I thought
to refer from arm64 to arm where possible. But I don't know whether is
this possible with the DPDK build system.

Jan

> Dave.
> 
> 
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 15:36           ` Jan Viktorin
@ 2015-11-02 15:49             ` Hunt, David
  2015-11-02 16:29               ` Jerin Jacob
  0 siblings, 1 reply; 28+ messages in thread
From: Hunt, David @ 2015-11-02 15:49 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

On 02/11/2015 15:36, Jan Viktorin wrote:
> On Mon, 2 Nov 2015 15:26:19 +0000
--snip--
> It was looking like we can share a lot of common code for both
> architectures. I didn't know how much different are the cpuflags.

CPU flags for ARMv8 are looking like this now. Quite different to the 
ARMv7 ones.

static const struct feature_entry cpu_feature_table[] = {
         FEAT_DEF(FP,        0x00000001, 0, REG_HWCAP,  0)
         FEAT_DEF(ASIMD,     0x00000001, 0, REG_HWCAP,  1)
         FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  2)
         FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP,  3)
         FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP,  4)
         FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP,  5)
         FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP,  6)
         FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP,  7)
         FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
         FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
};

> IMHO, it'd be better to have two directories arm and arm64. I thought
> to refer from arm64 to arm where possible. But I don't know whether is
> this possible with the DPDK build system.

I think both methodologies have their pros and cons. However, I'd lean 
towards the common directory with the "filename_32/64.h" scheme, as that 
similar to the x86 methodology, and we don't need to tweak the include 
paths to pull files from multiple directories.

Dave

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 15:49             ` Hunt, David
@ 2015-11-02 16:29               ` Jerin Jacob
  2015-11-02 17:29                 ` Jan Viktorin
  0 siblings, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02 16:29 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote:
> On 02/11/2015 15:36, Jan Viktorin wrote:
> >On Mon, 2 Nov 2015 15:26:19 +0000
> --snip--
> >It was looking like we can share a lot of common code for both
> >architectures. I didn't know how much different are the cpuflags.
> 
> CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7
> ones.
> 
> static const struct feature_entry cpu_feature_table[] = {
>         FEAT_DEF(FP,        0x00000001, 0, REG_HWCAP,  0)
>         FEAT_DEF(ASIMD,     0x00000001, 0, REG_HWCAP,  1)
>         FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  2)
>         FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP,  3)
>         FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP,  4)
>         FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP,  5)
>         FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP,  6)
>         FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP,  7)
>         FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
>         FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
> };
> 
> >IMHO, it'd be better to have two directories arm and arm64. I thought
> >to refer from arm64 to arm where possible. But I don't know whether is
> >this possible with the DPDK build system.
> 
> I think both methodologies have their pros and cons. However, I'd lean
> towards the common directory with the "filename_32/64.h" scheme, as that
> similar to the x86 methodology, and we don't need to tweak the include paths
> to pull files from multiple directories.
> 

I agree. Jan, could you please send the next version with
filename_32/64.h for atomic and cpuflags(ie for all header files).
I can re-base and send the complete arm64 patch based on your version.

Thanks,
Jerin



> Dave
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 16:29               ` Jerin Jacob
@ 2015-11-02 17:29                 ` Jan Viktorin
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Viktorin @ 2015-11-02 17:29 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On Mon, 2 Nov 2015 21:59:12 +0530
Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:

> On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote:
> > On 02/11/2015 15:36, Jan Viktorin wrote:  
> > >On Mon, 2 Nov 2015 15:26:19 +0000  
> > --snip--  
> > >It was looking like we can share a lot of common code for both
> > >architectures. I didn't know how much different are the cpuflags.  
> > 
> > CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7
> > ones.
> > 
> > static const struct feature_entry cpu_feature_table[] = {
> >         FEAT_DEF(FP,        0x00000001, 0, REG_HWCAP,  0)
> >         FEAT_DEF(ASIMD,     0x00000001, 0, REG_HWCAP,  1)
> >         FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  2)
> >         FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP,  3)
> >         FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP,  4)
> >         FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP,  5)
> >         FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP,  6)
> >         FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP,  7)
> >         FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
> >         FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
> > };
> >   
> > >IMHO, it'd be better to have two directories arm and arm64. I thought
> > >to refer from arm64 to arm where possible. But I don't know whether is
> > >this possible with the DPDK build system.  
> > 
> > I think both methodologies have their pros and cons. However, I'd lean
> > towards the common directory with the "filename_32/64.h" scheme, as that
> > similar to the x86 methodology, and we don't need to tweak the include paths
> > to pull files from multiple directories.
> >   
> 
> I agree. Jan, could you please send the next version with
> filename_32/64.h for atomic and cpuflags(ie for all header files).
> I can re-base and send the complete arm64 patch based on your version.
> 

I am working on it, however, after I've removed the unnecessary
intrinsics code and set the RTE_FORCE_INTRINSICS=y, it doesn't
build... So I'm figuring out what is wrong.

Jan

> Thanks,
> Jerin
> 
> 
> 
> > Dave
> >   



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-11-02 10:47         ` Hunt, David
  2015-11-02 13:17           ` Jerin Jacob
@ 2015-11-02 15:24           ` Jan Viktorin
  1 sibling, 0 replies; 28+ messages in thread
From: Jan Viktorin @ 2015-11-02 15:24 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, 2 Nov 2015 10:47:53 +0000
"Hunt, David" <david.hunt@intel.com> wrote:

> On 02/11/2015 06:32, Jerin Jacob wrote:
> > On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote:  
> 
> --snip--
> 
> >
> > Hi Jan and Dave,
> >
> > I have reviewed your patches for arm[64] support. Please check the
> > review comments.  
> 
--snip--
> > In order to debug this, Could provide the following
> > values in tested armv8 platform. Look like its running 32bit compatible
> > mode in your environment  
> 
> I'm using a Gigabyte MP30AR0 motherboard with an 8-core X-Gene, Running 
> a 4.3.0-rc6 kernel.
> Here's the information on the cpu_flags issue you requested:
> 
--snip--
> 
> root@mp30ar0:~#
> 
> Hope this helps.
> 
> Regards,
> Dave.
> 

My few bits to compare to ARMv7. There is AT_PLATFORM=v7l (and no
aarch32), this is probably to be fixed...

Altera SoC FPGA:

# LD_SHOW_AUXV=1 sleep 1
AT_HWCAP:    swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
AT_PAGESZ:       4096
AT_CLKTCK:       100
AT_PHDR:         0x10034
AT_PHENT:        32
AT_PHNUM:        8
AT_BASE:         0x76fd3000
AT_FLAGS:        0x0
AT_ENTRY:        0x149d9
AT_UID:          0
AT_EUID:         0
AT_GID:          0
AT_EGID:         0
AT_SECURE:       0
AT_RANDOM:       0x7ebbcf2f
AT_EXECFN:       /bin/sleep
AT_PLATFORM:     v7l

# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3
tls vfpd32 CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x3
CPU part        : 0xc09
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3
tls vfpd32 CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x3
CPU part        : 0xc09
CPU revision    : 0

Hardware        : Altera SOCFPGA
Revision        : 0000
Serial          : 0000000000000000


Odroid XU4:

# LD_SHOW_AUXV=1 sleep 1
AT_HWCAP:    swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
AT_PAGESZ:       4096
AT_CLKTCK:       100
AT_PHDR:         0x10034
AT_PHENT:        32
AT_PHNUM:        9
AT_BASE:         0xb6f8c000
AT_FLAGS:        0x0
AT_ENTRY:        0x11191
AT_UID:          1000
AT_EUID:         1000
AT_GID:          1000
AT_EGID:         1000
AT_SECURE:       0
AT_RANDOM:       0xbec42ed6
AT_EXECFN:       /bin/sleep
AT_PLATFORM:     v7l

# cat /proc/cpuinfo
Processor       : ARMv7 Processor rev 1 (v7l)
processor       : 0
BogoMIPS        : 3.07

processor       : 1
BogoMIPS        : 3.07

processor       : 2
BogoMIPS        : 3.07

processor       : 3
BogoMIPS        : 3.07

Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc05
CPU revision    : 1

Hardware        : ODROIDC
Revision        : 000a
Serial          : 1b00000000000000

-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-11-02 15:13               ` Jan Viktorin
@ 2015-11-02 15:20                 ` Hunt, David
  0 siblings, 0 replies; 28+ messages in thread
From: Hunt, David @ 2015-11-02 15:20 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

On 02/11/2015 15:13, Jan Viktorin wrote:
> On Mon, 2 Nov 2015 15:04:14 +0000
> "Hunt, David" <david.hunt@intel.com> wrote:
>
>> On 02/11/2015 13:17, Jerin Jacob wrote:
>> -snip--
>>> If am not wrong existing  rte_cpu_get_flag_enabled() implementation
>>> should be broken in your platform also for arm64. as I could see only AT_HWCAP
>>> not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also
>>> follows
>>>
>>> http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h
>>>
>>> and the implmentation is
>>>
>>> FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0) // not correct for arm64
>>> FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1) // not correct for arm64
>>> FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2) // not correct for arm64
>>> FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
>> --snip--
>>> FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
>>> FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
>>> FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
>>>
>>> Am I missing something ?
>>
>> You are correct. I need to re-visit this. In merging the ARMv7 and
>> ARVv8, I should have split the hardware capabilities flags into 32-but
>> and 64-bit versions. I'll do that in the next patch.
>> Thanks,
>> Dave.
>
> Should I split the rte_atomic.h and rte_cpuflags.h then?
>
> Jan

It looks like we're headed in that direction, so yes, I think that would 
be a good idea.

Dave

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-11-02 15:04             ` Hunt, David
@ 2015-11-02 15:13               ` Jan Viktorin
  2015-11-02 15:20                 ` Hunt, David
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Viktorin @ 2015-11-02 15:13 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, 2 Nov 2015 15:04:14 +0000
"Hunt, David" <david.hunt@intel.com> wrote:

> On 02/11/2015 13:17, Jerin Jacob wrote:
> -snip--
> > If am not wrong existing  rte_cpu_get_flag_enabled() implementation
> > should be broken in your platform also for arm64. as I could see only AT_HWCAP
> > not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also
> > follows
> >
> > http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h
> >
> > and the implmentation is
> >
> > FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0) // not correct for arm64
> > FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1) // not correct for arm64
> > FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2) // not correct for arm64
> > FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)  
> --snip--
> > FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
> > FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
> > FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
> >
> > Am I missing something ?  
> 
> You are correct. I need to re-visit this. In merging the ARMv7 and 
> ARVv8, I should have split the hardware capabilities flags into 32-but 
> and 64-bit versions. I'll do that in the next patch.
> Thanks,
> Dave.

Should I split the rte_atomic.h and rte_cpuflags.h then?

Jan

> 
> 
> 
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-11-02 13:17           ` Jerin Jacob
@ 2015-11-02 15:04             ` Hunt, David
  2015-11-02 15:13               ` Jan Viktorin
  0 siblings, 1 reply; 28+ messages in thread
From: Hunt, David @ 2015-11-02 15:04 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On 02/11/2015 13:17, Jerin Jacob wrote:
-snip--
> If am not wrong existing  rte_cpu_get_flag_enabled() implementation
> should be broken in your platform also for arm64. as I could see only AT_HWCAP
> not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also
> follows
>
> http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h
>
> and the implmentation is
>
> FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0) // not correct for arm64
> FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1) // not correct for arm64
> FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2) // not correct for arm64
> FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
--snip--
> FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
> FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
> FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
>
> Am I missing something ?

You are correct. I need to re-visit this. In merging the ARMv7 and 
ARVv8, I should have split the hardware capabilities flags into 32-but 
and 64-bit versions. I'll do that in the next patch.
Thanks,
Dave.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-11-02 10:47         ` Hunt, David
@ 2015-11-02 13:17           ` Jerin Jacob
  2015-11-02 15:04             ` Hunt, David
  2015-11-02 15:24           ` Jan Viktorin
  1 sibling, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02 13:17 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, Nov 02, 2015 at 10:47:53AM +0000, Hunt, David wrote:
> On 02/11/2015 06:32, Jerin Jacob wrote:
> >On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote:
>
> --snip--
>
> >
> >Hi Jan and Dave,
> >
> >I have reviewed your patches for arm[64] support. Please check the
> >review comments.
>
> Hi Jerin,
>
> I'm looking at the comments now, and working on getting the suggested
> changes merged into the patch-set.
>
> >Cavium would like to contribute on armv8 port and remaining libraries
> >(ACL, LPM, HASH) implementation for armv8. Currently i am re-basing
> >our ACL,HASH libraries implementation based on existing patches.
> >Happy to work with you guys to have full fledged armv8 support for DPDK.
> >
> >Jerin
>
> Thanks for that, it's good news indeed.
>
> >other query on rte_cpu_get_flag_enabled for armv8,
> >I have tried to run the existing patches on armv8-thunderX platform.
> >But there application start failure due to mismatch in
> >rte_cpu_get_flag_enabled() encoding.
> >
> >In my platform rte_cpu_get_flag_enabled() works based on
> >AT_HWCAP with following values[1] which different from
> >existing lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
> >
> >[1]http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h
> >
> >In order to debug this, Could provide the following
> >values in tested armv8 platform. Look like its running 32bit compatible
> >mode in your environment
>
> I'm using a Gigabyte MP30AR0 motherboard with an 8-core X-Gene, Running a
> 4.3.0-rc6 kernel.
> Here's the information on the cpu_flags issue you requested:
>
> >AT_SYSINFO_EHDR: 0x3ff859f0000
> >AT_??? (0x26): 0x430f0a10
> >AT_HWCAP:        fb
> >AT_PAGESZ:       65536
> >AT_CLKTCK:       100
> >AT_PHDR:         0x400040
> >AT_PHENT:        56
> >AT_PHNUM:        7
> >AT_BASE:         0x3ff85a00000
> >AT_FLAGS:        0x0
> >AT_ENTRY:        0x401900
> >AT_UID:          0
> >AT_EUID:         0
> >AT_GID:          0
> >AT_EGID:         0
> >AT_SECURE:       0
> >AT_RANDOM:       0x3ffef1c7988
> >AT_EXECFN:       /bin/sleep
> >AT_PLATFORM:     aarch64
>
> root@mp30ar0:~# LD_SHOW_AUXV=1 sleep 1000
> AT_SYSINFO_EHDR: 0x7f7956d000
> AT_HWCAP:        7
> AT_PAGESZ:       4096
> AT_CLKTCK:       100
> AT_PHDR:         0x400040
> AT_PHENT:        56
> AT_PHNUM:        7
> AT_BASE:         0x7f79543000
> AT_FLAGS:        0x0
> AT_ENTRY:        0x401900
> AT_UID:          0
> AT_EUID:         0
> AT_GID:          0
> AT_EGID:         0
> AT_SECURE:       0
> AT_RANDOM:       0x7ffcaf2e48
> AT_EXECFN:       /bin/sleep
> AT_PLATFORM:     aarch64
>

If am not wrong existing  rte_cpu_get_flag_enabled() implementation
should be broken in your platform also for arm64. as I could see only AT_HWCAP
not AT_HWCAP2 and AT_HWCAP is 0x7 that means your platform also
follows

http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h

and the implmentation is

FEAT_DEF(SWP,       0x00000001, 0, REG_HWCAP,  0) // not correct for arm64
FEAT_DEF(HALF,      0x00000001, 0, REG_HWCAP,  1) // not correct for arm64
FEAT_DEF(THUMB,     0x00000001, 0, REG_HWCAP,  2) // not correct for arm64
FEAT_DEF(A26BIT,    0x00000001, 0, REG_HWCAP,  3)
FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP,  4)
FEAT_DEF(FPA,       0x00000001, 0, REG_HWCAP,  5)
FEAT_DEF(VFP,       0x00000001, 0, REG_HWCAP,  6)
FEAT_DEF(EDSP,      0x00000001, 0, REG_HWCAP,  7)
FEAT_DEF(JAVA,      0x00000001, 0, REG_HWCAP,  8)
FEAT_DEF(IWMMXT,    0x00000001, 0, REG_HWCAP,  9)
FEAT_DEF(CRUNCH,    0x00000001, 0, REG_HWCAP,  10)
FEAT_DEF(THUMBEE,   0x00000001, 0, REG_HWCAP,  11)
FEAT_DEF(NEON,      0x00000001, 0, REG_HWCAP,  12)
FEAT_DEF(VFPv3,     0x00000001, 0, REG_HWCAP,  13)
FEAT_DEF(VFPv3D16,  0x00000001, 0, REG_HWCAP,  14)
FEAT_DEF(TLS,       0x00000001, 0, REG_HWCAP,  15)
FEAT_DEF(VFPv4,     0x00000001, 0, REG_HWCAP,  16)
FEAT_DEF(IDIVA,     0x00000001, 0, REG_HWCAP,  17)
FEAT_DEF(IDIVT,     0x00000001, 0, REG_HWCAP,  18)
FEAT_DEF(VFPD32,    0x00000001, 0, REG_HWCAP,  19)
FEAT_DEF(LPAE,      0x00000001, 0, REG_HWCAP,  20)
FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  21)
FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP2,  0)
FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP2,  1)
FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP2,  2)
FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP2,  3)
FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP2,  4)
FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)

Am I missing something ?


> >root@arm64:/export/dpdk-arm64# zcat /proc/config.gz  | grep CONFIG_COMPAT
> ># CONFIG_COMPAT_BRK is not set
> >CONFIG_COMPAT_BINFMT_ELF=y
> >CONFIG_COMPAT=y
> >CONFIG_COMPAT_NETLINK_MESSAGES=y
>
> root@mp30ar0:~# zcat /proc/config.gz  | grep CONFIG_COMPAT
> # CONFIG_COMPAT_BRK is not set
> CONFIG_COMPAT_OLD_SIGACTION=y
> CONFIG_COMPAT_BINFMT_ELF=y
> CONFIG_COMPAT=y
>
>
> >root@arm64:/export/dpdk-arm64# cat /proc/cpuinfo
> >Processor       : AArch64 Processor rev 0 (aarch64)
> >processor       : 0
> >processor       : 1
> --snip--
> >processor       : 46
> >processor       : 47
> >Features        : fp asimd aes pmull sha1 sha2 crc32
> >CPU implementer : 0x43
> >CPU architecture: AArch64
> >CPU variant     : 0x0
> >CPU part        : 0x0a1
> >CPU revision    : 0
>
> root@mp30ar0:~# cat /proc/cpuinfo
> processor       : 0
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 1
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 2
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 3
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 4
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 5
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 6
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> processor       : 7
> Features        : fp asimd evtstrm
> CPU implementer : 0x50
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0x000
> CPU revision    : 1
>
> root@mp30ar0:~#
>
> Hope this helps.
>
> Regards,
> Dave.
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-11-02  6:32       ` Jerin Jacob
@ 2015-11-02 10:47         ` Hunt, David
  2015-11-02 13:17           ` Jerin Jacob
  2015-11-02 15:24           ` Jan Viktorin
  0 siblings, 2 replies; 28+ messages in thread
From: Hunt, David @ 2015-11-02 10:47 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On 02/11/2015 06:32, Jerin Jacob wrote:
> On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote:

--snip--

>
> Hi Jan and Dave,
>
> I have reviewed your patches for arm[64] support. Please check the
> review comments.

Hi Jerin,

I'm looking at the comments now, and working on getting the suggested 
changes merged into the patch-set.

> Cavium would like to contribute on armv8 port and remaining libraries
> (ACL, LPM, HASH) implementation for armv8. Currently i am re-basing
> our ACL,HASH libraries implementation based on existing patches.
> Happy to work with you guys to have full fledged armv8 support for DPDK.
>
> Jerin

Thanks for that, it's good news indeed.

> other query on rte_cpu_get_flag_enabled for armv8,
> I have tried to run the existing patches on armv8-thunderX platform.
> But there application start failure due to mismatch in
> rte_cpu_get_flag_enabled() encoding.
>
> In my platform rte_cpu_get_flag_enabled() works based on
> AT_HWCAP with following values[1] which different from
> existing lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
>
> [1]http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h
>
> In order to debug this, Could provide the following
> values in tested armv8 platform. Look like its running 32bit compatible
> mode in your environment

I'm using a Gigabyte MP30AR0 motherboard with an 8-core X-Gene, Running 
a 4.3.0-rc6 kernel.
Here's the information on the cpu_flags issue you requested:

> AT_SYSINFO_EHDR: 0x3ff859f0000
> AT_??? (0x26): 0x430f0a10
> AT_HWCAP:        fb
> AT_PAGESZ:       65536
> AT_CLKTCK:       100
> AT_PHDR:         0x400040
> AT_PHENT:        56
> AT_PHNUM:        7
> AT_BASE:         0x3ff85a00000
> AT_FLAGS:        0x0
> AT_ENTRY:        0x401900
> AT_UID:          0
> AT_EUID:         0
> AT_GID:          0
> AT_EGID:         0
> AT_SECURE:       0
> AT_RANDOM:       0x3ffef1c7988
> AT_EXECFN:       /bin/sleep
> AT_PLATFORM:     aarch64

root@mp30ar0:~# LD_SHOW_AUXV=1 sleep 1000
AT_SYSINFO_EHDR: 0x7f7956d000
AT_HWCAP:        7
AT_PAGESZ:       4096
AT_CLKTCK:       100
AT_PHDR:         0x400040
AT_PHENT:        56
AT_PHNUM:        7
AT_BASE:         0x7f79543000
AT_FLAGS:        0x0
AT_ENTRY:        0x401900
AT_UID:          0
AT_EUID:         0
AT_GID:          0
AT_EGID:         0
AT_SECURE:       0
AT_RANDOM:       0x7ffcaf2e48
AT_EXECFN:       /bin/sleep
AT_PLATFORM:     aarch64

> root@arm64:/export/dpdk-arm64# zcat /proc/config.gz  | grep CONFIG_COMPAT
> # CONFIG_COMPAT_BRK is not set
> CONFIG_COMPAT_BINFMT_ELF=y
> CONFIG_COMPAT=y
> CONFIG_COMPAT_NETLINK_MESSAGES=y

root@mp30ar0:~# zcat /proc/config.gz  | grep CONFIG_COMPAT
# CONFIG_COMPAT_BRK is not set
CONFIG_COMPAT_OLD_SIGACTION=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_COMPAT=y


> root@arm64:/export/dpdk-arm64# cat /proc/cpuinfo
> Processor       : AArch64 Processor rev 0 (aarch64)
> processor       : 0
> processor       : 1
--snip--
> processor       : 46
> processor       : 47
> Features        : fp asimd aes pmull sha1 sha2 crc32
> CPU implementer : 0x43
> CPU architecture: AArch64
> CPU variant     : 0x0
> CPU part        : 0x0a1
> CPU revision    : 0

root@mp30ar0:~# cat /proc/cpuinfo
processor       : 0
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 1
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 2
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 3
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 4
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 5
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 6
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

processor       : 7
Features        : fp asimd evtstrm
CPU implementer : 0x50
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x000
CPU revision    : 1

root@mp30ar0:~#

Hope this helps.

Regards,
Dave.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-10-30 16:28     ` Hunt, David
@ 2015-11-02  6:32       ` Jerin Jacob
  2015-11-02 10:47         ` Hunt, David
  0 siblings, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2015-11-02  6:32 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Fri, Oct 30, 2015 at 04:28:25PM +0000, Hunt, David wrote:
> On 30/10/2015 16:11, Jan Viktorin wrote:
> >Hmm, I see. It's good to fix this in the generated e-mails between format-patch
> > and send-email calls. I always review those to be sure they meet my
> expectations ;).
> >Anyway, it is not clear, what has changed in the v3. Just the rte_cycles?
> >You should explain that at least in the 0000 patch. Better to keep some history
> >in each single commit (are there any rules in dpdk for this? Just look how they do in kernel).
> --snip--
> 
> Sure, I'll keep that in mind for the next time. A list of changes for each
> revision, and also changes in each patch in the patch set. As Thomas says -
> whatever helps the reviewer :)
> 
> For the moment there probably isn't a need to release a new patch set for
> these comments, so I'll just list them here:
> 1. v3 has just the additional comment in one of the patches to say that the
> armv8 header files are in the 'arm' include directory.
> 2. The rte_cycles is unchanged, the CONFIG_ is not needed.
> 
> If there is a need to post another patch set I'll include the change notes.
> Otherwise do we all think that the patch is there (or there abouts)? :)

Hi Jan and Dave,

I have reviewed your patches for arm[64] support. Please check the
review comments.

Cavium would like to contribute on armv8 port and remaining libraries
(ACL, LPM, HASH) implementation for armv8. Currently i am re-basing
our ACL,HASH libraries implementation based on existing patches.
Happy to work with you guys to have full fledged armv8 support for DPDK.

Jerin


other query on rte_cpu_get_flag_enabled for armv8,
I have tried to run the existing patches on armv8-thunderX platform.
But there application start failure due to mismatch in
rte_cpu_get_flag_enabled() encoding.

In my platform rte_cpu_get_flag_enabled() works based on
AT_HWCAP with following values[1] which different from
existing lib/librte_eal/common/include/arch/arm/rte_cpuflags.h

[1]http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/hwcap.h

In order to debug this, Could provide the following 
values in tested armv8 platform. Look like its running 32bit compatible 
mode in your environment


root@arm64:/export/dpdk-arm64# LD_SHOW_AUXV=1 sleep 1000
AT_SYSINFO_EHDR: 0x3ff859f0000
AT_??? (0x26): 0x430f0a10
AT_HWCAP:        fb
AT_PAGESZ:       65536
AT_CLKTCK:       100
AT_PHDR:         0x400040
AT_PHENT:        56
AT_PHNUM:        7
AT_BASE:         0x3ff85a00000
AT_FLAGS:        0x0
AT_ENTRY:        0x401900
AT_UID:          0
AT_EUID:         0
AT_GID:          0
AT_EGID:         0
AT_SECURE:       0
AT_RANDOM:       0x3ffef1c7988
AT_EXECFN:       /bin/sleep
AT_PLATFORM:     aarch64

root@arm64:/export/dpdk-arm64# zcat /proc/config.gz  | grep CONFIG_COMPAT
# CONFIG_COMPAT_BRK is not set
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_COMPAT=y
CONFIG_COMPAT_NETLINK_MESSAGES=y


root@arm64:/export/dpdk-arm64# cat /proc/cpuinfo
Processor       : AArch64 Processor rev 0 (aarch64)
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7
processor       : 8
processor       : 9
processor       : 10
processor       : 11
processor       : 12
processor       : 13
processor       : 14
processor       : 15
processor       : 16
processor       : 17
processor       : 18
processor       : 19
processor       : 20
processor       : 21
processor       : 22
processor       : 23
processor       : 24
processor       : 25
processor       : 26
processor       : 27
processor       : 28
processor       : 29
processor       : 30
processor       : 31
processor       : 32
processor       : 33
processor       : 34
processor       : 35
processor       : 36
processor       : 37
processor       : 38
processor       : 39
processor       : 40
processor       : 41
processor       : 42
processor       : 43
processor       : 44
processor       : 45
processor       : 46
processor       : 47
Features        : fp asimd aes pmull sha1 sha2 crc32
CPU implementer : 0x43
CPU architecture: AArch64
CPU variant     : 0x0
CPU part        : 0x0a1
CPU revision    : 0




> 
> Regards,
> Dave.
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-10-30 16:11   ` Jan Viktorin
  2015-10-30 16:16     ` Thomas Monjalon
@ 2015-10-30 16:28     ` Hunt, David
  2015-11-02  6:32       ` Jerin Jacob
  1 sibling, 1 reply; 28+ messages in thread
From: Hunt, David @ 2015-10-30 16:28 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

On 30/10/2015 16:11, Jan Viktorin wrote:
> Hmm, I see. It's good to fix this in the generated e-mails between format-patch
 > and send-email calls. I always review those to be sure they meet my 
expectations ;).
> Anyway, it is not clear, what has changed in the v3. Just the rte_cycles?
> You should explain that at least in the 0000 patch. Better to keep some history
> in each single commit (are there any rules in dpdk for this? Just look how they do in kernel).
--snip--

Sure, I'll keep that in mind for the next time. A list of changes for 
each revision, and also changes in each patch in the patch set. As 
Thomas says - whatever helps the reviewer :)

For the moment there probably isn't a need to release a new patch set 
for these comments, so I'll just list them here:
1. v3 has just the additional comment in one of the patches to say that 
the armv8 header files are in the 'arm' include directory.
2. The rte_cycles is unchanged, the CONFIG_ is not needed.

If there is a need to post another patch set I'll include the change 
notes. Otherwise do we all think that the patch is there (or there 
abouts)? :)

Regards,
Dave.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-10-30 16:11   ` Jan Viktorin
@ 2015-10-30 16:16     ` Thomas Monjalon
  2015-10-30 16:28     ` Hunt, David
  1 sibling, 0 replies; 28+ messages in thread
From: Thomas Monjalon @ 2015-10-30 16:16 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

2015-10-30 17:11, Jan Viktorin:
> Anyway, it is not clear, what has changed in the v3. Just the rte_cycles?
> You should explain that at least in the 0000 patch.
> Better to keep some history in each single commit (are there any rules in
> dpdk for this? Just look how they do in kernel).

The rule is to help reviewers ;)
History in the cover letter is good.
If there are also some history in each patch, it's better.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
       [not found] ` <5633798B.2050708@intel.com>
@ 2015-10-30 16:11   ` Jan Viktorin
  2015-10-30 16:16     ` Thomas Monjalon
  2015-10-30 16:28     ` Hunt, David
  0 siblings, 2 replies; 28+ messages in thread
From: Jan Viktorin @ 2015-10-30 16:11 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

Hmm, I see. It's good to fix this in the generated e-mails between format-patch and send-email calls. I always review those to be sure they meet my expectations ;).

Anyway, it is not clear, what has changed in the v3. Just the rte_cycles? You should explain that at least in the 0000 patch. Better to keep some history in each single commit (are there any rules in dpdk for this? Just look how they do in kernel).

I'll test the patchset in qemu anyway... so will probably send tested-by.

I've put this conversation to mailing list as I cannot see any reason why it is not CC'd there...

Jan Viktorin
RehiveTech
Sent from a mobile device
  Původní zpráva  
Od: Hunt, David
Odesláno: pátek, 30. října 2015 15:07
Komu: Jan Viktorin
Předmět: Fwd: [PATCH v3 6/6] test: add checks for cpu flags on armv8

Jan,
I had gone to the trouble of adding a "Reviewed-by" line in all the 
commit messages for each patch in the patch set, as well as addressing 
the comment about the armv8 files being in the arm dir.
However, the 'git format-patch' seems to have stripped out the
"Reviewed-by" line for some reason.
If you are happy with the latest patch set, could you reply and maybe 
say something like "series Reviewed-by..."?
Thanks for your help in this.
Regards,
Dave.



-------- Forwarded Message --------
Subject: [PATCH v3 6/6] test: add checks for cpu flags on armv8
Date: Fri, 30 Oct 2015 13:47:06 +0000
From: David Hunt <david.hunt@intel.com>
To: david.hunt@intel.com

Signed-off-by: David Hunt <david.hunt@intel.com>
---
app/test/test_cpuflags.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 557458f..1689048 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -1,4 +1,4 @@
-/*-
+/*
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
@@ -115,9 +115,18 @@ test_cpuflags(void)
CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
#endif

-#if defined(RTE_ARCH_ARM)
+#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64)
+	printf("Checking for Floating Point:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_FPA);
+
printf("Check for NEON:\t\t");
CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+
+	printf("Checking for ARM32 mode:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32);
+
+	printf("Checking for ARM64 mode:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64);
#endif

#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
-- 
1.9.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-11-02 17:31 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
2015-11-02  4:57   ` Jerin Jacob
2015-11-02 12:22     ` Hunt, David
2015-11-02 12:45       ` Jan Viktorin
2015-11-02 12:57       ` Jerin Jacob
2015-11-02 15:26         ` Hunt, David
2015-11-02 15:36           ` Jan Viktorin
2015-11-02 15:49             ` Hunt, David
2015-11-02 16:29               ` Jerin Jacob
2015-11-02 17:29                 ` Jan Viktorin
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
2015-11-02  5:15   ` Jerin Jacob
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
2015-11-02  4:43   ` Jerin Jacob
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt
     [not found] <1446212826-19425-7-git-send-email-david.hunt@intel.com>
     [not found] ` <5633798B.2050708@intel.com>
2015-10-30 16:11   ` Jan Viktorin
2015-10-30 16:16     ` Thomas Monjalon
2015-10-30 16:28     ` Hunt, David
2015-11-02  6:32       ` Jerin Jacob
2015-11-02 10:47         ` Hunt, David
2015-11-02 13:17           ` Jerin Jacob
2015-11-02 15:04             ` Hunt, David
2015-11-02 15:13               ` Jan Viktorin
2015-11-02 15:20                 ` Hunt, David
2015-11-02 15:24           ` Jan Viktorin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).