DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support
@ 2015-10-30 13:49 David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

This is th v3 patchset for ARMv8 that now sits on top of the v5 patch
of the ARMv7 code by RehiveTech. It adds code into the same arm include
directory, reducing code duplication.

Tested on an XGene 64-bit arm server board, with PCI slots. Passes traffic
between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should
work with many other NICS, but these are as yet untested.

Compiles igb_uio, kni and all the physical device PMDs.

An entry has been added to the Release notes.

We hope that this will encourage the ARM community to contribute PMDs
for their SoCs to DPDK.

For now, we've added some Intel engineers to the MAINTAINERS file. We would
like to encourage the ARM community to take over maintenance of this area
in future, and to further improve it.

Notes on arm64 kernel configuration:

  Using Ubuntu 14.04 LTS with a 4.3.0-rc6 kernel (with modified PCI drivers),
  and uio_pci_generic.
  ARM64 kernels do not seem to have functional resource mapping of PCI memory
  (PCI_MMAP), so the pci driver needs to be patched to enable this. The
  symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is
  missing the resource0...N files for mmapping the device memory. Earlier
  kernels (3.13.x) had these files present, but mmap'ping resulted in a
  "Bus Error" when the NIC memory was accessed.
  However, during limited testing with a modified 4.3.0-rc6 kernel, we were
  able to mmap the NIC memory, and pass traffic between the two ports on a
  82599 NIC connected via fibre cable.
  We have no plans to upstream a kernel patch for this and hope that
  someone more familiar with the arm architecture can create a proper patch
  and enable this functionality.

Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>

David Hunt (6):
  eal/arm: add 64-bit armv8 version of rte_memcpy.h
  eal/arm: add 64-bit armv8 version of rte_prefetch.h
  eal/arm: add 64-bit armv8 version of rte_cycles.h
  eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h
  mk: add support for armv8 on top of armv7
  test: add checks for cpu flags on armv8

 MAINTAINERS                                        |   3 +-
 app/test/test_cpuflags.c                           |  13 +-
 config/defconfig_arm64-armv8a-linuxapp-gcc         |  56 ++++
 doc/guides/rel_notes/release_2_2.rst               |   7 +-
 .../common/include/arch/arm/rte_cpuflags.h         |   6 +-
 .../common/include/arch/arm/rte_cycles.h           |   4 +
 .../common/include/arch/arm/rte_cycles_64.h        |  77 ++++++
 .../common/include/arch/arm/rte_memcpy.h           |   4 +
 .../common/include/arch/arm/rte_memcpy_64.h        | 308 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |   4 +
 .../common/include/arch/arm/rte_prefetch_64.h      |  61 ++++
 mk/arch/arm64/rte.vars.mk                          |  58 ++++
 mk/machine/armv8a/rte.vars.mk                      |  57 ++++
 13 files changed, 651 insertions(+), 7 deletions(-)
 create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
 create mode 100644 mk/arch/arm64/rte.vars.mk
 create mode 100644 mk/machine/armv8a/rte.vars.mk

-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-11-02  4:57   ` Jerin Jacob
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_memcpy.h           |   4 +
 .../common/include/arch/arm/rte_memcpy_64.h        | 308 +++++++++++++++++++++
 2 files changed, 312 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index d9f5bf1..1d562c3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_MEMCPY_ARM_H_
 #define _RTE_MEMCPY_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_memcpy_64.h>
+#else
 #include <rte_memcpy_32.h>
+#endif
 
 #endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
new file mode 100644
index 0000000..6d85113
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
@@ -0,0 +1,308 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMCPY_ARM_64_H_
+#define _RTE_MEMCPY_ARM_64_H_
+
+#include <stdint.h>
+#include <string.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP d0, d1, [%0]\n\t"
+		     "STP d0, d1, [%1]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP d0, d1, [%0 , #32]\n\t"
+		     "STP d0, d1, [%1 , #32]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP q0, q1, [%0 , #32]\n\t"
+		     "STP q0, q1, [%1 , #32]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP q0, q1, [%0 , #32]\n\t"
+		     "STP q0, q1, [%1 , #32]\n\t"
+		     "LDP q0, q1, [%0 , #64]\n\t"
+		     "STP q0, q1, [%1 , #64]\n\t"
+		     "LDP q0, q1, [%0 , #96]\n\t"
+		     "STP q0, q1, [%1 , #96]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	asm volatile("LDP q0, q1, [%0]\n\t"
+		     "STP q0, q1, [%1]\n\t"
+		     "LDP q0, q1, [%0 , #32]\n\t"
+		     "STP q0, q1, [%1 , #32]\n\t"
+		     "LDP q0, q1, [%0 , #64]\n\t"
+		     "STP q0, q1, [%1 , #64]\n\t"
+		     "LDP q0, q1, [%0 , #96]\n\t"
+		     "STP q0, q1, [%1 , #96]\n\t"
+		     "LDP q0, q1, [%0 , #128]\n\t"
+		     "STP q0, q1, [%1 , #128]\n\t"
+		     "LDP q0, q1, [%0 , #160]\n\t"
+		     "STP q0, q1, [%1 , #160]\n\t"
+		     "LDP q0, q1, [%0 , #192]\n\t"
+		     "STP q0, q1, [%1 , #192]\n\t"
+		     "LDP q0, q1, [%0 , #224]\n\t"
+		     "STP q0, q1, [%1 , #224]\n\t"
+		     : : "r" (src), "r" (dst) :
+	);
+}
+
+#define rte_memcpy(dst, src, n)              \
+	({ (__builtin_constant_p(n)) ?       \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08)
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n,
+			(const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n,
+			(const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		break;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		break;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0)
+		rte_mov16((uint8_t *)dst - 16 + n,
+			(const uint8_t *)src - 16 + n);
+	return ret;
+}
+
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_64_H_ */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_prefetch.h         |  4 ++
 .../common/include/arch/arm/rte_prefetch_64.h      | 61 ++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
index 1f46697..aa37de5 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_PREFETCH_ARM_H_
 #define _RTE_PREFETCH_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_prefetch_64.h>
+#else
 #include <rte_prefetch_32.h>
+#endif
 
 #endif /* _RTE_PREFETCH_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
new file mode 100644
index 0000000..b0d9170
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
@@ -0,0 +1,61 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_64_H_
+#define _RTE_PREFETCH_ARM_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+/* May want to add PSTL1KEEP instructions for prefetch for ownership. */
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_64_H_ */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-11-02  5:15   ` Jerin Jacob
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 .../common/include/arch/arm/rte_cycles.h           |  4 ++
 .../common/include/arch/arm/rte_cycles_64.h        | 77 ++++++++++++++++++++++
 2 files changed, 81 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index b2372fa..a8009a0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_CYCLES_ARM_H_
 #define _RTE_CYCLES_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_cycles_64.h>
+#else
 #include <rte_cycles_32.h>
+#endif
 
 #endif /* _RTE_CYCLES_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
new file mode 100644
index 0000000..148b9f4
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
@@ -0,0 +1,77 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_CYCLES_ARM64_H_
+#define _RTE_CYCLES_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	uint64_t tsc;
+
+	asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc));
+
+#ifdef RTE_TIMER_MULTIPLIER
+	return tsc * RTE_TIMER_MULTIPLIER;
+#else
+	return tsc;
+#endif
+
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	asm volatile("isb sy" :::);
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM64_H_ */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
                   ` (2 preceding siblings ...)
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt
  5 siblings, 0 replies; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 7ce9d14..5c5fd6a 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -141,12 +141,16 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
 	__attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
 {
 	int auxv_fd;
+#ifdef RTE_ARCH_64
+	Elf64_auxv_t auxv;
+#else
 	Elf32_auxv_t auxv;
+#endif
 
 	auxv_fd = open("/proc/self/auxv", O_RDONLY);
 	assert(auxv_fd);
 	while (read(auxv_fd, &auxv,
-		sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+		sizeof(auxv)) == sizeof(auxv)) {
 		if (auxv.a_type == AT_HWCAP)
 			out[REG_HWCAP] = auxv.a_un.a_val;
 		else if (auxv.a_type == AT_HWCAP2)
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
                   ` (3 preceding siblings ...)
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt
@ 2015-10-30 13:49 ` David Hunt
  2015-11-02  4:43   ` Jerin Jacob
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt
  5 siblings, 1 reply; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

The ARMv8 include files are in the arm directory in
lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 MAINTAINERS                                |  3 +-
 config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_2_2.rst       |  7 ++--
 mk/arch/arm64/rte.vars.mk                  | 58 ++++++++++++++++++++++++++++++
 mk/machine/armv8a/rte.vars.mk              | 57 +++++++++++++++++++++++++++++
 5 files changed, 177 insertions(+), 4 deletions(-)
 create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
 create mode 100644 mk/arch/arm64/rte.vars.mk
 create mode 100644 mk/machine/armv8a/rte.vars.mk

diff --git a/MAINTAINERS b/MAINTAINERS
index a8933eb..4569f13 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,8 +124,9 @@ IBM POWER
 M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
 F: lib/librte_eal/common/include/arch/ppc_64/
 
-ARM v7
+ARM
 M: Jan Viktorin <viktorin@rehivetech.com>
+M: David Hunt <david.hunt@intel.com>
 F: lib/librte_eal/common/include/arch/arm/
 
 Intel x86
diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
new file mode 100644
index 0000000..79a9533
--- /dev/null
+++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv8a"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
+# This is used to adjust the generic arm timer to align with the cpu cycle count.
+CONFIG_RTE_TIMER_MULTIPLIER=48
diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index 5b5bb4c..5aa523b 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -31,10 +31,11 @@ New Features
 
 * **Added vhost-user multiple queue support.**
 
-* **Introduce ARMv7 architecture**
+* **Introduce ARMv7 and ARMv8 architectures**
 
-  It is now possible to build DPDK for the ARMv7 platform and test with
-  virtual PMD drivers.
+  * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms.
+  * ARMv7 can be tested with virtual PMD drivers.
+  * ARMv8 can be tested with virtual and physical PMD drivers.
 
 
 Resolved Issues
diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk
new file mode 100644
index 0000000..3aad712
--- /dev/null
+++ b/mk/arch/arm64/rte.vars.mk
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# arch:
+#
+#   - define ARCH variable (overridden by cmdline or by previous
+#     optional define in machine .mk)
+#   - define CROSS variable (overridden by cmdline or previous define
+#     in machine .mk)
+#   - define CPU_CFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - define CPU_LDFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - define CPU_ASFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - may override any previously defined variable
+#
+# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32
+#
+
+ARCH  ?= arm64
+# common arch dir in eal headers
+ARCH_DIR := arm
+CROSS ?=
+
+CPU_CFLAGS  ?= -DRTE_CACHE_LINE_SIZE=64
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
new file mode 100644
index 0000000..b785062
--- /dev/null
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8
  2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
                   ` (4 preceding siblings ...)
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
@ 2015-10-30 13:49 ` David Hunt
  5 siblings, 0 replies; 18+ messages in thread
From: David Hunt @ 2015-10-30 13:49 UTC (permalink / raw)
  To: dev

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_cpuflags.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 557458f..1689048 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -1,4 +1,4 @@
-/*-
+/*
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
@@ -115,9 +115,18 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif
 
-#if defined(RTE_ARCH_ARM)
+#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64)
+	printf("Checking for Floating Point:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_FPA);
+
 	printf("Check for NEON:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+
+	printf("Checking for ARM32 mode:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH32);
+
+	printf("Checking for ARM64 mode:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_AARCH64);
 #endif
 
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
@ 2015-11-02  4:43   ` Jerin Jacob
  0 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-02  4:43 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Fri, Oct 30, 2015 at 01:49:18PM +0000, David Hunt wrote:
> The ARMv8 include files are in the arm directory in
> lib/librte_eal/common/include/arch/arm/ with the ARMv7 include files
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  MAINTAINERS                                |  3 +-
>  config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
>  doc/guides/rel_notes/release_2_2.rst       |  7 ++--
>  mk/arch/arm64/rte.vars.mk                  | 58 ++++++++++++++++++++++++++++++
>  mk/machine/armv8a/rte.vars.mk              | 57 +++++++++++++++++++++++++++++
>  5 files changed, 177 insertions(+), 4 deletions(-)
>  create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
>  create mode 100644 mk/arch/arm64/rte.vars.mk
>  create mode 100644 mk/machine/armv8a/rte.vars.mk
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a8933eb..4569f13 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -124,8 +124,9 @@ IBM POWER
>  M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
>  F: lib/librte_eal/common/include/arch/ppc_64/
>  
> -ARM v7
> +ARM
>  M: Jan Viktorin <viktorin@rehivetech.com>
> +M: David Hunt <david.hunt@intel.com>
>  F: lib/librte_eal/common/include/arch/arm/
>  
>  Intel x86
> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
> new file mode 100644
> index 0000000..79a9533
> --- /dev/null
> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> @@ -0,0 +1,56 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +#
> +
> +#include "common_linuxapp"
> +
> +CONFIG_RTE_MACHINE="armv8a"
> +
> +CONFIG_RTE_ARCH="arm64"
> +CONFIG_RTE_ARCH_ARM64=y
> +CONFIG_RTE_ARCH_64=y
> +CONFIG_RTE_ARCH_ARM_NEON=y
> +
> +CONFIG_RTE_TOOLCHAIN="gcc"
> +CONFIG_RTE_TOOLCHAIN_GCC=y
> +
> +CONFIG_RTE_IXGBE_INC_VECTOR=n
> +CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
> +CONFIG_RTE_LIBRTE_IVSHMEM=n
> +CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
> +
> +CONFIG_RTE_LIBRTE_LPM=n
> +CONFIG_RTE_LIBRTE_ACL=n
> +CONFIG_RTE_LIBRTE_TABLE=n
> +CONFIG_RTE_LIBRTE_PIPELINE=n
> +
> +# This is used to adjust the generic arm timer to align with the cpu cycle count.
> +CONFIG_RTE_TIMER_MULTIPLIER=48

Introducing a build-time dependency with cpu clock parameter not a good
idea. Either this parameter needs be removed or find out out the
multiplier at run-time by introducing a machine specific hook


> diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
> index 5b5bb4c..5aa523b 100644
> --- a/doc/guides/rel_notes/release_2_2.rst
> +++ b/doc/guides/rel_notes/release_2_2.rst
> @@ -31,10 +31,11 @@ New Features
>  
>  * **Added vhost-user multiple queue support.**
>  
> -* **Introduce ARMv7 architecture**
> +* **Introduce ARMv7 and ARMv8 architectures**
>  
> -  It is now possible to build DPDK for the ARMv7 platform and test with
> -  virtual PMD drivers.
> +  * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms.
> +  * ARMv7 can be tested with virtual PMD drivers.
> +  * ARMv8 can be tested with virtual and physical PMD drivers.
>  
>  
>  Resolved Issues
> diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk
> new file mode 100644
> index 0000000..3aad712
> --- /dev/null
> +++ b/mk/arch/arm64/rte.vars.mk
> @@ -0,0 +1,58 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2015 Intel Corporation. All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +#
> +# arch:
> +#
> +#   - define ARCH variable (overridden by cmdline or by previous
> +#     optional define in machine .mk)
> +#   - define CROSS variable (overridden by cmdline or previous define
> +#     in machine .mk)
> +#   - define CPU_CFLAGS variable (overridden by cmdline or previous
> +#     define in machine .mk)
> +#   - define CPU_LDFLAGS variable (overridden by cmdline or previous
> +#     define in machine .mk)
> +#   - define CPU_ASFLAGS variable (overridden by cmdline or previous
> +#     define in machine .mk)
> +#   - may override any previously defined variable
> +#
> +# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32
> +#
> +
> +ARCH  ?= arm64
> +# common arch dir in eal headers
> +ARCH_DIR := arm
> +CROSS ?=
> +
> +CPU_CFLAGS  ?= -DRTE_CACHE_LINE_SIZE=64

cache line size can be moved to MACHINE_CFLAGS as its more of machine
parameter.so that if machine has different cache line size(based on
arm64) can have new target like  defconfig_arm64-xxxxxxx-linuxapp-gcc

> +CPU_LDFLAGS ?=
> +CPU_ASFLAGS ?= -felf
> +
> +export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
> diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
> new file mode 100644
> index 0000000..b785062
> --- /dev/null
> +++ b/mk/machine/armv8a/rte.vars.mk
> @@ -0,0 +1,57 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2015 Intel Corporation. All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +#
> +# machine:
> +#
> +#   - can define ARCH variable (overridden by cmdline value)
> +#   - can define CROSS variable (overridden by cmdline value)
> +#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
> +#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
> +#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
> +#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
> +#     overrides the one defined in arch.
> +#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
> +#     overrides the one defined in arch.
> +#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
> +#     overrides the one defined in arch.
> +#   - may override any previously defined variable
> +#
> +
> +# ARCH =
> +# CROSS =
> +# MACHINE_CFLAGS =
> +# MACHINE_LDFLAGS =
> +# MACHINE_ASFLAGS =
> +# CPU_CFLAGS =
> +# CPU_LDFLAGS =
> +# CPU_ASFLAGS =
> +
> +MACHINE_CFLAGS += -march=armv8-a
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
@ 2015-11-02  4:57   ` Jerin Jacob
  2015-11-02 12:22     ` Hunt, David
  0 siblings, 1 reply; 18+ messages in thread
From: Jerin Jacob @ 2015-11-02  4:57 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  .../common/include/arch/arm/rte_memcpy.h           |   4 +
>  .../common/include/arch/arm/rte_memcpy_64.h        | 308 +++++++++++++++++++++
>  2 files changed, 312 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
> index d9f5bf1..1d562c3 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
> @@ -33,6 +33,10 @@
>  #ifndef _RTE_MEMCPY_ARM_H_
>  #define _RTE_MEMCPY_ARM_H_
>  
> +#ifdef RTE_ARCH_64
> +#include <rte_memcpy_64.h>
> +#else
>  #include <rte_memcpy_32.h>
> +#endif
>  
>  #endif /* _RTE_MEMCPY_ARM_H_ */
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> new file mode 100644
> index 0000000..6d85113
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> @@ -0,0 +1,308 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of IBM Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +*/
> +
> +#ifndef _RTE_MEMCPY_ARM_64_H_
> +#define _RTE_MEMCPY_ARM_64_H_
> +
> +#include <stdint.h>
> +#include <string.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include "generic/rte_memcpy.h"
> +
> +#ifdef __ARM_NEON_FP

SIMD is not optional in armv8 spec.So every armv8 machine will have
SIMD instruction unlike armv7.More over LDP/STP instruction is
not part of SIMD.So this check is not required or it can
be replaced with a check that select memcpy from either libc or this specific
implementation

> +
> +/* ARM NEON Intrinsics are used to copy data */
> +#include <arm_neon.h>
> +
> +static inline void
> +rte_mov16(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP d0, d1, [%0]\n\t"
> +		     "STP d0, d1, [%1]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}

IMO, no need to hardcode registers used for the mem move(d0, d1).
Let compiler schedule the registers for better performance.


> +
> +static inline void
> +rte_mov32(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov48(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP d0, d1, [%0 , #32]\n\t"
> +		     "STP d0, d1, [%1 , #32]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov64(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP q0, q1, [%0 , #32]\n\t"
> +		     "STP q0, q1, [%1 , #32]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov128(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP q0, q1, [%0 , #32]\n\t"
> +		     "STP q0, q1, [%1 , #32]\n\t"
> +		     "LDP q0, q1, [%0 , #64]\n\t"
> +		     "STP q0, q1, [%1 , #64]\n\t"
> +		     "LDP q0, q1, [%0 , #96]\n\t"
> +		     "STP q0, q1, [%1 , #96]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +static inline void
> +rte_mov256(uint8_t *dst, const uint8_t *src)
> +{
> +	asm volatile("LDP q0, q1, [%0]\n\t"
> +		     "STP q0, q1, [%1]\n\t"
> +		     "LDP q0, q1, [%0 , #32]\n\t"
> +		     "STP q0, q1, [%1 , #32]\n\t"
> +		     "LDP q0, q1, [%0 , #64]\n\t"
> +		     "STP q0, q1, [%1 , #64]\n\t"
> +		     "LDP q0, q1, [%0 , #96]\n\t"
> +		     "STP q0, q1, [%1 , #96]\n\t"
> +		     "LDP q0, q1, [%0 , #128]\n\t"
> +		     "STP q0, q1, [%1 , #128]\n\t"
> +		     "LDP q0, q1, [%0 , #160]\n\t"
> +		     "STP q0, q1, [%1 , #160]\n\t"
> +		     "LDP q0, q1, [%0 , #192]\n\t"
> +		     "STP q0, q1, [%1 , #192]\n\t"
> +		     "LDP q0, q1, [%0 , #224]\n\t"
> +		     "STP q0, q1, [%1 , #224]\n\t"
> +		     : : "r" (src), "r" (dst) :
> +	);
> +}
> +
> +#define rte_memcpy(dst, src, n)              \
> +	({ (__builtin_constant_p(n)) ?       \
> +	memcpy((dst), (src), (n)) :          \
> +	rte_memcpy_func((dst), (src), (n)); })
> +
> +static inline void *
> +rte_memcpy_func(void *dst, const void *src, size_t n)
> +{
> +	void *ret = dst;
> +
> +	/* We can't copy < 16 bytes using XMM registers so do it manually. */
> +	if (n < 16) {
> +		if (n & 0x01) {
> +			*(uint8_t *)dst = *(const uint8_t *)src;
> +			dst = (uint8_t *)dst + 1;
> +			src = (const uint8_t *)src + 1;
> +		}
> +		if (n & 0x02) {
> +			*(uint16_t *)dst = *(const uint16_t *)src;
> +			dst = (uint16_t *)dst + 1;
> +			src = (const uint16_t *)src + 1;
> +		}
> +		if (n & 0x04) {
> +			*(uint32_t *)dst = *(const uint32_t *)src;
> +			dst = (uint32_t *)dst + 1;
> +			src = (const uint32_t *)src + 1;
> +		}
> +		if (n & 0x08)
> +			*(uint64_t *)dst = *(const uint64_t *)src;
> +		return ret;
> +	}
> +
> +	/* Special fast cases for <= 128 bytes */
> +	if (n <= 32) {
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		rte_mov16((uint8_t *)dst - 16 + n,
> +			(const uint8_t *)src - 16 + n);
> +		return ret;
> +	}
> +
> +	if (n <= 64) {
> +		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
> +		rte_mov32((uint8_t *)dst - 32 + n,
> +			(const uint8_t *)src - 32 + n);
> +		return ret;
> +	}
> +
> +	if (n <= 128) {
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		rte_mov64((uint8_t *)dst - 64 + n,
> +			(const uint8_t *)src - 64 + n);
> +		return ret;
> +	}
> +
> +	/*
> +	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
> +	 * copies was found to be faster than doing 128 and 32 byte copies as
> +	 * well.
> +	 */
> +	for ( ; n >= 256; n -= 256) {

There is room for prefetching the next cacheline based on the cache line
size.

> +		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
> +		dst = (uint8_t *)dst + 256;
> +		src = (const uint8_t *)src + 256;
> +	}
> +
> +	/*
> +	 * We split the remaining bytes (which will be less than 256) into
> +	 * 64byte (2^6) chunks.
> +	 * Using incrementing integers in the case labels of a switch statement
> +	 * enourages the compiler to use a jump table. To get incrementing
> +	 * integers, we shift the 2 relevant bits to the LSB position to first
> +	 * get decrementing integers, and then subtract.
> +	 */
> +	switch (3 - (n >> 6)) {
> +	case 0x00:
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 64;
> +		dst = (uint8_t *)dst + 64;
> +		src = (const uint8_t *)src + 64;      /* fallthrough */
> +	case 0x01:
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 64;
> +		dst = (uint8_t *)dst + 64;
> +		src = (const uint8_t *)src + 64;      /* fallthrough */
> +	case 0x02:
> +		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 64;
> +		dst = (uint8_t *)dst + 64;
> +		src = (const uint8_t *)src + 64;      /* fallthrough */
> +	default:
> +		break;
> +	}
> +
> +	/*
> +	 * We split the remaining bytes (which will be less than 64) into
> +	 * 16byte (2^4) chunks, using the same switch structure as above.
> +	 */
> +	switch (3 - (n >> 4)) {
> +	case 0x00:
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 16;
> +		dst = (uint8_t *)dst + 16;
> +		src = (const uint8_t *)src + 16;      /* fallthrough */
> +	case 0x01:
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 16;
> +		dst = (uint8_t *)dst + 16;
> +		src = (const uint8_t *)src + 16;      /* fallthrough */
> +	case 0x02:
> +		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> +		n -= 16;
> +		dst = (uint8_t *)dst + 16;
> +		src = (const uint8_t *)src + 16;      /* fallthrough */
> +	default:
> +		break;
> +	}
> +
> +	/* Copy any remaining bytes, without going beyond end of buffers */
> +	if (n != 0)
> +		rte_mov16((uint8_t *)dst - 16 + n,
> +			(const uint8_t *)src - 16 + n);
> +	return ret;
> +}
> +
> +#else
> +
> +static inline void
> +rte_mov16(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 16);
> +}
> +
> +static inline void
> +rte_mov32(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 32);
> +}
> +
> +static inline void
> +rte_mov48(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 48);
> +}
> +
> +static inline void
> +rte_mov64(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 64);
> +}
> +
> +static inline void
> +rte_mov128(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 128);
> +}
> +
> +static inline void
> +rte_mov256(uint8_t *dst, const uint8_t *src)
> +{
> +	memcpy(dst, src, 256);
> +}
> +
> +static inline void *
> +rte_memcpy(void *dst, const void *src, size_t n)
> +{
> +	return memcpy(dst, src, n);
> +}
> +
> +static inline void *
> +rte_memcpy_func(void *dst, const void *src, size_t n)
> +{
> +	return memcpy(dst, src, n);
> +}
> +
> +#endif /* __ARM_NEON_FP */
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_MEMCPY_ARM_64_H_ */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h
  2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
@ 2015-11-02  5:15   ` Jerin Jacob
  0 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-02  5:15 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Fri, Oct 30, 2015 at 01:49:16PM +0000, David Hunt wrote:
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  .../common/include/arch/arm/rte_cycles.h           |  4 ++
>  .../common/include/arch/arm/rte_cycles_64.h        | 77 ++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
> index b2372fa..a8009a0 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
> @@ -33,6 +33,10 @@
>  #ifndef _RTE_CYCLES_ARM_H_
>  #define _RTE_CYCLES_ARM_H_
>  
> +#ifdef RTE_ARCH_64
> +#include <rte_cycles_64.h>
> +#else
>  #include <rte_cycles_32.h>
> +#endif
>  
>  #endif /* _RTE_CYCLES_ARM_H_ */
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> new file mode 100644
> index 0000000..148b9f4
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> @@ -0,0 +1,77 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of IBM Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +*/
> +
> +#ifndef _RTE_CYCLES_ARM64_H_
> +#define _RTE_CYCLES_ARM64_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include "generic/rte_cycles.h"
> +
> +/**
> + * Read the time base register.
> + *
> + * @return
> + *   The time base for this lcore.
> + */
> +static inline uint64_t
> +rte_rdtsc(void)
> +{
> +	uint64_t tsc;
> +
> +	asm volatile("mrs %0, CNTVCT_EL0" : "=r" (tsc));
> +
> +#ifdef RTE_TIMER_MULTIPLIER
> +	return tsc * RTE_TIMER_MULTIPLIER;
> +#else
> +	return tsc;
> +#endif
> +
> +}
> +
> +static inline uint64_t
> +rte_rdtsc_precise(void)
> +{
> +	asm volatile("isb sy" :::);

IMO, it should be asm volatile("dmb ish" : : : "memory")
to represent the data memory barrier(rte_mb()).

> +	return rte_rdtsc();
> +}
> +
> +static inline uint64_t
> +rte_get_tsc_cycles(void) { return rte_rdtsc(); }
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_CYCLES_ARM64_H_ */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02  4:57   ` Jerin Jacob
@ 2015-11-02 12:22     ` Hunt, David
  2015-11-02 12:45       ` Jan Viktorin
  2015-11-02 12:57       ` Jerin Jacob
  0 siblings, 2 replies; 18+ messages in thread
From: Hunt, David @ 2015-11-02 12:22 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On 02/11/2015 04:57, Jerin Jacob wrote:
> On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:
>> Signed-off-by: David Hunt <david.hunt@intel.com>
--snip--
>> +#ifndef _RTE_MEMCPY_ARM_64_H_
>> +#define _RTE_MEMCPY_ARM_64_H_
>> +
>> +#include <stdint.h>
>> +#include <string.h>
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include "generic/rte_memcpy.h"
>> +
>> +#ifdef __ARM_NEON_FP
>
> SIMD is not optional in armv8 spec.So every armv8 machine will have
> SIMD instruction unlike armv7.More over LDP/STP instruction is
> not part of SIMD.So this check is not required or it can
> be replaced with a check that select memcpy from either libc or this specific
> implementation

Jerin,
    I've just benchmarked the libc version against the hand-coded 
version of the memcpy routines, and the libc wins in most cases. This 
code was just an initial attempt at optimising the memccpy's, so I feel 
that with the current benchmark results, it would better just to remove 
the assembly versions, and use the libc version for the initial release 
on ARMv8.
Then, in the future, the ARMv8 experts are free to submit an optimised 
version as a patch in the future. Does that sound reasonable to you?
Rgds,
Dave.


--snip--

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 12:22     ` Hunt, David
@ 2015-11-02 12:45       ` Jan Viktorin
  2015-11-02 12:57       ` Jerin Jacob
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Viktorin @ 2015-11-02 12:45 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, 2 Nov 2015 12:22:47 +0000
"Hunt, David" <david.hunt@intel.com> wrote:

> On 02/11/2015 04:57, Jerin Jacob wrote:
> > On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:  
> >> Signed-off-by: David Hunt <david.hunt@intel.com>  
> --snip--
> >> +#ifndef _RTE_MEMCPY_ARM_64_H_
> >> +#define _RTE_MEMCPY_ARM_64_H_
> >> +
> >> +#include <stdint.h>
> >> +#include <string.h>
> >> +
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >> +#include "generic/rte_memcpy.h"
> >> +
> >> +#ifdef __ARM_NEON_FP  
> >
> > SIMD is not optional in armv8 spec.So every armv8 machine will have
> > SIMD instruction unlike armv7.More over LDP/STP instruction is
> > not part of SIMD.So this check is not required or it can
> > be replaced with a check that select memcpy from either libc or this specific
> > implementation  
> 
> Jerin,
>     I've just benchmarked the libc version against the hand-coded 
> version of the memcpy routines, and the libc wins in most cases. This 
> code was just an initial attempt at optimising the memccpy's, so I feel 
> that with the current benchmark results, it would better just to remove 
> the assembly versions, and use the libc version for the initial release 
> on ARMv8.
> Then, in the future, the ARMv8 experts are free to submit an optimised 
> version as a patch in the future. Does that sound reasonable to you?
> Rgds,
> Dave.

As there is no use of NEON in the code, this optimization seems to be
useless to me...

Jan

> 
> 
> --snip--
> 
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 12:22     ` Hunt, David
  2015-11-02 12:45       ` Jan Viktorin
@ 2015-11-02 12:57       ` Jerin Jacob
  2015-11-02 15:26         ` Hunt, David
  1 sibling, 1 reply; 18+ messages in thread
From: Jerin Jacob @ 2015-11-02 12:57 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote:
> On 02/11/2015 04:57, Jerin Jacob wrote:
> >On Fri, Oct 30, 2015 at 01:49:14PM +0000, David Hunt wrote:
> >>Signed-off-by: David Hunt <david.hunt@intel.com>
> --snip--
> >>+#ifndef _RTE_MEMCPY_ARM_64_H_
> >>+#define _RTE_MEMCPY_ARM_64_H_
> >>+
> >>+#include <stdint.h>
> >>+#include <string.h>
> >>+
> >>+#ifdef __cplusplus
> >>+extern "C" {
> >>+#endif
> >>+
> >>+#include "generic/rte_memcpy.h"
> >>+
> >>+#ifdef __ARM_NEON_FP
> >
> >SIMD is not optional in armv8 spec.So every armv8 machine will have
> >SIMD instruction unlike armv7.More over LDP/STP instruction is
> >not part of SIMD.So this check is not required or it can
> >be replaced with a check that select memcpy from either libc or this specific
> >implementation
> 
> Jerin,
>    I've just benchmarked the libc version against the hand-coded version of
> the memcpy routines, and the libc wins in most cases. This code was just an
> initial attempt at optimising the memccpy's, so I feel that with the current
> benchmark results, it would better just to remove the assembly versions, and
> use the libc version for the initial release on ARMv8.
> Then, in the future, the ARMv8 experts are free to submit an optimised
> version as a patch in the future. Does that sound reasonable to you?

Make sense. Based on my understanding, other blocks are also not optimized 
for arm64.
So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and
libc for initial version.

BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and
"byteorder_autotest" is broken. I think existing arm64 code is not optimized
beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified
CONFIG_RTE_FORCE_INTRINSICS scheme.

if you guys are OK with arm and arm64 as two different platform then
I can summit the complete working patch for arm64.(as in my current source
code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/)


> Rgds,
> Dave.
> 
> 
> --snip--
> 
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 12:57       ` Jerin Jacob
@ 2015-11-02 15:26         ` Hunt, David
  2015-11-02 15:36           ` Jan Viktorin
  0 siblings, 1 reply; 18+ messages in thread
From: Hunt, David @ 2015-11-02 15:26 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On 02/11/2015 12:57, Jerin Jacob wrote:
> On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote:
>> Jerin,
>>     I've just benchmarked the libc version against the hand-coded version of
>> the memcpy routines, and the libc wins in most cases. This code was just an
>> initial attempt at optimising the memccpy's, so I feel that with the current
>> benchmark results, it would better just to remove the assembly versions, and
>> use the libc version for the initial release on ARMv8.
>> Then, in the future, the ARMv8 experts are free to submit an optimised
>> version as a patch in the future. Does that sound reasonable to you?
>
> Make sense. Based on my understanding, other blocks are also not optimized
> for arm64.
> So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and
> libc for initial version.
>
> BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and
> "byteorder_autotest" is broken. I think existing arm64 code is not optimized
> beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified
> CONFIG_RTE_FORCE_INTRINSICS scheme.

Agreed.

> if you guys are OK with arm and arm64 as two different platform then
> I can summit the complete working patch for arm64.(as in my current source
> code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/)

Sure. That would be great. We initially started with two ARMv7 
patch-sets, and Jan merged into one. Something similar could happen for 
the ARMv8 patch set. We just want to end up with the best implementation 
possible. :)

Dave.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 15:26         ` Hunt, David
@ 2015-11-02 15:36           ` Jan Viktorin
  2015-11-02 15:49             ` Hunt, David
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Viktorin @ 2015-11-02 15:36 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, 2 Nov 2015 15:26:19 +0000
"Hunt, David" <david.hunt@intel.com> wrote:

> On 02/11/2015 12:57, Jerin Jacob wrote:
> > On Mon, Nov 02, 2015 at 12:22:47PM +0000, Hunt, David wrote:  
> >> Jerin,
> >>     I've just benchmarked the libc version against the hand-coded version of
> >> the memcpy routines, and the libc wins in most cases. This code was just an
> >> initial attempt at optimising the memccpy's, so I feel that with the current
> >> benchmark results, it would better just to remove the assembly versions, and
> >> use the libc version for the initial release on ARMv8.
> >> Then, in the future, the ARMv8 experts are free to submit an optimised
> >> version as a patch in the future. Does that sound reasonable to you?  
> >
> > Make sense. Based on my understanding, other blocks are also not optimized
> > for arm64.
> > So better to revert back to CONFIG_RTE_FORCE_INTRINSICS and
> > libc for initial version.
> >
> > BTW: I just tested ./arm64-armv8a-linuxapp-gcc/app/test and
> > "byteorder_autotest" is broken. I think existing arm64 code is not optimized
> > beyond CONFIG_RTE_FORCE_INTRINSICS. So better to use verified
> > CONFIG_RTE_FORCE_INTRINSICS scheme.  
> 
> Agreed.
> 
> > if you guys are OK with arm and arm64 as two different platform then
> > I can summit the complete working patch for arm64.(as in my current source
> > code "arm64" is a different platform(lib/librte_eal/common/include/arch/arm64/)  
> 
> Sure. That would be great. We initially started with two ARMv7 
> patch-sets, and Jan merged into one. Something similar could happen for 
> the ARMv8 patch set. We just want to end up with the best implementation 
> possible. :)
> 

It was looking like we can share a lot of common code for both
architectures. I didn't know how much different are the cpuflags.

IMHO, it'd be better to have two directories arm and arm64. I thought
to refer from arm64 to arm where possible. But I don't know whether is
this possible with the DPDK build system.

Jan

> Dave.
> 
> 
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 15:36           ` Jan Viktorin
@ 2015-11-02 15:49             ` Hunt, David
  2015-11-02 16:29               ` Jerin Jacob
  0 siblings, 1 reply; 18+ messages in thread
From: Hunt, David @ 2015-11-02 15:49 UTC (permalink / raw)
  To: Jan Viktorin; +Cc: dev

On 02/11/2015 15:36, Jan Viktorin wrote:
> On Mon, 2 Nov 2015 15:26:19 +0000
--snip--
> It was looking like we can share a lot of common code for both
> architectures. I didn't know how much different are the cpuflags.

CPU flags for ARMv8 are looking like this now. Quite different to the 
ARMv7 ones.

static const struct feature_entry cpu_feature_table[] = {
         FEAT_DEF(FP,        0x00000001, 0, REG_HWCAP,  0)
         FEAT_DEF(ASIMD,     0x00000001, 0, REG_HWCAP,  1)
         FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  2)
         FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP,  3)
         FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP,  4)
         FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP,  5)
         FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP,  6)
         FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP,  7)
         FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
         FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
};

> IMHO, it'd be better to have two directories arm and arm64. I thought
> to refer from arm64 to arm where possible. But I don't know whether is
> this possible with the DPDK build system.

I think both methodologies have their pros and cons. However, I'd lean 
towards the common directory with the "filename_32/64.h" scheme, as that 
similar to the x86 methodology, and we don't need to tweak the include 
paths to pull files from multiple directories.

Dave

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 15:49             ` Hunt, David
@ 2015-11-02 16:29               ` Jerin Jacob
  2015-11-02 17:29                 ` Jan Viktorin
  0 siblings, 1 reply; 18+ messages in thread
From: Jerin Jacob @ 2015-11-02 16:29 UTC (permalink / raw)
  To: Hunt, David; +Cc: dev

On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote:
> On 02/11/2015 15:36, Jan Viktorin wrote:
> >On Mon, 2 Nov 2015 15:26:19 +0000
> --snip--
> >It was looking like we can share a lot of common code for both
> >architectures. I didn't know how much different are the cpuflags.
> 
> CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7
> ones.
> 
> static const struct feature_entry cpu_feature_table[] = {
>         FEAT_DEF(FP,        0x00000001, 0, REG_HWCAP,  0)
>         FEAT_DEF(ASIMD,     0x00000001, 0, REG_HWCAP,  1)
>         FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  2)
>         FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP,  3)
>         FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP,  4)
>         FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP,  5)
>         FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP,  6)
>         FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP,  7)
>         FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
>         FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
> };
> 
> >IMHO, it'd be better to have two directories arm and arm64. I thought
> >to refer from arm64 to arm where possible. But I don't know whether is
> >this possible with the DPDK build system.
> 
> I think both methodologies have their pros and cons. However, I'd lean
> towards the common directory with the "filename_32/64.h" scheme, as that
> similar to the x86 methodology, and we don't need to tweak the include paths
> to pull files from multiple directories.
> 

I agree. Jan, could you please send the next version with
filename_32/64.h for atomic and cpuflags(ie for all header files).
I can re-base and send the complete arm64 patch based on your version.

Thanks,
Jerin



> Dave
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h
  2015-11-02 16:29               ` Jerin Jacob
@ 2015-11-02 17:29                 ` Jan Viktorin
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Viktorin @ 2015-11-02 17:29 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

On Mon, 2 Nov 2015 21:59:12 +0530
Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:

> On Mon, Nov 02, 2015 at 03:49:17PM +0000, Hunt, David wrote:
> > On 02/11/2015 15:36, Jan Viktorin wrote:  
> > >On Mon, 2 Nov 2015 15:26:19 +0000  
> > --snip--  
> > >It was looking like we can share a lot of common code for both
> > >architectures. I didn't know how much different are the cpuflags.  
> > 
> > CPU flags for ARMv8 are looking like this now. Quite different to the ARMv7
> > ones.
> > 
> > static const struct feature_entry cpu_feature_table[] = {
> >         FEAT_DEF(FP,        0x00000001, 0, REG_HWCAP,  0)
> >         FEAT_DEF(ASIMD,     0x00000001, 0, REG_HWCAP,  1)
> >         FEAT_DEF(EVTSTRM,   0x00000001, 0, REG_HWCAP,  2)
> >         FEAT_DEF(AES,       0x00000001, 0, REG_HWCAP,  3)
> >         FEAT_DEF(PMULL,     0x00000001, 0, REG_HWCAP,  4)
> >         FEAT_DEF(SHA1,      0x00000001, 0, REG_HWCAP,  5)
> >         FEAT_DEF(SHA2,      0x00000001, 0, REG_HWCAP,  6)
> >         FEAT_DEF(CRC32,     0x00000001, 0, REG_HWCAP,  7)
> >         FEAT_DEF(AARCH32,   0x00000001, 0, REG_PLATFORM, 0)
> >         FEAT_DEF(AARCH64,   0x00000001, 0, REG_PLATFORM, 1)
> > };
> >   
> > >IMHO, it'd be better to have two directories arm and arm64. I thought
> > >to refer from arm64 to arm where possible. But I don't know whether is
> > >this possible with the DPDK build system.  
> > 
> > I think both methodologies have their pros and cons. However, I'd lean
> > towards the common directory with the "filename_32/64.h" scheme, as that
> > similar to the x86 methodology, and we don't need to tweak the include paths
> > to pull files from multiple directories.
> >   
> 
> I agree. Jan, could you please send the next version with
> filename_32/64.h for atomic and cpuflags(ie for all header files).
> I can re-base and send the complete arm64 patch based on your version.
> 

I am working on it, however, after I've removed the unnecessary
intrinsics code and set the RTE_FORCE_INTRINSICS=y, it doesn't
build... So I'm figuring out what is wrong.

Jan

> Thanks,
> Jerin
> 
> 
> 
> > Dave
> >   



-- 
   Jan Viktorin                  E-mail: Viktorin@RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-11-02 17:31 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-30 13:49 [dpdk-dev] [PATCH v3 0/6] ARMv8 additions to ARMv7 support David Hunt
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 1/6] eal/arm: add 64-bit armv8 version of rte_memcpy.h David Hunt
2015-11-02  4:57   ` Jerin Jacob
2015-11-02 12:22     ` Hunt, David
2015-11-02 12:45       ` Jan Viktorin
2015-11-02 12:57       ` Jerin Jacob
2015-11-02 15:26         ` Hunt, David
2015-11-02 15:36           ` Jan Viktorin
2015-11-02 15:49             ` Hunt, David
2015-11-02 16:29               ` Jerin Jacob
2015-11-02 17:29                 ` Jan Viktorin
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 2/6] eal/arm: add 64-bit armv8 version of rte_prefetch.h David Hunt
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 3/6] eal/arm: add 64-bit armv8 version of rte_cycles.h David Hunt
2015-11-02  5:15   ` Jerin Jacob
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 4/6] eal/arm: fix 64-bit armv8 compilation of rte_cpuflags.h David Hunt
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 5/6] mk: add support for armv8 on top of armv7 David Hunt
2015-11-02  4:43   ` Jerin Jacob
2015-10-30 13:49 ` [dpdk-dev] [PATCH v3 6/6] test: add checks for cpu flags on armv8 David Hunt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).