DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  [PATCH v3 00/14] DPDK armv8-a support
@ 2015-11-06  9:40 Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 01/14] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
                   ` (14 more replies)
  0 siblings, 15 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

This is the v3 patchset for ARMv8 that now sits on top of the v6 patch
(based on upstream 82fb702077f67585d64a07de0080e5cb6a924a72)
of the ARMv7 code by RehiveTech. It adds code into the same arm include
directory, reducing code duplication.

Tested on an ThunderX arm 64-bit arm server board, with PCI slots. Passes traffic
between two physical ports on an Intel 82599 dual-port 10Gig NIC. Should
work with many other NICS as long as there is no unaligned access to
device memory but not yet untested.

Compiles igb_uio, kni and all the physical device PMDs.

An entry has been added to the Release notes.

v1..v2

1) included  "Introduce rte_smp_*mb() for memory barriers to use between lcores"
patch ACK by Konstantin in v2 as
"eal: introduce rte_smp_*mb() for memory barriers to use between lcores"
please superseded the original one

2) defined rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm
(1) and (2) fixes "Mbuf autotest" stress failure found in version 1

3) fixed white space issues with patch 8,9,10

4) disabled  CONFIG_RTE_LIBRTE_FM10K_PMD, CONFIG_RTE_LIBRTE_I40E_PMD
due to tmmintrin.h depenency or ARM

v2..v3

1) moved disabled  CONFIG_RTE_LIBRTE_FM10K_PMD, CONFIG_RTE_LIBRTE_I40E_PMD
to "mk: add support for armv8 on top of armv7" and
"mk: add support for thunderx machine target based on armv8-a"
to fix compilation issue per patch


NOTE:
Part of the work has been taken from David Hunt's v3 patch who was
initiated the armv8 port.


Notes on arm64 kernel configuration:

  Tested on using Ubuntu 14.04 LTS with a 3.18 kernel and igb_uio.
  ARM64 kernels does not have functional resource mapping of PCI memory
  (PCI_MMAP), so the pci driver needs to be patched to enable this. The
  symptom of this is when /sys/bus/pci/devices/0000:0X:00.Y directory is
  missing the resource0...N files for mmapping the device memory.

  Following patch fixes the PCI resource mapping issue om armv8.
  Its not yet up streamed.We are in the process of up streaming it.

  http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html


Jerin Jacob (14):
  eal: arm64: add armv8-a version of rte_atomic_64.h
  eal: arm64: add armv8-a version of rte_cpuflags_64.h
  eal: arm64: add armv8-a version of rte_prefetch_64.h
  eal: arm64: add armv8-a version of rte_cycles_64.h
  eal: arm64: rte_memcpy_64.h version based on libc memcpy
  eal: arm: ret_vector.h improvements
  app: test_cpuflags: test the new cpu flags added for arm64
  acl: arm64: acl implementation using NEON gcc intrinsic
  mk: add support for armv8 on top of armv7
  mk: add support for thunderx machine target based on armv8-a
  updated release note for armv8 support for DPDK 2.2
  maintainers: claim responsibility for ARMv8
  eal: introduce rte_smp_*mb() for memory barriers to use between lcores
  eal: arm: define rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm

 MAINTAINERS                                        |   5 +
 app/test-acl/main.c                                |   4 +
 app/test/test_cpuflags.c                           |  26 ++
 config/defconfig_arm64-armv8a-linuxapp-gcc         |  56 ++++
 config/defconfig_arm64-thunderx-linuxapp-gcc       |  56 ++++
 doc/guides/rel_notes/release_2_2.rst               |   7 +-
 drivers/net/virtio/virtqueue.h                     |   8 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c              |   4 +-
 drivers/net/xenvirt/virtqueue.h                    |   2 +-
 lib/librte_acl/Makefile                            |   5 +
 lib/librte_acl/acl.h                               |   4 +
 lib/librte_acl/acl_run_neon.c                      |  46 ++++
 lib/librte_acl/acl_run_neon.h                      | 289 +++++++++++++++++++++
 lib/librte_acl/rte_acl.c                           |  25 ++
 lib/librte_acl/rte_acl.h                           |   1 +
 .../common/include/arch/arm/rte_atomic.h           |  10 +
 .../common/include/arch/arm/rte_atomic_64.h        |  88 +++++++
 .../common/include/arch/arm/rte_cpuflags.h         |   4 +
 .../common/include/arch/arm/rte_cpuflags_64.h      | 152 +++++++++++
 .../common/include/arch/arm/rte_cycles.h           |   4 +
 .../common/include/arch/arm/rte_cycles_64.h        |  71 +++++
 .../common/include/arch/arm/rte_memcpy.h           |   4 +
 .../common/include/arch/arm/rte_memcpy_64.h        |  93 +++++++
 .../common/include/arch/arm/rte_prefetch.h         |   4 +
 .../common/include/arch/arm/rte_prefetch_64.h      |  61 +++++
 lib/librte_eal/common/include/arch/arm/rte_vect.h  |  54 ++--
 .../common/include/arch/ppc_64/rte_atomic.h        |   6 +
 .../common/include/arch/tile/rte_atomic.h          |   6 +
 .../common/include/arch/x86/rte_atomic.h           |   6 +
 lib/librte_eal/common/include/generic/rte_atomic.h |  27 ++
 lib/librte_ring/rte_ring.h                         |   8 +-
 mk/arch/arm64/rte.vars.mk                          |  58 +++++
 mk/machine/armv8a/rte.vars.mk                      |  58 +++++
 mk/machine/thunderx/rte.vars.mk                    |  58 +++++
 34 files changed, 1256 insertions(+), 54 deletions(-)
 create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
 create mode 100644 config/defconfig_arm64-thunderx-linuxapp-gcc
 create mode 100644 lib/librte_acl/acl_run_neon.c
 create mode 100644 lib/librte_acl/acl_run_neon.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
 create mode 100644 mk/arch/arm64/rte.vars.mk
 create mode 100644 mk/machine/armv8a/rte.vars.mk
 create mode 100644 mk/machine/thunderx/rte.vars.mk

-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 01/14] eal: arm64: add armv8-a version of rte_atomic_64.h
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 02/14] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

except rte_?wb() functions other functions are used from
RTE_FORCE_INTRINSICS=y scheme

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 .../common/include/arch/arm/rte_atomic.h           |  4 +
 .../common/include/arch/arm/rte_atomic_64.h        | 88 ++++++++++++++++++++++
 2 files changed, 92 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
index f4f5783..f3f3b6e 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_ATOMIC_ARM_H_
 #define _RTE_ATOMIC_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_atomic_64.h>
+#else
 #include <rte_atomic_32.h>
+#endif
 
 #endif /* _RTE_ATOMIC_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
new file mode 100644
index 0000000..671caa7
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -0,0 +1,88 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_ATOMIC_ARM64_H_
+#define _RTE_ATOMIC_ARM64_H_
+
+#ifndef RTE_FORCE_INTRINSICS
+#  error Platform must be built with CONFIG_RTE_FORCE_INTRINSICS
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+#define dmb(opt)  do { asm volatile("dmb " #opt : : : "memory"); } while (0)
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_mb(void)
+{
+	dmb(ish);
+}
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_wmb(void)
+{
+	dmb(ishst);
+}
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_rmb(void)
+{
+	dmb(ishld);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM64_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 02/14] eal: arm64: add armv8-a version of rte_cpuflags_64.h
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 01/14] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 03/14] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 .../common/include/arch/arm/rte_cpuflags.h         |   4 +
 .../common/include/arch/arm/rte_cpuflags_64.h      | 152 +++++++++++++++++++++
 2 files changed, 156 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 8de78d2..b8f6288 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_CPUFLAGS_ARM_H_
 #define _RTE_CPUFLAGS_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_cpuflags_64.h>
+#else
 #include <rte_cpuflags_32.h>
+#endif
 
 #endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
new file mode 100644
index 0000000..7bcc12f
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
@@ -0,0 +1,152 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM64_H_
+#define _RTE_CPUFLAGS_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+	REG_HWCAP = 0,
+	REG_HWCAP2,
+	REG_PLATFORM,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	RTE_CPUFLAG_FP = 0,
+	RTE_CPUFLAG_NEON,
+	RTE_CPUFLAG_EVTSTRM,
+	RTE_CPUFLAG_AES,
+	RTE_CPUFLAG_PMULL,
+	RTE_CPUFLAG_SHA1,
+	RTE_CPUFLAG_SHA2,
+	RTE_CPUFLAG_CRC32,
+	RTE_CPUFLAG_AARCH64,
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(FP,		0x00000001, 0, REG_HWCAP,  0)
+	FEAT_DEF(NEON,		0x00000001, 0, REG_HWCAP,  1)
+	FEAT_DEF(EVTSTRM,	0x00000001, 0, REG_HWCAP,  2)
+	FEAT_DEF(AES,		0x00000001, 0, REG_HWCAP,  3)
+	FEAT_DEF(PMULL,		0x00000001, 0, REG_HWCAP,  4)
+	FEAT_DEF(SHA1,		0x00000001, 0, REG_HWCAP,  5)
+	FEAT_DEF(SHA2,		0x00000001, 0, REG_HWCAP,  6)
+	FEAT_DEF(CRC32,		0x00000001, 0, REG_HWCAP,  7)
+	FEAT_DEF(AARCH64,	0x00000001, 0, REG_PLATFORM, 1)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+		     __attribute__((unused)) uint32_t subleaf,
+		     cpuid_registers_t out)
+{
+	int auxv_fd;
+	Elf64_auxv_t auxv;
+
+	auxv_fd = open("/proc/self/auxv", O_RDONLY);
+	assert(auxv_fd);
+	while (read(auxv_fd, &auxv,
+		    sizeof(Elf64_auxv_t)) == sizeof(Elf64_auxv_t)) {
+		if (auxv.a_type == AT_HWCAP) {
+			out[REG_HWCAP] = auxv.a_un.a_val;
+		} else if (auxv.a_type == AT_HWCAP2) {
+			out[REG_HWCAP2] = auxv.a_un.a_val;
+		} else if (auxv.a_type == AT_PLATFORM) {
+			if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+				out[REG_PLATFORM] = 0x0001;
+		}
+	}
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs = {0};
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM64_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 03/14] eal: arm64: add armv8-a version of rte_prefetch_64.h
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 01/14] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 02/14] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 04/14] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 .../common/include/arch/arm/rte_prefetch.h         |  4 ++
 .../common/include/arch/arm/rte_prefetch_64.h      | 61 ++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
index 1f46697..aa37de5 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_PREFETCH_ARM_H_
 #define _RTE_PREFETCH_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_prefetch_64.h>
+#else
 #include <rte_prefetch_32.h>
+#endif
 
 #endif /* _RTE_PREFETCH_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
new file mode 100644
index 0000000..f9cc62e
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_64_H_
+#define _RTE_PREFETCH_ARM_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL1KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL2KEEP, [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+	asm volatile ("PRFM PLDL3KEEP, [%0]" : : "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_64_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 04/14] eal: arm64: add armv8-a version of rte_cycles_64.h
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (2 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 03/14] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 05/14] eal: arm64: rte_memcpy_64.h version based on libc memcpy Jerin Jacob
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

cntcvt_el0 ticks are not based on cpu clk unlike rdtsc in x86.
Its a fixed clock running based at constant speed.
Though its a armv8-a implementer choice, typically it runs at 50 or 100 MHz

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 .../common/include/arch/arm/rte_cycles.h           |  4 ++
 .../common/include/arch/arm/rte_cycles_64.h        | 71 ++++++++++++++++++++++
 2 files changed, 75 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index b2372fa..a8009a0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_CYCLES_ARM_H_
 #define _RTE_CYCLES_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_cycles_64.h>
+#else
 #include <rte_cycles_32.h>
+#endif
 
 #endif /* _RTE_CYCLES_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
new file mode 100644
index 0000000..14f2612
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
@@ -0,0 +1,71 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_CYCLES_ARM64_H_
+#define _RTE_CYCLES_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	uint64_t tsc;
+
+	asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
+	return tsc;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM64_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 05/14] eal: arm64: rte_memcpy_64.h version based on libc memcpy
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (3 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 04/14] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 06/14] eal: arm: ret_vector.h improvements Jerin Jacob
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 .../common/include/arch/arm/rte_memcpy.h           |  4 +
 .../common/include/arch/arm/rte_memcpy_64.h        | 93 ++++++++++++++++++++++
 2 files changed, 97 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index d9f5bf1..1d562c3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -33,6 +33,10 @@
 #ifndef _RTE_MEMCPY_ARM_H_
 #define _RTE_MEMCPY_ARM_H_
 
+#ifdef RTE_ARCH_64
+#include <rte_memcpy_64.h>
+#else
 #include <rte_memcpy_32.h>
+#endif
 
 #endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
new file mode 100644
index 0000000..917cdc1
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
@@ -0,0 +1,93 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMCPY_ARM64_H_
+#define _RTE_MEMCPY_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <string.h>
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	memcpy(dst, src, 256);
+}
+
+#define rte_memcpy(d, s, n)	memcpy((d), (s), (n))
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	return memcpy(dst, src, n);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_64_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev]  [PATCH v3 06/14] eal: arm: ret_vector.h improvements
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (4 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 05/14] eal: arm64: rte_memcpy_64.h version based on libc memcpy Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-18 18:56   ` Thomas Monjalon
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 07/14] app: test_cpuflags: test the new cpu flags added for arm64 Jerin Jacob
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

added the definition of rte_xmm and xmm_t for acl noen implementation.
removed the emulated _mm_* functions

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 54 +++++++----------------
 1 file changed, 15 insertions(+), 39 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
index 7d5de97..21cdb4d 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *   Copyright(c) 2015 Cavium Networks. All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
  *   modification, are permitted provided that the following conditions
@@ -13,7 +13,7 @@
  *       notice, this list of conditions and the following disclaimer in
  *       the documentation and/or other materials provided with the
  *       distribution.
- *     * Neither the name of RehiveTech nor the names of its
+ *     * Neither the name of Cavium Networks nor the names of its
  *       contributors may be used to endorse or promote products derived
  *       from this software without specific prior written permission.
  *
@@ -33,49 +33,25 @@
 #ifndef _RTE_VECT_ARM_H_
 #define _RTE_VECT_ARM_H_
 
+#include "arm_neon.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
 
-#define XMM_SIZE 16
-#define XMM_MASK (XMM_MASK - 1)
-
-typedef struct {
-	union uint128 {
-		uint8_t uint8[16];
-		uint32_t uint32[4];
-	} val;
-} __m128i;
-
-static inline __m128i
-_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
-{
-	__m128i res;
-
-	res.val.uint32[0] = v0;
-	res.val.uint32[1] = v1;
-	res.val.uint32[2] = v2;
-	res.val.uint32[3] = v3;
-	return res;
-}
+typedef int32x4_t xmm_t;
 
-static inline __m128i
-_mm_loadu_si128(__m128i *v)
-{
-	__m128i res;
+#define	XMM_SIZE	(sizeof(xmm_t))
+#define	XMM_MASK	(XMM_SIZE - 1)
 
-	res = *v;
-	return res;
-}
-
-static inline __m128i
-_mm_load_si128(__m128i *v)
-{
-	__m128i res;
-
-	res = *v;
-	return res;
-}
+typedef union rte_xmm {
+	xmm_t    x;
+	uint8_t  u8[XMM_SIZE / sizeof(uint8_t)];
+	uint16_t u16[XMM_SIZE / sizeof(uint16_t)];
+	uint32_t u32[XMM_SIZE / sizeof(uint32_t)];
+	uint64_t u64[XMM_SIZE / sizeof(uint64_t)];
+	double   pd[XMM_SIZE / sizeof(double)];
+} __attribute__((aligned(16))) rte_xmm_t;
 
 #ifdef __cplusplus
 }
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 07/14] app: test_cpuflags: test the new cpu flags added for arm64
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (5 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 06/14] eal: arm: ret_vector.h improvements Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 app/test/test_cpuflags.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 557458f..e8d0ce7 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -120,6 +120,32 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
 #endif
 
+#if defined(RTE_ARCH_ARM64)
+	printf("Check for FP:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_FP);
+
+	printf("Check for ASIMD:\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+
+	printf("Check for EVTSTRM:\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_EVTSTRM);
+
+	printf("Check for AES:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_AES);
+
+	printf("Check for PMULL:\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_PMULL);
+
+	printf("Check for SHA1:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_SHA1);
+
+	printf("Check for SHA2:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_SHA2);
+
+	printf("Check for CRC32:\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_CRC32);
+#endif
+
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
 	printf("Check for SSE:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (6 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 07/14] app: test_cpuflags: test the new cpu flags added for arm64 Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:52   ` Ananyev, Konstantin
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 09/14] mk: add support for armv8 on top of armv7 Jerin Jacob
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

verified with testacl and acl_autotest applications on arm64 architecture.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 app/test-acl/main.c           |   4 +
 lib/librte_acl/Makefile       |   5 +
 lib/librte_acl/acl.h          |   4 +
 lib/librte_acl/acl_run_neon.c |  46 +++++++
 lib/librte_acl/acl_run_neon.h | 289 ++++++++++++++++++++++++++++++++++++++++++
 lib/librte_acl/rte_acl.c      |  25 ++++
 lib/librte_acl/rte_acl.h      |   1 +
 7 files changed, 374 insertions(+)
 create mode 100644 lib/librte_acl/acl_run_neon.c
 create mode 100644 lib/librte_acl/acl_run_neon.h

diff --git a/app/test-acl/main.c b/app/test-acl/main.c
index 72ce83c..0b0c093 100644
--- a/app/test-acl/main.c
+++ b/app/test-acl/main.c
@@ -101,6 +101,10 @@ static const struct acl_alg acl_alg[] = {
 		.name = "avx2",
 		.alg = RTE_ACL_CLASSIFY_AVX2,
 	},
+	{
+		.name = "neon",
+		.alg = RTE_ACL_CLASSIFY_NEON,
+	},
 };
 
 static struct {
diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index 7a1cf8a..27f91d5 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
+ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_neon.c
+else
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+endif
 
 CFLAGS_acl_run_sse.o += -msse4.1
+CFLAGS_acl_run_neon.o += -flax-vector-conversions -Wno-maybe-uninitialized
 
 #
 # If the compiler supports AVX2 instructions,
diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
index eb4930c..09d6784 100644
--- a/lib/librte_acl/acl.h
+++ b/lib/librte_acl/acl.h
@@ -230,6 +230,10 @@ int
 rte_acl_classify_avx2(const struct rte_acl_ctx *ctx, const uint8_t **data,
 	uint32_t *results, uint32_t num, uint32_t categories);
 
+int
+rte_acl_classify_neon(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories);
+
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
diff --git a/lib/librte_acl/acl_run_neon.c b/lib/librte_acl/acl_run_neon.c
new file mode 100644
index 0000000..b014451
--- /dev/null
+++ b/lib/librte_acl/acl_run_neon.c
@@ -0,0 +1,46 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#include "acl_run_neon.h"
+
+int
+rte_acl_classify_neon(const struct rte_acl_ctx *ctx, const uint8_t **data,
+		      uint32_t *results, uint32_t num, uint32_t categories)
+{
+	if (likely(num >= 8))
+		return search_neon_8(ctx, data, results, num, categories);
+	else if (num >= 4)
+		return search_neon_4(ctx, data, results, num, categories);
+	else
+		return rte_acl_classify_scalar(ctx, data, results, num,
+			categories);
+}
diff --git a/lib/librte_acl/acl_run_neon.h b/lib/librte_acl/acl_run_neon.h
new file mode 100644
index 0000000..cf7c57f
--- /dev/null
+++ b/lib/librte_acl/acl_run_neon.h
@@ -0,0 +1,289 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#include "acl_run.h"
+#include "acl_vect.h"
+
+struct _neon_acl_const {
+	rte_xmm_t xmm_shuffle_input;
+	rte_xmm_t xmm_index_mask;
+	rte_xmm_t range_base;
+} neon_acl_const  __attribute__((aligned(RTE_CACHE_LINE_SIZE))) = {
+	{
+		.u32 = {0x00000000, 0x04040404, 0x08080808, 0x0c0c0c0c}
+	},
+	{
+		.u32 = {RTE_ACL_NODE_INDEX, RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX, RTE_ACL_NODE_INDEX}
+	},
+	{
+		.u32 = {0xffffff00, 0xffffff04, 0xffffff08, 0xffffff0c}
+	},
+};
+
+/*
+ * Resolve priority for multiple results (neon version).
+ * This consists comparing the priority of the current traversal with the
+ * running set of results for the packet.
+ * For each result, keep a running array of the result (rule number) and
+ * its priority for each category.
+ */
+static inline void
+resolve_priority_neon(uint64_t transition, int n, const struct rte_acl_ctx *ctx,
+		      struct parms *parms,
+		      const struct rte_acl_match_results *p,
+		      uint32_t categories)
+{
+	uint32_t x;
+	int32x4_t results, priority, results1, priority1;
+	uint32x4_t selector;
+	int32_t *saved_results, *saved_priority;
+
+	for (x = 0; x < categories; x += RTE_ACL_RESULTS_MULTIPLIER) {
+		saved_results = (int32_t *)(&parms[n].cmplt->results[x]);
+		saved_priority = (int32_t *)(&parms[n].cmplt->priority[x]);
+
+		/* get results and priorities for completed trie */
+		results = vld1q_s32(
+			(const int32_t *)&p[transition].results[x]);
+		priority = vld1q_s32(
+			(const int32_t *)&p[transition].priority[x]);
+
+		/* if this is not the first completed trie */
+		if (parms[n].cmplt->count != ctx->num_tries) {
+			/* get running best results and their priorities */
+			results1 = vld1q_s32(saved_results);
+			priority1 = vld1q_s32(saved_priority);
+
+			/* select results that are highest priority */
+			selector = vcgtq_s32(priority1, priority);
+			results = vbslq_s32(selector, results1, results);
+			priority = vbslq_s32(selector, priority1, priority);
+		}
+
+		/* save running best results and their priorities */
+		vst1q_s32(saved_results, results);
+		vst1q_s32(saved_priority, priority);
+	}
+}
+
+/*
+ * Check for any match in 4 transitions
+ */
+static inline __attribute__((always_inline)) uint32_t
+check_any_match_x4(uint64_t val[])
+{
+	return ((val[0] | val[1] | val[2] | val[3]) & RTE_ACL_NODE_MATCH);
+}
+
+static inline __attribute__((always_inline)) void
+acl_match_check_x4(int slot, const struct rte_acl_ctx *ctx, struct parms *parms,
+		   struct acl_flow_data *flows, uint64_t transitions[])
+{
+	while (check_any_match_x4(transitions)) {
+		transitions[0] = acl_match_check(transitions[0], slot, ctx,
+			parms, flows, resolve_priority_neon);
+		transitions[1] = acl_match_check(transitions[1], slot + 1, ctx,
+			parms, flows, resolve_priority_neon);
+		transitions[2] = acl_match_check(transitions[2], slot + 2, ctx,
+			parms, flows, resolve_priority_neon);
+		transitions[3] = acl_match_check(transitions[3], slot + 3, ctx,
+			parms, flows, resolve_priority_neon);
+	}
+}
+
+/*
+ * Process 4 transitions (in 2 NEON Q registers) in parallel
+ */
+static inline __attribute__((always_inline)) int32x4_t
+transition4(int32x4_t next_input, const uint64_t *trans, uint64_t transitions[])
+{
+	int32x4x2_t tr_hi_lo;
+	int32x4_t t, in, r;
+	uint32x4_t index_msk, node_type, addr;
+	uint32x4_t dfa_msk, mask, quad_ofs, dfa_ofs;
+
+	/* Move low 32 into tr_hi_lo.val[0] and high 32 into tr_hi_lo.val[1] */
+	tr_hi_lo = vld2q_s32((const int32_t *)transitions);
+
+	/* Calculate the address (array index) for all 4 transitions. */
+
+	index_msk = vld1q_u32((const uint32_t *)&neon_acl_const.xmm_index_mask);
+
+	/* Calc node type and node addr */
+	node_type = vbicq_s32(tr_hi_lo.val[0], index_msk);
+	addr = vandq_s32(tr_hi_lo.val[0], index_msk);
+
+	/* t = 0 */
+	t = veorq_s32(node_type, node_type);
+
+	/* mask for DFA type(0) nodes */
+	dfa_msk = vceqq_u32(node_type, t);
+
+	mask = vld1q_s32((const int32_t *)&neon_acl_const.xmm_shuffle_input);
+	in = vqtbl1q_u8((uint8x16_t)next_input, (uint8x16_t)mask);
+
+	/* DFA calculations. */
+	r = vshrq_n_u32(in, 30); /* div by 64 */
+	mask = vld1q_s32((const int32_t *)&neon_acl_const.range_base);
+	r = vaddq_u8(r, mask);
+	t = vshrq_n_u32(in, 24);
+	r = vqtbl1q_u8((uint8x16_t)tr_hi_lo.val[1], (uint8x16_t)r);
+	dfa_ofs = vsubq_s32(t, r);
+
+	/* QUAD/SINGLE calculations. */
+	t = vcgtq_s8(in, tr_hi_lo.val[1]);
+	t = vabsq_s8(t);
+	t = vpaddlq_u8(t);
+	quad_ofs = vpaddlq_u16(t);
+
+	/* blend DFA and QUAD/SINGLE. */
+	t = vbslq_u8(dfa_msk, dfa_ofs, quad_ofs);
+
+	/* calculate address for next transitions */
+	addr = vaddq_u32(addr, t);
+
+	/* Fill next transitions */
+	transitions[0] = trans[vgetq_lane_u32(addr, 0)];
+	transitions[1] = trans[vgetq_lane_u32(addr, 1)];
+	transitions[2] = trans[vgetq_lane_u32(addr, 2)];
+	transitions[3] = trans[vgetq_lane_u32(addr, 3)];
+
+	return vshrq_n_u32(next_input, CHAR_BIT);
+}
+
+/*
+ * Execute trie traversal with 8 traversals in parallel
+ */
+static inline int
+search_neon_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	      uint32_t *results, uint32_t total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[8];
+	struct completion cmplt[8];
+	struct parms parms[8];
+	int32x4_t input0, input1;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		     total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < 8; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	 /* Check for any matches. */
+	acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]);
+	acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]);
+
+	while (flows.started > 0) {
+		/* Gather 4 bytes of input data for each stream. */
+		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0, 0);
+		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1, 0);
+
+		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0, 1);
+		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1, 1);
+
+		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input0, 2);
+		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 6), input1, 2);
+
+		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input0, 3);
+		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 7), input1, 3);
+
+		/* Process the 4 bytes of input on each stream. */
+
+		input0 = transition4(input0, flows.trans, &index_array[0]);
+		input1 = transition4(input1, flows.trans, &index_array[4]);
+
+		input0 = transition4(input0, flows.trans, &index_array[0]);
+		input1 = transition4(input1, flows.trans, &index_array[4]);
+
+		input0 = transition4(input0, flows.trans, &index_array[0]);
+		input1 = transition4(input1, flows.trans, &index_array[4]);
+
+		input0 = transition4(input0, flows.trans, &index_array[0]);
+		input1 = transition4(input1, flows.trans, &index_array[4]);
+
+		 /* Check for any matches. */
+		acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]);
+		acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]);
+	}
+
+	return 0;
+}
+
+/*
+ * Execute trie traversal with 4 traversals in parallel
+ */
+static inline int
+search_neon_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	      uint32_t *results, int total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[4];
+	struct completion cmplt[4];
+	struct parms parms[4];
+	int32x4_t input;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		     total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < 4; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	/* Check for any matches. */
+	acl_match_check_x4(0, ctx, parms, &flows, index_array);
+
+	while (flows.started > 0) {
+		/* Gather 4 bytes of input data for each stream. */
+		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
+		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1);
+		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2);
+		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
+
+		/* Process the 4 bytes of input on each stream. */
+		input = transition4(input, flows.trans, index_array);
+		input = transition4(input, flows.trans, index_array);
+		input = transition4(input, flows.trans, index_array);
+		input = transition4(input, flows.trans, index_array);
+
+		/* Check for any matches. */
+		acl_match_check_x4(0, ctx, parms, &flows, index_array);
+	}
+
+	return 0;
+}
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
index d60219f..e2fdebd 100644
--- a/lib/librte_acl/rte_acl.c
+++ b/lib/librte_acl/rte_acl.c
@@ -55,11 +55,32 @@ rte_acl_classify_avx2(__rte_unused const struct rte_acl_ctx *ctx,
 	return -ENOTSUP;
 }
 
+int __attribute__ ((weak))
+rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
+
+int __attribute__ ((weak))
+rte_acl_classify_neon(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
+
 static const rte_acl_classify_t classify_fns[] = {
 	[RTE_ACL_CLASSIFY_DEFAULT] = rte_acl_classify_scalar,
 	[RTE_ACL_CLASSIFY_SCALAR] = rte_acl_classify_scalar,
 	[RTE_ACL_CLASSIFY_SSE] = rte_acl_classify_sse,
 	[RTE_ACL_CLASSIFY_AVX2] = rte_acl_classify_avx2,
+	[RTE_ACL_CLASSIFY_NEON] = rte_acl_classify_neon,
 };
 
 /* by default, use always available scalar code path. */
@@ -93,6 +114,9 @@ rte_acl_init(void)
 {
 	enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;
 
+#ifdef RTE_ARCH_ARM64
+	alg =  RTE_ACL_CLASSIFY_NEON;
+#else
 #ifdef CC_AVX2_SUPPORT
 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
 		alg = RTE_ACL_CLASSIFY_AVX2;
@@ -102,6 +126,7 @@ rte_acl_init(void)
 #endif
 		alg = RTE_ACL_CLASSIFY_SSE;
 
+#endif
 	rte_acl_set_default_classify(alg);
 }
 
diff --git a/lib/librte_acl/rte_acl.h b/lib/librte_acl/rte_acl.h
index 98ef2fc..0979a09 100644
--- a/lib/librte_acl/rte_acl.h
+++ b/lib/librte_acl/rte_acl.h
@@ -270,6 +270,7 @@ enum rte_acl_classify_alg {
 	RTE_ACL_CLASSIFY_SCALAR = 1,  /**< generic implementation. */
 	RTE_ACL_CLASSIFY_SSE = 2,     /**< requires SSE4.1 support. */
 	RTE_ACL_CLASSIFY_AVX2 = 3,    /**< requires AVX2 support. */
+	RTE_ACL_CLASSIFY_NEON = 4,    /**< requires NEON support. */
 	RTE_ACL_CLASSIFY_NUM          /* should always be the last one. */
 };
 
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 09/14] mk: add support for armv8 on top of armv7
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (7 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 10/14] mk: add support for thunderx machine target based on armv8-a Jerin Jacob
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 config/defconfig_arm64-armv8a-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
 mk/arch/arm64/rte.vars.mk                  | 58 ++++++++++++++++++++++++++++++
 mk/machine/armv8a/rte.vars.mk              | 58 ++++++++++++++++++++++++++++++
 3 files changed, 172 insertions(+)
 create mode 100644 config/defconfig_arm64-armv8a-linuxapp-gcc
 create mode 100644 mk/arch/arm64/rte.vars.mk
 create mode 100644 mk/machine/armv8a/rte.vars.mk

diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
new file mode 100644
index 0000000..49e7056
--- /dev/null
+++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv8a"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_FORCE_INTRINSICS=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_CACHE_LINE_SIZE=64
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
diff --git a/mk/arch/arm64/rte.vars.mk b/mk/arch/arm64/rte.vars.mk
new file mode 100644
index 0000000..32e3a5f
--- /dev/null
+++ b/mk/arch/arm64/rte.vars.mk
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# arch:
+#
+#   - define ARCH variable (overridden by cmdline or by previous
+#     optional define in machine .mk)
+#   - define CROSS variable (overridden by cmdline or previous define
+#     in machine .mk)
+#   - define CPU_CFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - define CPU_LDFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - define CPU_ASFLAGS variable (overridden by cmdline or previous
+#     define in machine .mk)
+#   - may override any previously defined variable
+#
+# examples for CONFIG_RTE_ARCH: i686, x86_64, x86_64_32
+#
+
+ARCH  ?= arm64
+# common arch dir in eal headers
+ARCH_DIR := arm
+CROSS ?=
+
+CPU_CFLAGS  ?=
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
new file mode 100644
index 0000000..bdf8c6b
--- /dev/null
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a -DRTE_CACHE_LINE_SIZE=64
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 10/14] mk: add support for thunderx machine target based on armv8-a
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (8 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 09/14] mk: add support for armv8 on top of armv7 Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 11/14] updated release note for armv8 support for DPDK 2.2 Jerin Jacob
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Created the new thunderx machine target to address difference
in "cache line size" and "-mcpu=thunderx" vs default armv8-a machine target

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 config/defconfig_arm64-thunderx-linuxapp-gcc | 56 +++++++++++++++++++++++++++
 mk/machine/thunderx/rte.vars.mk              | 58 ++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+)
 create mode 100644 config/defconfig_arm64-thunderx-linuxapp-gcc
 create mode 100644 mk/machine/thunderx/rte.vars.mk

diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc
new file mode 100644
index 0000000..6b2048b
--- /dev/null
+++ b/config/defconfig_arm64-thunderx-linuxapp-gcc
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="thunderx"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_FORCE_INTRINSICS=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_CACHE_LINE_SIZE=128
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
diff --git a/mk/machine/thunderx/rte.vars.mk b/mk/machine/thunderx/rte.vars.mk
new file mode 100644
index 0000000..e49f9e1
--- /dev/null
+++ b/mk/machine/thunderx/rte.vars.mk
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+CROSS ?= aarch64-thunderx-linux-gnu-
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a -mcpu=thunderx -DRTE_CACHE_LINE_SIZE=128
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 11/14] updated release note for armv8 support for DPDK 2.2
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (9 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 10/14] mk: add support for thunderx machine target based on armv8-a Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 12/14] maintainers: claim responsibility for ARMv8 Jerin Jacob
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 doc/guides/rel_notes/release_2_2.rst | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index 43a3a3c..a3587a2 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -23,10 +23,11 @@ New Features
 
 * **Added vhost-user multiple queue support.**
 
-* **Introduce ARMv7 architecture**
+* **Introduce ARMv7 and ARMv8 architectures**
 
-  It is now possible to build DPDK for the ARMv7 platform and test with
-  virtual PMD drivers.
+  * It is now possible to build DPDK for the ARMv7 and ARMv8 platforms.
+  * ARMv7 can be tested with virtual PMD drivers.
+  * ARMv8 can be tested with virtual and physical PMD drivers.
 
 
 Resolved Issues
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 12/14] maintainers: claim responsibility for ARMv8
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (10 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 11/14] updated release note for armv8 support for DPDK 2.2 Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 13/14] eal: introduce rte_smp_*mb() for memory barriers to use between lcores Jerin Jacob
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 MAINTAINERS | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a8933eb..c44b328 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -128,6 +128,11 @@ ARM v7
 M: Jan Viktorin <viktorin@rehivetech.com>
 F: lib/librte_eal/common/include/arch/arm/
 
+ARM v8
+M: Jerin Jacob <jerin.jacob@caviumnetworks.com>
+F: lib/librte_eal/common/include/arch/arm/*_64.h
+F: lib/librte_acl/acl_run_neon.*
+
 Intel x86
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH v3 13/14] eal: introduce rte_smp_*mb() for memory barriers to use between lcores
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (11 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 12/14] maintainers: claim responsibility for ARMv8 Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 14/14] eal: arm: define rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm Jerin Jacob
  2015-11-18 21:48 ` [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Thomas Monjalon
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

This commit introduce rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb(), in
order to enable memory barriers between lcores.
The patch does not provide any functional change for IA, the goal is to
have infrastructure for weakly ordered machines like ARM to work on DPDK.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/virtio/virtqueue.h                     |  8 +++----
 drivers/net/xenvirt/rte_eth_xenvirt.c              |  4 ++--
 drivers/net/xenvirt/virtqueue.h                    |  2 +-
 .../common/include/arch/ppc_64/rte_atomic.h        |  6 +++++
 .../common/include/arch/tile/rte_atomic.h          |  6 +++++
 .../common/include/arch/x86/rte_atomic.h           |  6 +++++
 lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++++++++++++++++++++++
 lib/librte_ring/rte_ring.h                         |  8 +++----
 8 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 7789411..d233be6 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -53,12 +53,10 @@ struct rte_mbuf;
  *     accesses through relaxed memory I/O windows, so smp_mb() et al are
  *     sufficient.
  *
- * This driver is for virtio_pci on SMP and therefore can assume
- * weaker (compiler barriers)
  */
-#define virtio_mb()	rte_mb()
-#define virtio_rmb()	rte_compiler_barrier()
-#define virtio_wmb()	rte_compiler_barrier()
+#define virtio_mb()	rte_smp_mb()
+#define virtio_rmb()	rte_smp_rmb()
+#define virtio_wmb()	rte_smp_wmb()
 
 #ifdef RTE_PMD_PACKET_PREFETCH
 #define rte_packet_prefetch(p)  rte_prefetch1(p)
diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.c b/drivers/net/xenvirt/rte_eth_xenvirt.c
index 73e8bce..8c33a02 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.c
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.c
@@ -99,7 +99,7 @@ eth_xenvirt_rx(void *q, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 	nb_used = VIRTQUEUE_NUSED(rxvq);
 
-	rte_compiler_barrier(); /* rmb */
+	rte_smp_rmb();
 	num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
 	num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
 	if (unlikely(num == 0)) return 0;
@@ -150,7 +150,7 @@ eth_xenvirt_tx(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
 	nb_used = VIRTQUEUE_NUSED(txvq);
 
-	rte_compiler_barrier();   /* rmb */
+	rte_smp_rmb();
 
 	num = (uint16_t)(likely(nb_used <= VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);
 	num = virtqueue_dequeue_burst(txvq, snd_pkts, len, num);
diff --git a/drivers/net/xenvirt/virtqueue.h b/drivers/net/xenvirt/virtqueue.h
index eff6208..6dcb0ef 100644
--- a/drivers/net/xenvirt/virtqueue.h
+++ b/drivers/net/xenvirt/virtqueue.h
@@ -151,7 +151,7 @@ vq_ring_update_avail(struct virtqueue *vq, uint16_t desc_idx)
 	 */
 	avail_idx = (uint16_t)(vq->vq_ring.avail->idx & (vq->vq_nentries - 1));
 	vq->vq_ring.avail->ring[avail_idx] = desc_idx;
-	rte_compiler_barrier();  /* wmb , for IA memory model barrier is enough*/
+	rte_smp_wmb();
 	vq->vq_ring.avail->idx++;
 }
 
diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
index fb7af2b..b8bc2c0 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
@@ -72,6 +72,12 @@ extern "C" {
  */
 #define	rte_rmb() {asm volatile("sync" : : : "memory"); }
 
+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_compiler_barrier()
+
+#define rte_smp_rmb() rte_compiler_barrier()
+
 /*------------------------- 16 bit atomic operations -------------------------*/
 /* To be compatible with Power7, use GCC built-in functions for 16 bit
  * operations */
diff --git a/lib/librte_eal/common/include/arch/tile/rte_atomic.h b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
index 3dc8eb8..28825ff 100644
--- a/lib/librte_eal/common/include/arch/tile/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
@@ -79,6 +79,12 @@ static inline void rte_rmb(void)
 	__sync_synchronize();
 }
 
+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_compiler_barrier()
+
+#define rte_smp_rmb() rte_compiler_barrier()
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
index e93e8ee..41178c7 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
@@ -53,6 +53,12 @@ extern "C" {
 
 #define	rte_rmb() _mm_lfence()
 
+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_compiler_barrier()
+
+#define rte_smp_rmb() rte_compiler_barrier()
+
 /*------------------------- 16 bit atomic operations -------------------------*/
 
 #ifndef RTE_FORCE_INTRINSICS
diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h b/lib/librte_eal/common/include/generic/rte_atomic.h
index 6c7581a..26d1f56 100644
--- a/lib/librte_eal/common/include/generic/rte_atomic.h
+++ b/lib/librte_eal/common/include/generic/rte_atomic.h
@@ -72,6 +72,33 @@ static inline void rte_wmb(void);
  */
 static inline void rte_rmb(void);
 
+/**
+ * General memory barrier between lcores
+ *
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_smp_mb() call are globally visible across the lcores
+ * before the the LOAD and STORE operations that follows it.
+ */
+static inline void rte_smp_mb(void);
+
+/**
+ * Write memory barrier between lcores
+ *
+ * Guarantees that the STORE operations that precede the
+ * rte_smp_wmb() call are globally visible across the lcores
+ * before the the STORE operations that follows it.
+ */
+static inline void rte_smp_wmb(void);
+
+/**
+ * Read memory barrier between lcores
+ *
+ * Guarantees that the LOAD operations that precede the
+ * rte_smp_rmb() call are globally visible across the lcores
+ * before the the LOAD operations that follows it.
+ */
+static inline void rte_smp_rmb(void);
+
 #endif /* __DOXYGEN__ */
 
 /**
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index af68888..19ea1bb 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -457,7 +457,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 	/* write entries in ring */
 	ENQUEUE_PTRS();
-	rte_compiler_barrier();
+	rte_smp_wmb();
 
 	/* if we exceed the watermark */
 	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
@@ -552,7 +552,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 	/* write entries in ring */
 	ENQUEUE_PTRS();
-	rte_compiler_barrier();
+	rte_smp_wmb();
 
 	/* if we exceed the watermark */
 	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
@@ -643,7 +643,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	/* copy in table */
 	DEQUEUE_PTRS();
-	rte_compiler_barrier();
+	rte_smp_rmb();
 
 	/*
 	 * If there are other dequeues in progress that preceded us,
@@ -727,7 +727,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	/* copy in table */
 	DEQUEUE_PTRS();
-	rte_compiler_barrier();
+	rte_smp_rmb();
 
 	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev]  [PATCH v3 14/14] eal: arm: define rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (12 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 13/14] eal: introduce rte_smp_*mb() for memory barriers to use between lcores Jerin Jacob
@ 2015-11-06  9:40 ` Jerin Jacob
  2015-11-18 21:48 ` [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Thomas Monjalon
  14 siblings, 0 replies; 18+ messages in thread
From: Jerin Jacob @ 2015-11-06  9:40 UTC (permalink / raw)
  To: dev

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 lib/librte_eal/common/include/arch/arm/rte_atomic.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
index f3f3b6e..454a12b 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -39,4 +39,10 @@
 #include <rte_atomic_32.h>
 #endif
 
+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_wmb()
+
+#define rte_smp_rmb() rte_rmb()
+
 #endif /* _RTE_ATOMIC_ARM_H_ */
-- 
1.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
@ 2015-11-06  9:52   ` Ananyev, Konstantin
  0 siblings, 0 replies; 18+ messages in thread
From: Ananyev, Konstantin @ 2015-11-06  9:52 UTC (permalink / raw)
  To: Jerin Jacob, dev



> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Friday, November 06, 2015 9:40 AM
> To: dev@dpdk.org
> Cc: thomas.monjalon@6wind.com; Hunt, David; viktorin@rehivetech.com; Ananyev, Konstantin; Jerin Jacob
> Subject: [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic
> 
> verified with testacl and acl_autotest applications on arm64 architecture.
> 
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---

Didn't test it on ARM, but from x86 perspective all seems ok. 

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 06/14] eal: arm: ret_vector.h improvements
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 06/14] eal: arm: ret_vector.h improvements Jerin Jacob
@ 2015-11-18 18:56   ` Thomas Monjalon
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Monjalon @ 2015-11-18 18:56 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

2015-11-06 15:10, Jerin Jacob:
> added the definition of rte_xmm and xmm_t for acl noen implementation.
> removed the emulated _mm_* functions

2 nits:
- noen -> neon
- this kind of change deserves more explanations

> +#include "arm_neon.h"

I think it should be <arm_neon.h>

It won't block the merge of this patch,
but as you are becoming the ARMv8 maintainer, please try to check
even minor things which make read of DPDK code and history easier.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support
  2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
                   ` (13 preceding siblings ...)
  2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 14/14] eal: arm: define rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm Jerin Jacob
@ 2015-11-18 21:48 ` Thomas Monjalon
  14 siblings, 0 replies; 18+ messages in thread
From: Thomas Monjalon @ 2015-11-18 21:48 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

> Jerin Jacob (14):
>   eal: arm64: add armv8-a version of rte_atomic_64.h
>   eal: arm64: add armv8-a version of rte_cpuflags_64.h
>   eal: arm64: add armv8-a version of rte_prefetch_64.h
>   eal: arm64: add armv8-a version of rte_cycles_64.h
>   eal: arm64: rte_memcpy_64.h version based on libc memcpy
>   eal: arm: ret_vector.h improvements
>   app: test_cpuflags: test the new cpu flags added for arm64
>   acl: arm64: acl implementation using NEON gcc intrinsic
>   mk: add support for armv8 on top of armv7
>   mk: add support for thunderx machine target based on armv8-a
>   updated release note for armv8 support for DPDK 2.2
>   maintainers: claim responsibility for ARMv8
>   eal: introduce rte_smp_*mb() for memory barriers to use between lcores
>   eal: arm: define rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm

Applied, thanks

Welcome Jan and Jerin as maintainers of the ARM architectures for DPDK!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-11-18 21:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06  9:40 [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 01/14] eal: arm64: add armv8-a version of rte_atomic_64.h Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 02/14] eal: arm64: add armv8-a version of rte_cpuflags_64.h Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 03/14] eal: arm64: add armv8-a version of rte_prefetch_64.h Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 04/14] eal: arm64: add armv8-a version of rte_cycles_64.h Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 05/14] eal: arm64: rte_memcpy_64.h version based on libc memcpy Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 06/14] eal: arm: ret_vector.h improvements Jerin Jacob
2015-11-18 18:56   ` Thomas Monjalon
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 07/14] app: test_cpuflags: test the new cpu flags added for arm64 Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 08/14] acl: arm64: acl implementation using NEON gcc intrinsic Jerin Jacob
2015-11-06  9:52   ` Ananyev, Konstantin
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 09/14] mk: add support for armv8 on top of armv7 Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 10/14] mk: add support for thunderx machine target based on armv8-a Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 11/14] updated release note for armv8 support for DPDK 2.2 Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 12/14] maintainers: claim responsibility for ARMv8 Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 13/14] eal: introduce rte_smp_*mb() for memory barriers to use between lcores Jerin Jacob
2015-11-06  9:40 ` [dpdk-dev] [PATCH v3 14/14] eal: arm: define rte_smp_mb(), rte_smp_wmb(), rte_smp_rmb() for arm Jerin Jacob
2015-11-18 21:48 ` [dpdk-dev] [PATCH v3 00/14] DPDK armv8-a support Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).