DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library
@ 2015-11-23 18:45 Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 1/4] hash: replace libc memcmp with optimized memory compare functions for arm64 Jerin Jacob
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Jerin Jacob @ 2015-11-23 18:45 UTC (permalink / raw)
  To: dev

- This patch set has the changes required for optimised hash library usage in arm64 perspective
- Tested on Juno and Thunderx boards
- Tested and verified the changes with following DPDK unit test cases
        hash_functions_autotest
        hash_autotest
        hash_perf_autotest
        hash_scaling_autotes
-   Created the new xgene1 machine target to address the difference
    in optional armv8-a CRC extension availability compared to
    default armv8-a machine target(enabled CRC extension by default)
-  Supersededs the [dpdk-dev] [PATCH] hash: replace libc memcmp with
 optimized memory compare functions for arm64 patch


Jerin Jacob (4):
  hash: replace libc memcmp with optimized memory compare functions for
    arm64
  hash: implement rte_hash_crc_* based on armv8-a CRC32 instructions
  hash: select hash function as CRC if armv8-a CRC extension available
  mk: add xgene1 machine target based on armv8-a

 app/test/test_hash.c                       |   7 ++
 config/defconfig_arm64-xgene1-linuxapp-gcc |  56 +++++++++++
 lib/librte_hash/Makefile                   |   3 +
 lib/librte_hash/rte_cmp_arm64.h            | 114 ++++++++++++++++++++++
 lib/librte_hash/rte_crc_arm64.h            | 151 +++++++++++++++++++++++++++++
 lib/librte_hash/rte_cuckoo_hash.c          |   9 +-
 lib/librte_hash/rte_fbk_hash.h             |   2 +-
 lib/librte_hash/rte_hash_crc.h             |   7 ++
 mk/machine/armv8a/rte.vars.mk              |   2 +-
 mk/machine/thunderx/rte.vars.mk            |   2 +-
 mk/machine/xgene1/rte.vars.mk              |  58 +++++++++++
 mk/rte.cpuflags.mk                         |   4 +
 mk/toolchain/gcc/rte.toolchain-compat.mk   |   6 +-
 13 files changed, 415 insertions(+), 6 deletions(-)
 create mode 100644 config/defconfig_arm64-xgene1-linuxapp-gcc
 create mode 100644 lib/librte_hash/rte_cmp_arm64.h
 create mode 100644 lib/librte_hash/rte_crc_arm64.h
 create mode 100644 mk/machine/xgene1/rte.vars.mk

-- 
2.1.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH 1/4] hash: replace libc memcmp with optimized memory compare functions for arm64
  2015-11-23 18:45 [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Jerin Jacob
@ 2015-11-23 18:45 ` Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 2/4] hash: implement rte_hash_crc_* based on armv8-a CRC32 instructions Jerin Jacob
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jerin Jacob @ 2015-11-23 18:45 UTC (permalink / raw)
  To: dev

The following measurements shows improvement over the default
libc memcmp function

Length(B) by X% over libc memcmp
  16	  149.57%
  32	  122.7%
  48	  104.96%
  64	  98.21%
  80	  93.75%
  96	  90.55%
 112	  110.48%
 128	  137.24%

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 lib/librte_hash/rte_cmp_arm64.h   | 114 ++++++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_cuckoo_hash.c |   7 ++-
 2 files changed, 120 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_hash/rte_cmp_arm64.h

diff --git a/lib/librte_hash/rte_cmp_arm64.h b/lib/librte_hash/rte_cmp_arm64.h
new file mode 100644
index 0000000..6fd937b
--- /dev/null
+++ b/lib/librte_hash/rte_cmp_arm64.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 Cavium networks. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* Functions to compare multiple of 16 byte keys (up to 128 bytes) */
+static int
+rte_hash_k16_cmp_eq(const void *key1, const void *key2,
+		    size_t key_len __rte_unused)
+{
+	uint64_t x0, x1, y0, y1;
+
+	asm volatile(
+		"ldp %x[x1], %x[x0], [%x[p1]]"
+		: [x1]"=r"(x1), [x0]"=r"(x0)
+		: [p1]"r"(key1)
+		);
+	asm volatile(
+		"ldp %x[y1], %x[y0], [%x[p2]]"
+		: [y1]"=r"(y1), [y0]"=r"(y0)
+		: [p2]"r"(key2)
+		);
+	x0 ^= y0;
+	x1 ^= y1;
+	return !(x0 == 0 && x1 == 0);
+}
+
+static int
+rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k16_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k16_cmp_eq((const char *) key1 + 16,
+				(const char *) key2 + 16, key_len);
+}
+
+static int
+rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k16_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k16_cmp_eq((const char *) key1 + 16,
+				(const char *) key2 + 16, key_len) ||
+		rte_hash_k16_cmp_eq((const char *) key1 + 32,
+				(const char *) key2 + 32, key_len);
+}
+
+static int
+rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k32_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k32_cmp_eq((const char *) key1 + 32,
+				(const char *) key2 + 32, key_len);
+}
+
+static int
+rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k16_cmp_eq((const char *) key1 + 64,
+				(const char *) key2 + 64, key_len);
+}
+
+static int
+rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k32_cmp_eq((const char *) key1 + 64,
+				(const char *) key2 + 64, key_len);
+}
+
+static int
+rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k32_cmp_eq((const char *) key1 + 64,
+				(const char *) key2 + 64, key_len) ||
+		rte_hash_k16_cmp_eq((const char *) key1 + 96,
+				(const char *) key2 + 96, key_len);
+}
+
+static int
+rte_hash_k128_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+	return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+		rte_hash_k64_cmp_eq((const char *) key1 + 64,
+				(const char *) key2 + 64, key_len);
+}
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 1e970de..e6520dd 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -63,6 +63,10 @@
 #include "rte_cmp_x86.h"
 #endif
 
+#if defined(RTE_ARCH_ARM64)
+#include "rte_cmp_arm64.h"
+#endif
+
 TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
 
 static struct rte_tailq_elem rte_hash_tailq = {
@@ -280,7 +284,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
  * If x86 architecture is used, select appropriate compare function,
  * which may use x86 instrinsics, otherwise use memcmp
  */
-#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_X32)
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) ||\
+	 defined(RTE_ARCH_X86_X32) || defined(RTE_ARCH_ARM64)
 	/* Select function to compare keys */
 	switch (params->key_len) {
 	case 16:
-- 
2.1.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH 2/4] hash: implement rte_hash_crc_* based on armv8-a CRC32 instructions
  2015-11-23 18:45 [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 1/4] hash: replace libc memcmp with optimized memory compare functions for arm64 Jerin Jacob
@ 2015-11-23 18:45 ` Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 3/4] hash: select hash function as CRC if armv8-a CRC extension available Jerin Jacob
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jerin Jacob @ 2015-11-23 18:45 UTC (permalink / raw)
  To: dev

armv8-a has optional CRC32 extension, march=armv8-a+crc enables code
generation for the ARMv8-A architecture together with
the optional CRC32 extensions.

added RTE_MACHINE_CPUFLAG_CRC32 to detect the availability of
CRC32  extension in compile time. At run-time, The RTE_CPUFLAG_CRC32
can be used to find the availability.

armv8-a+crc target support added in GCC 4.9,
Used inline assembly and emulated __ARM_FEATURE_CRC32 to work
with tool-chain < 4.9

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 app/test/test_hash.c                     |   7 ++
 lib/librte_hash/Makefile                 |   3 +
 lib/librte_hash/rte_crc_arm64.h          | 151 +++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash_crc.h           |   7 ++
 mk/machine/armv8a/rte.vars.mk            |   2 +-
 mk/machine/thunderx/rte.vars.mk          |   2 +-
 mk/rte.cpuflags.mk                       |   4 +
 mk/toolchain/gcc/rte.toolchain-compat.mk |   6 +-
 8 files changed, 179 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_hash/rte_crc_arm64.h

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 4f2509d..2f3d884 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -217,6 +217,13 @@ test_crc32_hash_alg_equiv(void)
 			printf("Failed checking CRC32_SW against CRC32_SSE42_x64\n");
 			break;
 		}
+
+		/* Check against 8-byte-operand ARM64 CRC32 if available */
+		rte_hash_crc_set_alg(CRC32_ARM64);
+		if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
+			printf("Failed checking CRC32_SW against CRC32_ARM64\n");
+			break;
+		}
 	}
 
 	/* Resetting to best available algorithm */
diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile
index 7902c2b..bb1ea99 100644
--- a/lib/librte_hash/Makefile
+++ b/lib/librte_hash/Makefile
@@ -48,6 +48,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_hash_crc.h
+ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
+SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_crc_arm64.h
+endif
 SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
diff --git a/lib/librte_hash/rte_crc_arm64.h b/lib/librte_hash/rte_crc_arm64.h
new file mode 100644
index 0000000..02e26bc
--- /dev/null
+++ b/lib/librte_hash/rte_crc_arm64.h
@@ -0,0 +1,151 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 Cavium networks. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Cavium networks nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CRC_ARM64_H_
+#define _RTE_CRC_ARM64_H_
+
+/**
+ * @file
+ *
+ * RTE CRC arm64 Hash
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_cpuflags.h>
+#include <rte_branch_prediction.h>
+#include <rte_common.h>
+
+static inline uint32_t
+crc32c_arm64_u32(uint32_t data, uint32_t init_val)
+{
+	asm(".arch armv8-a+crc");
+	__asm__ volatile(
+			"crc32cw %w[crc], %w[crc], %w[value]"
+			: [crc] "+r" (init_val)
+			: [value] "r" (data));
+	return init_val;
+}
+
+static inline uint32_t
+crc32c_arm64_u64(uint64_t data, uint32_t init_val)
+{
+	asm(".arch armv8-a+crc");
+	__asm__ volatile(
+			"crc32cx %w[crc], %w[crc], %x[value]"
+			: [crc] "+r" (init_val)
+			: [value] "r" (data));
+	return init_val;
+}
+
+/**
+ * Allow or disallow use of arm64 SIMD instrinsics for CRC32 hash
+ * calculation.
+ *
+ * @param alg
+ *   An OR of following flags:
+ *   - (CRC32_SW) Don't use arm64 crc intrinsics
+ *   - (CRC32_ARM64) Use ARMv8 CRC intrinsic if available
+ *
+ */
+static inline void
+rte_hash_crc_set_alg(uint8_t alg)
+{
+	switch (alg) {
+	case CRC32_ARM64:
+		if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_CRC32))
+			alg = CRC32_SW;
+	case CRC32_SW:
+		crc32_alg = alg;
+	default:
+		break;
+	}
+}
+
+/* Setting the best available algorithm */
+static inline void __attribute__((constructor))
+rte_hash_crc_init_alg(void)
+{
+	rte_hash_crc_set_alg(CRC32_ARM64);
+}
+
+/**
+ * Use single crc32 instruction to perform a hash on a 4 byte value.
+ * Fall back to software crc32 implementation in case arm64 crc intrinsics is
+ * not supported
+ *
+ * @param data
+ *   Data to perform hash on.
+ * @param init_val
+ *   Value to initialise hash generator.
+ * @return
+ *   32bit calculated hash value.
+ */
+static inline uint32_t
+rte_hash_crc_4byte(uint32_t data, uint32_t init_val)
+{
+	if (likely(crc32_alg & CRC32_ARM64))
+		return crc32c_arm64_u32(data, init_val);
+
+	return crc32c_1word(data, init_val);
+}
+
+/**
+ * Use single crc32 instruction to perform a hash on a 8 byte value.
+ * Fall back to software crc32 implementation in case arm64 crc intrinsics is
+ * not supported
+ *
+ * @param data
+ *   Data to perform hash on.
+ * @param init_val
+ *   Value to initialise hash generator.
+ * @return
+ *   32bit calculated hash value.
+ */
+static inline uint32_t
+rte_hash_crc_8byte(uint64_t data, uint32_t init_val)
+{
+	if (likely(crc32_alg == CRC32_ARM64))
+		return crc32c_arm64_u64(data, init_val);
+
+	return crc32c_2words(data, init_val);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CRC_ARM64_H_ */
diff --git a/lib/librte_hash/rte_hash_crc.h b/lib/librte_hash/rte_hash_crc.h
index 1f6f5bf..78a34b7 100644
--- a/lib/librte_hash/rte_hash_crc.h
+++ b/lib/librte_hash/rte_hash_crc.h
@@ -407,9 +407,14 @@ crc32c_sse42_u64(uint64_t data, uint64_t init_val)
 #define CRC32_SSE42         (1U << 1)
 #define CRC32_x64           (1U << 2)
 #define CRC32_SSE42_x64     (CRC32_x64|CRC32_SSE42)
+#define CRC32_ARM64         (1U << 3)
 
 static uint8_t crc32_alg = CRC32_SW;
 
+#if defined(RTE_ARCH_ARM64)
+#include "rte_crc_arm64.h"
+#else
+
 /**
  * Allow or disallow use of SSE4.2 instrinsics for CRC32 hash
  * calculation.
@@ -498,6 +503,8 @@ rte_hash_crc_8byte(uint64_t data, uint32_t init_val)
 	return crc32c_2words(data, init_val);
 }
 
+#endif
+
 /**
  * Calculate CRC32 hash on user-supplied byte array.
  *
diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
index bdf8c6b..8c018a4 100644
--- a/mk/machine/armv8a/rte.vars.mk
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -55,4 +55,4 @@
 # CPU_LDFLAGS =
 # CPU_ASFLAGS =
 
-MACHINE_CFLAGS += -march=armv8-a -DRTE_CACHE_LINE_SIZE=64
+MACHINE_CFLAGS += -march=armv8-a+crc -DRTE_CACHE_LINE_SIZE=64
diff --git a/mk/machine/thunderx/rte.vars.mk b/mk/machine/thunderx/rte.vars.mk
index e49f9e1..0bb6b3d 100644
--- a/mk/machine/thunderx/rte.vars.mk
+++ b/mk/machine/thunderx/rte.vars.mk
@@ -55,4 +55,4 @@ CROSS ?= aarch64-thunderx-linux-gnu-
 # CPU_LDFLAGS =
 # CPU_ASFLAGS =
 
-MACHINE_CFLAGS += -march=armv8-a -mcpu=thunderx -DRTE_CACHE_LINE_SIZE=128
+MACHINE_CFLAGS += -march=armv8-a+crc -mcpu=thunderx -DRTE_CACHE_LINE_SIZE=128
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index bec7bdd..0a340a9 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -111,6 +111,10 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
 CPUFLAGS += NEON
 endif
 
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),)
+CPUFLAGS += CRC32
+endif
+
 
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
 
diff --git a/mk/toolchain/gcc/rte.toolchain-compat.mk b/mk/toolchain/gcc/rte.toolchain-compat.mk
index 61bb5b7..e144216 100644
--- a/mk/toolchain/gcc/rte.toolchain-compat.mk
+++ b/mk/toolchain/gcc/rte.toolchain-compat.mk
@@ -54,7 +54,11 @@ else
 # GCC 4.5.x - added support for atom
 # GCC 4.6.x - added support for corei7, corei7-avx
 # GCC 4.7.x - added support for fsgsbase, rdrnd, f16c, core-avx-i, core-avx2
-
+# GCC 4.9.x - added support for armv8-a+crc
+#
+	ifeq ($(shell test $(GCC_VERSION) -le 49 && echo 1), 1)
+		MACHINE_CFLAGS := $(patsubst -march=armv8-a+crc,-march=armv8-a+crc -D__ARM_FEATURE_CRC32=1,$(MACHINE_CFLAGS))
+	endif
 	ifeq ($(shell test $(GCC_VERSION) -le 47 && echo 1), 1)
 		MACHINE_CFLAGS := $(patsubst -march=core-avx-i,-march=corei7-avx,$(MACHINE_CFLAGS))
 		MACHINE_CFLAGS := $(patsubst -march=core-avx2,-march=core-avx2,$(MACHINE_CFLAGS))
-- 
2.1.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH 3/4] hash: select hash function as CRC if armv8-a CRC extension available
  2015-11-23 18:45 [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 1/4] hash: replace libc memcmp with optimized memory compare functions for arm64 Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 2/4] hash: implement rte_hash_crc_* based on armv8-a CRC32 instructions Jerin Jacob
@ 2015-11-23 18:45 ` Jerin Jacob
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 4/4] mk: add xgene1 machine target based on armv8-a Jerin Jacob
  2015-11-25 21:11 ` [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Thomas Monjalon
  4 siblings, 0 replies; 6+ messages in thread
From: Jerin Jacob @ 2015-11-23 18:45 UTC (permalink / raw)
  To: dev

select hash function for cuckoo, fbk as rte_hash_crc_4byte
if arm64-CRC extension available

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 lib/librte_hash/rte_cuckoo_hash.c | 2 +-
 lib/librte_hash/rte_fbk_hash.h    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index e6520dd..88f77c3 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -85,7 +85,7 @@ EAL_REGISTER_TAILQ(rte_hash_tailq)
 #endif
 
 /* Hash function used if none is specified */
-#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
+#if defined(RTE_MACHINE_CPUFLAG_SSE4_2) || defined(RTE_MACHINE_CPUFLAG_CRC32)
 #include <rte_hash_crc.h>
 #define DEFAULT_HASH_FUNC       rte_hash_crc
 #else
diff --git a/lib/librte_hash/rte_fbk_hash.h b/lib/librte_hash/rte_fbk_hash.h
index c9b5a6a..a430961 100644
--- a/lib/librte_hash/rte_fbk_hash.h
+++ b/lib/librte_hash/rte_fbk_hash.h
@@ -55,7 +55,7 @@ extern "C" {
 #include <string.h>
 
 #ifndef RTE_FBK_HASH_FUNC_DEFAULT
-#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
+#if defined(RTE_MACHINE_CPUFLAG_SSE4_2) || defined(RTE_MACHINE_CPUFLAG_CRC32)
 #include <rte_hash_crc.h>
 /** Default four-byte key hash function if none is specified. */
 #define RTE_FBK_HASH_FUNC_DEFAULT		rte_hash_crc_4byte
-- 
2.1.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH 4/4] mk: add xgene1 machine target based on armv8-a
  2015-11-23 18:45 [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Jerin Jacob
                   ` (2 preceding siblings ...)
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 3/4] hash: select hash function as CRC if armv8-a CRC extension available Jerin Jacob
@ 2015-11-23 18:45 ` Jerin Jacob
  2015-11-25 21:11 ` [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Thomas Monjalon
  4 siblings, 0 replies; 6+ messages in thread
From: Jerin Jacob @ 2015-11-23 18:45 UTC (permalink / raw)
  To: dev

created the new xgene1 machine target to address the difference
in optional armv8-a CRC extension availability compared to
default armv8-a machine target(enabled CRC extension by default)

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 config/defconfig_arm64-xgene1-linuxapp-gcc | 56 +++++++++++++++++++++++++++++
 mk/machine/xgene1/rte.vars.mk              | 58 ++++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+)
 create mode 100644 config/defconfig_arm64-xgene1-linuxapp-gcc
 create mode 100644 mk/machine/xgene1/rte.vars.mk

diff --git a/config/defconfig_arm64-xgene1-linuxapp-gcc b/config/defconfig_arm64-xgene1-linuxapp-gcc
new file mode 100644
index 0000000..d75f8f0
--- /dev/null
+++ b/config/defconfig_arm64-xgene1-linuxapp-gcc
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="xgene1"
+
+CONFIG_RTE_ARCH="arm64"
+CONFIG_RTE_ARCH_ARM64=y
+CONFIG_RTE_ARCH_64=y
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_FORCE_INTRINSICS=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+CONFIG_RTE_CACHE_LINE_SIZE=64
+
+CONFIG_RTE_IXGBE_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_IVSHMEM=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
diff --git a/mk/machine/xgene1/rte.vars.mk b/mk/machine/xgene1/rte.vars.mk
new file mode 100644
index 0000000..bdf8c6b
--- /dev/null
+++ b/mk/machine/xgene1/rte.vars.mk
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright (C) Cavium networks 2015. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Cavium networks nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#
+# machine:
+#
+#   - can define ARCH variable (overridden by cmdline value)
+#   - can define CROSS variable (overridden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+#   - can define CPU_CFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS += -march=armv8-a -DRTE_CACHE_LINE_SIZE=64
-- 
2.1.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library
  2015-11-23 18:45 [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Jerin Jacob
                   ` (3 preceding siblings ...)
  2015-11-23 18:45 ` [dpdk-dev] [PATCH 4/4] mk: add xgene1 machine target based on armv8-a Jerin Jacob
@ 2015-11-25 21:11 ` Thomas Monjalon
  4 siblings, 0 replies; 6+ messages in thread
From: Thomas Monjalon @ 2015-11-25 21:11 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev

2015-11-24 00:15, Jerin Jacob:
> - This patch set has the changes required for optimised hash library usage in arm64 perspective
> - Tested on Juno and Thunderx boards
> - Tested and verified the changes with following DPDK unit test cases
>         hash_functions_autotest
>         hash_autotest
>         hash_perf_autotest
>         hash_scaling_autotes
> -   Created the new xgene1 machine target to address the difference
>     in optional armv8-a CRC extension availability compared to
>     default armv8-a machine target(enabled CRC extension by default)
> -  Supersededs the [dpdk-dev] [PATCH] hash: replace libc memcmp with
>  optimized memory compare functions for arm64 patch
> 
> 
> Jerin Jacob (4):
>   hash: replace libc memcmp with optimized memory compare functions for
>     arm64
>   hash: implement rte_hash_crc_* based on armv8-a CRC32 instructions
>   hash: select hash function as CRC if armv8-a CRC extension available
>   mk: add xgene1 machine target based on armv8-a

Applied, thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-11-25 21:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-23 18:45 [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Jerin Jacob
2015-11-23 18:45 ` [dpdk-dev] [PATCH 1/4] hash: replace libc memcmp with optimized memory compare functions for arm64 Jerin Jacob
2015-11-23 18:45 ` [dpdk-dev] [PATCH 2/4] hash: implement rte_hash_crc_* based on armv8-a CRC32 instructions Jerin Jacob
2015-11-23 18:45 ` [dpdk-dev] [PATCH 3/4] hash: select hash function as CRC if armv8-a CRC extension available Jerin Jacob
2015-11-23 18:45 ` [dpdk-dev] [PATCH 4/4] mk: add xgene1 machine target based on armv8-a Jerin Jacob
2015-11-25 21:11 ` [dpdk-dev] [PATCH 0/4] optimize and use armv8 CRC extensions for hash library Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).