DPDK patches and discussions
 help / color / mirror / Atom feed
From: Zhiyong Yang <zhiyong.yang@intel.com>
To: dev@dpdk.org
Cc: yuanhan.liu@linux.intel.com, thomas.monjalon@6wind.com,
	bruce.richardson@intel.com, konstantin.ananyev@intel.com,
	pablo.de.lara.guarch@intel.com,
	Zhiyong Yang <zhiyong.yang@intel.com>
Subject: [dpdk-dev] [PATCH v2 1/4] eal/common: introduce rte_memset on IA platform
Date: Tue, 27 Dec 2016 18:04:55 +0800	[thread overview]
Message-ID: <1482833098-38096-2-git-send-email-zhiyong.yang@intel.com> (raw)
In-Reply-To: <1482833098-38096-1-git-send-email-zhiyong.yang@intel.com>

Performance drop has been caused in some cases when DPDK code calls glibc
function memset. please reference to discussions about memset in
http://dpdk.org/ml/archives/dev/2016-October/048628.html
It is necessary to introduce more high efficient function to fix it.
One important thing about rte_memset is that we can get clear control
on what instruction flow is used. This patch supports instruction sets
such as sse & avx(128 bits), avx2(256 bits) and avx512(512bits).
rte_memset makes full use of vectorization and inline function to improve
the perf on IA. In addition, cache line and memory alignment are fully
taken into consideration.

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
---

Changes in V2:

Rename rte_memset.h -> rte_memset_64.h and create a file rte_memset.h
for each arch.

 .../common/include/arch/arm/rte_memset.h           |  36 ++
 .../common/include/arch/ppc_64/rte_memset.h        |  36 ++
 .../common/include/arch/tile/rte_memset.h          |  36 ++
 .../common/include/arch/x86/rte_memset.h           |  51 +++
 .../common/include/arch/x86/rte_memset_64.h        | 378 +++++++++++++++++++++
 lib/librte_eal/common/include/generic/rte_memset.h |  52 +++
 6 files changed, 589 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memset.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memset.h
 create mode 100644 lib/librte_eal/common/include/arch/tile/rte_memset.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memset.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memset_64.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memset.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memset.h b/lib/librte_eal/common/include/arch/arm/rte_memset.h
new file mode 100644
index 0000000..6945f6d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memset.h
@@ -0,0 +1,36 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of RehiveTech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMSET_ARM_H_
+#define _RTE_MEMSET_ARM_H_
+
+#define rte_memset memset
+
+#endif /* _RTE_MEMSET_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_memset.h b/lib/librte_eal/common/include/arch/ppc_64/rte_memset.h
new file mode 100644
index 0000000..0d73f05
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memset.h
@@ -0,0 +1,36 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMSET_PPC_64_H_
+#define _RTE_MEMSET_PPC_64_H_
+
+#define rte_memset memset
+
+#endif /* _RTE_MEMSET_PPC_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/tile/rte_memset.h b/lib/librte_eal/common/include/arch/tile/rte_memset.h
new file mode 100644
index 0000000..e8a1aa1
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/tile/rte_memset.h
@@ -0,0 +1,36 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of EZchip Semiconductor nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMSET_TILE_H_
+#define _RTE_MEMSET_TILE_H_
+
+#define rte_memset memset
+
+#endif /* _RTE_MEMSET_TILE_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_memset.h b/lib/librte_eal/common/include/arch/x86/rte_memset.h
new file mode 100644
index 0000000..86e0812
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_memset.h
@@ -0,0 +1,51 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMSET_X86_H_
+#define _RTE_MEMSET_X86_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef RTE_ARCH_X86_64
+#include "rte_memset_64.h"
+#else
+#define rte_memset memset
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMSET_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_memset_64.h b/lib/librte_eal/common/include/arch/x86/rte_memset_64.h
new file mode 100644
index 0000000..f25d344
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_memset_64.h
@@ -0,0 +1,378 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMSET_X86_64_H_
+#define _RTE_MEMSET_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file
+ *
+ * Functions for vectorised implementation of memset().
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <string.h>
+#include <rte_vect.h>
+
+static inline void *
+rte_memset(void *dst, int a, size_t n) __attribute__((always_inline));
+
+static inline void
+rte_memset_less16(void *dst, int a, size_t n)
+{
+	uintptr_t dstu = (uintptr_t)dst;
+
+	if (n >= 8) {
+		uint16_t b = ((uint8_t)a | (((uint8_t)a) << 8));
+		uint16_t c = ((uint8_t)a | (((uint8_t)a) << 8));
+		uint32_t d = b | c << 16;
+		uint64_t e = d | ((uint64_t)d << 32);
+
+		*(uint64_t *)dstu = e;
+		*(uint64_t *)((uint8_t *)dstu + n - 8) = e;
+	} else {
+		if (n & 0x01) {
+			*(uint8_t *)dstu = (uint8_t)a;
+			dstu = (uintptr_t)((uint8_t *)dstu + 1);
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dstu = (uint8_t)a | (((uint8_t)a) << 8);
+			dstu = (uintptr_t)((uint16_t *)dstu + 1);
+		}
+		if (n & 0x04) {
+			uint16_t b = ((uint8_t)a | (((uint8_t)a) << 8));
+
+			*(uint32_t *)dstu = (uint32_t)(b | (b << 16));
+			dstu = (uintptr_t)((uint32_t *)dstu + 1);
+		}
+	}
+}
+
+static inline void
+rte_memset16(uint8_t *dst, int8_t a)
+{
+	__m128i xmm0;
+
+	xmm0 = _mm_set1_epi8(a);
+	_mm_storeu_si128((__m128i *)dst, xmm0);
+}
+
+static inline void
+rte_memset_17to32(void *dst, int a, size_t n)
+{
+	rte_memset16((uint8_t *)dst, a);
+	rte_memset16((uint8_t *)dst - 16 + n, a);
+}
+
+#ifdef RTE_MACHINE_CPUFLAG_AVX512
+
+/**
+ * AVX512 implementation below
+ */
+
+static inline void
+rte_memset32(uint8_t *dst, int8_t a)
+{
+	__m256i ymm0;
+
+	ymm0 = _mm256_set1_epi8(a);
+	_mm256_storeu_si256((__m256i *)dst, ymm0);
+}
+
+static inline void
+rte_memset64(uint8_t *dst, int8_t a)
+{
+	__m512i zmm0;
+
+	zmm0 = _mm512_set1_epi8(a);
+	_mm512_storeu_si512((void *)dst, zmm0);
+}
+
+static inline void
+rte_memset128blocks(uint8_t *dst, int8_t a, size_t n)
+{
+	__m512i zmm0;
+
+	zmm0 = _mm512_set1_epi8(a);
+	while (n >= 128) {
+		n -= 128;
+		_mm512_store_si512((void *)(dst + 0 * 64), zmm0);
+		_mm512_store_si512((void *)(dst + 1 * 64), zmm0);
+		dst = dst + 128;
+	}
+}
+
+static inline void *
+rte_memset(void *dst, int a, size_t n)
+{
+	void *ret = dst;
+	size_t dstofss;
+	size_t bits;
+
+	if (n < 16) {
+		rte_memset_less16(dst, a, n);
+		return ret;
+	} else if (n == 16) {
+		rte_memset16((uint8_t *)dst, a);
+		return ret;
+	}
+	if (n <= 32) {
+		rte_memset_17to32(dst, a, n);
+		return ret;
+	}
+	if (n <= 64) {
+		rte_memset32((uint8_t *)dst, a);
+		rte_memset32((uint8_t *)dst - 32 + n, a);
+		return ret;
+	}
+	if (n >= 256) {
+		dstofss = ((uintptr_t)dst & 0x3F);
+		if (dstofss > 0) {
+			dstofss = 64 - dstofss;
+			n -= dstofss;
+			rte_memset64((uint8_t *)dst, a);
+			dst = (uint8_t *)dst + dstofss;
+		}
+		rte_memset128blocks((uint8_t *)dst, a, n);
+		bits = n;
+		n = n & 127;
+		bits -= n;
+		dst = (uint8_t *)dst + bits;
+	}
+	if (n > 128) {
+		n -= 128;
+		rte_memset64((uint8_t *)dst, a);
+		rte_memset64((uint8_t *)dst + 64, a);
+		dst = (uint8_t *)dst + 128;
+	}
+	if (n > 64) {
+		rte_memset64((uint8_t *)dst, a);
+		rte_memset64((uint8_t *)dst - 64 + n, a);
+		return ret;
+	}
+	if (n > 0)
+		rte_memset64((uint8_t *)dst - 64 + n, a);
+	return ret;
+}
+
+#elif defined RTE_MACHINE_CPUFLAG_AVX2
+
+/**
+ *  AVX2 implementation below
+ */
+
+static inline void
+rte_memset32(uint8_t *dst, int8_t a)
+{
+	__m256i ymm0;
+
+	ymm0 = _mm256_set1_epi8(a);
+	_mm256_storeu_si256((__m256i *)dst, ymm0);
+}
+
+static inline void
+rte_memset_33to64(void *dst, int a, size_t n)
+{
+	rte_memset32((uint8_t *)dst, a);
+	rte_memset32((uint8_t *)dst - 32 + n, a);
+}
+
+static inline void
+rte_memset64blocks(uint8_t *dst, int8_t a, size_t n)
+{
+	__m256i ymm0;
+
+	ymm0 = _mm256_set1_epi8(a);
+	while (n >= 64) {
+		n -= 64;
+		_mm256_store_si256((__m256i *)((uint8_t *)dst + 0 * 32), ymm0);
+		_mm256_store_si256((__m256i *)((uint8_t *)dst + 1 * 32), ymm0);
+		dst = (uint8_t *)dst + 64;
+
+	}
+}
+
+static inline void *
+rte_memset(void *dst, int a, size_t n)
+{
+	void *ret = dst;
+	size_t dstofss;
+	size_t bits;
+
+	if (n < 16) {
+		rte_memset_less16(dst, a, n);
+		return ret;
+	} else if (n == 16) {
+		rte_memset16((uint8_t *)dst, a);
+		return ret;
+	}
+	if (n <= 32) {
+		rte_memset_17to32(dst, a, n);
+		return ret;
+	}
+	if (n <= 64) {
+		rte_memset_33to64(dst, a, n);
+		return ret;
+	}
+	if (n > 64) {
+		dstofss = (uintptr_t)dst & 0x1F;
+		if (dstofss > 0) {
+			dstofss = 32 - dstofss;
+			n -= dstofss;
+			rte_memset32((uint8_t *)dst, a);
+			dst = (uint8_t *)dst + dstofss;
+		}
+		rte_memset64blocks((uint8_t *)dst, a, n);
+		bits = n;
+		n = n & 63;
+		bits -= n;
+		dst = (uint8_t *)dst + bits;
+	}
+	if (n > 32) {
+		rte_memset_33to64(dst, a, n);
+		return ret;
+	}
+	if (n > 0)
+		rte_memset32((uint8_t *)dst - 32 + n, a);
+	return ret;
+}
+
+#else /* RTE_MACHINE_CPUFLAG */
+
+/**
+ * SSE && AVX implementation below
+ */
+
+static inline void
+rte_memset32(uint8_t *dst, int8_t a)
+{
+	__m128i xmm0 = _mm_set1_epi8(a);
+
+	_mm_storeu_si128((__m128i *)dst, xmm0);
+	_mm_storeu_si128((__m128i *)(dst + 16), xmm0);
+}
+
+static inline void
+rte_memset16blocks(uint8_t *dst, int8_t a, size_t n)
+{
+	__m128i xmm0 = _mm_set1_epi8(a);
+
+	while (n >= 16) {
+		n -= 16;
+		_mm_store_si128((__m128i *)(dst + 0 * 16), xmm0);
+		dst = (uint8_t *)dst + 16;
+	}
+}
+
+static inline void
+rte_memset64blocks(uint8_t *dst, int8_t a, size_t n)
+{
+	__m128i xmm0 = _mm_set1_epi8(a);
+
+	while (n >= 64) {
+		n -= 64;
+		_mm_store_si128((__m128i *)(dst + 0 * 16), xmm0);
+		_mm_store_si128((__m128i *)(dst + 1 * 16), xmm0);
+		_mm_store_si128((__m128i *)(dst + 2 * 16), xmm0);
+		_mm_store_si128((__m128i *)(dst + 3 * 16), xmm0);
+		dst = (uint8_t *)dst + 64;
+	}
+}
+
+static inline void *
+rte_memset(void *dst, int a, size_t n)
+{
+	void *ret = dst;
+	size_t dstofss;
+	size_t bits;
+
+	if (n < 16) {
+		rte_memset_less16(dst, a, n);
+		return ret;
+	} else if (n == 16) {
+		rte_memset16((uint8_t *)dst, a);
+		return ret;
+	}
+	if (n <= 32) {
+		rte_memset_17to32(dst, a, n);
+		return ret;
+	}
+	if (n <= 48) {
+		rte_memset32((uint8_t *)dst, a);
+		rte_memset16((uint8_t *)dst - 16 + n, a);
+		return ret;
+	}
+	if (n <= 64) {
+		rte_memset32((uint8_t *)dst, a);
+		rte_memset16((uint8_t *)dst + 32, a);
+		rte_memset16((uint8_t *)dst - 16 + n, a);
+		return ret;
+	}
+	if (n > 64) {
+		dstofss = (uintptr_t)dst & 0xF;
+		if (dstofss > 0) {
+			dstofss = 16 - dstofss;
+			n -= dstofss;
+			rte_memset16((uint8_t *)dst, a);
+			dst = (uint8_t *)dst + dstofss;
+		}
+		rte_memset64blocks((uint8_t *)dst, a, n);
+		bits = n;
+		n &= 63;
+		bits -= n;
+		dst = (uint8_t *)dst + bits;
+		rte_memset16blocks((uint8_t *)dst, a, n);
+		bits = n;
+		n &= 0xf;
+		bits -= n;
+		dst = (uint8_t *)dst + bits;
+		if (n > 0) {
+			rte_memset16((uint8_t *)dst - 16 + n, a);
+			return ret;
+		}
+	}
+	return ret;
+}
+
+#endif /* RTE_MACHINE_CPUFLAG */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMSET_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_memset.h b/lib/librte_eal/common/include/generic/rte_memset.h
new file mode 100644
index 0000000..b03a7d0
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_memset.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMSET_H_
+#define _RTE_MEMSET_H_
+
+/**
+ * @file
+ *
+ * Functions for vectorised implementation of memset().
+ */
+#ifdef _RTE_MEMSET_X86_64_H_
+
+static void *
+rte_memset(void *dst, int a, size_t n);
+
+#else
+
+#define rte_memset memset
+
+#endif
+#endif /* _RTE_MEMSET_H_ */
-- 
2.7.4

  reply	other threads:[~2016-12-27 10:05 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  8:26 [dpdk-dev] [PATCH 0/4] eal/common: introduce rte_memset and related test Zhiyong Yang
2016-12-02 10:00 ` Maxime Coquelin
2016-12-06  6:33   ` Yang, Zhiyong
2016-12-06  8:29     ` Maxime Coquelin
2016-12-07  9:28       ` Yang, Zhiyong
2016-12-07  9:37         ` Yuanhan Liu
2016-12-07  9:43           ` Yang, Zhiyong
2016-12-07  9:48             ` Yuanhan Liu
2016-12-05  8:26 ` [dpdk-dev] [PATCH 1/4] eal/common: introduce rte_memset on IA platform Zhiyong Yang
2016-12-02 10:25   ` Thomas Monjalon
2016-12-08  7:41     ` Yang, Zhiyong
2016-12-08  9:26       ` Ananyev, Konstantin
2016-12-08  9:53         ` Yang, Zhiyong
2016-12-08 10:27           ` Bruce Richardson
2016-12-08 10:30           ` Ananyev, Konstantin
2016-12-11 12:32             ` Yang, Zhiyong
2016-12-15  6:51               ` Yang, Zhiyong
2016-12-15 10:12                 ` Bruce Richardson
2016-12-16 10:19                   ` Yang, Zhiyong
2016-12-19  6:27                     ` Yuanhan Liu
2016-12-20  2:41                       ` Yao, Lei A
2016-12-15 10:53                 ` Ananyev, Konstantin
2016-12-16  2:15                   ` Yang, Zhiyong
2016-12-16 11:47                     ` Ananyev, Konstantin
2016-12-20  9:31                       ` Yang, Zhiyong
2016-12-08 15:09       ` Thomas Monjalon
2016-12-11 12:04         ` Yang, Zhiyong
2016-12-27 10:04   ` [dpdk-dev] [PATCH v2 0/4] eal/common: introduce rte_memset and related test Zhiyong Yang
2016-12-27 10:04     ` Zhiyong Yang [this message]
2016-12-27 10:04     ` [dpdk-dev] [PATCH v2 2/4] app/test: add functional autotest for rte_memset Zhiyong Yang
2016-12-27 10:04     ` [dpdk-dev] [PATCH v2 3/4] app/test: add performance " Zhiyong Yang
2016-12-27 10:04     ` [dpdk-dev] [PATCH v2 4/4] lib/librte_vhost: improve vhost perf using rte_memset Zhiyong Yang
2017-01-09  9:48     ` [dpdk-dev] [PATCH v2 0/4] eal/common: introduce rte_memset and related test Yang, Zhiyong
2017-01-17  6:24       ` Yang, Zhiyong
2017-01-17 20:14         ` Thomas Monjalon
2017-01-18  0:15           ` Vincent JARDIN
2017-01-18  2:42           ` Yang, Zhiyong
2017-01-18  7:42             ` Thomas Monjalon
2017-01-19  1:36               ` Yang, Zhiyong
2016-12-05  8:26 ` [dpdk-dev] [PATCH 2/4] app/test: add functional autotest for rte_memset Zhiyong Yang
2016-12-05  8:26 ` [dpdk-dev] [PATCH 3/4] app/test: add performance " Zhiyong Yang
2016-12-05  8:26 ` [dpdk-dev] [PATCH 4/4] lib/librte_vhost: improve vhost perf using rte_memset Zhiyong Yang
2016-12-02  9:46   ` Thomas Monjalon
2016-12-06  8:04     ` Yang, Zhiyong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1482833098-38096-2-git-send-email-zhiyong.yang@intel.com \
    --to=zhiyong.yang@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=pablo.de.lara.guarch@intel.com \
    --cc=thomas.monjalon@6wind.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).