DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK
@ 2014-10-16 10:44 Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific Chao Zhu
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

The set of patches split x86 architecture specific operations from DPDK and put them to the
arch directories of i686 and x86_64 architecture. This will make the adpotion of DPDK much easier
on other computer architecture. For a new architecture, just add an architecture specific
directory and necessary building configuration files, then DPDK eal library can support it. 
This is an upgrade version of the former patch.

Chao Zhu (7):
  Split atomic operations to architecture specific
  Split byte order operations to architecture specific
  Split CPU cycle operation to architecture specific
  Split prefetch operations to architecture specific
  Split spinlock operations to architecture specific
  Split memcpy operation to architecture specific
  Split CPU flags operations to architecture specific

 lib/librte_eal/common/Makefile                     |   21 +-
 lib/librte_eal/common/eal_common_cpuflags.c        |  190 ----
 .../common/include/arch/i686/rte_atomic.h          |  669 ++++++++++++
 .../common/include/arch/i686/rte_byteorder.h       |  194 ++++
 .../common/include/arch/i686/rte_cpuflags.h        |  364 +++++++
 .../common/include/arch/i686/rte_cycles.h          |  158 +++
 .../common/include/arch/i686/rte_memcpy.h          |  376 +++++++
 .../common/include/arch/i686/rte_prefetch.h        |   88 ++
 .../common/include/arch/i686/rte_spinlock.h        |  180 ++++
 .../common/include/arch/x86_64/rte_atomic.h        |  631 +++++++++++
 .../common/include/arch/x86_64/rte_byteorder.h     |  195 ++++
 .../common/include/arch/x86_64/rte_cpuflags.h      |  364 +++++++
 .../common/include/arch/x86_64/rte_cycles.h        |  158 +++
 .../common/include/arch/x86_64/rte_memcpy.h        |  376 +++++++
 .../common/include/arch/x86_64/rte_prefetch.h      |   88 ++
 .../common/include/arch/x86_64/rte_spinlock.h      |  180 ++++
 lib/librte_eal/common/include/generic/rte_atomic.h |  795 ++++++++++++++
 .../common/include/generic/rte_byteorder.h         |  124 +++
 lib/librte_eal/common/include/generic/rte_cycles.h |  190 ++++
 .../common/include/generic/rte_spinlock.h          |  169 +++
 .../common/include/i686/arch/rte_atomic.h          |  373 -------
 lib/librte_eal/common/include/rte_atomic.h         | 1133 --------------------
 lib/librte_eal/common/include/rte_byteorder.h      |  270 -----
 lib/librte_eal/common/include/rte_cpuflags.h       |  182 ----
 lib/librte_eal/common/include/rte_cycles.h         |  266 -----
 lib/librte_eal/common/include/rte_memcpy.h         |  376 -------
 lib/librte_eal/common/include/rte_prefetch.h       |   88 --
 lib/librte_eal/common/include/rte_spinlock.h       |  258 -----
 .../common/include/x86_64/arch/rte_atomic.h        |  335 ------
 29 files changed, 5311 insertions(+), 3480 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 2/7] Split byte order " Chao Zhu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch first add architecture specific directories to eal header
file directory. Then split the atomic operations to architecture specific and
generic files. Architecture specific files are put into the
corresponding architecture directory and common header are put into
generic directory.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |   11 +-
 .../common/include/arch/i686/rte_atomic.h          |  669 ++++++++++++
 .../common/include/arch/x86_64/rte_atomic.h        |  631 +++++++++++
 lib/librte_eal/common/include/generic/rte_atomic.h |  795 ++++++++++++++
 .../common/include/i686/arch/rte_atomic.h          |  373 -------
 lib/librte_eal/common/include/rte_atomic.h         | 1133 --------------------
 .../common/include/x86_64/arch/rte_atomic.h        |  335 ------
 7 files changed, 2102 insertions(+), 1845 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 7f27966..8ab363b 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -31,7 +31,7 @@
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-INC := rte_atomic.h rte_branch_prediction.h rte_byteorder.h rte_common.h
+INC := rte_branch_prediction.h rte_byteorder.h rte_common.h
 INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
@@ -46,11 +46,14 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-ARCH_INC := rte_atomic.h
+GENERIC_INC := rte_atomic.h
+ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
-SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/arch := \
-	$(addprefix include/$(RTE_ARCH)/arch/,$(ARCH_INC))
+SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
+	$(addprefix include/arch/$(RTE_ARCH)/,$(ARCH_INC))
+SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/generic := \
+	$(addprefix include/generic/,$(GENERIC_INC))
 
 # add libc if configured
 DEPDIRS-$(CONFIG_RTE_LIBC) += lib/libc
diff --git a/lib/librte_eal/common/include/arch/i686/rte_atomic.h b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
new file mode 100644
index 0000000..67efb19
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
@@ -0,0 +1,669 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/i386/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_I686_H_
+#define _RTE_ATOMIC_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <emmintrin.h>
+#include "generic/rte_atomic.h"
+
+/**
+ * @file
+ * Atomic Operations on i686
+ */
+
+#if RTE_MAX_LCORE == 1
+#define MPLOCKED                        /**< No need to insert MP lock prefix. */
+#else
+#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
+#endif
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb() _mm_mfence()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() _mm_sfence()
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() _mm_lfence()
+
+#ifndef RTE_FORCE_INTRINSICS
+ /*------------------------- 16 bit atomic operations -------------------------*/
+
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 16-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgw %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+/**
+ * Atomically test and set a 16-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically increment a 16-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/**
+ * Atomically decrement a 16-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 32-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgl %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+/**
+ * Atomically test and set a 32-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically increment a 32-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/**
+ * Atomically decrement a 32-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 64-bit words)
+ *
+ * @param dst
+ *   The destination into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	uint8_t res;
+	union {
+		struct {
+			uint32_t l32;
+			uint32_t h32;
+		};
+		uint64_t u64;
+	} _exp, _src;
+
+	_exp.u64 = exp;
+	_src.u64 = src;
+
+#ifndef __PIC__
+    asm volatile (
+            MPLOCKED
+            "cmpxchg8b (%[dst]);"
+            "setz %[res];"
+            : [res] "=a" (res)      /* result in eax */
+            : [dst] "S" (dst),      /* esi */
+             "b" (_src.l32),       /* ebx */
+             "c" (_src.h32),       /* ecx */
+             "a" (_exp.l32),       /* eax */
+             "d" (_exp.h32)        /* edx */
+			: "memory" );           /* no-clobber list */
+#else
+	asm volatile (
+            "mov %%ebx, %%edi\n"
+			MPLOCKED
+			"cmpxchg8b (%[dst]);"
+			"setz %[res];"
+            "xchgl %%ebx, %%edi;\n"
+			: [res] "=a" (res)      /* result in eax */
+			: [dst] "S" (dst),      /* esi */
+			  "D" (_src.l32),       /* ebx */
+			  "c" (_src.h32),       /* ecx */
+			  "a" (_exp.l32),       /* eax */
+			  "d" (_exp.h32)        /* edx */
+			: "memory" );           /* no-clobber list */
+#endif
+
+	return res;
+}
+
+/**
+ * Initialize the atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+}
+
+/**
+ * Atomically read a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+}
+
+/**
+ * Atomically set a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value of the counter.
+ */
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+}
+
+/**
+ * Atomically add a 64-bit value to a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp + inc);
+	}
+}
+
+/**
+ * Atomically subtract a 64-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp - dec);
+	}
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	rte_atomic64_add(v, 1);
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	rte_atomic64_sub(v, 1);
+}
+
+/**
+ * Add a 64-bit value to an atomic counter and return the result.
+ *
+ * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
+ * returns the value of v after the addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp + inc);
+	}
+
+
+	return tmp + inc;
+}
+
+/**
+ * Subtract a 64-bit value from an atomic counter and return the result.
+ *
+ * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
+ * and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp - dec);
+	}
+
+	return tmp - dec;
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns
+ * true if the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the addition is 0; false otherwise.
+ */
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_add_return(v, 1) == 0;
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after subtraction is 0; false otherwise.
+ */
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_sub_return(v, 1) == 0;
+}
+
+/**
+ * Atomically test and set a 64-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
new file mode 100644
index 0000000..7a3bc35
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
@@ -0,0 +1,631 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/amd64/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_X86_64_H_
+#define _RTE_ATOMIC_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <emmintrin.h>
+#include "generic/rte_atomic.h"
+
+#if RTE_MAX_LCORE == 1
+#define MPLOCKED                        /**< No need to insert MP lock prefix. */
+#else
+#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
+#endif
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb() _mm_mfence()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() _mm_sfence()
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() _mm_lfence()
+
+#ifndef RTE_FORCE_INTRINSICS
+ /*------------------------- 16 bit atomic operations -------------------------*/
+
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 16-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgw %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+/**
+ * Atomically test and set a 16-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically increment a 16-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/**
+ * Atomically decrement a 16-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 32-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgl %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+/**
+ * Atomically test and set a 32-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically increment a 32-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/**
+ * Atomically decrement a 32-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 64-bit words)
+ *
+ * @param dst
+ *   The destination into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	uint8_t res;
+
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgq %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+
+	return res;
+}
+
+/**
+ * Initialize the atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+/**
+ * Atomically read a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	return v->cnt;
+}
+
+/**
+ * Atomically set a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value of the counter.
+ */
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	v->cnt = new_value;
+}
+
+/**
+ * Atomically add a 64-bit value to a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	asm volatile(
+			MPLOCKED
+			"addq %[inc], %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: [inc] "ir" (inc),     /* input */
+			  "m" (v->cnt)
+			);
+}
+
+/**
+ * Atomically subtract a 64-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	asm volatile(
+			MPLOCKED
+			"subq %[dec], %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: [dec] "ir" (dec),     /* input */
+			  "m" (v->cnt)
+			);
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incq %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decq %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Add a 64-bit value to an atomic counter and return the result.
+ *
+ * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
+ * returns the value of v after the addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	int64_t prev = inc;
+
+	asm volatile(
+			MPLOCKED
+			"xaddq %[prev], %[cnt]"
+			: [prev] "+r" (prev),   /* output */
+			  [cnt] "=m" (v->cnt)
+			: "m" (v->cnt)          /* input */
+			);
+	return prev + inc;
+}
+
+/**
+ * Subtract a 64-bit value from an atomic counter and return the result.
+ *
+ * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
+ * and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return rte_atomic64_add_return(v, -dec);
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns
+ * true if the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the addition is 0; false otherwise.
+ */
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incq %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt), /* output */
+			  [ret] "=qm" (ret)
+			);
+
+	return ret != 0;
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after subtraction is 0; false otherwise.
+ */
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"decq %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return ret != 0;
+}
+
+/**
+ * Atomically test and set a 64-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h b/lib/librte_eal/common/include/generic/rte_atomic.h
new file mode 100644
index 0000000..ff7bf7a
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_atomic.h
@@ -0,0 +1,795 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_H_
+#define _RTE_ATOMIC_H_
+
+/**
+ * @file
+ * Atomic Operations
+ *
+ * This file defines a generic API for atomic operations. 
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/**
+ * Compiler barrier.
+ *
+ * Guarantees that operation reordering does not occur at compile time
+ * for operations directly before and after the barrier.
+ */
+#define	rte_compiler_barrier() do {		\
+	asm volatile ("" : : : "memory");	\
+} while(0)
+
+/**
+ * @file
+ * Atomic Operations on x86_64
+ */
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int16_t cnt; /**< An internal counter value. */
+} rte_atomic16_t;
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 16-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __sync_bool_compare_and_swap(dst, exp, src);
+}
+
+/**
+ * Atomically test and set a 16-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+#endif
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC16_INIT(val) { (val) }
+
+/**
+ * Initialize an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_init(rte_atomic16_t *v)
+{
+	v->cnt = 0;
+}
+
+/**
+ * Atomically read a 16-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int16_t
+rte_atomic16_read(const rte_atomic16_t *v)
+{
+	return v->cnt;
+}
+
+/**
+ * Atomically set a counter to a 16-bit value.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value for the counter.
+ */
+static inline void
+rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
+{
+	v->cnt = new_value;
+}
+
+/**
+ * Atomically add a 16-bit value to an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
+{
+	__sync_fetch_and_add(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 16-bit value from an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
+{
+	__sync_fetch_and_sub(&v->cnt, dec);
+}
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	rte_atomic16_add(v, 1);
+}
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	rte_atomic16_sub(v, 1);
+}
+#endif
+
+/**
+ * Atomically add a 16-bit value to a counter and return the result.
+ *
+ * Atomically adds the 16-bits value (inc) to the atomic counter (v) and
+ * returns the value of v after addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int16_t
+rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
+{
+	return __sync_add_and_fetch(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 16-bit value from a counter and return
+ * the result.
+ *
+ * Atomically subtracts the 16-bit value (inc) from the atomic counter
+ * (v) and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int16_t
+rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
+{
+	return __sync_sub_and_fetch(&v->cnt, dec);
+}
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Atomically increment a 16-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
+}
+
+/**
+ * Atomically decrement a 16-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
+}
+#endif
+
+/**
+ * Atomically set a 16-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic16_clear(rte_atomic16_t *v)
+{
+	v->cnt = 0;
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int32_t cnt; /**< An internal counter value. */
+} rte_atomic32_t;
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 32-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	return __sync_bool_compare_and_swap(dst, exp, src);
+}
+
+/**
+ * Atomically test and set a 32-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+#endif
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC32_INIT(val) { (val) }
+
+/**
+ * Initialize an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_init(rte_atomic32_t *v)
+{
+	v->cnt = 0;
+}
+
+/**
+ * Atomically read a 32-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int32_t
+rte_atomic32_read(const rte_atomic32_t *v)
+{
+	return v->cnt;
+}
+
+/**
+ * Atomically set a counter to a 32-bit value.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value for the counter.
+ */
+static inline void
+rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
+{
+	v->cnt = new_value;
+}
+
+/**
+ * Atomically add a 32-bit value to an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
+{
+	__sync_fetch_and_add(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 32-bit value from an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
+{
+	__sync_fetch_and_sub(&v->cnt, dec);
+}
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	rte_atomic32_add(v, 1);
+}
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	rte_atomic32_sub(v,1);
+}
+#endif
+
+/**
+ * Atomically add a 32-bit value to a counter and return the result.
+ *
+ * Atomically adds the 32-bits value (inc) to the atomic counter (v) and
+ * returns the value of v after addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int32_t
+rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
+{
+	return __sync_add_and_fetch(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 32-bit value from a counter and return
+ * the result.
+ *
+ * Atomically subtracts the 32-bit value (inc) from the atomic counter
+ * (v) and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int32_t
+rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
+{
+	return __sync_sub_and_fetch(&v->cnt, dec);
+}
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Atomically increment a 32-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
+}
+
+/**
+ * Atomically decrement a 32-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
+}
+#endif
+
+/**
+ * Atomically set a 32-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic32_clear(rte_atomic32_t *v)
+{
+	v->cnt = 0;
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int64_t cnt;  /**< Internal counter value. */
+} rte_atomic64_t;
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC64_INIT(val) { (val) }
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 64-bit words)
+ *
+ * @param dst
+ *   The destination into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	return __sync_bool_compare_and_swap(dst, exp, src);
+}
+
+/**
+ * Initialize the atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+#ifdef __LP64__
+	v->cnt = 0;
+#else
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+#endif
+}
+
+/**
+ * Atomically read a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+#ifdef __LP64__
+	return v->cnt;
+#else
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+#endif
+}
+
+/**
+ * Atomically set a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value of the counter.
+ */
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+#ifdef __LP64__
+	v->cnt = new_value;
+#else
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+#endif
+}
+
+/**
+ * Atomically add a 64-bit value to a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	__sync_fetch_and_add(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 64-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	__sync_fetch_and_sub(&v->cnt, dec);
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	rte_atomic64_add(v, 1);
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	rte_atomic64_sub(v, 1);
+}
+
+/**
+ * Add a 64-bit value to an atomic counter and return the result.
+ *
+ * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
+ * returns the value of v after the addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	return __sync_add_and_fetch(&v->cnt, inc);
+}
+
+/**
+ * Subtract a 64-bit value from an atomic counter and return the result.
+ *
+ * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
+ * and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return __sync_sub_and_fetch(&v->cnt, dec);
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns
+ * true if the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the addition is 0; false otherwise.
+ */
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_add_return(v, 1) == 0;
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after subtraction is 0; false otherwise.
+ */
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_sub_return(v, 1) == 0;
+}
+
+/**
+ * Atomically test and set a 64-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+
+#endif /*RTE_FORCE_INTRINSICS */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/i686/arch/rte_atomic.h b/lib/librte_eal/common/include/i686/arch/rte_atomic.h
deleted file mode 100644
index 6956b87..0000000
--- a/lib/librte_eal/common/include/i686/arch/rte_atomic.h
+++ /dev/null
@@ -1,373 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * Inspired from FreeBSD src/sys/i386/include/atomic.h
- * Copyright (c) 1998 Doug Rabson
- * All rights reserved.
- */
-
-#ifndef _RTE_ATOMIC_H_
-#error "don't include this file directly, please include generic <rte_atomic.h>"
-#endif
-
-#ifndef _RTE_I686_ATOMIC_H_
-#define _RTE_I686_ATOMIC_H_
-
-
-/**
- * @file
- * Atomic Operations on i686
- */
-
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	uint8_t res;
-	union {
-		struct {
-			uint32_t l32;
-			uint32_t h32;
-		};
-		uint64_t u64;
-	} _exp, _src;
-
-	_exp.u64 = exp;
-	_src.u64 = src;
-
-#ifndef __PIC__
-    asm volatile (
-            MPLOCKED
-            "cmpxchg8b (%[dst]);"
-            "setz %[res];"
-            : [res] "=a" (res)      /* result in eax */
-            : [dst] "S" (dst),      /* esi */
-             "b" (_src.l32),       /* ebx */
-             "c" (_src.h32),       /* ecx */
-             "a" (_exp.l32),       /* eax */
-             "d" (_exp.h32)        /* edx */
-			: "memory" );           /* no-clobber list */
-#else
-	asm volatile (
-            "mov %%ebx, %%edi\n"
-			MPLOCKED
-			"cmpxchg8b (%[dst]);"
-			"setz %[res];"
-            "xchgl %%ebx, %%edi;\n"
-			: [res] "=a" (res)      /* result in eax */
-			: [dst] "S" (dst),      /* esi */
-			  "D" (_src.l32),       /* ebx */
-			  "c" (_src.h32),       /* ecx */
-			  "a" (_exp.l32),       /* eax */
-			  "d" (_exp.h32)        /* edx */
-			: "memory" );           /* no-clobber list */
-#endif
-
-	return res;
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, 0);
-	}
-}
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		/* replace the value by itself */
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp);
-	}
-	return tmp;
-}
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, new_value);
-	}
-}
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp + inc);
-	}
-}
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp - dec);
-	}
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	rte_atomic64_add(v, 1);
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	rte_atomic64_sub(v, 1);
-}
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp + inc);
-	}
-
-	return tmp + inc;
-}
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp - dec);
-	}
-
-	return tmp - dec;
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_add_return(v, 1) == 0;
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	rte_atomic64_set(v, 0);
-}
-
-#endif /* _RTE_I686_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/rte_atomic.h b/lib/librte_eal/common/include/rte_atomic.h
deleted file mode 100644
index a5b6eec..0000000
--- a/lib/librte_eal/common/include/rte_atomic.h
+++ /dev/null
@@ -1,1133 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_ATOMIC_H_
-#define _RTE_ATOMIC_H_
-
-/**
- * @file
- * Atomic Operations
- *
- * This file defines a generic API for atomic
- * operations. The implementation is architecture-specific.
- *
- * See lib/librte_eal/common/include/i686/arch/rte_atomic.h
- * See lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdint.h>
-
-#if RTE_MAX_LCORE == 1
-#define MPLOCKED                        /**< No need to insert MP lock prefix. */
-#else
-#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
-#endif
-
-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- */
-#define	rte_mb() _mm_mfence()
-
-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- */
-#define	rte_wmb() _mm_sfence()
-
-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- */
-#define	rte_rmb() _mm_lfence()
-
-/**
- * Compiler barrier.
- *
- * Guarantees that operation reordering does not occur at compile time
- * for operations directly before and after the barrier.
- */
-#define	rte_compiler_barrier() do {		\
-	asm volatile ("" : : : "memory");	\
-} while(0)
-
-#include <emmintrin.h>
-
-/**
- * @file
- * Atomic Operations on x86_64
- */
-
-/*------------------------- 16 bit atomic operations -------------------------*/
-
-/**
- * Atomic compare and set.
- *
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 16-bit words)
- *
- * @param dst
- *   The destination location into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgw %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-#else
-	return __sync_bool_compare_and_swap(dst, exp, src);
-#endif
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int16_t cnt; /**< An internal counter value. */
-} rte_atomic16_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC16_INIT(val) { (val) }
-
-/**
- * Initialize an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic16_init(rte_atomic16_t *v)
-{
-	v->cnt = 0;
-}
-
-/**
- * Atomically read a 16-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int16_t
-rte_atomic16_read(const rte_atomic16_t *v)
-{
-	return v->cnt;
-}
-
-/**
- * Atomically set a counter to a 16-bit value.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value for the counter.
- */
-static inline void
-rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
-{
-	v->cnt = new_value;
-}
-
-/**
- * Atomically add a 16-bit value to an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
-{
-	__sync_fetch_and_add(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 16-bit value from an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
-{
-	__sync_fetch_and_sub(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic16_add(v, 1);
-#endif
-}
-
-/**
- * Atomically decrement a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"decw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic16_sub(v, 1);
-#endif
-}
-
-/**
- * Atomically add a 16-bit value to a counter and return the result.
- *
- * Atomically adds the 16-bits value (inc) to the atomic counter (v) and
- * returns the value of v after addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int16_t
-rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
-{
-	return __sync_add_and_fetch(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 16-bit value from a counter and return
- * the result.
- *
- * Atomically subtracts the 16-bit value (inc) from the atomic counter
- * (v) and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int16_t
-rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
-{
-	return __sync_sub_and_fetch(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 16-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the increment operation is 0; false otherwise.
- */
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically decrement a 16-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the decrement operation is 0; false otherwise.
- */
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically test and set a 16-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
-	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 16-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic16_clear(rte_atomic16_t *v)
-{
-	v->cnt = 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-/**
- * Atomic compare and set.
- *
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 32-bit words)
- *
- * @param dst
- *   The destination location into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgl %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-#else
-	return __sync_bool_compare_and_swap(dst, exp, src);
-#endif
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int32_t cnt; /**< An internal counter value. */
-} rte_atomic32_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC32_INIT(val) { (val) }
-
-/**
- * Initialize an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic32_init(rte_atomic32_t *v)
-{
-	v->cnt = 0;
-}
-
-/**
- * Atomically read a 32-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int32_t
-rte_atomic32_read(const rte_atomic32_t *v)
-{
-	return v->cnt;
-}
-
-/**
- * Atomically set a counter to a 32-bit value.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value for the counter.
- */
-static inline void
-rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
-{
-	v->cnt = new_value;
-}
-
-/**
- * Atomically add a 32-bit value to an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
-{
-	__sync_fetch_and_add(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 32-bit value from an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
-{
-	__sync_fetch_and_sub(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic32_add(v, 1);
-#endif
-}
-
-/**
- * Atomically decrement a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"decl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic32_sub(v,1);
-#endif
-}
-
-/**
- * Atomically add a 32-bit value to a counter and return the result.
- *
- * Atomically adds the 32-bits value (inc) to the atomic counter (v) and
- * returns the value of v after addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int32_t
-rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
-{
-	return __sync_add_and_fetch(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 32-bit value from a counter and return
- * the result.
- *
- * Atomically subtracts the 32-bit value (inc) from the atomic counter
- * (v) and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int32_t
-rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
-{
-	return __sync_sub_and_fetch(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 32-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the increment operation is 0; false otherwise.
- */
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically decrement a 32-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the decrement operation is 0; false otherwise.
- */
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically test and set a 32-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
-	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 32-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic32_clear(rte_atomic32_t *v)
-{
-	v->cnt = 0;
-}
-
-#ifndef RTE_FORCE_INTRINSICS
-/* any other functions are in arch specific files */
-#include "arch/rte_atomic.h"
-
-
-#ifdef __DOXYGEN__
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v);
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v);
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v);
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v);
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int
-rte_atomic64_inc_and_test(rte_atomic64_t *v);
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int
-rte_atomic64_dec_and_test(rte_atomic64_t *v);
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int
-rte_atomic64_test_and_set(rte_atomic64_t *v);
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_clear(rte_atomic64_t *v);
-
-#endif /* __DOXYGEN__ */
-
-#else /*RTE_FORCE_INTRINSICS */
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	return __sync_bool_compare_and_swap(dst, exp, src);
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-#ifdef __LP64__
-	v->cnt = 0;
-#else
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, 0);
-	}
-#endif
-}
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-#ifdef __LP64__
-	return v->cnt;
-#else
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		/* replace the value by itself */
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp);
-	}
-	return tmp;
-#endif
-}
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-#ifdef __LP64__
-	v->cnt = new_value;
-#else
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, new_value);
-	}
-#endif
-}
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	__sync_fetch_and_add(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	__sync_fetch_and_sub(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	rte_atomic64_add(v, 1);
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	rte_atomic64_sub(v, 1);
-}
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	return __sync_add_and_fetch(&v->cnt, inc);
-}
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	return __sync_sub_and_fetch(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_add_return(v, 1) == 0;
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	rte_atomic64_set(v, 0);
-}
-
-#endif /*RTE_FORCE_INTRINSICS */
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/x86_64/arch/rte_atomic.h b/lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
deleted file mode 100644
index 3ba7d3a..0000000
--- a/lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
+++ /dev/null
@@ -1,335 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * Inspired from FreeBSD src/sys/amd64/include/atomic.h
- * Copyright (c) 1998 Doug Rabson
- * All rights reserved.
- */
-
-#ifndef _RTE_ATOMIC_H_
-#error "don't include this file directly, please include generic <rte_atomic.h>"
-#endif
-
-#ifndef _RTE_X86_64_ATOMIC_H_
-#define _RTE_X86_64_ATOMIC_H_
-
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgq %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-
-	return res;
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-	v->cnt = 0;
-}
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-	return v->cnt;
-}
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-	v->cnt = new_value;
-}
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	asm volatile(
-			MPLOCKED
-			"addq %[inc], %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: [inc] "ir" (inc),     /* input */
-			  "m" (v->cnt)
-			);
-}
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	asm volatile(
-			MPLOCKED
-			"subq %[dec], %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: [dec] "ir" (dec),     /* input */
-			  "m" (v->cnt)
-			);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incq %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decq %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	int64_t prev = inc;
-
-	asm volatile(
-			MPLOCKED
-			"xaddq %[prev], %[cnt]"
-			: [prev] "+r" (prev),   /* output */
-			  [cnt] "=m" (v->cnt)
-			: "m" (v->cnt)          /* input */
-			);
-	return prev + inc;
-}
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	return rte_atomic64_add_return(v, -dec);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incq %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt), /* output */
-			  [ret] "=qm" (ret)
-			);
-
-	return ret != 0;
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"decq %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return ret != 0;
-}
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	v->cnt = 0;
-}
-
-#endif /* _RTE_X86_64_ATOMIC_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 2/7] Split byte order operations to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 3/7] Split CPU cycle operation " Chao Zhu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch splits the byte order operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_byteorder.h       |  194 ++++++++++++++
 .../common/include/arch/x86_64/rte_byteorder.h     |  195 ++++++++++++++
 .../common/include/generic/rte_byteorder.h         |  124 +++++++++
 lib/librte_eal/common/include/rte_byteorder.h      |  270 --------------------
 5 files changed, 515 insertions(+), 272 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 8ab363b..62a39cd 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -31,7 +31,7 @@
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-INC := rte_branch_prediction.h rte_byteorder.h rte_common.h
+INC := rte_branch_prediction.h rte_common.h
 INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-GENERIC_INC := rte_atomic.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_byteorder.h b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
new file mode 100644
index 0000000..de5cc83
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
@@ -0,0 +1,194 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_I686_H_
+#define _RTE_BYTEORDER_I686_H_
+
+/**
+ * @file
+ *
+ * Byte Swap Operations
+ *
+ * This file defines a architecture specific API for byte swap operations. 
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("xchgb %b[x1],%h[x2]"
+		      : [x1] "=Q" (x)
+		      : [x2] "0" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+} 
+ 
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* Compat./Leg. mode */
+static inline uint64_t rte_arch_bswap64(uint64_t x)
+{
+	uint64_t ret = 0;
+	ret |= ((uint64_t)rte_arch_bswap32(x & 0xffffffffUL) << 32);
+	ret |= ((uint64_t)rte_arch_bswap32((x >> 32) & 0xffffffffUL));
+	return ret;
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+/**
+ * Swap bytes in a 16-bit value.
+ */
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+/**
+ * Swap bytes in a 32-bit value.
+ */
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+/**
+ * Swap bytes in a 64-bit value.
+ */
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/**
+ * Swap bytes in a 16-bit value.
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/**
+ * Convert a 16-bit value from CPU order to little endian.
+ */
+#define rte_cpu_to_le_16(x) (x)
+
+/**
+ * Convert a 32-bit value from CPU order to little endian.
+ */
+#define rte_cpu_to_le_32(x) (x)
+
+/**
+ * Convert a 64-bit value from CPU order to little endian.
+ */
+#define rte_cpu_to_le_64(x) (x)
+
+
+/**
+ * Convert a 16-bit value from CPU order to big endian.
+ */
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+
+/**
+ * Convert a 32-bit value from CPU order to big endian.
+ */
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+
+/**
+ * Convert a 64-bit value from CPU order to big endian.
+ */
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+
+/**
+ * Convert a 16-bit value from little endian to CPU order.
+ */
+#define rte_le_to_cpu_16(x) (x)
+
+/**
+ * Convert a 32-bit value from little endian to CPU order.
+ */
+#define rte_le_to_cpu_32(x) (x)
+
+/**
+ * Convert a 64-bit value from little endian to CPU order.
+ */
+#define rte_le_to_cpu_64(x) (x)
+
+
+/**
+ * Convert a 16-bit value from big endian to CPU order.
+ */
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+
+/**
+ * Convert a 32-bit value from big endian to CPU order.
+ */
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+
+/**
+ * Convert a 64-bit value from big endian to CPU order.
+ */
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_I686_H_ */
\ No newline at end of file
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h b/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
new file mode 100644
index 0000000..089aeae
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
@@ -0,0 +1,195 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_X86_64_H_
+#define _RTE_BYTEORDER_X86_64_H_
+
+/**
+ * @file
+ *
+ * Byte Swap Operations
+ *
+ * This file defines a architecture specific API for byte swap operations. 
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("xchgb %b[x1],%h[x2]"
+		      : [x1] "=Q" (x)
+		      : [x2] "0" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+} 
+ 
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	register uint64_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+} 
+
+#ifndef RTE_FORCE_INTRINSICS
+/**
+ * Swap bytes in a 16-bit value.
+ */
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+/**
+ * Swap bytes in a 32-bit value.
+ */
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+/**
+ * Swap bytes in a 64-bit value.
+ */
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/**
+ * Swap bytes in a 16-bit value.
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/**
+ * Convert a 16-bit value from CPU order to little endian.
+ */
+#define rte_cpu_to_le_16(x) (x)
+
+/**
+ * Convert a 32-bit value from CPU order to little endian.
+ */
+#define rte_cpu_to_le_32(x) (x)
+
+/**
+ * Convert a 64-bit value from CPU order to little endian.
+ */
+#define rte_cpu_to_le_64(x) (x)
+
+
+/**
+ * Convert a 16-bit value from CPU order to big endian.
+ */
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+
+/**
+ * Convert a 32-bit value from CPU order to big endian.
+ */
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+
+/**
+ * Convert a 64-bit value from CPU order to big endian.
+ */
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+
+/**
+ * Convert a 16-bit value from little endian to CPU order.
+ */
+#define rte_le_to_cpu_16(x) (x)
+
+/**
+ * Convert a 32-bit value from little endian to CPU order.
+ */
+#define rte_le_to_cpu_32(x) (x)
+
+/**
+ * Convert a 64-bit value from little endian to CPU order.
+ */
+#define rte_le_to_cpu_64(x) (x)
+
+
+/**
+ * Convert a 16-bit value from big endian to CPU order.
+ */
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+
+/**
+ * Convert a 32-bit value from big endian to CPU order.
+ */
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+
+/**
+ * Convert a 64-bit value from big endian to CPU order.
+ */
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_X86_64_H_ */
\ No newline at end of file
diff --git a/lib/librte_eal/common/include/generic/rte_byteorder.h b/lib/librte_eal/common/include/generic/rte_byteorder.h
new file mode 100644
index 0000000..729d378
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_byteorder.h
@@ -0,0 +1,124 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_H_
+#define _RTE_BYTEORDER_H_
+
+/**
+ * @file
+ *
+ * Byte Swap Operations
+ *
+ * This file defines a generic API for byte swap operations. Part of
+ * the implementation is architecture-specific.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/*
+ * An internal function to swap bytes in a 16-bit value.
+ *
+ * It is used by rte_bswap16() when the value is constant. Do not use
+ * this function directly; rte_bswap16() is preferred.
+ */
+static inline uint16_t
+rte_constant_bswap16(uint16_t x)
+{
+	return (uint16_t)(((x & 0x00ffU) << 8) |
+		((x & 0xff00U) >> 8));
+}
+
+/*
+ * An internal function to swap bytes in a 32-bit value.
+ *
+ * It is used by rte_bswap32() when the value is constant. Do not use
+ * this function directly; rte_bswap32() is preferred.
+ */
+static inline uint32_t
+rte_constant_bswap32(uint32_t x)
+{
+	return  ((x & 0x000000ffUL) << 24) |
+		((x & 0x0000ff00UL) << 8) |
+		((x & 0x00ff0000UL) >> 8) |
+		((x & 0xff000000UL) >> 24);
+}
+
+/*
+ * An internal function to swap bytes of a 64-bit value.
+ *
+ * It is used by rte_bswap64() when the value is constant. Do not use
+ * this function directly; rte_bswap64() is preferred.
+ */
+static inline uint64_t
+rte_constant_bswap64(uint64_t x)
+{
+	return  ((x & 0x00000000000000ffULL) << 56) |
+		((x & 0x000000000000ff00ULL) << 40) |
+		((x & 0x0000000000ff0000ULL) << 24) |
+		((x & 0x00000000ff000000ULL) <<  8) |
+		((x & 0x000000ff00000000ULL) >>  8) |
+		((x & 0x0000ff0000000000ULL) >> 24) |
+		((x & 0x00ff000000000000ULL) >> 40) |
+		((x & 0xff00000000000000ULL) >> 56);
+}
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Swap bytes in a 16-bit value.
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)
+#define rte_bswap16(x) __builtin_bswap16(x)
+#endif
+
+/**
+ * Swap bytes in a 32-bit value.
+ */
+#define rte_bswap32(x) __builtin_bswap32(x)
+
+/**
+ * Swap bytes in a 64-bit value.
+ */
+#define rte_bswap64(x) __builtin_bswap64(x)
+
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_H_ */
diff --git a/lib/librte_eal/common/include/rte_byteorder.h b/lib/librte_eal/common/include/rte_byteorder.h
deleted file mode 100644
index 30fbd56..0000000
--- a/lib/librte_eal/common/include/rte_byteorder.h
+++ /dev/null
@@ -1,270 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_BYTEORDER_H_
-#define _RTE_BYTEORDER_H_
-
-/**
- * @file
- *
- * Byte Swap Operations
- *
- * This file defines a generic API for byte swap operations. Part of
- * the implementation is architecture-specific.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdint.h>
-
-/*
- * An internal function to swap bytes in a 16-bit value.
- *
- * It is used by rte_bswap16() when the value is constant. Do not use
- * this function directly; rte_bswap16() is preferred.
- */
-static inline uint16_t
-rte_constant_bswap16(uint16_t x)
-{
-	return (uint16_t)(((x & 0x00ffU) << 8) |
-		((x & 0xff00U) >> 8));
-}
-
-/*
- * An internal function to swap bytes in a 32-bit value.
- *
- * It is used by rte_bswap32() when the value is constant. Do not use
- * this function directly; rte_bswap32() is preferred.
- */
-static inline uint32_t
-rte_constant_bswap32(uint32_t x)
-{
-	return  ((x & 0x000000ffUL) << 24) |
-		((x & 0x0000ff00UL) << 8) |
-		((x & 0x00ff0000UL) >> 8) |
-		((x & 0xff000000UL) >> 24);
-}
-
-/*
- * An internal function to swap bytes of a 64-bit value.
- *
- * It is used by rte_bswap64() when the value is constant. Do not use
- * this function directly; rte_bswap64() is preferred.
- */
-static inline uint64_t
-rte_constant_bswap64(uint64_t x)
-{
-	return  ((x & 0x00000000000000ffULL) << 56) |
-		((x & 0x000000000000ff00ULL) << 40) |
-		((x & 0x0000000000ff0000ULL) << 24) |
-		((x & 0x00000000ff000000ULL) <<  8) |
-		((x & 0x000000ff00000000ULL) >>  8) |
-		((x & 0x0000ff0000000000ULL) >> 24) |
-		((x & 0x00ff000000000000ULL) >> 40) |
-		((x & 0xff00000000000000ULL) >> 56);
-}
-
-/*
- * An architecture-optimized byte swap for a 16-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap16().
- */
-static inline uint16_t rte_arch_bswap16(uint16_t _x)
-{
-	register uint16_t x = _x;
-	asm volatile ("xchgb %b[x1],%h[x2]"
-		      : [x1] "=Q" (x)
-		      : [x2] "0" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 32-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap32().
- */
-static inline uint32_t rte_arch_bswap32(uint32_t _x)
-{
-	register uint32_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 64-bit value.
- *
-  * Do not use this function directly. The preferred function is rte_bswap64().
- */
-#ifdef RTE_ARCH_X86_64
-/* 64-bit mode */
-static inline uint64_t rte_arch_bswap64(uint64_t _x)
-{
-	register uint64_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-#else /* ! RTE_ARCH_X86_64 */
-/* Compat./Leg. mode */
-static inline uint64_t rte_arch_bswap64(uint64_t x)
-{
-	uint64_t ret = 0;
-	ret |= ((uint64_t)rte_arch_bswap32(x & 0xffffffffUL) << 32);
-	ret |= ((uint64_t)rte_arch_bswap32((x >> 32) & 0xffffffffUL));
-	return ret;
-}
-#endif /* RTE_ARCH_X86_64 */
-
-
-#ifndef RTE_FORCE_INTRINSICS
-/**
- * Swap bytes in a 16-bit value.
- */
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-
-/**
- * Swap bytes in a 32-bit value.
- */
-#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap32(x) :		\
-				   rte_arch_bswap32(x)))
-
-/**
- * Swap bytes in a 64-bit value.
- */
-#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap64(x) :		\
-				   rte_arch_bswap64(x)))
-
-#else
-
-/**
- * Swap bytes in a 16-bit value.
- * __builtin_bswap16 is only available gcc 4.8 and upwards
- */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)
-#define rte_bswap16(x) __builtin_bswap16(x)
-#else
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-#endif
-
-/**
- * Swap bytes in a 32-bit value.
- */
-#define rte_bswap32(x) __builtin_bswap32(x)
-
-/**
- * Swap bytes in a 64-bit value.
- */
-#define rte_bswap64(x) __builtin_bswap64(x)
-
-#endif
-
-/**
- * Convert a 16-bit value from CPU order to little endian.
- */
-#define rte_cpu_to_le_16(x) (x)
-
-/**
- * Convert a 32-bit value from CPU order to little endian.
- */
-#define rte_cpu_to_le_32(x) (x)
-
-/**
- * Convert a 64-bit value from CPU order to little endian.
- */
-#define rte_cpu_to_le_64(x) (x)
-
-
-/**
- * Convert a 16-bit value from CPU order to big endian.
- */
-#define rte_cpu_to_be_16(x) rte_bswap16(x)
-
-/**
- * Convert a 32-bit value from CPU order to big endian.
- */
-#define rte_cpu_to_be_32(x) rte_bswap32(x)
-
-/**
- * Convert a 64-bit value from CPU order to big endian.
- */
-#define rte_cpu_to_be_64(x) rte_bswap64(x)
-
-
-/**
- * Convert a 16-bit value from little endian to CPU order.
- */
-#define rte_le_to_cpu_16(x) (x)
-
-/**
- * Convert a 32-bit value from little endian to CPU order.
- */
-#define rte_le_to_cpu_32(x) (x)
-
-/**
- * Convert a 64-bit value from little endian to CPU order.
- */
-#define rte_le_to_cpu_64(x) (x)
-
-
-/**
- * Convert a 16-bit value from big endian to CPU order.
- */
-#define rte_be_to_cpu_16(x) rte_bswap16(x)
-
-/**
- * Convert a 32-bit value from big endian to CPU order.
- */
-#define rte_be_to_cpu_32(x) rte_bswap32(x)
-
-/**
- * Convert a 64-bit value from big endian to CPU order.
- */
-#define rte_be_to_cpu_64(x) rte_bswap64(x)
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_BYTEORDER_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 3/7] Split CPU cycle operation to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 2/7] Split byte order " Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 4/7] Split prefetch operations " Chao Zhu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch splits the CPU TSC read operations from DPDK and push them to
architecture specific arch directories, so that other processors that
don't have tsc register can be can implement its'own functions.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_cycles.h          |  158 ++++++++++++
 .../common/include/arch/x86_64/rte_cycles.h        |  158 ++++++++++++
 lib/librte_eal/common/include/generic/rte_cycles.h |  190 ++++++++++++++
 lib/librte_eal/common/include/rte_cycles.h         |  266 --------------------
 5 files changed, 508 insertions(+), 268 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 62a39cd..c6aedf9 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -32,7 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 INC := rte_branch_prediction.h rte_common.h
-INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
+INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
 INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-GENERIC_INC := rte_atomic.h rte_byteorder.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cycles.h b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
new file mode 100644
index 0000000..a813e9b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
@@ -0,0 +1,158 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_I686_H_
+#define _RTE_CYCLES_I686_H_
+
+/**
+ * @file
+ *
+ * Architecture Specific Time Reference Functions.
+ */
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the TSC register.
+ *
+ * @return
+ *   The TSC for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+	if (unlikely(rte_cycles_vmware_tsc_map)) {
+		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
+		asm volatile("rdpmc" :
+		             "=a" (tsc.lo_32),
+		             "=d" (tsc.hi_32) :
+		             "c"(0x10000));
+		return tsc.tsc_64;
+	}
+#endif
+
+	asm volatile("rdtsc" :
+		     "=a" (tsc.lo_32),
+		     "=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+/**
+ * Read the TSC register precisely where function is called.
+ *
+ * @return
+ *   The TSC for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+/**
+ * Return the number of TSC cycles since boot
+ *
+  * @return
+ *   the number of cycles
+ */
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+/**
+ * Get the number of cycles since boot from the default timer.
+ *
+ * @return
+ *   The number of cycles
+ */
+static inline uint64_t
+rte_get_timer_cycles(void)
+{
+	switch(eal_timer_source) {
+	case EAL_TIMER_TSC:
+		return rte_rdtsc();
+	case EAL_TIMER_HPET:
+#ifdef RTE_LIBEAL_USE_HPET
+		return rte_get_hpet_cycles();
+#endif
+	default: rte_panic("Invalid timer source specified\n");
+	}
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h b/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
new file mode 100644
index 0000000..4c1946e
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
@@ -0,0 +1,158 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_X86_64_H_
+#define _RTE_CYCLES_X86_64_H_
+
+/**
+ * @file
+ *
+ * Architecture Specific Time Reference Functions.
+ */
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the TSC register.
+ *
+ * @return
+ *   The TSC for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+	if (unlikely(rte_cycles_vmware_tsc_map)) {
+		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
+		asm volatile("rdpmc" :
+		             "=a" (tsc.lo_32),
+		             "=d" (tsc.hi_32) :
+		             "c"(0x10000));
+		return tsc.tsc_64;
+	}
+#endif
+
+	asm volatile("rdtsc" :
+		     "=a" (tsc.lo_32),
+		     "=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+/**
+ * Read the TSC register precisely where function is called.
+ *
+ * @return
+ *   The TSC for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+/**
+ * Return the number of TSC cycles since boot
+ *
+  * @return
+ *   the number of cycles
+ */
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+/**
+ * Get the number of cycles since boot from the default timer.
+ *
+ * @return
+ *   The number of cycles
+ */
+static inline uint64_t
+rte_get_timer_cycles(void)
+{
+	switch(eal_timer_source) {
+	case EAL_TIMER_TSC:
+		return rte_rdtsc();
+	case EAL_TIMER_HPET:
+#ifdef RTE_LIBEAL_USE_HPET
+		return rte_get_hpet_cycles();
+#endif
+	default: rte_panic("Invalid timer source specified\n");
+	}
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_cycles.h b/lib/librte_eal/common/include/generic/rte_cycles.h
new file mode 100644
index 0000000..41379ec
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_cycles.h
@@ -0,0 +1,190 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_H_
+#define _RTE_CYCLES_H_
+
+/**
+ * @file
+ *
+ * Simple Time Reference Functions (Cycles and HPET).
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_debug.h>
+#include <rte_atomic.h>
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+/** Global switch to use VMWARE mapping of TSC instead of RDTSC */
+extern int rte_cycles_vmware_tsc_map;
+#include <rte_branch_prediction.h>
+#endif
+
+#define MS_PER_S 1000
+#define US_PER_S 1000000
+#define NS_PER_S 1000000000
+
+enum timer_source {
+	EAL_TIMER_TSC = 0,
+	EAL_TIMER_HPET
+};
+extern enum timer_source eal_timer_source;
+
+/**
+ * Get the measured frequency of the RDTSC counter
+ *
+ * @return
+ *   The TSC frequency for this lcore
+ */
+uint64_t
+rte_get_tsc_hz(void);
+
+#ifdef RTE_LIBEAL_USE_HPET
+/**
+ * Return the number of HPET cycles since boot
+ *
+ * This counter is global for all execution units. The number of
+ * cycles in one second can be retrieved using rte_get_hpet_hz().
+ *
+ * @return
+ *   the number of cycles
+ */
+uint64_t
+rte_get_hpet_cycles(void);
+
+/**
+ * Get the number of HPET cycles in one second.
+ *
+ * @return
+ *   The number of cycles in one second.
+ */
+uint64_t
+rte_get_hpet_hz(void);
+
+/**
+ * Initialise the HPET for use. This must be called before the rte_get_hpet_hz
+ * and rte_get_hpet_cycles APIs are called. If this function does not succeed,
+ * then the HPET functions are unavailable and should not be called.
+ *
+ * @param make_default
+ * 	If set, the hpet timer becomes the default timer whose values are
+ * 	returned by the rte_get_timer_hz/cycles API calls
+ *
+ * @return
+ * 	0 on success,
+ * 	-1 on error, and the make_default parameter is ignored.
+ */
+int rte_eal_hpet_init(int make_default);
+
+#endif
+
+/**
+ * Get the number of cycles in one second for the default timer.
+ *
+ * @return
+ *   The number of cycles in one second.
+ */
+static inline uint64_t
+rte_get_timer_hz(void)
+{
+	switch(eal_timer_source) {
+	case EAL_TIMER_TSC:
+		return rte_get_tsc_hz();
+	case EAL_TIMER_HPET:
+#ifdef RTE_LIBEAL_USE_HPET
+		return rte_get_hpet_hz();
+#endif
+	default: rte_panic("Invalid timer source specified\n");
+	}
+}
+
+/**
+ * Wait at least us microseconds.
+ *
+ * @param us
+ *   The number of microseconds to wait.
+ */
+void
+rte_delay_us(unsigned us);
+
+/**
+ * Wait at least ms milliseconds.
+ *
+ * @param ms
+ *   The number of milliseconds to wait.
+ */
+static inline void
+rte_delay_ms(unsigned ms)
+{
+	rte_delay_us(ms * 1000);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_H_ */
diff --git a/lib/librte_eal/common/include/rte_cycles.h b/lib/librte_eal/common/include/rte_cycles.h
deleted file mode 100644
index 9b4dbe1..0000000
--- a/lib/librte_eal/common/include/rte_cycles.h
+++ /dev/null
@@ -1,266 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-/*   BSD LICENSE
- *
- *   Copyright(c) 2013 6WIND.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of 6WIND S.A. nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CYCLES_H_
-#define _RTE_CYCLES_H_
-
-/**
- * @file
- *
- * Simple Time Reference Functions (Cycles and HPET).
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdint.h>
-#include <rte_debug.h>
-#include <rte_atomic.h>
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-/** Global switch to use VMWARE mapping of TSC instead of RDTSC */
-extern int rte_cycles_vmware_tsc_map;
-#include <rte_branch_prediction.h>
-#endif
-
-#define MS_PER_S 1000
-#define US_PER_S 1000000
-#define NS_PER_S 1000000000
-
-enum timer_source {
-	EAL_TIMER_TSC = 0,
-	EAL_TIMER_HPET
-};
-extern enum timer_source eal_timer_source;
-
-/**
- * Read the TSC register.
- *
- * @return
- *   The TSC for this lcore.
- */
-static inline uint64_t
-rte_rdtsc(void)
-{
-	union {
-		uint64_t tsc_64;
-		struct {
-			uint32_t lo_32;
-			uint32_t hi_32;
-		};
-	} tsc;
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-	if (unlikely(rte_cycles_vmware_tsc_map)) {
-		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
-		asm volatile("rdpmc" :
-		             "=a" (tsc.lo_32),
-		             "=d" (tsc.hi_32) :
-		             "c"(0x10000));
-		return tsc.tsc_64;
-	}
-#endif
-
-	asm volatile("rdtsc" :
-		     "=a" (tsc.lo_32),
-		     "=d" (tsc.hi_32));
-	return tsc.tsc_64;
-}
-
-/**
- * Read the TSC register precisely where function is called.
- *
- * @return
- *   The TSC for this lcore.
- */
-static inline uint64_t
-rte_rdtsc_precise(void)
-{
-	rte_mb();
-	return rte_rdtsc();
-}
-
-/**
- * Get the measured frequency of the RDTSC counter
- *
- * @return
- *   The TSC frequency for this lcore
- */
-uint64_t
-rte_get_tsc_hz(void);
-
-/**
- * Return the number of TSC cycles since boot
- *
-  * @return
- *   the number of cycles
- */
-static inline uint64_t
-rte_get_tsc_cycles(void) { return rte_rdtsc(); }
-
-#ifdef RTE_LIBEAL_USE_HPET
-/**
- * Return the number of HPET cycles since boot
- *
- * This counter is global for all execution units. The number of
- * cycles in one second can be retrieved using rte_get_hpet_hz().
- *
- * @return
- *   the number of cycles
- */
-uint64_t
-rte_get_hpet_cycles(void);
-
-/**
- * Get the number of HPET cycles in one second.
- *
- * @return
- *   The number of cycles in one second.
- */
-uint64_t
-rte_get_hpet_hz(void);
-
-/**
- * Initialise the HPET for use. This must be called before the rte_get_hpet_hz
- * and rte_get_hpet_cycles APIs are called. If this function does not succeed,
- * then the HPET functions are unavailable and should not be called.
- *
- * @param make_default
- * 	If set, the hpet timer becomes the default timer whose values are
- * 	returned by the rte_get_timer_hz/cycles API calls
- *
- * @return
- * 	0 on success,
- * 	-1 on error, and the make_default parameter is ignored.
- */
-int rte_eal_hpet_init(int make_default);
-
-#endif
-
-/**
- * Get the number of cycles since boot from the default timer.
- *
- * @return
- *   The number of cycles
- */
-static inline uint64_t
-rte_get_timer_cycles(void)
-{
-	switch(eal_timer_source) {
-	case EAL_TIMER_TSC:
-		return rte_rdtsc();
-	case EAL_TIMER_HPET:
-#ifdef RTE_LIBEAL_USE_HPET
-		return rte_get_hpet_cycles();
-#endif
-	default: rte_panic("Invalid timer source specified\n");
-	}
-}
-
-/**
- * Get the number of cycles in one second for the default timer.
- *
- * @return
- *   The number of cycles in one second.
- */
-static inline uint64_t
-rte_get_timer_hz(void)
-{
-	switch(eal_timer_source) {
-	case EAL_TIMER_TSC:
-		return rte_get_tsc_hz();
-	case EAL_TIMER_HPET:
-#ifdef RTE_LIBEAL_USE_HPET
-		return rte_get_hpet_hz();
-#endif
-	default: rte_panic("Invalid timer source specified\n");
-	}
-}
-
-/**
- * Wait at least us microseconds.
- *
- * @param us
- *   The number of microseconds to wait.
- */
-void
-rte_delay_us(unsigned us);
-
-/**
- * Wait at least ms milliseconds.
- *
- * @param ms
- *   The number of milliseconds to wait.
- */
-static inline void
-rte_delay_ms(unsigned ms)
-{
-	rte_delay_us(ms * 1000);
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_CYCLES_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 4/7] Split prefetch operations to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
                   ` (2 preceding siblings ...)
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 3/7] Split CPU cycle operation " Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 5/7] Split spinlock " Chao Zhu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch splits the prefetch operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can implement their own functions.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_prefetch.h        |   88 ++++++++++++++++++++
 .../common/include/arch/x86_64/rte_prefetch.h      |   88 ++++++++++++++++++++
 lib/librte_eal/common/include/rte_prefetch.h       |   88 --------------------
 4 files changed, 178 insertions(+), 90 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index c6aedf9..6cf7505 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
-INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
+INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
-ARCH_INC := $(GENERIC_INC)
+ARCH_INC := $(GENERIC_INC) rte_prefetch.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
diff --git a/lib/librte_eal/common/include/arch/i686/rte_prefetch.h b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
new file mode 100644
index 0000000..2625512
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
@@ -0,0 +1,88 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_I686_H_
+#define _RTE_PREFETCH_I686_H_
+
+/**
+ * @file
+ *
+ * Prefetch operations.
+ *
+ * This file defines an API for prefetch macros / inline-functions,
+ * which are architecture-dependent. Prefetching occurs when a
+ * processor requests an instruction or data from memory to cache
+ * before it is actually needed, potentially speeding up the execution of the
+ * program.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Prefetch a cache line into all cache levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch0(volatile void *p)
+{
+	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th cache level.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch1(volatile void *p)
+{
+	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th and 1th cache
+ * levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch2(volatile void *p)
+{
+	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h b/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
new file mode 100644
index 0000000..daaab2a
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
@@ -0,0 +1,88 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_X86_64_H_
+#define _RTE_PREFETCH_X86_64_H_
+
+/**
+ * @file
+ *
+ * Prefetch operations.
+ *
+ * This file defines an API for prefetch macros / inline-functions,
+ * which are architecture-dependent. Prefetching occurs when a
+ * processor requests an instruction or data from memory to cache
+ * before it is actually needed, potentially speeding up the execution of the
+ * program.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Prefetch a cache line into all cache levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch0(volatile void *p)
+{
+	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th cache level.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch1(volatile void *p)
+{
+	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th and 1th cache
+ * levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch2(volatile void *p)
+{
+	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/rte_prefetch.h b/lib/librte_eal/common/include/rte_prefetch.h
deleted file mode 100644
index 8a691ef..0000000
--- a/lib/librte_eal/common/include/rte_prefetch.h
+++ /dev/null
@@ -1,88 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_PREFETCH_H_
-#define _RTE_PREFETCH_H_
-
-/**
- * @file
- *
- * Prefetch operations.
- *
- * This file defines an API for prefetch macros / inline-functions,
- * which are architecture-dependent. Prefetching occurs when a
- * processor requests an instruction or data from memory to cache
- * before it is actually needed, potentially speeding up the execution of the
- * program.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/**
- * Prefetch a cache line into all cache levels.
- * @param p
- *   Address to prefetch
- */
-static inline void rte_prefetch0(volatile void *p)
-{
-	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-/**
- * Prefetch a cache line into all cache levels except the 0th cache level.
- * @param p
- *   Address to prefetch
- */
-static inline void rte_prefetch1(volatile void *p)
-{
-	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-/**
- * Prefetch a cache line into all cache levels except the 0th and 1th cache
- * levels.
- * @param p
- *   Address to prefetch
- */
-static inline void rte_prefetch2(volatile void *p)
-{
-	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_PREFETCH_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 5/7] Split spinlock operations to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
                   ` (3 preceding siblings ...)
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 4/7] Split prefetch operations " Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 6/7] Split memcpy operation " Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 7/7] Split CPU flags operations " Chao Zhu
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch splits the spinlock operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_spinlock.h        |  180 ++++++++++++++
 .../common/include/arch/x86_64/rte_spinlock.h      |  180 ++++++++++++++
 .../common/include/generic/rte_spinlock.h          |  169 +++++++++++++
 lib/librte_eal/common/include/rte_spinlock.h       |  258 --------------------
 5 files changed, 531 insertions(+), 260 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 6cf7505..9b9a73d 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -35,7 +35,7 @@ INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
-INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
+INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
 INC += rte_hexdump.h rte_devargs.h rte_dev.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h
 ARCH_INC := $(GENERIC_INC) rte_prefetch.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_spinlock.h b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
new file mode 100644
index 0000000..f61e31c
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
@@ -0,0 +1,180 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_I686_H_
+#define _RTE_SPINLOCK_I686_H_
+
+/**
+ * @file
+ *
+ * RTE Spinlocks
+ *
+ * This file defines an API for read-write locks, which are implemented
+ * in an architecture-specific way. This kind of lock simply waits in
+ * a loop repeatedly checking until the lock becomes available.
+ *
+ * All locks must be initialised before use, and only initialised once.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_spinlock.h"
+
+#ifndef RTE_FORCE_INTRINSICS
+/**
+ * Take the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	int lock_val = 1;
+	asm volatile (
+			"1:\n"
+			"xchg %[locked], %[lv]\n"
+			"test %[lv], %[lv]\n"
+			"jz 3f\n"
+			"2:\n"
+			"pause\n"
+			"cmpl $0, %[locked]\n"
+			"jnz 2b\n"
+			"jmp 1b\n"
+			"3:\n"
+			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
+			: "[lv]" (lock_val)
+			: "memory");
+}
+
+/**
+ * Release the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	int unlock_val = 0;
+	asm volatile (
+			"xchg %[locked], %[ulv]\n"
+			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
+			: "[ulv]" (unlock_val)
+			: "memory");
+}
+
+/**
+ * Try to take the lock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	int lockval = 1;
+
+	asm volatile (
+			"xchg %[locked], %[lockval]"
+			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
+			: "[lockval]" (lockval)
+			: "memory");
+
+	return (lockval == 0);
+}
+#endif
+
+/**
+ * Take the recursive spinlock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
+{
+	int id = rte_lcore_id();
+
+	if (slr->user != id) {
+		rte_spinlock_lock(&slr->sl);
+		slr->user = id;
+	}
+	slr->count++;
+}
+
+/**
+ * Release the recursive spinlock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
+{
+	if (--(slr->count) == 0) {
+		slr->user = -1;
+		rte_spinlock_unlock(&slr->sl);
+	}
+
+}
+
+/**
+ * Try to take the recursive lock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
+{
+	int id = rte_lcore_id();
+
+	if (slr->user != id) {
+		if (rte_spinlock_trylock(&slr->sl) == 0)
+			return 0;
+		slr->user = id;
+	}
+	slr->count++;
+	return 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_I686_H_ */
\ No newline at end of file
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h b/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
new file mode 100644
index 0000000..5f5c4cb
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
@@ -0,0 +1,180 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_X86_64_H_
+#define _RTE_SPINLOCK_X86_64_H_
+
+/**
+ * @file
+ *
+ * RTE Spinlocks
+ *
+ * This file defines an API for read-write locks, which are implemented
+ * in an architecture-specific way. This kind of lock simply waits in
+ * a loop repeatedly checking until the lock becomes available.
+ *
+ * All locks must be initialised before use, and only initialised once.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_spinlock.h"
+
+#ifndef RTE_FORCE_INTRINSICS
+/**
+ * Take the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	int lock_val = 1;
+	asm volatile (
+			"1:\n"
+			"xchg %[locked], %[lv]\n"
+			"test %[lv], %[lv]\n"
+			"jz 3f\n"
+			"2:\n"
+			"pause\n"
+			"cmpl $0, %[locked]\n"
+			"jnz 2b\n"
+			"jmp 1b\n"
+			"3:\n"
+			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
+			: "[lv]" (lock_val)
+			: "memory");
+}
+
+/**
+ * Release the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	int unlock_val = 0;
+	asm volatile (
+			"xchg %[locked], %[ulv]\n"
+			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
+			: "[ulv]" (unlock_val)
+			: "memory");
+}
+
+/**
+ * Try to take the lock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	int lockval = 1;
+
+	asm volatile (
+			"xchg %[locked], %[lockval]"
+			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
+			: "[lockval]" (lockval)
+			: "memory");
+
+	return (lockval == 0);
+}
+#endif
+
+/**
+ * Take the recursive spinlock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
+{
+	int id = rte_lcore_id();
+
+	if (slr->user != id) {
+		rte_spinlock_lock(&slr->sl);
+		slr->user = id;
+	}
+	slr->count++;
+}
+
+/**
+ * Release the recursive spinlock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
+{
+	if (--(slr->count) == 0) {
+		slr->user = -1;
+		rte_spinlock_unlock(&slr->sl);
+	}
+
+}
+
+/**
+ * Try to take the recursive lock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
+{
+	int id = rte_lcore_id();
+
+	if (slr->user != id) {
+		if (rte_spinlock_trylock(&slr->sl) == 0)
+			return 0;
+		slr->user = id;
+	}
+	slr->count++;
+	return 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_X86_64_H_ */
\ No newline at end of file
diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
new file mode 100644
index 0000000..fb0f464
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -0,0 +1,169 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_H_
+#define _RTE_SPINLOCK_H_
+
+/**
+ * @file
+ *
+ * RTE Spinlocks
+ *
+ * This file defines an API for read-write locks, which are implemented
+ * in an architecture-specific way. This kind of lock simply waits in
+ * a loop repeatedly checking until the lock becomes available.
+ *
+ * All locks must be initialised before use, and only initialised once.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_lcore.h>
+#ifdef RTE_FORCE_INTRINSICS
+#include <rte_common.h>
+#endif
+
+/**
+ * The rte_spinlock_t type.
+ */
+typedef struct {
+	volatile int locked; /**< lock status 0 = unlocked, 1 = locked */
+} rte_spinlock_t;
+
+/**
+ * A static spinlock initializer.
+ */
+#define RTE_SPINLOCK_INITIALIZER { 0 }
+
+/**
+ * Initialize the spinlock to an unlocked state.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_init(rte_spinlock_t *sl)
+{
+	sl->locked = 0;
+}
+
+#ifdef RTE_FORCE_INTRINSICS
+/**
+ * Take the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while(sl->locked)
+			rte_pause();
+}
+
+/**
+ * Release the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+
+/**
+ * Try to take the lock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked,1) == 0);
+}
+#endif
+
+/**
+ * Test if the lock is taken.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ * @return
+ *   1 if the lock is currently taken; 0 otherwise.
+ */
+static inline int rte_spinlock_is_locked (rte_spinlock_t *sl)
+{
+	return sl->locked;
+}
+
+/**
+ * The rte_spinlock_recursive_t type.
+ */
+typedef struct {
+	rte_spinlock_t sl; /**< the actual spinlock */
+	volatile int user; /**< core id using lock, -1 for unused */
+	volatile int count; /**< count of time this lock has been called */
+} rte_spinlock_recursive_t;
+
+/**
+ * A static recursive spinlock initializer.
+ */
+#define RTE_SPINLOCK_RECURSIVE_INITIALIZER {RTE_SPINLOCK_INITIALIZER, -1, 0}
+
+/**
+ * Initialize the recursive spinlock to an unlocked state.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_init(&slr->sl);
+	slr->user = -1;
+	slr->count = 0;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_H_ */
diff --git a/lib/librte_eal/common/include/rte_spinlock.h b/lib/librte_eal/common/include/rte_spinlock.h
deleted file mode 100644
index 661908d..0000000
--- a/lib/librte_eal/common/include/rte_spinlock.h
+++ /dev/null
@@ -1,258 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_SPINLOCK_H_
-#define _RTE_SPINLOCK_H_
-
-/**
- * @file
- *
- * RTE Spinlocks
- *
- * This file defines an API for read-write locks, which are implemented
- * in an architecture-specific way. This kind of lock simply waits in
- * a loop repeatedly checking until the lock becomes available.
- *
- * All locks must be initialised before use, and only initialised once.
- *
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <rte_lcore.h>
-#ifdef RTE_FORCE_INTRINSICS
-#include <rte_common.h>
-#endif
-
-/**
- * The rte_spinlock_t type.
- */
-typedef struct {
-	volatile int locked; /**< lock status 0 = unlocked, 1 = locked */
-} rte_spinlock_t;
-
-/**
- * A static spinlock initializer.
- */
-#define RTE_SPINLOCK_INITIALIZER { 0 }
-
-/**
- * Initialize the spinlock to an unlocked state.
- *
- * @param sl
- *   A pointer to the spinlock.
- */
-static inline void
-rte_spinlock_init(rte_spinlock_t *sl)
-{
-	sl->locked = 0;
-}
-
-/**
- * Take the spinlock.
- *
- * @param sl
- *   A pointer to the spinlock.
- */
-static inline void
-rte_spinlock_lock(rte_spinlock_t *sl)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	int lock_val = 1;
-	asm volatile (
-			"1:\n"
-			"xchg %[locked], %[lv]\n"
-			"test %[lv], %[lv]\n"
-			"jz 3f\n"
-			"2:\n"
-			"pause\n"
-			"cmpl $0, %[locked]\n"
-			"jnz 2b\n"
-			"jmp 1b\n"
-			"3:\n"
-			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
-			: "[lv]" (lock_val)
-			: "memory");
-#else
-	while (__sync_lock_test_and_set(&sl->locked, 1))
-		while(sl->locked)
-			rte_pause();
-#endif
-}
-
-/**
- * Release the spinlock.
- *
- * @param sl
- *   A pointer to the spinlock.
- */
-static inline void
-rte_spinlock_unlock (rte_spinlock_t *sl)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	int unlock_val = 0;
-	asm volatile (
-			"xchg %[locked], %[ulv]\n"
-			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
-			: "[ulv]" (unlock_val)
-			: "memory");
-#else
-	__sync_lock_release(&sl->locked);
-#endif
-}
-
-/**
- * Try to take the lock.
- *
- * @param sl
- *   A pointer to the spinlock.
- * @return
- *   1 if the lock is successfully taken; 0 otherwise.
- */
-static inline int
-rte_spinlock_trylock (rte_spinlock_t *sl)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	int lockval = 1;
-
-	asm volatile (
-			"xchg %[locked], %[lockval]"
-			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
-			: "[lockval]" (lockval)
-			: "memory");
-
-	return (lockval == 0);
-#else
-	return (__sync_lock_test_and_set(&sl->locked,1) == 0);
-#endif
-}
-
-/**
- * Test if the lock is taken.
- *
- * @param sl
- *   A pointer to the spinlock.
- * @return
- *   1 if the lock is currently taken; 0 otherwise.
- */
-static inline int rte_spinlock_is_locked (rte_spinlock_t *sl)
-{
-	return sl->locked;
-}
-
-/**
- * The rte_spinlock_recursive_t type.
- */
-typedef struct {
-	rte_spinlock_t sl; /**< the actual spinlock */
-	volatile int user; /**< core id using lock, -1 for unused */
-	volatile int count; /**< count of time this lock has been called */
-} rte_spinlock_recursive_t;
-
-/**
- * A static recursive spinlock initializer.
- */
-#define RTE_SPINLOCK_RECURSIVE_INITIALIZER {RTE_SPINLOCK_INITIALIZER, -1, 0}
-
-/**
- * Initialize the recursive spinlock to an unlocked state.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- */
-static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
-{
-	rte_spinlock_init(&slr->sl);
-	slr->user = -1;
-	slr->count = 0;
-}
-
-/**
- * Take the recursive spinlock.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- */
-static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
-{
-	int id = rte_lcore_id();
-
-	if (slr->user != id) {
-		rte_spinlock_lock(&slr->sl);
-		slr->user = id;
-	}
-	slr->count++;
-}
-/**
- * Release the recursive spinlock.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- */
-static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
-{
-	if (--(slr->count) == 0) {
-		slr->user = -1;
-		rte_spinlock_unlock(&slr->sl);
-	}
-
-}
-
-/**
- * Try to take the recursive lock.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- * @return
- *   1 if the lock is successfully taken; 0 otherwise.
- */
-static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
-{
-	int id = rte_lcore_id();
-
-	if (slr->user != id) {
-		if (rte_spinlock_trylock(&slr->sl) == 0)
-			return 0;
-		slr->user = id;
-	}
-	slr->count++;
-	return 1;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_SPINLOCK_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 6/7] Split memcpy operation to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
                   ` (4 preceding siblings ...)
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 5/7] Split spinlock " Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 7/7] Split CPU flags operations " Chao Zhu
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch splits the SSE based memory copy function from DPDK and push
them to architecture specific arch directories. Other processor
architecture can implement it's own vector based memory copy functions.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_memcpy.h          |  376 ++++++++++++++++++++
 .../common/include/arch/x86_64/rte_memcpy.h        |  376 ++++++++++++++++++++
 lib/librte_eal/common/include/rte_memcpy.h         |  376 --------------------
 4 files changed, 754 insertions(+), 378 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 9b9a73d..e09d509 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
-INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
+INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h
-ARCH_INC := $(GENERIC_INC) rte_prefetch.h
+ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
diff --git a/lib/librte_eal/common/include/arch/i686/rte_memcpy.h b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
new file mode 100644
index 0000000..ba750b1
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
@@ -0,0 +1,376 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_I686_H_
+#define _RTE_MEMCPY_I686_H_
+
+/**
+ * @file
+ *
+ * Functions for SSE implementation of memcpy().
+ */
+
+#include <stdint.h>
+#include <string.h>
+#include <emmintrin.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
+#endif
+
+/**
+ * Copy 16 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		: [reg_a] "=x" (reg_a)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 32 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 48 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 64 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 128 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu 64(%[src]), %[reg_e]\n\t"
+		"movdqu 80(%[src]), %[reg_f]\n\t"
+		"movdqu 96(%[src]), %[reg_g]\n\t"
+		"movdqu 112(%[src]), %[reg_h]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		"movdqu %[reg_e], 64(%[dst])\n\t"
+		"movdqu %[reg_f], 80(%[dst])\n\t"
+		"movdqu %[reg_g], 96(%[dst])\n\t"
+		"movdqu %[reg_h], 112(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d),
+		  [reg_e] "=x" (reg_e),
+		  [reg_f] "=x" (reg_f),
+		  [reg_g] "=x" (reg_g),
+		  [reg_h] "=x" (reg_h)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+#ifdef __INTEL_COMPILER
+#pragma warning(enable:593)
+#endif
+
+/**
+ * Copy 256 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	rte_mov128(dst, src);
+	rte_mov128(dst + 128, src + 128);
+}
+
+/**
+ * Copy bytes from one location to another. The locations must not overlap.
+ *
+ * @note This is implemented as a macro, so it's address should not be taken
+ * and care is needed as parameter expressions may be evaluated multiple times.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ * @param n
+ *   Number of bytes to copy.
+ * @return
+ *   Pointer to the destination data.
+ */
+#define rte_memcpy(dst, src, n)              \
+	((__builtin_constant_p(n)) ?          \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)))
+
+/*
+ * memcpy() function used by rte_memcpy macro
+ */
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n) __attribute__((always_inline));
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0) {
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+	}
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h b/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
new file mode 100644
index 0000000..11c7b86
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
@@ -0,0 +1,376 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_X86_64_H_
+#define _RTE_MEMCPY_X86_64_H_
+
+/**
+ * @file
+ *
+ * Functions for SSE implementation of memcpy().
+ */
+
+#include <stdint.h>
+#include <string.h>
+#include <emmintrin.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
+#endif
+
+/**
+ * Copy 16 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		: [reg_a] "=x" (reg_a)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 32 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 48 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 64 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+/**
+ * Copy 128 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu 64(%[src]), %[reg_e]\n\t"
+		"movdqu 80(%[src]), %[reg_f]\n\t"
+		"movdqu 96(%[src]), %[reg_g]\n\t"
+		"movdqu 112(%[src]), %[reg_h]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		"movdqu %[reg_e], 64(%[dst])\n\t"
+		"movdqu %[reg_f], 80(%[dst])\n\t"
+		"movdqu %[reg_g], 96(%[dst])\n\t"
+		"movdqu %[reg_h], 112(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d),
+		  [reg_e] "=x" (reg_e),
+		  [reg_f] "=x" (reg_f),
+		  [reg_g] "=x" (reg_g),
+		  [reg_h] "=x" (reg_h)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+#ifdef __INTEL_COMPILER
+#pragma warning(enable:593)
+#endif
+
+/**
+ * Copy 256 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	rte_mov128(dst, src);
+	rte_mov128(dst + 128, src + 128);
+}
+
+/**
+ * Copy bytes from one location to another. The locations must not overlap.
+ *
+ * @note This is implemented as a macro, so it's address should not be taken
+ * and care is needed as parameter expressions may be evaluated multiple times.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ * @param n
+ *   Number of bytes to copy.
+ * @return
+ *   Pointer to the destination data.
+ */
+#define rte_memcpy(dst, src, n)              \
+	((__builtin_constant_p(n)) ?          \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)))
+
+/*
+ * memcpy() function used by rte_memcpy macro
+ */
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n) __attribute__((always_inline));
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0) {
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+	}
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/rte_memcpy.h b/lib/librte_eal/common/include/rte_memcpy.h
deleted file mode 100644
index 131b196..0000000
--- a/lib/librte_eal/common/include/rte_memcpy.h
+++ /dev/null
@@ -1,376 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_MEMCPY_H_
-#define _RTE_MEMCPY_H_
-
-/**
- * @file
- *
- * Functions for SSE implementation of memcpy().
- */
-
-#include <stdint.h>
-#include <string.h>
-#include <emmintrin.h>
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#ifdef __INTEL_COMPILER
-#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
-#endif
-
-/**
- * Copy 16 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov16(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		: [reg_a] "=x" (reg_a)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 32 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov32(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 48 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov48(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 64 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov64(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 128 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov128(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu 64(%[src]), %[reg_e]\n\t"
-		"movdqu 80(%[src]), %[reg_f]\n\t"
-		"movdqu 96(%[src]), %[reg_g]\n\t"
-		"movdqu 112(%[src]), %[reg_h]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		"movdqu %[reg_e], 64(%[dst])\n\t"
-		"movdqu %[reg_f], 80(%[dst])\n\t"
-		"movdqu %[reg_g], 96(%[dst])\n\t"
-		"movdqu %[reg_h], 112(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d),
-		  [reg_e] "=x" (reg_e),
-		  [reg_f] "=x" (reg_f),
-		  [reg_g] "=x" (reg_g),
-		  [reg_h] "=x" (reg_h)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-#ifdef __INTEL_COMPILER
-#pragma warning(enable:593)
-#endif
-
-/**
- * Copy 256 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov256(uint8_t *dst, const uint8_t *src)
-{
-	rte_mov128(dst, src);
-	rte_mov128(dst + 128, src + 128);
-}
-
-/**
- * Copy bytes from one location to another. The locations must not overlap.
- *
- * @note This is implemented as a macro, so it's address should not be taken
- * and care is needed as parameter expressions may be evaluated multiple times.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- * @param n
- *   Number of bytes to copy.
- * @return
- *   Pointer to the destination data.
- */
-#define rte_memcpy(dst, src, n)              \
-	((__builtin_constant_p(n)) ?          \
-	memcpy((dst), (src), (n)) :          \
-	rte_memcpy_func((dst), (src), (n)))
-
-/*
- * memcpy() function used by rte_memcpy macro
- */
-static inline void *
-rte_memcpy_func(void *dst, const void *src, size_t n) __attribute__((always_inline));
-
-static inline void *
-rte_memcpy_func(void *dst, const void *src, size_t n)
-{
-	void *ret = dst;
-
-	/* We can't copy < 16 bytes using XMM registers so do it manually. */
-	if (n < 16) {
-		if (n & 0x01) {
-			*(uint8_t *)dst = *(const uint8_t *)src;
-			dst = (uint8_t *)dst + 1;
-			src = (const uint8_t *)src + 1;
-		}
-		if (n & 0x02) {
-			*(uint16_t *)dst = *(const uint16_t *)src;
-			dst = (uint16_t *)dst + 1;
-			src = (const uint16_t *)src + 1;
-		}
-		if (n & 0x04) {
-			*(uint32_t *)dst = *(const uint32_t *)src;
-			dst = (uint32_t *)dst + 1;
-			src = (const uint32_t *)src + 1;
-		}
-		if (n & 0x08) {
-			*(uint64_t *)dst = *(const uint64_t *)src;
-		}
-		return ret;
-	}
-
-	/* Special fast cases for <= 128 bytes */
-	if (n <= 32) {
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-		return ret;
-	}
-
-	if (n <= 64) {
-		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
-		return ret;
-	}
-
-	if (n <= 128) {
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
-		return ret;
-	}
-
-	/*
-	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
-	 * copies was found to be faster than doing 128 and 32 byte copies as
-	 * well.
-	 */
-	for ( ; n >= 256; n -= 256) {
-		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
-		dst = (uint8_t *)dst + 256;
-		src = (const uint8_t *)src + 256;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 256) into
-	 * 64byte (2^6) chunks.
-	 * Using incrementing integers in the case labels of a switch statement
-	 * enourages the compiler to use a jump table. To get incrementing
-	 * integers, we shift the 2 relevant bits to the LSB position to first
-	 * get decrementing integers, and then subtract.
-	 */
-	switch (3 - (n >> 6)) {
-	case 0x00:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x01:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x02:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	default:
-		;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 64) into
-	 * 16byte (2^4) chunks, using the same switch structure as above.
-	 */
-	switch (3 - (n >> 4)) {
-	case 0x00:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x01:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x02:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	default:
-		;
-	}
-
-	/* Copy any remaining bytes, without going beyond end of buffers */
-	if (n != 0) {
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-	}
-	return ret;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_MEMCPY_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH v2 7/7] Split CPU flags operations to architecture specific
  2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
                   ` (5 preceding siblings ...)
  2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 6/7] Split memcpy operation " Chao Zhu
@ 2014-10-16 10:44 ` Chao Zhu
  6 siblings, 0 replies; 8+ messages in thread
From: Chao Zhu @ 2014-10-16 10:44 UTC (permalink / raw)
  To: dev

This patch splits CPU flags related operations from DPDK and push them
to architecture specific arch directories, so that other processor
architecture can implement it's own CPU flag functions to support DPDK.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 lib/librte_eal/common/eal_common_cpuflags.c        |  190 ----------
 .../common/include/arch/i686/rte_cpuflags.h        |  364 ++++++++++++++++++++
 .../common/include/arch/x86_64/rte_cpuflags.h      |  364 ++++++++++++++++++++
 lib/librte_eal/common/include/rte_cpuflags.h       |  182 ----------
 5 files changed, 730 insertions(+), 374 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index e09d509..79f378e 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -36,7 +36,7 @@ INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
-INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
+INC += rte_string_fns.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
 INC += rte_hexdump.h rte_devargs.h rte_dev.h
 INC += rte_common_vect.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h
-ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h
+ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h rte_cpuflags.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index 9e79179..6fd360c 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -30,10 +30,6 @@
  *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
-#include <stdlib.h>
-#include <stdio.h>
-#include <errno.h>
-#include <stdint.h>
 #include <rte_cpuflags.h>
 
 /*
@@ -50,192 +46,6 @@
 #endif
 
 /**
- * Enumeration of CPU registers
- */
-enum cpu_register_t {
-	REG_EAX = 0,
-	REG_EBX,
-	REG_ECX,
-	REG_EDX,
-};
-
-typedef uint32_t cpuid_registers_t[4];
-
-#define CPU_FLAG_NAME_MAX_LEN 64
-
-/**
- * Struct to hold a processor feature entry
- */
-struct feature_entry {
-	uint32_t leaf;				/**< cpuid leaf */
-	uint32_t subleaf;			/**< cpuid subleaf */
-	uint32_t reg;				/**< cpuid register */
-	uint32_t bit;				/**< cpuid register bit */
-	char name[CPU_FLAG_NAME_MAX_LEN];       /**< String for printing */
-};
-
-#define FEAT_DEF(name, leaf, subleaf, reg, bit) \
-	[RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name },
-
-/**
- * An array that holds feature entries
- */
-static const struct feature_entry cpu_feature_table[] = {
-	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
-	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
-	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
-	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
-	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
-	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
-	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
-	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
-	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
-	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
-	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
-	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
-	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
-	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
-	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
-	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
-	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
-	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
-	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
-	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
-	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
-	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
-	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
-	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
-	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
-	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
-	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
-	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
-	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
-
-	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
-	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
-	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
-	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
-	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
-	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
-	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
-	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
-	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
-	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
-	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
-	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
-	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
-	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
-	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
-	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
-	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
-	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
-	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
-	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
-	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
-	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
-	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
-	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
-	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
-	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
-	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
-	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
-	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
-
-	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
-	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
-	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
-	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
-	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
-	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
-
-	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
-	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
-	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
-
-	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
-	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
-	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
-	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
-	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
-	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
-	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
-	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
-	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
-
-	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
-	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
-
-	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
-	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
-	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
-	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
-	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
-
-	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
-};
-
-/*
- * Execute CPUID instruction and get contents of a specific register
- *
- * This function, when compiled with GCC, will generate architecture-neutral
- * code, as per GCC manual.
- */
-static inline void
-rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
-{
-#if defined(__i386__) && defined(__PIC__)
-    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
-    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
-		 : "=r" (out[REG_EBX]),
-		   "=a" (out[REG_EAX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-#else
-
-    asm volatile("cpuid"
-		 : "=a" (out[REG_EAX]),
-		   "=b" (out[REG_EBX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-
-#endif
-}
-
-/*
- * Checks if a particular flag is available on current machine.
- */
-int
-rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
-{
-	const struct feature_entry *feat;
-	cpuid_registers_t regs;
-
-
-	if (feature >= RTE_CPUFLAG_NUMFLAGS)
-		/* Flag does not match anything in the feature tables */
-		return -ENOENT;
-
-	feat = &cpu_feature_table[feature];
-
-	if (!feat->leaf)
-		/* This entry in the table wasn't filled out! */
-		return -EFAULT;
-
-	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
-	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
-	      regs[REG_EAX] < feat->leaf)
-		return 0;
-
-	/* get the cpuid leaf containing the desired feature */
-	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
-
-	/* check if the feature is enabled */
-	return (regs[feat->reg] >> feat->bit) & 1;
-}
-
-/**
  * Checks if the machine is adequate for running the binary. If it is not, the
  * program exits with status 1.
  * The function attribute forces this function to be called before main(). But
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h b/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
new file mode 100644
index 0000000..9c43ffc
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
@@ -0,0 +1,364 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+ 
+#ifndef _RTE_CPUFLAGS_I686_H_
+#define _RTE_CPUFLAGS_I686_H_
+
+/**
+ * @file
+ * Architecture specific API to determine available CPU features at runtime. 
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	/* (EAX 01h) ECX features*/
+	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
+	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
+	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
+	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
+	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
+	RTE_CPUFLAG_VMX,                    /**< VMX */
+	RTE_CPUFLAG_SMX,                    /**< SMX */
+	RTE_CPUFLAG_EIST,                   /**< EIST */
+	RTE_CPUFLAG_TM2,                    /**< TM2 */
+	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
+	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
+	RTE_CPUFLAG_FMA,                    /**< FMA */
+	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
+	RTE_CPUFLAG_XTPR,                   /**< XTPR */
+	RTE_CPUFLAG_PDCM,                   /**< PDCM */
+	RTE_CPUFLAG_PCID,                   /**< PCID */
+	RTE_CPUFLAG_DCA,                    /**< DCA */
+	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
+	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
+	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
+	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
+	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
+	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
+	RTE_CPUFLAG_AES,                    /**< AES */
+	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
+	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
+	RTE_CPUFLAG_AVX,                    /**< AVX */
+	RTE_CPUFLAG_F16C,                   /**< F16C */
+	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
+
+	/* (EAX 01h) EDX features */
+	RTE_CPUFLAG_FPU,                    /**< FPU */
+	RTE_CPUFLAG_VME,                    /**< VME */
+	RTE_CPUFLAG_DE,                     /**< DE */
+	RTE_CPUFLAG_PSE,                    /**< PSE */
+	RTE_CPUFLAG_TSC,                    /**< TSC */
+	RTE_CPUFLAG_MSR,                    /**< MSR */
+	RTE_CPUFLAG_PAE,                    /**< PAE */
+	RTE_CPUFLAG_MCE,                    /**< MCE */
+	RTE_CPUFLAG_CX8,                    /**< CX8 */
+	RTE_CPUFLAG_APIC,                   /**< APIC */
+	RTE_CPUFLAG_SEP,                    /**< SEP */
+	RTE_CPUFLAG_MTRR,                   /**< MTRR */
+	RTE_CPUFLAG_PGE,                    /**< PGE */
+	RTE_CPUFLAG_MCA,                    /**< MCA */
+	RTE_CPUFLAG_CMOV,                   /**< CMOV */
+	RTE_CPUFLAG_PAT,                    /**< PAT */
+	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
+	RTE_CPUFLAG_PSN,                    /**< PSN */
+	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
+	RTE_CPUFLAG_DS,                     /**< DS */
+	RTE_CPUFLAG_ACPI,                   /**< ACPI */
+	RTE_CPUFLAG_MMX,                    /**< MMX */
+	RTE_CPUFLAG_FXSR,                   /**< FXSR */
+	RTE_CPUFLAG_SSE,                    /**< SSE */
+	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
+	RTE_CPUFLAG_SS,                     /**< SS */
+	RTE_CPUFLAG_HTT,                    /**< HTT */
+	RTE_CPUFLAG_TM,                     /**< TM */
+	RTE_CPUFLAG_PBE,                    /**< PBE */
+
+	/* (EAX 06h) EAX features */
+	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
+	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
+	RTE_CPUFLAG_ARAT,                   /**< ARAT */
+	RTE_CPUFLAG_PLN,                    /**< PLN */
+	RTE_CPUFLAG_ECMD,                   /**< ECMD */
+	RTE_CPUFLAG_PTM,                    /**< PTM */
+
+	/* (EAX 06h) ECX features */
+	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
+	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
+	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
+
+	/* (EAX 07h, ECX 0h) EBX features */
+	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
+	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
+	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
+	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
+	RTE_CPUFLAG_SMEP,                   /**< SMEP */
+	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
+	RTE_CPUFLAG_ERMS,                   /**< ERMS */
+	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
+	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
+
+	/* (EAX 80000001h) ECX features */
+	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
+	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
+
+	/* (EAX 80000001h) EDX features */
+	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
+	RTE_CPUFLAG_XD,                     /**< XD */
+	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
+	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
+	RTE_CPUFLAG_EM64T,                  /**< EM64T */
+
+	/* (EAX 80000007h) EDX features */
+	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
+
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
+};
+
+/**
+ * Enumeration of CPU registers
+ */
+enum cpu_register_t {
+	REG_EAX = 0,
+	REG_EBX,
+	REG_ECX,
+	REG_EDX,
+};
+
+typedef uint32_t cpuid_registers_t[4];
+
+#define CPU_FLAG_NAME_MAX_LEN 64
+
+/**
+ * Struct to hold a processor feature entry
+ */
+struct feature_entry {
+	uint32_t leaf;				/**< cpuid leaf */
+	uint32_t subleaf;			/**< cpuid subleaf */
+	uint32_t reg;				/**< cpuid register */
+	uint32_t bit;				/**< cpuid register bit */
+	char name[CPU_FLAG_NAME_MAX_LEN];       /**< String for printing */
+};
+
+#define FEAT_DEF(name, leaf, subleaf, reg, bit) \
+	[RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name },
+
+/**
+ * An array that holds feature entries
+ */
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
+	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
+	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
+	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
+	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
+	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
+	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
+	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
+	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
+	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
+	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
+	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
+	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
+	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
+	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
+	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
+	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
+	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
+	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
+	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
+	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
+	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
+	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
+	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
+	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
+	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
+	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
+	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
+	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
+
+	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
+	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
+	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
+	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
+	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
+	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
+	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
+	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
+	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
+	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
+	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
+	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
+	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
+	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
+	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
+	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
+	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
+	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
+	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
+	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
+	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
+	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
+	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
+	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
+	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
+	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
+	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
+	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
+	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
+
+	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
+	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
+	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
+	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
+	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
+	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
+
+	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
+	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
+	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
+
+	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
+	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
+	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
+	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
+	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
+	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
+	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
+	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
+	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
+
+	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
+	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
+
+	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
+	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
+	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
+	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
+	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
+
+	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
+};
+
+/*
+ * Execute CPUID instruction and get contents of a specific register
+ *
+ * This function, when compiled with GCC, will generate architecture-neutral
+ * code, as per GCC manual.
+ */
+static inline void
+rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
+{
+#if defined(__i386__) && defined(__PIC__)
+    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
+    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
+		 : "=r" (out[REG_EBX]),
+		   "=a" (out[REG_EAX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+#else
+
+    asm volatile("cpuid"
+		 : "=a" (out[REG_EAX]),
+		   "=b" (out[REG_EBX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+
+#endif
+}
+
+/**
+ * Function for checking a CPU flag availability
+ *
+ * @param flag
+ *     CPU flag to query CPU for
+ * @return
+ *     1 if flag is available
+ *     0 if flag is not available
+ *     -ENOENT if flag is invalid
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs;
+
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
+	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
+	      regs[REG_EAX] < feat->leaf)
+		return 0;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.
+ */
+void
+rte_cpu_check_supported(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_I686_H_ */
\ No newline at end of file
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
new file mode 100644
index 0000000..f54de66
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
@@ -0,0 +1,364 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+ 
+#ifndef _RTE_CPUFLAGS_X86_64_H_
+#define _RTE_CPUFLAGS_X86_64_H_
+
+/**
+ * @file
+ * Architecture specific API to determine available CPU features at runtime. 
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	/* (EAX 01h) ECX features*/
+	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
+	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
+	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
+	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
+	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
+	RTE_CPUFLAG_VMX,                    /**< VMX */
+	RTE_CPUFLAG_SMX,                    /**< SMX */
+	RTE_CPUFLAG_EIST,                   /**< EIST */
+	RTE_CPUFLAG_TM2,                    /**< TM2 */
+	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
+	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
+	RTE_CPUFLAG_FMA,                    /**< FMA */
+	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
+	RTE_CPUFLAG_XTPR,                   /**< XTPR */
+	RTE_CPUFLAG_PDCM,                   /**< PDCM */
+	RTE_CPUFLAG_PCID,                   /**< PCID */
+	RTE_CPUFLAG_DCA,                    /**< DCA */
+	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
+	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
+	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
+	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
+	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
+	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
+	RTE_CPUFLAG_AES,                    /**< AES */
+	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
+	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
+	RTE_CPUFLAG_AVX,                    /**< AVX */
+	RTE_CPUFLAG_F16C,                   /**< F16C */
+	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
+
+	/* (EAX 01h) EDX features */
+	RTE_CPUFLAG_FPU,                    /**< FPU */
+	RTE_CPUFLAG_VME,                    /**< VME */
+	RTE_CPUFLAG_DE,                     /**< DE */
+	RTE_CPUFLAG_PSE,                    /**< PSE */
+	RTE_CPUFLAG_TSC,                    /**< TSC */
+	RTE_CPUFLAG_MSR,                    /**< MSR */
+	RTE_CPUFLAG_PAE,                    /**< PAE */
+	RTE_CPUFLAG_MCE,                    /**< MCE */
+	RTE_CPUFLAG_CX8,                    /**< CX8 */
+	RTE_CPUFLAG_APIC,                   /**< APIC */
+	RTE_CPUFLAG_SEP,                    /**< SEP */
+	RTE_CPUFLAG_MTRR,                   /**< MTRR */
+	RTE_CPUFLAG_PGE,                    /**< PGE */
+	RTE_CPUFLAG_MCA,                    /**< MCA */
+	RTE_CPUFLAG_CMOV,                   /**< CMOV */
+	RTE_CPUFLAG_PAT,                    /**< PAT */
+	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
+	RTE_CPUFLAG_PSN,                    /**< PSN */
+	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
+	RTE_CPUFLAG_DS,                     /**< DS */
+	RTE_CPUFLAG_ACPI,                   /**< ACPI */
+	RTE_CPUFLAG_MMX,                    /**< MMX */
+	RTE_CPUFLAG_FXSR,                   /**< FXSR */
+	RTE_CPUFLAG_SSE,                    /**< SSE */
+	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
+	RTE_CPUFLAG_SS,                     /**< SS */
+	RTE_CPUFLAG_HTT,                    /**< HTT */
+	RTE_CPUFLAG_TM,                     /**< TM */
+	RTE_CPUFLAG_PBE,                    /**< PBE */
+
+	/* (EAX 06h) EAX features */
+	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
+	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
+	RTE_CPUFLAG_ARAT,                   /**< ARAT */
+	RTE_CPUFLAG_PLN,                    /**< PLN */
+	RTE_CPUFLAG_ECMD,                   /**< ECMD */
+	RTE_CPUFLAG_PTM,                    /**< PTM */
+
+	/* (EAX 06h) ECX features */
+	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
+	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
+	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
+
+	/* (EAX 07h, ECX 0h) EBX features */
+	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
+	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
+	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
+	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
+	RTE_CPUFLAG_SMEP,                   /**< SMEP */
+	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
+	RTE_CPUFLAG_ERMS,                   /**< ERMS */
+	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
+	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
+
+	/* (EAX 80000001h) ECX features */
+	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
+	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
+
+	/* (EAX 80000001h) EDX features */
+	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
+	RTE_CPUFLAG_XD,                     /**< XD */
+	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
+	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
+	RTE_CPUFLAG_EM64T,                  /**< EM64T */
+
+	/* (EAX 80000007h) EDX features */
+	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
+
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
+};
+
+/**
+ * Enumeration of CPU registers
+ */
+enum cpu_register_t {
+	REG_EAX = 0,
+	REG_EBX,
+	REG_ECX,
+	REG_EDX,
+};
+
+typedef uint32_t cpuid_registers_t[4];
+
+#define CPU_FLAG_NAME_MAX_LEN 64
+
+/**
+ * Struct to hold a processor feature entry
+ */
+struct feature_entry {
+	uint32_t leaf;				/**< cpuid leaf */
+	uint32_t subleaf;			/**< cpuid subleaf */
+	uint32_t reg;				/**< cpuid register */
+	uint32_t bit;				/**< cpuid register bit */
+	char name[CPU_FLAG_NAME_MAX_LEN];       /**< String for printing */
+};
+
+#define FEAT_DEF(name, leaf, subleaf, reg, bit) \
+	[RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name },
+
+/**
+ * An array that holds feature entries
+ */
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
+	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
+	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
+	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
+	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
+	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
+	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
+	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
+	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
+	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
+	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
+	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
+	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
+	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
+	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
+	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
+	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
+	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
+	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
+	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
+	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
+	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
+	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
+	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
+	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
+	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
+	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
+	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
+	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
+
+	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
+	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
+	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
+	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
+	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
+	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
+	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
+	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
+	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
+	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
+	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
+	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
+	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
+	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
+	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
+	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
+	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
+	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
+	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
+	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
+	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
+	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
+	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
+	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
+	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
+	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
+	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
+	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
+	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
+
+	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
+	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
+	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
+	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
+	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
+	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
+
+	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
+	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
+	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
+
+	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
+	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
+	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
+	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
+	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
+	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
+	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
+	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
+	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
+
+	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
+	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
+
+	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
+	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
+	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
+	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
+	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
+
+	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
+};
+
+/*
+ * Execute CPUID instruction and get contents of a specific register
+ *
+ * This function, when compiled with GCC, will generate architecture-neutral
+ * code, as per GCC manual.
+ */
+static inline void
+rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
+{
+#if defined(__i386__) && defined(__PIC__)
+    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
+    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
+		 : "=r" (out[REG_EBX]),
+		   "=a" (out[REG_EAX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+#else
+
+    asm volatile("cpuid"
+		 : "=a" (out[REG_EAX]),
+		   "=b" (out[REG_EBX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+
+#endif
+}
+
+/**
+ * Function for checking a CPU flag availability
+ *
+ * @param flag
+ *     CPU flag to query CPU for
+ * @return
+ *     1 if flag is available
+ *     0 if flag is not available
+ *     -ENOENT if flag is invalid
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs;
+
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
+	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
+	      regs[REG_EAX] < feat->leaf)
+		return 0;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.
+ */
+void
+rte_cpu_check_supported(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_X86_64_H_ */
\ No newline at end of file
diff --git a/lib/librte_eal/common/include/rte_cpuflags.h b/lib/librte_eal/common/include/rte_cpuflags.h
deleted file mode 100644
index 5fa96db..0000000
--- a/lib/librte_eal/common/include/rte_cpuflags.h
+++ /dev/null
@@ -1,182 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CPUFLAGS_H_
-#define _RTE_CPUFLAGS_H_
-
-/**
- * @file
- * Simple API to determine available CPU features at runtime.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-
-/**
- * Enumeration of all CPU features supported
- */
-enum rte_cpu_flag_t {
-	/* (EAX 01h) ECX features*/
-	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
-	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
-	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
-	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
-	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
-	RTE_CPUFLAG_VMX,                    /**< VMX */
-	RTE_CPUFLAG_SMX,                    /**< SMX */
-	RTE_CPUFLAG_EIST,                   /**< EIST */
-	RTE_CPUFLAG_TM2,                    /**< TM2 */
-	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
-	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
-	RTE_CPUFLAG_FMA,                    /**< FMA */
-	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
-	RTE_CPUFLAG_XTPR,                   /**< XTPR */
-	RTE_CPUFLAG_PDCM,                   /**< PDCM */
-	RTE_CPUFLAG_PCID,                   /**< PCID */
-	RTE_CPUFLAG_DCA,                    /**< DCA */
-	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
-	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
-	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
-	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
-	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
-	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
-	RTE_CPUFLAG_AES,                    /**< AES */
-	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
-	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
-	RTE_CPUFLAG_AVX,                    /**< AVX */
-	RTE_CPUFLAG_F16C,                   /**< F16C */
-	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
-
-	/* (EAX 01h) EDX features */
-	RTE_CPUFLAG_FPU,                    /**< FPU */
-	RTE_CPUFLAG_VME,                    /**< VME */
-	RTE_CPUFLAG_DE,                     /**< DE */
-	RTE_CPUFLAG_PSE,                    /**< PSE */
-	RTE_CPUFLAG_TSC,                    /**< TSC */
-	RTE_CPUFLAG_MSR,                    /**< MSR */
-	RTE_CPUFLAG_PAE,                    /**< PAE */
-	RTE_CPUFLAG_MCE,                    /**< MCE */
-	RTE_CPUFLAG_CX8,                    /**< CX8 */
-	RTE_CPUFLAG_APIC,                   /**< APIC */
-	RTE_CPUFLAG_SEP,                    /**< SEP */
-	RTE_CPUFLAG_MTRR,                   /**< MTRR */
-	RTE_CPUFLAG_PGE,                    /**< PGE */
-	RTE_CPUFLAG_MCA,                    /**< MCA */
-	RTE_CPUFLAG_CMOV,                   /**< CMOV */
-	RTE_CPUFLAG_PAT,                    /**< PAT */
-	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
-	RTE_CPUFLAG_PSN,                    /**< PSN */
-	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
-	RTE_CPUFLAG_DS,                     /**< DS */
-	RTE_CPUFLAG_ACPI,                   /**< ACPI */
-	RTE_CPUFLAG_MMX,                    /**< MMX */
-	RTE_CPUFLAG_FXSR,                   /**< FXSR */
-	RTE_CPUFLAG_SSE,                    /**< SSE */
-	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
-	RTE_CPUFLAG_SS,                     /**< SS */
-	RTE_CPUFLAG_HTT,                    /**< HTT */
-	RTE_CPUFLAG_TM,                     /**< TM */
-	RTE_CPUFLAG_PBE,                    /**< PBE */
-
-	/* (EAX 06h) EAX features */
-	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
-	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
-	RTE_CPUFLAG_ARAT,                   /**< ARAT */
-	RTE_CPUFLAG_PLN,                    /**< PLN */
-	RTE_CPUFLAG_ECMD,                   /**< ECMD */
-	RTE_CPUFLAG_PTM,                    /**< PTM */
-
-	/* (EAX 06h) ECX features */
-	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
-	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
-	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
-
-	/* (EAX 07h, ECX 0h) EBX features */
-	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
-	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
-	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
-	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
-	RTE_CPUFLAG_SMEP,                   /**< SMEP */
-	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
-	RTE_CPUFLAG_ERMS,                   /**< ERMS */
-	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
-	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
-
-	/* (EAX 80000001h) ECX features */
-	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
-	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
-
-	/* (EAX 80000001h) EDX features */
-	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
-	RTE_CPUFLAG_XD,                     /**< XD */
-	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
-	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
-	RTE_CPUFLAG_EM64T,                  /**< EM64T */
-
-	/* (EAX 80000007h) EDX features */
-	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
-
-	/* The last item */
-	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
-};
-
-
-/**
- * Function for checking a CPU flag availability
- *
- * @param flag
- *     CPU flag to query CPU for
- * @return
- *     1 if flag is available
- *     0 if flag is not available
- *     -ENOENT if flag is invalid
- */
-int
-rte_cpu_get_flag_enabled(enum rte_cpu_flag_t flag);
-
-/**
- * This function checks that the currently used CPU supports the CPU features
- * that were specified at compile time. It is called automatically within the
- * EAL, so does not need to be used by applications.
- */
-void
-rte_cpu_check_supported(void);
-
-#ifdef __cplusplus
-}
-#endif
-
-
-#endif /* _RTE_CPUFLAGS_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-16 10:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-16 10:44 [dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 2/7] Split byte order " Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 3/7] Split CPU cycle operation " Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 4/7] Split prefetch operations " Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 5/7] Split spinlock " Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 6/7] Split memcpy operation " Chao Zhu
2014-10-16 10:44 ` [dpdk-dev] [PATCH v2 7/7] Split CPU flags operations " Chao Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).