DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v3 00/10] split architecture specific operations
@ 2014-10-28 12:50 David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 01/10] eal: move rte_atomic.h header David Marchand
                   ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

The set of patches split x86 architecture specific operations from DPDK and put
them to x86 arch directory.
This will make the adoption of DPDK much easier on other computer architecture.
For a new architecture, just add an architecture specific directory and
necessary building configuration files, then DPDK eal library can support it.


Reviewing patchset from Chao, I ended up modifying it along the way,
so here is a new iteration of this patchset.

Changes since Chao v2 patchset :

- added a preliminary patch for moving rte_atomic.h (for better readability)
- fixed documentation generation
- implemented a generic header for each arch specific header (cpuflags, memcpy,
  prefetch were missing)
- removed C++ stuff from generic headers
- centralised all doxygen stuff in generic headers (no need to have duplicates)
- refactored rte_cycles functions
- moved vmware tsc stuff to arch rte_cycles.h headers
- finished x86 factorisation


Little summary of current state :

- all applications continue to include the eal headers as before, these headers
  are the arch-specific ones
- the arch specific headers always include the generic ones. The generic headers
  contain the doxygen documentation and code common to all architectures
- a x86 architecture has been defined which handles both 32bits and 64bits
  peculiarities


It builds fine for 32/64 bits (w and w/o "force intrinsics"), but I really would
like a lot of eyes on this (and I would say, especially, rte_cycles, rte_memcpy
and rte_cpuflags).
I still have some concerns about the use of intrinsics for architecture != x86
but I think Chao will be the best to look at this.


-- 
David Marchand

Chao Zhu (7):
  eal: split atomic operations to architecture specific
  eal: split byte order operations to architecture specific
  eal: split CPU cycle operation to architecture specific
  eal: split prefetch operations to architecture specific
  eal: split spinlock operations to architecture specific
  eal: split memcpy operation to architecture specific
  eal: split CPU flags operations to architecture specific

David Marchand (3):
  eal: move rte_atomic.h header
  eal: install all arch headers
  eal: factorize x86 headers

 doc/api/doxy-api.conf                              |    1 +
 lib/librte_eal/common/Makefile                     |   24 +-
 lib/librte_eal/common/eal_common_cpuflags.c        |  190 ----
 .../common/include/arch/x86/rte_atomic.h           |  216 ++++
 .../common/include/arch/x86/rte_atomic_32.h        |  222 ++++
 .../common/include/arch/x86/rte_atomic_64.h        |  191 ++++
 .../common/include/arch/x86/rte_byteorder.h        |  121 +++
 .../common/include/arch/x86/rte_byteorder_32.h     |   51 +
 .../common/include/arch/x86/rte_byteorder_64.h     |   52 +
 .../common/include/arch/x86/rte_cpuflags.h         |  310 ++++++
 .../common/include/arch/x86/rte_cycles.h           |  121 +++
 .../common/include/arch/x86/rte_memcpy.h           |  297 +++++
 .../common/include/arch/x86/rte_prefetch.h         |   62 ++
 .../common/include/arch/x86/rte_spinlock.h         |   94 ++
 lib/librte_eal/common/include/generic/rte_atomic.h |  918 ++++++++++++++++
 .../common/include/generic/rte_byteorder.h         |  189 ++++
 .../common/include/generic/rte_cpuflags.h          |  110 ++
 lib/librte_eal/common/include/generic/rte_cycles.h |  209 ++++
 lib/librte_eal/common/include/generic/rte_memcpy.h |  144 +++
 .../common/include/generic/rte_prefetch.h          |   71 ++
 .../common/include/generic/rte_spinlock.h          |  226 ++++
 .../common/include/i686/arch/rte_atomic.h          |  373 -------
 lib/librte_eal/common/include/rte_atomic.h         | 1133 --------------------
 lib/librte_eal/common/include/rte_byteorder.h      |  270 -----
 lib/librte_eal/common/include/rte_cpuflags.h       |  182 ----
 lib/librte_eal/common/include/rte_cycles.h         |  266 -----
 lib/librte_eal/common/include/rte_memcpy.h         |  376 -------
 lib/librte_eal/common/include/rte_prefetch.h       |   88 --
 lib/librte_eal/common/include/rte_spinlock.h       |  258 -----
 .../common/include/x86_64/arch/rte_atomic.h        |  335 ------
 mk/arch/i686/rte.vars.mk                           |    2 +
 mk/arch/x86_64/rte.vars.mk                         |    2 +
 32 files changed, 3624 insertions(+), 3480 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_32.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 01/10] eal: move rte_atomic.h header
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 02/10] eal: split atomic operations to architecture specific David Marchand
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

This breaks compilation but is a preliminary step for headers rework.

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_atomic.h          |  373 ++++++++++++++++++++
 .../common/include/arch/x86_64/rte_atomic.h        |  335 ++++++++++++++++++
 .../common/include/i686/arch/rte_atomic.h          |  373 --------------------
 .../common/include/x86_64/arch/rte_atomic.h        |  335 ------------------
 5 files changed, 710 insertions(+), 710 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 7f27966..6603c61 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -49,8 +49,8 @@ endif
 ARCH_INC := rte_atomic.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
-SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/arch := \
-	$(addprefix include/$(RTE_ARCH)/arch/,$(ARCH_INC))
+SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
+	$(addprefix include/arch/$(RTE_ARCH)/,$(ARCH_INC))
 
 # add libc if configured
 DEPDIRS-$(CONFIG_RTE_LIBC) += lib/libc
diff --git a/lib/librte_eal/common/include/arch/i686/rte_atomic.h b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
new file mode 100644
index 0000000..6956b87
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
@@ -0,0 +1,373 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/i386/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_H_
+#error "don't include this file directly, please include generic <rte_atomic.h>"
+#endif
+
+#ifndef _RTE_I686_ATOMIC_H_
+#define _RTE_I686_ATOMIC_H_
+
+
+/**
+ * @file
+ * Atomic Operations on i686
+ */
+
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 64-bit words)
+ *
+ * @param dst
+ *   The destination into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	uint8_t res;
+	union {
+		struct {
+			uint32_t l32;
+			uint32_t h32;
+		};
+		uint64_t u64;
+	} _exp, _src;
+
+	_exp.u64 = exp;
+	_src.u64 = src;
+
+#ifndef __PIC__
+    asm volatile (
+            MPLOCKED
+            "cmpxchg8b (%[dst]);"
+            "setz %[res];"
+            : [res] "=a" (res)      /* result in eax */
+            : [dst] "S" (dst),      /* esi */
+             "b" (_src.l32),       /* ebx */
+             "c" (_src.h32),       /* ecx */
+             "a" (_exp.l32),       /* eax */
+             "d" (_exp.h32)        /* edx */
+			: "memory" );           /* no-clobber list */
+#else
+	asm volatile (
+            "mov %%ebx, %%edi\n"
+			MPLOCKED
+			"cmpxchg8b (%[dst]);"
+			"setz %[res];"
+            "xchgl %%ebx, %%edi;\n"
+			: [res] "=a" (res)      /* result in eax */
+			: [dst] "S" (dst),      /* esi */
+			  "D" (_src.l32),       /* ebx */
+			  "c" (_src.h32),       /* ecx */
+			  "a" (_exp.l32),       /* eax */
+			  "d" (_exp.h32)        /* edx */
+			: "memory" );           /* no-clobber list */
+#endif
+
+	return res;
+}
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int64_t cnt;  /**< Internal counter value. */
+} rte_atomic64_t;
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC64_INIT(val) { (val) }
+
+/**
+ * Initialize the atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+}
+
+/**
+ * Atomically read a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+}
+
+/**
+ * Atomically set a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value of the counter.
+ */
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+}
+
+/**
+ * Atomically add a 64-bit value to a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp + inc);
+	}
+}
+
+/**
+ * Atomically subtract a 64-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp - dec);
+	}
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	rte_atomic64_add(v, 1);
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	rte_atomic64_sub(v, 1);
+}
+
+/**
+ * Add a 64-bit value to an atomic counter and return the result.
+ *
+ * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
+ * returns the value of v after the addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp + inc);
+	}
+
+	return tmp + inc;
+}
+
+/**
+ * Subtract a 64-bit value from an atomic counter and return the result.
+ *
+ * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
+ * and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp - dec);
+	}
+
+	return tmp - dec;
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns
+ * true if the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the addition is 0; false otherwise.
+ */
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_add_return(v, 1) == 0;
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after subtraction is 0; false otherwise.
+ */
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_sub_return(v, 1) == 0;
+}
+
+/**
+ * Atomically test and set a 64-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+
+#endif /* _RTE_I686_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
new file mode 100644
index 0000000..3ba7d3a
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
@@ -0,0 +1,335 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/amd64/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_H_
+#error "don't include this file directly, please include generic <rte_atomic.h>"
+#endif
+
+#ifndef _RTE_X86_64_ATOMIC_H_
+#define _RTE_X86_64_ATOMIC_H_
+
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 64-bit words)
+ *
+ * @param dst
+ *   The destination into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgq %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+
+	return res;
+}
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int64_t cnt;  /**< Internal counter value. */
+} rte_atomic64_t;
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC64_INIT(val) { (val) }
+
+/**
+ * Initialize the atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+
+/**
+ * Atomically read a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	return v->cnt;
+}
+
+/**
+ * Atomically set a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value of the counter.
+ */
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	v->cnt = new_value;
+}
+
+/**
+ * Atomically add a 64-bit value to a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	asm volatile(
+			MPLOCKED
+			"addq %[inc], %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: [inc] "ir" (inc),     /* input */
+			  "m" (v->cnt)
+			);
+}
+
+/**
+ * Atomically subtract a 64-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	asm volatile(
+			MPLOCKED
+			"subq %[dec], %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: [dec] "ir" (dec),     /* input */
+			  "m" (v->cnt)
+			);
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incq %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decq %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+/**
+ * Add a 64-bit value to an atomic counter and return the result.
+ *
+ * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
+ * returns the value of v after the addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	int64_t prev = inc;
+
+	asm volatile(
+			MPLOCKED
+			"xaddq %[prev], %[cnt]"
+			: [prev] "+r" (prev),   /* output */
+			  [cnt] "=m" (v->cnt)
+			: "m" (v->cnt)          /* input */
+			);
+	return prev + inc;
+}
+
+/**
+ * Subtract a 64-bit value from an atomic counter and return the result.
+ *
+ * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
+ * and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return rte_atomic64_add_return(v, -dec);
+}
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns
+ * true if the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the addition is 0; false otherwise.
+ */
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incq %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt), /* output */
+			  [ret] "=qm" (ret)
+			);
+
+	return ret != 0;
+}
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after subtraction is 0; false otherwise.
+ */
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"decq %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return ret != 0;
+}
+
+/**
+ * Atomically test and set a 64-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+
+#endif /* _RTE_X86_64_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/i686/arch/rte_atomic.h b/lib/librte_eal/common/include/i686/arch/rte_atomic.h
deleted file mode 100644
index 6956b87..0000000
--- a/lib/librte_eal/common/include/i686/arch/rte_atomic.h
+++ /dev/null
@@ -1,373 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * Inspired from FreeBSD src/sys/i386/include/atomic.h
- * Copyright (c) 1998 Doug Rabson
- * All rights reserved.
- */
-
-#ifndef _RTE_ATOMIC_H_
-#error "don't include this file directly, please include generic <rte_atomic.h>"
-#endif
-
-#ifndef _RTE_I686_ATOMIC_H_
-#define _RTE_I686_ATOMIC_H_
-
-
-/**
- * @file
- * Atomic Operations on i686
- */
-
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	uint8_t res;
-	union {
-		struct {
-			uint32_t l32;
-			uint32_t h32;
-		};
-		uint64_t u64;
-	} _exp, _src;
-
-	_exp.u64 = exp;
-	_src.u64 = src;
-
-#ifndef __PIC__
-    asm volatile (
-            MPLOCKED
-            "cmpxchg8b (%[dst]);"
-            "setz %[res];"
-            : [res] "=a" (res)      /* result in eax */
-            : [dst] "S" (dst),      /* esi */
-             "b" (_src.l32),       /* ebx */
-             "c" (_src.h32),       /* ecx */
-             "a" (_exp.l32),       /* eax */
-             "d" (_exp.h32)        /* edx */
-			: "memory" );           /* no-clobber list */
-#else
-	asm volatile (
-            "mov %%ebx, %%edi\n"
-			MPLOCKED
-			"cmpxchg8b (%[dst]);"
-			"setz %[res];"
-            "xchgl %%ebx, %%edi;\n"
-			: [res] "=a" (res)      /* result in eax */
-			: [dst] "S" (dst),      /* esi */
-			  "D" (_src.l32),       /* ebx */
-			  "c" (_src.h32),       /* ecx */
-			  "a" (_exp.l32),       /* eax */
-			  "d" (_exp.h32)        /* edx */
-			: "memory" );           /* no-clobber list */
-#endif
-
-	return res;
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, 0);
-	}
-}
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		/* replace the value by itself */
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp);
-	}
-	return tmp;
-}
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, new_value);
-	}
-}
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp + inc);
-	}
-}
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp - dec);
-	}
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	rte_atomic64_add(v, 1);
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	rte_atomic64_sub(v, 1);
-}
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp + inc);
-	}
-
-	return tmp + inc;
-}
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp - dec);
-	}
-
-	return tmp - dec;
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_add_return(v, 1) == 0;
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	rte_atomic64_set(v, 0);
-}
-
-#endif /* _RTE_I686_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/x86_64/arch/rte_atomic.h b/lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
deleted file mode 100644
index 3ba7d3a..0000000
--- a/lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
+++ /dev/null
@@ -1,335 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * Inspired from FreeBSD src/sys/amd64/include/atomic.h
- * Copyright (c) 1998 Doug Rabson
- * All rights reserved.
- */
-
-#ifndef _RTE_ATOMIC_H_
-#error "don't include this file directly, please include generic <rte_atomic.h>"
-#endif
-
-#ifndef _RTE_X86_64_ATOMIC_H_
-#define _RTE_X86_64_ATOMIC_H_
-
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgq %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-
-	return res;
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-	v->cnt = 0;
-}
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-	return v->cnt;
-}
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-	v->cnt = new_value;
-}
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	asm volatile(
-			MPLOCKED
-			"addq %[inc], %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: [inc] "ir" (inc),     /* input */
-			  "m" (v->cnt)
-			);
-}
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	asm volatile(
-			MPLOCKED
-			"subq %[dec], %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: [dec] "ir" (dec),     /* input */
-			  "m" (v->cnt)
-			);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incq %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decq %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	int64_t prev = inc;
-
-	asm volatile(
-			MPLOCKED
-			"xaddq %[prev], %[cnt]"
-			: [prev] "+r" (prev),   /* output */
-			  [cnt] "=m" (v->cnt)
-			: "m" (v->cnt)          /* input */
-			);
-	return prev + inc;
-}
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	return rte_atomic64_add_return(v, -dec);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incq %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt), /* output */
-			  [ret] "=qm" (ret)
-			);
-
-	return ret != 0;
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"decq %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return ret != 0;
-}
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	v->cnt = 0;
-}
-
-#endif /* _RTE_X86_64_ATOMIC_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 02/10] eal: split atomic operations to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 01/10] eal: move rte_atomic.h header David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 03/10] eal: split byte order " David Marchand
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch first adds architecture specific directories to eal.
Then split the atomic operations to architecture specific and generic files.
Architecture specific files are put into the corresponding architecture
directory and common header are put into generic directory.

Update documentation generation with new generic/ directory.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 doc/api/doxy-api.conf                              |    1 +
 lib/librte_eal/common/Makefile                     |    7 +-
 .../common/include/arch/i686/rte_atomic.h          |  322 +++---
 .../common/include/arch/x86_64/rte_atomic.h        |  321 +++---
 lib/librte_eal/common/include/generic/rte_atomic.h |  918 ++++++++++++++++
 lib/librte_eal/common/include/rte_atomic.h         | 1133 --------------------
 6 files changed, 1269 insertions(+), 1433 deletions(-)
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h

diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fe3879f..27c782c 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -31,6 +31,7 @@
 PROJECT_NAME            = DPDK
 INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_eal/common/include \
+                          lib/librte_eal/common/include/generic \
                           lib/librte_acl \
                           lib/librte_distributor \
                           lib/librte_ether \
diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 6603c61..8ab363b 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -31,7 +31,7 @@
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-INC := rte_atomic.h rte_branch_prediction.h rte_byteorder.h rte_common.h
+INC := rte_branch_prediction.h rte_byteorder.h rte_common.h
 INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
@@ -46,11 +46,14 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-ARCH_INC := rte_atomic.h
+GENERIC_INC := rte_atomic.h
+ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
 	$(addprefix include/arch/$(RTE_ARCH)/,$(ARCH_INC))
+SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/generic := \
+	$(addprefix include/generic/,$(GENERIC_INC))
 
 # add libc if configured
 DEPDIRS-$(CONFIG_RTE_LIBC) += lib/libc
diff --git a/lib/librte_eal/common/include/arch/i686/rte_atomic.h b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
index 6956b87..8330250 100644
--- a/lib/librte_eal/common/include/arch/i686/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
@@ -37,37 +37,179 @@
  * All rights reserved.
  */
 
-#ifndef _RTE_ATOMIC_H_
-#error "don't include this file directly, please include generic <rte_atomic.h>"
+#ifndef _RTE_ATOMIC_I686_H_
+#define _RTE_ATOMIC_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
 #endif
 
-#ifndef _RTE_I686_ATOMIC_H_
-#define _RTE_I686_ATOMIC_H_
+#include <emmintrin.h>
+#include "generic/rte_atomic.h"
 
+#if RTE_MAX_LCORE == 1
+#define MPLOCKED                        /**< No need to insert MP lock prefix. */
+#else
+#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
+#endif
 
-/**
- * @file
- * Atomic Operations on i686
- */
+#define	rte_mb() _mm_mfence()
+
+#define	rte_wmb() _mm_sfence()
+
+#define	rte_rmb() _mm_lfence()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgw %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgl %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
 
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
 
 /*------------------------- 64 bit atomic operations -------------------------*/
 
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
 static inline int
 rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
 {
@@ -114,24 +256,6 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
 	return res;
 }
 
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void
 rte_atomic64_init(rte_atomic64_t *v)
 {
@@ -145,14 +269,6 @@ rte_atomic64_init(rte_atomic64_t *v)
 	}
 }
 
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
 static inline int64_t
 rte_atomic64_read(rte_atomic64_t *v)
 {
@@ -168,14 +284,6 @@ rte_atomic64_read(rte_atomic64_t *v)
 	return tmp;
 }
 
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
 static inline void
 rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
 {
@@ -189,14 +297,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
 	}
 }
 
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
 static inline void
 rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
 {
@@ -210,14 +310,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
 	}
 }
 
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
 static inline void
 rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
 {
@@ -231,43 +323,18 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
 	}
 }
 
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void
 rte_atomic64_inc(rte_atomic64_t *v)
 {
 	rte_atomic64_add(v, 1);
 }
 
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void
 rte_atomic64_dec(rte_atomic64_t *v)
 {
 	rte_atomic64_sub(v, 1);
 }
 
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
 static inline int64_t
 rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
 {
@@ -283,19 +350,6 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
 	return tmp + inc;
 }
 
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
 static inline int64_t
 rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
 {
@@ -311,63 +365,29 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
 	return tmp - dec;
 }
 
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
 static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
 {
 	return rte_atomic64_add_return(v, 1) == 0;
 }
 
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
 static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
 {
 	return rte_atomic64_sub_return(v, 1) == 0;
 }
 
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
 static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
 {
 	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
 }
 
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void rte_atomic64_clear(rte_atomic64_t *v)
 {
 	rte_atomic64_set(v, 0);
 }
+#endif
+
+#ifdef __cplusplus
+}
+#endif
 
-#endif /* _RTE_I686_ATOMIC_H_ */
+#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
index 3ba7d3a..9138328 100644
--- a/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
@@ -37,36 +37,185 @@
  * All rights reserved.
  */
 
-#ifndef _RTE_ATOMIC_H_
-#error "don't include this file directly, please include generic <rte_atomic.h>"
+#ifndef _RTE_ATOMIC_X86_64_H_
+#define _RTE_ATOMIC_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <emmintrin.h>
+#include "generic/rte_atomic.h"
+
+#if RTE_MAX_LCORE == 1
+#define MPLOCKED                        /**< No need to insert MP lock prefix. */
+#else
+#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
 #endif
 
-#ifndef _RTE_X86_64_ATOMIC_H_
-#define _RTE_X86_64_ATOMIC_H_
+#define	rte_mb() _mm_mfence()
+
+#define	rte_wmb() _mm_sfence()
+
+#define	rte_rmb() _mm_lfence()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgw %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
 
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgl %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
 
 /*------------------------- 64 bit atomic operations -------------------------*/
 
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
 static inline int
 rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
 {
 	uint8_t res;
 
+
 	asm volatile(
 			MPLOCKED
 			"cmpxchgq %[src], %[dst];"
@@ -81,66 +230,24 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
 	return res;
 }
 
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void
 rte_atomic64_init(rte_atomic64_t *v)
 {
 	v->cnt = 0;
 }
 
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
 static inline int64_t
 rte_atomic64_read(rte_atomic64_t *v)
 {
 	return v->cnt;
 }
 
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
 static inline void
 rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
 {
 	v->cnt = new_value;
 }
 
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
 static inline void
 rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
 {
@@ -153,14 +260,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
 			);
 }
 
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
 static inline void
 rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
 {
@@ -173,12 +272,6 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
 			);
 }
 
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void
 rte_atomic64_inc(rte_atomic64_t *v)
 {
@@ -190,12 +283,6 @@ rte_atomic64_inc(rte_atomic64_t *v)
 			);
 }
 
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void
 rte_atomic64_dec(rte_atomic64_t *v)
 {
@@ -207,19 +294,6 @@ rte_atomic64_dec(rte_atomic64_t *v)
 			);
 }
 
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
 static inline int64_t
 rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
 {
@@ -235,36 +309,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
 	return prev + inc;
 }
 
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
 static inline int64_t
 rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
 {
 	return rte_atomic64_add_return(v, -dec);
 }
 
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
 static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
 {
 	uint8_t ret;
@@ -280,17 +330,6 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
 	return ret != 0;
 }
 
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
 static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
 {
 	uint8_t ret;
@@ -305,31 +344,19 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
 	return ret != 0;
 }
 
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
 static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
 {
 	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
 }
 
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
 static inline void rte_atomic64_clear(rte_atomic64_t *v)
 {
 	v->cnt = 0;
 }
+#endif
+
+#ifdef __cplusplus
+}
+#endif
 
-#endif /* _RTE_X86_64_ATOMIC_H_ */
+#endif /* _RTE_ATOMIC_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h b/lib/librte_eal/common/include/generic/rte_atomic.h
new file mode 100644
index 0000000..0160243
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_atomic.h
@@ -0,0 +1,918 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_H_
+#define _RTE_ATOMIC_H_
+
+/**
+ * @file
+ * Atomic Operations
+ *
+ * This file defines a generic API for atomic operations. 
+ */
+
+#include <stdint.h>
+
+#ifdef __DOXYGEN__
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_mb(void);
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_wmb(void);
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ * This function is architecture dependent.
+ */
+static inline void rte_rmb(void);
+
+#endif /* __DOXYGEN__ */
+
+/**
+ * Compiler barrier.
+ *
+ * Guarantees that operation reordering does not occur at compile time
+ * for operations directly before and after the barrier.
+ */
+#define	rte_compiler_barrier() do {		\
+	asm volatile ("" : : : "memory");	\
+} while(0)
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 16-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __sync_bool_compare_and_swap(dst, exp, src);
+}
+#endif
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int16_t cnt; /**< An internal counter value. */
+} rte_atomic16_t;
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC16_INIT(val) { (val) }
+
+/**
+ * Initialize an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_init(rte_atomic16_t *v)
+{
+	v->cnt = 0;
+}
+
+/**
+ * Atomically read a 16-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int16_t
+rte_atomic16_read(const rte_atomic16_t *v)
+{
+	return v->cnt;
+}
+
+/**
+ * Atomically set a counter to a 16-bit value.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value for the counter.
+ */
+static inline void
+rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
+{
+	v->cnt = new_value;
+}
+
+/**
+ * Atomically add a 16-bit value to an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
+{
+	__sync_fetch_and_add(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 16-bit value from an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
+{
+	__sync_fetch_and_sub(&v->cnt, dec);
+}
+
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	rte_atomic16_add(v, 1);
+}
+#endif
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	rte_atomic16_sub(v, 1);
+}
+#endif
+
+/**
+ * Atomically add a 16-bit value to a counter and return the result.
+ *
+ * Atomically adds the 16-bits value (inc) to the atomic counter (v) and
+ * returns the value of v after addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int16_t
+rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
+{
+	return __sync_add_and_fetch(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 16-bit value from a counter and return
+ * the result.
+ *
+ * Atomically subtracts the 16-bit value (inc) from the atomic counter
+ * (v) and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int16_t
+rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
+{
+	return __sync_sub_and_fetch(&v->cnt, dec);
+}
+
+/**
+ * Atomically increment a 16-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
+}
+#endif
+
+/**
+ * Atomically decrement a 16-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
+}
+#endif
+
+/**
+ * Atomically test and set a 16-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+#endif
+
+/**
+ * Atomically set a 16-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic16_clear(rte_atomic16_t *v)
+{
+	v->cnt = 0;
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+/**
+ * Atomic compare and set.
+ *
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 32-bit words)
+ *
+ * @param dst
+ *   The destination location into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	return __sync_bool_compare_and_swap(dst, exp, src);
+}
+#endif
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int32_t cnt; /**< An internal counter value. */
+} rte_atomic32_t;
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC32_INIT(val) { (val) }
+
+/**
+ * Initialize an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_init(rte_atomic32_t *v)
+{
+	v->cnt = 0;
+}
+
+/**
+ * Atomically read a 32-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int32_t
+rte_atomic32_read(const rte_atomic32_t *v)
+{
+	return v->cnt;
+}
+
+/**
+ * Atomically set a counter to a 32-bit value.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value for the counter.
+ */
+static inline void
+rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
+{
+	v->cnt = new_value;
+}
+
+/**
+ * Atomically add a 32-bit value to an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
+{
+	__sync_fetch_and_add(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 32-bit value from an atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
+{
+	__sync_fetch_and_sub(&v->cnt, dec);
+}
+
+/**
+ * Atomically increment a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	rte_atomic32_add(v, 1);
+}
+#endif
+
+/**
+ * Atomically decrement a counter by one.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	rte_atomic32_sub(v,1);
+}
+#endif
+
+/**
+ * Atomically add a 32-bit value to a counter and return the result.
+ *
+ * Atomically adds the 32-bits value (inc) to the atomic counter (v) and
+ * returns the value of v after addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int32_t
+rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
+{
+	return __sync_add_and_fetch(&v->cnt, inc);
+}
+
+/**
+ * Atomically subtract a 32-bit value from a counter and return
+ * the result.
+ *
+ * Atomically subtracts the 32-bit value (inc) from the atomic counter
+ * (v) and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int32_t
+rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
+{
+	return __sync_sub_and_fetch(&v->cnt, dec);
+}
+
+/**
+ * Atomically increment a 32-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the increment operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
+}
+#endif
+
+/**
+ * Atomically decrement a 32-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the decrement operation is 0; false otherwise.
+ */
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
+}
+#endif
+
+/**
+ * Atomically test and set a 32-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+#endif
+
+/**
+ * Atomically set a 32-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic32_clear(rte_atomic32_t *v)
+{
+	v->cnt = 0;
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (atomic) equivalent to:
+ *   if (*dst == exp)
+ *     *dst = src (all 64-bit words)
+ *
+ * @param dst
+ *   The destination into which the value will be written.
+ * @param exp
+ *   The expected value.
+ * @param src
+ *   The new value.
+ * @return
+ *   Non-zero on success; 0 on failure.
+ */
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	return __sync_bool_compare_and_swap(dst, exp, src);
+}
+#endif
+
+/**
+ * The atomic counter structure.
+ */
+typedef struct {
+	volatile int64_t cnt;  /**< Internal counter value. */
+} rte_atomic64_t;
+
+/**
+ * Static initializer for an atomic counter.
+ */
+#define RTE_ATOMIC64_INIT(val) { (val) }
+
+/**
+ * Initialize the atomic counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_init(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+#ifdef __LP64__
+	v->cnt = 0;
+#else
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+#endif
+}
+#endif
+
+/**
+ * Atomically read a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   The value of the counter.
+ */
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+#ifdef __LP64__
+	return v->cnt;
+#else
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+#endif
+}
+#endif
+
+/**
+ * Atomically set a 64-bit counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param new_value
+ *   The new value of the counter.
+ */
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+#ifdef __LP64__
+	v->cnt = new_value;
+#else
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+#endif
+}
+#endif
+
+/**
+ * Atomically add a 64-bit value to a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ */
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	__sync_fetch_and_add(&v->cnt, inc);
+}
+#endif
+
+/**
+ * Atomically subtract a 64-bit value from a counter.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ */
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	__sync_fetch_and_sub(&v->cnt, dec);
+}
+#endif
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	rte_atomic64_add(v, 1);
+}
+#endif
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	rte_atomic64_sub(v, 1);
+}
+#endif
+
+/**
+ * Add a 64-bit value to an atomic counter and return the result.
+ *
+ * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
+ * returns the value of v after the addition.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param inc
+ *   The value to be added to the counter.
+ * @return
+ *   The value of v after the addition.
+ */
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	return __sync_add_and_fetch(&v->cnt, inc);
+}
+#endif
+
+/**
+ * Subtract a 64-bit value from an atomic counter and return the result.
+ *
+ * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
+ * and returns the value of v after the subtraction.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @param dec
+ *   The value to be subtracted from the counter.
+ * @return
+ *   The value of v after the subtraction.
+ */
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return __sync_sub_and_fetch(&v->cnt, dec);
+}
+#endif
+
+/**
+ * Atomically increment a 64-bit counter by one and test.
+ *
+ * Atomically increments the atomic counter (v) by one and returns
+ * true if the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after the addition is 0; false otherwise.
+ */
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_add_return(v, 1) == 0;
+}
+#endif
+
+/**
+ * Atomically decrement a 64-bit counter by one and test.
+ *
+ * Atomically decrements the atomic counter (v) by one and returns true if
+ * the result is 0, or false in all other cases.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   True if the result after subtraction is 0; false otherwise.
+ */
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_sub_return(v, 1) == 0;
+}
+#endif
+
+/**
+ * Atomically test and set a 64-bit atomic counter.
+ *
+ * If the counter value is already set, return 0 (failed). Otherwise, set
+ * the counter value to 1 and return 1 (success).
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ * @return
+ *   0 if failed; else 1, success.
+ */
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+#endif
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#endif /* _RTE_ATOMIC_H_ */
diff --git a/lib/librte_eal/common/include/rte_atomic.h b/lib/librte_eal/common/include/rte_atomic.h
deleted file mode 100644
index a5b6eec..0000000
--- a/lib/librte_eal/common/include/rte_atomic.h
+++ /dev/null
@@ -1,1133 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_ATOMIC_H_
-#define _RTE_ATOMIC_H_
-
-/**
- * @file
- * Atomic Operations
- *
- * This file defines a generic API for atomic
- * operations. The implementation is architecture-specific.
- *
- * See lib/librte_eal/common/include/i686/arch/rte_atomic.h
- * See lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdint.h>
-
-#if RTE_MAX_LCORE == 1
-#define MPLOCKED                        /**< No need to insert MP lock prefix. */
-#else
-#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
-#endif
-
-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- */
-#define	rte_mb() _mm_mfence()
-
-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- */
-#define	rte_wmb() _mm_sfence()
-
-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- */
-#define	rte_rmb() _mm_lfence()
-
-/**
- * Compiler barrier.
- *
- * Guarantees that operation reordering does not occur at compile time
- * for operations directly before and after the barrier.
- */
-#define	rte_compiler_barrier() do {		\
-	asm volatile ("" : : : "memory");	\
-} while(0)
-
-#include <emmintrin.h>
-
-/**
- * @file
- * Atomic Operations on x86_64
- */
-
-/*------------------------- 16 bit atomic operations -------------------------*/
-
-/**
- * Atomic compare and set.
- *
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 16-bit words)
- *
- * @param dst
- *   The destination location into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgw %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-#else
-	return __sync_bool_compare_and_swap(dst, exp, src);
-#endif
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int16_t cnt; /**< An internal counter value. */
-} rte_atomic16_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC16_INIT(val) { (val) }
-
-/**
- * Initialize an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic16_init(rte_atomic16_t *v)
-{
-	v->cnt = 0;
-}
-
-/**
- * Atomically read a 16-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int16_t
-rte_atomic16_read(const rte_atomic16_t *v)
-{
-	return v->cnt;
-}
-
-/**
- * Atomically set a counter to a 16-bit value.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value for the counter.
- */
-static inline void
-rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
-{
-	v->cnt = new_value;
-}
-
-/**
- * Atomically add a 16-bit value to an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
-{
-	__sync_fetch_and_add(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 16-bit value from an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
-{
-	__sync_fetch_and_sub(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic16_add(v, 1);
-#endif
-}
-
-/**
- * Atomically decrement a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"decw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic16_sub(v, 1);
-#endif
-}
-
-/**
- * Atomically add a 16-bit value to a counter and return the result.
- *
- * Atomically adds the 16-bits value (inc) to the atomic counter (v) and
- * returns the value of v after addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int16_t
-rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
-{
-	return __sync_add_and_fetch(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 16-bit value from a counter and return
- * the result.
- *
- * Atomically subtracts the 16-bit value (inc) from the atomic counter
- * (v) and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int16_t
-rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
-{
-	return __sync_sub_and_fetch(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 16-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the increment operation is 0; false otherwise.
- */
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically decrement a 16-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the decrement operation is 0; false otherwise.
- */
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically test and set a 16-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
-	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 16-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic16_clear(rte_atomic16_t *v)
-{
-	v->cnt = 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-/**
- * Atomic compare and set.
- *
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 32-bit words)
- *
- * @param dst
- *   The destination location into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgl %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-#else
-	return __sync_bool_compare_and_swap(dst, exp, src);
-#endif
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int32_t cnt; /**< An internal counter value. */
-} rte_atomic32_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC32_INIT(val) { (val) }
-
-/**
- * Initialize an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic32_init(rte_atomic32_t *v)
-{
-	v->cnt = 0;
-}
-
-/**
- * Atomically read a 32-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int32_t
-rte_atomic32_read(const rte_atomic32_t *v)
-{
-	return v->cnt;
-}
-
-/**
- * Atomically set a counter to a 32-bit value.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value for the counter.
- */
-static inline void
-rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
-{
-	v->cnt = new_value;
-}
-
-/**
- * Atomically add a 32-bit value to an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
-{
-	__sync_fetch_and_add(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 32-bit value from an atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
-{
-	__sync_fetch_and_sub(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic32_add(v, 1);
-#endif
-}
-
-/**
- * Atomically decrement a counter by one.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	asm volatile(
-			MPLOCKED
-			"decl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-#else
-	rte_atomic32_sub(v,1);
-#endif
-}
-
-/**
- * Atomically add a 32-bit value to a counter and return the result.
- *
- * Atomically adds the 32-bits value (inc) to the atomic counter (v) and
- * returns the value of v after addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int32_t
-rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
-{
-	return __sync_add_and_fetch(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 32-bit value from a counter and return
- * the result.
- *
- * Atomically subtracts the 32-bit value (inc) from the atomic counter
- * (v) and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int32_t
-rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
-{
-	return __sync_sub_and_fetch(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 32-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the increment operation is 0; false otherwise.
- */
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_add_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically decrement a 32-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the decrement operation is 0; false otherwise.
- */
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-#else
-	return (__sync_sub_and_fetch(&v->cnt, 1) == 0);
-#endif
-}
-
-/**
- * Atomically test and set a 32-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
-	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 32-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic32_clear(rte_atomic32_t *v)
-{
-	v->cnt = 0;
-}
-
-#ifndef RTE_FORCE_INTRINSICS
-/* any other functions are in arch specific files */
-#include "arch/rte_atomic.h"
-
-
-#ifdef __DOXYGEN__
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v);
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v);
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v);
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v);
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int
-rte_atomic64_inc_and_test(rte_atomic64_t *v);
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int
-rte_atomic64_dec_and_test(rte_atomic64_t *v);
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int
-rte_atomic64_test_and_set(rte_atomic64_t *v);
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_clear(rte_atomic64_t *v);
-
-#endif /* __DOXYGEN__ */
-
-#else /*RTE_FORCE_INTRINSICS */
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-/**
- * An atomic compare and set function used by the mutex functions.
- * (atomic) equivalent to:
- *   if (*dst == exp)
- *     *dst = src (all 64-bit words)
- *
- * @param dst
- *   The destination into which the value will be written.
- * @param exp
- *   The expected value.
- * @param src
- *   The new value.
- * @return
- *   Non-zero on success; 0 on failure.
- */
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	return __sync_bool_compare_and_swap(dst, exp, src);
-}
-
-/**
- * The atomic counter structure.
- */
-typedef struct {
-	volatile int64_t cnt;  /**< Internal counter value. */
-} rte_atomic64_t;
-
-/**
- * Static initializer for an atomic counter.
- */
-#define RTE_ATOMIC64_INIT(val) { (val) }
-
-/**
- * Initialize the atomic counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-#ifdef __LP64__
-	v->cnt = 0;
-#else
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, 0);
-	}
-#endif
-}
-
-/**
- * Atomically read a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   The value of the counter.
- */
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-#ifdef __LP64__
-	return v->cnt;
-#else
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		/* replace the value by itself */
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp);
-	}
-	return tmp;
-#endif
-}
-
-/**
- * Atomically set a 64-bit counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param new_value
- *   The new value of the counter.
- */
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-#ifdef __LP64__
-	v->cnt = new_value;
-#else
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, new_value);
-	}
-#endif
-}
-
-/**
- * Atomically add a 64-bit value to a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- */
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	__sync_fetch_and_add(&v->cnt, inc);
-}
-
-/**
- * Atomically subtract a 64-bit value from a counter.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- */
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	__sync_fetch_and_sub(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	rte_atomic64_add(v, 1);
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	rte_atomic64_sub(v, 1);
-}
-
-/**
- * Add a 64-bit value to an atomic counter and return the result.
- *
- * Atomically adds the 64-bit value (inc) to the atomic counter (v) and
- * returns the value of v after the addition.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param inc
- *   The value to be added to the counter.
- * @return
- *   The value of v after the addition.
- */
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	return __sync_add_and_fetch(&v->cnt, inc);
-}
-
-/**
- * Subtract a 64-bit value from an atomic counter and return the result.
- *
- * Atomically subtracts the 64-bit value (dec) from the atomic counter (v)
- * and returns the value of v after the subtraction.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @param dec
- *   The value to be subtracted from the counter.
- * @return
- *   The value of v after the subtraction.
- */
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	return __sync_sub_and_fetch(&v->cnt, dec);
-}
-
-/**
- * Atomically increment a 64-bit counter by one and test.
- *
- * Atomically increments the atomic counter (v) by one and returns
- * true if the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after the addition is 0; false otherwise.
- */
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_add_return(v, 1) == 0;
-}
-
-/**
- * Atomically decrement a 64-bit counter by one and test.
- *
- * Atomically decrements the atomic counter (v) by one and returns true if
- * the result is 0, or false in all other cases.
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   True if the result after subtraction is 0; false otherwise.
- */
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-/**
- * Atomically test and set a 64-bit atomic counter.
- *
- * If the counter value is already set, return 0 (failed). Otherwise, set
- * the counter value to 1 and return 1 (success).
- *
- * @param v
- *   A pointer to the atomic counter.
- * @return
- *   0 if failed; else 1, success.
- */
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-/**
- * Atomically set a 64-bit counter to 0.
- *
- * @param v
- *   A pointer to the atomic counter.
- */
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	rte_atomic64_set(v, 0);
-}
-
-#endif /*RTE_FORCE_INTRINSICS */
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_ATOMIC_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 03/10] eal: split byte order operations to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 01/10] eal: move rte_atomic.h header David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 02/10] eal: split atomic operations to architecture specific David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 04/10] eal: split CPU cycle operation " David Marchand
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch splits the byte order operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_byteorder.h       |  129 ++++++++++
 .../common/include/arch/x86_64/rte_byteorder.h     |  130 ++++++++++
 .../common/include/generic/rte_byteorder.h         |  189 ++++++++++++++
 lib/librte_eal/common/include/rte_byteorder.h      |  270 --------------------
 5 files changed, 450 insertions(+), 272 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 8ab363b..62a39cd 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -31,7 +31,7 @@
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-INC := rte_branch_prediction.h rte_byteorder.h rte_common.h
+INC := rte_branch_prediction.h rte_common.h
 INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-GENERIC_INC := rte_atomic.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_byteorder.h b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
new file mode 100644
index 0000000..6d5b23e
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
@@ -0,0 +1,129 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_I686_H_
+#define _RTE_BYTEORDER_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("xchgb %b[x1],%h[x2]"
+		      : [x1] "=Q" (x)
+		      : [x2] "0" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* Compat./Leg. mode */
+static inline uint64_t rte_arch_bswap64(uint64_t x)
+{
+	uint64_t ret = 0;
+	ret |= ((uint64_t)rte_arch_bswap32(x & 0xffffffffUL) << 32);
+	ret |= ((uint64_t)rte_arch_bswap32((x >> 32) & 0xffffffffUL));
+	return ret;
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h b/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
new file mode 100644
index 0000000..825e576
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
@@ -0,0 +1,130 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_X86_64_H_
+#define _RTE_BYTEORDER_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("xchgb %b[x1],%h[x2]"
+		      : [x1] "=Q" (x)
+		      : [x2] "0" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	register uint64_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_byteorder.h b/lib/librte_eal/common/include/generic/rte_byteorder.h
new file mode 100644
index 0000000..9358136
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_byteorder.h
@@ -0,0 +1,189 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_H_
+#define _RTE_BYTEORDER_H_
+
+/**
+ * @file
+ *
+ * Byte Swap Operations
+ *
+ * This file defines a generic API for byte swap operations. Part of
+ * the implementation is architecture-specific.
+ */
+
+#include <stdint.h>
+
+/*
+ * An internal function to swap bytes in a 16-bit value.
+ *
+ * It is used by rte_bswap16() when the value is constant. Do not use
+ * this function directly; rte_bswap16() is preferred.
+ */
+static inline uint16_t
+rte_constant_bswap16(uint16_t x)
+{
+	return (uint16_t)(((x & 0x00ffU) << 8) |
+		((x & 0xff00U) >> 8));
+}
+
+/*
+ * An internal function to swap bytes in a 32-bit value.
+ *
+ * It is used by rte_bswap32() when the value is constant. Do not use
+ * this function directly; rte_bswap32() is preferred.
+ */
+static inline uint32_t
+rte_constant_bswap32(uint32_t x)
+{
+	return  ((x & 0x000000ffUL) << 24) |
+		((x & 0x0000ff00UL) << 8) |
+		((x & 0x00ff0000UL) >> 8) |
+		((x & 0xff000000UL) >> 24);
+}
+
+/*
+ * An internal function to swap bytes of a 64-bit value.
+ *
+ * It is used by rte_bswap64() when the value is constant. Do not use
+ * this function directly; rte_bswap64() is preferred.
+ */
+static inline uint64_t
+rte_constant_bswap64(uint64_t x)
+{
+	return  ((x & 0x00000000000000ffULL) << 56) |
+		((x & 0x000000000000ff00ULL) << 40) |
+		((x & 0x0000000000ff0000ULL) << 24) |
+		((x & 0x00000000ff000000ULL) <<  8) |
+		((x & 0x000000ff00000000ULL) >>  8) |
+		((x & 0x0000ff0000000000ULL) >> 24) |
+		((x & 0x00ff000000000000ULL) >> 40) |
+		((x & 0xff00000000000000ULL) >> 56);
+}
+
+
+#ifdef __DOXYGEN__
+
+/**
+ * Swap bytes in a 16-bit value.
+ */
+static uint16_t rte_bswap16(uint16_t _x);
+
+/**
+ * Swap bytes in a 32-bit value.
+ */
+static uint32_t rte_bswap32(uint32_t x);
+
+/**
+ * Swap bytes in a 64-bit value.
+ */
+static uint64_t rte_bswap64(uint64_t x);
+
+/**
+ * Convert a 16-bit value from CPU order to little endian.
+ */
+static uint16_t rte_cpu_to_le_16(uint16_t x);
+
+/**
+ * Convert a 32-bit value from CPU order to little endian.
+ */
+static uint32_t rte_cpu_to_le_32(uint32_t x);
+
+/**
+ * Convert a 64-bit value from CPU order to little endian.
+ */
+static uint64_t rte_cpu_to_le_64(uint64_t x);
+
+
+/**
+ * Convert a 16-bit value from CPU order to big endian.
+ */
+static uint16_t rte_cpu_to_be_16(uint16_t x);
+
+/**
+ * Convert a 32-bit value from CPU order to big endian.
+ */
+static uint32_t rte_cpu_to_be_32(uint32_t x);
+
+/**
+ * Convert a 64-bit value from CPU order to big endian.
+ */
+static uint64_t rte_cpu_to_be_64(uint64_t x);
+
+
+/**
+ * Convert a 16-bit value from little endian to CPU order.
+ */
+static uint16_t rte_le_to_cpu_16(uint16_t x);
+
+/**
+ * Convert a 32-bit value from little endian to CPU order.
+ */
+static uint32_t rte_le_to_cpu_32(uint32_t x);
+
+/**
+ * Convert a 64-bit value from little endian to CPU order.
+ */
+static uint64_t rte_le_to_cpu_64(uint64_t x);
+
+
+/**
+ * Convert a 16-bit value from big endian to CPU order.
+ */
+static uint16_t rte_be_to_cpu_16(uint16_t x);
+
+/**
+ * Convert a 32-bit value from big endian to CPU order.
+ */
+static uint32_t rte_be_to_cpu_32(uint32_t x);
+
+/**
+ * Convert a 64-bit value from big endian to CPU order.
+ */
+static uint64_t rte_be_to_cpu_64(uint64_t x);
+
+#endif /* __DOXYGEN__ */
+
+#ifdef RTE_FORCE_INTRINSICS
+#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)
+#define rte_bswap16(x) __builtin_bswap16(x)
+#endif
+
+#define rte_bswap32(x) __builtin_bswap32(x)
+
+#define rte_bswap64(x) __builtin_bswap64(x)
+
+#endif
+
+#endif /* _RTE_BYTEORDER_H_ */
diff --git a/lib/librte_eal/common/include/rte_byteorder.h b/lib/librte_eal/common/include/rte_byteorder.h
deleted file mode 100644
index 30fbd56..0000000
--- a/lib/librte_eal/common/include/rte_byteorder.h
+++ /dev/null
@@ -1,270 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_BYTEORDER_H_
-#define _RTE_BYTEORDER_H_
-
-/**
- * @file
- *
- * Byte Swap Operations
- *
- * This file defines a generic API for byte swap operations. Part of
- * the implementation is architecture-specific.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdint.h>
-
-/*
- * An internal function to swap bytes in a 16-bit value.
- *
- * It is used by rte_bswap16() when the value is constant. Do not use
- * this function directly; rte_bswap16() is preferred.
- */
-static inline uint16_t
-rte_constant_bswap16(uint16_t x)
-{
-	return (uint16_t)(((x & 0x00ffU) << 8) |
-		((x & 0xff00U) >> 8));
-}
-
-/*
- * An internal function to swap bytes in a 32-bit value.
- *
- * It is used by rte_bswap32() when the value is constant. Do not use
- * this function directly; rte_bswap32() is preferred.
- */
-static inline uint32_t
-rte_constant_bswap32(uint32_t x)
-{
-	return  ((x & 0x000000ffUL) << 24) |
-		((x & 0x0000ff00UL) << 8) |
-		((x & 0x00ff0000UL) >> 8) |
-		((x & 0xff000000UL) >> 24);
-}
-
-/*
- * An internal function to swap bytes of a 64-bit value.
- *
- * It is used by rte_bswap64() when the value is constant. Do not use
- * this function directly; rte_bswap64() is preferred.
- */
-static inline uint64_t
-rte_constant_bswap64(uint64_t x)
-{
-	return  ((x & 0x00000000000000ffULL) << 56) |
-		((x & 0x000000000000ff00ULL) << 40) |
-		((x & 0x0000000000ff0000ULL) << 24) |
-		((x & 0x00000000ff000000ULL) <<  8) |
-		((x & 0x000000ff00000000ULL) >>  8) |
-		((x & 0x0000ff0000000000ULL) >> 24) |
-		((x & 0x00ff000000000000ULL) >> 40) |
-		((x & 0xff00000000000000ULL) >> 56);
-}
-
-/*
- * An architecture-optimized byte swap for a 16-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap16().
- */
-static inline uint16_t rte_arch_bswap16(uint16_t _x)
-{
-	register uint16_t x = _x;
-	asm volatile ("xchgb %b[x1],%h[x2]"
-		      : [x1] "=Q" (x)
-		      : [x2] "0" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 32-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap32().
- */
-static inline uint32_t rte_arch_bswap32(uint32_t _x)
-{
-	register uint32_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 64-bit value.
- *
-  * Do not use this function directly. The preferred function is rte_bswap64().
- */
-#ifdef RTE_ARCH_X86_64
-/* 64-bit mode */
-static inline uint64_t rte_arch_bswap64(uint64_t _x)
-{
-	register uint64_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-#else /* ! RTE_ARCH_X86_64 */
-/* Compat./Leg. mode */
-static inline uint64_t rte_arch_bswap64(uint64_t x)
-{
-	uint64_t ret = 0;
-	ret |= ((uint64_t)rte_arch_bswap32(x & 0xffffffffUL) << 32);
-	ret |= ((uint64_t)rte_arch_bswap32((x >> 32) & 0xffffffffUL));
-	return ret;
-}
-#endif /* RTE_ARCH_X86_64 */
-
-
-#ifndef RTE_FORCE_INTRINSICS
-/**
- * Swap bytes in a 16-bit value.
- */
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-
-/**
- * Swap bytes in a 32-bit value.
- */
-#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap32(x) :		\
-				   rte_arch_bswap32(x)))
-
-/**
- * Swap bytes in a 64-bit value.
- */
-#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap64(x) :		\
-				   rte_arch_bswap64(x)))
-
-#else
-
-/**
- * Swap bytes in a 16-bit value.
- * __builtin_bswap16 is only available gcc 4.8 and upwards
- */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)
-#define rte_bswap16(x) __builtin_bswap16(x)
-#else
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-#endif
-
-/**
- * Swap bytes in a 32-bit value.
- */
-#define rte_bswap32(x) __builtin_bswap32(x)
-
-/**
- * Swap bytes in a 64-bit value.
- */
-#define rte_bswap64(x) __builtin_bswap64(x)
-
-#endif
-
-/**
- * Convert a 16-bit value from CPU order to little endian.
- */
-#define rte_cpu_to_le_16(x) (x)
-
-/**
- * Convert a 32-bit value from CPU order to little endian.
- */
-#define rte_cpu_to_le_32(x) (x)
-
-/**
- * Convert a 64-bit value from CPU order to little endian.
- */
-#define rte_cpu_to_le_64(x) (x)
-
-
-/**
- * Convert a 16-bit value from CPU order to big endian.
- */
-#define rte_cpu_to_be_16(x) rte_bswap16(x)
-
-/**
- * Convert a 32-bit value from CPU order to big endian.
- */
-#define rte_cpu_to_be_32(x) rte_bswap32(x)
-
-/**
- * Convert a 64-bit value from CPU order to big endian.
- */
-#define rte_cpu_to_be_64(x) rte_bswap64(x)
-
-
-/**
- * Convert a 16-bit value from little endian to CPU order.
- */
-#define rte_le_to_cpu_16(x) (x)
-
-/**
- * Convert a 32-bit value from little endian to CPU order.
- */
-#define rte_le_to_cpu_32(x) (x)
-
-/**
- * Convert a 64-bit value from little endian to CPU order.
- */
-#define rte_le_to_cpu_64(x) (x)
-
-
-/**
- * Convert a 16-bit value from big endian to CPU order.
- */
-#define rte_be_to_cpu_16(x) rte_bswap16(x)
-
-/**
- * Convert a 32-bit value from big endian to CPU order.
- */
-#define rte_be_to_cpu_32(x) rte_bswap32(x)
-
-/**
- * Convert a 64-bit value from big endian to CPU order.
- */
-#define rte_be_to_cpu_64(x) rte_bswap64(x)
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_BYTEORDER_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 04/10] eal: split CPU cycle operation to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (2 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 03/10] eal: split byte order " David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 05/10] eal: split prefetch operations " David Marchand
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch splits the CPU TSC read operations from DPDK and push them to
architecture specific arch directories, so that other processors that
don't have tsc register can implement its own functions.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_cycles.h          |  121 +++++++++
 .../common/include/arch/x86_64/rte_cycles.h        |  121 +++++++++
 lib/librte_eal/common/include/generic/rte_cycles.h |  209 +++++++++++++++
 lib/librte_eal/common/include/rte_cycles.h         |  266 --------------------
 5 files changed, 453 insertions(+), 268 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 62a39cd..c6aedf9 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -32,7 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 INC := rte_branch_prediction.h rte_common.h
-INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
+INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
 INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-GENERIC_INC := rte_atomic.h rte_byteorder.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cycles.h b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
new file mode 100644
index 0000000..6e47040
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
@@ -0,0 +1,121 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_I686_H_
+#define _RTE_CYCLES_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+/* Global switch to use VMWARE mapping of TSC instead of RDTSC */
+extern int rte_cycles_vmware_tsc_map;
+#include <rte_branch_prediction.h>
+#endif
+
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+	if (unlikely(rte_cycles_vmware_tsc_map)) {
+		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
+		asm volatile("rdpmc" :
+		             "=a" (tsc.lo_32),
+		             "=d" (tsc.hi_32) :
+		             "c"(0x10000));
+		return tsc.tsc_64;
+	}
+#endif
+
+	asm volatile("rdtsc" :
+		     "=a" (tsc.lo_32),
+		     "=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h b/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
new file mode 100644
index 0000000..6e3c7d8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
@@ -0,0 +1,121 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_X86_64_H_
+#define _RTE_CYCLES_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+/* Global switch to use VMWARE mapping of TSC instead of RDTSC */
+extern int rte_cycles_vmware_tsc_map;
+#include <rte_branch_prediction.h>
+#endif
+
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+	if (unlikely(rte_cycles_vmware_tsc_map)) {
+		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
+		asm volatile("rdpmc" :
+		             "=a" (tsc.lo_32),
+		             "=d" (tsc.hi_32) :
+		             "c"(0x10000));
+		return tsc.tsc_64;
+	}
+#endif
+
+	asm volatile("rdtsc" :
+		     "=a" (tsc.lo_32),
+		     "=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_cycles.h b/lib/librte_eal/common/include/generic/rte_cycles.h
new file mode 100644
index 0000000..c42b90b
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_cycles.h
@@ -0,0 +1,209 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_H_
+#define _RTE_CYCLES_H_
+
+/**
+ * @file
+ *
+ * Simple Time Reference Functions (Cycles and HPET).
+ */
+
+#include <stdint.h>
+#include <rte_debug.h>
+#include <rte_atomic.h>
+
+#define MS_PER_S 1000
+#define US_PER_S 1000000
+#define NS_PER_S 1000000000
+
+enum timer_source {
+	EAL_TIMER_TSC = 0,
+	EAL_TIMER_HPET
+};
+extern enum timer_source eal_timer_source;
+
+/**
+ * Get the measured frequency of the RDTSC counter
+ *
+ * @return
+ *   The TSC frequency for this lcore
+ */
+uint64_t
+rte_get_tsc_hz(void);
+
+/**
+ * Return the number of TSC cycles since boot
+ *
+  * @return
+ *   the number of cycles
+ */
+static inline uint64_t
+rte_get_tsc_cycles(void);
+
+#ifdef RTE_LIBEAL_USE_HPET
+/**
+ * Return the number of HPET cycles since boot
+ *
+ * This counter is global for all execution units. The number of
+ * cycles in one second can be retrieved using rte_get_hpet_hz().
+ *
+ * @return
+ *   the number of cycles
+ */
+uint64_t
+rte_get_hpet_cycles(void);
+
+/**
+ * Get the number of HPET cycles in one second.
+ *
+ * @return
+ *   The number of cycles in one second.
+ */
+uint64_t
+rte_get_hpet_hz(void);
+
+/**
+ * Initialise the HPET for use. This must be called before the rte_get_hpet_hz
+ * and rte_get_hpet_cycles APIs are called. If this function does not succeed,
+ * then the HPET functions are unavailable and should not be called.
+ *
+ * @param make_default
+ * 	If set, the hpet timer becomes the default timer whose values are
+ * 	returned by the rte_get_timer_hz/cycles API calls
+ *
+ * @return
+ * 	0 on success,
+ * 	-1 on error, and the make_default parameter is ignored.
+ */
+int rte_eal_hpet_init(int make_default);
+
+#endif
+
+/**
+ * Get the number of cycles since boot from the default timer.
+ *
+ * @return
+ *   The number of cycles
+ */
+static inline uint64_t
+rte_get_timer_cycles(void)
+{
+	switch(eal_timer_source) {
+	case EAL_TIMER_TSC:
+		return rte_get_tsc_cycles();
+	case EAL_TIMER_HPET:
+#ifdef RTE_LIBEAL_USE_HPET
+		return rte_get_hpet_cycles();
+#endif
+	default: rte_panic("Invalid timer source specified\n");
+	}
+}
+
+/**
+ * Get the number of cycles in one second for the default timer.
+ *
+ * @return
+ *   The number of cycles in one second.
+ */
+static inline uint64_t
+rte_get_timer_hz(void)
+{
+	switch(eal_timer_source) {
+	case EAL_TIMER_TSC:
+		return rte_get_tsc_hz();
+	case EAL_TIMER_HPET:
+#ifdef RTE_LIBEAL_USE_HPET
+		return rte_get_hpet_hz();
+#endif
+	default: rte_panic("Invalid timer source specified\n");
+	}
+}
+
+/**
+ * Wait at least us microseconds.
+ *
+ * @param us
+ *   The number of microseconds to wait.
+ */
+void
+rte_delay_us(unsigned us);
+
+/**
+ * Wait at least ms milliseconds.
+ *
+ * @param ms
+ *   The number of milliseconds to wait.
+ */
+static inline void
+rte_delay_ms(unsigned ms)
+{
+	rte_delay_us(ms * 1000);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_H_ */
diff --git a/lib/librte_eal/common/include/rte_cycles.h b/lib/librte_eal/common/include/rte_cycles.h
deleted file mode 100644
index 9b4dbe1..0000000
--- a/lib/librte_eal/common/include/rte_cycles.h
+++ /dev/null
@@ -1,266 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-/*   BSD LICENSE
- *
- *   Copyright(c) 2013 6WIND.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of 6WIND S.A. nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CYCLES_H_
-#define _RTE_CYCLES_H_
-
-/**
- * @file
- *
- * Simple Time Reference Functions (Cycles and HPET).
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdint.h>
-#include <rte_debug.h>
-#include <rte_atomic.h>
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-/** Global switch to use VMWARE mapping of TSC instead of RDTSC */
-extern int rte_cycles_vmware_tsc_map;
-#include <rte_branch_prediction.h>
-#endif
-
-#define MS_PER_S 1000
-#define US_PER_S 1000000
-#define NS_PER_S 1000000000
-
-enum timer_source {
-	EAL_TIMER_TSC = 0,
-	EAL_TIMER_HPET
-};
-extern enum timer_source eal_timer_source;
-
-/**
- * Read the TSC register.
- *
- * @return
- *   The TSC for this lcore.
- */
-static inline uint64_t
-rte_rdtsc(void)
-{
-	union {
-		uint64_t tsc_64;
-		struct {
-			uint32_t lo_32;
-			uint32_t hi_32;
-		};
-	} tsc;
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-	if (unlikely(rte_cycles_vmware_tsc_map)) {
-		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
-		asm volatile("rdpmc" :
-		             "=a" (tsc.lo_32),
-		             "=d" (tsc.hi_32) :
-		             "c"(0x10000));
-		return tsc.tsc_64;
-	}
-#endif
-
-	asm volatile("rdtsc" :
-		     "=a" (tsc.lo_32),
-		     "=d" (tsc.hi_32));
-	return tsc.tsc_64;
-}
-
-/**
- * Read the TSC register precisely where function is called.
- *
- * @return
- *   The TSC for this lcore.
- */
-static inline uint64_t
-rte_rdtsc_precise(void)
-{
-	rte_mb();
-	return rte_rdtsc();
-}
-
-/**
- * Get the measured frequency of the RDTSC counter
- *
- * @return
- *   The TSC frequency for this lcore
- */
-uint64_t
-rte_get_tsc_hz(void);
-
-/**
- * Return the number of TSC cycles since boot
- *
-  * @return
- *   the number of cycles
- */
-static inline uint64_t
-rte_get_tsc_cycles(void) { return rte_rdtsc(); }
-
-#ifdef RTE_LIBEAL_USE_HPET
-/**
- * Return the number of HPET cycles since boot
- *
- * This counter is global for all execution units. The number of
- * cycles in one second can be retrieved using rte_get_hpet_hz().
- *
- * @return
- *   the number of cycles
- */
-uint64_t
-rte_get_hpet_cycles(void);
-
-/**
- * Get the number of HPET cycles in one second.
- *
- * @return
- *   The number of cycles in one second.
- */
-uint64_t
-rte_get_hpet_hz(void);
-
-/**
- * Initialise the HPET for use. This must be called before the rte_get_hpet_hz
- * and rte_get_hpet_cycles APIs are called. If this function does not succeed,
- * then the HPET functions are unavailable and should not be called.
- *
- * @param make_default
- * 	If set, the hpet timer becomes the default timer whose values are
- * 	returned by the rte_get_timer_hz/cycles API calls
- *
- * @return
- * 	0 on success,
- * 	-1 on error, and the make_default parameter is ignored.
- */
-int rte_eal_hpet_init(int make_default);
-
-#endif
-
-/**
- * Get the number of cycles since boot from the default timer.
- *
- * @return
- *   The number of cycles
- */
-static inline uint64_t
-rte_get_timer_cycles(void)
-{
-	switch(eal_timer_source) {
-	case EAL_TIMER_TSC:
-		return rte_rdtsc();
-	case EAL_TIMER_HPET:
-#ifdef RTE_LIBEAL_USE_HPET
-		return rte_get_hpet_cycles();
-#endif
-	default: rte_panic("Invalid timer source specified\n");
-	}
-}
-
-/**
- * Get the number of cycles in one second for the default timer.
- *
- * @return
- *   The number of cycles in one second.
- */
-static inline uint64_t
-rte_get_timer_hz(void)
-{
-	switch(eal_timer_source) {
-	case EAL_TIMER_TSC:
-		return rte_get_tsc_hz();
-	case EAL_TIMER_HPET:
-#ifdef RTE_LIBEAL_USE_HPET
-		return rte_get_hpet_hz();
-#endif
-	default: rte_panic("Invalid timer source specified\n");
-	}
-}
-
-/**
- * Wait at least us microseconds.
- *
- * @param us
- *   The number of microseconds to wait.
- */
-void
-rte_delay_us(unsigned us);
-
-/**
- * Wait at least ms milliseconds.
- *
- * @param ms
- *   The number of milliseconds to wait.
- */
-static inline void
-rte_delay_ms(unsigned ms)
-{
-	rte_delay_us(ms * 1000);
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_CYCLES_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 05/10] eal: split prefetch operations to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (3 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 04/10] eal: split CPU cycle operation " David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 06/10] eal: split spinlock " David Marchand
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch splits the prefetch operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can implement their own functions.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_prefetch.h        |   62 ++++++++++++++
 .../common/include/arch/x86_64/rte_prefetch.h      |   62 ++++++++++++++
 .../common/include/generic/rte_prefetch.h          |   71 ++++++++++++++++
 lib/librte_eal/common/include/rte_prefetch.h       |   88 --------------------
 5 files changed, 197 insertions(+), 90 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index c6aedf9..9808c9f 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
-INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
+INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif
 
-GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_prefetch.h b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
new file mode 100644
index 0000000..5fbd98e
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
@@ -0,0 +1,62 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_I686_H_
+#define _RTE_PREFETCH_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(volatile void *p)
+{
+	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+static inline void rte_prefetch1(volatile void *p)
+{
+	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+static inline void rte_prefetch2(volatile void *p)
+{
+	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h b/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
new file mode 100644
index 0000000..ec2454d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
@@ -0,0 +1,62 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_X86_64_H_
+#define _RTE_PREFETCH_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(volatile void *p)
+{
+	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+static inline void rte_prefetch1(volatile void *p)
+{
+	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+static inline void rte_prefetch2(volatile void *p)
+{
+	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_prefetch.h b/lib/librte_eal/common/include/generic/rte_prefetch.h
new file mode 100644
index 0000000..217f319
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_prefetch.h
@@ -0,0 +1,71 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_H_
+#define _RTE_PREFETCH_H_
+
+/**
+ * @file
+ *
+ * Prefetch operations.
+ *
+ * This file defines an API for prefetch macros / inline-functions,
+ * which are architecture-dependent. Prefetching occurs when a
+ * processor requests an instruction or data from memory to cache
+ * before it is actually needed, potentially speeding up the execution of the
+ * program.
+ */
+
+/**
+ * Prefetch a cache line into all cache levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch0(volatile void *p);
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th cache level.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch1(volatile void *p);
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th and 1th cache
+ * levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch2(volatile void *p);
+
+#endif /* _RTE_PREFETCH_H_ */
diff --git a/lib/librte_eal/common/include/rte_prefetch.h b/lib/librte_eal/common/include/rte_prefetch.h
deleted file mode 100644
index 8a691ef..0000000
--- a/lib/librte_eal/common/include/rte_prefetch.h
+++ /dev/null
@@ -1,88 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_PREFETCH_H_
-#define _RTE_PREFETCH_H_
-
-/**
- * @file
- *
- * Prefetch operations.
- *
- * This file defines an API for prefetch macros / inline-functions,
- * which are architecture-dependent. Prefetching occurs when a
- * processor requests an instruction or data from memory to cache
- * before it is actually needed, potentially speeding up the execution of the
- * program.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/**
- * Prefetch a cache line into all cache levels.
- * @param p
- *   Address to prefetch
- */
-static inline void rte_prefetch0(volatile void *p)
-{
-	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-/**
- * Prefetch a cache line into all cache levels except the 0th cache level.
- * @param p
- *   Address to prefetch
- */
-static inline void rte_prefetch1(volatile void *p)
-{
-	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-/**
- * Prefetch a cache line into all cache levels except the 0th and 1th cache
- * levels.
- * @param p
- *   Address to prefetch
- */
-static inline void rte_prefetch2(volatile void *p)
-{
-	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_PREFETCH_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 06/10] eal: split spinlock operations to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (4 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 05/10] eal: split prefetch operations " David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 07/10] eal: split memcpy operation " David Marchand
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch splits the spinlock operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    3 +-
 .../common/include/arch/i686/rte_spinlock.h        |   94 +++++++
 .../common/include/arch/x86_64/rte_spinlock.h      |   94 +++++++
 .../common/include/generic/rte_spinlock.h          |  226 +++++++++++++++++
 lib/librte_eal/common/include/rte_spinlock.h       |  258 --------------------
 5 files changed, 416 insertions(+), 259 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 9808c9f..2394443 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -35,7 +35,7 @@ INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
-INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
+INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
 INC += rte_hexdump.h rte_devargs.h rte_dev.h
@@ -47,6 +47,7 @@ INC += rte_warnings.h
 endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
+GENERIC_INC += rte_spinlock.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_spinlock.h b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
new file mode 100644
index 0000000..60cfd4d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
@@ -0,0 +1,94 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_I686_H_
+#define _RTE_SPINLOCK_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_spinlock.h"
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	int lock_val = 1;
+	asm volatile (
+			"1:\n"
+			"xchg %[locked], %[lv]\n"
+			"test %[lv], %[lv]\n"
+			"jz 3f\n"
+			"2:\n"
+			"pause\n"
+			"cmpl $0, %[locked]\n"
+			"jnz 2b\n"
+			"jmp 1b\n"
+			"3:\n"
+			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
+			: "[lv]" (lock_val)
+			: "memory");
+}
+
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	int unlock_val = 0;
+	asm volatile (
+			"xchg %[locked], %[ulv]\n"
+			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
+			: "[ulv]" (unlock_val)
+			: "memory");
+}
+
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	int lockval = 1;
+
+	asm volatile (
+			"xchg %[locked], %[lockval]"
+			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
+			: "[lockval]" (lockval)
+			: "memory");
+
+	return (lockval == 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h b/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
new file mode 100644
index 0000000..54fba95
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
@@ -0,0 +1,94 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_X86_64_H_
+#define _RTE_SPINLOCK_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_spinlock.h"
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	int lock_val = 1;
+	asm volatile (
+			"1:\n"
+			"xchg %[locked], %[lv]\n"
+			"test %[lv], %[lv]\n"
+			"jz 3f\n"
+			"2:\n"
+			"pause\n"
+			"cmpl $0, %[locked]\n"
+			"jnz 2b\n"
+			"jmp 1b\n"
+			"3:\n"
+			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
+			: "[lv]" (lock_val)
+			: "memory");
+}
+
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	int unlock_val = 0;
+	asm volatile (
+			"xchg %[locked], %[ulv]\n"
+			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
+			: "[ulv]" (unlock_val)
+			: "memory");
+}
+
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	int lockval = 1;
+
+	asm volatile (
+			"xchg %[locked], %[lockval]"
+			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
+			: "[lockval]" (lockval)
+			: "memory");
+
+	return (lockval == 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
new file mode 100644
index 0000000..dea885c
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -0,0 +1,226 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_H_
+#define _RTE_SPINLOCK_H_
+
+/**
+ * @file
+ *
+ * RTE Spinlocks
+ *
+ * This file defines an API for read-write locks, which are implemented
+ * in an architecture-specific way. This kind of lock simply waits in
+ * a loop repeatedly checking until the lock becomes available.
+ *
+ * All locks must be initialised before use, and only initialised once.
+ *
+ */
+
+#include <rte_lcore.h>
+#ifdef RTE_FORCE_INTRINSICS
+#include <rte_common.h>
+#endif
+
+/**
+ * The rte_spinlock_t type.
+ */
+typedef struct {
+	volatile int locked; /**< lock status 0 = unlocked, 1 = locked */
+} rte_spinlock_t;
+
+/**
+ * A static spinlock initializer.
+ */
+#define RTE_SPINLOCK_INITIALIZER { 0 }
+
+/**
+ * Initialize the spinlock to an unlocked state.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_init(rte_spinlock_t *sl)
+{
+	sl->locked = 0;
+}
+
+/**
+ * Take the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while(sl->locked)
+			rte_pause();
+}
+#endif
+
+/**
+ * Release the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+#endif
+
+/**
+ * Try to take the lock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl);
+
+#ifdef RTE_FORCE_INTRINSICS
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked,1) == 0);
+}
+#endif
+
+/**
+ * Test if the lock is taken.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ * @return
+ *   1 if the lock is currently taken; 0 otherwise.
+ */
+static inline int rte_spinlock_is_locked (rte_spinlock_t *sl)
+{
+	return sl->locked;
+}
+
+/**
+ * The rte_spinlock_recursive_t type.
+ */
+typedef struct {
+	rte_spinlock_t sl; /**< the actual spinlock */
+	volatile int user; /**< core id using lock, -1 for unused */
+	volatile int count; /**< count of time this lock has been called */
+} rte_spinlock_recursive_t;
+
+/**
+ * A static recursive spinlock initializer.
+ */
+#define RTE_SPINLOCK_RECURSIVE_INITIALIZER {RTE_SPINLOCK_INITIALIZER, -1, 0}
+
+/**
+ * Initialize the recursive spinlock to an unlocked state.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
+{
+	rte_spinlock_init(&slr->sl);
+	slr->user = -1;
+	slr->count = 0;
+}
+
+/**
+ * Take the recursive spinlock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
+{
+	int id = rte_lcore_id();
+
+	if (slr->user != id) {
+		rte_spinlock_lock(&slr->sl);
+		slr->user = id;
+	}
+	slr->count++;
+}
+/**
+ * Release the recursive spinlock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ */
+static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
+{
+	if (--(slr->count) == 0) {
+		slr->user = -1;
+		rte_spinlock_unlock(&slr->sl);
+	}
+
+}
+
+/**
+ * Try to take the recursive lock.
+ *
+ * @param slr
+ *   A pointer to the recursive spinlock.
+ * @return
+ *   1 if the lock is successfully taken; 0 otherwise.
+ */
+static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
+{
+	int id = rte_lcore_id();
+
+	if (slr->user != id) {
+		if (rte_spinlock_trylock(&slr->sl) == 0)
+			return 0;
+		slr->user = id;
+	}
+	slr->count++;
+	return 1;
+}
+
+#endif /* _RTE_SPINLOCK_H_ */
diff --git a/lib/librte_eal/common/include/rte_spinlock.h b/lib/librte_eal/common/include/rte_spinlock.h
deleted file mode 100644
index 661908d..0000000
--- a/lib/librte_eal/common/include/rte_spinlock.h
+++ /dev/null
@@ -1,258 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_SPINLOCK_H_
-#define _RTE_SPINLOCK_H_
-
-/**
- * @file
- *
- * RTE Spinlocks
- *
- * This file defines an API for read-write locks, which are implemented
- * in an architecture-specific way. This kind of lock simply waits in
- * a loop repeatedly checking until the lock becomes available.
- *
- * All locks must be initialised before use, and only initialised once.
- *
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <rte_lcore.h>
-#ifdef RTE_FORCE_INTRINSICS
-#include <rte_common.h>
-#endif
-
-/**
- * The rte_spinlock_t type.
- */
-typedef struct {
-	volatile int locked; /**< lock status 0 = unlocked, 1 = locked */
-} rte_spinlock_t;
-
-/**
- * A static spinlock initializer.
- */
-#define RTE_SPINLOCK_INITIALIZER { 0 }
-
-/**
- * Initialize the spinlock to an unlocked state.
- *
- * @param sl
- *   A pointer to the spinlock.
- */
-static inline void
-rte_spinlock_init(rte_spinlock_t *sl)
-{
-	sl->locked = 0;
-}
-
-/**
- * Take the spinlock.
- *
- * @param sl
- *   A pointer to the spinlock.
- */
-static inline void
-rte_spinlock_lock(rte_spinlock_t *sl)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	int lock_val = 1;
-	asm volatile (
-			"1:\n"
-			"xchg %[locked], %[lv]\n"
-			"test %[lv], %[lv]\n"
-			"jz 3f\n"
-			"2:\n"
-			"pause\n"
-			"cmpl $0, %[locked]\n"
-			"jnz 2b\n"
-			"jmp 1b\n"
-			"3:\n"
-			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
-			: "[lv]" (lock_val)
-			: "memory");
-#else
-	while (__sync_lock_test_and_set(&sl->locked, 1))
-		while(sl->locked)
-			rte_pause();
-#endif
-}
-
-/**
- * Release the spinlock.
- *
- * @param sl
- *   A pointer to the spinlock.
- */
-static inline void
-rte_spinlock_unlock (rte_spinlock_t *sl)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	int unlock_val = 0;
-	asm volatile (
-			"xchg %[locked], %[ulv]\n"
-			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
-			: "[ulv]" (unlock_val)
-			: "memory");
-#else
-	__sync_lock_release(&sl->locked);
-#endif
-}
-
-/**
- * Try to take the lock.
- *
- * @param sl
- *   A pointer to the spinlock.
- * @return
- *   1 if the lock is successfully taken; 0 otherwise.
- */
-static inline int
-rte_spinlock_trylock (rte_spinlock_t *sl)
-{
-#ifndef RTE_FORCE_INTRINSICS
-	int lockval = 1;
-
-	asm volatile (
-			"xchg %[locked], %[lockval]"
-			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
-			: "[lockval]" (lockval)
-			: "memory");
-
-	return (lockval == 0);
-#else
-	return (__sync_lock_test_and_set(&sl->locked,1) == 0);
-#endif
-}
-
-/**
- * Test if the lock is taken.
- *
- * @param sl
- *   A pointer to the spinlock.
- * @return
- *   1 if the lock is currently taken; 0 otherwise.
- */
-static inline int rte_spinlock_is_locked (rte_spinlock_t *sl)
-{
-	return sl->locked;
-}
-
-/**
- * The rte_spinlock_recursive_t type.
- */
-typedef struct {
-	rte_spinlock_t sl; /**< the actual spinlock */
-	volatile int user; /**< core id using lock, -1 for unused */
-	volatile int count; /**< count of time this lock has been called */
-} rte_spinlock_recursive_t;
-
-/**
- * A static recursive spinlock initializer.
- */
-#define RTE_SPINLOCK_RECURSIVE_INITIALIZER {RTE_SPINLOCK_INITIALIZER, -1, 0}
-
-/**
- * Initialize the recursive spinlock to an unlocked state.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- */
-static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
-{
-	rte_spinlock_init(&slr->sl);
-	slr->user = -1;
-	slr->count = 0;
-}
-
-/**
- * Take the recursive spinlock.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- */
-static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
-{
-	int id = rte_lcore_id();
-
-	if (slr->user != id) {
-		rte_spinlock_lock(&slr->sl);
-		slr->user = id;
-	}
-	slr->count++;
-}
-/**
- * Release the recursive spinlock.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- */
-static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
-{
-	if (--(slr->count) == 0) {
-		slr->user = -1;
-		rte_spinlock_unlock(&slr->sl);
-	}
-
-}
-
-/**
- * Try to take the recursive lock.
- *
- * @param slr
- *   A pointer to the recursive spinlock.
- * @return
- *   1 if the lock is successfully taken; 0 otherwise.
- */
-static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
-{
-	int id = rte_lcore_id();
-
-	if (slr->user != id) {
-		if (rte_spinlock_trylock(&slr->sl) == 0)
-			return 0;
-		slr->user = id;
-	}
-	slr->count++;
-	return 1;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_SPINLOCK_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 07/10] eal: split memcpy operation to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (5 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 06/10] eal: split spinlock " David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 08/10] eal: split CPU flags operations " David Marchand
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch splits the SSE based memory copy function from DPDK and push
them to architecture specific arch directories. Other processor
architecture can implement its own vector based memory copy functions.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 .../common/include/arch/i686/rte_memcpy.h          |  297 ++++++++++++++++
 .../common/include/arch/x86_64/rte_memcpy.h        |  297 ++++++++++++++++
 lib/librte_eal/common/include/generic/rte_memcpy.h |  144 ++++++++
 lib/librte_eal/common/include/rte_memcpy.h         |  376 --------------------
 5 files changed, 740 insertions(+), 378 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 2394443..f9f98dc 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
-INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
+INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
-GENERIC_INC += rte_spinlock.h
+GENERIC_INC += rte_spinlock.h rte_memcpy.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_memcpy.h b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
new file mode 100644
index 0000000..b8513f6
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
@@ -0,0 +1,297 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_I686_H_
+#define _RTE_MEMCPY_I686_H_
+
+#include <stdint.h>
+#include <string.h>
+#include <emmintrin.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
+#endif
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		: [reg_a] "=x" (reg_a)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu 64(%[src]), %[reg_e]\n\t"
+		"movdqu 80(%[src]), %[reg_f]\n\t"
+		"movdqu 96(%[src]), %[reg_g]\n\t"
+		"movdqu 112(%[src]), %[reg_h]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		"movdqu %[reg_e], 64(%[dst])\n\t"
+		"movdqu %[reg_f], 80(%[dst])\n\t"
+		"movdqu %[reg_g], 96(%[dst])\n\t"
+		"movdqu %[reg_h], 112(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d),
+		  [reg_e] "=x" (reg_e),
+		  [reg_f] "=x" (reg_f),
+		  [reg_g] "=x" (reg_g),
+		  [reg_h] "=x" (reg_h)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+#ifdef __INTEL_COMPILER
+#pragma warning(enable:593)
+#endif
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	rte_mov128(dst, src);
+	rte_mov128(dst + 128, src + 128);
+}
+
+#define rte_memcpy(dst, src, n)              \
+	((__builtin_constant_p(n)) ?          \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)))
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0) {
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+	}
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h b/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
new file mode 100644
index 0000000..290c5cd
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
@@ -0,0 +1,297 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_X86_64_H_
+#define _RTE_MEMCPY_X86_64_H_
+
+#include <stdint.h>
+#include <string.h>
+#include <emmintrin.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
+#endif
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		: [reg_a] "=x" (reg_a)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu 64(%[src]), %[reg_e]\n\t"
+		"movdqu 80(%[src]), %[reg_f]\n\t"
+		"movdqu 96(%[src]), %[reg_g]\n\t"
+		"movdqu 112(%[src]), %[reg_h]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		"movdqu %[reg_e], 64(%[dst])\n\t"
+		"movdqu %[reg_f], 80(%[dst])\n\t"
+		"movdqu %[reg_g], 96(%[dst])\n\t"
+		"movdqu %[reg_h], 112(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d),
+		  [reg_e] "=x" (reg_e),
+		  [reg_f] "=x" (reg_f),
+		  [reg_g] "=x" (reg_g),
+		  [reg_h] "=x" (reg_h)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+#ifdef __INTEL_COMPILER
+#pragma warning(enable:593)
+#endif
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	rte_mov128(dst, src);
+	rte_mov128(dst + 128, src + 128);
+}
+
+#define rte_memcpy(dst, src, n)              \
+	((__builtin_constant_p(n)) ?          \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)))
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0) {
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+	}
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_memcpy.h b/lib/librte_eal/common/include/generic/rte_memcpy.h
new file mode 100644
index 0000000..03e8477
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_memcpy.h
@@ -0,0 +1,144 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_H_
+#define _RTE_MEMCPY_H_
+
+/**
+ * @file
+ *
+ * Functions for vectorised implementation of memcpy().
+ */
+
+/**
+ * Copy 16 bytes from one location to another using optimised
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src);
+
+/**
+ * Copy 32 bytes from one location to another using optimised
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src);
+
+/**
+ * Copy 48 bytes from one location to another using optimised
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src);
+
+/**
+ * Copy 64 bytes from one location to another using optimised
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src);
+
+/**
+ * Copy 128 bytes from one location to another using optimised
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src);
+
+/**
+ * Copy 256 bytes from one location to another using optimised
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src);
+
+#ifdef __DOXYGEN__
+
+/**
+ * Copy bytes from one location to another. The locations must not overlap.
+ *
+ * @note This is implemented as a macro, so it's address should not be taken
+ * and care is needed as parameter expressions may be evaluated multiple times.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ * @param n
+ *   Number of bytes to copy.
+ * @return
+ *   Pointer to the destination data.
+ */
+static void *
+rte_memcpy(void *dst, const void *src, size_t n);
+
+#endif /* __DOXYGEN__ */
+
+/*
+ * memcpy() function used by rte_memcpy macro
+ */
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n) __attribute__((always_inline));
+
+
+#endif /* _RTE_MEMCPY_H_ */
diff --git a/lib/librte_eal/common/include/rte_memcpy.h b/lib/librte_eal/common/include/rte_memcpy.h
deleted file mode 100644
index 131b196..0000000
--- a/lib/librte_eal/common/include/rte_memcpy.h
+++ /dev/null
@@ -1,376 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_MEMCPY_H_
-#define _RTE_MEMCPY_H_
-
-/**
- * @file
- *
- * Functions for SSE implementation of memcpy().
- */
-
-#include <stdint.h>
-#include <string.h>
-#include <emmintrin.h>
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#ifdef __INTEL_COMPILER
-#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
-#endif
-
-/**
- * Copy 16 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov16(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		: [reg_a] "=x" (reg_a)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 32 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov32(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 48 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov48(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 64 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov64(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-/**
- * Copy 128 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov128(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu 64(%[src]), %[reg_e]\n\t"
-		"movdqu 80(%[src]), %[reg_f]\n\t"
-		"movdqu 96(%[src]), %[reg_g]\n\t"
-		"movdqu 112(%[src]), %[reg_h]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		"movdqu %[reg_e], 64(%[dst])\n\t"
-		"movdqu %[reg_f], 80(%[dst])\n\t"
-		"movdqu %[reg_g], 96(%[dst])\n\t"
-		"movdqu %[reg_h], 112(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d),
-		  [reg_e] "=x" (reg_e),
-		  [reg_f] "=x" (reg_f),
-		  [reg_g] "=x" (reg_g),
-		  [reg_h] "=x" (reg_h)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-#ifdef __INTEL_COMPILER
-#pragma warning(enable:593)
-#endif
-
-/**
- * Copy 256 bytes from one location to another using optimised SSE
- * instructions. The locations should not overlap.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- */
-static inline void
-rte_mov256(uint8_t *dst, const uint8_t *src)
-{
-	rte_mov128(dst, src);
-	rte_mov128(dst + 128, src + 128);
-}
-
-/**
- * Copy bytes from one location to another. The locations must not overlap.
- *
- * @note This is implemented as a macro, so it's address should not be taken
- * and care is needed as parameter expressions may be evaluated multiple times.
- *
- * @param dst
- *   Pointer to the destination of the data.
- * @param src
- *   Pointer to the source data.
- * @param n
- *   Number of bytes to copy.
- * @return
- *   Pointer to the destination data.
- */
-#define rte_memcpy(dst, src, n)              \
-	((__builtin_constant_p(n)) ?          \
-	memcpy((dst), (src), (n)) :          \
-	rte_memcpy_func((dst), (src), (n)))
-
-/*
- * memcpy() function used by rte_memcpy macro
- */
-static inline void *
-rte_memcpy_func(void *dst, const void *src, size_t n) __attribute__((always_inline));
-
-static inline void *
-rte_memcpy_func(void *dst, const void *src, size_t n)
-{
-	void *ret = dst;
-
-	/* We can't copy < 16 bytes using XMM registers so do it manually. */
-	if (n < 16) {
-		if (n & 0x01) {
-			*(uint8_t *)dst = *(const uint8_t *)src;
-			dst = (uint8_t *)dst + 1;
-			src = (const uint8_t *)src + 1;
-		}
-		if (n & 0x02) {
-			*(uint16_t *)dst = *(const uint16_t *)src;
-			dst = (uint16_t *)dst + 1;
-			src = (const uint16_t *)src + 1;
-		}
-		if (n & 0x04) {
-			*(uint32_t *)dst = *(const uint32_t *)src;
-			dst = (uint32_t *)dst + 1;
-			src = (const uint32_t *)src + 1;
-		}
-		if (n & 0x08) {
-			*(uint64_t *)dst = *(const uint64_t *)src;
-		}
-		return ret;
-	}
-
-	/* Special fast cases for <= 128 bytes */
-	if (n <= 32) {
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-		return ret;
-	}
-
-	if (n <= 64) {
-		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
-		return ret;
-	}
-
-	if (n <= 128) {
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
-		return ret;
-	}
-
-	/*
-	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
-	 * copies was found to be faster than doing 128 and 32 byte copies as
-	 * well.
-	 */
-	for ( ; n >= 256; n -= 256) {
-		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
-		dst = (uint8_t *)dst + 256;
-		src = (const uint8_t *)src + 256;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 256) into
-	 * 64byte (2^6) chunks.
-	 * Using incrementing integers in the case labels of a switch statement
-	 * enourages the compiler to use a jump table. To get incrementing
-	 * integers, we shift the 2 relevant bits to the LSB position to first
-	 * get decrementing integers, and then subtract.
-	 */
-	switch (3 - (n >> 6)) {
-	case 0x00:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x01:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x02:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	default:
-		;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 64) into
-	 * 16byte (2^4) chunks, using the same switch structure as above.
-	 */
-	switch (3 - (n >> 4)) {
-	case 0x00:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x01:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x02:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	default:
-		;
-	}
-
-	/* Copy any remaining bytes, without going beyond end of buffers */
-	if (n != 0) {
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-	}
-	return ret;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_MEMCPY_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 08/10] eal: split CPU flags operations to architecture specific
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (6 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 07/10] eal: split memcpy operation " David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 09/10] eal: install all arch headers David Marchand
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

From: Chao Zhu <bjzhuc@cn.ibm.com>

This patch splits CPU flags related operations from DPDK and push them
to architecture specific arch directories, so that other processor
architecture can implement its own CPU flag functions to support DPDK.

Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile                     |    4 +-
 lib/librte_eal/common/eal_common_cpuflags.c        |  190 ------------
 .../common/include/arch/i686/rte_cpuflags.h        |  310 ++++++++++++++++++++
 .../common/include/arch/x86_64/rte_cpuflags.h      |  310 ++++++++++++++++++++
 .../common/include/generic/rte_cpuflags.h          |  110 +++++++
 lib/librte_eal/common/include/rte_cpuflags.h       |  182 ------------
 6 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index f9f98dc..ddf8b48 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -36,7 +36,7 @@ INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
-INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
+INC += rte_string_fns.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
 INC += rte_hexdump.h rte_devargs.h rte_dev.h
 INC += rte_common_vect.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
-GENERIC_INC += rte_spinlock.h rte_memcpy.h
+GENERIC_INC += rte_spinlock.h rte_memcpy.h rte_cpuflags.h
 ARCH_INC := $(GENERIC_INC)
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index 9e79179..6fd360c 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -30,10 +30,6 @@
  *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
-#include <stdlib.h>
-#include <stdio.h>
-#include <errno.h>
-#include <stdint.h>
 #include <rte_cpuflags.h>
 
 /*
@@ -50,192 +46,6 @@
 #endif
 
 /**
- * Enumeration of CPU registers
- */
-enum cpu_register_t {
-	REG_EAX = 0,
-	REG_EBX,
-	REG_ECX,
-	REG_EDX,
-};
-
-typedef uint32_t cpuid_registers_t[4];
-
-#define CPU_FLAG_NAME_MAX_LEN 64
-
-/**
- * Struct to hold a processor feature entry
- */
-struct feature_entry {
-	uint32_t leaf;				/**< cpuid leaf */
-	uint32_t subleaf;			/**< cpuid subleaf */
-	uint32_t reg;				/**< cpuid register */
-	uint32_t bit;				/**< cpuid register bit */
-	char name[CPU_FLAG_NAME_MAX_LEN];       /**< String for printing */
-};
-
-#define FEAT_DEF(name, leaf, subleaf, reg, bit) \
-	[RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name },
-
-/**
- * An array that holds feature entries
- */
-static const struct feature_entry cpu_feature_table[] = {
-	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
-	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
-	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
-	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
-	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
-	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
-	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
-	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
-	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
-	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
-	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
-	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
-	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
-	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
-	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
-	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
-	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
-	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
-	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
-	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
-	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
-	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
-	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
-	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
-	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
-	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
-	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
-	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
-	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
-
-	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
-	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
-	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
-	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
-	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
-	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
-	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
-	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
-	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
-	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
-	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
-	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
-	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
-	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
-	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
-	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
-	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
-	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
-	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
-	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
-	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
-	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
-	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
-	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
-	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
-	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
-	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
-	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
-	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
-
-	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
-	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
-	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
-	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
-	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
-	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
-
-	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
-	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
-	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
-
-	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
-	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
-	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
-	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
-	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
-	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
-	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
-	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
-	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
-
-	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
-	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
-
-	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
-	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
-	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
-	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
-	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
-
-	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
-};
-
-/*
- * Execute CPUID instruction and get contents of a specific register
- *
- * This function, when compiled with GCC, will generate architecture-neutral
- * code, as per GCC manual.
- */
-static inline void
-rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
-{
-#if defined(__i386__) && defined(__PIC__)
-    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
-    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
-		 : "=r" (out[REG_EBX]),
-		   "=a" (out[REG_EAX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-#else
-
-    asm volatile("cpuid"
-		 : "=a" (out[REG_EAX]),
-		   "=b" (out[REG_EBX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-
-#endif
-}
-
-/*
- * Checks if a particular flag is available on current machine.
- */
-int
-rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
-{
-	const struct feature_entry *feat;
-	cpuid_registers_t regs;
-
-
-	if (feature >= RTE_CPUFLAG_NUMFLAGS)
-		/* Flag does not match anything in the feature tables */
-		return -ENOENT;
-
-	feat = &cpu_feature_table[feature];
-
-	if (!feat->leaf)
-		/* This entry in the table wasn't filled out! */
-		return -EFAULT;
-
-	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
-	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
-	      regs[REG_EAX] < feat->leaf)
-		return 0;
-
-	/* get the cpuid leaf containing the desired feature */
-	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
-
-	/* check if the feature is enabled */
-	return (regs[feat->reg] >> feat->bit) & 1;
-}
-
-/**
  * Checks if the machine is adequate for running the binary. If it is not, the
  * program exits with status 1.
  * The function attribute forces this function to be called before main(). But
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h b/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
new file mode 100644
index 0000000..fd27e8f
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
@@ -0,0 +1,310 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_I686_H_
+#define _RTE_CPUFLAGS_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+
+#include "generic/rte_cpuflags.h"
+
+enum rte_cpu_flag_t {
+	/* (EAX 01h) ECX features*/
+	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
+	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
+	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
+	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
+	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
+	RTE_CPUFLAG_VMX,                    /**< VMX */
+	RTE_CPUFLAG_SMX,                    /**< SMX */
+	RTE_CPUFLAG_EIST,                   /**< EIST */
+	RTE_CPUFLAG_TM2,                    /**< TM2 */
+	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
+	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
+	RTE_CPUFLAG_FMA,                    /**< FMA */
+	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
+	RTE_CPUFLAG_XTPR,                   /**< XTPR */
+	RTE_CPUFLAG_PDCM,                   /**< PDCM */
+	RTE_CPUFLAG_PCID,                   /**< PCID */
+	RTE_CPUFLAG_DCA,                    /**< DCA */
+	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
+	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
+	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
+	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
+	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
+	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
+	RTE_CPUFLAG_AES,                    /**< AES */
+	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
+	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
+	RTE_CPUFLAG_AVX,                    /**< AVX */
+	RTE_CPUFLAG_F16C,                   /**< F16C */
+	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
+
+	/* (EAX 01h) EDX features */
+	RTE_CPUFLAG_FPU,                    /**< FPU */
+	RTE_CPUFLAG_VME,                    /**< VME */
+	RTE_CPUFLAG_DE,                     /**< DE */
+	RTE_CPUFLAG_PSE,                    /**< PSE */
+	RTE_CPUFLAG_TSC,                    /**< TSC */
+	RTE_CPUFLAG_MSR,                    /**< MSR */
+	RTE_CPUFLAG_PAE,                    /**< PAE */
+	RTE_CPUFLAG_MCE,                    /**< MCE */
+	RTE_CPUFLAG_CX8,                    /**< CX8 */
+	RTE_CPUFLAG_APIC,                   /**< APIC */
+	RTE_CPUFLAG_SEP,                    /**< SEP */
+	RTE_CPUFLAG_MTRR,                   /**< MTRR */
+	RTE_CPUFLAG_PGE,                    /**< PGE */
+	RTE_CPUFLAG_MCA,                    /**< MCA */
+	RTE_CPUFLAG_CMOV,                   /**< CMOV */
+	RTE_CPUFLAG_PAT,                    /**< PAT */
+	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
+	RTE_CPUFLAG_PSN,                    /**< PSN */
+	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
+	RTE_CPUFLAG_DS,                     /**< DS */
+	RTE_CPUFLAG_ACPI,                   /**< ACPI */
+	RTE_CPUFLAG_MMX,                    /**< MMX */
+	RTE_CPUFLAG_FXSR,                   /**< FXSR */
+	RTE_CPUFLAG_SSE,                    /**< SSE */
+	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
+	RTE_CPUFLAG_SS,                     /**< SS */
+	RTE_CPUFLAG_HTT,                    /**< HTT */
+	RTE_CPUFLAG_TM,                     /**< TM */
+	RTE_CPUFLAG_PBE,                    /**< PBE */
+
+	/* (EAX 06h) EAX features */
+	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
+	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
+	RTE_CPUFLAG_ARAT,                   /**< ARAT */
+	RTE_CPUFLAG_PLN,                    /**< PLN */
+	RTE_CPUFLAG_ECMD,                   /**< ECMD */
+	RTE_CPUFLAG_PTM,                    /**< PTM */
+
+	/* (EAX 06h) ECX features */
+	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
+	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
+	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
+
+	/* (EAX 07h, ECX 0h) EBX features */
+	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
+	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
+	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
+	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
+	RTE_CPUFLAG_SMEP,                   /**< SMEP */
+	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
+	RTE_CPUFLAG_ERMS,                   /**< ERMS */
+	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
+	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
+
+	/* (EAX 80000001h) ECX features */
+	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
+	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
+
+	/* (EAX 80000001h) EDX features */
+	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
+	RTE_CPUFLAG_XD,                     /**< XD */
+	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
+	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
+	RTE_CPUFLAG_EM64T,                  /**< EM64T */
+
+	/* (EAX 80000007h) EDX features */
+	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
+
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
+};
+
+enum cpu_register_t {
+	REG_EAX = 0,
+	REG_EBX,
+	REG_ECX,
+	REG_EDX,
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
+	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
+	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
+	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
+	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
+	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
+	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
+	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
+	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
+	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
+	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
+	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
+	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
+	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
+	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
+	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
+	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
+	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
+	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
+	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
+	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
+	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
+	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
+	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
+	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
+	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
+	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
+	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
+	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
+
+	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
+	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
+	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
+	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
+	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
+	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
+	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
+	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
+	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
+	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
+	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
+	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
+	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
+	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
+	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
+	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
+	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
+	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
+	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
+	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
+	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
+	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
+	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
+	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
+	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
+	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
+	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
+	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
+	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
+
+	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
+	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
+	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
+	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
+	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
+	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
+
+	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
+	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
+	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
+
+	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
+	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
+	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
+	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
+	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
+	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
+	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
+	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
+	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
+
+	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
+	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
+
+	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
+	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
+	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
+	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
+	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
+
+	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
+};
+
+static inline void
+rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
+{
+#if defined(__i386__) && defined(__PIC__)
+    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
+    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
+		 : "=r" (out[REG_EBX]),
+		   "=a" (out[REG_EAX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+#else
+
+    asm volatile("cpuid"
+		 : "=a" (out[REG_EAX]),
+		   "=b" (out[REG_EBX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+
+#endif
+}
+
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs;
+
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
+	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
+	      regs[REG_EAX] < feat->leaf)
+		return 0;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
new file mode 100644
index 0000000..98906c8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
@@ -0,0 +1,310 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+ 
+#ifndef _RTE_CPUFLAGS_X86_64_H_
+#define _RTE_CPUFLAGS_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+
+#include "generic/rte_cpuflags.h"
+
+enum rte_cpu_flag_t {
+	/* (EAX 01h) ECX features*/
+	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
+	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
+	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
+	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
+	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
+	RTE_CPUFLAG_VMX,                    /**< VMX */
+	RTE_CPUFLAG_SMX,                    /**< SMX */
+	RTE_CPUFLAG_EIST,                   /**< EIST */
+	RTE_CPUFLAG_TM2,                    /**< TM2 */
+	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
+	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
+	RTE_CPUFLAG_FMA,                    /**< FMA */
+	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
+	RTE_CPUFLAG_XTPR,                   /**< XTPR */
+	RTE_CPUFLAG_PDCM,                   /**< PDCM */
+	RTE_CPUFLAG_PCID,                   /**< PCID */
+	RTE_CPUFLAG_DCA,                    /**< DCA */
+	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
+	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
+	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
+	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
+	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
+	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
+	RTE_CPUFLAG_AES,                    /**< AES */
+	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
+	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
+	RTE_CPUFLAG_AVX,                    /**< AVX */
+	RTE_CPUFLAG_F16C,                   /**< F16C */
+	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
+
+	/* (EAX 01h) EDX features */
+	RTE_CPUFLAG_FPU,                    /**< FPU */
+	RTE_CPUFLAG_VME,                    /**< VME */
+	RTE_CPUFLAG_DE,                     /**< DE */
+	RTE_CPUFLAG_PSE,                    /**< PSE */
+	RTE_CPUFLAG_TSC,                    /**< TSC */
+	RTE_CPUFLAG_MSR,                    /**< MSR */
+	RTE_CPUFLAG_PAE,                    /**< PAE */
+	RTE_CPUFLAG_MCE,                    /**< MCE */
+	RTE_CPUFLAG_CX8,                    /**< CX8 */
+	RTE_CPUFLAG_APIC,                   /**< APIC */
+	RTE_CPUFLAG_SEP,                    /**< SEP */
+	RTE_CPUFLAG_MTRR,                   /**< MTRR */
+	RTE_CPUFLAG_PGE,                    /**< PGE */
+	RTE_CPUFLAG_MCA,                    /**< MCA */
+	RTE_CPUFLAG_CMOV,                   /**< CMOV */
+	RTE_CPUFLAG_PAT,                    /**< PAT */
+	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
+	RTE_CPUFLAG_PSN,                    /**< PSN */
+	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
+	RTE_CPUFLAG_DS,                     /**< DS */
+	RTE_CPUFLAG_ACPI,                   /**< ACPI */
+	RTE_CPUFLAG_MMX,                    /**< MMX */
+	RTE_CPUFLAG_FXSR,                   /**< FXSR */
+	RTE_CPUFLAG_SSE,                    /**< SSE */
+	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
+	RTE_CPUFLAG_SS,                     /**< SS */
+	RTE_CPUFLAG_HTT,                    /**< HTT */
+	RTE_CPUFLAG_TM,                     /**< TM */
+	RTE_CPUFLAG_PBE,                    /**< PBE */
+
+	/* (EAX 06h) EAX features */
+	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
+	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
+	RTE_CPUFLAG_ARAT,                   /**< ARAT */
+	RTE_CPUFLAG_PLN,                    /**< PLN */
+	RTE_CPUFLAG_ECMD,                   /**< ECMD */
+	RTE_CPUFLAG_PTM,                    /**< PTM */
+
+	/* (EAX 06h) ECX features */
+	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
+	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
+	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
+
+	/* (EAX 07h, ECX 0h) EBX features */
+	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
+	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
+	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
+	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
+	RTE_CPUFLAG_SMEP,                   /**< SMEP */
+	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
+	RTE_CPUFLAG_ERMS,                   /**< ERMS */
+	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
+	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
+
+	/* (EAX 80000001h) ECX features */
+	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
+	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
+
+	/* (EAX 80000001h) EDX features */
+	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
+	RTE_CPUFLAG_XD,                     /**< XD */
+	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
+	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
+	RTE_CPUFLAG_EM64T,                  /**< EM64T */
+
+	/* (EAX 80000007h) EDX features */
+	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
+
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
+};
+
+enum cpu_register_t {
+	REG_EAX = 0,
+	REG_EBX,
+	REG_ECX,
+	REG_EDX,
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
+	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
+	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
+	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
+	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
+	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
+	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
+	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
+	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
+	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
+	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
+	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
+	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
+	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
+	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
+	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
+	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
+	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
+	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
+	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
+	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
+	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
+	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
+	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
+	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
+	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
+	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
+	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
+	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
+
+	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
+	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
+	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
+	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
+	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
+	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
+	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
+	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
+	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
+	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
+	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
+	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
+	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
+	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
+	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
+	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
+	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
+	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
+	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
+	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
+	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
+	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
+	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
+	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
+	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
+	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
+	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
+	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
+	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
+
+	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
+	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
+	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
+	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
+	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
+	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
+
+	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
+	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
+	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
+
+	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
+	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
+	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
+	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
+	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
+	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
+	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
+	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
+	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
+
+	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
+	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
+
+	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
+	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
+	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
+	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
+	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
+
+	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
+};
+
+static inline void
+rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
+{
+#if defined(__i386__) && defined(__PIC__)
+    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
+    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
+		 : "=r" (out[REG_EBX]),
+		   "=a" (out[REG_EAX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+#else
+
+    asm volatile("cpuid"
+		 : "=a" (out[REG_EAX]),
+		   "=b" (out[REG_EBX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+
+#endif
+}
+
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs;
+
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
+	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
+	      regs[REG_EAX] < feat->leaf)
+		return 0;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
new file mode 100644
index 0000000..7f04838
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -0,0 +1,110 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_H_
+#define _RTE_CPUFLAGS_H_
+
+/**
+ * @file
+ * Architecture specific API to determine available CPU features at runtime.
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t;
+
+/**
+ * Enumeration of CPU registers
+ */
+enum cpu_register_t;
+
+typedef uint32_t cpuid_registers_t[4];
+
+#define CPU_FLAG_NAME_MAX_LEN 64
+
+/**
+ * Struct to hold a processor feature entry
+ */
+struct feature_entry {
+	uint32_t leaf;				/**< cpuid leaf */
+	uint32_t subleaf;			/**< cpuid subleaf */
+	uint32_t reg;				/**< cpuid register */
+	uint32_t bit;				/**< cpuid register bit */
+	char name[CPU_FLAG_NAME_MAX_LEN];       /**< String for printing */
+};
+
+#define FEAT_DEF(name, leaf, subleaf, reg, bit) \
+	[RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name },
+
+/**
+ * An array that holds feature entries
+ */
+static const struct feature_entry cpu_feature_table[];
+
+/**
+ * Execute CPUID instruction and get contents of a specific register
+ *
+ * This function, when compiled with GCC, will generate architecture-neutral
+ * code, as per GCC manual.
+ */
+static inline void
+rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out);
+
+/**
+ * Function for checking a CPU flag availability
+ *
+ * @param flag
+ *     CPU flag to query CPU for
+ * @return
+ *     1 if flag is available
+ *     0 if flag is not available
+ *     -ENOENT if flag is invalid
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);
+
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.
+ */
+void
+rte_cpu_check_supported(void);
+
+#endif /* _RTE_CPUFLAGS_H_ */
diff --git a/lib/librte_eal/common/include/rte_cpuflags.h b/lib/librte_eal/common/include/rte_cpuflags.h
deleted file mode 100644
index 5fa96db..0000000
--- a/lib/librte_eal/common/include/rte_cpuflags.h
+++ /dev/null
@@ -1,182 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CPUFLAGS_H_
-#define _RTE_CPUFLAGS_H_
-
-/**
- * @file
- * Simple API to determine available CPU features at runtime.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-
-/**
- * Enumeration of all CPU features supported
- */
-enum rte_cpu_flag_t {
-	/* (EAX 01h) ECX features*/
-	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
-	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
-	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
-	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
-	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
-	RTE_CPUFLAG_VMX,                    /**< VMX */
-	RTE_CPUFLAG_SMX,                    /**< SMX */
-	RTE_CPUFLAG_EIST,                   /**< EIST */
-	RTE_CPUFLAG_TM2,                    /**< TM2 */
-	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
-	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
-	RTE_CPUFLAG_FMA,                    /**< FMA */
-	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
-	RTE_CPUFLAG_XTPR,                   /**< XTPR */
-	RTE_CPUFLAG_PDCM,                   /**< PDCM */
-	RTE_CPUFLAG_PCID,                   /**< PCID */
-	RTE_CPUFLAG_DCA,                    /**< DCA */
-	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
-	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
-	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
-	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
-	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
-	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
-	RTE_CPUFLAG_AES,                    /**< AES */
-	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
-	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
-	RTE_CPUFLAG_AVX,                    /**< AVX */
-	RTE_CPUFLAG_F16C,                   /**< F16C */
-	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
-
-	/* (EAX 01h) EDX features */
-	RTE_CPUFLAG_FPU,                    /**< FPU */
-	RTE_CPUFLAG_VME,                    /**< VME */
-	RTE_CPUFLAG_DE,                     /**< DE */
-	RTE_CPUFLAG_PSE,                    /**< PSE */
-	RTE_CPUFLAG_TSC,                    /**< TSC */
-	RTE_CPUFLAG_MSR,                    /**< MSR */
-	RTE_CPUFLAG_PAE,                    /**< PAE */
-	RTE_CPUFLAG_MCE,                    /**< MCE */
-	RTE_CPUFLAG_CX8,                    /**< CX8 */
-	RTE_CPUFLAG_APIC,                   /**< APIC */
-	RTE_CPUFLAG_SEP,                    /**< SEP */
-	RTE_CPUFLAG_MTRR,                   /**< MTRR */
-	RTE_CPUFLAG_PGE,                    /**< PGE */
-	RTE_CPUFLAG_MCA,                    /**< MCA */
-	RTE_CPUFLAG_CMOV,                   /**< CMOV */
-	RTE_CPUFLAG_PAT,                    /**< PAT */
-	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
-	RTE_CPUFLAG_PSN,                    /**< PSN */
-	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
-	RTE_CPUFLAG_DS,                     /**< DS */
-	RTE_CPUFLAG_ACPI,                   /**< ACPI */
-	RTE_CPUFLAG_MMX,                    /**< MMX */
-	RTE_CPUFLAG_FXSR,                   /**< FXSR */
-	RTE_CPUFLAG_SSE,                    /**< SSE */
-	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
-	RTE_CPUFLAG_SS,                     /**< SS */
-	RTE_CPUFLAG_HTT,                    /**< HTT */
-	RTE_CPUFLAG_TM,                     /**< TM */
-	RTE_CPUFLAG_PBE,                    /**< PBE */
-
-	/* (EAX 06h) EAX features */
-	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
-	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
-	RTE_CPUFLAG_ARAT,                   /**< ARAT */
-	RTE_CPUFLAG_PLN,                    /**< PLN */
-	RTE_CPUFLAG_ECMD,                   /**< ECMD */
-	RTE_CPUFLAG_PTM,                    /**< PTM */
-
-	/* (EAX 06h) ECX features */
-	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
-	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
-	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
-
-	/* (EAX 07h, ECX 0h) EBX features */
-	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
-	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
-	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
-	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
-	RTE_CPUFLAG_SMEP,                   /**< SMEP */
-	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
-	RTE_CPUFLAG_ERMS,                   /**< ERMS */
-	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
-	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
-
-	/* (EAX 80000001h) ECX features */
-	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
-	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
-
-	/* (EAX 80000001h) EDX features */
-	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
-	RTE_CPUFLAG_XD,                     /**< XD */
-	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
-	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
-	RTE_CPUFLAG_EM64T,                  /**< EM64T */
-
-	/* (EAX 80000007h) EDX features */
-	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
-
-	/* The last item */
-	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
-};
-
-
-/**
- * Function for checking a CPU flag availability
- *
- * @param flag
- *     CPU flag to query CPU for
- * @return
- *     1 if flag is available
- *     0 if flag is not available
- *     -ENOENT if flag is invalid
- */
-int
-rte_cpu_get_flag_enabled(enum rte_cpu_flag_t flag);
-
-/**
- * This function checks that the currently used CPU supports the CPU features
- * that were specified at compile time. It is called automatically within the
- * EAL, so does not need to be used by applications.
- */
-void
-rte_cpu_check_supported(void);
-
-#ifdef __cplusplus
-}
-#endif
-
-
-#endif /* _RTE_CPUFLAGS_H_ */
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 09/10] eal: install all arch headers
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (7 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 08/10] eal: split CPU flags operations " David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 10/10] eal: factorize x86 headers David Marchand
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

Architecture can have their own specific headers, just install all headers from
arch directory.

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/common/Makefile |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index ddf8b48..499ba4d 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -48,11 +48,13 @@ endif
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
 GENERIC_INC += rte_spinlock.h rte_memcpy.h rte_cpuflags.h
-ARCH_INC := $(GENERIC_INC)
+# defined in mk/arch/$(RTE_ARCH)/rte.vars.mk
+ARCH_DIR ?= $(RTE_ARCH)
+ARCH_INC := $(notdir $(wildcard $(RTE_SDK)/lib/librte_eal/common/include/arch/$(ARCH_DIR)/*.h))
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
-	$(addprefix include/arch/$(RTE_ARCH)/,$(ARCH_INC))
+	$(addprefix include/arch/$(ARCH_DIR)/,$(ARCH_INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/generic := \
 	$(addprefix include/generic/,$(GENERIC_INC))
 
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v3 10/10] eal: factorize x86 headers
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (8 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 09/10] eal: install all arch headers David Marchand
@ 2014-10-28 12:50 ` David Marchand
  2014-11-03  8:10 ` [dpdk-dev] [PATCH v3 00/10] split architecture specific operations Chao CH Zhu
  2014-11-05  2:39 ` Chao Zhu
  11 siblings, 0 replies; 14+ messages in thread
From: David Marchand @ 2014-10-28 12:50 UTC (permalink / raw)
  To: dev; +Cc: bjzhuc

No need to keep the same code duplicated for 32 and 64bits x86.

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 .../common/include/arch/i686/rte_atomic.h          |  393 --------------------
 .../common/include/arch/i686/rte_byteorder.h       |  129 -------
 .../common/include/arch/i686/rte_cpuflags.h        |  310 ---------------
 .../common/include/arch/i686/rte_cycles.h          |  121 ------
 .../common/include/arch/i686/rte_memcpy.h          |  297 ---------------
 .../common/include/arch/i686/rte_prefetch.h        |   62 ---
 .../common/include/arch/i686/rte_spinlock.h        |   94 -----
 .../common/include/arch/x86/rte_atomic.h           |  216 +++++++++++
 .../common/include/arch/x86/rte_atomic_32.h        |  222 +++++++++++
 .../common/include/arch/x86/rte_atomic_64.h        |  191 ++++++++++
 .../common/include/arch/x86/rte_byteorder.h        |  121 ++++++
 .../common/include/arch/x86/rte_byteorder_32.h     |   51 +++
 .../common/include/arch/x86/rte_byteorder_64.h     |   52 +++
 .../common/include/arch/x86/rte_cpuflags.h         |  310 +++++++++++++++
 .../common/include/arch/x86/rte_cycles.h           |  121 ++++++
 .../common/include/arch/x86/rte_memcpy.h           |  297 +++++++++++++++
 .../common/include/arch/x86/rte_prefetch.h         |   62 +++
 .../common/include/arch/x86/rte_spinlock.h         |   94 +++++
 .../common/include/arch/x86_64/rte_atomic.h        |  362 ------------------
 .../common/include/arch/x86_64/rte_byteorder.h     |  130 -------
 .../common/include/arch/x86_64/rte_cpuflags.h      |  310 ---------------
 .../common/include/arch/x86_64/rte_cycles.h        |  121 ------
 .../common/include/arch/x86_64/rte_memcpy.h        |  297 ---------------
 .../common/include/arch/x86_64/rte_prefetch.h      |   62 ---
 .../common/include/arch/x86_64/rte_spinlock.h      |   94 -----
 mk/arch/i686/rte.vars.mk                           |    2 +
 mk/arch/x86_64/rte.vars.mk                         |    2 +
 27 files changed, 1741 insertions(+), 2782 deletions(-)
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_32.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/i686/rte_atomic.h b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
deleted file mode 100644
index 8330250..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_atomic.h
+++ /dev/null
@@ -1,393 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * Inspired from FreeBSD src/sys/i386/include/atomic.h
- * Copyright (c) 1998 Doug Rabson
- * All rights reserved.
- */
-
-#ifndef _RTE_ATOMIC_I686_H_
-#define _RTE_ATOMIC_I686_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <emmintrin.h>
-#include "generic/rte_atomic.h"
-
-#if RTE_MAX_LCORE == 1
-#define MPLOCKED                        /**< No need to insert MP lock prefix. */
-#else
-#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
-#endif
-
-#define	rte_mb() _mm_mfence()
-
-#define	rte_wmb() _mm_sfence()
-
-#define	rte_rmb() _mm_lfence()
-
-/*------------------------- 16 bit atomic operations -------------------------*/
-
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgw %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
-	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgl %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
-	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	uint8_t res;
-	union {
-		struct {
-			uint32_t l32;
-			uint32_t h32;
-		};
-		uint64_t u64;
-	} _exp, _src;
-
-	_exp.u64 = exp;
-	_src.u64 = src;
-
-#ifndef __PIC__
-    asm volatile (
-            MPLOCKED
-            "cmpxchg8b (%[dst]);"
-            "setz %[res];"
-            : [res] "=a" (res)      /* result in eax */
-            : [dst] "S" (dst),      /* esi */
-             "b" (_src.l32),       /* ebx */
-             "c" (_src.h32),       /* ecx */
-             "a" (_exp.l32),       /* eax */
-             "d" (_exp.h32)        /* edx */
-			: "memory" );           /* no-clobber list */
-#else
-	asm volatile (
-            "mov %%ebx, %%edi\n"
-			MPLOCKED
-			"cmpxchg8b (%[dst]);"
-			"setz %[res];"
-            "xchgl %%ebx, %%edi;\n"
-			: [res] "=a" (res)      /* result in eax */
-			: [dst] "S" (dst),      /* esi */
-			  "D" (_src.l32),       /* ebx */
-			  "c" (_src.h32),       /* ecx */
-			  "a" (_exp.l32),       /* eax */
-			  "d" (_exp.h32)        /* edx */
-			: "memory" );           /* no-clobber list */
-#endif
-
-	return res;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, 0);
-	}
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		/* replace the value by itself */
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp);
-	}
-	return tmp;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, new_value);
-	}
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp + inc);
-	}
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp - dec);
-	}
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	rte_atomic64_add(v, 1);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	rte_atomic64_sub(v, 1);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp + inc);
-	}
-
-	return tmp + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	int success = 0;
-	uint64_t tmp;
-
-	while (success == 0) {
-		tmp = v->cnt;
-		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
-		                              tmp, tmp - dec);
-	}
-
-	return tmp - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_add_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	rte_atomic64_set(v, 0);
-}
-#endif
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/i686/rte_byteorder.h b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
deleted file mode 100644
index 6d5b23e..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_BYTEORDER_I686_H_
-#define _RTE_BYTEORDER_I686_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_byteorder.h"
-
-/*
- * An architecture-optimized byte swap for a 16-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap16().
- */
-static inline uint16_t rte_arch_bswap16(uint16_t _x)
-{
-	register uint16_t x = _x;
-	asm volatile ("xchgb %b[x1],%h[x2]"
-		      : [x1] "=Q" (x)
-		      : [x2] "0" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 32-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap32().
- */
-static inline uint32_t rte_arch_bswap32(uint32_t _x)
-{
-	register uint32_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 64-bit value.
- *
-  * Do not use this function directly. The preferred function is rte_bswap64().
- */
-/* Compat./Leg. mode */
-static inline uint64_t rte_arch_bswap64(uint64_t x)
-{
-	uint64_t ret = 0;
-	ret |= ((uint64_t)rte_arch_bswap32(x & 0xffffffffUL) << 32);
-	ret |= ((uint64_t)rte_arch_bswap32((x >> 32) & 0xffffffffUL));
-	return ret;
-}
-
-#ifndef RTE_FORCE_INTRINSICS
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-
-#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap32(x) :		\
-				   rte_arch_bswap32(x)))
-
-#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap64(x) :		\
-				   rte_arch_bswap64(x)))
-#else
-/*
- * __builtin_bswap16 is only available gcc 4.8 and upwards
- */
-#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-#endif
-#endif
-
-#define rte_cpu_to_le_16(x) (x)
-#define rte_cpu_to_le_32(x) (x)
-#define rte_cpu_to_le_64(x) (x)
-
-#define rte_cpu_to_be_16(x) rte_bswap16(x)
-#define rte_cpu_to_be_32(x) rte_bswap32(x)
-#define rte_cpu_to_be_64(x) rte_bswap64(x)
-
-#define rte_le_to_cpu_16(x) (x)
-#define rte_le_to_cpu_32(x) (x)
-#define rte_le_to_cpu_64(x) (x)
-
-#define rte_be_to_cpu_16(x) rte_bswap16(x)
-#define rte_be_to_cpu_32(x) rte_bswap32(x)
-#define rte_be_to_cpu_64(x) rte_bswap64(x)
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_BYTEORDER_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h b/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
deleted file mode 100644
index fd27e8f..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
+++ /dev/null
@@ -1,310 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CPUFLAGS_I686_H_
-#define _RTE_CPUFLAGS_I686_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdlib.h>
-#include <stdio.h>
-#include <errno.h>
-#include <stdint.h>
-
-#include "generic/rte_cpuflags.h"
-
-enum rte_cpu_flag_t {
-	/* (EAX 01h) ECX features*/
-	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
-	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
-	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
-	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
-	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
-	RTE_CPUFLAG_VMX,                    /**< VMX */
-	RTE_CPUFLAG_SMX,                    /**< SMX */
-	RTE_CPUFLAG_EIST,                   /**< EIST */
-	RTE_CPUFLAG_TM2,                    /**< TM2 */
-	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
-	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
-	RTE_CPUFLAG_FMA,                    /**< FMA */
-	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
-	RTE_CPUFLAG_XTPR,                   /**< XTPR */
-	RTE_CPUFLAG_PDCM,                   /**< PDCM */
-	RTE_CPUFLAG_PCID,                   /**< PCID */
-	RTE_CPUFLAG_DCA,                    /**< DCA */
-	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
-	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
-	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
-	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
-	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
-	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
-	RTE_CPUFLAG_AES,                    /**< AES */
-	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
-	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
-	RTE_CPUFLAG_AVX,                    /**< AVX */
-	RTE_CPUFLAG_F16C,                   /**< F16C */
-	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
-
-	/* (EAX 01h) EDX features */
-	RTE_CPUFLAG_FPU,                    /**< FPU */
-	RTE_CPUFLAG_VME,                    /**< VME */
-	RTE_CPUFLAG_DE,                     /**< DE */
-	RTE_CPUFLAG_PSE,                    /**< PSE */
-	RTE_CPUFLAG_TSC,                    /**< TSC */
-	RTE_CPUFLAG_MSR,                    /**< MSR */
-	RTE_CPUFLAG_PAE,                    /**< PAE */
-	RTE_CPUFLAG_MCE,                    /**< MCE */
-	RTE_CPUFLAG_CX8,                    /**< CX8 */
-	RTE_CPUFLAG_APIC,                   /**< APIC */
-	RTE_CPUFLAG_SEP,                    /**< SEP */
-	RTE_CPUFLAG_MTRR,                   /**< MTRR */
-	RTE_CPUFLAG_PGE,                    /**< PGE */
-	RTE_CPUFLAG_MCA,                    /**< MCA */
-	RTE_CPUFLAG_CMOV,                   /**< CMOV */
-	RTE_CPUFLAG_PAT,                    /**< PAT */
-	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
-	RTE_CPUFLAG_PSN,                    /**< PSN */
-	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
-	RTE_CPUFLAG_DS,                     /**< DS */
-	RTE_CPUFLAG_ACPI,                   /**< ACPI */
-	RTE_CPUFLAG_MMX,                    /**< MMX */
-	RTE_CPUFLAG_FXSR,                   /**< FXSR */
-	RTE_CPUFLAG_SSE,                    /**< SSE */
-	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
-	RTE_CPUFLAG_SS,                     /**< SS */
-	RTE_CPUFLAG_HTT,                    /**< HTT */
-	RTE_CPUFLAG_TM,                     /**< TM */
-	RTE_CPUFLAG_PBE,                    /**< PBE */
-
-	/* (EAX 06h) EAX features */
-	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
-	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
-	RTE_CPUFLAG_ARAT,                   /**< ARAT */
-	RTE_CPUFLAG_PLN,                    /**< PLN */
-	RTE_CPUFLAG_ECMD,                   /**< ECMD */
-	RTE_CPUFLAG_PTM,                    /**< PTM */
-
-	/* (EAX 06h) ECX features */
-	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
-	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
-	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
-
-	/* (EAX 07h, ECX 0h) EBX features */
-	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
-	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
-	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
-	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
-	RTE_CPUFLAG_SMEP,                   /**< SMEP */
-	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
-	RTE_CPUFLAG_ERMS,                   /**< ERMS */
-	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
-	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
-
-	/* (EAX 80000001h) ECX features */
-	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
-	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
-
-	/* (EAX 80000001h) EDX features */
-	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
-	RTE_CPUFLAG_XD,                     /**< XD */
-	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
-	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
-	RTE_CPUFLAG_EM64T,                  /**< EM64T */
-
-	/* (EAX 80000007h) EDX features */
-	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
-
-	/* The last item */
-	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
-};
-
-enum cpu_register_t {
-	REG_EAX = 0,
-	REG_EBX,
-	REG_ECX,
-	REG_EDX,
-};
-
-static const struct feature_entry cpu_feature_table[] = {
-	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
-	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
-	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
-	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
-	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
-	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
-	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
-	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
-	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
-	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
-	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
-	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
-	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
-	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
-	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
-	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
-	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
-	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
-	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
-	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
-	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
-	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
-	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
-	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
-	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
-	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
-	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
-	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
-	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
-
-	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
-	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
-	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
-	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
-	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
-	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
-	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
-	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
-	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
-	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
-	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
-	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
-	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
-	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
-	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
-	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
-	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
-	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
-	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
-	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
-	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
-	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
-	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
-	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
-	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
-	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
-	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
-	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
-	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
-
-	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
-	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
-	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
-	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
-	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
-	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
-
-	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
-	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
-	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
-
-	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
-	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
-	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
-	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
-	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
-	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
-	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
-	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
-	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
-
-	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
-	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
-
-	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
-	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
-	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
-	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
-	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
-
-	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
-};
-
-static inline void
-rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
-{
-#if defined(__i386__) && defined(__PIC__)
-    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
-    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
-		 : "=r" (out[REG_EBX]),
-		   "=a" (out[REG_EAX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-#else
-
-    asm volatile("cpuid"
-		 : "=a" (out[REG_EAX]),
-		   "=b" (out[REG_EBX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-
-#endif
-}
-
-static inline int
-rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
-{
-	const struct feature_entry *feat;
-	cpuid_registers_t regs;
-
-
-	if (feature >= RTE_CPUFLAG_NUMFLAGS)
-		/* Flag does not match anything in the feature tables */
-		return -ENOENT;
-
-	feat = &cpu_feature_table[feature];
-
-	if (!feat->leaf)
-		/* This entry in the table wasn't filled out! */
-		return -EFAULT;
-
-	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
-	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
-	      regs[REG_EAX] < feat->leaf)
-		return 0;
-
-	/* get the cpuid leaf containing the desired feature */
-	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
-
-	/* check if the feature is enabled */
-	return (regs[feat->reg] >> feat->bit) & 1;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_CPUFLAGS_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cycles.h b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
deleted file mode 100644
index 6e47040..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_cycles.h
+++ /dev/null
@@ -1,121 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-/*   BSD LICENSE
- *
- *   Copyright(c) 2013 6WIND.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of 6WIND S.A. nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CYCLES_I686_H_
-#define _RTE_CYCLES_I686_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_cycles.h"
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-/* Global switch to use VMWARE mapping of TSC instead of RDTSC */
-extern int rte_cycles_vmware_tsc_map;
-#include <rte_branch_prediction.h>
-#endif
-
-static inline uint64_t
-rte_rdtsc(void)
-{
-	union {
-		uint64_t tsc_64;
-		struct {
-			uint32_t lo_32;
-			uint32_t hi_32;
-		};
-	} tsc;
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-	if (unlikely(rte_cycles_vmware_tsc_map)) {
-		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
-		asm volatile("rdpmc" :
-		             "=a" (tsc.lo_32),
-		             "=d" (tsc.hi_32) :
-		             "c"(0x10000));
-		return tsc.tsc_64;
-	}
-#endif
-
-	asm volatile("rdtsc" :
-		     "=a" (tsc.lo_32),
-		     "=d" (tsc.hi_32));
-	return tsc.tsc_64;
-}
-
-static inline uint64_t
-rte_rdtsc_precise(void)
-{
-	rte_mb();
-	return rte_rdtsc();
-}
-
-static inline uint64_t
-rte_get_tsc_cycles(void) { return rte_rdtsc(); }
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_CYCLES_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/i686/rte_memcpy.h b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
deleted file mode 100644
index b8513f6..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
+++ /dev/null
@@ -1,297 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_MEMCPY_I686_H_
-#define _RTE_MEMCPY_I686_H_
-
-#include <stdint.h>
-#include <string.h>
-#include <emmintrin.h>
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_memcpy.h"
-
-#ifdef __INTEL_COMPILER
-#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
-#endif
-
-static inline void
-rte_mov16(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		: [reg_a] "=x" (reg_a)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov32(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov48(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov64(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov128(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu 64(%[src]), %[reg_e]\n\t"
-		"movdqu 80(%[src]), %[reg_f]\n\t"
-		"movdqu 96(%[src]), %[reg_g]\n\t"
-		"movdqu 112(%[src]), %[reg_h]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		"movdqu %[reg_e], 64(%[dst])\n\t"
-		"movdqu %[reg_f], 80(%[dst])\n\t"
-		"movdqu %[reg_g], 96(%[dst])\n\t"
-		"movdqu %[reg_h], 112(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d),
-		  [reg_e] "=x" (reg_e),
-		  [reg_f] "=x" (reg_f),
-		  [reg_g] "=x" (reg_g),
-		  [reg_h] "=x" (reg_h)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-#ifdef __INTEL_COMPILER
-#pragma warning(enable:593)
-#endif
-
-static inline void
-rte_mov256(uint8_t *dst, const uint8_t *src)
-{
-	rte_mov128(dst, src);
-	rte_mov128(dst + 128, src + 128);
-}
-
-#define rte_memcpy(dst, src, n)              \
-	((__builtin_constant_p(n)) ?          \
-	memcpy((dst), (src), (n)) :          \
-	rte_memcpy_func((dst), (src), (n)))
-
-static inline void *
-rte_memcpy_func(void *dst, const void *src, size_t n)
-{
-	void *ret = dst;
-
-	/* We can't copy < 16 bytes using XMM registers so do it manually. */
-	if (n < 16) {
-		if (n & 0x01) {
-			*(uint8_t *)dst = *(const uint8_t *)src;
-			dst = (uint8_t *)dst + 1;
-			src = (const uint8_t *)src + 1;
-		}
-		if (n & 0x02) {
-			*(uint16_t *)dst = *(const uint16_t *)src;
-			dst = (uint16_t *)dst + 1;
-			src = (const uint16_t *)src + 1;
-		}
-		if (n & 0x04) {
-			*(uint32_t *)dst = *(const uint32_t *)src;
-			dst = (uint32_t *)dst + 1;
-			src = (const uint32_t *)src + 1;
-		}
-		if (n & 0x08) {
-			*(uint64_t *)dst = *(const uint64_t *)src;
-		}
-		return ret;
-	}
-
-	/* Special fast cases for <= 128 bytes */
-	if (n <= 32) {
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-		return ret;
-	}
-
-	if (n <= 64) {
-		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
-		return ret;
-	}
-
-	if (n <= 128) {
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
-		return ret;
-	}
-
-	/*
-	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
-	 * copies was found to be faster than doing 128 and 32 byte copies as
-	 * well.
-	 */
-	for ( ; n >= 256; n -= 256) {
-		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
-		dst = (uint8_t *)dst + 256;
-		src = (const uint8_t *)src + 256;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 256) into
-	 * 64byte (2^6) chunks.
-	 * Using incrementing integers in the case labels of a switch statement
-	 * enourages the compiler to use a jump table. To get incrementing
-	 * integers, we shift the 2 relevant bits to the LSB position to first
-	 * get decrementing integers, and then subtract.
-	 */
-	switch (3 - (n >> 6)) {
-	case 0x00:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x01:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x02:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	default:
-		;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 64) into
-	 * 16byte (2^4) chunks, using the same switch structure as above.
-	 */
-	switch (3 - (n >> 4)) {
-	case 0x00:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x01:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x02:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	default:
-		;
-	}
-
-	/* Copy any remaining bytes, without going beyond end of buffers */
-	if (n != 0) {
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-	}
-	return ret;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_MEMCPY_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/i686/rte_prefetch.h b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
deleted file mode 100644
index 5fbd98e..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
+++ /dev/null
@@ -1,62 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_PREFETCH_I686_H_
-#define _RTE_PREFETCH_I686_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_prefetch.h"
-
-static inline void rte_prefetch0(volatile void *p)
-{
-	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-static inline void rte_prefetch1(volatile void *p)
-{
-	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-static inline void rte_prefetch2(volatile void *p)
-{
-	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_PREFETCH_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/i686/rte_spinlock.h b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
deleted file mode 100644
index 60cfd4d..0000000
--- a/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
+++ /dev/null
@@ -1,94 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_SPINLOCK_I686_H_
-#define _RTE_SPINLOCK_I686_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_spinlock.h"
-
-#ifndef RTE_FORCE_INTRINSICS
-static inline void
-rte_spinlock_lock(rte_spinlock_t *sl)
-{
-	int lock_val = 1;
-	asm volatile (
-			"1:\n"
-			"xchg %[locked], %[lv]\n"
-			"test %[lv], %[lv]\n"
-			"jz 3f\n"
-			"2:\n"
-			"pause\n"
-			"cmpl $0, %[locked]\n"
-			"jnz 2b\n"
-			"jmp 1b\n"
-			"3:\n"
-			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
-			: "[lv]" (lock_val)
-			: "memory");
-}
-
-static inline void
-rte_spinlock_unlock (rte_spinlock_t *sl)
-{
-	int unlock_val = 0;
-	asm volatile (
-			"xchg %[locked], %[ulv]\n"
-			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
-			: "[ulv]" (unlock_val)
-			: "memory");
-}
-
-static inline int
-rte_spinlock_trylock (rte_spinlock_t *sl)
-{
-	int lockval = 1;
-
-	asm volatile (
-			"xchg %[locked], %[lockval]"
-			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
-			: "[lockval]" (lockval)
-			: "memory");
-
-	return (lockval == 0);
-}
-#endif
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_SPINLOCK_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
new file mode 100644
index 0000000..e93e8ee
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
@@ -0,0 +1,216 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_X86_H_
+#define _RTE_ATOMIC_X86_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <emmintrin.h>
+#include "generic/rte_atomic.h"
+
+#if RTE_MAX_LCORE == 1
+#define MPLOCKED                        /**< No need to insert MP lock prefix. */
+#else
+#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
+#endif
+
+#define	rte_mb() _mm_mfence()
+
+#define	rte_wmb() _mm_sfence()
+
+#define	rte_rmb() _mm_lfence()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgw %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decw %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decw %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	uint8_t res;
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgl %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+	return res;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decl %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(MPLOCKED
+			"decl %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return (ret != 0);
+}
+#endif
+
+#ifdef RTE_ARCH_I686
+#include "rte_atomic_32.h"
+#else
+#include "rte_atomic_64.h"
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_X86_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic_32.h b/lib/librte_eal/common/include/arch/x86/rte_atomic_32.h
new file mode 100644
index 0000000..400d8a9
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic_32.h
@@ -0,0 +1,222 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/i386/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_I686_H_
+#define _RTE_ATOMIC_I686_H_
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	uint8_t res;
+	union {
+		struct {
+			uint32_t l32;
+			uint32_t h32;
+		};
+		uint64_t u64;
+	} _exp, _src;
+
+	_exp.u64 = exp;
+	_src.u64 = src;
+
+#ifndef __PIC__
+    asm volatile (
+            MPLOCKED
+            "cmpxchg8b (%[dst]);"
+            "setz %[res];"
+            : [res] "=a" (res)      /* result in eax */
+            : [dst] "S" (dst),      /* esi */
+             "b" (_src.l32),       /* ebx */
+             "c" (_src.h32),       /* ecx */
+             "a" (_exp.l32),       /* eax */
+             "d" (_exp.h32)        /* edx */
+			: "memory" );           /* no-clobber list */
+#else
+	asm volatile (
+            "mov %%ebx, %%edi\n"
+			MPLOCKED
+			"cmpxchg8b (%[dst]);"
+			"setz %[res];"
+            "xchgl %%ebx, %%edi;\n"
+			: [res] "=a" (res)      /* result in eax */
+			: [dst] "S" (dst),      /* esi */
+			  "D" (_src.l32),       /* ebx */
+			  "c" (_src.h32),       /* ecx */
+			  "a" (_exp.l32),       /* eax */
+			  "d" (_exp.h32)        /* edx */
+			: "memory" );           /* no-clobber list */
+#endif
+
+	return res;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, 0);
+	}
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		/* replace the value by itself */
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp);
+	}
+	return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, new_value);
+	}
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp + inc);
+	}
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp - dec);
+	}
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	rte_atomic64_add(v, 1);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	rte_atomic64_sub(v, 1);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp + inc);
+	}
+
+	return tmp + inc;
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	int success = 0;
+	uint64_t tmp;
+
+	while (success == 0) {
+		tmp = v->cnt;
+		success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+		                              tmp, tmp - dec);
+	}
+
+	return tmp - dec;
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_add_return(v, 1) == 0;
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	return rte_atomic64_sub_return(v, 1) == 0;
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	rte_atomic64_set(v, 0);
+}
+#endif
+
+#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
new file mode 100644
index 0000000..4de6600
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
@@ -0,0 +1,191 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/amd64/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_X86_64_H_
+#define _RTE_ATOMIC_X86_64_H_
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	uint8_t res;
+
+
+	asm volatile(
+			MPLOCKED
+			"cmpxchgq %[src], %[dst];"
+			"sete %[res];"
+			: [res] "=a" (res),     /* output */
+			  [dst] "=m" (*dst)
+			: [src] "r" (src),      /* input */
+			  "a" (exp),
+			  "m" (*dst)
+			: "memory");            /* no-clobber list */
+
+	return res;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	return v->cnt;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	v->cnt = new_value;
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	asm volatile(
+			MPLOCKED
+			"addq %[inc], %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: [inc] "ir" (inc),     /* input */
+			  "m" (v->cnt)
+			);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	asm volatile(
+			MPLOCKED
+			"subq %[dec], %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: [dec] "ir" (dec),     /* input */
+			  "m" (v->cnt)
+			);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"incq %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	asm volatile(
+			MPLOCKED
+			"decq %[cnt]"
+			: [cnt] "=m" (v->cnt)   /* output */
+			: "m" (v->cnt)          /* input */
+			);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	int64_t prev = inc;
+
+	asm volatile(
+			MPLOCKED
+			"xaddq %[prev], %[cnt]"
+			: [prev] "+r" (prev),   /* output */
+			  [cnt] "=m" (v->cnt)
+			: "m" (v->cnt)          /* input */
+			);
+	return prev + inc;
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	return rte_atomic64_add_return(v, -dec);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"incq %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt), /* output */
+			  [ret] "=qm" (ret)
+			);
+
+	return ret != 0;
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	uint8_t ret;
+
+	asm volatile(
+			MPLOCKED
+			"decq %[cnt] ; "
+			"sete %[ret]"
+			: [cnt] "+m" (v->cnt),  /* output */
+			  [ret] "=qm" (ret)
+			);
+	return ret != 0;
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+#endif
+
+#endif /* _RTE_ATOMIC_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_byteorder.h b/lib/librte_eal/common/include/arch/x86/rte_byteorder.h
new file mode 100644
index 0000000..1aa6985
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_byteorder.h
@@ -0,0 +1,121 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_X86_H_
+#define _RTE_BYTEORDER_X86_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	register uint16_t x = _x;
+	asm volatile ("xchgb %b[x1],%h[x2]"
+		      : [x1] "=Q" (x)
+		      : [x2] "0" (x)
+		      );
+	return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	register uint32_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#ifdef RTE_ARCH_I686
+#include "rte_byteorder_32.h"
+#else
+#include "rte_byteorder_64.h"
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_X86_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h b/lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h
new file mode 100644
index 0000000..51c306f
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h
@@ -0,0 +1,51 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_I686_H_
+#define _RTE_BYTEORDER_I686_H_
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* Compat./Leg. mode */
+static inline uint64_t rte_arch_bswap64(uint64_t x)
+{
+	uint64_t ret = 0;
+	ret |= ((uint64_t)rte_arch_bswap32(x & 0xffffffffUL) << 32);
+	ret |= ((uint64_t)rte_arch_bswap32((x >> 32) & 0xffffffffUL));
+	return ret;
+}
+
+#endif /* _RTE_BYTEORDER_I686_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h b/lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h
new file mode 100644
index 0000000..dda572b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_X86_64_H_
+#define _RTE_BYTEORDER_X86_64_H_
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	register uint64_t x = _x;
+	asm volatile ("bswap %[x]"
+		      : [x] "+r" (x)
+		      );
+	return x;
+}
+
+#endif /* _RTE_BYTEORDER_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h b/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
new file mode 100644
index 0000000..98906c8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
@@ -0,0 +1,310 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+ 
+#ifndef _RTE_CPUFLAGS_X86_64_H_
+#define _RTE_CPUFLAGS_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+
+#include "generic/rte_cpuflags.h"
+
+enum rte_cpu_flag_t {
+	/* (EAX 01h) ECX features*/
+	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
+	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
+	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
+	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
+	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
+	RTE_CPUFLAG_VMX,                    /**< VMX */
+	RTE_CPUFLAG_SMX,                    /**< SMX */
+	RTE_CPUFLAG_EIST,                   /**< EIST */
+	RTE_CPUFLAG_TM2,                    /**< TM2 */
+	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
+	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
+	RTE_CPUFLAG_FMA,                    /**< FMA */
+	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
+	RTE_CPUFLAG_XTPR,                   /**< XTPR */
+	RTE_CPUFLAG_PDCM,                   /**< PDCM */
+	RTE_CPUFLAG_PCID,                   /**< PCID */
+	RTE_CPUFLAG_DCA,                    /**< DCA */
+	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
+	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
+	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
+	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
+	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
+	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
+	RTE_CPUFLAG_AES,                    /**< AES */
+	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
+	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
+	RTE_CPUFLAG_AVX,                    /**< AVX */
+	RTE_CPUFLAG_F16C,                   /**< F16C */
+	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
+
+	/* (EAX 01h) EDX features */
+	RTE_CPUFLAG_FPU,                    /**< FPU */
+	RTE_CPUFLAG_VME,                    /**< VME */
+	RTE_CPUFLAG_DE,                     /**< DE */
+	RTE_CPUFLAG_PSE,                    /**< PSE */
+	RTE_CPUFLAG_TSC,                    /**< TSC */
+	RTE_CPUFLAG_MSR,                    /**< MSR */
+	RTE_CPUFLAG_PAE,                    /**< PAE */
+	RTE_CPUFLAG_MCE,                    /**< MCE */
+	RTE_CPUFLAG_CX8,                    /**< CX8 */
+	RTE_CPUFLAG_APIC,                   /**< APIC */
+	RTE_CPUFLAG_SEP,                    /**< SEP */
+	RTE_CPUFLAG_MTRR,                   /**< MTRR */
+	RTE_CPUFLAG_PGE,                    /**< PGE */
+	RTE_CPUFLAG_MCA,                    /**< MCA */
+	RTE_CPUFLAG_CMOV,                   /**< CMOV */
+	RTE_CPUFLAG_PAT,                    /**< PAT */
+	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
+	RTE_CPUFLAG_PSN,                    /**< PSN */
+	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
+	RTE_CPUFLAG_DS,                     /**< DS */
+	RTE_CPUFLAG_ACPI,                   /**< ACPI */
+	RTE_CPUFLAG_MMX,                    /**< MMX */
+	RTE_CPUFLAG_FXSR,                   /**< FXSR */
+	RTE_CPUFLAG_SSE,                    /**< SSE */
+	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
+	RTE_CPUFLAG_SS,                     /**< SS */
+	RTE_CPUFLAG_HTT,                    /**< HTT */
+	RTE_CPUFLAG_TM,                     /**< TM */
+	RTE_CPUFLAG_PBE,                    /**< PBE */
+
+	/* (EAX 06h) EAX features */
+	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
+	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
+	RTE_CPUFLAG_ARAT,                   /**< ARAT */
+	RTE_CPUFLAG_PLN,                    /**< PLN */
+	RTE_CPUFLAG_ECMD,                   /**< ECMD */
+	RTE_CPUFLAG_PTM,                    /**< PTM */
+
+	/* (EAX 06h) ECX features */
+	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
+	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
+	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
+
+	/* (EAX 07h, ECX 0h) EBX features */
+	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
+	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
+	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
+	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
+	RTE_CPUFLAG_SMEP,                   /**< SMEP */
+	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
+	RTE_CPUFLAG_ERMS,                   /**< ERMS */
+	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
+	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
+
+	/* (EAX 80000001h) ECX features */
+	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
+	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
+
+	/* (EAX 80000001h) EDX features */
+	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
+	RTE_CPUFLAG_XD,                     /**< XD */
+	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
+	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
+	RTE_CPUFLAG_EM64T,                  /**< EM64T */
+
+	/* (EAX 80000007h) EDX features */
+	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
+
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
+};
+
+enum cpu_register_t {
+	REG_EAX = 0,
+	REG_EBX,
+	REG_ECX,
+	REG_EDX,
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
+	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
+	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
+	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
+	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
+	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
+	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
+	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
+	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
+	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
+	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
+	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
+	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
+	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
+	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
+	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
+	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
+	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
+	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
+	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
+	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
+	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
+	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
+	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
+	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
+	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
+	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
+	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
+	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
+
+	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
+	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
+	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
+	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
+	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
+	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
+	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
+	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
+	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
+	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
+	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
+	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
+	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
+	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
+	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
+	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
+	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
+	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
+	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
+	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
+	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
+	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
+	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
+	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
+	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
+	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
+	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
+	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
+	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
+
+	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
+	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
+	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
+	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
+	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
+	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
+
+	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
+	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
+	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
+
+	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
+	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
+	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
+	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
+	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
+	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
+	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
+	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
+	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
+
+	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
+	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
+
+	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
+	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
+	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
+	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
+	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
+
+	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
+};
+
+static inline void
+rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
+{
+#if defined(__i386__) && defined(__PIC__)
+    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
+    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
+		 : "=r" (out[REG_EBX]),
+		   "=a" (out[REG_EAX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+#else
+
+    asm volatile("cpuid"
+		 : "=a" (out[REG_EAX]),
+		   "=b" (out[REG_EBX]),
+		   "=c" (out[REG_ECX]),
+		   "=d" (out[REG_EDX])
+		 : "a" (leaf), "c" (subleaf));
+
+#endif
+}
+
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs;
+
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
+	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
+	      regs[REG_EAX] < feat->leaf)
+		return 0;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_cycles.h b/lib/librte_eal/common/include/arch/x86/rte_cycles.h
new file mode 100644
index 0000000..6e3c7d8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_cycles.h
@@ -0,0 +1,121 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_X86_64_H_
+#define _RTE_CYCLES_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+/* Global switch to use VMWARE mapping of TSC instead of RDTSC */
+extern int rte_cycles_vmware_tsc_map;
+#include <rte_branch_prediction.h>
+#endif
+
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
+	if (unlikely(rte_cycles_vmware_tsc_map)) {
+		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
+		asm volatile("rdpmc" :
+		             "=a" (tsc.lo_32),
+		             "=d" (tsc.hi_32) :
+		             "c"(0x10000));
+		return tsc.tsc_64;
+	}
+#endif
+
+	asm volatile("rdtsc" :
+		     "=a" (tsc.lo_32),
+		     "=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
new file mode 100644
index 0000000..290c5cd
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
@@ -0,0 +1,297 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_X86_64_H_
+#define _RTE_MEMCPY_X86_64_H_
+
+#include <stdint.h>
+#include <string.h>
+#include <emmintrin.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
+#endif
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		: [reg_a] "=x" (reg_a)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
+	asm volatile (
+		"movdqu (%[src]), %[reg_a]\n\t"
+		"movdqu 16(%[src]), %[reg_b]\n\t"
+		"movdqu 32(%[src]), %[reg_c]\n\t"
+		"movdqu 48(%[src]), %[reg_d]\n\t"
+		"movdqu 64(%[src]), %[reg_e]\n\t"
+		"movdqu 80(%[src]), %[reg_f]\n\t"
+		"movdqu 96(%[src]), %[reg_g]\n\t"
+		"movdqu 112(%[src]), %[reg_h]\n\t"
+		"movdqu %[reg_a], (%[dst])\n\t"
+		"movdqu %[reg_b], 16(%[dst])\n\t"
+		"movdqu %[reg_c], 32(%[dst])\n\t"
+		"movdqu %[reg_d], 48(%[dst])\n\t"
+		"movdqu %[reg_e], 64(%[dst])\n\t"
+		"movdqu %[reg_f], 80(%[dst])\n\t"
+		"movdqu %[reg_g], 96(%[dst])\n\t"
+		"movdqu %[reg_h], 112(%[dst])\n\t"
+		: [reg_a] "=x" (reg_a),
+		  [reg_b] "=x" (reg_b),
+		  [reg_c] "=x" (reg_c),
+		  [reg_d] "=x" (reg_d),
+		  [reg_e] "=x" (reg_e),
+		  [reg_f] "=x" (reg_f),
+		  [reg_g] "=x" (reg_g),
+		  [reg_h] "=x" (reg_h)
+		: [src] "r" (src),
+		  [dst] "r"(dst)
+		: "memory"
+	);
+}
+
+#ifdef __INTEL_COMPILER
+#pragma warning(enable:593)
+#endif
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	rte_mov128(dst, src);
+	rte_mov128(dst + 128, src + 128);
+}
+
+#define rte_memcpy(dst, src, n)              \
+	((__builtin_constant_p(n)) ?          \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)))
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0) {
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+	}
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_prefetch.h b/lib/librte_eal/common/include/arch/x86/rte_prefetch.h
new file mode 100644
index 0000000..ec2454d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_prefetch.h
@@ -0,0 +1,62 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_X86_64_H_
+#define _RTE_PREFETCH_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(volatile void *p)
+{
+	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+static inline void rte_prefetch1(volatile void *p)
+{
+	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+static inline void rte_prefetch2(volatile void *p)
+{
+	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_spinlock.h b/lib/librte_eal/common/include/arch/x86/rte_spinlock.h
new file mode 100644
index 0000000..54fba95
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_spinlock.h
@@ -0,0 +1,94 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_X86_64_H_
+#define _RTE_SPINLOCK_X86_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_spinlock.h"
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	int lock_val = 1;
+	asm volatile (
+			"1:\n"
+			"xchg %[locked], %[lv]\n"
+			"test %[lv], %[lv]\n"
+			"jz 3f\n"
+			"2:\n"
+			"pause\n"
+			"cmpl $0, %[locked]\n"
+			"jnz 2b\n"
+			"jmp 1b\n"
+			"3:\n"
+			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
+			: "[lv]" (lock_val)
+			: "memory");
+}
+
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	int unlock_val = 0;
+	asm volatile (
+			"xchg %[locked], %[ulv]\n"
+			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
+			: "[ulv]" (unlock_val)
+			: "memory");
+}
+
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	int lockval = 1;
+
+	asm volatile (
+			"xchg %[locked], %[lockval]"
+			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
+			: "[lockval]" (lockval)
+			: "memory");
+
+	return (lockval == 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h b/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
deleted file mode 100644
index 9138328..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
+++ /dev/null
@@ -1,362 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * Inspired from FreeBSD src/sys/amd64/include/atomic.h
- * Copyright (c) 1998 Doug Rabson
- * All rights reserved.
- */
-
-#ifndef _RTE_ATOMIC_X86_64_H_
-#define _RTE_ATOMIC_X86_64_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <emmintrin.h>
-#include "generic/rte_atomic.h"
-
-#if RTE_MAX_LCORE == 1
-#define MPLOCKED                        /**< No need to insert MP lock prefix. */
-#else
-#define MPLOCKED        "lock ; "       /**< Insert MP lock prefix. */
-#endif
-
-#define	rte_mb() _mm_mfence()
-
-#define	rte_wmb() _mm_sfence()
-
-#define	rte_rmb() _mm_lfence()
-
-/*------------------------- 16 bit atomic operations -------------------------*/
-
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgw %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
-	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decw %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decw %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
-	uint8_t res;
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgl %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-	return res;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
-	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decl %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(MPLOCKED
-			"decl %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return (ret != 0);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
-	uint8_t res;
-
-
-	asm volatile(
-			MPLOCKED
-			"cmpxchgq %[src], %[dst];"
-			"sete %[res];"
-			: [res] "=a" (res),     /* output */
-			  [dst] "=m" (*dst)
-			: [src] "r" (src),      /* input */
-			  "a" (exp),
-			  "m" (*dst)
-			: "memory");            /* no-clobber list */
-
-	return res;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
-	v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
-	return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
-	v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
-	asm volatile(
-			MPLOCKED
-			"addq %[inc], %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: [inc] "ir" (inc),     /* input */
-			  "m" (v->cnt)
-			);
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
-	asm volatile(
-			MPLOCKED
-			"subq %[dec], %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: [dec] "ir" (dec),     /* input */
-			  "m" (v->cnt)
-			);
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"incq %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
-	asm volatile(
-			MPLOCKED
-			"decq %[cnt]"
-			: [cnt] "=m" (v->cnt)   /* output */
-			: "m" (v->cnt)          /* input */
-			);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
-	int64_t prev = inc;
-
-	asm volatile(
-			MPLOCKED
-			"xaddq %[prev], %[cnt]"
-			: [prev] "+r" (prev),   /* output */
-			  [cnt] "=m" (v->cnt)
-			: "m" (v->cnt)          /* input */
-			);
-	return prev + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
-	return rte_atomic64_add_return(v, -dec);
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"incq %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt), /* output */
-			  [ret] "=qm" (ret)
-			);
-
-	return ret != 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
-	uint8_t ret;
-
-	asm volatile(
-			MPLOCKED
-			"decq %[cnt] ; "
-			"sete %[ret]"
-			: [cnt] "+m" (v->cnt),  /* output */
-			  [ret] "=qm" (ret)
-			);
-	return ret != 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
-	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
-	v->cnt = 0;
-}
-#endif
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_ATOMIC_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h b/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
deleted file mode 100644
index 825e576..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
+++ /dev/null
@@ -1,130 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_BYTEORDER_X86_64_H_
-#define _RTE_BYTEORDER_X86_64_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_byteorder.h"
-
-/*
- * An architecture-optimized byte swap for a 16-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap16().
- */
-static inline uint16_t rte_arch_bswap16(uint16_t _x)
-{
-	register uint16_t x = _x;
-	asm volatile ("xchgb %b[x1],%h[x2]"
-		      : [x1] "=Q" (x)
-		      : [x2] "0" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 32-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap32().
- */
-static inline uint32_t rte_arch_bswap32(uint32_t _x)
-{
-	register uint32_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 64-bit value.
- *
-  * Do not use this function directly. The preferred function is rte_bswap64().
- */
-/* 64-bit mode */
-static inline uint64_t rte_arch_bswap64(uint64_t _x)
-{
-	register uint64_t x = _x;
-	asm volatile ("bswap %[x]"
-		      : [x] "+r" (x)
-		      );
-	return x;
-}
-
-#ifndef RTE_FORCE_INTRINSICS
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-
-#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap32(x) :		\
-				   rte_arch_bswap32(x)))
-
-#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap64(x) :		\
-				   rte_arch_bswap64(x)))
-#else
-/*
- * __builtin_bswap16 is only available gcc 4.8 and upwards
- */
-#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
-				   rte_constant_bswap16(x) :		\
-				   rte_arch_bswap16(x)))
-#endif
-#endif
-
-#define rte_cpu_to_le_16(x) (x)
-#define rte_cpu_to_le_32(x) (x)
-#define rte_cpu_to_le_64(x) (x)
-
-#define rte_cpu_to_be_16(x) rte_bswap16(x)
-#define rte_cpu_to_be_32(x) rte_bswap32(x)
-#define rte_cpu_to_be_64(x) rte_bswap64(x)
-
-#define rte_le_to_cpu_16(x) (x)
-#define rte_le_to_cpu_32(x) (x)
-#define rte_le_to_cpu_64(x) (x)
-
-#define rte_be_to_cpu_16(x) rte_bswap16(x)
-#define rte_be_to_cpu_32(x) rte_bswap32(x)
-#define rte_be_to_cpu_64(x) rte_bswap64(x)
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_BYTEORDER_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
deleted file mode 100644
index 98906c8..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
+++ /dev/null
@@ -1,310 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
- 
-#ifndef _RTE_CPUFLAGS_X86_64_H_
-#define _RTE_CPUFLAGS_X86_64_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include <stdlib.h>
-#include <stdio.h>
-#include <errno.h>
-#include <stdint.h>
-
-#include "generic/rte_cpuflags.h"
-
-enum rte_cpu_flag_t {
-	/* (EAX 01h) ECX features*/
-	RTE_CPUFLAG_SSE3 = 0,               /**< SSE3 */
-	RTE_CPUFLAG_PCLMULQDQ,              /**< PCLMULQDQ */
-	RTE_CPUFLAG_DTES64,                 /**< DTES64 */
-	RTE_CPUFLAG_MONITOR,                /**< MONITOR */
-	RTE_CPUFLAG_DS_CPL,                 /**< DS_CPL */
-	RTE_CPUFLAG_VMX,                    /**< VMX */
-	RTE_CPUFLAG_SMX,                    /**< SMX */
-	RTE_CPUFLAG_EIST,                   /**< EIST */
-	RTE_CPUFLAG_TM2,                    /**< TM2 */
-	RTE_CPUFLAG_SSSE3,                  /**< SSSE3 */
-	RTE_CPUFLAG_CNXT_ID,                /**< CNXT_ID */
-	RTE_CPUFLAG_FMA,                    /**< FMA */
-	RTE_CPUFLAG_CMPXCHG16B,             /**< CMPXCHG16B */
-	RTE_CPUFLAG_XTPR,                   /**< XTPR */
-	RTE_CPUFLAG_PDCM,                   /**< PDCM */
-	RTE_CPUFLAG_PCID,                   /**< PCID */
-	RTE_CPUFLAG_DCA,                    /**< DCA */
-	RTE_CPUFLAG_SSE4_1,                 /**< SSE4_1 */
-	RTE_CPUFLAG_SSE4_2,                 /**< SSE4_2 */
-	RTE_CPUFLAG_X2APIC,                 /**< X2APIC */
-	RTE_CPUFLAG_MOVBE,                  /**< MOVBE */
-	RTE_CPUFLAG_POPCNT,                 /**< POPCNT */
-	RTE_CPUFLAG_TSC_DEADLINE,           /**< TSC_DEADLINE */
-	RTE_CPUFLAG_AES,                    /**< AES */
-	RTE_CPUFLAG_XSAVE,                  /**< XSAVE */
-	RTE_CPUFLAG_OSXSAVE,                /**< OSXSAVE */
-	RTE_CPUFLAG_AVX,                    /**< AVX */
-	RTE_CPUFLAG_F16C,                   /**< F16C */
-	RTE_CPUFLAG_RDRAND,                 /**< RDRAND */
-
-	/* (EAX 01h) EDX features */
-	RTE_CPUFLAG_FPU,                    /**< FPU */
-	RTE_CPUFLAG_VME,                    /**< VME */
-	RTE_CPUFLAG_DE,                     /**< DE */
-	RTE_CPUFLAG_PSE,                    /**< PSE */
-	RTE_CPUFLAG_TSC,                    /**< TSC */
-	RTE_CPUFLAG_MSR,                    /**< MSR */
-	RTE_CPUFLAG_PAE,                    /**< PAE */
-	RTE_CPUFLAG_MCE,                    /**< MCE */
-	RTE_CPUFLAG_CX8,                    /**< CX8 */
-	RTE_CPUFLAG_APIC,                   /**< APIC */
-	RTE_CPUFLAG_SEP,                    /**< SEP */
-	RTE_CPUFLAG_MTRR,                   /**< MTRR */
-	RTE_CPUFLAG_PGE,                    /**< PGE */
-	RTE_CPUFLAG_MCA,                    /**< MCA */
-	RTE_CPUFLAG_CMOV,                   /**< CMOV */
-	RTE_CPUFLAG_PAT,                    /**< PAT */
-	RTE_CPUFLAG_PSE36,                  /**< PSE36 */
-	RTE_CPUFLAG_PSN,                    /**< PSN */
-	RTE_CPUFLAG_CLFSH,                  /**< CLFSH */
-	RTE_CPUFLAG_DS,                     /**< DS */
-	RTE_CPUFLAG_ACPI,                   /**< ACPI */
-	RTE_CPUFLAG_MMX,                    /**< MMX */
-	RTE_CPUFLAG_FXSR,                   /**< FXSR */
-	RTE_CPUFLAG_SSE,                    /**< SSE */
-	RTE_CPUFLAG_SSE2,                   /**< SSE2 */
-	RTE_CPUFLAG_SS,                     /**< SS */
-	RTE_CPUFLAG_HTT,                    /**< HTT */
-	RTE_CPUFLAG_TM,                     /**< TM */
-	RTE_CPUFLAG_PBE,                    /**< PBE */
-
-	/* (EAX 06h) EAX features */
-	RTE_CPUFLAG_DIGTEMP,                /**< DIGTEMP */
-	RTE_CPUFLAG_TRBOBST,                /**< TRBOBST */
-	RTE_CPUFLAG_ARAT,                   /**< ARAT */
-	RTE_CPUFLAG_PLN,                    /**< PLN */
-	RTE_CPUFLAG_ECMD,                   /**< ECMD */
-	RTE_CPUFLAG_PTM,                    /**< PTM */
-
-	/* (EAX 06h) ECX features */
-	RTE_CPUFLAG_MPERF_APERF_MSR,        /**< MPERF_APERF_MSR */
-	RTE_CPUFLAG_ACNT2,                  /**< ACNT2 */
-	RTE_CPUFLAG_ENERGY_EFF,             /**< ENERGY_EFF */
-
-	/* (EAX 07h, ECX 0h) EBX features */
-	RTE_CPUFLAG_FSGSBASE,               /**< FSGSBASE */
-	RTE_CPUFLAG_BMI1,                   /**< BMI1 */
-	RTE_CPUFLAG_HLE,                    /**< Hardware Lock elision */
-	RTE_CPUFLAG_AVX2,                   /**< AVX2 */
-	RTE_CPUFLAG_SMEP,                   /**< SMEP */
-	RTE_CPUFLAG_BMI2,                   /**< BMI2 */
-	RTE_CPUFLAG_ERMS,                   /**< ERMS */
-	RTE_CPUFLAG_INVPCID,                /**< INVPCID */
-	RTE_CPUFLAG_RTM,                    /**< Transactional memory */
-
-	/* (EAX 80000001h) ECX features */
-	RTE_CPUFLAG_LAHF_SAHF,              /**< LAHF_SAHF */
-	RTE_CPUFLAG_LZCNT,                  /**< LZCNT */
-
-	/* (EAX 80000001h) EDX features */
-	RTE_CPUFLAG_SYSCALL,                /**< SYSCALL */
-	RTE_CPUFLAG_XD,                     /**< XD */
-	RTE_CPUFLAG_1GB_PG,                 /**< 1GB_PG */
-	RTE_CPUFLAG_RDTSCP,                 /**< RDTSCP */
-	RTE_CPUFLAG_EM64T,                  /**< EM64T */
-
-	/* (EAX 80000007h) EDX features */
-	RTE_CPUFLAG_INVTSC,                 /**< INVTSC */
-
-	/* The last item */
-	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
-};
-
-enum cpu_register_t {
-	REG_EAX = 0,
-	REG_EBX,
-	REG_ECX,
-	REG_EDX,
-};
-
-static const struct feature_entry cpu_feature_table[] = {
-	FEAT_DEF(SSE3, 0x00000001, 0, REG_ECX,  0)
-	FEAT_DEF(PCLMULQDQ, 0x00000001, 0, REG_ECX,  1)
-	FEAT_DEF(DTES64, 0x00000001, 0, REG_ECX,  2)
-	FEAT_DEF(MONITOR, 0x00000001, 0, REG_ECX,  3)
-	FEAT_DEF(DS_CPL, 0x00000001, 0, REG_ECX,  4)
-	FEAT_DEF(VMX, 0x00000001, 0, REG_ECX,  5)
-	FEAT_DEF(SMX, 0x00000001, 0, REG_ECX,  6)
-	FEAT_DEF(EIST, 0x00000001, 0, REG_ECX,  7)
-	FEAT_DEF(TM2, 0x00000001, 0, REG_ECX,  8)
-	FEAT_DEF(SSSE3, 0x00000001, 0, REG_ECX,  9)
-	FEAT_DEF(CNXT_ID, 0x00000001, 0, REG_ECX, 10)
-	FEAT_DEF(FMA, 0x00000001, 0, REG_ECX, 12)
-	FEAT_DEF(CMPXCHG16B, 0x00000001, 0, REG_ECX, 13)
-	FEAT_DEF(XTPR, 0x00000001, 0, REG_ECX, 14)
-	FEAT_DEF(PDCM, 0x00000001, 0, REG_ECX, 15)
-	FEAT_DEF(PCID, 0x00000001, 0, REG_ECX, 17)
-	FEAT_DEF(DCA, 0x00000001, 0, REG_ECX, 18)
-	FEAT_DEF(SSE4_1, 0x00000001, 0, REG_ECX, 19)
-	FEAT_DEF(SSE4_2, 0x00000001, 0, REG_ECX, 20)
-	FEAT_DEF(X2APIC, 0x00000001, 0, REG_ECX, 21)
-	FEAT_DEF(MOVBE, 0x00000001, 0, REG_ECX, 22)
-	FEAT_DEF(POPCNT, 0x00000001, 0, REG_ECX, 23)
-	FEAT_DEF(TSC_DEADLINE, 0x00000001, 0, REG_ECX, 24)
-	FEAT_DEF(AES, 0x00000001, 0, REG_ECX, 25)
-	FEAT_DEF(XSAVE, 0x00000001, 0, REG_ECX, 26)
-	FEAT_DEF(OSXSAVE, 0x00000001, 0, REG_ECX, 27)
-	FEAT_DEF(AVX, 0x00000001, 0, REG_ECX, 28)
-	FEAT_DEF(F16C, 0x00000001, 0, REG_ECX, 29)
-	FEAT_DEF(RDRAND, 0x00000001, 0, REG_ECX, 30)
-
-	FEAT_DEF(FPU, 0x00000001, 0, REG_EDX,  0)
-	FEAT_DEF(VME, 0x00000001, 0, REG_EDX,  1)
-	FEAT_DEF(DE, 0x00000001, 0, REG_EDX,  2)
-	FEAT_DEF(PSE, 0x00000001, 0, REG_EDX,  3)
-	FEAT_DEF(TSC, 0x00000001, 0, REG_EDX,  4)
-	FEAT_DEF(MSR, 0x00000001, 0, REG_EDX,  5)
-	FEAT_DEF(PAE, 0x00000001, 0, REG_EDX,  6)
-	FEAT_DEF(MCE, 0x00000001, 0, REG_EDX,  7)
-	FEAT_DEF(CX8, 0x00000001, 0, REG_EDX,  8)
-	FEAT_DEF(APIC, 0x00000001, 0, REG_EDX,  9)
-	FEAT_DEF(SEP, 0x00000001, 0, REG_EDX, 11)
-	FEAT_DEF(MTRR, 0x00000001, 0, REG_EDX, 12)
-	FEAT_DEF(PGE, 0x00000001, 0, REG_EDX, 13)
-	FEAT_DEF(MCA, 0x00000001, 0, REG_EDX, 14)
-	FEAT_DEF(CMOV, 0x00000001, 0, REG_EDX, 15)
-	FEAT_DEF(PAT, 0x00000001, 0, REG_EDX, 16)
-	FEAT_DEF(PSE36, 0x00000001, 0, REG_EDX, 17)
-	FEAT_DEF(PSN, 0x00000001, 0, REG_EDX, 18)
-	FEAT_DEF(CLFSH, 0x00000001, 0, REG_EDX, 19)
-	FEAT_DEF(DS, 0x00000001, 0, REG_EDX, 21)
-	FEAT_DEF(ACPI, 0x00000001, 0, REG_EDX, 22)
-	FEAT_DEF(MMX, 0x00000001, 0, REG_EDX, 23)
-	FEAT_DEF(FXSR, 0x00000001, 0, REG_EDX, 24)
-	FEAT_DEF(SSE, 0x00000001, 0, REG_EDX, 25)
-	FEAT_DEF(SSE2, 0x00000001, 0, REG_EDX, 26)
-	FEAT_DEF(SS, 0x00000001, 0, REG_EDX, 27)
-	FEAT_DEF(HTT, 0x00000001, 0, REG_EDX, 28)
-	FEAT_DEF(TM, 0x00000001, 0, REG_EDX, 29)
-	FEAT_DEF(PBE, 0x00000001, 0, REG_EDX, 31)
-
-	FEAT_DEF(DIGTEMP, 0x00000006, 0, REG_EAX,  0)
-	FEAT_DEF(TRBOBST, 0x00000006, 0, REG_EAX,  1)
-	FEAT_DEF(ARAT, 0x00000006, 0, REG_EAX,  2)
-	FEAT_DEF(PLN, 0x00000006, 0, REG_EAX,  4)
-	FEAT_DEF(ECMD, 0x00000006, 0, REG_EAX,  5)
-	FEAT_DEF(PTM, 0x00000006, 0, REG_EAX,  6)
-
-	FEAT_DEF(MPERF_APERF_MSR, 0x00000006, 0, REG_ECX,  0)
-	FEAT_DEF(ACNT2, 0x00000006, 0, REG_ECX,  1)
-	FEAT_DEF(ENERGY_EFF, 0x00000006, 0, REG_ECX,  3)
-
-	FEAT_DEF(FSGSBASE, 0x00000007, 0, REG_EBX,  0)
-	FEAT_DEF(BMI1, 0x00000007, 0, REG_EBX,  2)
-	FEAT_DEF(HLE, 0x00000007, 0, REG_EBX,  4)
-	FEAT_DEF(AVX2, 0x00000007, 0, REG_EBX,  5)
-	FEAT_DEF(SMEP, 0x00000007, 0, REG_EBX,  6)
-	FEAT_DEF(BMI2, 0x00000007, 0, REG_EBX,  7)
-	FEAT_DEF(ERMS, 0x00000007, 0, REG_EBX,  8)
-	FEAT_DEF(INVPCID, 0x00000007, 0, REG_EBX, 10)
-	FEAT_DEF(RTM, 0x00000007, 0, REG_EBX, 11)
-
-	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, REG_ECX,  0)
-	FEAT_DEF(LZCNT, 0x80000001, 0, REG_ECX,  4)
-
-	FEAT_DEF(SYSCALL, 0x80000001, 0, REG_EDX, 11)
-	FEAT_DEF(XD, 0x80000001, 0, REG_EDX, 20)
-	FEAT_DEF(1GB_PG, 0x80000001, 0, REG_EDX, 26)
-	FEAT_DEF(RDTSCP, 0x80000001, 0, REG_EDX, 27)
-	FEAT_DEF(EM64T, 0x80000001, 0, REG_EDX, 29)
-
-	FEAT_DEF(INVTSC, 0x80000007, 0, REG_EDX,  8)
-};
-
-static inline void
-rte_cpu_get_features(uint32_t leaf, uint32_t subleaf, cpuid_registers_t out)
-{
-#if defined(__i386__) && defined(__PIC__)
-    /* %ebx is a forbidden register if we compile with -fPIC or -fPIE */
-    asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0"
-		 : "=r" (out[REG_EBX]),
-		   "=a" (out[REG_EAX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-#else
-
-    asm volatile("cpuid"
-		 : "=a" (out[REG_EAX]),
-		   "=b" (out[REG_EBX]),
-		   "=c" (out[REG_ECX]),
-		   "=d" (out[REG_EDX])
-		 : "a" (leaf), "c" (subleaf));
-
-#endif
-}
-
-static inline int
-rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
-{
-	const struct feature_entry *feat;
-	cpuid_registers_t regs;
-
-
-	if (feature >= RTE_CPUFLAG_NUMFLAGS)
-		/* Flag does not match anything in the feature tables */
-		return -ENOENT;
-
-	feat = &cpu_feature_table[feature];
-
-	if (!feat->leaf)
-		/* This entry in the table wasn't filled out! */
-		return -EFAULT;
-
-	rte_cpu_get_features(feat->leaf & 0xffff0000, 0, regs);
-	if (((regs[REG_EAX] ^ feat->leaf) & 0xffff0000) ||
-	      regs[REG_EAX] < feat->leaf)
-		return 0;
-
-	/* get the cpuid leaf containing the desired feature */
-	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
-
-	/* check if the feature is enabled */
-	return (regs[feat->reg] >> feat->bit) & 1;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_CPUFLAGS_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h b/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
deleted file mode 100644
index 6e3c7d8..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
+++ /dev/null
@@ -1,121 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-/*   BSD LICENSE
- *
- *   Copyright(c) 2013 6WIND.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of 6WIND S.A. nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_CYCLES_X86_64_H_
-#define _RTE_CYCLES_X86_64_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_cycles.h"
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-/* Global switch to use VMWARE mapping of TSC instead of RDTSC */
-extern int rte_cycles_vmware_tsc_map;
-#include <rte_branch_prediction.h>
-#endif
-
-static inline uint64_t
-rte_rdtsc(void)
-{
-	union {
-		uint64_t tsc_64;
-		struct {
-			uint32_t lo_32;
-			uint32_t hi_32;
-		};
-	} tsc;
-
-#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
-	if (unlikely(rte_cycles_vmware_tsc_map)) {
-		/* ecx = 0x10000 corresponds to the physical TSC for VMware */
-		asm volatile("rdpmc" :
-		             "=a" (tsc.lo_32),
-		             "=d" (tsc.hi_32) :
-		             "c"(0x10000));
-		return tsc.tsc_64;
-	}
-#endif
-
-	asm volatile("rdtsc" :
-		     "=a" (tsc.lo_32),
-		     "=d" (tsc.hi_32));
-	return tsc.tsc_64;
-}
-
-static inline uint64_t
-rte_rdtsc_precise(void)
-{
-	rte_mb();
-	return rte_rdtsc();
-}
-
-static inline uint64_t
-rte_get_tsc_cycles(void) { return rte_rdtsc(); }
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_CYCLES_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h b/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
deleted file mode 100644
index 290c5cd..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
+++ /dev/null
@@ -1,297 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_MEMCPY_X86_64_H_
-#define _RTE_MEMCPY_X86_64_H_
-
-#include <stdint.h>
-#include <string.h>
-#include <emmintrin.h>
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_memcpy.h"
-
-#ifdef __INTEL_COMPILER
-#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
-#endif
-
-static inline void
-rte_mov16(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		: [reg_a] "=x" (reg_a)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov32(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov48(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov64(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-static inline void
-rte_mov128(uint8_t *dst, const uint8_t *src)
-{
-	__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;
-	asm volatile (
-		"movdqu (%[src]), %[reg_a]\n\t"
-		"movdqu 16(%[src]), %[reg_b]\n\t"
-		"movdqu 32(%[src]), %[reg_c]\n\t"
-		"movdqu 48(%[src]), %[reg_d]\n\t"
-		"movdqu 64(%[src]), %[reg_e]\n\t"
-		"movdqu 80(%[src]), %[reg_f]\n\t"
-		"movdqu 96(%[src]), %[reg_g]\n\t"
-		"movdqu 112(%[src]), %[reg_h]\n\t"
-		"movdqu %[reg_a], (%[dst])\n\t"
-		"movdqu %[reg_b], 16(%[dst])\n\t"
-		"movdqu %[reg_c], 32(%[dst])\n\t"
-		"movdqu %[reg_d], 48(%[dst])\n\t"
-		"movdqu %[reg_e], 64(%[dst])\n\t"
-		"movdqu %[reg_f], 80(%[dst])\n\t"
-		"movdqu %[reg_g], 96(%[dst])\n\t"
-		"movdqu %[reg_h], 112(%[dst])\n\t"
-		: [reg_a] "=x" (reg_a),
-		  [reg_b] "=x" (reg_b),
-		  [reg_c] "=x" (reg_c),
-		  [reg_d] "=x" (reg_d),
-		  [reg_e] "=x" (reg_e),
-		  [reg_f] "=x" (reg_f),
-		  [reg_g] "=x" (reg_g),
-		  [reg_h] "=x" (reg_h)
-		: [src] "r" (src),
-		  [dst] "r"(dst)
-		: "memory"
-	);
-}
-
-#ifdef __INTEL_COMPILER
-#pragma warning(enable:593)
-#endif
-
-static inline void
-rte_mov256(uint8_t *dst, const uint8_t *src)
-{
-	rte_mov128(dst, src);
-	rte_mov128(dst + 128, src + 128);
-}
-
-#define rte_memcpy(dst, src, n)              \
-	((__builtin_constant_p(n)) ?          \
-	memcpy((dst), (src), (n)) :          \
-	rte_memcpy_func((dst), (src), (n)))
-
-static inline void *
-rte_memcpy_func(void *dst, const void *src, size_t n)
-{
-	void *ret = dst;
-
-	/* We can't copy < 16 bytes using XMM registers so do it manually. */
-	if (n < 16) {
-		if (n & 0x01) {
-			*(uint8_t *)dst = *(const uint8_t *)src;
-			dst = (uint8_t *)dst + 1;
-			src = (const uint8_t *)src + 1;
-		}
-		if (n & 0x02) {
-			*(uint16_t *)dst = *(const uint16_t *)src;
-			dst = (uint16_t *)dst + 1;
-			src = (const uint16_t *)src + 1;
-		}
-		if (n & 0x04) {
-			*(uint32_t *)dst = *(const uint32_t *)src;
-			dst = (uint32_t *)dst + 1;
-			src = (const uint32_t *)src + 1;
-		}
-		if (n & 0x08) {
-			*(uint64_t *)dst = *(const uint64_t *)src;
-		}
-		return ret;
-	}
-
-	/* Special fast cases for <= 128 bytes */
-	if (n <= 32) {
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-		return ret;
-	}
-
-	if (n <= 64) {
-		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
-		return ret;
-	}
-
-	if (n <= 128) {
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
-		return ret;
-	}
-
-	/*
-	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
-	 * copies was found to be faster than doing 128 and 32 byte copies as
-	 * well.
-	 */
-	for ( ; n >= 256; n -= 256) {
-		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
-		dst = (uint8_t *)dst + 256;
-		src = (const uint8_t *)src + 256;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 256) into
-	 * 64byte (2^6) chunks.
-	 * Using incrementing integers in the case labels of a switch statement
-	 * enourages the compiler to use a jump table. To get incrementing
-	 * integers, we shift the 2 relevant bits to the LSB position to first
-	 * get decrementing integers, and then subtract.
-	 */
-	switch (3 - (n >> 6)) {
-	case 0x00:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x01:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	case 0x02:
-		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
-		n -= 64;
-		dst = (uint8_t *)dst + 64;
-		src = (const uint8_t *)src + 64;      /* fallthrough */
-	default:
-		;
-	}
-
-	/*
-	 * We split the remaining bytes (which will be less than 64) into
-	 * 16byte (2^4) chunks, using the same switch structure as above.
-	 */
-	switch (3 - (n >> 4)) {
-	case 0x00:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x01:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	case 0x02:
-		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		n -= 16;
-		dst = (uint8_t *)dst + 16;
-		src = (const uint8_t *)src + 16;      /* fallthrough */
-	default:
-		;
-	}
-
-	/* Copy any remaining bytes, without going beyond end of buffers */
-	if (n != 0) {
-		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
-	}
-	return ret;
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_MEMCPY_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h b/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
deleted file mode 100644
index ec2454d..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
+++ /dev/null
@@ -1,62 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_PREFETCH_X86_64_H_
-#define _RTE_PREFETCH_X86_64_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_prefetch.h"
-
-static inline void rte_prefetch0(volatile void *p)
-{
-	asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-static inline void rte_prefetch1(volatile void *p)
-{
-	asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-static inline void rte_prefetch2(volatile void *p)
-{
-	asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
-}
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_PREFETCH_X86_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h b/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
deleted file mode 100644
index 54fba95..0000000
--- a/lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
+++ /dev/null
@@ -1,94 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_SPINLOCK_X86_64_H_
-#define _RTE_SPINLOCK_X86_64_H_
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include "generic/rte_spinlock.h"
-
-#ifndef RTE_FORCE_INTRINSICS
-static inline void
-rte_spinlock_lock(rte_spinlock_t *sl)
-{
-	int lock_val = 1;
-	asm volatile (
-			"1:\n"
-			"xchg %[locked], %[lv]\n"
-			"test %[lv], %[lv]\n"
-			"jz 3f\n"
-			"2:\n"
-			"pause\n"
-			"cmpl $0, %[locked]\n"
-			"jnz 2b\n"
-			"jmp 1b\n"
-			"3:\n"
-			: [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
-			: "[lv]" (lock_val)
-			: "memory");
-}
-
-static inline void
-rte_spinlock_unlock (rte_spinlock_t *sl)
-{
-	int unlock_val = 0;
-	asm volatile (
-			"xchg %[locked], %[ulv]\n"
-			: [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
-			: "[ulv]" (unlock_val)
-			: "memory");
-}
-
-static inline int
-rte_spinlock_trylock (rte_spinlock_t *sl)
-{
-	int lockval = 1;
-
-	asm volatile (
-			"xchg %[locked], %[lockval]"
-			: [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
-			: "[lockval]" (lockval)
-			: "memory");
-
-	return (lockval == 0);
-}
-#endif
-
-#ifdef __cplusplus
-}
-#endif
-
-#endif /* _RTE_SPINLOCK_X86_64_H_ */
diff --git a/mk/arch/i686/rte.vars.mk b/mk/arch/i686/rte.vars.mk
index 8d56ca7..8ba9a23 100644
--- a/mk/arch/i686/rte.vars.mk
+++ b/mk/arch/i686/rte.vars.mk
@@ -48,6 +48,8 @@
 #
 
 ARCH  ?= i386
+# common arch dir in eal headers
+ARCH_DIR := x86
 CROSS ?=
 
 CPU_CFLAGS  ?= -m32
diff --git a/mk/arch/x86_64/rte.vars.mk b/mk/arch/x86_64/rte.vars.mk
index 51bd477..b986f04 100644
--- a/mk/arch/x86_64/rte.vars.mk
+++ b/mk/arch/x86_64/rte.vars.mk
@@ -48,6 +48,8 @@
 #
 
 ARCH  ?= x86_64
+# common arch dir in eal headers
+ARCH_DIR := x86
 CROSS ?=
 
 CPU_CFLAGS  ?= -m64
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/10] split architecture specific operations
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (9 preceding siblings ...)
  2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 10/10] eal: factorize x86 headers David Marchand
@ 2014-11-03  8:10 ` Chao CH Zhu
  2014-11-05  2:39 ` Chao Zhu
  11 siblings, 0 replies; 14+ messages in thread
From: Chao CH Zhu @ 2014-11-03  8:10 UTC (permalink / raw)
  To: David Marchand; +Cc: dev

David,

I acked the set of patches.

Thanks a lot!

Best Regards!
------------------------------
Chao Zhu 




From:   David Marchand <david.marchand@6wind.com>
To:     dev@dpdk.org
Cc:     Chao CH Zhu/China/IBM@IBMCN
Date:   2014/10/28 20:50
Subject:        [PATCH v3 00/10] split architecture specific operations



The set of patches split x86 architecture specific operations from DPDK 
and put
them to x86 arch directory.
This will make the adoption of DPDK much easier on other computer 
architecture.
For a new architecture, just add an architecture specific directory and
necessary building configuration files, then DPDK eal library can support 
it.


Reviewing patchset from Chao, I ended up modifying it along the way,
so here is a new iteration of this patchset.

Changes since Chao v2 patchset :

- added a preliminary patch for moving rte_atomic.h (for better 
readability)
- fixed documentation generation
- implemented a generic header for each arch specific header (cpuflags, 
memcpy,
  prefetch were missing)
- removed C++ stuff from generic headers
- centralised all doxygen stuff in generic headers (no need to have 
duplicates)
- refactored rte_cycles functions
- moved vmware tsc stuff to arch rte_cycles.h headers
- finished x86 factorisation


Little summary of current state :

- all applications continue to include the eal headers as before, these 
headers
  are the arch-specific ones
- the arch specific headers always include the generic ones. The generic 
headers
  contain the doxygen documentation and code common to all architectures
- a x86 architecture has been defined which handles both 32bits and 64bits
  peculiarities


It builds fine for 32/64 bits (w and w/o "force intrinsics"), but I really 
would
like a lot of eyes on this (and I would say, especially, rte_cycles, 
rte_memcpy
and rte_cpuflags).
I still have some concerns about the use of intrinsics for architecture != 
x86
but I think Chao will be the best to look at this.


-- 
David Marchand

Chao Zhu (7):
  eal: split atomic operations to architecture specific
  eal: split byte order operations to architecture specific
  eal: split CPU cycle operation to architecture specific
  eal: split prefetch operations to architecture specific
  eal: split spinlock operations to architecture specific
  eal: split memcpy operation to architecture specific
  eal: split CPU flags operations to architecture specific

David Marchand (3):
  eal: move rte_atomic.h header
  eal: install all arch headers
  eal: factorize x86 headers

 doc/api/doxy-api.conf                              |    1 +
 lib/librte_eal/common/Makefile                     |   24 +-
 lib/librte_eal/common/eal_common_cpuflags.c        |  190 ----
 .../common/include/arch/x86/rte_atomic.h           |  216 ++++
 .../common/include/arch/x86/rte_atomic_32.h        |  222 ++++
 .../common/include/arch/x86/rte_atomic_64.h        |  191 ++++
 .../common/include/arch/x86/rte_byteorder.h        |  121 +++
 .../common/include/arch/x86/rte_byteorder_32.h     |   51 +
 .../common/include/arch/x86/rte_byteorder_64.h     |   52 +
 .../common/include/arch/x86/rte_cpuflags.h         |  310 ++++++
 .../common/include/arch/x86/rte_cycles.h           |  121 +++
 .../common/include/arch/x86/rte_memcpy.h           |  297 +++++
 .../common/include/arch/x86/rte_prefetch.h         |   62 ++
 .../common/include/arch/x86/rte_spinlock.h         |   94 ++
 lib/librte_eal/common/include/generic/rte_atomic.h |  918 
++++++++++++++++
 .../common/include/generic/rte_byteorder.h         |  189 ++++
 .../common/include/generic/rte_cpuflags.h          |  110 ++
 lib/librte_eal/common/include/generic/rte_cycles.h |  209 ++++
 lib/librte_eal/common/include/generic/rte_memcpy.h |  144 +++
 .../common/include/generic/rte_prefetch.h          |   71 ++
 .../common/include/generic/rte_spinlock.h          |  226 ++++
 .../common/include/i686/arch/rte_atomic.h          |  373 -------
 lib/librte_eal/common/include/rte_atomic.h         | 1133 
--------------------
 lib/librte_eal/common/include/rte_byteorder.h      |  270 -----
 lib/librte_eal/common/include/rte_cpuflags.h       |  182 ----
 lib/librte_eal/common/include/rte_cycles.h         |  266 -----
 lib/librte_eal/common/include/rte_memcpy.h         |  376 -------
 lib/librte_eal/common/include/rte_prefetch.h       |   88 --
 lib/librte_eal/common/include/rte_spinlock.h       |  258 -----
 .../common/include/x86_64/arch/rte_atomic.h        |  335 ------
 mk/arch/i686/rte.vars.mk                           |    2 +
 mk/arch/x86_64/rte.vars.mk                         |    2 +
 32 files changed, 3624 insertions(+), 3480 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_32.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder.h
 create mode 100644 
lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h
 create mode 100644 
lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/10] split architecture specific operations
  2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
                   ` (10 preceding siblings ...)
  2014-11-03  8:10 ` [dpdk-dev] [PATCH v3 00/10] split architecture specific operations Chao CH Zhu
@ 2014-11-05  2:39 ` Chao Zhu
  2014-11-05 21:57   ` Thomas Monjalon
  11 siblings, 1 reply; 14+ messages in thread
From: Chao Zhu @ 2014-11-05  2:39 UTC (permalink / raw)
  To: David Marchand, dev; +Cc: bjzhuc


> The set of patches split x86 architecture specific operations from DPDK and put
> them to x86 arch directory.
> This will make the adoption of DPDK much easier on other computer architecture.
> For a new architecture, just add an architecture specific directory and
> necessary building configuration files, then DPDK eal library can support it.
>
>
> Reviewing patchset from Chao, I ended up modifying it along the way,
> so here is a new iteration of this patchset.
>
> Changes since Chao v2 patchset :
>
> - added a preliminary patch for moving rte_atomic.h (for better readability)
> - fixed documentation generation
> - implemented a generic header for each arch specific header (cpuflags, memcpy,
>    prefetch were missing)
> - removed C++ stuff from generic headers
> - centralised all doxygen stuff in generic headers (no need to have duplicates)
> - refactored rte_cycles functions
> - moved vmware tsc stuff to arch rte_cycles.h headers
> - finished x86 factorisation
>
>
> Little summary of current state :
>
> - all applications continue to include the eal headers as before, these headers
>    are the arch-specific ones
> - the arch specific headers always include the generic ones. The generic headers
>    contain the doxygen documentation and code common to all architectures
> - a x86 architecture has been defined which handles both 32bits and 64bits
>    peculiarities
>
>
> It builds fine for 32/64 bits (w and w/o "force intrinsics"), but I really would
> like a lot of eyes on this (and I would say, especially, rte_cycles, rte_memcpy
> and rte_cpuflags).
> I still have some concerns about the use of intrinsics for architecture != x86
> but I think Chao will be the best to look at this.
>
>
Acked-by: Chao Zhu <bjzhuc@cn.ibm.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/10] split architecture specific operations
  2014-11-05  2:39 ` Chao Zhu
@ 2014-11-05 21:57   ` Thomas Monjalon
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Monjalon @ 2014-11-05 21:57 UTC (permalink / raw)
  To: Chao Zhu, David Marchand; +Cc: dev, bjzhuc

> > The set of patches split x86 architecture specific operations from DPDK and put
> > them to x86 arch directory.
> > This will make the adoption of DPDK much easier on other computer architecture.
> > For a new architecture, just add an architecture specific directory and
> > necessary building configuration files, then DPDK eal library can support it.
> >
> >
> > Reviewing patchset from Chao, I ended up modifying it along the way,
> > so here is a new iteration of this patchset.
> >
> > Changes since Chao v2 patchset :
> >
> > - added a preliminary patch for moving rte_atomic.h (for better readability)
> > - fixed documentation generation
> > - implemented a generic header for each arch specific header (cpuflags, memcpy,
> >    prefetch were missing)
> > - removed C++ stuff from generic headers
> > - centralised all doxygen stuff in generic headers (no need to have duplicates)
> > - refactored rte_cycles functions
> > - moved vmware tsc stuff to arch rte_cycles.h headers
> > - finished x86 factorisation
> >
> >
> > Little summary of current state :
> >
> > - all applications continue to include the eal headers as before, these headers
> >    are the arch-specific ones
> > - the arch specific headers always include the generic ones. The generic headers
> >    contain the doxygen documentation and code common to all architectures
> > - a x86 architecture has been defined which handles both 32bits and 64bits
> >    peculiarities
> >
> >
> > It builds fine for 32/64 bits (w and w/o "force intrinsics"), but I really would
> > like a lot of eyes on this (and I would say, especially, rte_cycles, rte_memcpy
> > and rte_cpuflags).
> > I still have some concerns about the use of intrinsics for architecture != x86
> > but I think Chao will be the best to look at this.
> >
> >
> Acked-by: Chao Zhu <bjzhuc@cn.ibm.com>

Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>

Applied

Thanks for the big clean-up.
Now we are ready to accept new CPUs.

-- 
Thomas

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-11-05 21:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-28 12:50 [dpdk-dev] [PATCH v3 00/10] split architecture specific operations David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 01/10] eal: move rte_atomic.h header David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 02/10] eal: split atomic operations to architecture specific David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 03/10] eal: split byte order " David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 04/10] eal: split CPU cycle operation " David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 05/10] eal: split prefetch operations " David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 06/10] eal: split spinlock " David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 07/10] eal: split memcpy operation " David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 08/10] eal: split CPU flags operations " David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 09/10] eal: install all arch headers David Marchand
2014-10-28 12:50 ` [dpdk-dev] [PATCH v3 10/10] eal: factorize x86 headers David Marchand
2014-11-03  8:10 ` [dpdk-dev] [PATCH v3 00/10] split architecture specific operations Chao CH Zhu
2014-11-05  2:39 ` Chao Zhu
2014-11-05 21:57   ` Thomas Monjalon

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git