DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCHv3 0/5] ACL library
@ 2014-06-13 11:26 Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 1/5] Add ACL library (librte_acl) into DPDK Konstantin Ananyev
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Konstantin Ananyev @ 2014-06-13 11:26 UTC (permalink / raw)
  To: dev, dev

The ACL library is used to perform an N-tuple search over a set of rules
with multiple categories and find the best match (highest priority)
for each category.
This code was previously released under a proprietary license,
but is now being released under a BSD license to allow its
integration with the rest of the Intel DPDK codebase.

Konstantin Ananyev (5):
  Add ACL library (librte_acl) into DPDK
  acl: update UT to reflect latest changes in the librte_acl
  acl: New test-acl application
  acl: New sample l3fwd-acl
  acl: add doxygen configuration and start page

v2 fixes:
* Fixed several checkpatch.pl issues
* Added doxygen related changes

v3 fixes:
* Fixed even more checkpatch.pl issues
* fix for: when all rules are wildcards rte_classify() will not work correclty

 app/Makefile                         |    1 +
 app/test-acl/Makefile                |   45 +
 app/test-acl/main.c                  | 1029 ++++++++++++++++
 app/test-acl/main.h                  |   50 +
 app/test/test_acl.c                  |  216 +++--
 config/common_linuxapp               |    6 +
 doc/doxy-api-index.md                |    3 +-
 doc/doxy-api.conf                    |    3 +-
 examples/Makefile                    |    1 +
 examples/l3fwd-acl/Makefile          |   56 +
 examples/l3fwd-acl/main.c            | 2140 ++++++++++++++++++++++++++++++++++
 examples/l3fwd-acl/main.h            |   45 +
 lib/librte_acl/Makefile              |   60 +
 lib/librte_acl/acl.h                 |  182 +++
 lib/librte_acl/acl_bld.c             | 2001 +++++++++++++++++++++++++++++++
 lib/librte_acl/acl_gen.c             |  473 ++++++++
 lib/librte_acl/acl_run.c             |  944 +++++++++++++++
 lib/librte_acl/acl_vect.h            |  132 +++
 lib/librte_acl/rte_acl.c             |  413 +++++++
 lib/librte_acl/rte_acl.h             |  453 +++++++
 lib/librte_acl/rte_acl_osdep.h       |   92 ++
 lib/librte_acl/rte_acl_osdep_alone.h |  277 +++++
 lib/librte_acl/tb_mem.c              |  102 ++
 lib/librte_acl/tb_mem.h              |   73 ++
 24 files changed, 8714 insertions(+), 83 deletions(-)
 create mode 100644 app/test-acl/Makefile
 create mode 100644 app/test-acl/main.c
 create mode 100644 app/test-acl/main.h
 create mode 100644 examples/l3fwd-acl/Makefile
 create mode 100644 examples/l3fwd-acl/main.c
 create mode 100644 examples/l3fwd-acl/main.h
 create mode 100644 lib/librte_acl/Makefile
 create mode 100644 lib/librte_acl/acl.h
 create mode 100644 lib/librte_acl/acl_bld.c
 create mode 100644 lib/librte_acl/acl_gen.c
 create mode 100644 lib/librte_acl/acl_run.c
 create mode 100644 lib/librte_acl/acl_vect.h
 create mode 100644 lib/librte_acl/rte_acl.c
 create mode 100644 lib/librte_acl/rte_acl.h
 create mode 100644 lib/librte_acl/rte_acl_osdep.h
 create mode 100644 lib/librte_acl/rte_acl_osdep_alone.h
 create mode 100644 lib/librte_acl/tb_mem.c
 create mode 100644 lib/librte_acl/tb_mem.h

-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCHv3 1/5] Add ACL library (librte_acl) into DPDK
  2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
@ 2014-06-13 11:26 ` Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 2/5] acl: update UT to reflect latest changes in the librte_acl Konstantin Ananyev
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Konstantin Ananyev @ 2014-06-13 11:26 UTC (permalink / raw)
  To: dev, dev

The ACL library is used to perform an N-tuple search over a set of rules with
multiple categories and find the best match for each category.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 config/common_linuxapp               |    6 +
 lib/librte_acl/Makefile              |   60 +
 lib/librte_acl/acl.h                 |  182 +++
 lib/librte_acl/acl_bld.c             | 2001 ++++++++++++++++++++++++++++++++++
 lib/librte_acl/acl_gen.c             |  473 ++++++++
 lib/librte_acl/acl_run.c             |  944 ++++++++++++++++
 lib/librte_acl/acl_vect.h            |  132 +++
 lib/librte_acl/rte_acl.c             |  413 +++++++
 lib/librte_acl/rte_acl.h             |  453 ++++++++
 lib/librte_acl/rte_acl_osdep.h       |   92 ++
 lib/librte_acl/rte_acl_osdep_alone.h |  277 +++++
 lib/librte_acl/tb_mem.c              |  102 ++
 lib/librte_acl/tb_mem.h              |   73 ++
 13 files changed, 5208 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_acl/Makefile
 create mode 100644 lib/librte_acl/acl.h
 create mode 100644 lib/librte_acl/acl_bld.c
 create mode 100644 lib/librte_acl/acl_gen.c
 create mode 100644 lib/librte_acl/acl_run.c
 create mode 100644 lib/librte_acl/acl_vect.h
 create mode 100644 lib/librte_acl/rte_acl.c
 create mode 100644 lib/librte_acl/rte_acl.h
 create mode 100644 lib/librte_acl/rte_acl_osdep.h
 create mode 100644 lib/librte_acl/rte_acl_osdep_alone.h
 create mode 100644 lib/librte_acl/tb_mem.c
 create mode 100644 lib/librte_acl/tb_mem.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 7c143eb..94a8242 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -279,6 +279,12 @@ CONFIG_RTE_LIBRTE_HASH_DEBUG=n
 CONFIG_RTE_LIBRTE_LPM=y
 CONFIG_RTE_LIBRTE_LPM_DEBUG=n
 
+# Compile librte_acl
+#
+CONFIG_RTE_LIBRTE_ACL=y
+CONFIG_RTE_LIBRTE_ACL_DEBUG=n
+CONFIG_RTE_LIBRTE_ACL_STANDALONE=n
+
 #
 # Compile librte_power
 #
diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
new file mode 100644
index 0000000..4fe4593
--- /dev/null
+++ b/lib/librte_acl/Makefile
@@ -0,0 +1,60 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_acl.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += tb_mem.c
+
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_ACL)-include := rte_acl_osdep.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_ACL)-include += rte_acl.h
+
+ifeq ($(CONFIG_RTE_LIBRTE_ACL_STANDALONE),y)
+# standalone build
+SYMLINK-$(CONFIG_RTE_LIBRTE_ACL)-include += rte_acl_osdep_alone.h
+else
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ACL) += lib/librte_eal lib/librte_malloc
+endif
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
new file mode 100644
index 0000000..e6d7985
--- /dev/null
+++ b/lib/librte_acl/acl.h
@@ -0,0 +1,182 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef	_ACL_H_
+#define	_ACL_H_
+
+#ifdef __cplusplus
+extern"C" {
+#endif /* __cplusplus */
+
+#define RTE_ACL_QUAD_MAX	5
+#define RTE_ACL_QUAD_SIZE	4
+#define RTE_ACL_QUAD_SINGLE	UINT64_C(0x7f7f7f7f00000000)
+
+#define RTE_ACL_SINGLE_TRIE_SIZE	2000
+
+#define RTE_ACL_DFA_MAX		UINT8_MAX
+#define RTE_ACL_DFA_SIZE	(UINT8_MAX + 1)
+
+typedef int bits_t;
+
+#define	RTE_ACL_BIT_SET_SIZE	((UINT8_MAX + 1) / (sizeof(bits_t) * CHAR_BIT))
+
+struct rte_acl_bitset {
+	bits_t             bits[RTE_ACL_BIT_SET_SIZE];
+};
+
+#define	RTE_ACL_NODE_DFA	(0 << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_SINGLE	(1U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_QEXACT	(2U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_QRANGE	(3U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_MATCH	(4U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_TYPE	(7U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_UNDEFINED	UINT32_MAX
+
+/*
+ * Structure of a node is a set of ptrs and each ptr has a bit map
+ * of values associated with this transition.
+ */
+struct rte_acl_ptr_set {
+	struct rte_acl_bitset values;	/* input values associated with ptr */
+	struct rte_acl_node  *ptr;	/* transition to next node */
+};
+
+struct rte_acl_classifier_results {
+	int results[RTE_ACL_MAX_CATEGORIES];
+};
+
+struct rte_acl_match_results {
+	uint32_t results[RTE_ACL_MAX_CATEGORIES];
+	int32_t priority[RTE_ACL_MAX_CATEGORIES];
+};
+
+struct rte_acl_node {
+	uint64_t node_index;  /* index for this node */
+	uint32_t level;       /* level 0-n in the trie */
+	uint32_t ref_count;   /* ref count for this node */
+	struct rte_acl_bitset  values;
+	/* set of all values that map to another node
+	 * (union of bits in each transition.
+	 */
+	uint32_t                num_ptrs; /* number of ptr_set in use */
+	uint32_t                max_ptrs; /* number of allocated ptr_set */
+	uint32_t                min_add;  /* number of ptr_set per allocation */
+	struct rte_acl_ptr_set *ptrs;     /* transitions array for this node */
+	int32_t                 match_flag;
+	int32_t                 match_index; /* index to match data */
+	uint32_t                node_type;
+	int32_t                 fanout;
+	/* number of ranges (transitions w/ consecutive bits) */
+	int32_t                 id;
+	struct rte_acl_match_results *mrt; /* only valid when match_flag != 0 */
+	char                         transitions[RTE_ACL_QUAD_SIZE];
+	/* boundaries for ranged node */
+	struct rte_acl_node     *next;
+	/* free list link or pointer to duplicate node during merge */
+	struct rte_acl_node     *prev;
+	/* points to node from which this node was duplicated */
+
+	uint32_t                subtree_id;
+	uint32_t                subtree_ref_count;
+
+};
+enum {
+	RTE_ACL_SUBTREE_NODE = 0x80000000
+};
+
+/*
+ * Types of tries used to generate runtime structure(s)
+ */
+enum {
+	RTE_ACL_FULL_TRIE = 0,
+	RTE_ACL_NOSRC_TRIE = 1,
+	RTE_ACL_NODST_TRIE = 2,
+	RTE_ACL_NOPORTS_TRIE = 4,
+	RTE_ACL_NOVLAN_TRIE = 8,
+	RTE_ACL_UNUSED_TRIE = 0x80000000
+};
+
+
+/** MAX number of tries per one ACL context.*/
+#define RTE_ACL_MAX_TRIES	8
+
+/** Max number of characters in PM name.*/
+#define RTE_ACL_NAMESIZE	32
+
+
+struct rte_acl_trie {
+	uint32_t        type;
+	uint32_t        count;
+	int32_t         smallest;  /* smallest rule in this trie */
+	uint32_t        root_index;
+	const uint32_t *data_index;
+	uint32_t        num_data_indexes;
+};
+
+struct rte_acl_bld_trie {
+	struct rte_acl_node *trie;
+};
+
+struct rte_acl_ctx {
+	TAILQ_ENTRY(rte_acl_ctx) next;    /**< Next in list. */
+	char                name[RTE_ACL_NAMESIZE];
+	/** Name of the ACL context. */
+	int32_t             socket_id;
+	/** Socket ID to allocate memory from. */
+	void               *rules;
+	uint32_t            max_rules;
+	uint32_t            rule_sz;
+	uint32_t            num_rules;
+	uint32_t            num_categories;
+	uint32_t            num_tries;
+	uint32_t            match_index;
+	uint64_t            no_match;
+	uint64_t            idle;
+	uint64_t           *trans_table;
+	uint32_t           *data_indexes;
+	struct rte_acl_trie trie[RTE_ACL_MAX_TRIES];
+	void               *mem;
+	size_t              mem_sz;
+	struct rte_acl_config config; /* copy of build config. */
+};
+
+int rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie,
+	struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
+	uint32_t num_categories, uint32_t data_index_sz, int match_num);
+
+#ifdef __cplusplus
+}
+#endif /* __cplusplus */
+
+#endif /* _ACL_H_ */
diff --git a/lib/librte_acl/acl_bld.c b/lib/librte_acl/acl_bld.c
new file mode 100644
index 0000000..66dd847
--- /dev/null
+++ b/lib/librte_acl/acl_bld.c
@@ -0,0 +1,2001 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "tb_mem.h"
+#include "acl.h"
+
+#define	ACL_POOL_ALIGN		8
+#define	ACL_POOL_ALLOC_MIN	0x800000
+
+/* number of pointers per alloc */
+#define ACL_PTR_ALLOC	32
+
+/* variable for dividing rule sets */
+#define NODE_MAX	2500
+#define NODE_PERCENTAGE	(0.40)
+#define RULE_PERCENTAGE	(0.40)
+
+/* TALLY are statistics per field */
+enum {
+	TALLY_0 = 0,        /* number of rules that are 0% or more wild. */
+	TALLY_25,	    /* number of rules that are 25% or more wild. */
+	TALLY_50,
+	TALLY_75,
+	TALLY_100,
+	TALLY_DEACTIVATED, /* deactivated fields (100% wild in all rules). */
+	TALLY_DEPTH,
+	/* number of rules that are 100% wild for this field and higher. */
+	TALLY_NUM
+};
+
+static const uint32_t wild_limits[TALLY_DEACTIVATED] = {0, 25, 50, 75, 100};
+
+enum {
+	ACL_INTERSECT_NONE = 0,
+	ACL_INTERSECT_A = 1,    /* set A is a superset of A and B intersect */
+	ACL_INTERSECT_B = 2,    /* set B is a superset of A and B intersect */
+	ACL_INTERSECT = 4,	/* sets A and B intersect */
+};
+
+enum {
+	ACL_PRIORITY_EQUAL = 0,
+	ACL_PRIORITY_NODE_A = 1,
+	ACL_PRIORITY_NODE_B = 2,
+	ACL_PRIORITY_MIXED = 3
+};
+
+
+struct acl_mem_block {
+	uint32_t block_size;
+	void     *mem_ptr;
+};
+
+#define	MEM_BLOCK_NUM	16
+
+/* Single ACL rule, build representation.*/
+struct rte_acl_build_rule {
+	struct rte_acl_build_rule   *next;
+	struct rte_acl_config       *config;
+	/**< configuration for each field in the rule. */
+	const struct rte_acl_rule   *f;
+	uint32_t                    *wildness;
+};
+
+/* Context for build phase */
+struct acl_build_context {
+	const struct rte_acl_ctx *acx;
+	struct rte_acl_build_rule *build_rules;
+	struct rte_acl_config     cfg;
+	uint32_t                  node;
+	uint32_t                  num_nodes;
+	uint32_t                  category_mask;
+	uint32_t                  num_rules;
+	uint32_t                  node_id;
+	uint32_t                  src_mask;
+	uint32_t                  num_build_rules;
+	uint32_t                  num_tries;
+	struct tb_mem_pool        pool;
+	struct rte_acl_trie       tries[RTE_ACL_MAX_TRIES];
+	struct rte_acl_bld_trie   bld_tries[RTE_ACL_MAX_TRIES];
+	uint32_t            data_indexes[RTE_ACL_MAX_TRIES][RTE_ACL_MAX_FIELDS];
+
+	/* memory free lists for nodes and blocks used for node ptrs */
+	struct acl_mem_block      blocks[MEM_BLOCK_NUM];
+	struct rte_acl_node       *node_free_list;
+};
+
+static int acl_merge_trie(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	uint32_t level, uint32_t subtree_id, struct rte_acl_node **node_c);
+
+static int acl_merge(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	int move, int a_subset, int level);
+
+static void
+acl_deref_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node, int index);
+
+static void *
+acl_build_alloc(struct acl_build_context *context, size_t n, size_t s)
+{
+	uint32_t m;
+	void *p;
+	size_t alloc_size = n * s;
+
+	/*
+	 * look for memory in free lists
+	 */
+	for (m = 0; m < RTE_DIM(context->blocks); m++) {
+		if (context->blocks[m].block_size ==
+		   alloc_size && context->blocks[m].mem_ptr != NULL) {
+			p = context->blocks[m].mem_ptr;
+			context->blocks[m].mem_ptr = *((void **)p);
+			memset(p, 0, alloc_size);
+			return (p);
+		}
+	}
+
+	/*
+	 * return allocation from memory pool
+	 */
+	p = tb_alloc(&context->pool, alloc_size);
+	return (p);
+}
+
+/*
+ * Free memory blocks (kept in context for reuse).
+ */
+static void
+acl_build_free(struct acl_build_context *context, size_t s, void *p)
+{
+	uint32_t n;
+
+	for (n = 0; n < RTE_DIM(context->blocks); n++) {
+		if (context->blocks[n].block_size == s) {
+			*((void **)p) = context->blocks[n].mem_ptr;
+			context->blocks[n].mem_ptr = p;
+			return;
+		}
+	}
+	for (n = 0; n < RTE_DIM(context->blocks); n++) {
+		if (context->blocks[n].block_size == 0) {
+			context->blocks[n].block_size = s;
+			*((void **)p) = NULL;
+			context->blocks[n].mem_ptr = p;
+			return;
+		}
+	}
+}
+
+/*
+ * Allocate and initialize a new node.
+ */
+static struct rte_acl_node *
+acl_alloc_node(struct acl_build_context *context, int level)
+{
+	struct rte_acl_node *node;
+
+	if (context->node_free_list != NULL) {
+		node = context->node_free_list;
+		context->node_free_list = node->next;
+		memset(node, 0, sizeof(struct rte_acl_node));
+	} else {
+		node = acl_build_alloc(context, sizeof(struct rte_acl_node), 1);
+	}
+
+	if (node != NULL) {
+		node->num_ptrs = 0;
+		node->level = level;
+		node->node_type = RTE_ACL_NODE_UNDEFINED;
+		node->node_index = RTE_ACL_NODE_UNDEFINED;
+		context->num_nodes++;
+		node->id = context->node_id++;
+	}
+	return (node);
+}
+
+/*
+ * Dereference all nodes to which this node points
+ */
+static void
+acl_free_node(struct acl_build_context *context,
+	struct rte_acl_node *node)
+{
+	uint32_t n;
+
+	if (node->prev != NULL)
+		node->prev->next = NULL;
+	for (n = 0; n < node->num_ptrs; n++)
+		acl_deref_ptr(context, node, n);
+
+	/* free mrt if this is a match node */
+	if (node->mrt != NULL) {
+		acl_build_free(context, sizeof(struct rte_acl_match_results),
+			node->mrt);
+		node->mrt = NULL;
+	}
+
+	/* free transitions to other nodes */
+	if (node->ptrs != NULL) {
+		acl_build_free(context,
+			node->max_ptrs * sizeof(struct rte_acl_ptr_set),
+			node->ptrs);
+		node->ptrs = NULL;
+	}
+
+	/* put it on the free list */
+	context->num_nodes--;
+	node->next = context->node_free_list;
+	context->node_free_list = node;
+}
+
+
+/*
+ * Include src bitset in dst bitset
+ */
+static void
+acl_include(struct rte_acl_bitset *dst, struct rte_acl_bitset *src, bits_t mask)
+{
+	uint32_t n;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		dst->bits[n] = (dst->bits[n] & mask) | src->bits[n];
+}
+
+/*
+ * Set dst to bits of src1 that are not in src2
+ */
+static int
+acl_exclude(struct rte_acl_bitset *dst,
+	struct rte_acl_bitset *src1,
+	struct rte_acl_bitset *src2)
+{
+	uint32_t n;
+	bits_t all_bits = 0;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+		dst->bits[n] = src1->bits[n] & ~src2->bits[n];
+		all_bits |= dst->bits[n];
+	}
+	return (all_bits != 0);
+}
+
+/*
+ * Add a pointer (ptr) to a node.
+ */
+static int
+acl_add_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node,
+	struct rte_acl_node *ptr,
+	struct rte_acl_bitset *bits)
+{
+	uint32_t n, num_ptrs;
+	struct rte_acl_ptr_set *ptrs = NULL;
+
+	/*
+	 * If there's already a pointer to the same node, just add to the bitset
+	 */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL) {
+			if (node->ptrs[n].ptr == ptr) {
+				acl_include(&node->ptrs[n].values, bits, -1);
+				acl_include(&node->values, bits, -1);
+				return (0);
+			}
+		}
+	}
+
+	/* if there's no room for another pointer, make room */
+	if (node->num_ptrs >= node->max_ptrs) {
+		/* add room for more pointers */
+		num_ptrs = node->max_ptrs + ACL_PTR_ALLOC;
+		if ((ptrs = acl_build_alloc(context, num_ptrs,
+				sizeof(*ptrs))) == NULL)
+			return (-ENOMEM);
+
+		/* copy current points to new memory allocation */
+		if (node->ptrs != NULL) {
+			memcpy(ptrs, node->ptrs,
+				node->num_ptrs * sizeof(*ptrs));
+			acl_build_free(context, node->max_ptrs * sizeof(*ptrs),
+				node->ptrs);
+		}
+		node->ptrs = ptrs;
+		node->max_ptrs = num_ptrs;
+	}
+
+	/* Find available ptr and add a new pointer to this node */
+	for (n = node->min_add; n < node->max_ptrs; n++) {
+		if (node->ptrs[n].ptr == NULL) {
+			node->ptrs[n].ptr = ptr;
+			acl_include(&node->ptrs[n].values, bits, 0);
+			acl_include(&node->values, bits, -1);
+			if (ptr != NULL)
+				ptr->ref_count++;
+			if (node->num_ptrs <= n)
+				node->num_ptrs = n + 1;
+			return (0);
+		}
+	}
+
+	return (0);
+}
+
+/*
+ * Add a pointer for a range of values
+ */
+static int
+acl_add_ptr_range(struct acl_build_context *context,
+	struct rte_acl_node *root,
+	struct rte_acl_node *node,
+	uint8_t low,
+	uint8_t high)
+{
+	uint32_t n;
+	struct rte_acl_bitset bitset;
+
+	/* clear the bitset values */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		bitset.bits[n] = 0;
+
+	/* for each bit in range, add bit to set */
+	for (n = 0; n < UINT8_MAX + 1; n++)
+		if (n >= low && n <= high)
+			bitset.bits[n / (sizeof(bits_t) * 8)] |=
+				1 << (n % (sizeof(bits_t) * 8));
+
+	return (acl_add_ptr(context, root, node, &bitset));
+}
+
+/*
+ * Generate a bitset from a byte value and mask.
+ */
+static int
+acl_gen_mask(struct rte_acl_bitset *bitset, uint32_t value, uint32_t mask)
+{
+	int range = 0;
+	uint32_t n;
+
+	/* clear the bitset values */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		bitset->bits[n] = 0;
+
+	/* for each bit in value/mask, add bit to set */
+	for (n = 0; n < UINT8_MAX + 1; n++) {
+		if ((n & mask) == value) {
+			range++;
+			bitset->bits[n / (sizeof(bits_t) * 8)] |=
+				1 << (n % (sizeof(bits_t) * 8));
+		}
+	}
+	return (range);
+}
+
+/*
+ * Determine how A and B intersect.
+ * Determine if A and/or B are supersets of the intersection.
+ */
+static int
+acl_intersect_type(struct rte_acl_bitset *a_bits,
+	struct rte_acl_bitset *b_bits,
+	struct rte_acl_bitset *intersect)
+{
+	uint32_t n;
+	bits_t intersect_bits = 0;
+	bits_t a_superset = 0;
+	bits_t b_superset = 0;
+
+	/*
+	 * calculate and store intersection and check if A and/or B have
+	 * bits outside the intersection (superset)
+	 */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+		intersect->bits[n] = a_bits->bits[n] & b_bits->bits[n];
+		a_superset |= a_bits->bits[n] ^ intersect->bits[n];
+		b_superset |= b_bits->bits[n] ^ intersect->bits[n];
+		intersect_bits |= intersect->bits[n];
+	}
+
+	n = (intersect_bits == 0 ? ACL_INTERSECT_NONE : ACL_INTERSECT) |
+		(b_superset == 0 ? 0 : ACL_INTERSECT_B) |
+		(a_superset == 0 ? 0 : ACL_INTERSECT_A);
+
+	return (n);
+}
+
+/*
+ * Check if all bits in the bitset are on
+ */
+static int
+acl_full(struct rte_acl_node *node)
+{
+	uint32_t n;
+	bits_t all_bits = -1;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		all_bits &= node->values.bits[n];
+	return (all_bits == -1);
+}
+
+/*
+ * Check if all bits in the bitset are off
+ */
+static int
+acl_empty(struct rte_acl_node *node)
+{
+	uint32_t n;
+
+	if (node->ref_count == 0) {
+		for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+			if (0 != node->values.bits[n])
+				return 0;
+		}
+		return (1);
+	} else {
+		return (0);
+	}
+}
+
+/*
+ * Compute intersection of A and B
+ * return 1 if there is an intersection else 0.
+ */
+static int
+acl_intersect(struct rte_acl_bitset *a_bits,
+	struct rte_acl_bitset *b_bits,
+	struct rte_acl_bitset *intersect)
+{
+	uint32_t n;
+	bits_t all_bits = 0;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+		intersect->bits[n] = a_bits->bits[n] & b_bits->bits[n];
+		all_bits |= intersect->bits[n];
+	}
+	return (all_bits != 0);
+}
+
+/*
+ * Duplicate a node
+ */
+static struct rte_acl_node *
+acl_dup_node(struct acl_build_context *context, struct rte_acl_node *node)
+{
+	uint32_t n;
+	struct rte_acl_node *next;
+
+	if ((next = acl_alloc_node(context, node->level)) == NULL)
+		return (NULL);
+
+	/* allocate the pointers */
+	if (node->num_ptrs > 0) {
+		next->ptrs = acl_build_alloc(context,
+			node->max_ptrs,
+			sizeof(struct rte_acl_ptr_set));
+		if (next->ptrs == NULL)
+			return (NULL);
+		next->max_ptrs = node->max_ptrs;
+	}
+
+	/* copy over the pointers */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL) {
+			next->ptrs[n].ptr = node->ptrs[n].ptr;
+			next->ptrs[n].ptr->ref_count++;
+			acl_include(&next->ptrs[n].values,
+				&node->ptrs[n].values, -1);
+		}
+	}
+
+	next->num_ptrs = node->num_ptrs;
+
+	/* copy over node's match results */
+	if (node->match_flag == 0)
+		next->match_flag = 0;
+	else {
+		next->match_flag = -1;
+		next->mrt = acl_build_alloc(context, 1, sizeof(*next->mrt));
+		memcpy(next->mrt, node->mrt, sizeof(*next->mrt));
+	}
+
+	/* copy over node's bitset */
+	acl_include(&next->values, &node->values, -1);
+
+	node->next = next;
+	next->prev = node;
+
+	return (next);
+}
+
+/*
+ * Dereference a pointer from a node
+ */
+static void
+acl_deref_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node, int index)
+{
+	struct rte_acl_node *ref_node;
+
+	/* De-reference the node at the specified pointer */
+	if (node != NULL && node->ptrs[index].ptr != NULL) {
+		ref_node = node->ptrs[index].ptr;
+		ref_node->ref_count--;
+		if (ref_node->ref_count == 0)
+			acl_free_node(context, ref_node);
+	}
+}
+
+/*
+ * Exclude bitset from a node pointer
+ * returns  0 if poiter was deref'd
+ *          1 otherwise.
+ */
+static int
+acl_exclude_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node,
+	int index,
+	struct rte_acl_bitset *b_bits)
+{
+	int retval = 1;
+
+	/*
+	 * remove bitset from node pointer and deref
+	 * if the bitset becomes empty.
+	 */
+	if (!acl_exclude(&node->ptrs[index].values,
+			&node->ptrs[index].values,
+			b_bits)) {
+		acl_deref_ptr(context, node, index);
+		node->ptrs[index].ptr = NULL;
+		retval = 0;
+	}
+
+	/* exclude bits from the composite bits for the node */
+	acl_exclude(&node->values, &node->values, b_bits);
+	return retval;
+}
+
+/*
+ * Remove a bitset from src ptr and move remaining ptr to dst
+ */
+static int
+acl_move_ptr(struct acl_build_context *context,
+	struct rte_acl_node *dst,
+	struct rte_acl_node *src,
+	int index,
+	struct rte_acl_bitset *b_bits)
+{
+	int rc;
+
+	if (b_bits != NULL)
+		if (!acl_exclude_ptr(context, src, index, b_bits))
+			return (0);
+
+	/* add src pointer to dst node */
+	if ((rc = acl_add_ptr(context, dst, src->ptrs[index].ptr,
+			&src->ptrs[index].values)) < 0)
+		return (rc);
+
+	/* remove ptr from src */
+	acl_exclude_ptr(context, src, index, &src->ptrs[index].values);
+	return (1);
+}
+
+/*
+ * acl_exclude rte_acl_bitset from src and copy remaining pointer to dst
+ */
+static int
+acl_copy_ptr(struct acl_build_context *context,
+	struct rte_acl_node *dst,
+	struct rte_acl_node *src,
+	int index,
+	struct rte_acl_bitset *b_bits)
+{
+	int rc;
+	struct rte_acl_bitset bits;
+
+	if (b_bits != NULL)
+		if (!acl_exclude(&bits, &src->ptrs[index].values, b_bits))
+			return (0);
+
+	if ((rc = acl_add_ptr(context, dst, src->ptrs[index].ptr, &bits)) < 0)
+		return (rc);
+	return (1);
+}
+
+/*
+ * Fill in gaps in ptrs list with the ptr at the end of the list
+ */
+static void
+acl_compact_node_ptrs(struct rte_acl_node *node_a)
+{
+	uint32_t n;
+	int min_add = node_a->min_add;
+
+	while (node_a->num_ptrs > 0  &&
+			node_a->ptrs[node_a->num_ptrs - 1].ptr == NULL)
+		node_a->num_ptrs--;
+
+	for (n = min_add; n + 1 < node_a->num_ptrs; n++) {
+
+		/* if this entry is empty */
+		if (node_a->ptrs[n].ptr == NULL) {
+
+			/* move the last pointer to this entry */
+			acl_include(&node_a->ptrs[n].values,
+				&node_a->ptrs[node_a->num_ptrs - 1].values,
+				0);
+			node_a->ptrs[n].ptr =
+				node_a->ptrs[node_a->num_ptrs - 1].ptr;
+
+			/*
+			 * mark the end as empty and adjust the number
+			 * of used pointer enum_tries
+			 */
+			node_a->ptrs[node_a->num_ptrs - 1].ptr = NULL;
+			while (node_a->num_ptrs > 0  &&
+				node_a->ptrs[node_a->num_ptrs - 1].ptr == NULL)
+				node_a->num_ptrs--;
+		}
+	}
+}
+
+/*
+ * acl_merge helper routine.
+ */
+static int
+acl_merge_intersect(struct acl_build_context *context,
+	struct rte_acl_node *node_a, uint32_t idx_a,
+	struct rte_acl_node *node_b, uint32_t idx_b,
+	int next_move, int level,
+	struct rte_acl_bitset *intersect_ptr)
+{
+	struct rte_acl_node *node_c;
+
+	/* Duplicate A for intersection */
+	if ((node_c = acl_dup_node(context, node_a->ptrs[idx_a].ptr)) == NULL)
+		return (-1);
+
+	/* Remove intersection from A */
+	acl_exclude_ptr(context, node_a, idx_a, intersect_ptr);
+
+	/*
+	 * Added link from A to C for all transitions
+	 * in the intersection
+	 */
+	if (acl_add_ptr(context, node_a, node_c, intersect_ptr) < 0)
+		return (-1);
+
+	/* merge B->node into C */
+	return (acl_merge(context, node_c, node_b->ptrs[idx_b].ptr, next_move,
+		0, level + 1));
+}
+
+
+/*
+ * Merge the children of nodes A and B together.
+ *
+ * if match node
+ *	For each category
+ *		node A result = highest priority result
+ * if any pointers in A intersect with any in B
+ *	For each intersection
+ *		C = copy of node that A points to
+ *		remove intersection from A pointer
+ *		add a pointer to A that points to C for the intersection
+ *		Merge C and node that B points to
+ * Compact the pointers in A and B
+ * if move flag
+ *	If B has only one reference
+ *		Move B pointers to A
+ *	else
+ *		Copy B pointers to A
+ */
+static int
+acl_merge(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	int move, int a_subset, int level)
+{
+	uint32_t n, m, ptrs_a, ptrs_b;
+	uint32_t min_add_a, min_add_b;
+	int intersect_type;
+	int node_intersect_type;
+	int b_full, next_move, rc;
+	struct rte_acl_bitset intersect_values;
+	struct rte_acl_bitset intersect_ptr;
+
+	min_add_a = 0;
+	min_add_b = 0;
+	intersect_type = 0;
+	node_intersect_type = 0;
+
+	if (level == 0)
+		a_subset = 1;
+
+	/*
+	 *  Resolve match priorities
+	 */
+	if (node_a->match_flag != 0 || node_b->match_flag != 0) {
+
+		if (node_a->match_flag == 0 || node_b->match_flag == 0)
+			RTE_LOG(ERR, ACL, "Not both matches\n");
+
+		if (node_b->match_flag < node_a->match_flag)
+			RTE_LOG(ERR, ACL, "Not same match\n");
+
+		for (n = 0; n < context->cfg.num_categories; n++) {
+			if (node_a->mrt->priority[n] <
+					node_b->mrt->priority[n]) {
+				node_a->mrt->priority[n] =
+					node_b->mrt->priority[n];
+				node_a->mrt->results[n] =
+					node_b->mrt->results[n];
+			}
+		}
+	}
+
+	/*
+	 * If the two node transitions intersect then merge the transitions.
+	 * Check intersection for entire node (all pointers)
+	 */
+	node_intersect_type = acl_intersect_type(&node_a->values,
+		&node_b->values,
+		&intersect_values);
+
+	if (node_intersect_type & ACL_INTERSECT) {
+
+		b_full = acl_full(node_b);
+
+		min_add_b = node_b->min_add;
+		node_b->min_add = node_b->num_ptrs;
+		ptrs_b = node_b->num_ptrs;
+
+		min_add_a = node_a->min_add;
+		node_a->min_add = node_a->num_ptrs;
+		ptrs_a = node_a->num_ptrs;
+
+		for (n = 0; n < ptrs_a; n++) {
+			for (m = 0; m < ptrs_b; m++) {
+
+				if (node_a->ptrs[n].ptr == NULL ||
+						node_b->ptrs[m].ptr == NULL ||
+						node_a->ptrs[n].ptr ==
+						node_b->ptrs[m].ptr)
+						continue;
+
+				intersect_type = acl_intersect_type(
+					&node_a->ptrs[n].values,
+					&node_b->ptrs[m].values,
+					&intersect_ptr);
+
+				/* If this node is not a 'match' node */
+				if ((intersect_type & ACL_INTERSECT) &&
+					(context->cfg.num_categories != 1 ||
+					!(node_a->ptrs[n].ptr->match_flag))) {
+
+					/*
+					 * next merge is a 'move' pointer,
+					 * if this one is and B is a
+					 * subset of the intersection.
+					 */
+					next_move = move &&
+						(intersect_type &
+						ACL_INTERSECT_B) == 0;
+
+					if (a_subset && b_full) {
+						rc = acl_merge(context,
+							node_a->ptrs[n].ptr,
+							node_b->ptrs[m].ptr,
+							next_move,
+							1, level + 1);
+						if (rc != 0)
+							return (rc);
+					} else {
+						rc = acl_merge_intersect(
+							context, node_a, n,
+							node_b, m, next_move,
+							level, &intersect_ptr);
+						if (rc != 0)
+							return (rc);
+					}
+				}
+			}
+		}
+	}
+
+	/* Compact pointers */
+	node_a->min_add = min_add_a;
+	acl_compact_node_ptrs(node_a);
+	node_b->min_add = min_add_b;
+	acl_compact_node_ptrs(node_b);
+
+	/*
+	 *  Either COPY or MOVE pointers from B to A
+	 */
+	acl_intersect(&node_a->values, &node_b->values, &intersect_values);
+
+	if (move && node_b->ref_count == 1) {
+		for (m = 0; m < node_b->num_ptrs; m++) {
+			if (node_b->ptrs[m].ptr != NULL &&
+					acl_move_ptr(context, node_a, node_b, m,
+					&intersect_values) < 0)
+				return (-1);
+		}
+	} else {
+		for (m = 0; m < node_b->num_ptrs; m++) {
+			if (node_b->ptrs[m].ptr != NULL &&
+					acl_copy_ptr(context, node_a, node_b, m,
+					&intersect_values) < 0)
+				return (-1);
+		}
+	}
+
+	/*
+	 *  Free node if its empty (no longer used)
+	 */
+	if (acl_empty(node_b)) {
+		acl_free_node(context, node_b);
+	}
+	return (0);
+}
+
+static int
+acl_resolve_leaf(struct acl_build_context *context,
+	struct rte_acl_node *node_a,
+	struct rte_acl_node *node_b,
+	struct rte_acl_node **node_c)
+{
+	uint32_t n;
+	int combined_priority = ACL_PRIORITY_EQUAL;
+
+	for (n = 0; n < context->cfg.num_categories; n++) {
+		if (node_a->mrt->priority[n] != node_b->mrt->priority[n]) {
+			combined_priority |= (node_a->mrt->priority[n] >
+				node_b->mrt->priority[n]) ?
+				ACL_PRIORITY_NODE_A : ACL_PRIORITY_NODE_B;
+		}
+	}
+
+	/*
+	 * if node a is higher or equal priority for all categories,
+	 * then return node_a.
+	 */
+	if (combined_priority == ACL_PRIORITY_NODE_A ||
+			combined_priority == ACL_PRIORITY_EQUAL) {
+		*node_c = node_a;
+		return 0;
+	}
+
+	/*
+	 * if node b is higher or equal priority for all categories,
+	 * then return node_b.
+	 */
+	if (combined_priority == ACL_PRIORITY_NODE_B) {
+		*node_c = node_b;
+		return 0;
+	}
+
+	/*
+	 * mixed priorities - create a new node with the highest priority
+	 * for each category.
+	 */
+
+	/* force new duplication. */
+	node_a->next = NULL;
+
+	*node_c = acl_dup_node(context, node_a);
+	for (n = 0; n < context->cfg.num_categories; n++) {
+		if ((*node_c)->mrt->priority[n] < node_b->mrt->priority[n]) {
+			(*node_c)->mrt->priority[n] = node_b->mrt->priority[n];
+			(*node_c)->mrt->results[n] = node_b->mrt->results[n];
+		}
+	}
+	return 0;
+}
+
+/*
+* Within the existing trie structure, determine which nodes are
+* part of the subtree of the trie to be merged.
+*
+* For these purposes, a subtree is defined as the set of nodes that
+* are 1) not a superset of the intersection with the same level of
+* the merging tree, and 2) do not have any references from a node
+* outside of the subtree.
+*/
+static void
+mark_subtree(struct rte_acl_node *node,
+	struct rte_acl_bitset *level_bits,
+	uint32_t level,
+	uint32_t id)
+{
+	uint32_t n;
+
+	/* mark this node as part of the subtree */
+	node->subtree_id = id | RTE_ACL_SUBTREE_NODE;
+
+	for (n = 0; n < node->num_ptrs; n++) {
+
+		if (node->ptrs[n].ptr != NULL) {
+
+			struct rte_acl_bitset intersect_bits;
+			int intersect;
+
+			/*
+			* Item 1) :
+			* check if this child pointer is not a superset of the
+			* same level of the merging tree.
+			*/
+			intersect = acl_intersect_type(&node->ptrs[n].values,
+				&level_bits[level],
+				&intersect_bits);
+
+			if ((intersect & ACL_INTERSECT_A) == 0) {
+
+				struct rte_acl_node *child = node->ptrs[n].ptr;
+
+				/*
+				 * reset subtree reference if this is
+				 * the first visit by this subtree.
+				 */
+				if (child->subtree_id != id) {
+					child->subtree_id = id;
+					child->subtree_ref_count = 0;
+				}
+
+				/*
+				* Item 2) :
+				* increment the subtree reference count and if
+				* all references are from this subtree then
+				* recurse to that child
+				*/
+				child->subtree_ref_count++;
+				if (child->subtree_ref_count ==
+						child->ref_count)
+					mark_subtree(child, level_bits,
+						level + 1, id);
+			}
+		}
+	}
+}
+
+/*
+ * Build the set of bits that define the set of transitions
+ * for each level of a trie.
+ */
+static void
+build_subset_mask(struct rte_acl_node *node,
+	struct rte_acl_bitset *level_bits,
+	int level)
+{
+	uint32_t n;
+
+	/* Add this node's transitions to the set for this level */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		level_bits[level].bits[n] &= node->values.bits[n];
+
+	/* For each child, add the transitions for the next level */
+	for (n = 0; n < node->num_ptrs; n++)
+		if (node->ptrs[n].ptr != NULL)
+			build_subset_mask(node->ptrs[n].ptr, level_bits,
+				level + 1);
+}
+
+
+/*
+ * Merge nodes A and B together,
+ *   returns a node that is the path for the intersection
+ *
+ * If match node (leaf on trie)
+ *	For each category
+ *		return node = highest priority result
+ *
+ * Create C as a duplicate of A to point to child intersections
+ * If any pointers in C intersect with any in B
+ *	For each intersection
+ *		merge children
+ *		remove intersection from C pointer
+ *		add a pointer from C to child intersection node
+ * Compact the pointers in A and B
+ * Copy any B pointers that are outside of the intersection to C
+ * If C has no references to the B trie
+ *   free C and return A
+ * Else If C has no references to the A trie
+ *   free C and return B
+ * Else
+ *   return C
+ */
+static int
+acl_merge_trie(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	uint32_t level, uint32_t subtree_id, struct rte_acl_node **return_c)
+{
+	uint32_t n, m, ptrs_c, ptrs_b;
+	uint32_t min_add_c, min_add_b;
+	int node_intersect_type;
+	struct rte_acl_bitset node_intersect;
+	struct rte_acl_node *node_c;
+	struct rte_acl_node *node_a_next;
+	int node_b_refs;
+	int node_a_refs;
+
+	node_c = node_a;
+	node_a_next = node_a->next;
+	min_add_c = 0;
+	min_add_b = 0;
+	node_a_refs = node_a->num_ptrs;
+	node_b_refs = 0;
+	node_intersect_type = 0;
+
+	/* Resolve leaf nodes (matches) */
+	if (node_a->match_flag != 0) {
+		acl_resolve_leaf(context, node_a, node_b, return_c);
+		return 0;
+	}
+
+	/*
+	 * Create node C as a copy of node A if node A is not part of
+	 * a subtree of the merging tree (node B side). Otherwise,
+	 * just use node A.
+	 */
+	if (level > 0 &&
+			node_a->subtree_id !=
+			(subtree_id | RTE_ACL_SUBTREE_NODE)) {
+		node_c = acl_dup_node(context, node_a);
+		node_c->subtree_id = subtree_id | RTE_ACL_SUBTREE_NODE;
+	}
+
+	/*
+	 * If the two node transitions intersect then merge the transitions.
+	 * Check intersection for entire node (all pointers)
+	 */
+	node_intersect_type = acl_intersect_type(&node_c->values,
+		&node_b->values,
+		&node_intersect);
+
+	if (node_intersect_type & ACL_INTERSECT) {
+
+		min_add_b = node_b->min_add;
+		node_b->min_add = node_b->num_ptrs;
+		ptrs_b = node_b->num_ptrs;
+
+		min_add_c = node_c->min_add;
+		node_c->min_add = node_c->num_ptrs;
+		ptrs_c = node_c->num_ptrs;
+
+		for (n = 0; n < ptrs_c; n++) {
+			if (node_c->ptrs[n].ptr == NULL) {
+				node_a_refs--;
+				continue;
+			}
+			node_c->ptrs[n].ptr->next = NULL;
+			for (m = 0; m < ptrs_b; m++) {
+
+				struct rte_acl_bitset child_intersect;
+				int child_intersect_type;
+				struct rte_acl_node *child_node_c = NULL;
+
+				if (node_b->ptrs[m].ptr == NULL ||
+						node_c->ptrs[n].ptr ==
+						node_b->ptrs[m].ptr)
+						continue;
+
+				child_intersect_type = acl_intersect_type(
+					&node_c->ptrs[n].values,
+					&node_b->ptrs[m].values,
+					&child_intersect);
+
+				if ((child_intersect_type & ACL_INTERSECT) !=
+						0) {
+					if (acl_merge_trie(context,
+							node_c->ptrs[n].ptr,
+							node_b->ptrs[m].ptr,
+							level + 1, subtree_id,
+							&child_node_c))
+						return 1;
+
+					if (child_node_c != NULL &&
+							child_node_c !=
+							node_c->ptrs[n].ptr) {
+
+						node_b_refs++;
+
+						/*
+						 * Added link from C to
+						 * child_C for all transitions
+						 * in the intersection.
+						 */
+						acl_add_ptr(context, node_c,
+							child_node_c,
+							&child_intersect);
+
+						/*
+						 * inc refs if pointer is not
+						 * to node b.
+						 */
+						node_a_refs += (child_node_c !=
+							node_b->ptrs[m].ptr);
+
+						/*
+						 * Remove intersection from C
+						 * pointer.
+						 */
+						if (!acl_exclude(
+							&node_c->ptrs[n].values,
+							&node_c->ptrs[n].values,
+							&child_intersect)) {
+							acl_deref_ptr(context,
+								node_c, n);
+							node_c->ptrs[n].ptr =
+								NULL;
+							node_a_refs--;
+						}
+					}
+				}
+			}
+		}
+
+		/* Compact pointers */
+		node_c->min_add = min_add_c;
+		acl_compact_node_ptrs(node_c);
+		node_b->min_add = min_add_b;
+		acl_compact_node_ptrs(node_b);
+	}
+
+	/*
+	 *  Copy pointers outside of the intersection from B to C
+	 */
+	if ((node_intersect_type & ACL_INTERSECT_B) != 0) {
+		node_b_refs++;
+		for (m = 0; m < node_b->num_ptrs; m++)
+			if (node_b->ptrs[m].ptr != NULL)
+				acl_copy_ptr(context, node_c,
+					node_b, m, &node_intersect);
+	}
+
+	/*
+	 * Free node C if top of trie is contained in A or B
+	 *  if node C is a duplicate of node A &&
+	 *     node C was not an existing duplicate
+	 */
+	if (node_c != node_a && node_c != node_a_next) {
+
+		/*
+		 * if the intersection has no references to the
+		 * B side, then it is contained in A
+		 */
+		if (node_b_refs == 0) {
+			acl_free_node(context, node_c);
+			node_c = node_a;
+		} else {
+			/*
+			 * if the intersection has no references to the
+			 * A side, then it is contained in B.
+			 */
+			if (node_a_refs == 0) {
+				acl_free_node(context, node_c);
+				node_c = node_b;
+			}
+		}
+	}
+
+	if (return_c != NULL)
+		*return_c = node_c;
+
+	if (level == 0)
+		acl_free_node(context, node_b);
+
+	return 0;
+}
+
+/*
+ * Reset current runtime fields before next build:
+ *  - free allocated RT memory.
+ *  - reset all RT related fields to zero.
+ */
+static void
+acl_build_reset(struct rte_acl_ctx *ctx)
+{
+	rte_free(ctx->mem);
+	memset(&ctx->num_categories, 0,
+		sizeof(*ctx) - offsetof(struct rte_acl_ctx, num_categories));
+}
+
+static void
+acl_gen_range(struct acl_build_context *context,
+	const uint8_t *hi, const uint8_t *lo, int size, int level,
+	struct rte_acl_node *root, struct rte_acl_node *end)
+{
+	struct rte_acl_node *node, *prev;
+	uint32_t n;
+
+	prev = root;
+	for (n = size - 1; n > 0; n--) {
+		node = acl_alloc_node(context, level++);
+		acl_add_ptr_range(context, prev, node, lo[n], hi[n]);
+		prev = node;
+	}
+	acl_add_ptr_range(context, prev, end, lo[0], hi[0]);
+}
+
+static struct rte_acl_node *
+acl_gen_range_trie(struct acl_build_context *context,
+	const void *min, const void *max,
+	int size, int level, struct rte_acl_node **pend)
+{
+	int32_t n;
+	struct rte_acl_node *root;
+	const uint8_t *lo = (const uint8_t *)min;
+	const uint8_t *hi = (const uint8_t *)max;
+
+	*pend = acl_alloc_node(context, level+size);
+	root = acl_alloc_node(context, level++);
+
+	if (lo[size - 1] == hi[size - 1]) {
+		acl_gen_range(context, hi, lo, size, level, root, *pend);
+	} else {
+		uint8_t limit_lo[64];
+		uint8_t limit_hi[64];
+		uint8_t hi_ff = UINT8_MAX;
+		uint8_t lo_00 = 0;
+
+		memset(limit_lo, 0, RTE_DIM(limit_lo));
+		memset(limit_hi, UINT8_MAX, RTE_DIM(limit_hi));
+
+		for (n = size - 2; n >= 0; n--) {
+			hi_ff = (uint8_t)(hi_ff & hi[n]);
+			lo_00 = (uint8_t)(lo_00 | lo[n]);
+		}
+
+		if (hi_ff != UINT8_MAX) {
+			limit_lo[size - 1] = hi[size - 1];
+			acl_gen_range(context, hi, limit_lo, size, level,
+				root, *pend);
+		}
+
+		if (lo_00 != 0) {
+			limit_hi[size - 1] = lo[size - 1];
+			acl_gen_range(context, limit_hi, lo, size, level,
+				root, *pend);
+		}
+
+		if (hi[size - 1] - lo[size - 1] > 1 ||
+				lo_00 == 0 ||
+				hi_ff == UINT8_MAX) {
+			limit_lo[size-1] = (uint8_t)(lo[size-1] + (lo_00 != 0));
+			limit_hi[size-1] = (uint8_t)(hi[size-1] -
+				(hi_ff != UINT8_MAX));
+			acl_gen_range(context, limit_hi, limit_lo, size,
+				level, root, *pend);
+		}
+	}
+	return (root);
+}
+
+static struct rte_acl_node *
+acl_gen_mask_trie(struct acl_build_context *context,
+	const void *value, const void *mask,
+	int size, int level, struct rte_acl_node **pend)
+{
+	int32_t n;
+	struct rte_acl_node *root;
+	struct rte_acl_node *node, *prev;
+	struct rte_acl_bitset bits;
+	const uint8_t *val = (const uint8_t *)value;
+	const uint8_t *msk = (const uint8_t *)mask;
+
+	root = acl_alloc_node(context, level++);
+	prev = root;
+
+	for (n = size - 1; n >= 0; n--) {
+		node = acl_alloc_node(context, level++);
+		acl_gen_mask(&bits, val[n] & msk[n], msk[n]);
+		acl_add_ptr(context, prev, node, &bits);
+		prev = node;
+	}
+
+	*pend = prev;
+	return (root);
+}
+
+static struct rte_acl_node *
+build_trie(struct acl_build_context *context, struct rte_acl_build_rule *head,
+	struct rte_acl_build_rule **last, uint32_t *count)
+{
+	uint32_t n, m;
+	int field_index, node_count;
+	struct rte_acl_node *trie;
+	struct rte_acl_build_rule *prev, *rule;
+	struct rte_acl_node *end, *merge, *root, *end_prev;
+	const struct rte_acl_field *fld;
+	struct rte_acl_bitset level_bits[RTE_ACL_MAX_LEVELS];
+
+	prev = head;
+	rule = head;
+
+	if ((trie = acl_alloc_node(context, 0)) == NULL)
+		return (NULL);
+
+	while (rule != NULL) {
+
+		if ((root = acl_alloc_node(context, 0)) == NULL)
+			return (NULL);
+
+		root->ref_count = 1;
+		end = root;
+
+		for (n = 0; n < rule->config->num_fields; n++) {
+
+			field_index = rule->config->defs[n].field_index;
+			fld = rule->f->field + field_index;
+			end_prev = end;
+
+			/* build a mini-trie for this field */
+			switch (rule->config->defs[n].type) {
+
+			case RTE_ACL_FIELD_TYPE_BITMASK:
+				merge = acl_gen_mask_trie(context,
+					&fld->value,
+					&fld->mask_range,
+					rule->config->defs[n].size,
+					end->level + 1,
+					&end);
+				break;
+
+			case RTE_ACL_FIELD_TYPE_MASK:
+			{
+				/*
+				 * set msb for the size of the field and
+				 * all higher bits.
+				 */
+				uint64_t mask;
+
+				if (fld->mask_range.u32 == 0) {
+					mask = 0;
+
+				/*
+				 * arithmetic right shift for the length of
+				 * the mask less the msb.
+				 */
+				} else {
+					mask = -1 <<
+						(rule->config->defs[n].size *
+						CHAR_BIT - fld->mask_range.u32);
+				}
+
+				/* gen a mini-trie for this field */
+				merge = acl_gen_mask_trie(context,
+					&fld->value,
+					(char *)&mask,
+					rule->config->defs[n].size,
+					end->level + 1,
+					&end);
+			}
+			break;
+
+			case RTE_ACL_FIELD_TYPE_RANGE:
+				merge = acl_gen_range_trie(context,
+					&rule->f->field[field_index].value,
+					&rule->f->field[field_index].mask_range,
+					rule->config->defs[n].size,
+					end->level + 1,
+					&end);
+				break;
+
+			default:
+				RTE_LOG(ERR, ACL,
+					"Error in rule[%u] type - %hhu\n",
+					rule->f->data.userdata,
+					rule->config->defs[n].type);
+				return (NULL);
+			}
+
+			/* merge this field on to the end of the rule */
+			if (acl_merge_trie(context, end_prev, merge, 0,
+					0, NULL) != 0) {
+				return (NULL);
+			}
+		}
+
+		end->match_flag = ++context->num_build_rules;
+
+		/*
+		 * Setup the results for this rule.
+		 * The result and priority of each category.
+		 */
+		if (end->mrt == NULL &&
+				(end->mrt = acl_build_alloc(context, 1,
+				sizeof(*end->mrt))) == NULL)
+			return (NULL);
+
+		for (m = 0; m < context->cfg.num_categories; m++) {
+			if (rule->f->data.category_mask & (1 << m)) {
+				end->mrt->results[m] = rule->f->data.userdata;
+				end->mrt->priority[m] = rule->f->data.priority;
+			} else {
+				end->mrt->results[m] = 0;
+				end->mrt->priority[m] = 0;
+			}
+		}
+
+		node_count = context->num_nodes;
+
+		memset(&level_bits[0], UINT8_MAX, sizeof(level_bits));
+		build_subset_mask(root, &level_bits[0], 0);
+		mark_subtree(trie, &level_bits[0], 0, end->match_flag);
+		(*count)++;
+
+		/* merge this rule into the trie */
+		if (acl_merge_trie(context, trie, root, 0, end->match_flag,
+			NULL))
+			return NULL;
+
+		node_count = context->num_nodes - node_count;
+		if (node_count > NODE_MAX) {
+			*last = prev;
+			return trie;
+		}
+
+		prev = rule;
+		rule = rule->next;
+	}
+
+	*last = NULL;
+	return trie;
+}
+
+static int
+acl_calc_wildness(struct rte_acl_build_rule *head,
+	const struct rte_acl_config *config)
+{
+	uint32_t n;
+	struct rte_acl_build_rule *rule;
+
+	for (rule = head; rule != NULL; rule = rule->next) {
+
+		for (n = 0; n < config->num_fields; n++) {
+
+			double wild = 0;
+			double size = CHAR_BIT * config->defs[n].size;
+			int field_index = config->defs[n].field_index;
+			const struct rte_acl_field *fld = rule->f->field +
+				field_index;
+
+			switch (rule->config->defs[n].type) {
+			case RTE_ACL_FIELD_TYPE_BITMASK:
+				wild = (size -
+					_mm_popcnt_u32(fld->mask_range.u8)) /
+					size;
+				break;
+
+			case RTE_ACL_FIELD_TYPE_MASK:
+				wild = (size - fld->mask_range.u32) / size;
+				break;
+
+			case RTE_ACL_FIELD_TYPE_RANGE:
+				switch (rule->config->defs[n].size) {
+				case sizeof(uint8_t):
+					wild = ((double)fld->mask_range.u8 -
+						fld->value.u8) / UINT8_MAX;
+					break;
+				case sizeof(uint16_t):
+					wild = ((double)fld->mask_range.u16 -
+						fld->value.u16) / UINT16_MAX;
+					break;
+				case sizeof(uint32_t):
+					wild = ((double)fld->mask_range.u32 -
+						fld->value.u32) / UINT32_MAX;
+					break;
+				case sizeof(uint64_t):
+					wild = ((double)fld->mask_range.u64 -
+						fld->value.u64) / UINT64_MAX;
+					break;
+				default:
+					RTE_LOG(ERR, ACL,
+						"%s(rule: %u) invalid %u-th "
+						"field, type: %hhu, "
+						"unknown size: %hhu\n",
+						__func__,
+						rule->f->data.userdata,
+						n,
+						rule->config->defs[n].type,
+						rule->config->defs[n].size);
+					return (-EINVAL);
+				}
+				break;
+
+			default:
+				RTE_LOG(ERR, ACL,
+					"%s(rule: %u) invalid %u-th "
+					"field, unknown type: %hhu\n",
+					__func__,
+					rule->f->data.userdata,
+					n,
+					rule->config->defs[n].type);
+					return (-EINVAL);
+
+			}
+
+			rule->wildness[field_index] = (uint32_t)(wild * 100);
+		}
+	}
+
+	return (0);
+}
+
+static int
+acl_rule_stats(struct rte_acl_build_rule *head, struct rte_acl_config *config,
+	uint32_t *wild_limit)
+{
+	int min;
+	struct rte_acl_build_rule *rule;
+	uint32_t n, m, fields_deactivated = 0;
+	uint32_t start = 0, deactivate = 0;
+	int tally[RTE_ACL_MAX_LEVELS][TALLY_NUM];
+
+	memset(tally, 0, sizeof(tally));
+
+	for (rule = head; rule != NULL; rule = rule->next) {
+
+		for (n = 0; n < config->num_fields; n++) {
+			uint32_t field_index = config->defs[n].field_index;
+
+			tally[n][TALLY_0]++;
+			for (m = 1; m < RTE_DIM(wild_limits); m++) {
+				if (rule->wildness[field_index] >=
+						wild_limits[m])
+					tally[n][m]++;
+			}
+		}
+
+		for (n = config->num_fields - 1; n > 0; n--) {
+			uint32_t field_index = config->defs[n].field_index;
+
+			if (rule->wildness[field_index] == 100)
+				tally[n][TALLY_DEPTH]++;
+			else
+				break;
+		}
+	}
+
+	/*
+	 * Look for any field that is always wild and drop it from the config
+	 * Only deactivate if all fields for a given input loop are deactivated.
+	 */
+	for (n = 1; n < config->num_fields; n++) {
+		if (config->defs[n].input_index !=
+				config->defs[n - 1].input_index) {
+			for (m = start; m < n; m++)
+				tally[m][TALLY_DEACTIVATED] = deactivate;
+			fields_deactivated += deactivate;
+			start = n;
+			deactivate = 1;
+		}
+
+		/* if the field is not always completely wild */
+		if (tally[n][TALLY_100] != tally[n][TALLY_0])
+			deactivate = 0;
+	}
+
+	for (m = start; m < n; m++)
+		tally[m][TALLY_DEACTIVATED] = deactivate;
+
+	fields_deactivated += deactivate;
+
+	/* remove deactivated fields */
+	if (fields_deactivated) {
+		uint32_t k, l = 0;
+
+		for (k = 0; k < config->num_fields; k++) {
+			if (tally[k][TALLY_DEACTIVATED] == 0) {
+				memcpy(&tally[l][0], &tally[k][0],
+					TALLY_NUM * sizeof(tally[0][0]));
+				memcpy(&config->defs[l++],
+					&config->defs[k],
+					sizeof(struct rte_acl_field_def));
+			}
+		}
+		config->num_fields = l;
+	}
+
+	min = RTE_ACL_SINGLE_TRIE_SIZE;
+	if (config->num_fields == 2)
+		min *= 4;
+	else if (config->num_fields == 3)
+		min *= 3;
+	else if (config->num_fields == 4)
+		min *= 2;
+
+	if (tally[0][TALLY_0] < min)
+		return 0;
+	for (n = 0; n < config->num_fields; n++)
+		wild_limit[n] = 0;
+
+	/*
+	 * If trailing fields are 100% wild, group those together.
+	 * This allows the search length of the trie to be shortened.
+	 */
+	for (n = 1; n < config->num_fields; n++) {
+
+		double rule_percentage = (double)tally[n][TALLY_DEPTH] /
+			tally[n][0];
+
+		if (rule_percentage > RULE_PERCENTAGE) {
+			/* if it crosses an input boundary then round up */
+			while (config->defs[n - 1].input_index ==
+					config->defs[n].input_index)
+				n++;
+
+			/* set the limit for selecting rules */
+			while (n < config->num_fields)
+				wild_limit[n++] = 100;
+
+			if (wild_limit[n - 1] == 100)
+				return 1;
+		}
+	}
+
+	/* look for the most wild that's 40% or more of the rules */
+	for (n = 1; n < config->num_fields; n++) {
+		for (m = TALLY_100; m > 0; m--) {
+
+			double rule_percentage = (double)tally[n][m] /
+				tally[n][0];
+
+			if (tally[n][TALLY_DEACTIVATED] == 0 &&
+					tally[n][TALLY_0] >
+					RTE_ACL_SINGLE_TRIE_SIZE &&
+					rule_percentage > NODE_PERCENTAGE &&
+					rule_percentage < 0.80) {
+				wild_limit[n] = wild_limits[m];
+				return 1;
+			}
+		}
+	}
+	return 0;
+}
+
+static int
+order(struct rte_acl_build_rule **insert, struct rte_acl_build_rule *rule)
+{
+	uint32_t n;
+	struct rte_acl_build_rule *left = *insert;
+
+	if (left == NULL)
+		return (0);
+
+	for (n = 1; n < left->config->num_fields; n++) {
+		int field_index = left->config->defs[n].field_index;
+
+		if (left->wildness[field_index] != rule->wildness[field_index])
+			return (left->wildness[field_index] >=
+				rule->wildness[field_index]);
+	}
+	return (0);
+}
+
+static struct rte_acl_build_rule *
+ordered_insert_rule(struct rte_acl_build_rule *head,
+	struct rte_acl_build_rule *rule)
+{
+	struct rte_acl_build_rule **insert;
+
+	if (rule == NULL)
+		return head;
+
+	rule->next = head;
+	if (head == NULL)
+		return rule;
+
+	insert = &head;
+	while (order(insert, rule)) {
+		insert = &(*insert)->next;
+	}
+
+	rule->next = *insert;
+	*insert = rule;
+	return (head);
+}
+
+static struct rte_acl_build_rule *
+sort_rules(struct rte_acl_build_rule *head)
+{
+	struct rte_acl_build_rule *rule, *reordered_head = NULL;
+	struct rte_acl_build_rule *last_rule = NULL;
+
+	for (rule = head; rule != NULL; rule = rule->next) {
+		reordered_head = ordered_insert_rule(reordered_head, last_rule);
+		last_rule = rule;
+	}
+
+	if (last_rule != reordered_head) {
+		reordered_head = ordered_insert_rule(reordered_head, last_rule);
+	}
+
+	return reordered_head;
+}
+
+static uint32_t
+acl_build_index(const struct rte_acl_config *config, uint32_t *data_index)
+{
+	uint32_t n, m;
+	int32_t last_header;
+
+	m = 0;
+	last_header = -1;
+
+	for (n = 0; n < config->num_fields; n++) {
+		if (last_header != config->defs[n].input_index) {
+			last_header = config->defs[n].input_index;
+			data_index[m++] = config->defs[n].offset;
+		}
+	}
+
+	return (m);
+}
+
+static int
+acl_build_tries(struct acl_build_context *context,
+	struct rte_acl_build_rule *head)
+{
+	int32_t rc;
+	uint32_t n, m, num_tries;
+	struct rte_acl_config *config;
+	struct rte_acl_build_rule *last, *rule;
+	uint32_t wild_limit[RTE_ACL_MAX_LEVELS];
+	struct rte_acl_build_rule *rule_sets[RTE_ACL_MAX_TRIES];
+
+	config = head->config;
+	rule = head;
+	rule_sets[0] = head;
+	num_tries = 1;
+
+	/* initialize tries */
+	for (n = 0; n < RTE_DIM(context->tries); n++) {
+		context->tries[n].type = RTE_ACL_UNUSED_TRIE;
+		context->bld_tries[n].trie = NULL;
+		context->tries[n].count = 0;
+		context->tries[n].smallest = INT32_MAX;
+	}
+
+	context->tries[0].type = RTE_ACL_FULL_TRIE;
+
+	/* calc wildness of each field of each rule */
+	if ((rc = acl_calc_wildness(head, config)) != 0)
+		return (rc);
+
+	n = acl_rule_stats(head, config, &wild_limit[0]);
+
+	/* put all rules that fit the wildness criteria into a seperate trie */
+	while (n > 0 && num_tries < RTE_ACL_MAX_TRIES) {
+
+		struct rte_acl_config *new_config;
+		struct rte_acl_build_rule **prev = &rule_sets[num_tries - 1];
+		struct rte_acl_build_rule *next = head->next;
+
+		if ((new_config = acl_build_alloc(context, 1,
+				sizeof(*new_config))) == NULL) {
+			RTE_LOG(ERR, ACL,
+				"Failed to geti space for new config\n");
+			return (-ENOMEM);
+		}
+
+		memcpy(new_config, config, sizeof(*new_config));
+		config = new_config;
+		rule_sets[num_tries] = NULL;
+
+		for (rule = head; rule != NULL; rule = next) {
+
+			int move = 1;
+
+			next = rule->next;
+			for (m = 0; m < config->num_fields; m++) {
+				int x = config->defs[m].field_index;
+				if (rule->wildness[x] < wild_limit[m]) {
+					move = 0;
+					break;
+				}
+			}
+
+			if (move) {
+				rule->config = new_config;
+				rule->next = rule_sets[num_tries];
+				rule_sets[num_tries] = rule;
+				*prev = next;
+			} else
+				prev = &rule->next;
+		}
+
+		head = rule_sets[num_tries];
+		n = acl_rule_stats(rule_sets[num_tries], config,
+			&wild_limit[0]);
+		num_tries++;
+	}
+
+	if (n > 0)
+		RTE_LOG(DEBUG, ACL,
+			"Number of tries(%d) exceeded.\n", RTE_ACL_MAX_TRIES);
+
+	for (n = 0; n < num_tries; n++) {
+
+		rule_sets[n] = sort_rules(rule_sets[n]);
+		context->tries[n].type = RTE_ACL_FULL_TRIE;
+		context->tries[n].count = 0;
+		context->tries[n].num_data_indexes =
+			acl_build_index(rule_sets[n]->config,
+			context->data_indexes[n]);
+		context->tries[n].data_index = context->data_indexes[n];
+
+		if ((context->bld_tries[n].trie =
+				build_trie(context, rule_sets[n],
+				&last, &context->tries[n].count)) == NULL) {
+			RTE_LOG(ERR, ACL, "Build of %u-th trie failed\n", n);
+			return (-ENOMEM);
+		}
+
+
+		if (last != NULL) {
+			rule_sets[num_tries++] = last->next;
+			last->next = NULL;
+			acl_free_node(context, context->bld_tries[n].trie);
+			context->tries[n].count = 0;
+
+			if ((context->bld_tries[n].trie =
+					build_trie(context,
+					rule_sets[n], &last,
+					&context->tries[n].count)) == NULL) {
+				RTE_LOG(ERR, ACL,
+					"Build of %u-th trie failed\n", n);
+					return (-ENOMEM);
+			}
+		}
+	}
+
+	context->num_tries = num_tries;
+	return (0);
+}
+
+static void
+acl_build_log(const struct acl_build_context *ctx)
+{
+	uint32_t n;
+
+	RTE_LOG(DEBUG, ACL, "Build phase for ACL \"%s\":\n"
+		"memory consumed: %zu\n",
+		ctx->acx->name,
+		ctx->pool.alloc);
+
+	for (n = 0; n < RTE_DIM(ctx->tries); n++) {
+		if (ctx->tries[n].count != 0)
+			RTE_LOG(DEBUG, ACL,
+				"trie %u: number of rules: %u\n",
+				n, ctx->tries[n].count);
+	}
+}
+
+static int
+acl_build_rules(struct acl_build_context *bcx)
+{
+	struct rte_acl_build_rule *br, *head;
+	const struct rte_acl_rule *rule;
+	uint32_t *wp;
+	uint32_t fn, i, n, num;
+	size_t ofs, sz;
+
+	fn = bcx->cfg.num_fields;
+	n = bcx->acx->num_rules;
+	ofs = n * sizeof(*br);
+	sz = ofs + n * fn * sizeof(*wp);
+
+	if ((br = tb_alloc(&bcx->pool, sz)) == NULL) {
+		RTE_LOG(ERR, ACL, "ACL conext %s: failed to create a copy "
+			"of %u build rules (%zu bytes)\n",
+			bcx->acx->name, n, sz);
+		return (-ENOMEM);
+	}
+
+	wp = (uint32_t *)((uintptr_t)br + ofs);
+	num = 0;
+	head = NULL;
+
+	for (i = 0; i != n; i++) {
+		rule = (const struct rte_acl_rule *)
+			((uintptr_t)bcx->acx->rules + bcx->acx->rule_sz * i);
+		if ((rule->data.category_mask & bcx->category_mask) != 0) {
+			br[num].next = head;
+			br[num].config = &bcx->cfg;
+			br[num].f = rule;
+			br[num].wildness = wp;
+			wp += fn;
+			head = br + num;
+			num++;
+		}
+	}
+
+	bcx->num_rules = num;
+	bcx->build_rules = head;
+
+	return (0);
+}
+
+/*
+ * Copy data_indexes for each trie into RT location.
+ */
+static void
+acl_set_data_indexes(struct rte_acl_ctx *ctx)
+{
+	uint32_t i, n, ofs;
+
+	ofs = 0;
+	for (i = 0; i != ctx->num_tries; i++) {
+		n = ctx->trie[i].num_data_indexes;
+		memcpy(ctx->data_indexes + ofs, ctx->trie[i].data_index,
+			n * sizeof(ctx->data_indexes[0]));
+		ctx->trie[i].data_index = ctx->data_indexes + ofs;
+		ofs += n;
+	}
+}
+
+
+int
+rte_acl_build(struct rte_acl_ctx *ctx, const struct rte_acl_config *cfg)
+{
+	int rc;
+	struct acl_build_context bcx;
+
+	if (ctx == NULL || cfg == NULL || cfg->num_categories == 0 ||
+			cfg->num_categories > RTE_ACL_MAX_CATEGORIES)
+		return -(EINVAL);
+
+	acl_build_reset(ctx);
+
+	memset(&bcx, 0, sizeof(bcx));
+	bcx.acx = ctx;
+	bcx.pool.alignment = ACL_POOL_ALIGN;
+	bcx.pool.min_alloc = ACL_POOL_ALLOC_MIN;
+	bcx.cfg = *cfg;
+	bcx.category_mask = LEN2MASK(bcx.cfg.num_categories);
+
+
+	/* Create a buid rules copy. */
+	if ((rc = acl_build_rules(&bcx)) != 0)
+		return (rc);
+
+	/* No rules to build for that context+config */
+	if (bcx.build_rules == NULL) {
+		rc = -EINVAL;
+
+	/* build internal trie representation. */
+	} else if ((rc = acl_build_tries(&bcx, bcx.build_rules)) == 0) {
+
+		/* allocate and fill run-time  structures. */
+		if ((rc = rte_acl_gen(ctx, bcx.tries, bcx.bld_tries,
+				bcx.num_tries, bcx.cfg.num_categories,
+				RTE_ACL_IPV4VLAN_NUM * RTE_DIM(bcx.tries),
+				bcx.num_build_rules)) == 0) {
+
+			/* set data indexes. */
+			acl_set_data_indexes(ctx);
+
+			/* copy in build config. */
+			ctx->config = *cfg;
+		}
+	}
+
+	acl_build_log(&bcx);
+
+	/* cleanup after build. */
+	tb_free_pool(&bcx.pool);
+	return (rc);
+}
diff --git a/lib/librte_acl/acl_gen.c b/lib/librte_acl/acl_gen.c
new file mode 100644
index 0000000..4b4862c
--- /dev/null
+++ b/lib/librte_acl/acl_gen.c
@@ -0,0 +1,473 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "acl_vect.h"
+#include "acl.h"
+
+#define	QRANGE_MIN	((uint8_t)INT8_MIN)
+
+#define	RTE_ACL_VERIFY(exp)	do {                                          \
+	if (!(exp))                                                           \
+		rte_panic("line %d\tassert \"" #exp "\" failed\n", __LINE__); \
+} while (0)
+
+struct acl_node_counters {
+	int                match;
+	int                match_used;
+	int                single;
+	int                quad;
+	int                quad_vectors;
+	int                dfa;
+	int                smallest_match;
+};
+
+struct rte_acl_indices {
+	int                dfa_index;
+	int                quad_index;
+	int                single_index;
+	int                match_index;
+};
+
+static void
+acl_gen_log_stats(const struct rte_acl_ctx *ctx,
+	const struct acl_node_counters *counts)
+{
+	RTE_LOG(DEBUG, ACL, "Gen phase for ACL \"%s\":\n"
+		"runtime memory footprint on socket %d:\n"
+		"single nodes/bytes used: %d/%zu\n"
+		"quad nodes/bytes used: %d/%zu\n"
+		"DFA nodes/bytes used: %d/%zu\n"
+		"match nodes/bytes used: %d/%zu\n"
+		"total: %zu bytes\n",
+		ctx->name, ctx->socket_id,
+		counts->single, counts->single * sizeof(uint64_t),
+		counts->quad, counts->quad_vectors * sizeof(uint64_t),
+		counts->dfa, counts->dfa * RTE_ACL_DFA_SIZE * sizeof(uint64_t),
+		counts->match,
+		counts->match * sizeof(struct rte_acl_match_results),
+		ctx->mem_sz);
+}
+
+/*
+*  Counts the number of groups of sequential bits that are
+*  either 0 or 1, as specified by the zero_one parameter. This is used to
+*  calculate the number of ranges in a node to see if it fits in a quad range
+*  node.
+*/
+static int
+acl_count_sequential_groups(struct rte_acl_bitset *bits, int zero_one)
+{
+	int n, ranges, last_bit;
+
+	ranges = 0;
+	last_bit = zero_one ^ 1;
+
+	for (n = QRANGE_MIN; n < UINT8_MAX + 1; n++) {
+		if (bits->bits[n / (sizeof(bits_t) * 8)] &
+				(1 << (n % (sizeof(bits_t) * 8)))) {
+			if (zero_one == 1 && last_bit != 1)
+				ranges++;
+			last_bit = 1;
+		} else {
+			if (zero_one == 0 && last_bit != 0)
+				ranges++;
+			last_bit = 0;
+		}
+	}
+	for (n = 0; n < QRANGE_MIN; n++) {
+		if (bits->bits[n / (sizeof(bits_t) * 8)] &
+				(1 << (n % (sizeof(bits_t) * 8)))) {
+			if (zero_one == 1 && last_bit != 1)
+				ranges++;
+			last_bit = 1;
+		} else {
+			if (zero_one == 0 && last_bit != 0)
+				ranges++;
+			last_bit = 0;
+		}
+	}
+
+	return (ranges);
+}
+
+/*
+ * Count number of ranges spanned by the node's pointers
+ */
+static int
+acl_count_fanout(struct rte_acl_node *node)
+{
+	uint32_t n;
+	int ranges;
+
+	if (node->fanout != 0)
+		return (node->fanout);
+
+	ranges = acl_count_sequential_groups(&node->values, 0);
+
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL)
+			ranges += acl_count_sequential_groups(
+				&node->ptrs[n].values, 1);
+	}
+
+	node->fanout = ranges;
+	return (node->fanout);
+}
+
+/*
+ * Determine the type of nodes and count each type
+ */
+static int
+acl_count_trie_types(struct acl_node_counters *counts,
+	struct rte_acl_node *node, int match, int force_dfa)
+{
+	uint32_t n;
+	int num_ptrs;
+
+	/* skip if this node has been counted */
+	if (node->node_type != (uint32_t)RTE_ACL_NODE_UNDEFINED)
+		return (match);
+
+	if (node->match_flag != 0 || node->num_ptrs == 0) {
+		counts->match++;
+		if (node->match_flag == -1)
+			node->match_flag = match++;
+		node->node_type = RTE_ACL_NODE_MATCH;
+		if (counts->smallest_match > node->match_flag)
+			counts->smallest_match = node->match_flag;
+		return match;
+	}
+
+	num_ptrs = acl_count_fanout(node);
+
+	/* Force type to dfa */
+	if (force_dfa)
+		num_ptrs = RTE_ACL_DFA_SIZE;
+
+	/* determine node type based on number of ranges */
+	if (num_ptrs == 1) {
+		counts->single++;
+		node->node_type = RTE_ACL_NODE_SINGLE;
+	} else if (num_ptrs <= RTE_ACL_QUAD_MAX) {
+		counts->quad++;
+		counts->quad_vectors += node->fanout;
+		node->node_type = RTE_ACL_NODE_QRANGE;
+	} else {
+		counts->dfa++;
+		node->node_type = RTE_ACL_NODE_DFA;
+	}
+
+	/*
+	 * recursively count the types of all children
+	 */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL)
+			match = acl_count_trie_types(counts, node->ptrs[n].ptr,
+				match, 0);
+	}
+
+	return (match);
+}
+
+static void
+acl_add_ptrs(struct rte_acl_node *node, uint64_t *node_array, uint64_t no_match,
+	int resolved)
+{
+	uint32_t n, x;
+	int m, ranges, last_bit;
+	struct rte_acl_node *child;
+	struct rte_acl_bitset *bits;
+	uint64_t *node_a, index, dfa[RTE_ACL_DFA_SIZE];
+
+	ranges = 0;
+	last_bit = 0;
+
+	for (n = 0; n < RTE_DIM(dfa); n++)
+		dfa[n] = no_match;
+
+	for (x = 0; x < node->num_ptrs; x++) {
+
+		if ((child = node->ptrs[x].ptr) == NULL)
+			continue;
+
+		bits = &node->ptrs[x].values;
+		for (n = 0; n < RTE_DIM(dfa); n++) {
+
+			if (bits->bits[n / (sizeof(bits_t) * CHAR_BIT)] &
+				(1 << (n % (sizeof(bits_t) * CHAR_BIT)))) {
+
+				dfa[n] = resolved ? child->node_index : x;
+				ranges += (last_bit == 0);
+				last_bit = 1;
+			} else {
+				last_bit = 0;
+			}
+		}
+	}
+
+	/*
+	 * Rather than going from 0 to 256, the range count and
+	 * the layout are from 80-ff then 0-7f due to signed compare
+	 * for SSE (cmpgt).
+	 */
+	if (node->node_type == RTE_ACL_NODE_QRANGE) {
+
+		m = 0;
+		node_a = node_array;
+		index = dfa[QRANGE_MIN];
+		*node_a++ = index;
+
+		for (x = QRANGE_MIN + 1; x < UINT8_MAX + 1; x++) {
+			if (dfa[x] != index) {
+				index = dfa[x];
+				*node_a++ = index;
+				node->transitions[m++] = (uint8_t)(x - 1);
+			}
+		}
+
+		for (x = 0; x < INT8_MAX + 1; x++) {
+			if (dfa[x] != index) {
+				index = dfa[x];
+				*node_a++ = index;
+				node->transitions[m++] = (uint8_t)(x - 1);
+			}
+		}
+
+		/* fill unused locations with max value - nothing is greater */
+		for (; m < RTE_ACL_QUAD_SIZE; m++)
+			node->transitions[m] = INT8_MAX;
+
+		RTE_ACL_VERIFY(m <= RTE_ACL_QUAD_SIZE);
+
+	} else if (node->node_type == RTE_ACL_NODE_DFA && resolved) {
+		for (n = 0; n < RTE_DIM(dfa); n++)
+			node_array[n] = dfa[n];
+	}
+}
+
+/*
+ * Routine that allocates space for this node and recursively calls
+ * to allocate space for each child. Once all the children are allocated,
+ * then resolve all transitions for this node.
+ */
+static void
+acl_gen_node(struct rte_acl_node *node, uint64_t *node_array,
+	uint64_t no_match, struct rte_acl_indices *index, int num_categories)
+{
+	uint32_t n, *qtrp;
+	uint64_t *array_ptr;
+	struct rte_acl_match_results *match;
+
+	if (node->node_index != RTE_ACL_NODE_UNDEFINED)
+		return;
+
+	array_ptr = NULL;
+
+	switch (node->node_type) {
+	case RTE_ACL_NODE_DFA:
+		node->node_index = index->dfa_index | node->node_type;
+		array_ptr = &node_array[index->dfa_index];
+		index->dfa_index += RTE_ACL_DFA_SIZE;
+		for (n = 0; n < RTE_ACL_DFA_SIZE; n++)
+			array_ptr[n] = no_match;
+		break;
+	case RTE_ACL_NODE_SINGLE:
+		node->node_index = RTE_ACL_QUAD_SINGLE | index->single_index |
+			node->node_type;
+		array_ptr = &node_array[index->single_index];
+		index->single_index += 1;
+		array_ptr[0] = no_match;
+		break;
+	case RTE_ACL_NODE_QRANGE:
+		array_ptr = &node_array[index->quad_index];
+		acl_add_ptrs(node, array_ptr, no_match,  0);
+		qtrp = (uint32_t *)node->transitions;
+		node->node_index = qtrp[0];
+		node->node_index <<= sizeof(index->quad_index) * CHAR_BIT;
+		node->node_index |= index->quad_index | node->node_type;
+		index->quad_index += node->fanout;
+		break;
+	case RTE_ACL_NODE_MATCH:
+		match = ((struct rte_acl_match_results *)
+			(node_array + index->match_index));
+		memcpy(match + node->match_flag, node->mrt, sizeof(*node->mrt));
+		node->node_index = node->match_flag | node->node_type;
+		break;
+	case RTE_ACL_NODE_UNDEFINED:
+		RTE_ACL_VERIFY(node->node_type !=
+			(uint32_t)RTE_ACL_NODE_UNDEFINED);
+		break;
+	}
+
+	/* recursively allocate space for all children */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL)
+			acl_gen_node(node->ptrs[n].ptr,
+				node_array,
+				no_match,
+				index,
+				num_categories);
+	}
+
+	/* All children are resolved, resolve this node's pointers */
+	switch (node->node_type) {
+	case RTE_ACL_NODE_DFA:
+		acl_add_ptrs(node, array_ptr, no_match, 1);
+		break;
+	case RTE_ACL_NODE_SINGLE:
+		for (n = 0; n < node->num_ptrs; n++) {
+			if (node->ptrs[n].ptr != NULL)
+				array_ptr[0] = node->ptrs[n].ptr->node_index;
+		}
+		break;
+	case RTE_ACL_NODE_QRANGE:
+		acl_add_ptrs(node, array_ptr, no_match, 1);
+		break;
+	case RTE_ACL_NODE_MATCH:
+		break;
+	case RTE_ACL_NODE_UNDEFINED:
+		RTE_ACL_VERIFY(node->node_type !=
+			(uint32_t)RTE_ACL_NODE_UNDEFINED);
+		break;
+	}
+}
+
+static int
+acl_calc_counts_indicies(struct acl_node_counters *counts,
+	struct rte_acl_indices *indices, struct rte_acl_trie *trie,
+	struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
+	int match_num)
+{
+	uint32_t n;
+
+	memset(indices, 0, sizeof(*indices));
+	memset(counts, 0, sizeof(*counts));
+
+	/* Get stats on nodes */
+	for (n = 0; n < num_tries; n++) {
+		counts->smallest_match = INT32_MAX;
+		match_num = acl_count_trie_types(counts, node_bld_trie[n].trie,
+			match_num, 1);
+		trie[n].smallest = counts->smallest_match;
+	}
+
+	indices->dfa_index = RTE_ACL_DFA_SIZE + 1;
+	indices->quad_index = indices->dfa_index +
+		counts->dfa * RTE_ACL_DFA_SIZE;
+	indices->single_index = indices->quad_index + counts->quad_vectors;
+	indices->match_index = indices->single_index + counts->single + 1;
+	indices->match_index = RTE_ALIGN(indices->match_index,
+		(XMM_SIZE / sizeof(uint64_t)));
+
+	return (match_num);
+}
+
+/*
+ * Generate the runtime structure using build structure
+ */
+int
+rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie,
+	struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
+	uint32_t num_categories, uint32_t data_index_sz, int match_num)
+{
+	void *mem;
+	size_t total_size;
+	uint64_t *node_array, no_match;
+	uint32_t n, match_index;
+	struct rte_acl_match_results *match;
+	struct acl_node_counters counts;
+	struct rte_acl_indices indices;
+
+	/* Fill counts and indicies arrays from the nodes. */
+	match_num = acl_calc_counts_indicies(&counts, &indices, trie,
+		node_bld_trie, num_tries, match_num);
+
+	/* Allocate runtime memory (align to cache boundary) */
+	total_size = RTE_ALIGN(data_index_sz, CACHE_LINE_SIZE) +
+		indices.match_index * sizeof(uint64_t) +
+		(match_num + 2) * sizeof(struct rte_acl_match_results) +
+		XMM_SIZE;
+
+	if ((mem =  rte_zmalloc_socket(ctx->name, total_size, CACHE_LINE_SIZE,
+			ctx->socket_id)) == NULL) {
+		RTE_LOG(ERR, ACL,
+			"allocation of %zu bytes on socket %d for %s failed\n",
+			total_size, ctx->socket_id, ctx->name);
+		return (-ENOMEM);
+	}
+
+	/* Fill the runtime structure */
+	match_index = indices.match_index;
+	node_array = (uint64_t *)((uintptr_t)mem +
+		RTE_ALIGN(data_index_sz, CACHE_LINE_SIZE));
+
+	/*
+	 * Setup the NOMATCH node (a SINGLE at the
+	 * highest index, that points to itself)
+	 */
+
+	node_array[RTE_ACL_DFA_SIZE] = RTE_ACL_DFA_SIZE | RTE_ACL_NODE_SINGLE;
+	no_match = RTE_ACL_NODE_MATCH;
+
+	for (n = 0; n < RTE_ACL_DFA_SIZE; n++)
+		node_array[n] = no_match;
+
+	match = ((struct rte_acl_match_results *)(node_array + match_index));
+	memset(match, 0, sizeof(*match));
+
+	for (n = 0; n < num_tries; n++) {
+
+		acl_gen_node(node_bld_trie[n].trie, node_array, no_match,
+			&indices, num_categories);
+
+		if (node_bld_trie[n].trie->node_index == no_match)
+			trie[n].root_index = 0;
+		else
+			trie[n].root_index = node_bld_trie[n].trie->node_index;
+	}
+
+	ctx->mem = mem;
+	ctx->mem_sz = total_size;
+	ctx->data_indexes = mem;
+	ctx->num_tries = num_tries;
+	ctx->num_categories = num_categories;
+	ctx->match_index = match_index;
+	ctx->no_match = no_match;
+	ctx->idle = node_array[RTE_ACL_DFA_SIZE];
+	ctx->trans_table = node_array;
+	memcpy(ctx->trie, trie, sizeof(ctx->trie));
+
+	acl_gen_log_stats(ctx, &counts);
+	return (0);
+}
diff --git a/lib/librte_acl/acl_run.c b/lib/librte_acl/acl_run.c
new file mode 100644
index 0000000..79e6e76
--- /dev/null
+++ b/lib/librte_acl/acl_run.c
@@ -0,0 +1,944 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "acl_vect.h"
+#include "acl.h"
+
+#define MAX_SEARCHES_SSE8	8
+#define MAX_SEARCHES_SSE4	4
+#define MAX_SEARCHES_SSE2	2
+#define MAX_SEARCHES_SCALAR	2
+
+#define GET_NEXT_4BYTES(prm, idx)	\
+	(*((const int32_t *)((prm)[(idx)].data + *(prm)[idx].data_index++)))
+
+
+#define RTE_ACL_NODE_INDEX	((uint32_t)~RTE_ACL_NODE_TYPE)
+
+#define	SCALAR_QRANGE_MULT	0x01010101
+#define	SCALAR_QRANGE_MASK	0x7f7f7f7f
+#define	SCALAR_QRANGE_MIN	0x80808080
+
+enum {
+	SHUFFLE32_SLOT1 = 0xe5,
+	SHUFFLE32_SLOT2 = 0xe6,
+	SHUFFLE32_SLOT3 = 0xe7,
+	SHUFFLE32_SWAP64 = 0x4e,
+};
+
+/*
+ * Structure to manage N parallel trie traversals.
+ * The runtime trie traversal routines can process 8, 4, or 2 tries
+ * in parallel. Each packet may require multiple trie traversals (up to 4).
+ * This structure is used to fill the slots (0 to n-1) for parallel processing
+ * with the trie traversals needed for each packet.
+ */
+struct acl_flow_data {
+	uint32_t            num_packets;
+	/* number of packets processed */
+	uint32_t            started;
+	/* number of trie traversals in progress */
+	uint32_t            trie;
+	/* current trie index (0 to N-1) */
+	uint32_t            cmplt_size;
+	uint32_t            total_packets;
+	uint32_t            categories;
+	/* number of result categories per packet. */
+	/* maximum number of packets to process */
+	const uint64_t     *trans;
+	const uint8_t     **data;
+	uint32_t           *results;
+	struct completion  *last_cmplt;
+	struct completion  *cmplt_array;
+};
+
+/*
+ * Structure to maintain running results for
+ * a single packet (up to 4 tries).
+ */
+struct completion {
+	uint32_t *results;                          /* running results. */
+	int32_t   priority[RTE_ACL_MAX_CATEGORIES]; /* running priorities. */
+	uint32_t  count;                            /* num of remaining tries */
+	/* true for allocated struct */
+} __attribute__((aligned(XMM_SIZE)));
+
+/*
+ * One parms structure for each slot in the search engine.
+ */
+struct parms {
+	const uint8_t              *data;
+	/* input data for this packet */
+	const uint32_t             *data_index;
+	/* data indirection for this trie */
+	struct completion          *cmplt;
+	/* completion data for this packet */
+};
+
+/*
+ * Define an global idle node for unused engine slots
+ */
+static const uint32_t idle[UINT8_MAX + 1];
+
+static const rte_xmm_t mm_type_quad_range = {
+	.u32 = {
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+	},
+};
+
+static const rte_xmm_t mm_type_quad_range64 = {
+	.u32 = {
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+		0,
+		0,
+	},
+};
+
+static const rte_xmm_t mm_shuffle_input = {
+	.u32 = {0x00000000, 0x04040404, 0x08080808, 0x0c0c0c0c},
+};
+
+static const rte_xmm_t mm_shuffle_input64 = {
+	.u32 = {0x00000000, 0x04040404, 0x80808080, 0x80808080},
+};
+
+static const rte_xmm_t mm_ones_16 = {
+	.u16 = {1, 1, 1, 1, 1, 1, 1, 1},
+};
+
+static const rte_xmm_t mm_bytes = {
+	.u32 = {UINT8_MAX, UINT8_MAX, UINT8_MAX, UINT8_MAX},
+};
+
+static const rte_xmm_t mm_bytes64 = {
+	.u32 = {UINT8_MAX, UINT8_MAX, 0, 0},
+};
+
+static const rte_xmm_t mm_match_mask = {
+	.u32 = {
+		RTE_ACL_NODE_MATCH,
+		RTE_ACL_NODE_MATCH,
+		RTE_ACL_NODE_MATCH,
+		RTE_ACL_NODE_MATCH,
+	},
+};
+
+static const rte_xmm_t mm_match_mask64 = {
+	.u32 = {
+		RTE_ACL_NODE_MATCH,
+		0,
+		RTE_ACL_NODE_MATCH,
+		0,
+	},
+};
+
+static const rte_xmm_t mm_index_mask = {
+	.u32 = {
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+	},
+};
+
+static const rte_xmm_t mm_index_mask64 = {
+	.u32 = {
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+		0,
+		0,
+	},
+};
+
+/*
+ * Allocate a completion structure to manage the tries for a packet.
+ */
+static inline struct completion *
+alloc_completion(struct completion *p, uint32_t size, uint32_t tries,
+	uint32_t *results)
+{
+	uint32_t n;
+
+	for (n = 0; n < size; n++) {
+
+		if (p[n].count == 0) {
+
+			/* mark as allocated and set number of tries. */
+			p[n].count = tries;
+			p[n].results = results;
+			return &(p[n]);
+		}
+	}
+
+	/* should never get here */
+	return (NULL);
+}
+
+/*
+ * Resolve priority for a single result trie.
+ */
+static inline void
+resolve_single_priority(uint64_t transition, int n,
+	const struct rte_acl_ctx *ctx, struct parms *parms,
+	const struct rte_acl_match_results *p)
+{
+	if (parms[n].cmplt->count == ctx->num_tries ||
+			parms[n].cmplt->priority[0] <=
+			p[transition].priority[0]) {
+
+		parms[n].cmplt->priority[0] = p[transition].priority[0];
+		parms[n].cmplt->results[0] = p[transition].results[0];
+	}
+
+	parms[n].cmplt->count--;
+}
+
+/*
+ * Resolve priority for multiple results. This consists comparing
+ * the priority of the current traversal with the running set of
+ * results for the packet. For each result, keep a running array of
+ * the result (rule number) and its priority for each category.
+ */
+static inline void
+resolve_priority(uint64_t transition, int n, const struct rte_acl_ctx *ctx,
+	struct parms *parms, const struct rte_acl_match_results *p,
+	uint32_t categories)
+{
+	uint32_t x;
+	xmm_t results, priority, results1, priority1, selector;
+	xmm_t *saved_results, *saved_priority;
+
+	for (x = 0; x < categories; x += RTE_ACL_RESULTS_MULTIPLIER) {
+
+		saved_results = (xmm_t *)(&parms[n].cmplt->results[x]);
+		saved_priority =
+			(xmm_t *)(&parms[n].cmplt->priority[x]);
+
+		/* get results and priorities for completed trie */
+		results = MM_LOADU((const xmm_t *)&p[transition].results[x]);
+		priority = MM_LOADU((const xmm_t *)&p[transition].priority[x]);
+
+		/* if this is not the first completed trie */
+		if (parms[n].cmplt->count != ctx->num_tries) {
+
+			/* get running best results and their priorities */
+			results1 = MM_LOADU(saved_results);
+			priority1 = MM_LOADU(saved_priority);
+
+			/* select results that are highest priority */
+			selector = MM_CMPGT32(priority1, priority);
+			results = MM_BLENDV8(results, results1, selector);
+			priority = MM_BLENDV8(priority, priority1, selector);
+		}
+
+		/* save running best results and their priorities */
+		MM_STOREU(saved_results, results);
+		MM_STOREU(saved_priority, priority);
+	}
+
+	/* Count down completed tries for this search request */
+	parms[n].cmplt->count--;
+}
+
+/*
+ * Routine to fill a slot in the parallel trie traversal array (parms) from
+ * the list of packets (flows).
+ */
+static inline uint64_t
+acl_start_next_trie(struct acl_flow_data *flows, struct parms *parms, int n,
+	const struct rte_acl_ctx *ctx)
+{
+	uint64_t transition;
+
+	/* if there are any more packets to process */
+	if (flows->num_packets < flows->total_packets) {
+		parms[n].data = flows->data[flows->num_packets];
+		parms[n].data_index = ctx->trie[flows->trie].data_index;
+
+		/* if this is the first trie for this packet */
+		if (flows->trie == 0) {
+			flows->last_cmplt = alloc_completion(flows->cmplt_array,
+				flows->cmplt_size, ctx->num_tries,
+				flows->results +
+				flows->num_packets * flows->categories);
+		}
+
+		/* set completion parameters and starting index for this slot */
+		parms[n].cmplt = flows->last_cmplt;
+		transition =
+			flows->trans[parms[n].data[*parms[n].data_index++] +
+			ctx->trie[flows->trie].root_index];
+
+		/*
+		 * if this is the last trie for this packet,
+		 * then setup next packet.
+		 */
+		flows->trie++;
+		if (flows->trie >= ctx->num_tries) {
+			flows->trie = 0;
+			flows->num_packets++;
+		}
+
+		/* keep track of number of active trie traversals */
+		flows->started++;
+
+	/* no more tries to process, set slot to an idle position */
+	} else {
+		transition = ctx->idle;
+		parms[n].data = (const uint8_t *)idle;
+		parms[n].data_index = idle;
+	}
+	return transition;
+}
+
+/*
+ * Detect matches. If a match node transition is found, then this trie
+ * traversal is complete and fill the slot with the next trie
+ * to be processed.
+ */
+static inline uint64_t
+acl_match_check_transition(uint64_t transition, int slot,
+	const struct rte_acl_ctx *ctx, struct parms *parms,
+	struct acl_flow_data *flows)
+{
+	const struct rte_acl_match_results *p;
+
+	p = (const struct rte_acl_match_results *)
+		(flows->trans + ctx->match_index);
+
+	if (transition & RTE_ACL_NODE_MATCH) {
+
+		/* Remove flags from index and decrement active traversals */
+		transition &= RTE_ACL_NODE_INDEX;
+		flows->started--;
+
+		/* Resolve priorities for this trie and running results */
+		if (flows->categories == 1)
+			resolve_single_priority(transition, slot, ctx,
+				parms, p);
+		else
+			resolve_priority(transition, slot, ctx, parms, p,
+				flows->categories);
+
+		/* Fill the slot with the next trie or idle trie */
+		transition = acl_start_next_trie(flows, parms, slot, ctx);
+
+	} else if (transition == ctx->idle) {
+		/* reset indirection table for idle slots */
+		parms[slot].data_index = idle;
+	}
+
+	return transition;
+}
+
+/*
+ * Extract transitions from an XMM register and check for any matches
+ */
+static void
+acl_process_matches(xmm_t *indicies, int slot, const struct rte_acl_ctx *ctx,
+	struct parms *parms, struct acl_flow_data *flows)
+{
+	uint64_t transition1, transition2;
+
+	/* extract transition from low 64 bits. */
+	transition1 = MM_CVT64(*indicies);
+
+	/* extract transition from high 64 bits. */
+	*indicies = MM_SHUFFLE32(*indicies, SHUFFLE32_SWAP64);
+	transition2 = MM_CVT64(*indicies);
+
+	transition1 = acl_match_check_transition(transition1, slot, ctx,
+		parms, flows);
+	transition2 = acl_match_check_transition(transition2, slot + 1, ctx,
+		parms, flows);
+
+	/* update indicies with new transitions. */
+	*indicies = MM_SET64(transition2, transition1);
+}
+
+/*
+ * Check for a match in 2 transitions (contained in SSE register)
+ */
+static inline void
+acl_match_check_x2(int slot, const struct rte_acl_ctx *ctx, struct parms *parms,
+	struct acl_flow_data *flows, xmm_t *indicies, xmm_t match_mask)
+{
+	xmm_t temp;
+
+	temp = MM_AND(match_mask, *indicies);
+	while (!MM_TESTZ(temp, temp)) {
+		acl_process_matches(indicies, slot, ctx, parms, flows);
+		temp = MM_AND(match_mask, *indicies);
+	}
+}
+
+/*
+ * Check for any match in 4 transitions (contained in 2 SSE registers)
+ */
+static inline void
+acl_match_check_x4(int slot, const struct rte_acl_ctx *ctx, struct parms *parms,
+	struct acl_flow_data *flows, xmm_t *indicies1, xmm_t *indicies2,
+	xmm_t match_mask)
+{
+	xmm_t temp;
+
+	/* put low 32 bits of each transition into one register */
+	temp = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1, (__m128)*indicies2,
+		0x88);
+	/* test for match node */
+	temp = MM_AND(match_mask, temp);
+
+	while (!MM_TESTZ(temp, temp)) {
+		acl_process_matches(indicies1, slot, ctx, parms, flows);
+		acl_process_matches(indicies2, slot + 2, ctx, parms, flows);
+
+		temp = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1,
+					(__m128)*indicies2,
+					0x88);
+		temp = MM_AND(match_mask, temp);
+	}
+}
+
+/*
+ * Calculate the address of the next transition for
+ * all types of nodes. Note that only DFA nodes and range
+ * nodes actually transition to another node. Match
+ * nodes don't move.
+ */
+static inline xmm_t
+acl_calc_addr(xmm_t index_mask, xmm_t next_input, xmm_t shuffle_input,
+	xmm_t ones_16, xmm_t bytes, xmm_t type_quad_range,
+	xmm_t *indicies1, xmm_t *indicies2)
+{
+	xmm_t addr, node_types, temp;
+
+	/*
+	 * Note that no transition is done for a match
+	 * node and therefore a stream freezes when
+	 * it reaches a match.
+	 */
+
+	/* Shuffle low 32 into temp and high 32 into indicies2 */
+	temp = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1, (__m128)*indicies2,
+		0x88);
+	*indicies2 = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1,
+		(__m128)*indicies2, 0xdd);
+
+	/* Calc node type and node addr */
+	node_types = MM_ANDNOT(index_mask, temp);
+	addr = MM_AND(index_mask, temp);
+
+	/*
+	 * Calc addr for DFAs - addr = dfa_index + input_byte
+	 */
+
+	/* mask for DFA type (0) nodes */
+	temp = MM_CMPEQ32(node_types, MM_XOR(node_types, node_types));
+
+	/* add input byte to DFA position */
+	temp = MM_AND(temp, bytes);
+	temp = MM_AND(temp, next_input);
+	addr = MM_ADD32(addr, temp);
+
+	/*
+	 * Calc addr for Range nodes -> range_index + range(input)
+	 */
+	node_types = MM_CMPEQ32(node_types, type_quad_range);
+
+	/*
+	 * Calculate number of range boundaries that are less than the
+	 * input value. Range boundaries for each node are in signed 8 bit,
+	 * ordered from -128 to 127 in the indicies2 register.
+	 * This is effectively a popcnt of bytes that are greater than the
+	 * input byte.
+	 */
+
+	/* shuffle input byte to all 4 positions of 32 bit value */
+	temp = MM_SHUFFLE8(next_input, shuffle_input);
+
+	/* check ranges */
+	temp = MM_CMPGT8(temp, *indicies2);
+
+	/* convert -1 to 1 (bytes greater than input byte */
+	temp = MM_SIGN8(temp, temp);
+
+	/* horizontal add pairs of bytes into words */
+	temp = MM_MADD8(temp, temp);
+
+	/* horizontal add pairs of words into dwords */
+	temp = MM_MADD16(temp, ones_16);
+
+	/* mask to range type nodes */
+	temp = MM_AND(temp, node_types);
+
+	/* add index into node position */
+	return (MM_ADD32(addr, temp));
+}
+
+/*
+ * Process 4 transitions (in 2 SIMD registers) in parallel
+ */
+static inline xmm_t
+transition4(xmm_t index_mask, xmm_t next_input, xmm_t shuffle_input,
+	xmm_t ones_16, xmm_t bytes, xmm_t type_quad_range,
+	const uint64_t *trans, xmm_t *indicies1, xmm_t *indicies2)
+{
+	xmm_t addr;
+	uint64_t trans0, trans2;
+
+	 /* Calculate the address (array index) for all 4 transitions. */
+
+	addr = acl_calc_addr(index_mask, next_input, shuffle_input, ones_16,
+		bytes, type_quad_range, indicies1, indicies2);
+
+	 /* Gather 64 bit transitions and pack back into 2 registers. */
+
+	trans0 = trans[MM_CVT32(addr)];
+
+	/* get slot 2 */
+
+	/* {x0, x1, x2, x3} -> {x2, x1, x2, x3} */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT2);
+	trans2 = trans[MM_CVT32(addr)];
+
+	/* get slot 1 */
+
+	/* {x2, x1, x2, x3} -> {x1, x1, x2, x3} */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT1);
+	*indicies1 = MM_SET64(trans[MM_CVT32(addr)], trans0);
+
+	/* get slot 3 */
+
+	/* {x1, x1, x2, x3} -> {x3, x1, x2, x3} */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT3);
+	*indicies2 = MM_SET64(trans[MM_CVT32(addr)], trans2);
+
+	return (MM_SRL32(next_input, 8));
+}
+
+static inline void
+acl_set_flow(struct acl_flow_data *flows, struct completion *cmplt,
+	uint32_t cmplt_size, const uint8_t **data, uint32_t *results,
+	uint32_t data_num, uint32_t categories, const uint64_t *trans)
+{
+	flows->num_packets = 0;
+	flows->started = 0;
+	flows->trie = 0;
+	flows->last_cmplt = NULL;
+	flows->cmplt_array = cmplt;
+	flows->total_packets = data_num;
+	flows->categories = categories;
+	flows->cmplt_size = cmplt_size;
+	flows->data = data;
+	flows->results = results;
+	flows->trans = trans;
+}
+
+/*
+ * Execute trie traversal with 8 traversals in parallel
+ */
+static inline void
+search_sse_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SSE8];
+	struct completion cmplt[MAX_SEARCHES_SSE8];
+	struct parms parms[MAX_SEARCHES_SSE8];
+	xmm_t input0, input1;
+	xmm_t indicies1, indicies2, indicies3, indicies4;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SSE8; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	/*
+	 * indicies1 contains index_array[0,1]
+	 * indicies2 contains index_array[2,3]
+	 * indicies3 contains index_array[4,5]
+	 * indicies4 contains index_array[6,7]
+	 */
+
+	indicies1 = MM_LOADU((xmm_t *) &index_array[0]);
+	indicies2 = MM_LOADU((xmm_t *) &index_array[2]);
+
+	indicies3 = MM_LOADU((xmm_t *) &index_array[4]);
+	indicies4 = MM_LOADU((xmm_t *) &index_array[6]);
+
+	 /* Check for any matches. */
+	acl_match_check_x4(0, ctx, parms, &flows,
+		&indicies1, &indicies2, mm_match_mask.m);
+	acl_match_check_x4(4, ctx, parms, &flows,
+		&indicies3, &indicies4, mm_match_mask.m);
+
+	while (flows.started > 0) {
+
+		/* Gather 4 bytes of input data for each stream. */
+		input0 = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0),
+			0);
+		input1 = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 4),
+			0);
+
+		input0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 1), 1);
+		input1 = MM_INSERT32(input1, GET_NEXT_4BYTES(parms, 5), 1);
+
+		input0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 2), 2);
+		input1 = MM_INSERT32(input1, GET_NEXT_4BYTES(parms, 6), 2);
+
+		input0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 3), 3);
+		input1 = MM_INSERT32(input1, GET_NEXT_4BYTES(parms, 7), 3);
+
+		 /* Process the 4 bytes of input on each stream. */
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		 /* Check for any matches. */
+		acl_match_check_x4(0, ctx, parms, &flows,
+			&indicies1, &indicies2, mm_match_mask.m);
+		acl_match_check_x4(4, ctx, parms, &flows,
+			&indicies3, &indicies4, mm_match_mask.m);
+	}
+}
+
+/*
+ * Execute trie traversal with 4 traversals in parallel
+ */
+static inline void
+search_sse_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	 uint32_t *results, int total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SSE4];
+	struct completion cmplt[MAX_SEARCHES_SSE4];
+	struct parms parms[MAX_SEARCHES_SSE4];
+	xmm_t input, indicies1, indicies2;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SSE4; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	indicies1 = MM_LOADU((xmm_t *) &index_array[0]);
+	indicies2 = MM_LOADU((xmm_t *) &index_array[2]);
+
+	/* Check for any matches. */
+	acl_match_check_x4(0, ctx, parms, &flows,
+		&indicies1, &indicies2, mm_match_mask.m);
+
+	while (flows.started > 0) {
+
+		/* Gather 4 bytes of input data for each stream. */
+		input = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0), 0);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 1), 1);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 2), 2);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 3), 3);
+
+		/* Process the 4 bytes of input on each stream. */
+		input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		 input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		 input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		 input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		/* Check for any matches. */
+		acl_match_check_x4(0, ctx, parms, &flows,
+			&indicies1, &indicies2, mm_match_mask.m);
+	}
+}
+
+static inline xmm_t
+transition2(xmm_t index_mask, xmm_t next_input, xmm_t shuffle_input,
+	xmm_t ones_16, xmm_t bytes, xmm_t type_quad_range,
+	const uint64_t *trans, xmm_t *indicies1)
+{
+	uint64_t t;
+	xmm_t addr, indicies2;
+
+	indicies2 = MM_XOR(ones_16, ones_16);
+
+	addr = acl_calc_addr(index_mask, next_input, shuffle_input, ones_16,
+		bytes, type_quad_range, indicies1, &indicies2);
+
+	/* Gather 64 bit transitions and pack 2 per register. */
+
+	t = trans[MM_CVT32(addr)];
+
+	/* get slot 1 */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT1);
+	*indicies1 = MM_SET64(trans[MM_CVT32(addr)], t);
+
+	return (MM_SRL32(next_input, 8));
+}
+
+/*
+ * Execute trie traversal with 2 traversals in parallel.
+ */
+static inline void
+search_sse_2(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SSE2];
+	struct completion cmplt[MAX_SEARCHES_SSE2];
+	struct parms parms[MAX_SEARCHES_SSE2];
+	xmm_t input, indicies;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SSE2; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	indicies = MM_LOADU((xmm_t *) &index_array[0]);
+
+	/* Check for any matches. */
+	acl_match_check_x2(0, ctx, parms, &flows, &indicies, mm_match_mask64.m);
+
+	while (flows.started > 0) {
+
+		/* Gather 4 bytes of input data for each stream. */
+		input = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0), 0);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 1), 1);
+
+		/* Process the 4 bytes of input on each stream. */
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		/* Check for any matches. */
+		acl_match_check_x2(0, ctx, parms, &flows, &indicies,
+			mm_match_mask64.m);
+	}
+}
+
+/*
+ * When processing the transition, rather than using if/else
+ * construct, the offset is calculated for DFA and QRANGE and
+ * then conditionally added to the address based on node type.
+ * This is done to avoid branch mis-predictions. Since the
+ * offset is rather simple calculation it is more efficient
+ * to do the calculation and do a condition move rather than
+ * a conditional branch to determine which calculation to do.
+ */
+static inline uint32_t
+scan_forward(uint32_t input, uint32_t max)
+{
+	return ((input == 0) ? max : rte_bsf32(input));
+}
+
+static inline uint64_t
+scalar_transition(const uint64_t *trans_table, uint64_t transition,
+	uint8_t input)
+{
+	uint32_t addr, index, ranges, x, a, b, c;
+
+	/* break transition into component parts */
+	ranges = transition >> (sizeof(index) * CHAR_BIT);
+
+	/* calc address for a QRANGE node */
+	c = input * SCALAR_QRANGE_MULT;
+	a = ranges | SCALAR_QRANGE_MIN;
+	index = transition & ~RTE_ACL_NODE_INDEX;
+	a -= (c & SCALAR_QRANGE_MASK);
+	b = c & SCALAR_QRANGE_MIN;
+	addr = transition ^ index;
+	a &= SCALAR_QRANGE_MIN;
+	a ^= (ranges ^ b) & (a ^ b);
+	x = scan_forward(a, 32) >> 3;
+	addr += (index == RTE_ACL_NODE_DFA) ? input : x;
+
+	/* pickup next transition */
+	transition = *(trans_table + addr);
+	return (transition);
+}
+
+int
+rte_acl_classify_scalar(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories)
+{
+	int n;
+	uint64_t transition0, transition1;
+	uint32_t input0, input1;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SCALAR];
+	struct completion cmplt[MAX_SEARCHES_SCALAR];
+	struct parms parms[MAX_SEARCHES_SCALAR];
+
+	if (categories != 1 &&
+		((RTE_ACL_RESULTS_MULTIPLIER - 1) & categories) != 0)
+		return (-EINVAL);
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, num,
+		categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SCALAR; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	transition0 = index_array[0];
+	transition1 = index_array[1];
+
+	while (flows.started > 0) {
+
+		input0 = GET_NEXT_4BYTES(parms, 0);
+		input1 = GET_NEXT_4BYTES(parms, 1);
+
+		for (n = 0; n < 4; n++) {
+			if (likely((transition0 & RTE_ACL_NODE_MATCH) == 0))
+				transition0 = scalar_transition(flows.trans,
+					transition0, (uint8_t)input0);
+
+			input0 >>= CHAR_BIT;
+
+			if (likely((transition1 & RTE_ACL_NODE_MATCH) == 0))
+				transition1 = scalar_transition(flows.trans,
+					transition1, (uint8_t)input1);
+
+			input1 >>= CHAR_BIT;
+
+		}
+		if ((transition0 | transition1) & RTE_ACL_NODE_MATCH) {
+			transition0 = acl_match_check_transition(transition0,
+				0, ctx, parms, &flows);
+			transition1 = acl_match_check_transition(transition1,
+				1, ctx, parms, &flows);
+
+		}
+	}
+	return (0);
+}
+
+int
+rte_acl_classify(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories)
+{
+	if (categories != 1 &&
+		((RTE_ACL_RESULTS_MULTIPLIER - 1) & categories) != 0)
+		return (-EINVAL);
+
+	if (likely(num >= MAX_SEARCHES_SSE8))
+		search_sse_8(ctx, data, results, num, categories);
+	else if (num >= MAX_SEARCHES_SSE4)
+		search_sse_4(ctx, data, results, num, categories);
+	else
+		search_sse_2(ctx, data, results, num, categories);
+
+	return (0);
+}
diff --git a/lib/librte_acl/acl_vect.h b/lib/librte_acl/acl_vect.h
new file mode 100644
index 0000000..d813600
--- /dev/null
+++ b/lib/librte_acl/acl_vect.h
@@ -0,0 +1,132 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_VECT_H_
+#define _RTE_ACL_VECT_H_
+
+/**
+ * @file
+ *
+ * RTE ACL SSE/AVX related header.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define	MM_ADD16(a, b)		_mm_add_epi16(a, b)
+#define	MM_ADD32(a, b)		_mm_add_epi32(a, b)
+#define	MM_ALIGNR8(a, b, c)	_mm_alignr_epi8(a, b, c)
+#define	MM_AND(a, b)		_mm_and_si128(a, b)
+#define MM_ANDNOT(a, b)		_mm_andnot_si128(a, b)
+#define MM_BLENDV8(a, b, c)	_mm_blendv_epi8(a, b, c)
+#define MM_CMPEQ16(a, b)	_mm_cmpeq_epi16(a, b)
+#define MM_CMPEQ32(a, b)	_mm_cmpeq_epi32(a, b)
+#define	MM_CMPEQ8(a, b)		_mm_cmpeq_epi8(a, b)
+#define MM_CMPGT32(a, b)	_mm_cmpgt_epi32(a, b)
+#define MM_CMPGT8(a, b)		_mm_cmpgt_epi8(a, b)
+#define MM_CVT(a)		_mm_cvtsi32_si128(a)
+#define	MM_CVT32(a)		_mm_cvtsi128_si32(a)
+#define MM_CVTU32(a)		_mm_cvtsi32_si128(a)
+#define	MM_INSERT16(a, c, b)	_mm_insert_epi16(a, c, b)
+#define	MM_INSERT32(a, c, b)	_mm_insert_epi32(a, c, b)
+#define	MM_LOAD(a)		_mm_load_si128(a)
+#define	MM_LOADH_PI(a, b)	_mm_loadh_pi(a, b)
+#define	MM_LOADU(a)		_mm_loadu_si128(a)
+#define	MM_MADD16(a, b)		_mm_madd_epi16(a, b)
+#define	MM_MADD8(a, b)		_mm_maddubs_epi16(a, b)
+#define	MM_MOVEMASK8(a)		_mm_movemask_epi8(a)
+#define MM_OR(a, b)		_mm_or_si128(a, b)
+#define	MM_SET1_16(a)		_mm_set1_epi16(a)
+#define	MM_SET1_32(a)		_mm_set1_epi32(a)
+#define	MM_SET1_64(a)		_mm_set1_epi64(a)
+#define	MM_SET1_8(a)		_mm_set1_epi8(a)
+#define	MM_SET32(a, b, c, d)	_mm_set_epi32(a, b, c, d)
+#define	MM_SHUFFLE32(a, b)	_mm_shuffle_epi32(a, b)
+#define	MM_SHUFFLE8(a, b)	_mm_shuffle_epi8(a, b)
+#define	MM_SHUFFLEPS(a, b, c)	_mm_shuffle_ps(a, b, c)
+#define	MM_SIGN8(a, b)		_mm_sign_epi8(a, b)
+#define	MM_SLL64(a, b)		_mm_sll_epi64(a, b)
+#define	MM_SRL128(a, b)		_mm_srli_si128(a, b)
+#define MM_SRL16(a, b)		_mm_srli_epi16(a, b)
+#define	MM_SRL32(a, b)		_mm_srli_epi32(a, b)
+#define	MM_STORE(a, b)		_mm_store_si128(a, b)
+#define	MM_STOREU(a, b)		_mm_storeu_si128(a, b)
+#define	MM_TESTZ(a, b)		_mm_testz_si128(a, b)
+#define	MM_XOR(a, b)		_mm_xor_si128(a, b)
+
+#define	MM_SET16(a, b, c, d, e, f, g, h)	\
+	_mm_set_epi16(a, b, c, d, e, f, g, h)
+
+#define	MM_SET8(c0, c1, c2, c3, c4, c5, c6, c7,	\
+		c8, c9, cA, cB, cC, cD, cE, cF)	\
+	_mm_set_epi8(c0, c1, c2, c3, c4, c5, c6, c7,	\
+		c8, c9, cA, cB, cC, cD, cE, cF)
+
+#ifdef RTE_ARCH_X86_64
+
+#define	MM_CVT64(a)		_mm_cvtsi128_si64(a)
+
+#else
+
+#define	MM_CVT64(a)	({ \
+	rte_xmm_t m;       \
+	m.m = (a);         \
+	(m.u64[0]);        \
+})
+
+#endif /*RTE_ARCH_X86_64 */
+
+/*
+ * Prior to version 12.1 icc doesn't support _mm_set_epi64x.
+ */
+#if (defined(__ICC) && __ICC < 1210)
+
+#define	MM_SET64(a, b)	({ \
+	rte_xmm_t m;       \
+	m.u64[0] = b;      \
+	m.u64[1] = a;      \
+	(m.m);             \
+})
+
+#else
+
+#define	MM_SET64(a, b)		_mm_set_epi64x(a, b)
+
+#endif /* (defined(__ICC) && __ICC < 1210) */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACL_VECT_H_ */
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
new file mode 100644
index 0000000..932f2c9
--- /dev/null
+++ b/lib/librte_acl/rte_acl.c
@@ -0,0 +1,413 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "acl.h"
+
+#define	BIT_SIZEOF(x)	(sizeof(x) * CHAR_BIT)
+
+TAILQ_HEAD(rte_acl_list, rte_acl_ctx);
+
+struct rte_acl_ctx *
+rte_acl_find_existing(const char *name)
+{
+	struct rte_acl_ctx *ctx;
+	struct rte_acl_list *acl_list;
+
+	/* check that we have an initialised tail queue */
+	if ((acl_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_ACL,
+			rte_acl_list)) == NULL) {
+		rte_errno = E_RTE_NO_TAILQ;
+		return NULL;
+	}
+
+	rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_FOREACH(ctx, acl_list, next) {
+		if (strncmp(name, ctx->name, sizeof(ctx->name)) == 0)
+			break;
+	}
+	rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	if (ctx == NULL)
+		rte_errno = ENOENT;
+	return (ctx);
+}
+
+void
+rte_acl_free(struct rte_acl_ctx *ctx)
+{
+	if (ctx == NULL)
+		return;
+
+	RTE_EAL_TAILQ_REMOVE(RTE_TAILQ_ACL, rte_acl_list, ctx);
+
+	rte_free(ctx->mem);
+	rte_free(ctx);
+}
+
+struct rte_acl_ctx *
+rte_acl_create(const struct rte_acl_param *param)
+{
+	size_t sz;
+	struct rte_acl_ctx *ctx;
+	struct rte_acl_list *acl_list;
+	char name[sizeof(ctx->name)];
+
+	/* check that we have an initialised tail queue */
+	if ((acl_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_ACL,
+			rte_acl_list)) == NULL) {
+		rte_errno = E_RTE_NO_TAILQ;
+		return NULL;
+	}
+
+	/* check that input parameters are valid. */
+	if (param == NULL || param->name == NULL) {
+		rte_errno = EINVAL;
+		return (NULL);
+	}
+
+	rte_snprintf(name, sizeof(name), "ACL_%s", param->name);
+
+	/* calculate amount of memory required for pattern set. */
+	sz = sizeof(*ctx) + param->max_rule_num * param->rule_size;
+
+	/* get EAL TAILQ lock. */
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+
+	/* if we already have one with that name */
+	TAILQ_FOREACH(ctx, acl_list, next) {
+		if (strncmp(param->name, ctx->name, sizeof(ctx->name)) == 0)
+			break;
+	}
+
+	/* if ACL with such name doesn't exist, then create a new one. */
+	if (ctx == NULL && (ctx = rte_zmalloc_socket(name, sz, CACHE_LINE_SIZE,
+			param->socket_id)) != NULL) {
+
+		/* init new allocated context. */
+		ctx->rules = ctx + 1;
+		ctx->max_rules = param->max_rule_num;
+		ctx->rule_sz = param->rule_size;
+		ctx->socket_id = param->socket_id;
+		rte_snprintf(ctx->name, sizeof(ctx->name), "%s", param->name);
+
+		TAILQ_INSERT_TAIL(acl_list, ctx, next);
+
+	} else if (ctx == NULL) {
+		RTE_LOG(ERR, ACL,
+			"allocation of %zu bytes on socket %d for %s failed\n",
+			sz, param->socket_id, name);
+	}
+
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+	return (ctx);
+}
+
+static int
+acl_add_rules(struct rte_acl_ctx *ctx, const void *rules, uint32_t num)
+{
+	uint8_t *pos;
+
+	if (num + ctx->num_rules > ctx->max_rules)
+		return (-ENOMEM);
+
+	pos = ctx->rules;
+	pos += ctx->rule_sz * ctx->num_rules;
+	memcpy(pos, rules, num * ctx->rule_sz);
+	ctx->num_rules += num;
+
+	return (0);
+}
+
+static int
+acl_check_rule(const struct rte_acl_rule_data *rd)
+{
+	if ((rd->category_mask & LEN2MASK(RTE_ACL_MAX_CATEGORIES)) == 0 ||
+			rd->priority > RTE_ACL_MAX_PRIORITY ||
+			rd->priority < RTE_ACL_MIN_PRIORITY ||
+			rd->userdata == RTE_ACL_INVALID_USERDATA)
+		return (-EINVAL);
+	return (0);
+}
+
+int
+rte_acl_add_rules(struct rte_acl_ctx *ctx, const struct rte_acl_rule *rules,
+	uint32_t num)
+{
+	const struct rte_acl_rule *rv;
+	uint32_t i;
+	int32_t rc;
+
+	if (ctx == NULL || rules == NULL || 0 == ctx->rule_sz)
+		return (-EINVAL);
+
+	for (i = 0; i != num; i++) {
+		rv = (const struct rte_acl_rule *)
+			((uintptr_t)rules + i * ctx->rule_sz);
+		if ((rc = acl_check_rule(&rv->data)) != 0) {
+			RTE_LOG(ERR, ACL, "%s(%s): rule #%u is invalid\n",
+				__func__, ctx->name, i + 1);
+			return (rc);
+		}
+	}
+
+	return (acl_add_rules(ctx, rules, num));
+}
+
+/*
+ * Reset all rules.
+ * Note that RT structures are not affected.
+ */
+void
+rte_acl_reset_rules(struct rte_acl_ctx *ctx)
+{
+	if (ctx != NULL)
+		ctx->num_rules = 0;
+}
+
+/*
+ * Reset all rules and destroys RT structures.
+ */
+void
+rte_acl_reset(struct rte_acl_ctx *ctx)
+{
+	if (ctx != NULL) {
+		rte_acl_reset_rules(ctx);
+		rte_acl_build(ctx, &ctx->config);
+	}
+}
+
+/*
+ * Dump ACL context to the stdout.
+ */
+void
+rte_acl_dump(const struct rte_acl_ctx *ctx)
+{
+	if (!ctx)
+		return;
+	printf("acl context <%s>@%p\n", ctx->name, ctx);
+	printf("  max_rules=%"PRIu32"\n", ctx->max_rules);
+	printf("  rule_size=%"PRIu32"\n", ctx->rule_sz);
+	printf("  num_rules=%"PRIu32"\n", ctx->num_rules);
+	printf("  num_categories=%"PRIu32"\n", ctx->num_categories);
+	printf("  num_tries=%"PRIu32"\n", ctx->num_tries);
+}
+
+/*
+ * Dump all ACL contexts to the stdout.
+ */
+void
+rte_acl_list_dump(void)
+{
+	struct rte_acl_ctx *ctx;
+	struct rte_acl_list *acl_list;
+
+	/* check that we have an initialised tail queue */
+	if ((acl_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_ACL,
+			rte_acl_list)) == NULL) {
+		rte_errno = E_RTE_NO_TAILQ;
+		return;
+	}
+
+	rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_FOREACH(ctx, acl_list, next) {
+		rte_acl_dump(ctx);
+	}
+	rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
+}
+
+/*
+ * Support for legacy ipv4vlan rules.
+ */
+
+RTE_ACL_RULE_DEF(acl_ipv4vlan_rule, RTE_ACL_IPV4VLAN_NUM_FIELDS);
+
+static int
+acl_ipv4vlan_check_rule(const struct rte_acl_ipv4vlan_rule *rule)
+{
+	if (rule->src_port_low > rule->src_port_high ||
+			rule->dst_port_low > rule->dst_port_high ||
+			rule->src_mask_len > BIT_SIZEOF(rule->src_addr) ||
+			rule->dst_mask_len > BIT_SIZEOF(rule->dst_addr))
+		return (-EINVAL);
+
+	return (acl_check_rule(&rule->data));
+}
+
+static void
+acl_ipv4vlan_convert_rule(const struct rte_acl_ipv4vlan_rule *ri,
+	struct acl_ipv4vlan_rule *ro)
+{
+	ro->data = ri->data;
+
+	ro->field[RTE_ACL_IPV4VLAN_PROTO_FIELD].value.u8 = ri->proto;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN1_FIELD].value.u16 = ri->vlan;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN2_FIELD].value.u16 = ri->domain;
+	ro->field[RTE_ACL_IPV4VLAN_SRC_FIELD].value.u32 = ri->src_addr;
+	ro->field[RTE_ACL_IPV4VLAN_DST_FIELD].value.u32 = ri->dst_addr;
+	ro->field[RTE_ACL_IPV4VLAN_SRCP_FIELD].value.u16 = ri->src_port_low;
+	ro->field[RTE_ACL_IPV4VLAN_DSTP_FIELD].value.u16 = ri->dst_port_low;
+
+	ro->field[RTE_ACL_IPV4VLAN_PROTO_FIELD].mask_range.u8 = ri->proto_mask;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN1_FIELD].mask_range.u16 = ri->vlan_mask;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN2_FIELD].mask_range.u16 =
+		ri->domain_mask;
+	ro->field[RTE_ACL_IPV4VLAN_SRC_FIELD].mask_range.u32 =
+		ri->src_mask_len;
+	ro->field[RTE_ACL_IPV4VLAN_DST_FIELD].mask_range.u32 = ri->dst_mask_len;
+	ro->field[RTE_ACL_IPV4VLAN_SRCP_FIELD].mask_range.u16 =
+		ri->src_port_high;
+	ro->field[RTE_ACL_IPV4VLAN_DSTP_FIELD].mask_range.u16 =
+		ri->dst_port_high;
+}
+
+int
+rte_acl_ipv4vlan_add_rules(struct rte_acl_ctx *ctx,
+	const struct rte_acl_ipv4vlan_rule *rules,
+	uint32_t num)
+{
+	int32_t rc;
+	uint32_t i;
+	struct acl_ipv4vlan_rule rv;
+
+	if (ctx == NULL || rules == NULL || ctx->rule_sz != sizeof(rv))
+		return (-EINVAL);
+
+	/* check input rules. */
+	for (i = 0; i != num; i++) {
+		if ((rc = acl_ipv4vlan_check_rule(rules + i)) != 0) {
+			RTE_LOG(ERR, ACL, "%s(%s): rule #%u is invalid\n",
+				__func__, ctx->name, i + 1);
+			return (rc);
+		}
+	}
+
+	if (num + ctx->num_rules > ctx->max_rules)
+		return (-ENOMEM);
+
+	/* perform conversion to the internal format and add to the context. */
+	for (i = 0, rc = 0; i != num && rc == 0; i++) {
+		acl_ipv4vlan_convert_rule(rules + i, &rv);
+		rc = acl_add_rules(ctx, &rv, 1);
+	}
+
+	return (rc);
+}
+
+static void
+acl_ipv4vlan_config(struct rte_acl_config *cfg,
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM],
+	uint32_t num_categories)
+{
+	static const struct rte_acl_field_def
+		ipv4_defs[RTE_ACL_IPV4VLAN_NUM_FIELDS] = {
+		{
+			.type = RTE_ACL_FIELD_TYPE_BITMASK,
+			.size = sizeof(uint8_t),
+			.field_index = RTE_ACL_IPV4VLAN_PROTO_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_PROTO,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_BITMASK,
+			.size = sizeof(uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_VLAN1_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_VLAN,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_BITMASK,
+			.size = sizeof(uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_VLAN2_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_VLAN,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_MASK,
+			.size = sizeof(uint32_t),
+			.field_index = RTE_ACL_IPV4VLAN_SRC_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_SRC,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_MASK,
+			.size = sizeof(uint32_t),
+			.field_index = RTE_ACL_IPV4VLAN_DST_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_DST,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_RANGE,
+			.size = sizeof(uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_SRCP_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_RANGE,
+			.size = sizeof(uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_DSTP_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		},
+	};
+
+	memcpy(&cfg->defs, ipv4_defs, sizeof(ipv4_defs));
+	cfg->num_fields = RTE_DIM(ipv4_defs);
+
+	cfg->defs[RTE_ACL_IPV4VLAN_PROTO_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_PROTO];
+	cfg->defs[RTE_ACL_IPV4VLAN_VLAN1_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_VLAN];
+	cfg->defs[RTE_ACL_IPV4VLAN_VLAN2_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_VLAN] +
+		cfg->defs[RTE_ACL_IPV4VLAN_VLAN1_FIELD].size;
+	cfg->defs[RTE_ACL_IPV4VLAN_SRC_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_SRC];
+	cfg->defs[RTE_ACL_IPV4VLAN_DST_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_DST];
+	cfg->defs[RTE_ACL_IPV4VLAN_SRCP_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_PORTS];
+	cfg->defs[RTE_ACL_IPV4VLAN_DSTP_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_PORTS] +
+		cfg->defs[RTE_ACL_IPV4VLAN_SRCP_FIELD].size;
+
+	cfg->num_categories = num_categories;
+}
+
+int
+rte_acl_ipv4vlan_build(struct rte_acl_ctx *ctx,
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM],
+	uint32_t num_categories)
+{
+	struct rte_acl_config cfg;
+
+	if (ctx == NULL || layout == NULL)
+		return (-EINVAL);
+
+	acl_ipv4vlan_config(&cfg, layout, num_categories);
+	return (rte_acl_build(ctx, &cfg));
+}
diff --git a/lib/librte_acl/rte_acl.h b/lib/librte_acl/rte_acl.h
new file mode 100644
index 0000000..afc0f69
--- /dev/null
+++ b/lib/librte_acl/rte_acl.h
@@ -0,0 +1,453 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_H_
+#define _RTE_ACL_H_
+
+/**
+ * @file
+ *
+ * RTE Classifier.
+ */
+
+#include <rte_acl_osdep.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define	RTE_ACL_MAX_CATEGORIES	16
+
+#define	RTE_ACL_RESULTS_MULTIPLIER	(XMM_SIZE / sizeof(uint32_t))
+
+#define RTE_ACL_MAX_LEVELS 64
+#define RTE_ACL_MAX_FIELDS 64
+
+union rte_acl_field_types {
+	uint8_t  u8;
+	uint16_t u16;
+	uint32_t u32;
+	uint64_t u64;
+};
+
+enum {
+	RTE_ACL_FIELD_TYPE_MASK = 0,
+	RTE_ACL_FIELD_TYPE_RANGE,
+	RTE_ACL_FIELD_TYPE_BITMASK
+};
+
+/**
+ * ACL Field defintion.
+ * Each field in the ACL rule has an associate definition.
+ * It defines the type of field, its size, its offset in the input buffer,
+ * the field index, and the input index.
+ * For performance reasons, the inner loop of the search function is unrolled
+ * to process four input bytes at a time. This requires the input to be grouped
+ * into sets of 4 consecutive bytes. The loop processes the first input byte as
+ * part of the setup and then subsequent bytes must be in groups of 4
+ * consecutive bytes.
+ */
+struct rte_acl_field_def {
+	uint8_t  type;        /**< type - RTE_ACL_FIELD_TYPE_*. */
+	uint8_t	 size;        /**< size of field 1,2,4, or 8. */
+	uint8_t	 field_index; /**< index of field inside the rule. */
+	uint8_t  input_index; /**< 0-N input index. */
+	uint32_t offset;      /**< offset to start of field. */
+};
+
+/**
+ * ACL build configuration.
+ * Defines the fields of an ACL trie and number of categories to build with.
+ */
+struct rte_acl_config {
+	uint32_t num_categories; /**< Number of categories to build with. */
+	uint32_t num_fields;     /**< Number of field definitions. */
+	struct rte_acl_field_def defs[RTE_ACL_MAX_FIELDS];
+	/**< array of field definitions. */
+};
+
+/**
+ * Defines the value of a field for a rule.
+ */
+struct rte_acl_field {
+	union rte_acl_field_types value;
+	/**< a 1,2,4, or 8 byte value of the field. */
+	union rte_acl_field_types mask_range;
+	/**<
+	 * depending on field type:
+	 * mask -> 1.2.3.4/32 value=0x1020304, mask_range=32,
+	 * range -> 0 : 65535 value=0, mask_range=65535,
+	 * bitmask -> 0x06/0xff value=6, mask_range=0xff.
+	 */
+};
+
+enum {
+	RTE_ACL_TYPE_SHIFT = 29,
+	RTE_ACL_MAX_INDEX = LEN2MASK(RTE_ACL_TYPE_SHIFT),
+	RTE_ACL_MAX_PRIORITY = RTE_ACL_MAX_INDEX,
+	RTE_ACL_MIN_PRIORITY = 0,
+};
+
+#define	RTE_ACL_INVALID_USERDATA	0
+
+/**
+ * Miscellaneous data for ACL rule.
+ */
+struct rte_acl_rule_data {
+	uint32_t category_mask; /**< Mask of categories for that rule. */
+	int32_t  priority;      /**< Priority for that rule. */
+	uint32_t userdata;      /**< Associated with the rule user data. */
+};
+
+/**
+ * Defines single ACL rule.
+ * data - miscellaneous data for the rule.
+ * field[] - value and mask or range for each field.
+ */
+#define	RTE_ACL_RULE_DEF(name, fld_num)	struct name {\
+	struct rte_acl_rule_data data;               \
+	struct rte_acl_field field[fld_num];         \
+}
+
+RTE_ACL_RULE_DEF(rte_acl_rule, 0);
+
+#define	RTE_ACL_RULE_SZ(fld_num)	\
+	(sizeof(struct rte_acl_rule) + sizeof(struct rte_acl_field) * (fld_num))
+
+
+/** Max number of characters in name.*/
+#define	RTE_ACL_NAMESIZE		32
+
+/**
+ * Parameters used when creating the ACL context.
+ */
+struct rte_acl_param {
+	const char *name;         /**< Name of the ACL context. */
+	int         socket_id;    /**< Socket ID to allocate memory for. */
+	uint32_t    rule_size;    /**< Size of each rule. */
+	uint32_t    max_rule_num; /**< Maximum number of rules. */
+};
+
+
+/**
+ * Create a new ACL context.
+ *
+ * @param param
+ *   Parameters used to create and initialise the ACL context.
+ * @return
+ *   Pointer to ACL context structure that is used in future ACL
+ *   operations, or NULL on error, with error code set in rte_errno.
+ *   Possible rte_errno errors include:
+ *   - E_RTE_NO_TAILQ - no tailq list could be got for the ACL context list
+ *   - EINVAL - invalid parameter passed to function
+ */
+struct rte_acl_ctx *
+rte_acl_create(const struct rte_acl_param *param);
+
+/**
+ * Find an existing ACL context object and return a pointer to it.
+ *
+ * @param name
+ *   Name of the ACL context as passed to rte_acl_create()
+ * @return
+ *   Pointer to ACL context or NULL if object not found
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - ENOENT - value not available for return
+ */
+struct rte_acl_ctx *
+rte_acl_find_existing(const char *name);
+
+/**
+ * De-allocate all memory used by ACL context.
+ *
+ * @param ctx
+ *   ACL context to free
+ */
+void
+rte_acl_free(struct rte_acl_ctx *ctx);
+
+/**
+ * Add rules to an existing ACL context.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to add patterns to.
+ * @param rules
+ *   Array of rules to add to the ACL context.
+ *   Note that all fields in rte_acl_rule structures are expected
+ *   to be in host byte order.
+ *   Each rule expected to be in the same format and not exceed size
+ *   specified at ACL context creation time.
+ * @param num
+ *   Number of elements in the input array of rules.
+ * @return
+ *   - -ENOMEM if there is no space in the ACL context for these rules.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_add_rules(struct rte_acl_ctx *ctx, const struct rte_acl_rule *rules,
+	uint32_t num);
+
+/**
+ * Delete all rules from the ACL context.
+ * This function is not multi-thread safe.
+ * Note that internal run-time structures are not affected.
+ *
+ * @param ctx
+ *   ACL context to delete rules from.
+ */
+void
+rte_acl_reset_rules(struct rte_acl_ctx *ctx);
+
+/**
+ * Analyze set of rules and build required internal run-time structures.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to build.
+ * @param cfg
+ *   Pointer to struct rte_acl_config - defines build parameters.
+ * @return
+ *   - -ENOMEM if couldn't allocate enough memory.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Negative error code if operation failed.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_build(struct rte_acl_ctx *ctx, const struct rte_acl_config *cfg);
+
+/**
+ * Delete all rules from the ACL context and
+ * destroy all internal run-time structures.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to reset.
+ */
+void
+rte_acl_reset(struct rte_acl_ctx *ctx);
+
+/**
+ * Search for a matching ACL rule for each input data buffer.
+ * Each input data buffer can have up to *categories* matches.
+ * That implies that results array should be big enough to hold
+ * (categories * num) elements.
+ * Also categories parameter should be either one or multiple of
+ * RTE_ACL_RESULTS_MULTIPLIER and can't be bigger than RTE_ACL_MAX_CATEGORIES.
+ * If more than one rule is applicable for given input buffer and
+ * given category, then rule with highest priority will be returned as a match.
+ * Note, that it is a caller responsibility to ensure that input parameters
+ * are valid and point to correct memory locations.
+ *
+ * @param ctx
+ *   ACL context to search with.
+ * @param data
+ *   Array of pointers to input data buffers to perform search.
+ *   Note that all fields in input data buffers supposed to be in network
+ *   byte order (MSB).
+ * @param results
+ *   Array of search results, *categories* results per each input data buffer.
+ * @param num
+ *   Number of elements in the input data buffers array.
+ * @param categories
+ *   Number of maximum possible matches for each input buffer, one possible
+ *   match per category.
+ * @return
+ *   zero on successful completion.
+ *   -EINVAL for incorrect arguments.
+ */
+int
+rte_acl_classify(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories);
+
+/**
+ * Perform scalar search for a matching ACL rule for each input data buffer.
+ * Note, that while the search itself will avoid explicit use of SSE/AVX
+ * intrinsics, code for comparing matching results/priorities sill might use
+ * vector intrinsics (for  categories > 1).
+ * Each input data buffer can have up to *categories* matches.
+ * That implies that results array should be big enough to hold
+ * (categories * num) elements.
+ * Also categories parameter should be either one or multiple of
+ * RTE_ACL_RESULTS_MULTIPLIER and can't be bigger than RTE_ACL_MAX_CATEGORIES.
+ * If more than one rule is applicable for given input buffer and
+ * given category, then rule with highest priority will be returned as a match.
+ * Note, that it is a caller's responsibility to ensure that input parameters
+ * are valid and point to correct memory locations.
+ *
+ * @param ctx
+ *   ACL context to search with.
+ * @param data
+ *   Array of pointers to input data buffers to perform search.
+ *   Note that all fields in input data buffers supposed to be in network
+ *   byte order (MSB).
+ * @param results
+ *   Array of search results, *categories* results per each input data buffer.
+ * @param num
+ *   Number of elements in the input data buffers array.
+ * @param categories
+ *   Number of maximum possible matches for each input buffer, one possible
+ *   match per category.
+ * @return
+ *   zero on successful completion.
+ *   -EINVAL for incorrect arguments.
+ */
+int
+rte_acl_classify_scalar(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories);
+
+/**
+ * Dump an ACL context structure to the console.
+ *
+ * @param ctx
+ *   ACL context to dump.
+ */
+void
+rte_acl_dump(const struct rte_acl_ctx *ctx);
+
+/**
+ * Dump all ACL context structures to the console.
+ */
+void
+rte_acl_list_dump(void);
+
+/**
+ * Legacy support for 7-tuple IPv4 and VLAN rule.
+ * This structure and corresponding API is deprecated.
+ */
+struct rte_acl_ipv4vlan_rule {
+	struct rte_acl_rule_data data; /**< Miscellaneous data for the rule. */
+	uint8_t proto;                 /**< IPv4 protocol ID. */
+	uint8_t proto_mask;            /**< IPv4 protocol ID mask. */
+	uint16_t vlan;                 /**< VLAN ID. */
+	uint16_t vlan_mask;            /**< VLAN ID mask. */
+	uint16_t domain;               /**< VLAN domain. */
+	uint16_t domain_mask;          /**< VLAN domain mask. */
+	uint32_t src_addr;             /**< IPv4 source address. */
+	uint32_t src_mask_len;         /**< IPv4 source address mask. */
+	uint32_t dst_addr;             /**< IPv4 destination address. */
+	uint32_t dst_mask_len;         /**< IPv4 destination address mask. */
+	uint16_t src_port_low;         /**< L4 source port low. */
+	uint16_t src_port_high;        /**< L4 source port high. */
+	uint16_t dst_port_low;         /**< L4 destination port low. */
+	uint16_t dst_port_high;        /**< L4 destination port high. */
+};
+
+/**
+ * Specifies fields layout inside rte_acl_rule for rte_acl_ipv4vlan_rule.
+ */
+enum {
+	RTE_ACL_IPV4VLAN_PROTO_FIELD,
+	RTE_ACL_IPV4VLAN_VLAN1_FIELD,
+	RTE_ACL_IPV4VLAN_VLAN2_FIELD,
+	RTE_ACL_IPV4VLAN_SRC_FIELD,
+	RTE_ACL_IPV4VLAN_DST_FIELD,
+	RTE_ACL_IPV4VLAN_SRCP_FIELD,
+	RTE_ACL_IPV4VLAN_DSTP_FIELD,
+	RTE_ACL_IPV4VLAN_NUM_FIELDS
+};
+
+/**
+ * Macro to define rule size for rte_acl_ipv4vlan_rule.
+ */
+#define	RTE_ACL_IPV4VLAN_RULE_SZ	\
+	RTE_ACL_RULE_SZ(RTE_ACL_IPV4VLAN_NUM_FIELDS)
+
+/*
+ * That effectively defines order of IPV4VLAN classifications:
+ *  - PROTO
+ *  - VLAN (TAG and DOMAIN)
+ *  - SRC IP ADDRESS
+ *  - DST IP ADDRESS
+ *  - PORTS (SRC and DST)
+ */
+enum {
+	RTE_ACL_IPV4VLAN_PROTO,
+	RTE_ACL_IPV4VLAN_VLAN,
+	RTE_ACL_IPV4VLAN_SRC,
+	RTE_ACL_IPV4VLAN_DST,
+	RTE_ACL_IPV4VLAN_PORTS,
+	RTE_ACL_IPV4VLAN_NUM
+};
+
+/**
+ * Add ipv4vlan rules to an existing ACL context.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to add patterns to.
+ * @param rules
+ *   Array of rules to add to the ACL context.
+ *   Note that all fields in rte_acl_ipv4vlan_rule structures are expected
+ *   to be in host byte order.
+ * @param num
+ *   Number of elements in the input array of rules.
+ * @return
+ *   - -ENOMEM if there is no space in the ACL context for these rules.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_ipv4vlan_add_rules(struct rte_acl_ctx *ctx,
+	const struct rte_acl_ipv4vlan_rule *rules,
+	uint32_t num);
+
+/**
+ * Analyze set of ipv4vlan rules and build required internal
+ * run-time structures.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to build.
+ * @param layout
+ *   Layout of input data to search through.
+ * @param num_categories
+ *   Maximum number of categories to use in that build.
+ * @return
+ *   - -ENOMEM if couldn't allocate enough memory.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Negative error code if operation failed.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_ipv4vlan_build(struct rte_acl_ctx *ctx,
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM],
+	uint32_t num_categories);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACL_H_ */
diff --git a/lib/librte_acl/rte_acl_osdep.h b/lib/librte_acl/rte_acl_osdep.h
new file mode 100644
index 0000000..046b22d
--- /dev/null
+++ b/lib/librte_acl/rte_acl_osdep.h
@@ -0,0 +1,92 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_OSDEP_H_
+#define _RTE_ACL_OSDEP_H_
+
+/**
+ * @file
+ *
+ * RTE ACL DPDK/OS dependent file.
+ */
+
+#include <stdint.h>
+#include <stddef.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <ctype.h>
+#include <string.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdarg.h>
+#include <stdlib.h>
+#include <sys/queue.h>
+
+/*
+ * Common defines.
+ */
+
+#define	LEN2MASK(ln)	((uint32_t)(((uint64_t)1 << (ln)) - 1))
+
+#define DIM(x) RTE_DIM(x)
+
+/*
+ * To build ACL standalone.
+ */
+#ifdef RTE_LIBRTE_ACL_STANDALONE
+#include <rte_acl_osdep_alone.h>
+#else
+
+#include <rte_common.h>
+#include <rte_common_vect.h>
+#include <rte_memory.h>
+#include <rte_log.h>
+#include <rte_memcpy.h>
+#include <rte_prefetch.h>
+#include <rte_byteorder.h>
+#include <rte_branch_prediction.h>
+#include <rte_memzone.h>
+#include <rte_malloc.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_eal_memconfig.h>
+#include <rte_per_lcore.h>
+#include <rte_errno.h>
+#include <rte_string_fns.h>
+#include <rte_cpuflags.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+
+#endif /* RTE_LIBRTE_ACL_STANDALONE */
+
+#endif /* _RTE_ACL_OSDEP_H_ */
diff --git a/lib/librte_acl/rte_acl_osdep_alone.h b/lib/librte_acl/rte_acl_osdep_alone.h
new file mode 100644
index 0000000..cde7240
--- /dev/null
+++ b/lib/librte_acl/rte_acl_osdep_alone.h
@@ -0,0 +1,277 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_OSDEP_ALONE_H_
+#define _RTE_ACL_OSDEP_ALONE_H_
+
+/**
+ * @file
+ *
+ * RTE ACL OS dependent file.
+ * An example how to build/use ACL library standalone
+ * (without rest of DPDK).
+ * Don't include that file on it's own, use <rte_acl_osdep.h>.
+ */
+
+#if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
+
+#ifdef __SSE__
+#include <xmmintrin.h>
+#endif
+
+#ifdef __SSE2__
+#include <emmintrin.h>
+#endif
+
+#if defined(__SSE4_2__) || defined(__SSE4_1__)
+#include <smmintrin.h>
+#endif
+
+#else
+
+#include <x86intrin.h>
+
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define	DUMMY_MACRO	do {} while (0)
+
+/*
+ * rte_common related.
+ */
+#define	__rte_unused	__attribute__((__unused__))
+
+#define RTE_PTR_ADD(ptr, x)	((typeof(ptr))((uintptr_t)(ptr) + (x)))
+
+#define	RTE_PTR_ALIGN_FLOOR(ptr, align) \
+	(typeof(ptr))((uintptr_t)(ptr) & ~((uintptr_t)(align) - 1))
+
+#define	RTE_PTR_ALIGN_CEIL(ptr, align) \
+	RTE_PTR_ALIGN_FLOOR(RTE_PTR_ADD(ptr, (align) - 1), align)
+
+#define	RTE_PTR_ALIGN(ptr, align)	RTE_PTR_ALIGN_CEIL(ptr, align)
+
+#define	RTE_ALIGN_FLOOR(val, align) \
+	(typeof(val))((val) & (~((typeof(val))((align) - 1))))
+
+#define	RTE_ALIGN_CEIL(val, align) \
+	RTE_ALIGN_FLOOR(((val) + ((typeof(val))(align) - 1)), align)
+
+#define	RTE_ALIGN(ptr, align)	RTE_ALIGN_CEIL(ptr, align)
+
+#define	RTE_MIN(a, b)	({ \
+		typeof(a) _a = (a); \
+		typeof(b) _b = (b); \
+		_a < _b ? _a : _b;   \
+	})
+
+#define	RTE_DIM(a)		(sizeof(a) / sizeof((a)[0]))
+
+/**
+ * Searches the input parameter for the least significant set bit
+ * (starting from zero).
+ * If a least significant 1 bit is found, its bit index is returned.
+ * If the content of the input paramer is zero, then the content of the return
+ * value is undefined.
+ * @param v
+ *     input parameter, should not be zero.
+ * @return
+ *     least significant set bit in the input parameter.
+ */
+static inline uint32_t
+rte_bsf32(uint32_t v)
+{
+	asm("bsf %1,%0"
+		: "=r" (v)
+		: "rm" (v));
+	return (v);
+}
+
+/*
+ * rte_common_vect related.
+ */
+typedef __m128i xmm_t;
+
+#define	XMM_SIZE	(sizeof(xmm_t))
+#define	XMM_MASK	(XMM_SIZE - 1)
+
+typedef union rte_mmsse {
+	xmm_t    m;
+	uint8_t  u8[XMM_SIZE / sizeof(uint8_t)];
+	uint16_t u16[XMM_SIZE / sizeof(uint16_t)];
+	uint32_t u32[XMM_SIZE / sizeof(uint32_t)];
+	uint64_t u64[XMM_SIZE / sizeof(uint64_t)];
+	double   pd[XMM_SIZE / sizeof(double)];
+} rte_xmm_t;
+
+/*
+ * rte_cycles related.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+	asm volatile("rdtsc" :
+		"=a" (tsc.lo_32),
+		"=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+/*
+ * rte_lcore related.
+ */
+#define rte_lcore_id()	(0)
+
+/*
+ * rte_errno related.
+ */
+#define	rte_errno	errno
+#define	E_RTE_NO_TAILQ	(-1)
+
+/*
+ * rte_rwlock related.
+ */
+#define	rte_rwlock_read_lock(x)		DUMMY_MACRO
+#define	rte_rwlock_read_unlock(x)	DUMMY_MACRO
+#define	rte_rwlock_write_lock(x)	DUMMY_MACRO
+#define	rte_rwlock_write_unlock(x)	DUMMY_MACRO
+
+/*
+ * rte_memory related.
+ */
+#define	SOCKET_ID_ANY	-1                  /**< Any NUMA socket. */
+#define	CACHE_LINE_SIZE	64                  /**< Cache line size. */
+#define	CACHE_LINE_MASK	(CACHE_LINE_SIZE-1) /**< Cache line mask. */
+
+/**
+ * Force alignment to cache line.
+ */
+#define	__rte_cache_aligned	__attribute__((__aligned__(CACHE_LINE_SIZE)))
+
+
+/*
+ * rte_byteorder related.
+ */
+#define	rte_le_to_cpu_16(x)	(x)
+#define	rte_le_to_cpu_32(x)	(x)
+
+#define rte_cpu_to_be_16(x)	\
+	(((x) & UINT8_MAX) << CHAR_BIT | ((x) >> CHAR_BIT & UINT8_MAX))
+#define rte_cpu_to_be_32(x)	__builtin_bswap32(x)
+
+/*
+ * rte_branch_prediction related.
+ */
+#ifndef	likely
+#define	likely(x)	__builtin_expect((x), 1)
+#endif	/* likely */
+
+#ifndef	unlikely
+#define	unlikely(x)	__builtin_expect((x), 0)
+#endif	/* unlikely */
+
+
+/*
+ * rte_tailq related.
+ */
+static inline void *
+rte_dummy_tailq(void)
+{
+	static __thread TAILQ_HEAD(rte_dummy_head, rte_dummy) dummy_head;
+	TAILQ_INIT(&dummy_head);
+	return (&dummy_head);
+}
+
+#define	RTE_TAILQ_LOOKUP_BY_IDX(idx, struct_name)	rte_dummy_tailq()
+
+#define RTE_EAL_TAILQ_REMOVE(idx, type, elm)	DUMMY_MACRO
+
+/*
+ * rte_string related
+ */
+#define	rte_snprintf(str, len, frmt, args...)	snprintf(str, len, frmt, ##args)
+
+/*
+ * rte_log related
+ */
+#define RTE_LOG(l, t, fmt, args...)	printf(fmt, ##args)
+
+/*
+ * rte_malloc related
+ */
+#define	rte_free(x)	free(x)
+
+static inline void *
+rte_zmalloc_socket(__rte_unused const char *type, size_t size, unsigned align,
+	__rte_unused int socket)
+{
+	void *ptr;
+	int rc;
+
+	if ((rc = posix_memalign(&ptr, align, size)) != 0) {
+		rte_errno = rc;
+		return (NULL);
+	}
+
+	memset(ptr, 0, size);
+	return (ptr);
+}
+
+/*
+ * rte_debug related
+ */
+#define	rte_panic(fmt, args...)	do {         \
+	RTE_LOG(CRIT, EAL, fmt, ##args);     \
+	abort();                             \
+} while (0)
+
+#define	rte_exit(err, fmt, args...)	do { \
+	RTE_LOG(CRIT, EAL, fmt, ##args);     \
+	exit(err);                           \
+} while (0)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACL_OSDEP_ALONE_H_ */
diff --git a/lib/librte_acl/tb_mem.c b/lib/librte_acl/tb_mem.c
new file mode 100644
index 0000000..817d0c8
--- /dev/null
+++ b/lib/librte_acl/tb_mem.c
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "tb_mem.h"
+
+/*
+ *  Memory managment routines for temporary memory.
+ *  That memory is used only during build phase and is released after
+ *  build is finished.
+ */
+
+static struct tb_mem_block *
+tb_pool(struct tb_mem_pool *pool, size_t sz)
+{
+	struct tb_mem_block *block;
+	uint8_t *ptr;
+	size_t size;
+
+	size = sz + pool->alignment - 1;
+	if ((block = calloc(1, size + sizeof(*pool->block))) == NULL) {
+		RTE_LOG(ERR, MALLOC, "%s(%zu)\n failed, currently allocated "
+			"by pool: %zu bytes\n", __func__, sz, pool->alloc);
+		return (NULL);
+	}
+
+	block->pool = pool;
+
+	block->next = pool->block;
+	pool->block = block;
+
+	pool->alloc += size;
+
+	ptr = (uint8_t *)(block + 1);
+	block->mem = RTE_PTR_ALIGN_CEIL(ptr, pool->alignment);
+	block->size = size - (block->mem - ptr);
+
+	return (block);
+}
+
+void *
+tb_alloc(struct tb_mem_pool *pool, size_t size)
+{
+	struct tb_mem_block *block;
+	void *ptr;
+	size_t new_sz;
+
+	size = RTE_ALIGN_CEIL(size, pool->alignment);
+
+	block = pool->block;
+	if (block == NULL || block->size < size) {
+		new_sz = (size > pool->min_alloc) ? size : pool->min_alloc;
+		if ((block = tb_pool(pool, new_sz)) == NULL)
+			return (NULL);
+	}
+	ptr = block->mem;
+	block->size -= size;
+	block->mem += size;
+	return (ptr);
+}
+
+void
+tb_free_pool(struct tb_mem_pool *pool)
+{
+	struct tb_mem_block *next, *block;
+
+	for (block = pool->block; block != NULL; block = next) {
+		next = block->next;
+		free(block);
+	}
+	pool->block = NULL;
+	pool->alloc = 0;
+}
diff --git a/lib/librte_acl/tb_mem.h b/lib/librte_acl/tb_mem.h
new file mode 100644
index 0000000..a3ed795
--- /dev/null
+++ b/lib/librte_acl/tb_mem.h
@@ -0,0 +1,73 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _TB_MEM_H_
+#define _TB_MEM_H_
+
+/**
+ * @file
+ *
+ * RTE ACL temporary (build phase) memory managment.
+ * Contains structures and functions to manage temporary (used by build only)
+ * memory. Memory allocated in large blocks to speed 'free' when trie is
+ * destructed (finish of build phase).
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_acl_osdep.h>
+
+struct tb_mem_block {
+	struct tb_mem_block *next;
+	struct tb_mem_pool  *pool;
+	size_t               size;
+	uint8_t             *mem;
+};
+
+struct tb_mem_pool {
+	struct tb_mem_block *block;
+	size_t               alignment;
+	size_t               min_alloc;
+	size_t               alloc;
+};
+
+void *tb_alloc(struct tb_mem_pool *pool, size_t size);
+void tb_free_pool(struct tb_mem_pool *pool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _TB_MEM_H_ */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCHv3 2/5] acl: update UT to reflect latest changes in the librte_acl
  2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 1/5] Add ACL library (librte_acl) into DPDK Konstantin Ananyev
@ 2014-06-13 11:26 ` Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 3/5] acl: New test-acl application Konstantin Ananyev
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Konstantin Ananyev @ 2014-06-13 11:26 UTC (permalink / raw)
  To: dev, dev

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/test_acl.c |  216 ++++++++++++++++++++++++++++++++-------------------
 1 files changed, 135 insertions(+), 81 deletions(-)

diff --git a/app/test/test_acl.c b/app/test/test_acl.c
index 97cf1fb..5abe36f 100644
--- a/app/test/test_acl.c
+++ b/app/test/test_acl.c
@@ -66,7 +66,7 @@ struct rte_acl_ipv4vlan_rule acl_rule = {
 
 /* byteswap to cpu or network order */
 static void
-bswap_test_data(struct ipv4_7tuple * data, int len, int to_be)
+bswap_test_data(struct ipv4_7tuple *data, int len, int to_be)
 {
 	int i;
 
@@ -80,8 +80,7 @@ bswap_test_data(struct ipv4_7tuple * data, int len, int to_be)
 			data[i].port_src = rte_cpu_to_be_16(data[i].port_src);
 			data[i].vlan = rte_cpu_to_be_16(data[i].vlan);
 			data[i].domain = rte_cpu_to_be_16(data[i].domain);
-		}
-		else {
+		} else {
 			data[i].ip_dst = rte_be_to_cpu_32(data[i].ip_dst);
 			data[i].ip_src = rte_be_to_cpu_32(data[i].ip_src);
 			data[i].port_dst = rte_be_to_cpu_16(data[i].port_dst);
@@ -96,46 +95,12 @@ bswap_test_data(struct ipv4_7tuple * data, int len, int to_be)
  * Test scalar and SSE ACL lookup.
  */
 static int
-test_classify(void)
+test_classify_run(struct rte_acl_ctx *acx)
 {
-	struct rte_acl_ctx * acx;
 	int ret, i;
 	uint32_t result, count;
-
 	uint32_t results[RTE_DIM(acl_test_data) * RTE_ACL_MAX_CATEGORIES];
-
-	const uint8_t * data[RTE_DIM(acl_test_data)];
-
-	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {
-			offsetof(struct ipv4_7tuple, proto),
-			offsetof(struct ipv4_7tuple, vlan),
-			offsetof(struct ipv4_7tuple, ip_src),
-			offsetof(struct ipv4_7tuple, ip_dst),
-			offsetof(struct ipv4_7tuple, port_src),
-	};
-
-	acx = rte_acl_create(&acl_param);
-	if (acx == NULL) {
-		printf("Line %i: Error creating ACL context!\n", __LINE__);
-		return -1;
-	}
-
-	/* add rules to the context */
-	ret = rte_acl_ipv4vlan_add_rules(acx, acl_test_rules,
-			RTE_DIM(acl_test_rules));
-	if (ret != 0) {
-		printf("Line %i: Adding rules to ACL context failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
-
-	/* try building the context */
-	ret = rte_acl_ipv4vlan_build(acx, layout, RTE_ACL_MAX_CATEGORIES);
-	if (ret != 0) {
-		printf("Line %i: Building ACL context failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
+	const uint8_t *data[RTE_DIM(acl_test_data)];
 
 	/* swap all bytes in the data to network order */
 	bswap_test_data(acl_test_data, RTE_DIM(acl_test_data), 1);
@@ -158,12 +123,13 @@ test_classify(void)
 
 		/* check if we allow everything we should allow */
 		for (i = 0; i < (int) count; i++) {
-			result = results[i * RTE_ACL_MAX_CATEGORIES + ACL_ALLOW];
+			result =
+				results[i * RTE_ACL_MAX_CATEGORIES + ACL_ALLOW];
 			if (result != acl_test_data[i].allow) {
 				printf("Line %i: Error in allow results at %i "
-						"(expected %"PRIu32" got %"PRIu32")!\n",
-						__LINE__, i, acl_test_data[i].allow,
-						result);
+					"(expected %"PRIu32" got %"PRIu32")!\n",
+					__LINE__, i, acl_test_data[i].allow,
+					result);
 				goto err;
 			}
 		}
@@ -173,9 +139,9 @@ test_classify(void)
 			result = results[i * RTE_ACL_MAX_CATEGORIES + ACL_DENY];
 			if (result != acl_test_data[i].deny) {
 				printf("Line %i: Error in deny results at %i "
-						"(expected %"PRIu32" got %"PRIu32")!\n",
-						__LINE__, i, acl_test_data[i].deny,
-						result);
+					"(expected %"PRIu32" got %"PRIu32")!\n",
+					__LINE__, i, acl_test_data[i].deny,
+					result);
 				goto err;
 			}
 		}
@@ -183,7 +149,7 @@ test_classify(void)
 
 	/* make a quick check for scalar */
 	ret = rte_acl_classify_scalar(acx, data, results,
-					RTE_DIM(acl_test_data), RTE_ACL_MAX_CATEGORIES);
+			RTE_DIM(acl_test_data), RTE_ACL_MAX_CATEGORIES);
 	if (ret != 0) {
 		printf("Line %i: SSE classify failed!\n", __LINE__);
 		goto err;
@@ -213,21 +179,97 @@ test_classify(void)
 		}
 	}
 
-	/* free ACL context */
-	rte_acl_free(acx);
+	ret = 0;
 
+err:
 	/* swap data back to cpu order so that next time tests don't fail */
 	bswap_test_data(acl_test_data, RTE_DIM(acl_test_data), 0);
+	return (ret);
+}
 
-	return 0;
-err:
+static int
+test_classify_buid(struct rte_acl_ctx *acx)
+{
+	int ret;
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {
+			offsetof(struct ipv4_7tuple, proto),
+			offsetof(struct ipv4_7tuple, vlan),
+			offsetof(struct ipv4_7tuple, ip_src),
+			offsetof(struct ipv4_7tuple, ip_dst),
+			offsetof(struct ipv4_7tuple, port_src),
+	};
 
-	/* swap data back to cpu order so that next time tests don't fail */
-	bswap_test_data(acl_test_data, RTE_DIM(acl_test_data), 0);
+	/* add rules to the context */
+	ret = rte_acl_ipv4vlan_add_rules(acx, acl_test_rules,
+			RTE_DIM(acl_test_rules));
+	if (ret != 0) {
+		printf("Line %i: Adding rules to ACL context failed!\n",
+			__LINE__);
+		return (ret);
+	}
 
-	rte_acl_free(acx);
+	/* try building the context */
+	ret = rte_acl_ipv4vlan_build(acx, layout, RTE_ACL_MAX_CATEGORIES);
+	if (ret != 0) {
+		printf("Line %i: Building ACL context failed!\n", __LINE__);
+		return (ret);
+	}
 
-	return -1;
+	return (0);
+}
+
+#define	TEST_CLASSIFY_ITER	4
+
+/*
+ * Test scalar and SSE ACL lookup.
+ */
+static int
+test_classify(void)
+{
+	struct rte_acl_ctx *acx;
+	int i, ret;
+
+	acx = rte_acl_create(&acl_param);
+	if (acx == NULL) {
+		printf("Line %i: Error creating ACL context!\n", __LINE__);
+		return -1;
+	}
+
+	ret = 0;
+	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
+
+		if ((i & 1) == 0)
+			rte_acl_reset(acx);
+		else
+			rte_acl_reset_rules(acx);
+
+		ret = test_classify_buid(acx);
+		if (ret != 0) {
+			printf("Line %i, iter: %d: "
+				"Adding rules to ACL context failed!\n",
+				__LINE__, i);
+			break;
+		}
+
+		ret = test_classify_run(acx);
+		if (ret != 0) {
+			printf("Line %i, iter: %d: %s failed!\n",
+				__LINE__, i, __func__);
+			break;
+		}
+
+		/* reset rules and make sure that classify still works ok. */
+		rte_acl_reset_rules(acx);
+		ret = test_classify_run(acx);
+		if (ret != 0) {
+			printf("Line %i, iter: %d: %s failed!\n",
+				__LINE__, i, __func__);
+			break;
+		}
+	}
+
+	rte_acl_free(acx);
+	return (ret);
 }
 
 /*
@@ -241,11 +283,11 @@ err:
 static int
 test_invalid_layout(void)
 {
-	struct rte_acl_ctx * acx;
+	struct rte_acl_ctx *acx;
 	int ret, i;
 
 	uint32_t results[RTE_DIM(invalid_layout_data)];
-	const uint8_t * data[RTE_DIM(invalid_layout_data)];
+	const uint8_t *data[RTE_DIM(invalid_layout_data)];
 
 	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {
 			/* proto points to destination port's first byte */
@@ -257,7 +299,10 @@ test_invalid_layout(void)
 			offsetof(struct ipv4_7tuple, ip_dst),
 			offsetof(struct ipv4_7tuple, ip_src),
 
-			/* we can't swap ports here, so we will swap them in the data */
+			/*
+			 * we can't swap ports here, so we will swap
+			 * them in the data
+			 */
 			offsetof(struct ipv4_7tuple, port_src),
 	};
 
@@ -274,7 +319,8 @@ test_invalid_layout(void)
 		ret = rte_acl_ipv4vlan_add_rules(acx, invalid_layout_rules,
 				RTE_DIM(invalid_layout_rules));
 		if (ret != 0) {
-			printf("Line %i: Adding rules to ACL context failed!\n", __LINE__);
+			printf("Line %i: Adding rules to ACL context failed!\n",
+				__LINE__);
 			rte_acl_free(acx);
 			return -1;
 		}
@@ -307,8 +353,10 @@ test_invalid_layout(void)
 
 	for (i = 0; i < (int) RTE_DIM(results); i++) {
 		if (results[i] != invalid_layout_data[i].allow) {
-			printf("Line %i: Wrong results at %i (result=%u, should be %u)!\n",
-					__LINE__, i, results[i], invalid_layout_data[i].allow);
+			printf("Line %i: Wrong results at %i "
+				"(result=%u, should be %u)!\n",
+				__LINE__, i, results[i],
+				invalid_layout_data[i].allow);
 			goto err;
 		}
 	}
@@ -324,8 +372,10 @@ test_invalid_layout(void)
 
 	for (i = 0; i < (int) RTE_DIM(results); i++) {
 		if (results[i] != invalid_layout_data[i].allow) {
-			printf("Line %i: Wrong results at %i (result=%u, should be %u)!\n",
-					__LINE__, i, results[i], invalid_layout_data[i].allow);
+			printf("Line %i: Wrong results at %i "
+				"(result=%u, should be %u)!\n",
+				__LINE__, i, results[i],
+				invalid_layout_data[i].allow);
 			goto err;
 		}
 	}
@@ -353,13 +403,13 @@ static int
 test_create_find_add(void)
 {
 	struct rte_acl_param param;
-	struct rte_acl_ctx * acx, *acx2, *tmp;
+	struct rte_acl_ctx *acx, *acx2, *tmp;
 	struct rte_acl_ipv4vlan_rule rules[LEN];
 
 	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {0};
 
-	const char * acx_name = "acx";
-	const char * acx2_name = "acx2";
+	const char *acx_name = "acx";
+	const char *acx2_name = "acx2";
 	int i, ret;
 
 	/* create two contexts */
@@ -385,8 +435,9 @@ test_create_find_add(void)
 	param.name = acx_name;
 	tmp = rte_acl_create(&param);
 	if (tmp != acx) {
-		printf("Line %i: Creating context with existing name test failed!\n",
-				__LINE__);
+		printf("Line %i: Creating context with existing name "
+			"test failed!\n",
+			__LINE__);
 		if (tmp)
 			rte_acl_free(tmp);
 		goto err;
@@ -395,8 +446,9 @@ test_create_find_add(void)
 	param.name = acx2_name;
 	tmp = rte_acl_create(&param);
 	if (tmp != acx2) {
-		printf("Line %i: Creating context with existing name test 2 failed!\n",
-				__LINE__);
+		printf("Line %i: Creating context with existing "
+			"name test 2 failed!\n",
+			__LINE__);
 		if (tmp)
 			rte_acl_free(tmp);
 		goto err;
@@ -442,9 +494,12 @@ test_create_find_add(void)
 
 	/* create dummy acl */
 	for (i = 0; i < LEN; i++) {
-		memcpy(&rules[i], &acl_rule, sizeof(struct rte_acl_ipv4vlan_rule));
-		rules[i].data.userdata = i + 1;       /* skip zero */
-		rules[i].data.category_mask = 1 << i; /* one rule per category */
+		memcpy(&rules[i], &acl_rule,
+			sizeof(struct rte_acl_ipv4vlan_rule));
+		/* skip zero */
+		rules[i].data.userdata = i + 1;
+		/* one rule per category */
+		rules[i].data.category_mask = 1 << i;
 	}
 
 	/* try filling up the context */
@@ -486,7 +541,7 @@ err:
 static int
 test_invalid_rules(void)
 {
-	struct rte_acl_ctx * acx;
+	struct rte_acl_ctx *acx;
 	int ret;
 
 	struct rte_acl_ipv4vlan_rule rule;
@@ -589,7 +644,7 @@ static int
 test_invalid_parameters(void)
 {
 	struct rte_acl_param param;
-	struct rte_acl_ctx * acx;
+	struct rte_acl_ctx *acx;
 	struct rte_acl_ipv4vlan_rule rule;
 	int result;
 
@@ -618,8 +673,7 @@ test_invalid_parameters(void)
 		printf("Line %i: ACL context creation with zero rule len "
 				"failed!\n", __LINE__);
 		return -1;
-	}
-	else
+	} else
 		rte_acl_free(acx);
 
 	/* zero max rule num */
@@ -631,8 +685,7 @@ test_invalid_parameters(void)
 		printf("Line %i: ACL context creation with zero rule num "
 				"failed!\n", __LINE__);
 		return -1;
-	}
-	else
+	} else
 		rte_acl_free(acx);
 
 	/* invalid NUMA node */
@@ -705,7 +758,8 @@ test_invalid_parameters(void)
 	/* zero count (should succeed) */
 	result = rte_acl_ipv4vlan_add_rules(acx, &rule, 0);
 	if (result != 0) {
-		printf("Line %i: Adding 0 rules to ACL context failed!\n", __LINE__);
+		printf("Line %i: Adding 0 rules to ACL context failed!\n",
+			__LINE__);
 		rte_acl_free(acx);
 		return -1;
 	}
@@ -835,7 +889,7 @@ static int
 test_misc(void)
 {
 	struct rte_acl_param param;
-	struct rte_acl_ctx * acx;
+	struct rte_acl_ctx *acx;
 
 	/* create context */
 	memcpy(&param, &acl_param, sizeof(param));
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCHv3 3/5] acl: New test-acl application
  2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 1/5] Add ACL library (librte_acl) into DPDK Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 2/5] acl: update UT to reflect latest changes in the librte_acl Konstantin Ananyev
@ 2014-06-13 11:26 ` Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 4/5] acl: New sample l3fwd-acl Konstantin Ananyev
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Konstantin Ananyev @ 2014-06-13 11:26 UTC (permalink / raw)
  To: dev, dev

Introduce test-acl:
Usage example and main test application for the ACL library.
Provides IPv4/IPv6 5-tuple classification.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/Makefile          |    1 +
 app/test-acl/Makefile |   45 +++
 app/test-acl/main.c   | 1029 +++++++++++++++++++++++++++++++++++++++++++++++++
 app/test-acl/main.h   |   50 +++
 4 files changed, 1125 insertions(+), 0 deletions(-)
 create mode 100644 app/test-acl/Makefile
 create mode 100644 app/test-acl/main.c
 create mode 100644 app/test-acl/main.h

diff --git a/app/Makefile b/app/Makefile
index 04417d8..90557e5 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -35,5 +35,6 @@ DIRS-$(CONFIG_RTE_APP_TEST) += test
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += dump_cfg
+DIRS-$(CONFIG_RTE_LIBRTE_ACL) += test-acl
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/test-acl/Makefile b/app/test-acl/Makefile
new file mode 100644
index 0000000..00fa3b6
--- /dev/null
+++ b/app/test-acl/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = testacl
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) := main.c
+
+# this application needs libraries first
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ACL) += lib
+
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-acl/main.c b/app/test-acl/main.c
new file mode 100644
index 0000000..78d9ae5
--- /dev/null
+++ b/app/test-acl/main.c
@@ -0,0 +1,1029 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include <getopt.h>
+#include <string.h>
+
+#ifndef RTE_LIBRTE_ACL_STANDALONE
+
+#include <rte_cycles.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_ip.h>
+
+#define	PRINT_USAGE_START	"%s [EAL options]\n"
+
+#else
+
+#define IPv4(a, b, c, d) ((uint32_t)(((a) & 0xff) << 24) | \
+				(((b) & 0xff) << 16) |     \
+				(((c) & 0xff) << 8)  |     \
+				((d) & 0xff))
+
+#define	RTE_LCORE_FOREACH_SLAVE(x)	while (((x) = 0))
+
+#define	rte_eal_remote_launch(a, b, c)	DUMMY_MACRO
+#define	rte_eal_mp_wait_lcore()		DUMMY_MACRO
+
+#define	rte_eal_init(c, v)	(0)
+
+#define	PRINT_USAGE_START	"%s\n"
+
+#endif /*RTE_LIBRTE_ACL_STANDALONE */
+
+#include "main.h"
+
+#define GET_CB_FIELD(in, fd, base, lim, dlm)	do {            \
+	unsigned long val;                                      \
+	char *end_fld;                                          \
+	errno = 0;                                              \
+	val = strtoul((in), &end_fld, (base));                  \
+	if (errno != 0 || end_fld[0] != (dlm) || val > (lim))   \
+		return (-EINVAL);                               \
+	(fd) = (typeof(fd))val;                                 \
+	(in) = end_fld + 1;                                     \
+} while (0)
+
+#define	OPT_RULE_FILE		"rulesf"
+#define	OPT_TRACE_FILE		"tracef"
+#define	OPT_RULE_NUM		"rulenum"
+#define	OPT_TRACE_NUM		"tracenum"
+#define	OPT_TRACE_STEP		"tracestep"
+#define	OPT_SEARCH_SCALAR	"scalar"
+#define	OPT_BLD_CATEGORIES	"bldcat"
+#define	OPT_RUN_CATEGORIES	"runcat"
+#define	OPT_ITER_NUM		"iter"
+#define	OPT_VERBOSE		"verbose"
+#define	OPT_IPV6		"ipv6"
+
+#define	TRACE_DEFAULT_NUM	0x10000
+#define	TRACE_STEP_MAX		0x1000
+#define	TRACE_STEP_DEF		0x100
+
+#define	RULE_NUM		0x10000
+
+enum {
+	DUMP_NONE,
+	DUMP_SEARCH,
+	DUMP_PKT,
+	DUMP_MAX
+};
+
+static struct {
+	const char         *prgname;
+	const char         *rule_file;
+	const char         *trace_file;
+	uint32_t            bld_categories;
+	uint32_t            run_categories;
+	uint32_t            nb_rules;
+	uint32_t            nb_traces;
+	uint32_t            trace_step;
+	uint32_t            trace_sz;
+	uint32_t            iter_num;
+	uint32_t            verbose;
+	uint32_t            scalar;
+	uint32_t            used_traces;
+	void               *traces;
+	struct rte_acl_ctx *acx;
+	uint32_t			ipv6;
+} config = {
+	.bld_categories = 3,
+	.run_categories = 1,
+	.nb_rules = RULE_NUM,
+	.nb_traces = TRACE_DEFAULT_NUM,
+	.trace_step = TRACE_STEP_DEF,
+	.iter_num = 1,
+	.verbose = DUMP_MAX,
+	.ipv6 = 0
+};
+
+static struct rte_acl_param prm = {
+	.name = APP_NAME,
+	.socket_id = SOCKET_ID_ANY,
+};
+
+/*
+ * Rule and trace formats definitions.
+ */
+
+struct ipv4_5tuple {
+	uint8_t  proto;
+	uint32_t ip_src;
+	uint32_t ip_dst;
+	uint16_t port_src;
+	uint16_t port_dst;
+};
+
+enum {
+	PROTO_FIELD_IPV4,
+	SRC_FIELD_IPV4,
+	DST_FIELD_IPV4,
+	SRCP_FIELD_IPV4,
+	DSTP_FIELD_IPV4,
+	NUM_FIELDS_IPV4
+};
+
+struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof(uint8_t),
+		.field_index = PROTO_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PROTO,
+		.offset = offsetof(struct ipv4_5tuple, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_SRC,
+		.offset = offsetof(struct ipv4_5tuple, ip_src),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_DST,
+		.offset = offsetof(struct ipv4_5tuple, ip_dst),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = SRCP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = offsetof(struct ipv4_5tuple, port_src),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = DSTP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = offsetof(struct ipv4_5tuple, port_dst),
+	},
+};
+
+#define	IPV6_ADDR_LEN	16
+#define	IPV6_ADDR_U16	(IPV6_ADDR_LEN / sizeof(uint16_t))
+#define	IPV6_ADDR_U32	(IPV6_ADDR_LEN / sizeof(uint32_t))
+
+struct ipv6_5tuple {
+	uint8_t  proto;
+	uint32_t ip_src[IPV6_ADDR_U32];
+	uint32_t ip_dst[IPV6_ADDR_U32];
+	uint16_t port_src;
+	uint16_t port_dst;
+};
+
+enum {
+	PROTO_FIELD_IPV6,
+	SRC1_FIELD_IPV6,
+	SRC2_FIELD_IPV6,
+	SRC3_FIELD_IPV6,
+	SRC4_FIELD_IPV6,
+	DST1_FIELD_IPV6,
+	DST2_FIELD_IPV6,
+	DST3_FIELD_IPV6,
+	DST4_FIELD_IPV6,
+	SRCP_FIELD_IPV6,
+	DSTP_FIELD_IPV6,
+	NUM_FIELDS_IPV6
+};
+
+struct rte_acl_field_def ipv6_defs[NUM_FIELDS_IPV6] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof(uint8_t),
+		.field_index = PROTO_FIELD_IPV6,
+		.input_index = PROTO_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC1_FIELD_IPV6,
+		.input_index = SRC1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[0]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC2_FIELD_IPV6,
+		.input_index = SRC2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[1]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC3_FIELD_IPV6,
+		.input_index = SRC3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[2]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC4_FIELD_IPV6,
+		.input_index = SRC4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[3]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST1_FIELD_IPV6,
+		.input_index = DST1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[0]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST2_FIELD_IPV6,
+		.input_index = DST2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[1]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST3_FIELD_IPV6,
+		.input_index = DST3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[2]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST4_FIELD_IPV6,
+		.input_index = DST4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[3]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = SRCP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, port_src),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = DSTP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, port_dst),
+	},
+};
+
+
+enum {
+	CB_FLD_SRC_ADDR,
+	CB_FLD_DST_ADDR,
+	CB_FLD_SRC_PORT_LOW,
+	CB_FLD_SRC_PORT_DLM,
+	CB_FLD_SRC_PORT_HIGH,
+	CB_FLD_DST_PORT_LOW,
+	CB_FLD_DST_PORT_DLM,
+	CB_FLD_DST_PORT_HIGH,
+	CB_FLD_PROTO,
+	CB_FLD_NUM,
+};
+
+enum {
+	CB_TRC_SRC_ADDR,
+	CB_TRC_DST_ADDR,
+	CB_TRC_SRC_PORT,
+	CB_TRC_DST_PORT,
+	CB_TRC_PROTO,
+	CB_TRC_NUM,
+};
+
+RTE_ACL_RULE_DEF(acl_rule, RTE_ACL_MAX_FIELDS);
+
+static const char cb_port_delim[] = ":";
+
+static char line[LINE_MAX];
+
+#define	dump_verbose(lvl, fh, fmt, args...)	do { \
+	if ((lvl) <= (int32_t)config.verbose)        \
+		fprintf(fh, fmt, ##args);            \
+} while (0)
+
+
+/*
+ * Parse ClassBench input trace (test vectors and expected results) file.
+ * Expected format:
+ * <src_ipv4_addr> <space> <dst_ipv4_addr> <space> \
+ * <src_port> <space> <dst_port> <space> <proto>
+ */
+static int
+parse_cb_ipv4_trace(char *str, struct ipv4_5tuple *v)
+{
+	int i;
+	char *s, *sp, *in[CB_TRC_NUM];
+	static const char *dlm = " \t\n";
+
+	s = str;
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	GET_CB_FIELD(in[CB_TRC_SRC_ADDR], v->ip_src, 0, UINT32_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_DST_ADDR], v->ip_dst, 0, UINT32_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_SRC_PORT], v->port_src, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_DST_PORT], v->port_dst, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_PROTO], v->proto, 0, UINT8_MAX, 0);
+
+	/* convert to network byte order. */
+	v->ip_src = rte_cpu_to_be_32(v->ip_src);
+	v->ip_dst = rte_cpu_to_be_32(v->ip_dst);
+	v->port_src = rte_cpu_to_be_16(v->port_src);
+	v->port_dst = rte_cpu_to_be_16(v->port_dst);
+
+	return (0);
+}
+
+/*
+ * Parses IPV6 address, exepcts the following format:
+ * XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX (where X - is a hexedecimal digit).
+ */
+static int
+parse_ipv6_addr(const char *in, const char **end, uint32_t v[IPV6_ADDR_U32],
+	char dlm)
+{
+	uint32_t addr[IPV6_ADDR_U16];
+
+	GET_CB_FIELD(in, addr[0], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[1], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[2], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[3], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[4], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[5], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[6], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[7], 16, UINT16_MAX, dlm);
+
+	*end = in;
+
+	v[0] = (addr[0] << 16) + addr[1];
+	v[1] = (addr[2] << 16) + addr[3];
+	v[2] = (addr[4] << 16) + addr[5];
+	v[3] = (addr[6] << 16) + addr[7];
+
+	return (0);
+}
+
+static int
+parse_cb_ipv6_addr_trace(const char *in, uint32_t v[IPV6_ADDR_U32])
+{
+	int32_t rc;
+	const char *end;
+
+	if ((rc = parse_ipv6_addr(in, &end, v, 0)) != 0)
+		return (rc);
+
+	v[0] = rte_cpu_to_be_32(v[0]);
+	v[1] = rte_cpu_to_be_32(v[1]);
+	v[2] = rte_cpu_to_be_32(v[2]);
+	v[3] = rte_cpu_to_be_32(v[3]);
+
+	return (0);
+}
+
+/*
+ * Parse ClassBench input trace (test vectors and expected results) file.
+ * Expected format:
+ * <src_ipv6_addr> <space> <dst_ipv6_addr> <space> \
+ * <src_port> <space> <dst_port> <space> <proto>
+ */
+static int
+parse_cb_ipv6_trace(char *str, struct ipv6_5tuple *v)
+{
+	int32_t i, rc;
+	char *s, *sp, *in[CB_TRC_NUM];
+	static const char *dlm = " \t\n";
+
+	s = str;
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	/* get ip6 src address. */
+	if ((rc = parse_cb_ipv6_addr_trace(in[CB_TRC_SRC_ADDR],
+			v->ip_src)) != 0)
+		return (rc);
+
+	/* get ip6 dst address. */
+	if ((rc = parse_cb_ipv6_addr_trace(in[CB_TRC_DST_ADDR],
+			v->ip_dst)) != 0)
+		return (rc);
+
+	GET_CB_FIELD(in[CB_TRC_SRC_PORT], v->port_src, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_DST_PORT], v->port_dst, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_PROTO], v->proto, 0, UINT8_MAX, 0);
+
+	/* convert to network byte order. */
+	v->port_src = rte_cpu_to_be_16(v->port_src);
+	v->port_dst = rte_cpu_to_be_16(v->port_dst);
+
+	return (0);
+}
+
+static void
+tracef_init(void)
+{
+	static const char name[] = APP_NAME;
+	FILE *f;
+	size_t sz;
+	uint32_t n;
+	struct ipv4_5tuple *v;
+	struct ipv6_5tuple *w;
+
+	sz = config.nb_traces * (config.ipv6 ? sizeof(*w) : sizeof(*v));
+	if ((config.traces = rte_zmalloc_socket(name, sz, CACHE_LINE_SIZE,
+			SOCKET_ID_ANY)) == NULL)
+		rte_exit(EXIT_FAILURE, "Cannot allocate %zu bytes for "
+			"requested %u number of trace records\n",
+			sz, config.nb_traces);
+
+	if ((f = fopen(config.trace_file, "r")) == NULL)
+		rte_exit(-EINVAL, "failed to open file: %s\n",
+			config.trace_file);
+
+	v = config.traces;
+	w = config.traces;
+	for (n = 0; n != config.nb_traces; n++) {
+
+		if (fgets(line, sizeof(line), f) == NULL)
+			break;
+
+		if (config.ipv6) {
+			if (parse_cb_ipv6_trace(line, w + n) != 0)
+				rte_exit(EXIT_FAILURE,
+					"%s: failed to parse ipv6 trace "
+					"record at line %u\n",
+					config.trace_file, n + 1);
+		} else {
+			if (parse_cb_ipv4_trace(line, v + n) != 0)
+				rte_exit(EXIT_FAILURE,
+					"%s: failed to parse ipv4 trace "
+					"record at line %u\n",
+					config.trace_file, n + 1);
+		}
+	}
+
+	config.used_traces = n;
+	fclose(f);
+}
+
+static int
+parse_ipv6_net(const char *in, struct rte_acl_field field[4])
+{
+	int32_t rc;
+	const char *mp;
+	uint32_t i, m, v[4];
+	const uint32_t nbu32 = sizeof(uint32_t) * CHAR_BIT;
+
+	/* get address. */
+	if ((rc = parse_ipv6_addr(in, &mp, v, '/')) != 0)
+		return (rc);
+
+	/* get mask. */
+	GET_CB_FIELD(mp, m, 0, CHAR_BIT * sizeof(v), 0);
+
+	/* put all together. */
+	for (i = 0; i != RTE_DIM(v); i++) {
+		if (m >= (i + 1) * nbu32)
+			field[i].mask_range.u32 = nbu32;
+		else
+			field[i].mask_range.u32 = m > (i * nbu32) ?
+				m - (i * 32) : 0;
+
+		field[i].value.u32 = v[i];
+	}
+
+	return (0);
+}
+
+
+static int
+parse_cb_ipv6_rule(char *str, struct acl_rule *v)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+
+	/*
+	 * Skip leading '@'
+	 */
+	if (strchr(str, '@') != str)
+		return (-EINVAL);
+
+	s = str + 1;
+
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_SRC_ADDR],
+			v->field + SRC1_FIELD_IPV6)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_DST_ADDR],
+			v->field + DST1_FIELD_IPV6)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	/* source port. */
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	/* destination port. */
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	return (0);
+}
+
+static int
+parse_ipv4_net(const char *in, uint32_t *addr, uint32_t *mask_len)
+{
+	uint8_t a, b, c, d, m;
+
+	GET_CB_FIELD(in, a, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, b, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, c, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, d, 0, UINT8_MAX, '/');
+	GET_CB_FIELD(in, m, 0, sizeof(uint32_t) * CHAR_BIT, 0);
+
+	addr[0] = IPv4(a, b, c, d);
+	mask_len[0] = m;
+
+	return (0);
+}
+/*
+ * Parse ClassBench rules file.
+ * Expected format:
+ * '@'<src_ipv4_addr>'/'<masklen> <space> \
+ * <dst_ipv4_addr>'/'<masklen> <space> \
+ * <src_port_low> <space> ":" <src_port_high> <space> \
+ * <dst_port_low> <space> ":" <dst_port_high> <space> \
+ * <proto>'/'<mask>
+ */
+static int
+parse_cb_ipv4_rule(char *str, struct acl_rule *v)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+
+	/*
+	 * Skip leading '@'
+	 */
+	if (strchr(str, '@') != str)
+		return (-EINVAL);
+
+	s = str + 1;
+
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_SRC_ADDR],
+			&v->field[SRC_FIELD_IPV4].value.u32,
+			&v->field[SRC_FIELD_IPV4].mask_range.u32)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_DST_ADDR],
+			&v->field[DST_FIELD_IPV4].value.u32,
+			&v->field[DST_FIELD_IPV4].mask_range.u32)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	/* source port. */
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	/* destination port. */
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	return (0);
+}
+
+typedef int (*parse_5tuple)(char *text, struct acl_rule *rule);
+
+static int
+add_cb_rules(FILE *f, struct rte_acl_ctx *ctx)
+{
+	int rc;
+	uint32_t n;
+	struct acl_rule v;
+	parse_5tuple parser;
+
+	memset(&v, 0, sizeof(v));
+	parser = (config.ipv6 != 0) ? parse_cb_ipv6_rule : parse_cb_ipv4_rule;
+
+	for (n = 1; fgets(line, sizeof(line), f) != NULL; n++) {
+
+		if ((rc = parser(line, &v)) != 0) {
+			RTE_LOG(ERR, TESTACL, "line %u: parse_cb_ipv4vlan_rule"
+				" failed, error code: %d (%s)\n",
+				n, rc, strerror(-rc));
+			return (rc);
+		}
+
+		v.data.category_mask = LEN2MASK(RTE_ACL_MAX_CATEGORIES);
+		v.data.priority = RTE_ACL_MAX_PRIORITY - n;
+		v.data.userdata = n;
+
+		if ((rc = rte_acl_add_rules(ctx, (struct rte_acl_rule *)&v,
+				1)) != 0) {
+			RTE_LOG(ERR, TESTACL, "line %u: failed to add rules "
+				"into ACL context, error code: %d (%s)\n",
+				n, rc, strerror(-rc));
+			return (rc);
+		}
+	}
+
+	return (0);
+}
+
+static void
+acx_init(void)
+{
+	int ret;
+	FILE *f;
+	struct rte_acl_config cfg;
+
+	/* setup ACL build config. */
+	if (config.ipv6) {
+		cfg.num_fields = RTE_DIM(ipv6_defs);
+		memcpy(&cfg.defs, ipv6_defs, sizeof(ipv6_defs));
+	} else {
+		cfg.num_fields = RTE_DIM(ipv4_defs);
+		memcpy(&cfg.defs, ipv4_defs, sizeof(ipv4_defs));
+	}
+	cfg.num_categories = config.bld_categories;
+
+	/* setup ACL creation parameters. */
+	prm.rule_size = RTE_ACL_RULE_SZ(cfg.num_fields);
+	prm.max_rule_num = config.nb_rules;
+
+	if ((config.acx = rte_acl_create(&prm)) == NULL)
+		rte_exit(rte_errno, "failed to create ACL context\n");
+
+	/* add ACL rules. */
+	if ((f = fopen(config.rule_file, "r")) == NULL)
+		rte_exit(-EINVAL, "failed to open file %s\n",
+			config.rule_file);
+
+	if ((ret = add_cb_rules(f, config.acx)) != 0)
+		rte_exit(rte_errno, "failed to add rules into ACL context\n");
+
+	fclose(f);
+
+	/* perform build. */
+	ret = rte_acl_build(config.acx, &cfg);
+
+	dump_verbose(DUMP_NONE, stdout,
+		"rte_acl_build(%u) finished with %d\n",
+		config.bld_categories, ret);
+
+	rte_acl_dump(config.acx);
+
+	if (ret != 0)
+		rte_exit(ret, "failed to build search context\n");
+}
+
+static uint32_t
+search_ip5tuples_once(uint32_t categories, uint32_t step, int scalar)
+{
+	int ret;
+	uint32_t i, j, k, n, r;
+	const uint8_t *data[step], *v;
+	uint32_t results[step * categories];
+
+	v = config.traces;
+	for (i = 0; i != config.used_traces; i += n) {
+
+		n = RTE_MIN(step, config.used_traces - i);
+
+		for (j = 0; j != n; j++) {
+			data[j] = v;
+			v += config.trace_sz;
+		}
+
+		if (scalar != 0)
+			ret = rte_acl_classify_scalar(config.acx, data,
+				results, n, categories);
+
+		else
+			ret = rte_acl_classify(config.acx, data,
+				results, n, categories);
+
+		if (ret != 0)
+			rte_exit(ret, "classify for ipv%c_5tuples returns %d\n",
+				config.ipv6 ? '6' : '4', ret);
+
+		for (r = 0, j = 0; j != n; j++) {
+			for (k = 0; k != categories; k++, r++) {
+				dump_verbose(DUMP_PKT, stdout,
+					"ipv%c_5tuple: %u, category: %u, "
+					"result: %u\n",
+					config.ipv6 ? '6' : '4',
+					i + j + 1, k, results[r] - 1);
+			}
+
+		}
+	}
+
+	dump_verbose(DUMP_SEARCH, stdout,
+		"%s(%u, %u, %s) returns %u\n", __func__,
+		categories, step, scalar != 0 ? "scalar" : "sse", i);
+	return (i);
+}
+
+static int
+search_ip5tuples(__attribute__((unused)) void *arg)
+{
+	uint64_t pkt, start, tm;
+	uint32_t i, lcore;
+
+	lcore = rte_lcore_id();
+	start = rte_rdtsc();
+	pkt = 0;
+
+	for (i = 0; i != config.iter_num; i++) {
+		pkt += search_ip5tuples_once(config.run_categories,
+			config.trace_step, config.scalar);
+	}
+
+	tm = rte_rdtsc() - start;
+	dump_verbose(DUMP_NONE, stdout,
+		"%s  @lcore %u: %" PRIu32 " iterations, %" PRIu64 " pkts, %"
+		PRIu32 " categories, %" PRIu64 " cycles, %#Lf cycles/pkt\n",
+		__func__, lcore, i, pkt, config.run_categories,
+		tm, (long double)tm / pkt);
+
+	return (0);
+}
+
+static uint32_t
+get_uint32_opt(const char *opt, const char *name, uint32_t min, uint32_t max)
+{
+	unsigned long val;
+	char *end;
+
+	errno = 0;
+	val = strtoul(opt, &end, 0);
+	if (errno != 0 || end[0] != 0 || val > max || val < min)
+		rte_exit(-EINVAL, "invalid value: \"%s\" for option: %s\n",
+			opt, name);
+	return (val);
+}
+
+static void
+print_usage(const char *prgname)
+{
+	fprintf(stdout,
+		PRINT_USAGE_START
+		"--" OPT_RULE_FILE "=<rules set file>\n"
+		"[--" OPT_TRACE_FILE "=<input traces file>]\n"
+		"[--" OPT_RULE_NUM
+			"=<maximum number of rules for ACL context>]\n"
+		"[--" OPT_TRACE_NUM
+			"=<number of traces to read binary file in>]\n"
+		"[--" OPT_TRACE_STEP
+			"=<number of traces to classify per one call>]\n"
+		"[--" OPT_BLD_CATEGORIES
+			"=<number of categories to build with>]\n"
+		"[--" OPT_RUN_CATEGORIES
+			"=<number of categories to run with> "
+			"should be either 1 or multiple of %zu, "
+			"but not greater then %u]\n"
+		"[--" OPT_ITER_NUM "=<number of iterations to perform>]\n"
+		"[--" OPT_VERBOSE "=<verbose level>]\n"
+		"[--" OPT_SEARCH_SCALAR "=<use scalar version>]\n"
+		"[--" OPT_IPV6 "=<IPv6 rules and trace files>]\n",
+		prgname, RTE_ACL_RESULTS_MULTIPLIER,
+		(uint32_t)RTE_ACL_MAX_CATEGORIES);
+}
+
+static void
+dump_config(FILE *f)
+{
+	fprintf(f, "%s:\n", __func__);
+	fprintf(f, "%s:%s\n", OPT_RULE_FILE, config.rule_file);
+	fprintf(f, "%s:%s\n", OPT_TRACE_FILE, config.trace_file);
+	fprintf(f, "%s:%u\n", OPT_RULE_NUM, config.nb_rules);
+	fprintf(f, "%s:%u\n", OPT_TRACE_NUM, config.nb_traces);
+	fprintf(f, "%s:%u\n", OPT_TRACE_STEP, config.trace_step);
+	fprintf(f, "%s:%u\n", OPT_BLD_CATEGORIES, config.bld_categories);
+	fprintf(f, "%s:%u\n", OPT_RUN_CATEGORIES, config.run_categories);
+	fprintf(f, "%s:%u\n", OPT_ITER_NUM, config.iter_num);
+	fprintf(f, "%s:%u\n", OPT_VERBOSE, config.verbose);
+	fprintf(f, "%s:%u\n", OPT_SEARCH_SCALAR, config.scalar);
+	fprintf(f, "%s:%u\n", OPT_IPV6, config.ipv6);
+}
+
+static void
+check_config(void)
+{
+	if (config.rule_file == NULL) {
+		print_usage(config.prgname);
+		rte_exit(-EINVAL, "mandatory option %s is not specified\n",
+			OPT_RULE_FILE);
+	}
+}
+
+
+static void
+get_input_opts(int argc, char **argv)
+{
+	static struct option lgopts[] = {
+		{OPT_RULE_FILE, 1, 0, 0},
+		{OPT_TRACE_FILE, 1, 0, 0},
+		{OPT_TRACE_NUM, 1, 0, 0},
+		{OPT_RULE_NUM, 1, 0, 0},
+		{OPT_TRACE_STEP, 1, 0, 0},
+		{OPT_BLD_CATEGORIES, 1, 0, 0},
+		{OPT_RUN_CATEGORIES, 1, 0, 0},
+		{OPT_ITER_NUM, 1, 0, 0},
+		{OPT_VERBOSE, 1, 0, 0},
+		{OPT_SEARCH_SCALAR, 0, 0, 0},
+		{OPT_IPV6, 0, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	int opt, opt_idx;
+
+	while ((opt = getopt_long(argc, argv, "", lgopts,  &opt_idx)) != EOF) {
+
+		if (opt != 0) {
+			print_usage(config.prgname);
+			rte_exit(-EINVAL, "unknown option: %c", opt);
+		}
+
+		if (strcmp(lgopts[opt_idx].name, OPT_RULE_FILE) == 0) {
+			config.rule_file = optarg;
+		} else if (strcmp(lgopts[opt_idx].name, OPT_TRACE_FILE) == 0) {
+			config.trace_file = optarg;
+		} else if (strcmp(lgopts[opt_idx].name, OPT_RULE_NUM) == 0) {
+			config.nb_rules = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, RTE_ACL_MAX_INDEX + 1);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_TRACE_NUM) == 0) {
+			config.nb_traces = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, UINT32_MAX);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_TRACE_STEP) == 0) {
+			config.trace_step = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, TRACE_STEP_MAX);
+		} else if (strcmp(lgopts[opt_idx].name,
+				OPT_BLD_CATEGORIES) == 0) {
+			config.bld_categories = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1,
+				RTE_ACL_MAX_CATEGORIES);
+		} else if (strcmp(lgopts[opt_idx].name,
+				OPT_RUN_CATEGORIES) == 0) {
+			config.run_categories = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1,
+				RTE_ACL_MAX_CATEGORIES);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_ITER_NUM) == 0) {
+			config.iter_num = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, UINT16_MAX);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_VERBOSE) == 0) {
+			config.verbose = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, DUMP_NONE, DUMP_MAX);
+		} else if (strcmp(lgopts[opt_idx].name,
+				OPT_SEARCH_SCALAR) == 0) {
+			config.scalar = 1;
+		} else if (strcmp(lgopts[opt_idx].name, OPT_IPV6) == 0) {
+			config.ipv6 = 1;
+		}
+	}
+	config.trace_sz = config.ipv6 ? sizeof(struct ipv6_5tuple) :
+						sizeof(struct ipv4_5tuple);
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	uint32_t lcore;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	argc -= ret;
+	argv += ret;
+
+	config.prgname = argv[0];
+
+	get_input_opts(argc, argv);
+	dump_config(stdout);
+	check_config();
+
+	acx_init();
+
+	if (config.trace_file != NULL)
+		tracef_init();
+
+	RTE_LCORE_FOREACH_SLAVE(lcore)
+		 rte_eal_remote_launch(search_ip5tuples, NULL, lcore);
+
+	search_ip5tuples(NULL);
+
+	rte_eal_mp_wait_lcore();
+
+	rte_acl_free(config.acx);
+	return (0);
+}
diff --git a/app/test-acl/main.h b/app/test-acl/main.h
new file mode 100644
index 0000000..cec0408
--- /dev/null
+++ b/app/test-acl/main.h
@@ -0,0 +1,50 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+#define	RTE_LOGTYPE_TESTACL	RTE_LOGTYPE_USER1
+
+#define	APP_NAME	"TESTACL"
+
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCHv3 4/5] acl: New sample l3fwd-acl
  2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
                   ` (2 preceding siblings ...)
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 3/5] acl: New test-acl application Konstantin Ananyev
@ 2014-06-13 11:26 ` Konstantin Ananyev
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 5/5] acl: add doxygen configuration and start page Konstantin Ananyev
  2014-06-13 11:56 ` [dpdk-dev] [PATCHv3 0/5] ACL library Thomas Monjalon
  5 siblings, 0 replies; 11+ messages in thread
From: Konstantin Ananyev @ 2014-06-13 11:26 UTC (permalink / raw)
  To: dev, dev

Demonstrates the use of the ACL library in the DPDK application to
implement packet classification and L3 forwarding.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/Makefile           |    1 +
 examples/l3fwd-acl/Makefile |   56 ++
 examples/l3fwd-acl/main.c   | 2140 +++++++++++++++++++++++++++++++++++++++++++
 examples/l3fwd-acl/main.h   |   45 +
 4 files changed, 2242 insertions(+), 0 deletions(-)
 create mode 100644 examples/l3fwd-acl/Makefile
 create mode 100644 examples/l3fwd-acl/main.c
 create mode 100644 examples/l3fwd-acl/main.h

diff --git a/examples/Makefile b/examples/Makefile
index d6b08c2..f3d1726 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -64,5 +64,6 @@ DIRS-y += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
 DIRS-y += vmdq
 DIRS-y += vmdq_dcb
+DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
 
 include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/l3fwd-acl/Makefile b/examples/l3fwd-acl/Makefile
new file mode 100644
index 0000000..7ba7247
--- /dev/null
+++ b/examples/l3fwd-acl/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l3fwd-acl
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l3fwd-acl/main.c b/examples/l3fwd-acl/main.c
new file mode 100644
index 0000000..6c8ac26
--- /dev/null
+++ b/examples/l3fwd-acl/main.c
@@ -0,0 +1,2140 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/types.h>
+#include <string.h>
+#include <sys/queue.h>
+#include <stdarg.h>
+#include <errno.h>
+#include <getopt.h>
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_random.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+#include <rte_string_fns.h>
+#include <rte_acl.h>
+
+#include "main.h"
+
+#define DO_RFC_1812_CHECKS
+
+#define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1
+
+#define MAX_JUMBO_PKT_LEN  9600
+
+#define MEMPOOL_CACHE_SIZE 256
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+
+/*
+ * This expression is used to calculate the number of mbufs needed
+ * depending on user input, taking into account memory for rx and tx hardware
+ * rings, cache per lcore and mtable per port per lcore.
+ * RTE_MAX is used to ensure that NB_MBUF never goes below a
+ * minimum value of 8192
+ */
+
+#define NB_MBUF	RTE_MAX(\
+	(nb_ports * nb_rx_queue*RTE_TEST_RX_DESC_DEFAULT +	\
+	nb_ports * nb_lcores * MAX_PKT_BURST +			\
+	nb_ports * n_tx_queue * RTE_TEST_TX_DESC_DEFAULT +	\
+	nb_lcores * MEMPOOL_CACHE_SIZE),			\
+	(unsigned)8192)
+
+/*
+ * RX and TX Prefetch, Host, and Write-back threshold values should be
+ * carefully set for optimal performance. Consult the network
+ * controller's datasheet and supporting DPDK documentation for guidance
+ * on how these parameters should be set.
+ */
+#define RX_PTHRESH 8 /**< Default values of RX prefetch threshold reg. */
+#define RX_HTHRESH 8 /**< Default values of RX host threshold reg. */
+#define RX_WTHRESH 4 /**< Default values of RX write-back threshold reg. */
+
+/*
+ * These default values are optimized for use with the Intel(R) 82599 10 GbE
+ * Controller and the DPDK ixgbe PMD. Consider using other values for other
+ * network controllers and/or network drivers.
+ */
+#define TX_PTHRESH 36 /**< Default values of TX prefetch threshold reg. */
+#define TX_HTHRESH 0  /**< Default values of TX host threshold reg. */
+#define TX_WTHRESH 0  /**< Default values of TX write-back threshold reg. */
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+#define NB_SOCKETS 8
+
+/* Configure how many packets ahead to prefetch, when reading packets */
+#define PREFETCH_OFFSET	3
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t enabled_port_mask;
+static int promiscuous_on; /**< Ports set in promiscuous mode off by default. */
+static int numa_on = 1; /**< NUMA is enabled by default. */
+
+struct mbuf_table {
+	uint16_t len;
+	struct rte_mbuf *m_table[MAX_PKT_BURST];
+};
+
+struct lcore_rx_queue {
+	uint8_t port_id;
+	uint8_t queue_id;
+} __rte_cache_aligned;
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
+#define MAX_RX_QUEUE_PER_PORT 128
+
+#define MAX_LCORE_PARAMS 1024
+struct lcore_params {
+	uint8_t port_id;
+	uint8_t queue_id;
+	uint8_t lcore_id;
+} __rte_cache_aligned;
+
+static struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
+static struct lcore_params lcore_params_array_default[] = {
+	{0, 0, 2},
+	{0, 1, 2},
+	{0, 2, 2},
+	{1, 0, 2},
+	{1, 1, 2},
+	{1, 2, 2},
+	{2, 0, 2},
+	{3, 0, 3},
+	{3, 1, 3},
+};
+
+static struct lcore_params *lcore_params = lcore_params_array_default;
+static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
+				sizeof(lcore_params_array_default[0]);
+
+static struct rte_eth_conf port_conf = {
+	.rxmode = {
+		.mq_mode	= ETH_MQ_RX_RSS,
+		.max_rx_pkt_len = ETHER_MAX_LEN,
+		.split_hdr_size = 0,
+		.header_split   = 0, /**< Header Split disabled */
+		.hw_ip_checksum = 1, /**< IP checksum offload enabled */
+		.hw_vlan_filter = 0, /**< VLAN filtering disabled */
+		.jumbo_frame    = 0, /**< Jumbo Frame Support disabled */
+		.hw_strip_crc   = 0, /**< CRC stripped by hardware */
+	},
+	.rx_adv_conf = {
+		.rss_conf = {
+			.rss_key = NULL,
+			.rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV4_TCP
+				| ETH_RSS_IPV4_UDP
+				| ETH_RSS_IPV6 | ETH_RSS_IPV6_EX
+				| ETH_RSS_IPV6_TCP | ETH_RSS_IPV6_TCP_EX
+				| ETH_RSS_IPV6_UDP | ETH_RSS_IPV6_UDP_EX,
+		},
+	},
+	.txmode = {
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+};
+
+static const struct rte_eth_rxconf rx_conf = {
+	.rx_thresh = {
+		.pthresh = RX_PTHRESH,
+		.hthresh = RX_HTHRESH,
+		.wthresh = RX_WTHRESH,
+	},
+	.rx_free_thresh = 32,
+};
+
+static const struct rte_eth_txconf tx_conf = {
+	.tx_thresh = {
+		.pthresh = TX_PTHRESH,
+		.hthresh = TX_HTHRESH,
+		.wthresh = TX_WTHRESH,
+	},
+	.tx_free_thresh = 0, /* Use PMD default values */
+	.tx_rs_thresh = 0, /* Use PMD default values */
+	.txq_flags = 0x0,
+};
+
+static struct rte_mempool *pktmbuf_pool[NB_SOCKETS];
+
+/***********************start of ACL part******************************/
+#ifdef DO_RFC_1812_CHECKS
+static inline int
+is_valid_ipv4_pkt(struct ipv4_hdr *pkt, uint32_t link_len);
+#endif
+static inline int
+send_single_packet(struct rte_mbuf *m, uint8_t port);
+
+#define MAX_ACL_RULE_NUM	100000
+#define DEFAULT_MAX_CATEGORIES	1
+#define L3FWD_ACL_IPV4_NAME	"l3fwd-acl-ipv4"
+#define L3FWD_ACL_IPV6_NAME	"l3fwd-acl-ipv6"
+#define ACL_LEAD_CHAR		('@')
+#define ROUTE_LEAD_CHAR		('R')
+#define COMMENT_LEAD_CHAR	('#')
+#define OPTION_CONFIG		"config"
+#define OPTION_NONUMA		"no-numa"
+#define OPTION_ENBJMO		"enable-jumbo"
+#define OPTION_RULE_IPV4	"rule_ipv4"
+#define OPTION_RULE_IPV6	"rule_ipv6"
+#define OPTION_SCALAR		"scalar"
+#define ACL_DENY_SIGNATURE	0xf0000000
+#define RTE_LOGTYPE_L3FWDACL	RTE_LOGTYPE_USER3
+#define acl_log(format, ...)	RTE_LOG(ERR, L3FWDACL, format, ##__VA_ARGS__)
+#define uint32_t_to_char(ip, a, b, c, d) do {\
+		*a = (unsigned char)(ip >> 24 & 0xff);\
+		*b = (unsigned char)(ip >> 16 & 0xff);\
+		*c = (unsigned char)(ip >> 8 & 0xff);\
+		*d = (unsigned char)(ip & 0xff);\
+	} while (0)
+#define OFF_ETHHEAD	(sizeof(struct ether_hdr))
+#define OFF_IPV42PROTO (offsetof(struct ipv4_hdr, next_proto_id))
+#define OFF_IPV62PROTO (offsetof(struct ipv6_hdr, proto))
+#define MBUF_IPV4_2PROTO(m)	\
+	(rte_pktmbuf_mtod((m), uint8_t *) + OFF_ETHHEAD + OFF_IPV42PROTO)
+#define MBUF_IPV6_2PROTO(m)	\
+	(rte_pktmbuf_mtod((m), uint8_t *) + OFF_ETHHEAD + OFF_IPV62PROTO)
+
+#define GET_CB_FIELD(in, fd, base, lim, dlm)	do {            \
+	unsigned long val;                                      \
+	char *end;                                              \
+	errno = 0;                                              \
+	val = strtoul((in), &end, (base));                      \
+	if (errno != 0 || end[0] != (dlm) || val > (lim))       \
+		return (-EINVAL);                               \
+	(fd) = (typeof(fd))val;                                 \
+	(in) = end + 1;                                         \
+} while (0)
+
+#define CLASSIFY(context, data, res, num, cat) do {		\
+	if (scalar)						\
+		rte_acl_classify_scalar((context), (data),	\
+		(res), (num), (cat));				\
+	else							\
+		rte_acl_classify((context), (data),		\
+		(res), (num), (cat));				\
+} while (0)
+
+/*
+  * ACL rules should have higher priorities than route ones to ensure ACL rule
+  * always be found when input packets have multi-matches in the database.
+  * A exception case is performance measure, which can define route rules with
+  * higher priority and route rules will always be returned in each lookup.
+  * Reserve range from ACL_RULE_PRIORITY_MAX + 1 to
+  * RTE_ACL_MAX_PRIORITY for route entries in performance measure
+  */
+#define ACL_RULE_PRIORITY_MAX 0x10000000
+
+/*
+  * Forward port info save in ACL lib starts from 1
+  * since ACL assume 0 is invalid.
+  * So, need add 1 when saving and minus 1 when forwarding packets.
+  */
+#define FWD_PORT_SHIFT 1
+
+/*
+ * Rule and trace formats definitions.
+ */
+
+enum {
+	PROTO_FIELD_IPV4,
+	SRC_FIELD_IPV4,
+	DST_FIELD_IPV4,
+	SRCP_FIELD_IPV4,
+	DSTP_FIELD_IPV4,
+	NUM_FIELDS_IPV4
+};
+
+struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof(uint8_t),
+		.field_index = PROTO_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PROTO,
+		.offset = 0,
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_SRC,
+		.offset = offsetof(struct ipv4_hdr, src_addr) -
+			offsetof(struct ipv4_hdr, next_proto_id),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_DST,
+		.offset = offsetof(struct ipv4_hdr, dst_addr) -
+			offsetof(struct ipv4_hdr, next_proto_id),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = SRCP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = sizeof(struct ipv4_hdr) -
+			offsetof(struct ipv4_hdr, next_proto_id),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = DSTP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = sizeof(struct ipv4_hdr) -
+			offsetof(struct ipv4_hdr, next_proto_id) +
+			sizeof(uint16_t),
+	},
+};
+
+#define	IPV6_ADDR_LEN	16
+#define	IPV6_ADDR_U16	(IPV6_ADDR_LEN / sizeof(uint16_t))
+#define	IPV6_ADDR_U32	(IPV6_ADDR_LEN / sizeof(uint32_t))
+
+enum {
+	PROTO_FIELD_IPV6,
+	SRC1_FIELD_IPV6,
+	SRC2_FIELD_IPV6,
+	SRC3_FIELD_IPV6,
+	SRC4_FIELD_IPV6,
+	DST1_FIELD_IPV6,
+	DST2_FIELD_IPV6,
+	DST3_FIELD_IPV6,
+	DST4_FIELD_IPV6,
+	SRCP_FIELD_IPV6,
+	DSTP_FIELD_IPV6,
+	NUM_FIELDS_IPV6
+};
+
+struct rte_acl_field_def ipv6_defs[NUM_FIELDS_IPV6] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof(uint8_t),
+		.field_index = PROTO_FIELD_IPV6,
+		.input_index = PROTO_FIELD_IPV6,
+		.offset = 0,
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC1_FIELD_IPV6,
+		.input_index = SRC1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr) -
+			offsetof(struct ipv6_hdr, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC2_FIELD_IPV6,
+		.input_index = SRC2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr) -
+			offsetof(struct ipv6_hdr, proto) + sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC3_FIELD_IPV6,
+		.input_index = SRC3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr) -
+			offsetof(struct ipv6_hdr, proto) + 2 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC4_FIELD_IPV6,
+		.input_index = SRC4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr) -
+			offsetof(struct ipv6_hdr, proto) + 3 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST1_FIELD_IPV6,
+		.input_index = DST1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr)
+				- offsetof(struct ipv6_hdr, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST2_FIELD_IPV6,
+		.input_index = DST2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr) -
+			offsetof(struct ipv6_hdr, proto) + sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST3_FIELD_IPV6,
+		.input_index = DST3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr) -
+			offsetof(struct ipv6_hdr, proto) + 2 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST4_FIELD_IPV6,
+		.input_index = DST4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr) -
+			offsetof(struct ipv6_hdr, proto) + 3 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = SRCP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = sizeof(struct ipv6_hdr) -
+			offsetof(struct ipv6_hdr, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = DSTP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = sizeof(struct ipv6_hdr) -
+			offsetof(struct ipv6_hdr, proto) + sizeof(uint16_t),
+	},
+};
+
+enum {
+	CB_FLD_SRC_ADDR,
+	CB_FLD_DST_ADDR,
+	CB_FLD_SRC_PORT_LOW,
+	CB_FLD_SRC_PORT_DLM,
+	CB_FLD_SRC_PORT_HIGH,
+	CB_FLD_DST_PORT_LOW,
+	CB_FLD_DST_PORT_DLM,
+	CB_FLD_DST_PORT_HIGH,
+	CB_FLD_PROTO,
+	CB_FLD_USERDATA,
+	CB_FLD_NUM,
+};
+
+RTE_ACL_RULE_DEF(acl4_rule, RTE_DIM(ipv4_defs));
+RTE_ACL_RULE_DEF(acl6_rule, RTE_DIM(ipv6_defs));
+
+struct acl_search_t {
+	const uint8_t *data_ipv4[MAX_PKT_BURST];
+	struct rte_mbuf *m_ipv4[MAX_PKT_BURST];
+	uint32_t res_ipv4[MAX_PKT_BURST];
+	int num_ipv4;
+
+	const uint8_t *data_ipv6[MAX_PKT_BURST];
+	struct rte_mbuf *m_ipv6[MAX_PKT_BURST];
+	uint32_t res_ipv6[MAX_PKT_BURST];
+	int num_ipv6;
+};
+
+static struct {
+	char mapped[NB_SOCKETS];
+	struct rte_acl_ctx *acx_ipv4[NB_SOCKETS];
+	struct rte_acl_ctx *acx_ipv6[NB_SOCKETS];
+#ifdef L3FWDACL_DEBUG
+	struct acl4_rule *rule_ipv4;
+	struct acl6_rule *rule_ipv6;
+#endif
+} acl_config;
+
+static struct{
+	const char *rule_ipv4_name;
+	const char *rule_ipv6_name;
+	int scalar;
+} parm_config;
+
+const char cb_port_delim[] = ":";
+
+static inline void
+print_one_ipv4_rule(struct acl4_rule *rule, int extra)
+{
+	unsigned char a, b, c, d;
+
+	uint32_t_to_char(rule->field[SRC_FIELD_IPV4].value.u32,
+			&a, &b, &c, &d);
+	printf("%hhu.%hhu.%hhu.%hhu/%u ", a, b, c, d,
+			rule->field[SRC_FIELD_IPV4].mask_range.u32);
+	uint32_t_to_char(rule->field[DST_FIELD_IPV4].value.u32,
+			&a, &b, &c, &d);
+	printf("%hhu.%hhu.%hhu.%hhu/%u ", a, b, c, d,
+			rule->field[DST_FIELD_IPV4].mask_range.u32);
+	printf("%hu : %hu %hu : %hu 0x%hhx/0x%hhx ",
+		rule->field[SRCP_FIELD_IPV4].value.u16,
+		rule->field[SRCP_FIELD_IPV4].mask_range.u16,
+		rule->field[DSTP_FIELD_IPV4].value.u16,
+		rule->field[DSTP_FIELD_IPV4].mask_range.u16,
+		rule->field[PROTO_FIELD_IPV4].value.u8,
+		rule->field[PROTO_FIELD_IPV4].mask_range.u8);
+	if (extra)
+		printf("0x%x-0x%x-0x%x ",
+			rule->data.category_mask,
+			rule->data.priority,
+			rule->data.userdata);
+}
+
+static inline void
+print_one_ipv6_rule(struct acl6_rule *rule, int extra)
+{
+	unsigned char a, b, c, d;
+
+	uint32_t_to_char(rule->field[SRC1_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf("%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[SRC2_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[SRC3_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[SRC4_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x/%u ", a, b, c, d,
+			rule->field[SRC1_FIELD_IPV6].mask_range.u32
+			+ rule->field[SRC2_FIELD_IPV6].mask_range.u32
+			+ rule->field[SRC3_FIELD_IPV6].mask_range.u32
+			+ rule->field[SRC4_FIELD_IPV6].mask_range.u32);
+
+	uint32_t_to_char(rule->field[DST1_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf("%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[DST2_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[DST3_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[DST4_FIELD_IPV6].value.u32,
+		&a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x/%u ", a, b, c, d,
+			rule->field[DST1_FIELD_IPV6].mask_range.u32
+			+ rule->field[DST2_FIELD_IPV6].mask_range.u32
+			+ rule->field[DST3_FIELD_IPV6].mask_range.u32
+			+ rule->field[DST4_FIELD_IPV6].mask_range.u32);
+
+	printf("%hu : %hu %hu : %hu 0x%hhx/0x%hhx ",
+		rule->field[SRCP_FIELD_IPV6].value.u16,
+		rule->field[SRCP_FIELD_IPV6].mask_range.u16,
+		rule->field[DSTP_FIELD_IPV6].value.u16,
+		rule->field[DSTP_FIELD_IPV6].mask_range.u16,
+		rule->field[PROTO_FIELD_IPV6].value.u8,
+		rule->field[PROTO_FIELD_IPV6].mask_range.u8);
+	if (extra)
+		printf("0x%x-0x%x-0x%x ",
+			rule->data.category_mask,
+			rule->data.priority,
+			rule->data.userdata);
+}
+
+/* Bypass comment and empty lines */
+static inline int
+is_bypass_line(char *buff)
+{
+	int i = 0;
+
+	/* comment line */
+	if (buff[0] == COMMENT_LEAD_CHAR)
+		return 1;
+	/* empty line */
+	while (buff[i] != '\0') {
+		if (!isspace(buff[i]))
+			return 0;
+		i++;
+	}
+	return 1;
+}
+
+#ifdef L3FWDACL_DEBUG
+static inline void
+dump_acl4_rule(struct rte_mbuf *m, uint32_t sig)
+{
+	uint32_t offset = sig & ~ACL_DENY_SIGNATURE;
+	unsigned char a, b, c, d;
+	struct ipv4_hdr *ipv4_hdr = (struct ipv4_hdr *)
+					(rte_pktmbuf_mtod(m, unsigned char *) +
+					sizeof(struct ether_hdr));
+
+	uint32_t_to_char(rte_bswap32(ipv4_hdr->src_addr), &a, &b, &c, &d);
+	printf("Packet Src:%hhu.%hhu.%hhu.%hhu ", a, b, c, d);
+	uint32_t_to_char(rte_bswap32(ipv4_hdr->dst_addr), &a, &b, &c, &d);
+	printf("Dst:%hhu.%hhu.%hhu.%hhu ", a, b, c, d);
+
+	printf("Src port:%hu,Dst port:%hu ",
+			rte_bswap16(*(uint16_t *)(ipv4_hdr + 1)),
+			rte_bswap16(*((uint16_t *)(ipv4_hdr + 1) + 1)));
+	printf("hit ACL %d - ", offset);
+
+	print_one_ipv4_rule(acl_config.rule_ipv4 + offset, 1);
+
+	printf("\n\n");
+}
+
+static inline void
+dump_acl6_rule(struct rte_mbuf *m, uint32_t sig)
+{
+	unsigned i;
+	uint32_t offset = sig & ~ACL_DENY_SIGNATURE;
+	struct ipv6_hdr *ipv6_hdr = (struct ipv6_hdr *)
+					(rte_pktmbuf_mtod(m, unsigned char *) +
+					sizeof(struct ether_hdr));
+
+	printf("Packet Src");
+	for (i = 0; i < RTE_DIM(ipv6_hdr->src_addr); i += sizeof(uint16_t))
+		printf(":%.2x%.2x",
+			ipv6_hdr->src_addr[i], ipv6_hdr->src_addr[i + 1]);
+
+	printf("\nDst");
+	for (i = 0; i < RTE_DIM(ipv6_hdr->dst_addr); i += sizeof(uint16_t))
+		printf(":%.2x%.2x",
+			ipv6_hdr->dst_addr[i], ipv6_hdr->dst_addr[i + 1]);
+
+	printf("\nSrc port:%hu,Dst port:%hu ",
+			rte_bswap16(*(uint16_t *)(ipv6_hdr + 1)),
+			rte_bswap16(*((uint16_t *)(ipv6_hdr + 1) + 1)));
+	printf("hit ACL %d - ", offset);
+
+	print_one_ipv6_rule(acl_config.rule_ipv6 + offset, 1);
+
+	printf("\n\n");
+}
+#endif /* L3FWDACL_DEBUG */
+
+static inline void
+dump_ipv4_rules(struct acl4_rule *rule, int num, int extra)
+{
+	int i;
+
+	for (i = 0; i < num; i++, rule++) {
+		printf("\t%d:", i + 1);
+		print_one_ipv4_rule(rule, extra);
+		printf("\n");
+	}
+}
+
+static inline void
+dump_ipv6_rules(struct acl6_rule *rule, int num, int extra)
+{
+	int i;
+
+	for (i = 0; i < num; i++, rule++) {
+		printf("\t%d:", i + 1);
+		print_one_ipv6_rule(rule, extra);
+		printf("\n");
+	}
+}
+
+#ifdef DO_RFC_1812_CHECKS
+static inline void
+prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
+	int index)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct rte_mbuf *pkt = pkts_in[index];
+
+	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
+
+	if (type == PKT_RX_IPV4_HDR) {
+
+		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt,
+			unsigned char *) + sizeof(struct ether_hdr));
+
+		/* Check to make sure the packet is valid (RFC1812) */
+		if (is_valid_ipv4_pkt(ipv4_hdr, pkt->pkt.pkt_len) >= 0) {
+
+			/* Update time to live and header checksum */
+			--(ipv4_hdr->time_to_live);
+			++(ipv4_hdr->hdr_checksum);
+
+			/* Fill acl structure */
+			acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
+			acl->m_ipv4[(acl->num_ipv4)++] = pkt;
+
+		} else {
+			/* Not a valid IPv4 packet */
+			rte_pktmbuf_free(pkt);
+		}
+
+	} else if (type == PKT_RX_IPV6_HDR) {
+
+		/* Fill acl structure */
+		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
+		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
+
+	} else {
+		/* Unknown type, drop the packet */
+		rte_pktmbuf_free(pkt);
+	}
+}
+
+#else
+static inline void
+prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
+	int index)
+{
+	struct rte_mbuf *pkt = pkts_in[index];
+
+	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
+
+	if (type == PKT_RX_IPV4_HDR) {
+
+		/* Fill acl structure */
+		acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
+		acl->m_ipv4[(acl->num_ipv4)++] = pkt;
+
+
+	} else if (type == PKT_RX_IPV6_HDR) {
+
+		/* Fill acl structure */
+		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
+		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
+	} else {
+		/* Unknown type, drop the packet */
+		rte_pktmbuf_free(pkt);
+	}
+}
+#endif /* DO_RFC_1812_CHECKS */
+
+static inline void
+prepare_acl_parameter(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
+	int nb_rx)
+{
+	int i;
+
+	acl->num_ipv4 = 0;
+	acl->num_ipv6 = 0;
+
+	/* Prefetch first packets */
+	for (i = 0; i < PREFETCH_OFFSET && i < nb_rx; i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(
+				pkts_in[i], void *));
+	}
+
+	for (i = 0; i < (nb_rx - PREFETCH_OFFSET); i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(pkts_in[
+				i + PREFETCH_OFFSET], void *));
+		prepare_one_packet(pkts_in, acl, i);
+	}
+
+	/* Process left packets */
+	for (; i < nb_rx; i++)
+		prepare_one_packet(pkts_in, acl, i);
+}
+
+static inline void
+send_one_packet(struct rte_mbuf *m, uint32_t res)
+{
+	if (likely((res & ACL_DENY_SIGNATURE) == 0 && res != 0)) {
+		/* forward packets */
+		send_single_packet(m,
+			(uint8_t)(res - FWD_PORT_SHIFT));
+	} else{
+		/* in the ACL list, drop it */
+#ifdef L3FWDACL_DEBUG
+		if ((res & ACL_DENY_SIGNATURE) != 0) {
+			if (m->ol_flags & PKT_RX_IPV4_HDR)
+				dump_acl4_rule(m, res);
+			else
+				dump_acl6_rule(m, res);
+		}
+#endif
+		rte_pktmbuf_free(m);
+	}
+}
+
+
+
+static inline void
+send_packets(struct rte_mbuf **m, uint32_t *res, int num)
+{
+	int i;
+
+	/* Prefetch first packets */
+	for (i = 0; i < PREFETCH_OFFSET && i < num; i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(
+				m[i], void *));
+	}
+
+	for (i = 0; i < (num - PREFETCH_OFFSET); i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(m[
+				i + PREFETCH_OFFSET], void *));
+		send_one_packet(m[i], res[i]);
+	}
+
+	/* Process left packets */
+	for (; i < num; i++) {
+		send_one_packet(m[i], res[i]);
+	}
+
+}
+
+/*
+ * Parses IPV6 address, exepcts the following format:
+ * XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX (where X - is a hexedecimal digit).
+ */
+static int
+parse_ipv6_addr(const char *in, const char **end, uint32_t v[IPV6_ADDR_U32],
+	char dlm)
+{
+	uint32_t addr[IPV6_ADDR_U16];
+
+	GET_CB_FIELD(in, addr[0], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[1], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[2], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[3], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[4], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[5], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[6], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[7], 16, UINT16_MAX, dlm);
+
+	*end = in;
+
+	v[0] = (addr[0] << 16) + addr[1];
+	v[1] = (addr[2] << 16) + addr[3];
+	v[2] = (addr[4] << 16) + addr[5];
+	v[3] = (addr[6] << 16) + addr[7];
+
+	return (0);
+}
+
+static int
+parse_ipv6_net(const char *in, struct rte_acl_field field[4])
+{
+	int32_t rc;
+	const char *mp;
+	uint32_t i, m, v[4];
+	const uint32_t nbu32 = sizeof(uint32_t) * CHAR_BIT;
+
+	/* get address. */
+	if ((rc = parse_ipv6_addr(in, &mp, v, '/')) != 0)
+		return (rc);
+
+	/* get mask. */
+	GET_CB_FIELD(mp, m, 0, CHAR_BIT * sizeof(v), 0);
+
+	/* put all together. */
+	for (i = 0; i != RTE_DIM(v); i++) {
+		if (m >= (i + 1) * nbu32)
+			field[i].mask_range.u32 = nbu32;
+		else
+			field[i].mask_range.u32 = m > (i * nbu32) ?
+				m - (i * 32) : 0;
+
+		field[i].value.u32 = v[i];
+	}
+
+	return (0);
+}
+
+static int
+parse_cb_ipv6_rule(char *str, struct rte_acl_rule *v, int has_userdata)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+	int dim = has_userdata ? CB_FLD_NUM : CB_FLD_USERDATA;
+	s = str;
+
+	for (i = 0; i != dim; i++, s = NULL) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_SRC_ADDR],
+			v->field + SRC1_FIELD_IPV6)) != 0) {
+		acl_log("failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_DST_ADDR],
+			v->field + DST1_FIELD_IPV6)) != 0) {
+		acl_log("failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	/* source port. */
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	/* destination port. */
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	if (v->field[SRCP_FIELD_IPV6].mask_range.u16
+			< v->field[SRCP_FIELD_IPV6].value.u16
+			|| v->field[DSTP_FIELD_IPV6].mask_range.u16
+			< v->field[DSTP_FIELD_IPV6].value.u16)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	if (has_userdata)
+		GET_CB_FIELD(in[CB_FLD_USERDATA], v->data.userdata,
+			0, UINT32_MAX, 0);
+
+	return (0);
+}
+
+/*
+ * Parse ClassBench rules file.
+ * Expected format:
+ * '@'<src_ipv4_addr>'/'<masklen> <space> \
+ * <dst_ipv4_addr>'/'<masklen> <space> \
+ * <src_port_low> <space> ":" <src_port_high> <space> \
+ * <dst_port_low> <space> ":" <dst_port_high> <space> \
+ * <proto>'/'<mask>
+ */
+static int
+parse_ipv4_net(const char *in, uint32_t *addr, uint32_t *mask_len)
+{
+	uint8_t a, b, c, d, m;
+
+	GET_CB_FIELD(in, a, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, b, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, c, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, d, 0, UINT8_MAX, '/');
+	GET_CB_FIELD(in, m, 0, sizeof(uint32_t) * CHAR_BIT, 0);
+
+	addr[0] = IPv4(a, b, c, d);
+	mask_len[0] = m;
+
+	return (0);
+}
+
+static int
+parse_cb_ipv4vlan_rule(char *str, struct rte_acl_rule *v, int has_userdata)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+	int dim = has_userdata ? CB_FLD_NUM : CB_FLD_USERDATA;
+	s = str;
+	for (i = 0; i != dim; i++, s = NULL) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_SRC_ADDR],
+			&v->field[SRC_FIELD_IPV4].value.u32,
+			&v->field[SRC_FIELD_IPV4].mask_range.u32)) != 0) {
+			acl_log("failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_DST_ADDR],
+			&v->field[DST_FIELD_IPV4].value.u32,
+			&v->field[DST_FIELD_IPV4].mask_range.u32)) != 0) {
+		acl_log("failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	if (v->field[SRCP_FIELD_IPV4].mask_range.u16
+			< v->field[SRCP_FIELD_IPV4].value.u16
+			|| v->field[DSTP_FIELD_IPV4].mask_range.u16
+			< v->field[DSTP_FIELD_IPV4].value.u16)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	if (has_userdata)
+		GET_CB_FIELD(in[CB_FLD_USERDATA], v->data.userdata, 0,
+			UINT32_MAX, 0);
+
+	return (0);
+}
+
+static int
+add_rules(const char *rule_path,
+		struct rte_acl_rule **proute_base,
+		unsigned int *proute_num,
+		struct rte_acl_rule **pacl_base,
+		unsigned int *pacl_num, uint32_t rule_size,
+		int (*parser)(char *, struct rte_acl_rule*, int))
+{
+	uint8_t *acl_rules, *route_rules;
+	struct rte_acl_rule *next;
+	unsigned int acl_num = 0, route_num = 0, total_num = 0;
+	unsigned int acl_cnt = 0, route_cnt = 0;
+	char buff[LINE_MAX];
+	FILE *fh = fopen(rule_path, "rb");
+	unsigned int i = 0;
+
+	if (fh == NULL)
+		rte_exit(EXIT_FAILURE, "%s: Open %s failed\n", __func__,
+			rule_path);
+
+	while ((fgets(buff, LINE_MAX, fh) != NULL)) {
+		if (buff[0] == ROUTE_LEAD_CHAR)
+			route_num++;
+		else if (buff[0] == ACL_LEAD_CHAR)
+			acl_num++;
+	}
+
+	if (0 == route_num)
+		rte_exit(EXIT_FAILURE, "Not find any route entries in %s!\n",
+				rule_path);
+
+	fseek(fh, 0, SEEK_SET);
+
+	acl_rules = (uint8_t *)calloc(acl_num, rule_size);
+
+	if (NULL == acl_rules)
+		rte_exit(EXIT_FAILURE, "%s: failed to malloc memory\n",
+			__func__);
+
+	route_rules = (uint8_t *)calloc(route_num, rule_size);
+
+	if (NULL == route_rules)
+		rte_exit(EXIT_FAILURE, "%s: failed to malloc memory\n",
+			__func__);
+
+	i = 0;
+	while (fgets(buff, LINE_MAX, fh) != NULL) {
+		i++;
+
+		if (is_bypass_line(buff))
+			continue;
+
+		char s = buff[0];
+
+		/* Route entry */
+		if (s == ROUTE_LEAD_CHAR)
+			next = (struct rte_acl_rule *)(route_rules +
+				route_cnt * rule_size);
+
+		/* ACL entry */
+		else if (s == ACL_LEAD_CHAR)
+			next = (struct rte_acl_rule *)(acl_rules +
+				acl_cnt * rule_size);
+
+		/* Illegal line */
+		else
+			rte_exit(EXIT_FAILURE,
+				"%s Line %u: should start with leading "
+				"char %c or %c\n",
+				rule_path, i, ROUTE_LEAD_CHAR, ACL_LEAD_CHAR);
+
+		if (parser(buff + 1, next, s == ROUTE_LEAD_CHAR) != 0)
+			rte_exit(EXIT_FAILURE,
+				"%s Line %u: parse rules error\n",
+				rule_path, i);
+
+		if (s == ROUTE_LEAD_CHAR) {
+			/* Check the forwarding port number */
+			if ((enabled_port_mask & (1 << next->data.userdata)) ==
+					0)
+				rte_exit(EXIT_FAILURE,
+					"%s Line %u: fwd number illegal:%u\n",
+					rule_path, i, next->data.userdata);
+			next->data.userdata += FWD_PORT_SHIFT;
+			route_cnt++;
+		} else {
+			next->data.userdata = ACL_DENY_SIGNATURE + acl_cnt;
+			acl_cnt++;
+		}
+
+		next->data.priority = RTE_ACL_MAX_PRIORITY - total_num;
+		next->data.category_mask = -1;
+		total_num++;
+	}
+
+	fclose(fh);
+
+	*pacl_base = (struct rte_acl_rule *)acl_rules;
+	*pacl_num = acl_num;
+	*proute_base = (struct rte_acl_rule *)route_rules;
+	*proute_num = route_cnt;
+
+	return 0;
+}
+
+static void
+dump_acl_config(void)
+{
+	printf("ACL option are:\n");
+	printf(OPTION_RULE_IPV4": %s\n", parm_config.rule_ipv4_name);
+	printf(OPTION_RULE_IPV6": %s\n", parm_config.rule_ipv6_name);
+	printf(OPTION_SCALAR": %d\n", parm_config.scalar);
+}
+
+static int
+check_acl_config(void)
+{
+	if (parm_config.rule_ipv4_name == NULL) {
+		acl_log("ACL IPv4 rule file not specified\n");
+		return -1;
+	} else if (parm_config.rule_ipv6_name == NULL) {
+		acl_log("ACL IPv6 rule file not specified\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static struct rte_acl_ctx*
+setup_acl(struct rte_acl_rule *route_base,
+		struct rte_acl_rule *acl_base, unsigned int route_num,
+		unsigned int acl_num, int ipv6, int socketid)
+{
+	char name[PATH_MAX];
+	struct rte_acl_param acl_param;
+	struct rte_acl_config acl_build_param;
+	struct rte_acl_ctx *context;
+	int dim = ipv6 ? RTE_DIM(ipv6_defs) : RTE_DIM(ipv4_defs);
+
+	/* Create ACL contexts */
+	rte_snprintf(name, sizeof(name), "%s%d",
+			ipv6 ? L3FWD_ACL_IPV6_NAME : L3FWD_ACL_IPV4_NAME,
+			socketid);
+
+	acl_param.name = name;
+	acl_param.socket_id = socketid;
+	acl_param.rule_size = RTE_ACL_RULE_SZ(dim);
+	acl_param.max_rule_num = MAX_ACL_RULE_NUM;
+
+	if ((context = rte_acl_create(&acl_param)) == NULL)
+		rte_exit(EXIT_FAILURE, "Failed to create ACL context\n");
+
+	if (rte_acl_add_rules(context, route_base, route_num) < 0)
+			rte_exit(EXIT_FAILURE, "add rules failed\n");
+
+	if (rte_acl_add_rules(context, acl_base, acl_num) < 0)
+			rte_exit(EXIT_FAILURE, "add rules failed\n");
+
+	/* Perform builds */
+	acl_build_param.num_categories = DEFAULT_MAX_CATEGORIES;
+
+	acl_build_param.num_fields = dim;
+	memcpy(&acl_build_param.defs, ipv6 ? ipv6_defs : ipv4_defs,
+		ipv6 ? sizeof(ipv6_defs) : sizeof(ipv4_defs));
+
+	if (rte_acl_build(context, &acl_build_param) != 0)
+		rte_exit(EXIT_FAILURE, "Failed to build ACL trie\n");
+
+	rte_acl_dump(context);
+
+	return context;
+}
+
+static int
+app_acl_init(void)
+{
+	unsigned lcore_id ;
+	unsigned int i;
+	int socketid;
+	struct rte_acl_rule *acl_base_ipv4, *route_base_ipv4,
+		*acl_base_ipv6, *route_base_ipv6;
+	unsigned int acl_num_ipv4 = 0, route_num_ipv4 = 0,
+		acl_num_ipv6 = 0, route_num_ipv6 = 0;
+
+	if (check_acl_config() != 0)
+		rte_exit(EXIT_FAILURE, "Failed to get valid ACL options\n");
+
+	dump_acl_config();
+
+	/* Load  rules from the input file */
+	if (add_rules(parm_config.rule_ipv4_name, &route_base_ipv4,
+			&route_num_ipv4, &acl_base_ipv4, &acl_num_ipv4,
+			sizeof(struct acl4_rule), &parse_cb_ipv4vlan_rule) < 0)
+		rte_exit(EXIT_FAILURE, "Failed to add rules\n");
+
+	acl_log("IPv4 Route entries %u:\n", route_num_ipv4);
+	dump_ipv4_rules((struct acl4_rule *)route_base_ipv4, route_num_ipv4, 1);
+
+	acl_log("IPv4 ACL entries %u:\n", acl_num_ipv4);
+	dump_ipv4_rules((struct acl4_rule *)acl_base_ipv4, acl_num_ipv4, 1);
+
+	if (add_rules(parm_config.rule_ipv6_name, &route_base_ipv6,
+			&route_num_ipv6,
+			&acl_base_ipv6, &acl_num_ipv6,
+			sizeof(struct acl6_rule), &parse_cb_ipv6_rule) < 0)
+		rte_exit(EXIT_FAILURE, "Failed to add rules\n");
+
+	acl_log("IPv6 Route entries %u:\n", route_num_ipv6);
+	dump_ipv6_rules((struct acl6_rule *)route_base_ipv6, route_num_ipv6, 1);
+
+	acl_log("IPv6 ACL entries %u:\n", acl_num_ipv6);
+	dump_ipv6_rules((struct acl6_rule *)acl_base_ipv6, acl_num_ipv6, 1);
+
+	memset(&acl_config, 0, sizeof(acl_config));
+
+	/* Check sockets a context should be created on */
+	if (!numa_on)
+		acl_config.mapped[0] = 1;
+	else {
+		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+			if (rte_lcore_is_enabled(lcore_id) == 0)
+				continue;
+
+			socketid = rte_lcore_to_socket_id(lcore_id);
+			if (socketid >= NB_SOCKETS) {
+				acl_log("Socket %d of lcore %u is out "
+					"of range %d\n",
+					socketid, lcore_id, NB_SOCKETS);
+				return -1;
+			}
+
+			acl_config.mapped[socketid] = 1;
+		}
+	}
+
+	for (i = 0; i < NB_SOCKETS; i++) {
+		if (acl_config.mapped[i]) {
+			acl_config.acx_ipv4[i] = setup_acl(route_base_ipv4,
+				acl_base_ipv4, route_num_ipv4, acl_num_ipv4,
+				0, i);
+
+			acl_config.acx_ipv6[i] = setup_acl(route_base_ipv6,
+				acl_base_ipv6, route_num_ipv6, acl_num_ipv6,
+				1, i);
+		}
+	}
+
+	free(route_base_ipv4);
+	free(route_base_ipv6);
+
+#ifdef L3FWDACL_DEBUG
+	acl_config.rule_ipv4 = (struct acl4_rule *)acl_base_ipv4;
+	acl_config.rule_ipv6 = (struct acl6_rule *)acl_base_ipv6;
+#else
+	free(acl_base_ipv4);
+	free(acl_base_ipv6);
+#endif
+
+	return 0;
+}
+
+/***********************end of ACL part******************************/
+
+struct lcore_conf {
+	uint16_t n_rx_queue;
+	struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
+	uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
+	struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+} __rte_cache_aligned;
+
+static struct lcore_conf lcore_conf[RTE_MAX_LCORE];
+
+/* Send burst of packets on an output interface */
+static inline int
+send_burst(struct lcore_conf *qconf, uint16_t n, uint8_t port)
+{
+	struct rte_mbuf **m_table;
+	int ret;
+	uint16_t queueid;
+
+	queueid = qconf->tx_queue_id[port];
+	m_table = (struct rte_mbuf **)qconf->tx_mbufs[port].m_table;
+
+	ret = rte_eth_tx_burst(port, queueid, m_table, n);
+	if (unlikely(ret < n)) {
+		do {
+			rte_pktmbuf_free(m_table[ret]);
+		} while (++ret < n);
+	}
+
+	return 0;
+}
+
+/* Enqueue a single packet, and send burst if queue is filled */
+static inline int
+send_single_packet(struct rte_mbuf *m, uint8_t port)
+{
+	uint32_t lcore_id;
+	uint16_t len;
+	struct lcore_conf *qconf;
+
+	lcore_id = rte_lcore_id();
+
+	qconf = &lcore_conf[lcore_id];
+	len = qconf->tx_mbufs[port].len;
+	qconf->tx_mbufs[port].m_table[len] = m;
+	len++;
+
+	/* enough pkts to be sent */
+	if (unlikely(len == MAX_PKT_BURST)) {
+		send_burst(qconf, MAX_PKT_BURST, port);
+		len = 0;
+	}
+
+	qconf->tx_mbufs[port].len = len;
+	return 0;
+}
+
+#ifdef DO_RFC_1812_CHECKS
+static inline int
+is_valid_ipv4_pkt(struct ipv4_hdr *pkt, uint32_t link_len)
+{
+	/* From http://www.rfc-editor.org/rfc/rfc1812.txt section 5.2.2 */
+	/*
+	 * 1. The packet length reported by the Link Layer must be large
+	 * enough to hold the minimum length legal IP datagram (20 bytes).
+	 */
+	if (link_len < sizeof(struct ipv4_hdr))
+		return -1;
+
+	/* 2. The IP checksum must be correct. */
+	/* this is checked in H/W */
+
+	/*
+	 * 3. The IP version number must be 4. If the version number is not 4
+	 * then the packet may be another version of IP, such as IPng or
+	 * ST-II.
+	 */
+	if (((pkt->version_ihl) >> 4) != 4)
+		return -3;
+	/*
+	 * 4. The IP header length field must be large enough to hold the
+	 * minimum length legal IP datagram (20 bytes = 5 words).
+	 */
+	if ((pkt->version_ihl & 0xf) < 5)
+		return -4;
+
+	/*
+	 * 5. The IP total length field must be large enough to hold the IP
+	 * datagram header, whose length is specified in the IP header length
+	 * field.
+	 */
+	if (rte_cpu_to_be_16(pkt->total_length) < sizeof(struct ipv4_hdr))
+		return -5;
+
+	return 0;
+}
+#endif
+
+/* main processing loop */
+static int
+main_loop(__attribute__((unused)) void *dummy)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	unsigned lcore_id;
+	uint64_t prev_tsc, diff_tsc, cur_tsc;
+	int i, nb_rx;
+	uint8_t portid, queueid;
+	struct lcore_conf *qconf;
+	int socketid;
+	const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1)
+			/ US_PER_S * BURST_TX_DRAIN_US;
+	int scalar = parm_config.scalar;
+
+	prev_tsc = 0;
+
+	lcore_id = rte_lcore_id();
+	qconf = &lcore_conf[lcore_id];
+	socketid = rte_lcore_to_socket_id(lcore_id);
+
+	if (qconf->n_rx_queue == 0) {
+		RTE_LOG(INFO, L3FWD, "lcore %u has nothing to do\n", lcore_id);
+		return 0;
+	}
+
+	RTE_LOG(INFO, L3FWD, "entering main loop on lcore %u\n", lcore_id);
+
+	for (i = 0; i < qconf->n_rx_queue; i++) {
+
+		portid = qconf->rx_queue_list[i].port_id;
+		queueid = qconf->rx_queue_list[i].queue_id;
+		RTE_LOG(INFO, L3FWD,
+			" -- lcoreid=%u portid=%hhu rxqueueid=%hhu\n",
+			lcore_id, portid, queueid);
+	}
+
+	while (1) {
+
+		cur_tsc = rte_rdtsc();
+
+		/*
+		 * TX burst queue drain
+		 */
+		diff_tsc = cur_tsc - prev_tsc;
+		if (unlikely(diff_tsc > drain_tsc)) {
+
+			/*
+			 * This could be optimized (use queueid instead of
+			 * portid), but it is not called so often
+			 */
+			for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+				if (qconf->tx_mbufs[portid].len == 0)
+					continue;
+				send_burst(&lcore_conf[lcore_id],
+					qconf->tx_mbufs[portid].len,
+					portid);
+				qconf->tx_mbufs[portid].len = 0;
+			}
+
+			prev_tsc = cur_tsc;
+		}
+
+		/*
+		 * Read packet from RX queues
+		 */
+		for (i = 0; i < qconf->n_rx_queue; ++i) {
+
+			portid = qconf->rx_queue_list[i].port_id;
+			queueid = qconf->rx_queue_list[i].queue_id;
+			nb_rx = rte_eth_rx_burst(portid, queueid,
+				pkts_burst, MAX_PKT_BURST);
+
+			if (nb_rx > 0) {
+				struct acl_search_t acl_search;
+
+				prepare_acl_parameter(pkts_burst, &acl_search,
+					nb_rx);
+
+				if (acl_search.num_ipv4) {
+					CLASSIFY(acl_config.acx_ipv4[socketid],
+						acl_search.data_ipv4,
+						acl_search.res_ipv4,
+						acl_search.num_ipv4,
+						DEFAULT_MAX_CATEGORIES);
+
+					send_packets(acl_search.m_ipv4,
+						acl_search.res_ipv4,
+						acl_search.num_ipv4);
+				}
+
+				if (acl_search.num_ipv6) {
+					CLASSIFY(acl_config.acx_ipv6[socketid],
+						acl_search.data_ipv6,
+						acl_search.res_ipv6,
+						acl_search.num_ipv6,
+						DEFAULT_MAX_CATEGORIES);
+
+					send_packets(acl_search.m_ipv6,
+						acl_search.res_ipv6,
+						acl_search.num_ipv6);
+				}
+			}
+		}
+	}
+}
+
+static int
+check_lcore_params(void)
+{
+	uint8_t queue, lcore;
+	uint16_t i;
+	int socketid;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		queue = lcore_params[i].queue_id;
+		if (queue >= MAX_RX_QUEUE_PER_PORT) {
+			printf("invalid queue number: %hhu\n", queue);
+			return -1;
+		}
+		lcore = lcore_params[i].lcore_id;
+		if (!rte_lcore_is_enabled(lcore)) {
+			printf("error: lcore %hhu is not enabled in "
+				"lcore mask\n", lcore);
+			return -1;
+		}
+		if ((socketid = rte_lcore_to_socket_id(lcore) != 0) &&
+				(numa_on == 0)) {
+			printf("warning: lcore %hhu is on socket %d "
+				"with numa off\n",
+				lcore, socketid);
+		}
+	}
+	return 0;
+}
+
+static int
+check_port_config(const unsigned nb_ports)
+{
+	unsigned portid;
+	uint16_t i;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		portid = lcore_params[i].port_id;
+
+		if ((enabled_port_mask & (1 << portid)) == 0) {
+			printf("port %u is not enabled in port mask\n", portid);
+			return -1;
+		}
+		if (portid >= nb_ports) {
+			printf("port %u is not present on the board\n", portid);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static uint8_t
+get_port_n_rx_queues(const uint8_t port)
+{
+	int queue = -1;
+	uint16_t i;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		if (lcore_params[i].port_id == port &&
+				lcore_params[i].queue_id > queue)
+			queue = lcore_params[i].queue_id;
+	}
+	return (uint8_t)(++queue);
+}
+
+static int
+init_lcore_rx_queues(void)
+{
+	uint16_t i, nb_rx_queue;
+	uint8_t lcore;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		lcore = lcore_params[i].lcore_id;
+		nb_rx_queue = lcore_conf[lcore].n_rx_queue;
+		if (nb_rx_queue >= MAX_RX_QUEUE_PER_LCORE) {
+			printf("error: too many queues (%u) for lcore: %u\n",
+				(unsigned)nb_rx_queue + 1, (unsigned)lcore);
+			return -1;
+		} else {
+			lcore_conf[lcore].rx_queue_list[nb_rx_queue].port_id =
+				lcore_params[i].port_id;
+			lcore_conf[lcore].rx_queue_list[nb_rx_queue].queue_id =
+				lcore_params[i].queue_id;
+			lcore_conf[lcore].n_rx_queue++;
+		}
+	}
+	return 0;
+}
+
+/* display usage */
+static void
+print_usage(const char *prgname)
+{
+	printf("%s [EAL options] -- -p PORTMASK -P"
+		"--"OPTION_RULE_IPV4"=FILE"
+		"--"OPTION_RULE_IPV6"=FILE"
+		"  [--"OPTION_CONFIG" (port,queue,lcore)[,(port,queue,lcore]]"
+		"  [--"OPTION_ENBJMO" [--max-pkt-len PKTLEN]]\n"
+		"  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+		"  -P : enable promiscuous mode\n"
+		"  --"OPTION_CONFIG": (port,queue,lcore): "
+		"rx queues configuration\n"
+		"  --"OPTION_NONUMA": optional, disable numa awareness\n"
+		"  --"OPTION_ENBJMO": enable jumbo frame"
+		" which max packet len is PKTLEN in decimal (64-9600)\n"
+		"  --"OPTION_RULE_IPV4"=FILE: specify the ipv4 rules entries "
+		"file. "
+		"Each rule occupy one line. "
+		"2 kinds of rules are supported. "
+		"One is ACL entry at while line leads with character '%c', "
+		"another is route entry at while line leads with "
+		"character '%c'.\n"
+		"  --"OPTION_RULE_IPV6"=FILE: specify the ipv6 rules "
+		"entries file.\n"
+		"  --"OPTION_SCALAR": Use scalar function to do lookup\n",
+		prgname, ACL_LEAD_CHAR, ROUTE_LEAD_CHAR);
+}
+
+static int
+parse_max_pkt_len(const char *pktlen)
+{
+	char *end = NULL;
+	unsigned long len;
+
+	/* parse decimal string */
+	len = strtoul(pktlen, &end, 10);
+	if ((pktlen[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (len == 0)
+		return -1;
+
+	return len;
+}
+
+static int
+parse_portmask(const char *portmask)
+{
+	char *end = NULL;
+	unsigned long pm;
+
+	/* parse hexadecimal string */
+	pm = strtoul(portmask, &end, 16);
+	if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (pm == 0)
+		return -1;
+
+	return pm;
+}
+
+static int
+parse_config(const char *q_arg)
+{
+	char s[256];
+	const char *p, *p0 = q_arg;
+	char *end;
+	enum fieldnames {
+		FLD_PORT = 0,
+		FLD_QUEUE,
+		FLD_LCORE,
+		_NUM_FLD
+	};
+	unsigned long int_fld[_NUM_FLD];
+	char *str_fld[_NUM_FLD];
+	int i;
+	unsigned size;
+
+	nb_lcore_params = 0;
+
+	while ((p = strchr(p0, '(')) != NULL) {
+		++p;
+		if ((p0 = strchr(p, ')')) == NULL)
+			return -1;
+
+		size = p0 - p;
+		if (size >= sizeof(s))
+			return -1;
+
+		rte_snprintf(s, sizeof(s), "%.*s", size, p);
+		if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') !=
+				_NUM_FLD)
+			return -1;
+		for (i = 0; i < _NUM_FLD; i++) {
+			errno = 0;
+			int_fld[i] = strtoul(str_fld[i], &end, 0);
+			if (errno != 0 || end == str_fld[i] || int_fld[i] > 255)
+				return -1;
+		}
+		if (nb_lcore_params >= MAX_LCORE_PARAMS) {
+			printf("exceeded max number of lcore params: %hu\n",
+				nb_lcore_params);
+			return -1;
+		}
+		lcore_params_array[nb_lcore_params].port_id =
+			(uint8_t)int_fld[FLD_PORT];
+		lcore_params_array[nb_lcore_params].queue_id =
+			(uint8_t)int_fld[FLD_QUEUE];
+		lcore_params_array[nb_lcore_params].lcore_id =
+			(uint8_t)int_fld[FLD_LCORE];
+		++nb_lcore_params;
+	}
+	lcore_params = lcore_params_array;
+	return 0;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	static struct option lgopts[] = {
+		{OPTION_CONFIG, 1, 0, 0},
+		{OPTION_NONUMA, 0, 0, 0},
+		{OPTION_ENBJMO, 0, 0, 0},
+		{OPTION_RULE_IPV4, 1, 0, 0},
+		{OPTION_RULE_IPV6, 1, 0, 0},
+		{OPTION_SCALAR, 0, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "p:P",
+				lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'p':
+			enabled_port_mask = parse_portmask(optarg);
+			if (enabled_port_mask == 0) {
+				printf("invalid portmask\n");
+				print_usage(prgname);
+				return -1;
+			}
+			break;
+		case 'P':
+			printf("Promiscuous mode selected\n");
+			promiscuous_on = 1;
+			break;
+
+		/* long options */
+		case 0:
+			if (!strncmp(lgopts[option_index].name,
+					OPTION_CONFIG,
+					sizeof(OPTION_CONFIG))) {
+				ret = parse_config(optarg);
+				if (ret) {
+					printf("invalid config\n");
+					print_usage(prgname);
+					return -1;
+				}
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+					OPTION_NONUMA,
+					sizeof(OPTION_NONUMA))) {
+				printf("numa is disabled\n");
+				numa_on = 0;
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+					OPTION_ENBJMO, sizeof(OPTION_ENBJMO))) {
+				struct option lenopts = {
+					"max-pkt-len",
+					required_argument,
+					0,
+					0
+				};
+
+				printf("jumbo frame is enabled\n");
+				port_conf.rxmode.jumbo_frame = 1;
+
+				/*
+				 * if no max-pkt-len set, then use the
+				 * default value ETHER_MAX_LEN
+				 */
+				if (0 == getopt_long(argc, argvopt, "",
+						&lenopts, &option_index)) {
+					ret = parse_max_pkt_len(optarg);
+					if ((ret < 64) ||
+						(ret > MAX_JUMBO_PKT_LEN)) {
+						printf("invalid packet "
+							"length\n");
+						print_usage(prgname);
+						return -1;
+					}
+					port_conf.rxmode.max_rx_pkt_len = ret;
+				}
+				printf("set jumbo frame max packet length "
+					"to %u\n",
+					(unsigned int)
+					port_conf.rxmode.max_rx_pkt_len);
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+					OPTION_RULE_IPV4,
+					sizeof(OPTION_RULE_IPV4)))
+				parm_config.rule_ipv4_name = optarg;
+
+			if (!strncmp(lgopts[option_index].name,
+					OPTION_RULE_IPV6,
+					sizeof(OPTION_RULE_IPV6))) {
+				parm_config.rule_ipv6_name = optarg;
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+					OPTION_SCALAR, sizeof(OPTION_SCALAR)))
+				parm_config.scalar = 1;
+
+
+			break;
+
+		default:
+			print_usage(prgname);
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
+static void
+print_ethaddr(const char *name, const struct ether_addr *eth_addr)
+{
+	printf("%s%02X:%02X:%02X:%02X:%02X:%02X", name,
+		eth_addr->addr_bytes[0],
+		eth_addr->addr_bytes[1],
+		eth_addr->addr_bytes[2],
+		eth_addr->addr_bytes[3],
+		eth_addr->addr_bytes[4],
+		eth_addr->addr_bytes[5]);
+}
+
+static int
+init_mem(unsigned nb_mbuf)
+{
+	int socketid;
+	unsigned lcore_id;
+	char s[64];
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		if (rte_lcore_is_enabled(lcore_id) == 0)
+			continue;
+
+		if (numa_on)
+			socketid = rte_lcore_to_socket_id(lcore_id);
+		else
+			socketid = 0;
+
+		if (socketid >= NB_SOCKETS) {
+			rte_exit(EXIT_FAILURE,
+				"Socket %d of lcore %u is out of range %d\n",
+				socketid, lcore_id, NB_SOCKETS);
+		}
+		if (pktmbuf_pool[socketid] == NULL) {
+			rte_snprintf(s, sizeof(s), "mbuf_pool_%d", socketid);
+			pktmbuf_pool[socketid] =
+				rte_mempool_create(s, nb_mbuf, MBUF_SIZE,
+					MEMPOOL_CACHE_SIZE,
+					sizeof(struct rte_pktmbuf_pool_private),
+					rte_pktmbuf_pool_init, NULL,
+					rte_pktmbuf_init, NULL,
+					socketid, 0);
+			if (pktmbuf_pool[socketid] == NULL)
+				rte_exit(EXIT_FAILURE,
+					"Cannot init mbuf pool on socket %d\n",
+					socketid);
+			else
+				printf("Allocated mbuf pool on socket %d\n",
+					socketid);
+		}
+	}
+	return 0;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+	uint8_t portid, count, all_ports_up, print_flag = 0;
+	struct rte_eth_link link;
+
+	printf("\nChecking link status");
+	fflush(stdout);
+	for (count = 0; count <= MAX_CHECK_TIME; count++) {
+		all_ports_up = 1;
+		for (portid = 0; portid < port_num; portid++) {
+			if ((port_mask & (1 << portid)) == 0)
+				continue;
+			memset(&link, 0, sizeof(link));
+			rte_eth_link_get_nowait(portid, &link);
+			/* print link status if flag set */
+			if (print_flag == 1) {
+				if (link.link_status)
+					printf("Port %d Link Up - speed %u "
+						"Mbps - %s\n", (uint8_t)portid,
+						(unsigned)link.link_speed,
+				(link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+					("full-duplex") : ("half-duplex\n"));
+				else
+					printf("Port %d Link Down\n",
+						(uint8_t)portid);
+				continue;
+			}
+			/* clear all_ports_up flag if any link down */
+			if (link.link_status == 0) {
+				all_ports_up = 0;
+				break;
+			}
+		}
+		/* after finally printing all link status, get out */
+		if (print_flag == 1)
+			break;
+
+		if (all_ports_up == 0) {
+			printf(".");
+			fflush(stdout);
+			rte_delay_ms(CHECK_INTERVAL);
+		}
+
+		/* set the print_flag if all ports up or timeout */
+		if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+			print_flag = 1;
+			printf("done\n");
+		}
+	}
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	struct lcore_conf *qconf;
+	int ret;
+	unsigned nb_ports;
+	uint16_t queueid;
+	unsigned lcore_id;
+	uint32_t n_tx_queue, nb_lcores;
+	uint8_t portid, nb_rx_queue, queue, socketid;
+
+	/* init EAL */
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid EAL parameters\n");
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid L3FWD parameters\n");
+
+	if (check_lcore_params() < 0)
+		rte_exit(EXIT_FAILURE, "check_lcore_params failed\n");
+
+	ret = init_lcore_rx_queues();
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "init_lcore_rx_queues failed\n");
+
+	if (rte_eal_pci_probe() < 0)
+		rte_exit(EXIT_FAILURE, "Cannot probe PCI\n");
+
+	nb_ports = rte_eth_dev_count();
+	if (nb_ports > RTE_MAX_ETHPORTS)
+		nb_ports = RTE_MAX_ETHPORTS;
+
+	if (check_port_config(nb_ports) < 0)
+		rte_exit(EXIT_FAILURE, "check_port_config failed\n");
+
+	/* Add ACL rules and route entries, build trie */
+	if (app_acl_init() < 0)
+		rte_exit(EXIT_FAILURE, "app_acl_init failed\n");
+
+	nb_lcores = rte_lcore_count();
+
+	/* initialize all ports */
+	for (portid = 0; portid < nb_ports; portid++) {
+		/* skip ports that are not enabled */
+		if ((enabled_port_mask & (1 << portid)) == 0) {
+			printf("\nSkipping disabled port %d\n", portid);
+			continue;
+		}
+
+		/* init port */
+		printf("Initializing port %d ... ", portid);
+		fflush(stdout);
+
+		nb_rx_queue = get_port_n_rx_queues(portid);
+		n_tx_queue = nb_lcores;
+		if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
+			n_tx_queue = MAX_TX_QUEUE_PER_PORT;
+		printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
+			nb_rx_queue, (unsigned)n_tx_queue);
+		ret = rte_eth_dev_configure(portid, nb_rx_queue,
+					(uint16_t)n_tx_queue, &port_conf);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE,
+				"Cannot configure device: err=%d, port=%d\n",
+				ret, portid);
+
+		rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
+		print_ethaddr(" Address:", &ports_eth_addr[portid]);
+		printf(", ");
+
+		/* init memory */
+		ret = init_mem(NB_MBUF);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "init_mem failed\n");
+
+		/* init one TX queue per couple (lcore,port) */
+		queueid = 0;
+		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+			if (rte_lcore_is_enabled(lcore_id) == 0)
+				continue;
+
+			if (numa_on)
+				socketid = (uint8_t)
+					rte_lcore_to_socket_id(lcore_id);
+			else
+				socketid = 0;
+
+			printf("txq=%u,%d,%d ", lcore_id, queueid, socketid);
+			fflush(stdout);
+			ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
+						     socketid, &tx_conf);
+			if (ret < 0)
+				rte_exit(EXIT_FAILURE,
+					"rte_eth_tx_queue_setup: err=%d, "
+					"port=%d\n", ret, portid);
+
+			qconf = &lcore_conf[lcore_id];
+			qconf->tx_queue_id[portid] = queueid;
+			queueid++;
+		}
+		printf("\n");
+	}
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		if (rte_lcore_is_enabled(lcore_id) == 0)
+			continue;
+		qconf = &lcore_conf[lcore_id];
+		printf("\nInitializing rx queues on lcore %u ... ", lcore_id);
+		fflush(stdout);
+		/* init RX queues */
+		for (queue = 0; queue < qconf->n_rx_queue; ++queue) {
+			portid = qconf->rx_queue_list[queue].port_id;
+			queueid = qconf->rx_queue_list[queue].queue_id;
+
+			if (numa_on)
+				socketid = (uint8_t)
+					rte_lcore_to_socket_id(lcore_id);
+			else
+				socketid = 0;
+
+			printf("rxq=%d,%d,%d ", portid, queueid, socketid);
+			fflush(stdout);
+
+			ret = rte_eth_rx_queue_setup(portid, queueid, nb_rxd,
+					socketid, &rx_conf,
+					pktmbuf_pool[socketid]);
+			if (ret < 0)
+				rte_exit(EXIT_FAILURE,
+					"rte_eth_rx_queue_setup: err=%d,"
+					"port=%d\n", ret, portid);
+		}
+	}
+
+	printf("\n");
+
+	/* start ports */
+	for (portid = 0; portid < nb_ports; portid++) {
+		if ((enabled_port_mask & (1 << portid)) == 0) {
+			continue;
+		}
+		/* Start device */
+		ret = rte_eth_dev_start(portid);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE,
+				"rte_eth_dev_start: err=%d, port=%d\n",
+				ret, portid);
+
+		/*
+		 * If enabled, put device in promiscuous mode.
+		 * This allows IO forwarding mode to forward packets
+		 * to itself through 2 cross-connected  ports of the
+		 * target machine.
+		 */
+		if (promiscuous_on)
+			rte_eth_promiscuous_enable(portid);
+	}
+
+	check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
+
+	/* launch per-lcore init on every lcore */
+	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
+	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+		if (rte_eal_wait_lcore(lcore_id) < 0)
+			return -1;
+	}
+
+	return 0;
+}
diff --git a/examples/l3fwd-acl/main.h b/examples/l3fwd-acl/main.h
new file mode 100644
index 0000000..f54938b
--- /dev/null
+++ b/examples/l3fwd-acl/main.h
@@ -0,0 +1,45 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCHv3 5/5] acl: add doxygen configuration and start page
  2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
                   ` (3 preceding siblings ...)
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 4/5] acl: New sample l3fwd-acl Konstantin Ananyev
@ 2014-06-13 11:26 ` Konstantin Ananyev
  2014-06-13 11:56 ` [dpdk-dev] [PATCHv3 0/5] ACL library Thomas Monjalon
  5 siblings, 0 replies; 11+ messages in thread
From: Konstantin Ananyev @ 2014-06-13 11:26 UTC (permalink / raw)
  To: dev, dev

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 doc/doxy-api-index.md |    3 ++-
 doc/doxy-api.conf     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/doc/doxy-api-index.md b/doc/doxy-api-index.md
index 6e75a6e..83303a1 100644
--- a/doc/doxy-api-index.md
+++ b/doc/doxy-api-index.md
@@ -78,7 +78,8 @@ There are many libraries, so their headers may be grouped by topics:
   [SCTP]               (@ref rte_sctp.h),
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
-  [LPM route]          (@ref rte_lpm.h)
+  [LPM route]          (@ref rte_lpm.h),
+  [ACL]                (@ref rte_acl.h)
 
 - **QoS**:
   [metering]           (@ref rte_meter.h),
diff --git a/doc/doxy-api.conf b/doc/doxy-api.conf
index 9df7356..2be4b1a 100644
--- a/doc/doxy-api.conf
+++ b/doc/doxy-api.conf
@@ -45,7 +45,8 @@ INPUT                   = doc/doxy-api-index.md \
                           lib/librte_power \
                           lib/librte_ring \
                           lib/librte_sched \
-                          lib/librte_timer
+                          lib/librte_timer \
+                          lib/librte_acl
 FILE_PATTERNS           = rte_*.h \
                           cmdline.h
 PREDEFINED              = __DOXYGEN__ \
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCHv3 0/5] ACL library
  2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
                   ` (4 preceding siblings ...)
  2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 5/5] acl: add doxygen configuration and start page Konstantin Ananyev
@ 2014-06-13 11:56 ` Thomas Monjalon
  2014-06-13 12:02   ` Ananyev, Konstantin
  5 siblings, 1 reply; 11+ messages in thread
From: Thomas Monjalon @ 2014-06-13 11:56 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev

Hi Konstantin,

> Konstantin Ananyev (5):
>   Add ACL library (librte_acl) into DPDK
>   acl: update UT to reflect latest changes in the librte_acl
>   acl: New test-acl application
>   acl: New sample l3fwd-acl
>   acl: add doxygen configuration and start page
> 
> v2 fixes:
> * Fixed several checkpatch.pl issues
> * Added doxygen related changes
> 
> v3 fixes:
> * Fixed even more checkpatch.pl issues

Sorry to bother you but after checking v3,
I think these errors should be avoided:
	ERROR: do not use assignment in if condition
	ERROR: return is not a function, parentheses are not required

Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCHv3 0/5] ACL library
  2014-06-13 11:56 ` [dpdk-dev] [PATCHv3 0/5] ACL library Thomas Monjalon
@ 2014-06-13 12:02   ` Ananyev, Konstantin
  2014-06-13 23:38     ` Thomas Monjalon
  0 siblings, 1 reply; 11+ messages in thread
From: Ananyev, Konstantin @ 2014-06-13 12:02 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,
These changes would require a lot of changes inside ACL library.
As I said in cover letter, ACL library was part of IPL code for a while (nearly a year).
I don't really want to make significant changes in it just before the release without really good reason - bugs found.
Thanks
Konstantin

-----Original Message-----
From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] 
Sent: Friday, June 13, 2014 12:56 PM
To: Ananyev, Konstantin
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCHv3 0/5] ACL library

Hi Konstantin,

> Konstantin Ananyev (5):
>   Add ACL library (librte_acl) into DPDK
>   acl: update UT to reflect latest changes in the librte_acl
>   acl: New test-acl application
>   acl: New sample l3fwd-acl
>   acl: add doxygen configuration and start page
> 
> v2 fixes:
> * Fixed several checkpatch.pl issues
> * Added doxygen related changes
> 
> v3 fixes:
> * Fixed even more checkpatch.pl issues

Sorry to bother you but after checking v3,
I think these errors should be avoided:
	ERROR: do not use assignment in if condition
	ERROR: return is not a function, parentheses are not required

Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCHv3 0/5] ACL library
  2014-06-13 12:02   ` Ananyev, Konstantin
@ 2014-06-13 23:38     ` Thomas Monjalon
  2014-06-14  0:07       ` Richardson, Bruce
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Monjalon @ 2014-06-13 23:38 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

2014-06-13 12:02, Ananyev, Konstantin:
> 2014-06-13 13:56, Thomas Monjalon:
> > > Konstantin Ananyev (5):
> > >   Add ACL library (librte_acl) into DPDK
> > >   acl: update UT to reflect latest changes in the librte_acl
> > >   acl: New test-acl application
> > >   acl: New sample l3fwd-acl
> > >   acl: add doxygen configuration and start page
> > > 
> > > v2 fixes:
> > > * Fixed several checkpatch.pl issues
> > > * Added doxygen related changes
> > > 
> > > v3 fixes:
> > > * Fixed even more checkpatch.pl issues
> > 
> > Sorry to bother you but after checking v3,
> > 
> > I think these errors should be avoided:
> > 	ERROR: do not use assignment in if condition
> > 	ERROR: return is not a function, parentheses are not required
> 
> These changes would require a lot of changes inside ACL library.
> As I said in cover letter, ACL library was part of IPL code for a while
> (nearly a year).
> I don't really want to make significant changes in it just before the
> release without really good reason - bugs found.

I've made the code style changes, move some configuration lines and added
to BSD build (not tested).
As it was previously acked and tested,
it is now applied for version 1.7.0.

Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCHv3 0/5] ACL library
  2014-06-13 23:38     ` Thomas Monjalon
@ 2014-06-14  0:07       ` Richardson, Bruce
  2014-06-14  7:55         ` Thomas Monjalon
  0 siblings, 1 reply; 11+ messages in thread
From: Richardson, Bruce @ 2014-06-14  0:07 UTC (permalink / raw)
  To: Thomas Monjalon, Ananyev, Konstantin; +Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Friday, June 13, 2014 4:38 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCHv3 0/5] ACL library
> 
> 2014-06-13 12:02, Ananyev, Konstantin:
> > 2014-06-13 13:56, Thomas Monjalon:
> > > > Konstantin Ananyev (5):
> > > >   Add ACL library (librte_acl) into DPDK
> > > >   acl: update UT to reflect latest changes in the librte_acl
> > > >   acl: New test-acl application
> > > >   acl: New sample l3fwd-acl
> > > >   acl: add doxygen configuration and start page
> > > >
> > > > v2 fixes:
> > > > * Fixed several checkpatch.pl issues
> > > > * Added doxygen related changes
> > > >
> > > > v3 fixes:
> > > > * Fixed even more checkpatch.pl issues
> > >
> > > Sorry to bother you but after checking v3,
> > >
> > > I think these errors should be avoided:
> > > 	ERROR: do not use assignment in if condition
> > > 	ERROR: return is not a function, parentheses are not required
> >
> > These changes would require a lot of changes inside ACL library.
> > As I said in cover letter, ACL library was part of IPL code for a while
> > (nearly a year).
> > I don't really want to make significant changes in it just before the
> > release without really good reason - bugs found.
> 
> I've made the code style changes, move some configuration lines and added
> to BSD build (not tested).
> As it was previously acked and tested,
> it is now applied for version 1.7.0.
> 
I'd be a bit wary about adding it to the BSD build. I'm only running BSD in a VM here, but there GCC fails to recognise the processor supports SSE4 instruction sets when using "-march=native", and so fails to compile the vector code - at least the vectorized PMD functions. That's why I've explicitly disabled those in the latest version of the vector PMD patch, and I would suggest doing the same for the ACL code for now.

Interestingly enough, in the same VM clang does seem to recognise SSE4. Output from the compilers below. Anyone with a physical box running BSD who could confirm if this issue is localised to VMs or not?

[bruce@BSD10-VM ~]$ gcc48 -dM -E -march=native - < /dev/null | grep SSE
#define __SSE2_MATH__ 1
#define __SSE_MATH__ 1
#define __SSE2__ 1
#define __SSSE3__ 1
#define __SSE__ 1
#define __SSE3__ 1
[bruce@BSD10-VM ~]$ clang -dM -E -march=native - < /dev/null | grep SSE
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE4_2__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSSE3__ 1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCHv3 0/5] ACL library
  2014-06-14  0:07       ` Richardson, Bruce
@ 2014-06-14  7:55         ` Thomas Monjalon
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Monjalon @ 2014-06-14  7:55 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev

2014-06-14 00:07, Richardson, Bruce:
> From: Thomas Monjalon
> > I've made the code style changes, move some configuration lines and added
> > to BSD build (not tested).
> > 
> I'd be a bit wary about adding it to the BSD build. I'm only running BSD in
> a VM here, but there GCC fails to recognise the processor supports SSE4
> instruction sets when using "-march=native", and so fails to compile the
> vector code - at least the vectorized PMD functions. That's why I've
> explicitly disabled those in the latest version of the vector PMD patch,
> and I would suggest doing the same for the ACL code for now.

Actually, the configuration option didn't exist for BSD so I added it.
It's better to explicitly disable it with an explanation in git history.

Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-06-14  7:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-13 11:26 [dpdk-dev] [PATCHv3 0/5] ACL library Konstantin Ananyev
2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 1/5] Add ACL library (librte_acl) into DPDK Konstantin Ananyev
2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 2/5] acl: update UT to reflect latest changes in the librte_acl Konstantin Ananyev
2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 3/5] acl: New test-acl application Konstantin Ananyev
2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 4/5] acl: New sample l3fwd-acl Konstantin Ananyev
2014-06-13 11:26 ` [dpdk-dev] [PATCHv3 5/5] acl: add doxygen configuration and start page Konstantin Ananyev
2014-06-13 11:56 ` [dpdk-dev] [PATCHv3 0/5] ACL library Thomas Monjalon
2014-06-13 12:02   ` Ananyev, Konstantin
2014-06-13 23:38     ` Thomas Monjalon
2014-06-14  0:07       ` Richardson, Bruce
2014-06-14  7:55         ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).